WO2017135896A1 - An imaging system and method for estimating three-dimensional shape and/ or behaviour of a subject - Google Patents
An imaging system and method for estimating three-dimensional shape and/ or behaviour of a subject Download PDFInfo
- Publication number
- WO2017135896A1 WO2017135896A1 PCT/SG2017/050048 SG2017050048W WO2017135896A1 WO 2017135896 A1 WO2017135896 A1 WO 2017135896A1 SG 2017050048 W SG2017050048 W SG 2017050048W WO 2017135896 A1 WO2017135896 A1 WO 2017135896A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- container
- subject
- imaging system
- dimensional
- image
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/557—Depth or shape recovery from multiple images from light fields, e.g. from plenoptic cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
Definitions
- the present invention relates broadly, but not exclusively, to methods and systems for estimating three-dimensional (or 3D) shape and/or behaviour of a subject.
- Pose estimation and tracking are in general well studied areas for various subjects.
- a pose can be regarded as any parameterized form of the subject's three-dimensional appearance (or shape) and/ or behaviour. This can include simple shapes, more sophisticated geometric models or articulated skeletons.
- the zebrafish, Danio rerio has been a widely used model organism in studies of genetics, developmental biology, biomechanics as well as neurosciences.
- the architecture of the central nervous system and neural circuitry of zebrafish is very similar to other vertebrates including mammals, with a high degree of conservation with respect to the principal neurotransmitters. Additionally they are easier to house and care for than other animals and have a much larger number offspring in each generation.
- Some of the conventional techniques treat the subject (e.g. fish) as only a single point, failing to sufficiently describe posture and body movements.
- one technique uses a semi- automated method wherein the user is required to mark the snout and tail locations of the fish for the first time.
- a collection of B-splines are used to geometrically model the fish.
- this method assumes that the fish maintains constant length and width profile and only works for zebrafish in shallow water, where motion is largely planar.
- a fast and accurate algorithm regarding the fish pose as a three-part model was proposed conventionally.
- this method is restricted to two-dimensional (or 2D).
- One conventional technique introduces a 3D reconstruction of a detailed geometric model, though using 2D silhouettes from multiple cameras. Additionally, methods developed for larval zebrafish are either not readily adaptable or extensible to adult zebrafish.
- Tracking systems for marked animals can work for extended durations and large groups, but marking is usually invasive and might modify natural behaviour.
- the challenge of tracking un marked animals lies in resolving identities after animals touch, cross or occlude each other.
- Current multi-tracking systems attempt to deal with this problem by calculating most likely assignments that take into account the movement of animals before and after overlap.
- Shape models to resolve more complex crossings are presented in existing conventional techniques. Multiple cameras or multiple views are used in several works to enable tracking in 3D. Due to increased occlusions inherit in multi-view approaches, they are more subject to the risk of assigning incorrect identities after a crossing, and as a result, assignments of the following frames are significantly affected.
- an imaging system for estimating three-dimensional shape and/or behaviour of a subject comprising:
- a container configured to receive the subject
- an image capture device optically coupled to the container and configured to capture a two-dimensional image of the subject in the container, the captured image comprising a plurality of micro-images, each of the plurality of micro-images representing a separate portion of a two- dimensional representation of a three-dimensional light-field in the container and including depth information of the subject in the container,
- the three-dimensional shape and/or behaviour of the subject in the container is estimated based on the captured two-dimensional image of the subject in the container.
- the container has a patterned background, wherein the image capture device is configured to capture the image of the subject in the container against the patterned background of the container.
- the patterned background is a randomized background comprising a high contrast two-dimensional random or pseudorandom sequence.
- a resolution of the randomized background of the container matches a field of view of optics of the image capture device.
- the image capture device is further configured to derive a three- dimensional contour of an exterior of the subject so as to estimate the three-dimensional shape and/or behaviour of the subject in the container.
- the imaging system described above further comprising a plurality of illuminating devices, each of the plurality of illuminating devices being disposed at one side of the container and configured to provide illumination to the three-dimensional light-field in the container.
- the plurality of illuminating devices are fibre-optic spot lights.
- a geometry of the container is configured to compliment the field of view of the optics of the image capture device.
- the geometry of the container includes a pyramid.
- the container comprises a lid configured to compliment a shape of the container.
- the lid is configured to float on water in the container so as to prevent the subject from causing disturbance to a surface of the water.
- the imaging system described above further comprising a shield surrounding a base of the container and configured to control the illumination in the container.
- the imaging system described above further comprises a calibration tool arranged to optically communicate with the image capture device for calibrating the image capture device.
- the calibration tool comprises a patterned surface for calibrating the image capture device.
- the patterned surface is patterned similarly to the randomized background of the container.
- the image capture device is a plenoptic depth camera.
- the imaging system described above further comprises one or more illuminating devices configured to project textured patterns onto the subject in the container to improve an optical contrast of the subject in the container.
- a method for estimating three-dimensional shape and/or behaviour of a subject in a system comprising a container and an image capture device, the method comprising: capturing, by the image capture device, a two-dimensional image of the subject in the container, the captured image comprising a plurality of micro-images, each of the plurality of micro-images representing a separate portion of a two-dimensional representation of a three- dimensional light-field and including depth information of the subject in the container; and
- the method further comprises overfilling the container with water and placing a lid on the water to remove optical distortion during the capturing step.
- the method further comprises further comprising projecting textured patterns within the container to improve an optical contrast of the subject in the container during the capturing step.
- the method further comprises retrofitting a pyramidal-shaped partition into the container.
- the term “depth” refers to a perpendicular measurement of a subject with respect to an image capture device.
- image capture device refers to a device that performs an imaging function. Imaging functions comprise, among other things, capturing, processing and sending an image. Examples of an image capture device comprise, among other things, a plenoptic depth camera.
- three-dimensional light-field refers to a three-dimensional space light capture- able by the image capture device.
- resolution of a randomized background refers to a frequency of a two- dimensional random sequence on a background.
- Figure 1 shows an imaging system according to one embodiment.
- Figure 2 shows a method for estimating three-dimensional shape and/or behaviour of a subject in a system according to one embodiment.
- Figure 3 including Figures 3A-3C, illustrates an embodiment for estimating three- dimensional shape and/or behaviour of a subject in a system.
- Figure 4 including Figures 4A-4C, illustrates how the three-dimensional shape and/or behaviour of a subject may be estimated in a system.
- Figure 5 including Figures 5A-5B shows an example of a potential field constructed for a subject.
- Figure 6 including Figures 6A-6B shows an example where either of two fish is occluded in downstream frames according to an embodiment.
- Figure 7 including Figures 7A-7F, illustrates the effect of Schreckstoff on the subject.
- Figure 8 shows a graphical representation of normalization of inter-fish distance against time.
- Figure 9 shows pose predictions spanning the depth range of the container according to an embodiment.
- Figure 10 including Figures 10A-10D, illustrates views of different parts of the algorithmic analysis.
- Figure 1 1 shows a comparison of images of plain features against patterned features.
- Figure 12 shows an example of an image of pattern sheet and its corresponding three-dimensional model, respectively.
- Figure 13 shows an example of an image taken of a mouse.
- Figure 14 shows an example of an image taken of a mouse in an environment according to an embodiment.
- Figure 15 shows an example of an image taken of a mouse taken under patterned illumination according to an embodiment.
- Figure 1 6 shows a schematic diagram of a computer system suitable for use in executing the method depicted in Figure 2
- Such apparatus may be specially constructed for the required purposes, or may comprise a computer or other device selectively activated or reconfigured by a computer program stored in the computer.
- the algorithms and displays presented herein are not inherently related to any particular computer or other apparatus.
- Various machines may be used with programs in accordance with the teachings herein.
- the construction of more specialized apparatus to perform the required method steps may be appropriate.
- the structure of a computer will appear from the description below.
- the present specification also implicitly discloses a computer program, in that it would be apparent to the person skilled in the art that the individual steps of the method described herein may be put into effect by computer code.
- the computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and coding thereof may be used to implement the teachings of the disclosure contained herein.
- the computer program is not intended to be limited to any particular control flow. There are many other variants of the computer program, which can use different control flows without departing from the spirit or scope of the invention.
- the computer readable medium may include storage devices such as magnetic or optical disks, memory chips, or other storage devices suitable for interfacing with a computer.
- the computer readable medium may also include a hard-wired medium such as exemplified in the Internet system, or wireless medium such as exemplified in the GSM mobile telephone system.
- the computer program when loaded and executed on such a computer effectively results in an apparatus that implements the steps of the preferred method.
- FIG. 1 shows an imaging system according to one embodiment.
- an imaging system 1 00 for estimating three-dimensional shape and/or behaviour of a subject 1 18(a), 1 18(b).
- the imaging system 100 comprises a container 108 configured to receive the subject 1 18(a), 1 18(b); and an image capture device 1 14 optically coupled to the container and configured to capture a two-dimensional image of the subject 1 18(a), 1 18(b) in the container 108.
- the captured image comprises a plurality of micro-images, each of which represents a separate portion of a two-dimensional representation of a three- dimensional light-field in the container 108 and includes depth information of the subject 1 18(a), 1 18(b) in the container 108.
- the three-dimensional shape and/or behaviour of the subject 1 18(a), 1 18(b) in the container 108 is estimated based on the captured two- dimensional image of the subject 1 18(a), 1 18(b) in the container 108.
- the container 108 has a patterned background 120, wherein the image capture device 1 14 is configured to capture the image of the subject 1 18(a), 1 18(b) in the container against the patterned background 120 of the container 108.
- the subject may be a fish (e.g., zebrafish) or a mammal (e.g., a rat).
- the patterned background 120 is a randomized background comprising a high contrast two-dimensional random or pseudorandom sequence.
- the patterned background 120 comprises a two-dimensional non-repeating pattern of digital squares (e.g., an arbitrary set of lattice points) at a specific resolution. Additionally or alternatively, a resolution of the randomized background of the container 108 matches a field of view of optics of the image capture device 1 14.
- the image capture device 1 14 is a plenoptic depth camera.
- the plenoptic camera works by analyzing features in its field of view to estimate depth. If part of the image contains insufficient contrast then the depth estimation is dominated by noise.
- the addition of a patterned surface behind the subject reduces background noise for an increased signal to noise ratio.
- a randomized sequence is chosen because it is understood that the camera system operates by comparing matching images on neighboring lens regions, therefore a repeating pattern may confuse the camera system.
- the resolution (frequency of the sequence) is important and may be matched to the micro lens resolution as well as external optics; too small a pattern may not be resolved and too large a pattern may present plain regions.
- the image capture device 1 14 may be further configured to derive a three-dimensional contour of an exterior of the subject so as to estimate the three-dimensional shape and/or behaviour of the subject 1 18(a), 1 18(b) in the container 108.
- the imaging system 100 may further comprises a plurality of illuminating devices 106, each of the plurality of illuminating devices 106 being disposed at one side of the container 108 and configured to provide illumination to the three-dimensional light-field in the container 108.
- the imaging system described further comprises one or more illuminating devices configured to project textured patterns onto the subject in the container to improve an optical contrast of the subject 1 18(a), 1 18(b) in the container 108.
- the plurality of illuminating devices 106 may be fibre-optic spot lights.
- the optical illumination of the subject 1 18(a), 1 18(b) is provided by four high intensity fibre-optic spot lights on each side of the container 108.
- a strong direct light e.g., microscope quality fibre-optic light sources
- the directional nature of the light source allows the subject to be illuminated at a different level than the background, improving contrast in the 2D image that is captured.
- diffuse lighting as typically used in imaging applications, lights the whole arena evenly which does not improve contrast.
- a geometry of the container 108 is configured to compliment the field of view of the optics of the image capture device 1 14. Additionally or alternatively, the geometry of the container 108 compliments an optical dynamic range of the image capture device 1 14. It is to be understood that the optical dynamic range refers to the ratio between the largest and the smallest value in terms of optical measurement.
- the geometry of the container 108 includes a pyramid.
- the geometry of the container 108 serves several purposes. The walls of the container 108 constrain the movement of the subject 1 18(a), 1 18(b) within the field of view of the image capture device 1 14.
- the pyramid shape of the container 108 optimally matches the field of view at all depths.
- the floor and lid of the container 108 constrain the subject 1 18(a), 1 18(b) within the optimal depth of field and dynamic range of the image capture device 1 14.
- the depth of field is chosen to provide the optimal compromise of resolution and useable depth of field for the scale of the given subject 1 18(a), 1 18(b).
- the walls of the container 108 being parallel and in line with the border of the field of view, present no reflections to the image capture device 1 14. If a standard cuboid container 108 were used, a reflected subject 1 1 18(a) presents an unwanted double image and reflected light or background objects causes noise.
- the bottom of the container 108 is raised above its resting surface 102 (and the background pattern 120) to spatially separate the subject from the background, improving the signal to background data separation.
- the container 108 may be placed in a position that is spaced-apart from the surface. That is, the container 108 is separated from the surface 102 by a space 122.
- the container 108 comprises a lid 1 10 configured to compliment a shape of the container 108.
- the lid 1 10 is configured to float on water in the container 108 so as to prevent the subject 1 18(a), 1 18(b) from causing disturbance to a surface of the water which causes optical interference.
- a floating lid 1 10 allows it to self-position on the surface.
- the floating lid 1 10 is vertically constrained by the tapered walls and therefore is always positioned at the same height regardless of water depth. It is to be understood that the lid 1 10 is constrained by the shape of the container 108.
- overfilling the container 108 with water and placing a lid on the water may be performed to remove optical distortion during the step of capturing the two-dimensional image.
- the imaging system 100 described above further comprising a shield surrounding a base of the container and configured to control the illumination in the container 108.
- a horizontal shield surrounding the base of the container is used to shade the background surface from the majority of the light while allowing sufficient light for the image capture device 1 14 to key off the background pattern. This creates further optical contrast between the subject 1 18(a), 1 18(b) and background 120.
- the imaging system described above further comprises a calibration tool 1 16 arranged to optically communicate with the image capture device 1 14 for calibrating the image capture device 1 14.
- the calibration tool 1 16 comprises a patterned surface for calibrating the image capture device 1 14.
- the patterned surface is patterned similarly to the randomized background of the container 120.
- the calibration tool 1 16 is configured to calibrate to image capture device 1 14 at an optimum dynamic range. If the subject's natural surface texture is plain and un-patterned, the image capture device 1 14 is more likely to resolve its surface with noisy data, making it more challenging to capture the depth information. Projecting a textured pattern over the arena and therefore over the subject 1 18(a), 1 18(b) can achieve a less noisy signal.
- the calibration tool 1 16 provides register of differing surface depths that can be placed in front of the image capture device 1 14 in order to scale or calibrate the data to real world measurements.
- Each of the different surface depth may be patterned similarly to the background pattern 120 which reduces noise during calibration.
- Figure 2 shows a method for estimating three-dimensional shape and/or behaviour of a subject in a system comprising a container and an image capture device.
- the method comprises: capturing, by the image capture device, a two-dimensional image of the subject in the container, the captured image comprising a plurality of micro-images, each of the plurality of micro-images representing a separate portion of a two-dimensional representation of a three-dimensional light-field and including depth information of the subject in the container.
- the method further comprises estimating the three-dimensional shape and/or behaviour of the subject in the container based on the captured two-dimensional image of the subject in the container.
- the method further comprises overfilling the container with water and placing a lid on the water to remove optical distortion during the capturing step.
- the method comprises projecting textured patterns within the container to improve an optical contrast of the subject in the container during the capturing step.
- Figure 3 illustrates an imaging system for estimating three-dimensional shape and/or behaviour of a subject in a system in an embodiment.
- the three-dimensional pose estimation and tracking algorithm takes into account underwater lighting, reflections and partial/complete occlusions.
- the imagining system as illustrated in Figure 3 is easy to adopt in any laboratory environment.
- a top-mounted image capture device or a depth camera 302 is used to obtain the depth information about the subject (e.g., fish) and use the pointcloud to learn the position and orientation of the fish.
- the image capture device is a Raytrix R5 light-field camera.
- a parametrized mathematical representation of 3D zebrafish may be formulated and a two-step pose estimation algorithm is proposed followed by the tracking module.
- Figure 3B shows how a two-dimensional image of the subject taken by the imaging system shown in Figure 3A may be estimated in the two-step pose estimation.
- Figure 3C shows how a three-dimensional shape and/or behavior is estimated from the two-dimensional image.
- a top-mounted depth camera as shown in Figure 3A.
- a light-field camera uses a micro lens array placed behind a main lens and just in front of a charge-coupled device sensor focal plane to enable capturing of the entire lightfield of the subject in the container.
- the alignment of the micro lens array with respect to the main lens is a critical factor in determining imaging quality and hence the need for calibration.
- Raytrix provides a calibration filter, which comes with a calibration grid.
- the calibration step involves aligning the micro lens images to the calibration grid. This would ensure that the f-numbers of the main imaging system and micro lens imaging system are matched, for optimal imaging quality. For example, based on correspondence and defocus cues, as well as based on variation methods to obtain accurate estimation of depth images,
- depth information may be used.
- the virtual depth is an important measure in the field of plenoptics.
- Each micro image image formed by a micro lens
- the depth of image points can be computed as a ratio, by triangulation.
- the depth map supplied by the light-field camera is in the space of virtual depths.
- the non-linear scale of virtual depths requires a non-trivial calibration step to convert it to real world object distances (metric depths).
- a behavioural model may be used to perform metric depth calibration of the Raytrix R5.
- the object distance 0 is expressed as a linear combination of the virtual depth v and u which is defined as ov .
- c ⁇ and c 2 are coefficients.
- the coefficients c °' Cl and ° 2 are real-valued numbers, which are defined to express an unknown quantity (the actual depth of an object) in terms of a measurable quantity (the virtual depth). In an example, these coefficients are used to express the actual depth of a subject in terms of a virtual depth. These coefficients may be learned. That is, these coefficients may be obtained by imaging a controlled setting in which the actual depths of the subject are known and a mathematical model is used to fit such training data. This step may be part of the camera calibration process.
- the coefficients can be determined based on a number of calibration points. For training this model, calibrated step-like chips of known heights were used. Twenty calibration points were drawn from 10 different heights spanning the relevant depth range, v is obtained from the camera, o is physically measured and u is the product of these values. A simple least squares method is used to estimate the coefficients. Now o can be expressed as a function of v as in Eq.(2).
- the first step of the preprocessing phase isolates the depth point cloud of the fish.
- point clouds belonging to each of the fish are isolated.
- basic image processing techniques including smooth filtering, thresholding and connected component analysis are performed on the top-view depth image.
- a mask obtained by thresholding the corresponding fully focussed color image, is applied. This helps segment the 2D area of the fish from potentially noisy depth images, and produce viable point clouds.
- a dedicated fish localization module is employed to perform heuristic clustering of the data to isolate each fish into a cluster, based on its position in the previous frames.
- the head and tail locations within the point cloud are assigned.
- the anatomy of the zebrafish is taken into consideration herein, which provides prior knowledge that the tail would be more tapered than the head.
- Points at the boundaries of the segmented 2D area are taken, and the spherical neighbourhood in the point cloud of the corresponding 3D points is profiled.
- the density of points in this neighbourhood is used in conjunction with the direction of motion, obtained from the trajectory, to determine the head and tail positions. This will prove to be a necessary high-fidelity step as the potential fields, to be discussed below, are constructed based on the fish orientation.
- the line segment connecting these two positions is considered as the axis of the fish.
- the fish point cloud is transformed to a canonical coordinate system by aligning the fish axis to the x-axis and translating its centroid to the origin.
- the fish is shifted back to its original coordinate system by using the corresponding inverse transformation.
- Phase I Quadratic component
- the point cloud is first projected on to the XY plane in a canonical system.
- a common curve fitting approach is then adopted to learn the body curve's projection on the horizontal plane. Modelling the curve's projection as a second order polynomial,
- the coefficient matrix b is computed.
- a visual of the process is shown in Figure 4A.
- a depth value for every point on the body curve's projection is computed, which provides visualizuation and presented the pose in 3D.
- Figure 4A shows how an image captured by the imaging system shown in Figure
- Figure 4B and 4C show how the liner component in Phase II of the pose estimation algorithm may be estimated.
- Figure 4B illustrates that points (as represented by the arrows) indicating the body curve of the fish may be projected onto a constructed plane.
- the image plane as shown in Figure 4B, represents the horizontal projection of the body curve of the fish.
- Figure 4C shows how a liner relationship, between the points that indicate the body curve may be derived, at the intersection of the constructed plane and the image plane. That is, a linear relationship is determined based on the depth values and the horizontal projection of the body curve so as to estimate the 3D shape and/or behaviour of the fish.
- Algorithm 1 Tracking at time t + 1 :
- potential field is modelled as a three-dimensional angular Gaussian.
- the principle axis of the Gaussian is along the tangent to the body curve, evaluated at the head.
- the potential field at time instant t for a range of values of ⁇ and p ⁇ s defined as ⁇ f> tiS ⁇ Typically e [-s/2, s/2] and ⁇ is a small positive quantity.
- the Gaussian field's mean, ⁇ is at the centroid of the fish body and has a covariance of ⁇ 3 ⁇ 3 .
- Detecting potential occlusions in subsequent frames is an important step. When two fish are nearing each other, their trajectories can potentially cross. Resolving each of these fishes' indentities after crossovers involving touching, partial/complete occlusions is generally a challenging problem. The difficulty of identity resolution in such cases increases with number of entities involved.
- a module is implemented to detect possible occlusions by checking extent of pairwise overlap of the respective potential fields, governed by a threshold.
- Figure 6, including Figures 6A and 6B shows a scenario where the either of the fish may be occluded in immediate downstream frames in according to an embodiment.
- the tracking algorithm waits for a specified amount of time w t to see if identity recovery is possible. Tracking is reset with randomized identities for all the fishes if recovery is not possible. This results in one or more fully tracked segments, where consistency of identities is ensured at an intra-segment level while there is no mapping at an inter-segment level. For the identities to be globally consistent, such a mapping must be established. For this purpose, a procedure is implemented, wherein discriminative appearance features are calculated for each individual and a nearest-neighbour search is employed to calculate feature similarity to assign the identities.
- the complete data acquisition protocol is detailed below.
- the spatial resolution of the recorded depth images is 1024x 1024 pixels.
- MATLAB R2013a may be used for the implementation of the pose estimation and tracking algorithm.
- Video recordings and experiments may be done using a laboratory desktop with standard CPU processing and memory configurations.
- the unit may be equipped with NVIDIA GeForce GTX TITAN graphics processing unit (GPU) with CUDA support and a CameraLink hardware interface for the Raytrix camera.
- the multi-threaded video capturing software may be written in C++ using the Raytrix Lightfield API.
- a Basler color camera (acA1600-60gc) may be synchronized with the Raytrix for a parallel side-view capture, purely for verification purposes.
- the frame rate of a synchronized data capture setup may be 25 FPS (frame-per-second). The option of dropping the verification-view is allowed, which will enable the top-view camera to capture at 50 FPS.
- FIG. 7 and 8 illustrate experimental results. From top to bottom, the rows show the top-view image, our top- view estimation (color-coded by fish ID), side view estimation, corresponding side-view validation image and the trajectory until this time instant respectively. With a fair degree of accuracy, the 3D locations and poses of multiple adult zebrafish were predicted. The number of crossings/ occlusions were much larger in the six fish scenario. The proposed tracking algorithm effectively resolves the identities of fish in most of these scenarios. Both partial and complete occlusions are dealt with through the efficacy of our potential fields.
- the tracking is put on hold if the individual fish poses cannot be determined with a high degree of confidence, and the identities are recovered in subsequent frames where the fish separate out.
- Temporal information in the form of the potential field is used to assign identities back. Cases of reflections from the sides of the tank were dealt with successfully.
- Figure 7 illustrates the effect of Schreckstoff on the subject.
- Figure 7A shows the effect of SS in relation to swimming depths of the fish.
- Figure 7B shows the effect of SS in relation to speeds of the fish.
- Figure 7C shows the effect of SS in relation to the body pitches of the fish.
- Figure 7D shows the effect of SS in relation to angle of turns of the fish.
- Figure 7E and 7F show the effect of SS in relation to the occupancy of the tank before exposing to the SS and after exposing to the SS, respectively.
- Table 1 Empirically determined Raytrix parameters for data recording.
- Figure 9 shows pose predictions spanning the depth range of the container according to an embodiment. Each column corresponds to the scenario mentioned at the top, where the color coded positions of two fish are suggested. Row wise from top to bottom: Top view color image, Top view prediction overlaid on the input depth image, side view prediction, orthogonal view for validation of the prediction.
- Figure 10A-1 0D illustrates views of different parts of the algorithmic analysis.
- Figure 10A shows the depth image viewed from above by the plenoptic camera overlaid with the pixels fulfilling a threshold and spinal model.
- Figure 10B illustrates the side view by a second camera which is used for manual verification of the model. Fish are represented by 1002 and 1004.
- Figure 1 0C shows the spinal model of the three fish (blue, red, cyan bold lines) and the historical path of each fish in three- dimensions.
- Figure 10D shows a vertical plane (the same side view as the second camera angle) of the model, showing depth data pixels fulfilling a threshold value and the spinal models of each fish.
- Figure 1 1 A shows a comparison of images of plain features against patterned features.
- Figures 1 1 A-1 1 C show images taken using a plenoptic camera.
- Figures 1 1 A-1 1 C show three colour images recorded by the plenoptic camera.
- Figures 1 1 A-1 1 C are three corresponding 3D models calculated by the camera software.
- Figure 1 1 A shows a white, channelled object placed on a white background. The 3D model is noisy and bears no resemblance to the object.
- Figure 1 1 B shows the same object on a patterned background. The 3D model now shows a flat background and the object is clearly visible, though the channel is not rendered.
- Figure 1 1 C shows the same object covered in patterned paper. The 3D model now clearly shows the channel feature of the object.
- Figure 12 shows an example of an image of pattern sheet and its corresponding three-dimensional model, respectively.
- Figure 12A shows a two-dimensional image of a image subject taken by a plenoptic camera and
- Figure 12B shows the corresponding 3D model calculated by the camera software.
- the image subject is a flat test sheet with squares of different patterns.
- the model in Figure 12B shows that some patterns cause vertical distortion in the model, whereas others give a flat response.
- Figure 13 shows an example of an image taken of a mouse.
- Figure 13A shows a colour image of a mouse restrained within a pyramid container having a patterned background.
- Figure 1 3B shows the corresponding depth information obtained for the image shown in Figure 13A.
- Figure 1 3C shows the corresponding heat map of the image shown in Figure 13B.
- Figure 13 shows that the embodiment described above is applicable to larger animals such as mice.
- Figure 14 shows an example of an image taken of a mouse in an environment according to an embodiment.
- the environment is one that is outside the container.
- Figure 14A shows an image of a mouse that is taken in an open field arena with patterned background.
- Figure 14B shows the corresponding heat map of the image shown in Figure 14A.
- Figure 14 shows that the embodiment described above is applicable to larger animals such as mice.
- Figure 15 shows an example of an image taken of a mouse taken under patterned illumination according to an embodiment.
- Figure 1 5A shows an image of a mouse that is taken under patterned illumination.
- Figure 15B shows the corresponding heat map of the image shown in Figure 15A.
- Figure 15 shows that a better image is obtained when a pattern is projected onto the mouse, creating a better optical texture.
- Figure 16 depicts an exemplary computer / computing device 1600, hereinafter interchangeably referred to as a computer system 1600, where one or more such computing devices 1600 may be used to facilitate execution of the above-described method for providing a travel recommendation to a user.
- one or more components of the computer system 1600 may be used to realize the computer 1602.
- the following description of the computing device 1600 is provided by way of example only and is not intended to be limiting.
- the example computing device 1600 includes a processor 1604 for executing software routines. Although a single processor is shown for the sake of clarity, the computing device 1600 may also include a multi-processor system.
- the processor 1604 is connected to a communication infrastructure 1606 for communication with other components of the computing device 400.
- the communication infrastructure 1606 may include, for example, a communications bus, cross-bar, or network.
- the computing device 1600 further includes a main memory 1608, such as a random access memory (RAM), and a secondary memory 1610.
- main memory 1608 such as a random access memory (RAM)
- the secondary memory 1610 may include, for example, a storage drive 1612, which may be a hard disk drive, a solid state drive or a hybrid drive and/or a removable storage drive 1614, which may include a magnetic tape drive, an optical disk drive, a solid state storage drive (such as a USB flash drive, a flash memory device, a solid state drive or a memory card), or the like.
- the removable storage drive 1614 reads from and/or writes to a removable storage medium 1644 in a well-known manner.
- the removable storage medium 1644 may include magnetic tape, optical disk, non-volatile memory storage medium, or the like, which is read by and written to by removable storage drive 1614.
- the removable storage medium 1644 includes a computer readable storage medium having stored therein computer executable program code instructions and/or data.
- the secondary memory 1610 may additionally or alternatively include other similar means for allowing computer programs or other instructions to be loaded into the computing device 1600.
- Such means can include, for example, a removable storage unit 1622 and an interface 1640.
- a removable storage unit 1622 and interface 1640 include a program cartridge and cartridge interface (such as that found in video game console devices), a removable memory chip (such as an EPROM or PROM) and associated socket, a removable solid state storage drive (such as a USB flash drive, a flash memory device, a solid state drive or a memory card), and other removable storage units 1622 and interfaces 1640 which allow software and data to be transferred from the removable storage unit 1622 to the computer system 1600.
- the computing device 1 600 also includes at least one communication interface
- the communication interface 1624 allows software and data to be transferred between computing device 1600 and external devices via a communication path 1626.
- the communication interface 1624 permits data to be transferred between the computing device 1600 and a data communication network, such as a public data or private data communication network.
- the communication interface 1624 may be used to exchange data between different computing devices 1600 which such computing devices 1600 form part an interconnected computer network. Examples of a communication interface 1624 can include a modem, a network interface (such as an Ethernet card), a communication port
- the communication interface 1624 may be wired or may be wireless.
- Software and data transferred via the communication interface 1624 are in the form of signals which can be electronic, electromagnetic, optical or other signals capable of being received by communication interface 1624. These signals are provided to the communication interface via the communication path 1626.
- the computing device 1600 further includes a display interface 1602 which performs operations for rendering images to an associated display 1630 and an audio interface 1632 for performing operations for playing audio content via associated speaker(s) 1634.
- computer program product may refer, in part, to removable storage medium 1644, removable storage unit 1622, a hard disk installed in storage drive 1612, or a carrier wave carrying software over communication path 1626 (wireless link or cable) to communication interface 1624.
- Computer readable storage media refers to any non- transitory, non-volatile tangible storage medium that provides recorded instructions and/or data to the computing device 1600 for execution and/or processing.
- Examples of such storage media include magnetic tape, CD-ROM, DVD, Blu-rayTM Disc, a hard disk drive, a ROM or integrated circuit, a solid state storage drive (such as a USB flash drive, a flash memory device, a solid state drive or a memory card), a hybrid drive, a magneto-optical disk, or a computer readable card such as a SD card and the like, whether or not such devices are internal or external of the computing device 1600.
- a solid state storage drive such as a USB flash drive, a flash memory device, a solid state drive or a memory card
- a hybrid drive such as a magneto-optical disk
- a computer readable card such as a SD card and the like
- Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the computing device 1600 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
- the computer programs are stored in main memory 1608 and/or secondary memory 1610. Computer programs can also be received via the communication interface 1624. Such computer programs, when executed, enable the computing device 1600 to perform one or more features of embodiments discussed herein. In various embodiments, the computer programs, when executed, enable the processor 1604 to perform features of the above-described embodiments. Accordingly, such computer programs represent controllers of the computer system 1600.
- Software may be stored in a computer program product and loaded into the computing device 400 using the removable storage drive 1614, the storage drive 1612, or the interface 1640.
- the computer program product may be downloaded to the computer system 1600 over the communications path 1626.
- the software when executed by the processor 1604, causes the computing device 1600 to perform functions of embodiments described herein.
- FIG. 16 It is to be understood that the embodiment of Figure 16 is presented merely by way of example. Therefore, in some embodiments one or more features of the computing device 1600 may be omitted. Also, in some embodiments, one or more features of the computing device 1600 may be combined together. Additionally, in some embodiments, one or more features of the computing device 1600 may be split into one or more component parts.
- a systematic approach is proposed for automated 3D pose estimation and tracking of multiple subjects from a single camera.
- the system that is described applies to a plurality of un-marked subjects.
- the embodiments above able to continuously acquire and analyze video footage for long durations.
- the light-field cameras can be used in tracking small animals like zebra fish.
- the image capturing devices e.g., cameras
- the image capturing devices can also be applied for monitoring a plurality of subjects (e.g., zebrafishes) together in a container.
- Light-field camera that are used in some embodiments can acquire 3D imaging inputs, due to the relative high resolution and frame-rate comparing to other available 3D sensing options. That is, light-field cameras capture incoming light directions in addition to 2D images of the scene obtained by micro-lenses with narrow baseline of sub-pixel shifts.
- Recently light-field cameras have been shown as a promising paradigm for depth estimation and surface inspection in industrial and medical applications.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Length Measuring Devices By Optical Means (AREA)
Abstract
Disclosed is an imaging system for estimating three-dimensional shape and/or behaviour of a subject, comprising a container to receive the subject; and an image capture device optically coupled to the container and to capture a two- dimensional image of the subject, the captured image comprising a plurality of micro-images, each micro-image representing a separate portion of a two-dimensional representation of a three-dimensional light-field in the container and including depth information of the subject in the container, wherein the three-dimensional shape and/or behaviour of the subject is estimated based on the captured two-dimensional image of the subject. The image capture device is preferably a plenoptic depth camera (also known as a light field camera), and may be used to track the movement of a living organism in a tank, such as a zebrafish. The geometry of the container is preferably configured to compliment the field of view of the optics of the image capture device, e.g. in the shape of a pyramid.
Description
An Imaging System and Method for Estimating Three-Dimensional Shape And/ Or
Behaviour of a Subject
FIELD OF INVENTION
[001 ] The present invention relates broadly, but not exclusively, to methods and systems for estimating three-dimensional (or 3D) shape and/or behaviour of a subject.
BACKGROUND
[002] Pose estimation and tracking are in general well studied areas for various subjects. A pose can be regarded as any parameterized form of the subject's three-dimensional appearance (or shape) and/ or behaviour. This can include simple shapes, more sophisticated geometric models or articulated skeletons. For years, the zebrafish, Danio rerio, has been a widely used model organism in studies of genetics, developmental biology, biomechanics as well as neurosciences. The architecture of the central nervous system and neural circuitry of zebrafish is very similar to other vertebrates including mammals, with a high degree of conservation with respect to the principal neurotransmitters. Additionally they are easier to house and care for than other animals and have a much larger number offspring in each generation. Capturing the behaviour and detailed locomotory kinematics of model organisms is a key component in a wide range of research disciplines. Depending on the research area, the desired phenotypic observations can vary from social behaviour in cohorts to speed, type of swimming movements and response to non-verbal cues. Given this context, there has been a natural increase in the use of video processing technology to automate the analysis of zebrafish kinematic behaviour. Video processing allows for automation, increased accuracy and high research throughput. Furthermore, it allows definition of new measures based on features that wouldn't have been possible to detect or analyse by manual methods.
[003] Some of the conventional techniques treat the subject (e.g. fish) as only a single point, failing to sufficiently describe posture and body movements. For example, one technique uses a semi- automated method wherein the user is required to mark the snout and tail locations of the
fish for the first time. A collection of B-splines are used to geometrically model the fish. However, this method assumes that the fish maintains constant length and width profile and only works for zebrafish in shallow water, where motion is largely planar. A fast and accurate algorithm regarding the fish pose as a three-part model was proposed conventionally. However still, this method is restricted to two-dimensional (or 2D). One conventional technique introduces a 3D reconstruction of a detailed geometric model, though using 2D silhouettes from multiple cameras. Additionally, methods developed for larval zebrafish are either not readily adaptable or extensible to adult zebrafish.
[004] Tracking systems for marked animals can work for extended durations and large groups, but marking is usually invasive and might modify natural behaviour. The challenge of tracking un marked animals lies in resolving identities after animals touch, cross or occlude each other. Current multi-tracking systems attempt to deal with this problem by calculating most likely assignments that take into account the movement of animals before and after overlap. There are also conventional techniques that employ image processing techniques to separate objects, when overlaps are small. Shape models to resolve more complex crossings are presented in existing conventional techniques. Multiple cameras or multiple views are used in several works to enable tracking in 3D. Due to increased occlusions inherit in multi-view approaches, they are more subject to the risk of assigning incorrect identities after a crossing, and as a result, assignments of the following frames are significantly affected. One way that has been used is an identity tracking which presents a tracking system that seems to be capable of solving most of the crossing scenarios. The main drawback of this method is that the system is restricted to a two dimensional view. Also, since it is a pure tracking method, the fish is just treated as a mass without a parametrized pose. Commercial tools like Zebral_ab3D, VideoMot2 and DanioVision offer video tracking solutions. While DanioVision is dedicated to zebrafish larvae and other small organisms, VideoMot2 offers 2D tracking and Zebral_ab3D utilizes two cameras placed at the top and on the side of the tank.
[005] A need therefore exists to provide methods for estimating three-dimensional shape and/or behaviour of a subject that addresses one or more of the above problems.
[006] Furthermore, other desirable features and characteristics will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and this background of the disclosure.
SUMMARY
[007] According to a first aspect, an imaging system for estimating three-dimensional shape and/or behaviour of a subject is provided, comprising:
a container configured to receive the subject; and
an image capture device optically coupled to the container and configured to capture a two-dimensional image of the subject in the container, the captured image comprising a plurality of micro-images, each of the plurality of micro-images representing a separate portion of a two- dimensional representation of a three-dimensional light-field in the container and including depth information of the subject in the container,
wherein the three-dimensional shape and/or behaviour of the subject in the container is estimated based on the captured two-dimensional image of the subject in the container.
[008] In an embodiment, the container has a patterned background, wherein the image capture device is configured to capture the image of the subject in the container against the patterned background of the container.
[009] In an embodiment, the patterned background is a randomized background comprising a high contrast two-dimensional random or pseudorandom sequence.
[0010] In an embodiment, a resolution of the randomized background of the container matches a field of view of optics of the image capture device.
[001 1 ] In an embodiment, the image capture device is further configured to derive a three- dimensional contour of an exterior of the subject so as to estimate the three-dimensional shape and/or behaviour of the subject in the container.
[0012] In an embodiment, the imaging system described above further comprising a plurality of illuminating devices, each of the plurality of illuminating devices being disposed at one side of the container and configured to provide illumination to the three-dimensional light-field in the container.
[0013] In an embodiment, the plurality of illuminating devices are fibre-optic spot lights.
[0014] In an embodiment, a geometry of the container is configured to compliment the field of view of the optics of the image capture device.
[0015] In an embodiment, the geometry of the container includes a pyramid.
[0016] In an embodiment, the container comprises a lid configured to compliment a shape of the container.
[0017] In an embodiment, the lid is configured to float on water in the container so as to prevent the subject from causing disturbance to a surface of the water.
[0018] In an embodiment, the imaging system described above further comprising a shield surrounding a base of the container and configured to control the illumination in the container.
[0019] In an embodiment, the imaging system described above further comprises a calibration tool arranged to optically communicate with the image capture device for calibrating the image capture device.
[0020] In an embodiment, the calibration tool comprises a patterned surface for calibrating the image capture device.
[0021 ] In an embodiment, wherein the patterned surface is patterned similarly to the randomized background of the container.
[0022] In an embodiment, the image capture device is a plenoptic depth camera.
[0023] In an embodiment, the imaging system described above further comprises one or more illuminating devices configured to project textured patterns onto the subject in the container to improve an optical contrast of the subject in the container.
[0024] In one aspect, a method for estimating three-dimensional shape and/or behaviour of a subject in a system is provided, comprising a container and an image capture device, the method comprising: capturing, by the image capture device, a two-dimensional image of the subject in the container, the captured image comprising a plurality of micro-images, each of the plurality of micro-images representing a separate portion of a two-dimensional representation of a three- dimensional light-field and including depth information of the subject in the container; and
estimating the three-dimensional shape and/or behaviour of the subject in the container based on the captured two-dimensional image of the subject in the container.
[0025] In an embodiment, the method further comprises overfilling the container with water and placing a lid on the water to remove optical distortion during the capturing step.
[0026] In an embodiment, the method further comprises further comprising projecting textured patterns within the container to improve an optical contrast of the subject in the container during the capturing step.
[0027] In an embodiment, the method further comprises retrofitting a pyramidal-shaped partition into the container.
[0028] Unless context dictates otherwise, the following terms will be given the meaning provided here:
[0029] The term "depth" refers to a perpendicular measurement of a subject with respect to an image capture device.
[0030] The term "image capture device" refers to a device that performs an imaging function. Imaging functions comprise, among other things, capturing, processing and sending an image. Examples of an image capture device comprise, among other things, a plenoptic depth camera.
[0031 ] The term "three-dimensional light-field" refers to a three-dimensional space light capture- able by the image capture device.
[0032] The term "resolution of a randomized background" refers to a frequency of a two- dimensional random sequence on a background.
BRIEF DESCRIPTION OF THE DRAWINGS
[0033] Embodiments of the invention will be better understood and readily apparent to one of ordinary skill in the art from the following written description, by way of example only, and in conjunction with the drawings, in which:
[0034] Figure 1 shows an imaging system according to one embodiment.
[0035] Figure 2 shows a method for estimating three-dimensional shape and/or behaviour of a subject in a system according to one embodiment.
[0036] Figure 3, including Figures 3A-3C, illustrates an embodiment for estimating three- dimensional shape and/or behaviour of a subject in a system.
[0037] Figure 4, including Figures 4A-4C, illustrates how the three-dimensional shape and/or behaviour of a subject may be estimated in a system.
[0038] Figure 5 including Figures 5A-5B, shows an example of a potential field constructed for a subject.
[0039] Figure 6, including Figures 6A-6B, shows an example where either of two fish is occluded in downstream frames according to an embodiment.
[0040] Figure 7, including Figures 7A-7F, illustrates the effect of Schreckstoff on the subject.
[0041 ] Figure 8 shows a graphical representation of normalization of inter-fish distance against time.
[0042] Figure 9 shows pose predictions spanning the depth range of the container according to an embodiment.
[0043] Figure 10, including Figures 10A-10D, illustrates views of different parts of the algorithmic analysis.
[0044] Figure 1 1 , including Figures 1 1 A-1 1 C, shows a comparison of images of plain features against patterned features.
[0045] Figure 12, including Figures 12A-12B, shows an example of an image of pattern sheet and its corresponding three-dimensional model, respectively.
[0046] Figure 13, including Figures 13A-13C, shows an example of an image taken of a mouse.
[0047] Figure 14, including Figures 14A-14B, shows an example of an image taken of a mouse in an environment according to an embodiment.
[0048] Figure 15, including Figures 15A-15C, shows an example of an image taken of a mouse taken under patterned illumination according to an embodiment.
[0049] Figure 1 6 shows a schematic diagram of a computer system suitable for use in executing the method depicted in Figure 2
DETAILED DESCRIPTION
[0050] Embodiments of the present invention will be described, by way of example only, with reference to the drawings. Like reference numerals and characters in the drawings refer to like elements or equivalents.
[0051 ] Some portions of the description which follows are explicitly or implicitly presented in terms of algorithms and functional or symbolic representations of operations on data within a computer memory. These algorithmic descriptions and functional or symbolic representations are the means used by those skilled in the data processing arts to convey most effectively the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities, such as electrical, magnetic or optical signals capable of being stored, transferred, combined, compared, and otherwise manipulated.
[0052] Unless specifically stated otherwise, and as apparent from the following, it will be appreciated that throughout the present specification, discussions utilizing terms such as "capturing", "estimating", "overfilling", "projecting", "retrofitting", "scanning", "calculating", "determining", "replacing", "generating", "initializing", "outputting", or the like, refer to the action and processes of a computer system, or similar electronic device, that manipulates and transforms data represented as physical quantities within the computer system into other data similarly represented as physical quantities within the computer system or other information storage, transmission or display devices.
[0053] The present specification also discloses apparatus for performing the operations of the methods. Such apparatus may be specially constructed for the required purposes, or may comprise a computer or other device selectively activated or reconfigured by a computer program stored in the computer. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various machines may be used with programs in accordance with the teachings herein. Alternatively, the construction of more specialized apparatus to perform the required method steps may be appropriate. The structure of a computer will appear from the description below.
[0054] In addition, the present specification also implicitly discloses a computer program, in that it would be apparent to the person skilled in the art that the individual steps of the method described herein may be put into effect by computer code. The computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and coding thereof may be used to implement the teachings of the disclosure contained herein. Moreover, the computer program is not intended to be limited to any particular control flow. There are many other variants of the computer program, which can use different control flows without departing from the spirit or scope of the invention.
[0055] Furthermore, one or more of the steps of the computer program may be performed in parallel rather than sequentially. Such a computer program may be stored on any computer readable medium. The computer readable medium may include storage devices such as magnetic or optical disks, memory chips, or other storage devices suitable for interfacing with a computer. The computer readable medium may also include a hard-wired medium such as exemplified in the Internet system, or wireless medium such as exemplified in the GSM mobile telephone system. The computer program when loaded and executed on such a computer effectively results in an apparatus that implements the steps of the preferred method.
[0056] Figure 1 shows an imaging system according to one embodiment. As shown, there is an imaging system 1 00 for estimating three-dimensional shape and/or behaviour of a subject 1 18(a), 1 18(b). The imaging system 100 comprises a container 108 configured to receive the subject 1 18(a), 1 18(b); and an image capture device 1 14 optically coupled to
the container and configured to capture a two-dimensional image of the subject 1 18(a), 1 18(b) in the container 108. The captured image comprises a plurality of micro-images, each of which represents a separate portion of a two-dimensional representation of a three- dimensional light-field in the container 108 and includes depth information of the subject 1 18(a), 1 18(b) in the container 108. The three-dimensional shape and/or behaviour of the subject 1 18(a), 1 18(b) in the container 108 is estimated based on the captured two- dimensional image of the subject 1 18(a), 1 18(b) in the container 108.
[0057] In a preferred embodiment, the container 108 has a patterned background 120, wherein the image capture device 1 14 is configured to capture the image of the subject 1 18(a), 1 18(b) in the container against the patterned background 120 of the container 108. The subject may be a fish (e.g., zebrafish) or a mammal (e.g., a rat).
[0058] The patterned background 120 is a randomized background comprising a high contrast two-dimensional random or pseudorandom sequence. In an embodiment, the patterned background 120 comprises a two-dimensional non-repeating pattern of digital squares (e.g., an arbitrary set of lattice points) at a specific resolution. Additionally or alternatively, a resolution of the randomized background of the container 108 matches a field of view of optics of the image capture device 1 14.
[0059] In an embodiment, the image capture device 1 14 is a plenoptic depth camera. A person skilled in the art will understand that the plenoptic camera works by analyzing features in its field of view to estimate depth. If part of the image contains insufficient contrast then the depth estimation is dominated by noise. The addition of a patterned surface behind the subject reduces background noise for an increased signal to noise ratio. A randomized sequence is chosen because it is understood that the camera system operates by comparing matching images on neighboring lens regions, therefore a repeating pattern may confuse the camera system. The resolution (frequency of the sequence) is important and may be matched to the micro lens resolution as well as external optics; too small a pattern may not be resolved and too large a pattern may present plain regions.
[0060] The image capture device 1 14 may be further configured to derive a three-dimensional contour of an exterior of the subject so as to estimate the three-dimensional shape and/or behaviour of the subject 1 18(a), 1 18(b) in the container 108.
[0061 ] The imaging system 100 may further comprises a plurality of illuminating devices 106, each of the plurality of illuminating devices 106 being disposed at one side of the container 108 and configured to provide illumination to the three-dimensional light-field in the container 108. In an embodiment, the imaging system described further comprises one or more illuminating devices configured to project textured patterns onto the subject in the container to improve an optical contrast of the subject 1 18(a), 1 18(b) in the container 108. The plurality of illuminating devices 106 may be fibre-optic spot lights. In one embodiment, the optical illumination of the subject 1 18(a), 1 18(b) is provided by four high intensity fibre-optic spot lights on each side of the container 108. Many lighting configurations provide signal to noise ratio. A strong direct light (e.g., microscope quality fibre-optic light sources) would illuminate the internal organs of the subject e.g., fish 1 18(a) to present a non-uniform view for the image capture device 1 14, therefore decreasing noise. Advantageously, the directional nature of the light source allows the subject to be illuminated at a different level than the background, improving contrast in the 2D image that is captured. Conventionally, diffuse lighting, as typically used in imaging applications, lights the whole arena evenly which does not improve contrast.
[0062] A geometry of the container 108 is configured to compliment the field of view of the optics of the image capture device 1 14. Additionally or alternatively, the geometry of the container 108 compliments an optical dynamic range of the image capture device 1 14. It is to be understood that the optical dynamic range refers to the ratio between the largest and the smallest value in terms of optical measurement. For example, the geometry of the container 108 includes a pyramid. The geometry of the container 108 serves several purposes. The walls of the container 108 constrain the movement of the subject 1 18(a), 1 18(b) within the field of view of the image capture device 1 14. Advantageously, the pyramid shape of the container 108 optimally matches the field of view at all depths.
[0063] The floor and lid of the container 108 constrain the subject 1 18(a), 1 18(b) within the optimal depth of field and dynamic range of the image capture device 1 14. In an embodiment,
the depth of field is chosen to provide the optimal compromise of resolution and useable depth of field for the scale of the given subject 1 18(a), 1 18(b).
[0064] The walls of the container 108, being parallel and in line with the border of the field of view, present no reflections to the image capture device 1 14. If a standard cuboid container 108 were used, a reflected subject 1 1 18(a) presents an unwanted double image and reflected light or background objects causes noise. The bottom of the container 108 is raised above its resting surface 102 (and the background pattern 120) to spatially separate the subject from the background, improving the signal to background data separation. As shown in Figure 1 , the container 108 may be placed in a position that is spaced-apart from the surface. That is, the container 108 is separated from the surface 102 by a space 122.
[0065] In an embodiment, the container 108 comprises a lid 1 10 configured to compliment a shape of the container 108. In an embodiment, the lid 1 10 is configured to float on water in the container 108 so as to prevent the subject 1 18(a), 1 18(b) from causing disturbance to a surface of the water which causes optical interference. Advantageously, a floating lid 1 10 allows it to self-position on the surface. When used with container 108 having a pyramid geometry, the floating lid 1 10 is vertically constrained by the tapered walls and therefore is always positioned at the same height regardless of water depth. It is to be understood that the lid 1 10 is constrained by the shape of the container 108. Alternatively or additionally, overfilling the container 108 with water and placing a lid on the water may be performed to remove optical distortion during the step of capturing the two-dimensional image.
[0066] In an embodiment, the imaging system 100 described above further comprising a shield surrounding a base of the container and configured to control the illumination in the container 108. In an embodiment, a horizontal shield surrounding the base of the container is used to shade the background surface from the majority of the light while allowing sufficient light for the image capture device 1 14 to key off the background pattern. This creates further optical contrast between the subject 1 18(a), 1 18(b) and background 120.
[0067] In an embodiment, the imaging system described above further comprises a calibration tool 1 16 arranged to optically communicate with the image capture device 1 14 for calibrating the image capture device 1 14. In an embodiment, the calibration tool 1 16 comprises a patterned
surface for calibrating the image capture device 1 14. In an embodiment, the patterned surface is patterned similarly to the randomized background of the container 120. Advantageously, the calibration tool 1 16 is configured to calibrate to image capture device 1 14 at an optimum dynamic range. If the subject's natural surface texture is plain and un-patterned, the image capture device 1 14 is more likely to resolve its surface with noisy data, making it more challenging to capture the depth information. Projecting a textured pattern over the arena and therefore over the subject 1 18(a), 1 18(b) can achieve a less noisy signal. This can be alternative to or complimentary to the above lighting. In an embodiment, the calibration tool 1 16 provides register of differing surface depths that can be placed in front of the image capture device 1 14 in order to scale or calibrate the data to real world measurements. Each of the different surface depth may be patterned similarly to the background pattern 120 which reduces noise during calibration.
[0068] Figure 2 shows a method for estimating three-dimensional shape and/or behaviour of a subject in a system comprising a container and an image capture device. In step 202, the method comprises: capturing, by the image capture device, a two-dimensional image of the subject in the container, the captured image comprising a plurality of micro-images, each of the plurality of micro-images representing a separate portion of a two-dimensional representation of a three-dimensional light-field and including depth information of the subject in the container. In step 204, the method further comprises estimating the three-dimensional shape and/or behaviour of the subject in the container based on the captured two-dimensional image of the subject in the container.
[0069] In an embodiment, the method further comprises overfilling the container with water and placing a lid on the water to remove optical distortion during the capturing step.
[0070] Additionally or alternatively, the method comprises projecting textured patterns within the container to improve an optical contrast of the subject in the container during the capturing step.
[0071 ] In an embodiment, the method further comprises retrofitting a pyramidal-shaped partition into the container.
[0072] Figure 3 illustrates an imaging system for estimating three-dimensional shape and/or behaviour of a subject in a system in an embodiment. The three-dimensional pose estimation and tracking algorithm takes into account underwater lighting, reflections and partial/complete occlusions. The imagining system as illustrated in Figure 3 is easy to adopt in any laboratory environment. As shown in Figure 3A, a top-mounted image capture device or a depth camera 302 is used to obtain the depth information about the subject (e.g., fish) and use the pointcloud to learn the position and orientation of the fish. As example of the image capture device is a Raytrix R5 light-field camera. The larger field of view of the camera in the reconstructed depth image offers a significant advantage in this embodiment. For example, a parametrized mathematical representation of 3D zebrafish may be formulated and a two-step pose estimation algorithm is proposed followed by the tracking module. Figure 3B shows how a two-dimensional image of the subject taken by the imaging system shown in Figure 3A may be estimated in the two-step pose estimation. Figure 3C shows how a three-dimensional shape and/or behavior is estimated from the two-dimensional image.
[0073] Advantageously, only one source of input is required to estimate the three-dimensional behaviour. For example, a top-mounted depth camera as shown in Figure 3A. In an example, a Raytrix R5 (f=25mm) light-field camera is used for this purpose. A light-field camera uses a micro lens array placed behind a main lens and just in front of a charge-coupled device sensor focal plane to enable capturing of the entire lightfield of the subject in the container. The alignment of the micro lens array with respect to the main lens is a critical factor in determining imaging quality and hence the need for calibration. Raytrix provides a calibration filter, which comes with a calibration grid. The calibration step involves aligning the micro lens images to the calibration grid. This would ensure that the f-numbers of the main imaging system and micro lens imaging system are matched, for optimal imaging quality. For example, based on correspondence and defocus cues, as well as based on variation methods to obtain accurate estimation of depth images,
[0074] Additionally or alternatively, depth information may be used. The virtual depth is an important measure in the field of plenoptics. Each micro image (image formed by a micro lens) shows the virtual main lens image from a different perspective. Based on the focused image of a point in two or more micro images, the depth of image points can be computed as a ratio, by triangulation. Thus, the depth map supplied by the light-field camera is in the space of virtual
depths. The non-linear scale of virtual depths requires a non-trivial calibration step to convert it to real world object distances (metric depths).
[0075] A behavioural model may be used to perform metric depth calibration of the Raytrix R5. In this non-explicit estimation approach, the object distance 0 is expressed as a linear combination of the virtual depth v and u which is defined as ov .
[0076] o = uc0 + vq + c.
where c^ and c2 are coefficients. The coefficients c°'Cl and °2 are real-valued numbers, which are defined to express an unknown quantity (the actual depth of an object) in terms of a measurable quantity (the virtual depth). In an example, these coefficients are used to express the actual depth of a subject in terms of a virtual depth. These coefficients may be learned. That is, these coefficients may be obtained by imaging a controlled setting in which the actual depths of the subject are known and a mathematical model is used to fit such training data. This step may be part of the camera calibration process.
[0077] Since in Eq.(1 ) o,u and v are measurable dimensions, the coefficients can be determined based on a number of calibration points. For training this model, calibrated step-like chips of known heights were used. Twenty calibration points were drawn from 10 different heights spanning the relevant depth range, v is obtained from the camera, o is physically measured and u is the product of these values. A simple least squares method is used to estimate the coefficients. Now o can be expressed as a function of v as in Eq.(2).
[0078] For the purpose of pose estimation as shown in Figure 3B, a two-phase strategy may be employed. Natural tendency of fish to change direction of propagation happens along the horizontal plane. On the other hand, when diving or surfacing, the body is inclined at an angle to the horizontal, while being straight. Other locomotory behaviours are seldom observed. To model the body-curve of the fish identical to natural behaviour, he polynomial body-curve is
taken to be quadratic in the horizontal plane, while constraining it to being linear in the vertical direction. The two-phase approach first determines the quadratic relationship, followed by the linear relationship to get the entire body-curve in 3D. The proposed method is detailed in the following.
[0079] Preprocessing
[0080] The first step of the preprocessing phase isolates the depth point cloud of the fish. In case of multiple fish in the scene, point clouds belonging to each of the fish are isolated. For this purpose, basic image processing techniques including smooth filtering, thresholding and connected component analysis are performed on the top-view depth image. Additionally a mask, obtained by thresholding the corresponding fully focussed color image, is applied. This helps segment the 2D area of the fish from potentially noisy depth images, and produce viable point clouds. In the case of fish not being separable from the top-view, a dedicated fish localization module is employed to perform heuristic clustering of the data to isolate each fish into a cluster, based on its position in the previous frames.
[0081 ] The head and tail locations within the point cloud are assigned. The anatomy of the zebrafish is taken into consideration herein, which provides prior knowledge that the tail would be more tapered than the head. Points at the boundaries of the segmented 2D area are taken, and the spherical neighbourhood in the point cloud of the corresponding 3D points is profiled. The density of points in this neighbourhood is used in conjunction with the direction of motion, obtained from the trajectory, to determine the head and tail positions. This will prove to be a necessary high-fidelity step as the potential fields, to be discussed below, are constructed based on the fish orientation. The line segment connecting these two positions is considered as the axis of the fish. To proceed without loss of generality, the fish point cloud is transformed to a canonical coordinate system by aligning the fish axis to the x-axis and translating its centroid to the origin. At the end of the pose estimation procedure, the fish is shifted back to its original coordinate system by using the corresponding inverse transformation.
[0082] Phase I: Quadratic component
[0083] The point cloud is first projected on to the XY plane in a canonical system. A common curve fitting approach is then adopted to learn the body curve's projection on the horizontal plane. Modelling the curve's projection as a second order polynomial,
[0084] y(x) = a2x2 + axx + a (3)
[0088] y = ( 3Ί ■■
[0089] As X is not a square matrix, its inverse cannot be computed to solve the system directly to determine a . To circumvent this, QR factorization of XT is performed such that XT = QR , where Q is orthogonal and R is an upper triangular matrix.
[0090] x a ~ ' (7) [0091 ] QRa = yT . (8)
[0092] Since Q is orthogonal, QTQ = I . Multiplying both sides of Eq. (8) by QT , [0093] R = QTyT . (g)
[0094] For solving for a a = (R+(QTyT)
(10)
[0095] Phase II: Linear component
[0096] After obtaining the horizontal projection of the body curve, its orientation along the depth axis is estimated. The plane containing the secant connecting the end points of the fish's body curve, and perpendicular to the image plane, is constructed. The entire point cloud is then projected on to this newly constructed plane. The approach to determine the linear coefficients is very similar to Phase I, except that this projection of the body curve is modelled as a first order polynomial.
[0097] z( ) = bj +bc (i i :
[0098] This system can be represented as b X = z where
[0099] b = (b1 bj, (12) x,
[00100] X = (13)
[00101 ] (14)
[00102] As described previously, using QR decomposition and a direct solver, the coefficient matrix b is computed. A visual of the process is shown in Figure 4A. A depth value for every point on the body curve's projection is computed, which provides visualizuation and presented the pose in 3D.
[00103] Figure 4A shows how an image captured by the imaging system shown in Figure
3 may be represented. In an example, salient features of commonly used fish representations are adopted and extended to formulate the model. Figure 4B and 4C show how the liner
component in Phase II of the pose estimation algorithm may be estimated. Figure 4B illustrates that points (as represented by the arrows) indicating the body curve of the fish may be projected onto a constructed plane. As mentioned in the above, the body curve's projection of the fish may be obtained from the depth values. The image plane, as shown in Figure 4B, represents the horizontal projection of the body curve of the fish. Figure 4C shows how a liner relationship, between the points that indicate the body curve may be derived, at the intersection of the constructed plane and the image plane. That is, a linear relationship is determined based on the depth values and the horizontal projection of the body curve so as to estimate the 3D shape and/or behaviour of the fish.
[00104] Two points, for the head and tip of the tail respectively, were determined to be essential not only for estimation of the pose, but also while describing its motion (e.g. estimating behavior). The body of the fish is modelled as an elastic polynomial curve in R3 .
[00105] As mentioned previously, a marker-free tracking algorithm is proposed which works well even with multiple size-matched fish in the arena. The following presents a procedure for the tracking task.
[00106] Algorithm 1 : Tracking at time t + 1 :
[00107] Input: Nf <- number of fish in the arena
[00108] φ =χ "Ήί <r- potential fields calculated at time i=\ .Nf i=\ .N f i=l ..N f
[00109] t nt <- point clouds at t and t + 1 respectively θί+1 1 ^ current computed poses without IDs Ω, - Nf x Nf occlusion label matrix at time t t - l x Nf fish IDs at time t , Output: at+l
[001 10] If Ω, == 0 , then
[001 1 1 ] for n f = 1 : Nf do
[001 1 2] <— α" I overlap^" f , π^+ι) is maximum
[001 13] End for
[001 14] Else, First resolve IDs for all those not involved in any occlusion using the overlap method.
[001 1 5] For, unresolved entities i do
[001 1 6] t'+1 - at k \ ΐ (θί'+ι ) is maximized. End for
[001 1 7] End if
[001 1 8] Compute Ω,+1
[001 1 9] Potential fields
[001 20] The idea of the potential field is to quantitatively capture the transition of the fish from time t to t + 1 . Since locomotion is governed by laws of physics, all transitions are not equally likely. Some happen readily while few others involve crossing a larger energy barrier to make the transition. Modern video capturing techniques ensure sufficiently fast recording and hence the problem of discontinuous motion which makes tracking almost impossible can be ignored.
[001 21 ] The two important physical factors dictating a transition is the distance δ the fish has moved and the angle p with which it has turned. When p = 0 , its the case of motion along the same straight line. Making a sharp turn, i.e. with higher values of p require the fish to expend more energy to overcome inertia. Hence, these can be considered as less likely when compared to the p = case. Similarly, for the fish to jump a greater distance, it needs more
energy. Hence transitions with smaller δ are more likely. A prior assumption is that fish do not swim backward and hence a partial field is needed to model its movement.
[00122] In an embodiment, potential field is modelled as a three-dimensional angular Gaussian. In case of bends or turns, the principle axis of the Gaussian is along the tangent to the body curve, evaluated at the head. Without loss of generality, the potential field at time instant t , for a range of values of δ and p \s defined as <f>tiS ■ Typically e [-s/2, s/2] and δ is a small positive quantity. The Gaussian field's mean, μ , is at the centroid of the fish body and has a covariance of σ3χ3 . With reference to the global coordinate system, let the centroid at this instant C(t) = (x, z, y)■ Then ^δ (x + Ax, y + A , z + Az) gives the potential for a transition such that C(t + l) = (x + Ax, y + Ay, z + Az) . The higher the potential, the more likelihood of the transition to occur. Figure 5 shows an example of the potential field constructed for a random pose. It can be noted from Figure 5b that the distribution is only on one side of the mean, based on the direction of propagation. Also, the height of the cross section decays with distance.
[00123] Flagging occluson
[00124] Detecting potential occlusions in subsequent frames is an important step. When two fish are nearing each other, their trajectories can potentially cross. Resolving each of these fishes' indentities after crossovers involving touching, partial/complete occlusions is generally a challenging problem. The difficulty of identity resolution in such cases increases with number of entities involved. A module is implemented to detect possible occlusions by checking extent of pairwise overlap of the respective potential fields, governed by a threshold. Figure 6, including Figures 6A and 6B, shows a scenario where the either of the fish may be occluded in immediate downstream frames in according to an embodiment.
[00125] Connecting tracked segments
[00126] It is possible that after an occlusion, the identities of the fishes cannot be assigned with high confidence. In this scenario, the tracking algorithm waits for a specified amount of time wt to see if identity recovery is possible. Tracking is reset with randomized
identities for all the fishes if recovery is not possible. This results in one or more fully tracked segments, where consistency of identities is ensured at an intra-segment level while there is no mapping at an inter-segment level. For the identities to be globally consistent, such a mapping must be established. For this purpose, a procedure is implemented, wherein discriminative appearance features are calculated for each individual and a nearest-neighbour search is employed to calculate feature similarity to assign the identities. In an embodiment, a person skilled in the art will appreciate that this procedure is employed as a semi-automatic work flow and uses this module as the sole modality for tracking, which makes it computationally very expensive. In contrast, the proposed model is only a complimentary modality which is used only to handle conflicts, and in an automatic fashion. For experiments in various embodiment, wt =
300ms .
[00127] Implementation details
[00128] The complete data acquisition protocol is detailed below. The spatial resolution of the recorded depth images is 1024x 1024 pixels. MATLAB R2013a may be used for the implementation of the pose estimation and tracking algorithm. Video recordings and experiments may be done using a laboratory desktop with standard CPU processing and memory configurations. The unit may be equipped with NVIDIA GeForce GTX TITAN graphics processing unit (GPU) with CUDA support and a CameraLink hardware interface for the Raytrix camera. The multi-threaded video capturing software may be written in C++ using the Raytrix Lightfield API. A Basler color camera (acA1600-60gc) may be synchronized with the Raytrix for a parallel side-view capture, purely for verification purposes. On average, the frame rate of a synchronized data capture setup may be 25 FPS (frame-per-second). The option of dropping the verification-view is allowed, which will enable the top-view camera to capture at 50 FPS.
[00129] Experiments and discussion
[00130] The proposed pose estimation and tracking algorithm has been applied to a number of real datasets. Experiments with the number of fish N = 2,3,4 and 6 were performed. Both albino and normal fish were used for the set of experiments. To capture natural motion of the fish, they were left in the tank for 15 minutes for acclimatisation before the recordings were
started. A pyramidal tank, emulating the field of view of the light-field camera, was used for the experiments. The top of the tank measured 9.5cm x 9.5cm and the bottom 12.7cm x 12.7cm , with the height being 9cm . The floor of the fish tank was about 30cm away from our top mounted depth camera. The side view from the Basler color camera was software synchronized with the depth camera to test the effectiveness of our estimation method. Figures 7 and 8 illustrate experimental results. From top to bottom, the rows show the top-view image, our top- view estimation (color-coded by fish ID), side view estimation, corresponding side-view validation image and the trajectory until this time instant respectively. With a fair degree of accuracy, the 3D locations and poses of multiple adult zebrafish were predicted. The number of crossings/ occlusions were much larger in the six fish scenario. The proposed tracking algorithm effectively resolves the identities of fish in most of these scenarios. Both partial and complete occlusions are dealt with through the efficacy of our potential fields. In the event of occlusions, the tracking is put on hold if the individual fish poses cannot be determined with a high degree of confidence, and the identities are recovered in subsequent frames where the fish separate out. Temporal information in the form of the potential field is used to assign identities back. Cases of reflections from the sides of the tank were dealt with successfully.
[00131 ] Schreckstoff Experiments
[00132] To demonstrate the applicability and utility of the system, a behavioural study is performed with the aim of parametrising and observing the fish's response to a known alarm substance: Schreckstoff. Multiple repeats of the protocol detailed in below were performed for statistical relevance, with both one and two fishes. Parameters to be quantified included speed, turn angles, body curvature, body pitch, swimming depth, spatial occupancy etc. among others. Exemplary results from such a study is shown in Figure 7. The body pose and kinematics inferred by the system is able to differentiate the behaviour of the fish on exposure to Schreckstoff, based on these pre-defined parameters. A person skilled in the art will appreciate that, in the high-anxiety state, the fishes show a strong preference to lower regions in the tank. Also classic escape mechanistic signs are observed such as the high speeds, sharper turns etc. This provides fertile ground for potentially mining subtler behavioural patterns that may escape the human eye. Given the capability of the system to track multiple individuals, with high fidelity, over long periods, another interesting parameter to look at would be the inter-fish distance as shown in Figure 8. It is observed that there are oscillations in the proximity of the fishes in the
low-anxiety state. The frequency of such oscillations increases in the high-anxiety state, owing to the darting exhibited by the fishes. The period of time with the constant inter-fish distance is when both the fishes freeze, indicating a strong reaction to Schreckstoff.
[00133] Figure 7 illustrates the effect of Schreckstoff on the subject. Figure 7A shows the effect of SS in relation to swimming depths of the fish. Figure 7B shows the effect of SS in relation to speeds of the fish. Figure 7C shows the effect of SS in relation to the body pitches of the fish. Figure 7D shows the effect of SS in relation to angle of turns of the fish. Figure 7E and 7F show the effect of SS in relation to the occupancy of the tank before exposing to the SS and after exposing to the SS, respectively.
[00134] Data capture protocol
[00135] The conditions under which the data is recorded play an important role in the case of a light-field camera. One of the most critical factors is the lighting condition. Four fibre optic light sources were thus placed at different heights covering the depth range of the tank, to lessen bias towards lighting at only certain depths. Raytrix adopts a stereo-vision like mechanism for the estimation of depths, in the sense that texture is essential for determining the depths at a location. For this purpose, a black and white texture strip with QR code-like printed patterns is placed on the floor above which the tank is placed. This ensures that the background is fully textured and accurate estimation of depths of foreground objects is possible. One other factor to be addressed is the meniscus of water. The surface shape creates an undesired lens effect which needs to be removed. For this purpose, water is filled up to the brim until there is a full upward meniscus and the tank is then capped with a lid. The Raytrix camera was placed at a height of 30.1 cm from the floor. The focus distance was set to 0.3m and the exposure time 2.5ms. A Basler acA1600-60gc color camera is side mounted to perform synchronized capture for verification purposes. For depth map reconstruction from raw light-field data, the default software support (RxLive API) from Raytrix is used, which offers a host of tunable parameters, each of whose values will have an effect on the reconstructed depth profiles. For the given capture setting, a relatively robust set of parameter values that are identifid are presented in Table 1 . The complete list of parameters can be found in the online documentation of the Raytrix Lightfield SDK. Those parameters not specified herein can be assumed to take their default values.
Bilateral filter 10
radius
Bilateral edge 0.05
smoothing
factor
Bilateral noise 5
reduction
factor
Table 1 : Empirically determined Raytrix parameters for data recording.
[00136] Schreckstoff experimental protocol
[00137] To control the number of variables in the experimental procedure, the protocol was strictly adhered to. Fish were moved from their tanks only after the arena was setup as in Appendix 6, while ensuring minimal net stress. The subjects were given a ten minute acclimatization time, over the course of which the arena lighting is gradually increased to maximum brightness. The data acquisition is then started and is not interrupted until the end. After 2.5 minutes, Schreckstoff (SS) was injected slowly into the tank by means of a syringe fitted with a long capillary tube. Post-injection of the SS, the recording continued for yet another 2.5 minutes. At the end of a round of acquisition, all apparatus including the tank, liner, lid, syringe etc. are thoroughly alcohol washed, and soaked in water overnight. The subject is immediately euthanised as per standard guidelines, and frozen down in a - 80° C, to preserve hormone expression levels for downstream biochemical studies.
[00138] Figure 9 shows pose predictions spanning the depth range of the container according to an embodiment. Each column corresponds to the scenario mentioned at the top, where the color coded positions of two fish are suggested. Row wise from top to bottom: Top view color image, Top view prediction overlaid on the input depth image, side view prediction, orthogonal view for validation of the prediction.
[00139] Figure 10, including Figure 10A-1 0D, illustrates views of different parts of the algorithmic analysis. Figure 10A shows the depth image viewed from above by the plenoptic camera overlaid with the pixels fulfilling a threshold and spinal model. Figure 10B illustrates the side view by a second camera which is used for manual verification of the model. Fish are represented by 1002 and 1004. Figure 1 0C shows the spinal model of the three fish (blue, red, cyan bold lines) and the historical path of each fish in three- dimensions. Figure 10D shows a vertical plane (the same side view as the second camera angle) of the model, showing depth data pixels fulfilling a threshold value and the spinal models of each fish.
[00140] Figure 1 1 A, including Figures 1 1 A-1 1 C, shows a comparison of images of plain features against patterned features. Figures 1 1 A-1 1 C show images taken using a plenoptic camera. On the left of Figures 1 1 A-1 1 C, are three colour images recorded by the plenoptic camera. On the right of Figures 1 1 A-1 1 C, are three corresponding 3D models calculated by the camera software. Figure 1 1 A shows a white, channelled object placed on a white background. The 3D model is noisy and bears no resemblance to the object. Figure 1 1 B shows the same object on a patterned background. The 3D model now shows a flat background and the object is clearly visible, though the channel is not rendered. Figure 1 1 C shows the same object covered in patterned paper. The 3D model now clearly shows the channel feature of the object.
[00141 ] Figure 12, including Figures 12A-12B, shows an example of an image of pattern sheet and its corresponding three-dimensional model, respectively. Figure 12A shows a two-dimensional image of a image subject taken by a plenoptic camera and Figure 12B shows the corresponding 3D model calculated by the camera software. The image subject is a flat test sheet with squares of different patterns. The model in Figure 12B shows that some patterns cause vertical distortion in the model, whereas others give a flat response.
[00142] Figure 13, including Figures 13A-13C, shows an example of an image taken of a mouse. Figure 13A shows a colour image of a mouse restrained within a pyramid container having a patterned background. Figure 1 3B shows the corresponding depth
information obtained for the image shown in Figure 13A. Figure 1 3C shows the corresponding heat map of the image shown in Figure 13B. Figure 13 shows that the embodiment described above is applicable to larger animals such as mice.
[00143] Figure 14, including Figures 14A-14B, shows an example of an image taken of a mouse in an environment according to an embodiment. In this example, the environment is one that is outside the container. Figure 14A shows an image of a mouse that is taken in an open field arena with patterned background. Figure 14B shows the corresponding heat map of the image shown in Figure 14A. Figure 14 shows that the embodiment described above is applicable to larger animals such as mice.
[00144] Figure 15, including Figures 15A-15C, shows an example of an image taken of a mouse taken under patterned illumination according to an embodiment. Figure 1 5A shows an image of a mouse that is taken under patterned illumination. Figure 15B shows the corresponding heat map of the image shown in Figure 15A. Figure 15 shows that a better image is obtained when a pattern is projected onto the mouse, creating a better optical texture.
[00145] Figure 16 depicts an exemplary computer / computing device 1600, hereinafter interchangeably referred to as a computer system 1600, where one or more such computing devices 1600 may be used to facilitate execution of the above-described method for providing a travel recommendation to a user. In addition, one or more components of the computer system 1600 may be used to realize the computer 1602. The following description of the computing device 1600 is provided by way of example only and is not intended to be limiting.
[00146] As shown in Figure 16, the example computing device 1600 includes a processor 1604 for executing software routines. Although a single processor is shown for the sake of clarity, the computing device 1600 may also include a multi-processor system. The processor 1604 is connected to a communication infrastructure 1606 for communication with other components of the computing device 400. The communication infrastructure 1606 may include, for example, a communications bus, cross-bar, or network.
[00147] The computing device 1600 further includes a main memory 1608, such as a random access memory (RAM), and a secondary memory 1610. The secondary memory 1610 may include, for example, a storage drive 1612, which may be a hard disk drive, a solid state drive or a hybrid drive and/or a removable storage drive 1614, which may include a magnetic tape drive, an optical disk drive, a solid state storage drive (such as a USB flash drive, a flash memory device, a solid state drive or a memory card), or the like. The removable storage drive 1614 reads from and/or writes to a removable storage medium 1644 in a well-known manner. The removable storage medium 1644 may include magnetic tape, optical disk, non-volatile memory storage medium, or the like, which is read by and written to by removable storage drive 1614. As will be appreciated by persons skilled in the relevant art(s), the removable storage medium 1644 includes a computer readable storage medium having stored therein computer executable program code instructions and/or data.
[00148] In an alternative implementation, the secondary memory 1610 may additionally or alternatively include other similar means for allowing computer programs or other instructions to be loaded into the computing device 1600. Such means can include, for example, a removable storage unit 1622 and an interface 1640. Examples of a removable storage unit 1622 and interface 1640 include a program cartridge and cartridge interface (such as that found in video game console devices), a removable memory chip (such as an EPROM or PROM) and associated socket, a removable solid state storage drive (such as a USB flash drive, a flash memory device, a solid state drive or a memory card), and other removable storage units 1622 and interfaces 1640 which allow software and data to be transferred from the removable storage unit 1622 to the computer system 1600.
[00149] The computing device 1 600 also includes at least one communication interface
1624. The communication interface 1624 allows software and data to be transferred between computing device 1600 and external devices via a communication path 1626. In various embodiments of the inventions, the communication interface 1624 permits data to be transferred between the computing device 1600 and a data communication network, such as a public data or private data communication network. The communication interface 1624 may be used to exchange data between different computing devices 1600 which such computing devices 1600 form part an interconnected computer network. Examples of a communication interface 1624 can include a modem, a network interface (such as an Ethernet card), a communication port
(such as a serial, parallel, printer, GPIB, IEEE 1394, RJ25, USB), an antenna with associated
circuitry and the like. The communication interface 1624 may be wired or may be wireless. Software and data transferred via the communication interface 1624 are in the form of signals which can be electronic, electromagnetic, optical or other signals capable of being received by communication interface 1624. These signals are provided to the communication interface via the communication path 1626.
[00150] As shown in Figure 16, the computing device 1600 further includes a display interface 1602 which performs operations for rendering images to an associated display 1630 and an audio interface 1632 for performing operations for playing audio content via associated speaker(s) 1634.
[00151 ] As used herein, the term "computer program product" may refer, in part, to removable storage medium 1644, removable storage unit 1622, a hard disk installed in storage drive 1612, or a carrier wave carrying software over communication path 1626 (wireless link or cable) to communication interface 1624. Computer readable storage media refers to any non- transitory, non-volatile tangible storage medium that provides recorded instructions and/or data to the computing device 1600 for execution and/or processing. Examples of such storage media include magnetic tape, CD-ROM, DVD, Blu-ray™ Disc, a hard disk drive, a ROM or integrated circuit, a solid state storage drive (such as a USB flash drive, a flash memory device, a solid state drive or a memory card), a hybrid drive, a magneto-optical disk, or a computer readable card such as a SD card and the like, whether or not such devices are internal or external of the computing device 1600. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of software, application programs, instructions and/or data to the computing device 1600 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
[00152] The computer programs (also called computer program code) are stored in main memory 1608 and/or secondary memory 1610. Computer programs can also be received via the communication interface 1624. Such computer programs, when executed, enable the computing device 1600 to perform one or more features of embodiments discussed herein. In various embodiments, the computer programs, when executed, enable the processor 1604 to
perform features of the above-described embodiments. Accordingly, such computer programs represent controllers of the computer system 1600.
[00153] Software may be stored in a computer program product and loaded into the computing device 400 using the removable storage drive 1614, the storage drive 1612, or the interface 1640. Alternatively, the computer program product may be downloaded to the computer system 1600 over the communications path 1626. The software, when executed by the processor 1604, causes the computing device 1600 to perform functions of embodiments described herein.
[00154] It is to be understood that the embodiment of Figure 16 is presented merely by way of example. Therefore, in some embodiments one or more features of the computing device 1600 may be omitted. Also, in some embodiments, one or more features of the computing device 1600 may be combined together. Additionally, in some embodiments, one or more features of the computing device 1600 may be split into one or more component parts.
[00155] Advantageously, a systematic approach is proposed for automated 3D pose estimation and tracking of multiple subjects from a single camera. As shown in the above embodiments, the system that is described applies to a plurality of un-marked subjects. Additionally, the embodiments above able to continuously acquire and analyze video footage for long durations. The light-field cameras can be used in tracking small animals like zebra fish. Embodiments in the above also demonstrate that the image capturing devices (e.g., cameras) can also be applied for monitoring a plurality of subjects (e.g., zebrafishes) together in a container. Light-field camera that are used in some embodiments can acquire 3D imaging inputs, due to the relative high resolution and frame-rate comparing to other available 3D sensing options. That is, light-field cameras capture incoming light directions in addition to 2D images of the scene obtained by micro-lenses with narrow baseline of sub-pixel shifts. Recently light-field cameras have been shown as a promising paradigm for depth estimation and surface inspection in industrial and medical applications.
[00156] It will be appreciated by a person skilled in the art that numerous variations and/or modifications may be made to the present invention as shown in the specific
embodiments without departing from the spirit or scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects to be illustrative and not restrictive.
Claims
1 . An imaging system for estimating three-dimensional shape and/or behaviour of a subject comprising:
a container configured to receive the subject; and
an image capture device optically coupled to the container and configured to capture a two-dimensional image of the subject in the container, the captured image comprising a plurality of micro-images, each of the plurality of micro-images representing a separate portion of a two- dimensional representation of a three-dimensional light-field in the container and including depth information of the subject in the container,
wherein the three-dimensional shape and/or behaviour of the subject in the container is estimated based on the captured two-dimensional image of the subject in the container.
2. The imaging system according to claim 1 , wherein the container has a patterned background, wherein the image capture device is configured to capture the image of the subject in the container against the patterned background of the container.
3. The imaging system according to claim 2, wherein the patterned background is a randomized background comprising a high contrast two-dimensional random or pseudorandom sequence.
4. The imaging system according to claim 3, wherein a resolution of the randomized background of the container matches a field of view of optics of the image capture device.
5. The imaging system according to claim 1 , wherein the image capture device is further configured to derive a three-dimensional contour of an exterior of the subject so as to estimate the three-dimensional shape and/or behaviour of the subject in the container.
6. The imaging system according to claim 1 , further comprising a plurality of illuminating devices, each of the plurality of illuminating devices being disposed at one side of the container and configured to provide illumination to the three-dimensional light-field in the container.
7. The imaging system according to claim 6, wherein the plurality of illuminating devices are fibre-optic spot lights.
8. The imaging system according to claim 1 , wherein a geometry of the container is configured to compliment the field of view of the optics of the image capture device.
9. The imaging system according to claim 8, wherein the geometry of the container includes a pyramid.
10. The imaging system according to claim 1 , wherein the container comprises a lid configured to compliment a shape of the container.
1 1 . The imaging system according to claim 10, wherein the lid is configured to float on water in the container so as to prevent the subject from causing disturbance to a surface of the water.
12. The imaging system according to claim 6 further comprising a shield surrounding a base of the container and configured to control the illumination in the container.
13. The imaging system according to claim 2, further comprising a calibration tool arranged to optically communicate with the image capture device for calibrating the image capture device.
14. The imaging system according to claim 13, wherein the calibration tool comprises a patterned surface for calibrating the image capture device.
15. The imaging system according to claim 14, wherein the patterned surface is patterned similarly to the randomized background of the container.
16. The imaging system according to claim 1 , wherein the image capture device is a plenoptic depth camera.
17. The imaging system according to claim 1 , further comprising one or more illuminating devices configured to project textured patterns onto the subject in the container to improve an optical contrast of the subject in the container.
18. A method for estimating three-dimensional shape and/or behaviour of a subject in a system comprising a container and an image capture device, the method comprising: capturing, by the image capture device, a two-dimensional image of the subject in the container, the captured image comprising a plurality of micro-images, each of the plurality of micro-images representing a separate portion of a two-dimensional representation of a three- dimensional light-field and including depth information of the subject in the container; and
estimating the three-dimensional shape and/or behaviour of the subject in the container based on the captured two-dimensional image of the subject in the container.
19. The method according to claim 18, further comprising overfilling the container with water and placing a lid on the water to remove optical distortion during the capturing step.
20. The method according to claim 18, further comprising projecting textured patterns within the container to improve an optical contrast of the subject in the container during the capturing step.
21 . The method according to claim 18, further comprising retrofitting a pyramidal-shaped partition into the container.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SG10201600799U | 2016-02-02 | ||
SG10201600799U | 2016-02-02 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017135896A1 true WO2017135896A1 (en) | 2017-08-10 |
Family
ID=59499953
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SG2017/050048 WO2017135896A1 (en) | 2016-02-02 | 2017-02-02 | An imaging system and method for estimating three-dimensional shape and/ or behaviour of a subject |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2017135896A1 (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140209035A1 (en) * | 2012-11-14 | 2014-07-31 | Solvic Limited | Aquarium lighting system |
WO2014116120A1 (en) * | 2013-01-28 | 2014-07-31 | Sinvent As | System and method for counting zooplankton |
WO2015074718A1 (en) * | 2013-11-22 | 2015-05-28 | Vidinoti Sa | A light field processing method |
US20150347833A1 (en) * | 2014-06-03 | 2015-12-03 | Mark Ries Robinson | Noncontact Biometrics with Small Footprint |
-
2017
- 2017-02-02 WO PCT/SG2017/050048 patent/WO2017135896A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140209035A1 (en) * | 2012-11-14 | 2014-07-31 | Solvic Limited | Aquarium lighting system |
WO2014116120A1 (en) * | 2013-01-28 | 2014-07-31 | Sinvent As | System and method for counting zooplankton |
WO2015074718A1 (en) * | 2013-11-22 | 2015-05-28 | Vidinoti Sa | A light field processing method |
US20150347833A1 (en) * | 2014-06-03 | 2015-12-03 | Mark Ries Robinson | Noncontact Biometrics with Small Footprint |
Non-Patent Citations (4)
Title |
---|
LAZEBNIK, S. ET AL.: "Beyond bags of features: spatial pyramid matching for recognizing natural scene categories", IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, 22 June 2006 (2006-06-22), pages 2169 - 2178, XP010923120, [retrieved on 20170417] * |
MURGIA, F. ET AL.: "3D Reconstruction from Plenoptic Image.", 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR, 26 November 2015 (2015-11-26), pages 448 - 451, XP032842216, [retrieved on 20170417] * |
RECENT DEVELOPMENT OF VOLUMETRIC PIV WITH A PLENOPTIC CAMERA, 2 July 2013 (2013-07-02), Retrieved from the Internet <URL:http://resolver.tudelft.nl/uuid:087ebcbc-b6fd-4db5-9e66-684eca0db20f> [retrieved on 20170417] * |
XU, C. ET AL.: "Depth Image Based Articulated Object Pose Estimation, Tracking, and Action Recognition on Lie Groups", INTERNATIONAL JOURNAL OF COMPUTER VISION, 23 February 2017 (2017-02-23), pages 1 - 25, XP036253513, [retrieved on 20170417] * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6855587B2 (en) | Devices and methods for acquiring distance information from a viewpoint | |
US9952422B2 (en) | Enhancing the resolution of three dimensional video images formed using a light field microscope | |
US9087241B2 (en) | Intelligent part identification for use with scene characterization or motion capture | |
US20160093101A1 (en) | Method And System For Generating A Three-Dimensional Model | |
US20120313937A1 (en) | Coupled reconstruction of hair and skin | |
JP2009500042A (en) | System for 3D monitoring and analysis of target motor behavior | |
Qian et al. | Feature point based 3D tracking of multiple fish from multi-view images | |
Ye et al. | 3d reconstruction in the presence of glasses by acoustic and stereo fusion | |
McKinnon et al. | Towards automated and in-situ, near-real time 3-D reconstruction of coral reef environments | |
EP4278322A1 (en) | Methods and apparatuses for generating anatomical models using diagnostic images | |
Grover et al. | O fly, where art thou? | |
Kriegel et al. | Cell shape characterization and classification with discrete Fourier transforms and self‐organizing maps | |
Yekutieli et al. | Analyzing octopus movements using three-dimensional reconstruction | |
Martineau et al. | Tracking zebrafish larvae in group–status and perspectives | |
RU2004123248A (en) | SYSTEM AND METHOD OF TRACKING OBJECT | |
Yu et al. | Visual Perception and Control of Underwater Robots | |
WO2017135896A1 (en) | An imaging system and method for estimating three-dimensional shape and/ or behaviour of a subject | |
Jiang et al. | Automatic video tracking of Chinese mitten crabs based on the particle filter algorithm using a biologically constrained probe and resampling | |
CN109544622A (en) | A kind of binocular vision solid matching method and system based on MSER | |
Zováthi et al. | ST-DepthNet: A spatio-temporal deep network for depth completion using a single non-repetitive circular scanning Lidar | |
Wu et al. | DeepShapeKit: accurate 4D shape reconstruction of swimming fish | |
Ravan et al. | Rapid automated 3-D pose estimation of larval zebrafish using a physical model-trained neural network | |
Foo | Design and develop of an automated tennis ball collector and launcher robot for both able-bodied and wheelchair tennis players-ball recognition systems. | |
Fan et al. | Depth estimation of semi-submerged objects using a light-field camera | |
Jogeshwar | Look at the Bigger Picture: Analyzing Eye Tracking Data With Multi-Dimensional Visualization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17747884 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17747884 Country of ref document: EP Kind code of ref document: A1 |