EP4689727A1 - Objektverfolgungsschaltung und objektverfolgungsverfahren - Google Patents

Objektverfolgungsschaltung und objektverfolgungsverfahren

Info

Publication number
EP4689727A1
EP4689727A1 EP24715173.1A EP24715173A EP4689727A1 EP 4689727 A1 EP4689727 A1 EP 4689727A1 EP 24715173 A EP24715173 A EP 24715173A EP 4689727 A1 EP4689727 A1 EP 4689727A1
Authority
EP
European Patent Office
Prior art keywords
time
data
flight
object tracking
image data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP24715173.1A
Other languages
English (en)
French (fr)
Inventor
Marc Osswald
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Advanced Visual Sensing AG
Sony Semiconductor Solutions Corp
Original Assignee
Sony Advanced Visual Sensing AG
Sony Semiconductor Solutions Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Advanced Visual Sensing AG, Sony Semiconductor Solutions Corp filed Critical Sony Advanced Visual Sensing AG
Publication of EP4689727A1 publication Critical patent/EP4689727A1/de
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/88Lidar systems specially adapted for specific applications
    • G01S17/89Lidar systems specially adapted for specific applications for mapping or imaging
    • G01S17/8943D imaging with simultaneous measurement of time-of-flight at a 2D array of receiver pixels, e.g. time-of-flight cameras or flash lidar
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S17/00Systems using the reflection or reradiation of electromagnetic waves other than radio waves, e.g. lidar systems
    • G01S17/86Combinations of lidar systems with systems other than lidar, radar or sonar, e.g. with direction finders
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality

Definitions

  • the present disclosure generally pertains to object tracking circuitry and to an object tracking method.
  • a global shutter (GS) camera may obtain multiple consecutive images and a movement of an object may be determined.
  • GS global shutter
  • Time-of-flight sensing is generally known.
  • Time-of-flight cameras may determine a distance to an object based on a roundtrip delay of emitted light or based on a deterioration of emitted light.
  • event sensing may be generally known. In event sensing, a change of light conditions may be detected, thereby triggering an event.
  • the disclosure provides object tracking circuitry configured to: obtain time-of-flight data, based on a time-of-flight measurement, the time-of-flight data indicating the object; obtain image data indicating the object and at least one light spot, the light spot originating from light emitted for the time-of-flight measurement; and generate object data based on the time-of-flight data, on the image data, and on an association between the time-of-flight data and the image data, the association being based on the at least one light spot detected in the image data.
  • the disclosure provides an object tracking method comprising: obtaining time-of-flight data, based on a time-of-flight measurement, the time-of-flight data indicating the object; obtaining image data indicating the object and at least one light spot, the light spot originating from light emitted for the time-of-flight measurement; and generating object data based on the time-of-flight data, on the image data, and on an association between the time-of-flight data and the image data, the association being based on the at least one light spot detected in the image data.
  • Fig. 1 depicts a camera according to the present disclosure
  • Fig. 2 depicts an object tracking method, as it is generally known
  • Fig. 3 depicts a method for object tracking according to the present disclosure, wherein a constant frame rate is applied for ToF measurements;
  • Fig. 4 depicts how the present disclosure avoids a 2D-3D data correspondence problem which is present in the prior art
  • Fig. 5 depicts an embodiment for implicitly generating object data according to the present disclosure
  • Fig. 6 depicts a further embodiment of an object tracking method according to the present disclosure, wherein a varying frame rate is applied for ToF measurements;
  • Fig. 7 depicts, on the left, a conventional ToF illuminator, and on the right, a ToF illuminator according to the present disclosure
  • Fig. 8 depicts an AR/VR system according to the present disclosure
  • Fig. 9a depicts an embodiment in which an HMD tracks whether a hand of a user interacts with a virtual object by illuminating the hand;
  • Fig. 9b depicts an embodiment in which an HMD tracks whether a hand of a user interacts with a virtual object by illuminating the virtual object;
  • Fig. 10 depicts an embodiment of an object tracking method according to the present disclosure in a block diagram
  • Fig. 11 depicts a further embodiment of an object tracking method according to the present disclosure in a block diagram, in which a correspondence between ToF data and image data is determined; and Fig. 12 depicts a further embodiment of an object tracking method according to the present disclosure in a block diagram in which it is decided, based on image data, whether ToF data should be obtained.
  • object tracking is generally known.
  • existing methods lead to unprecise object tracking, particularly when high-speed tracking is desired.
  • 3D data e.g., depth data
  • some embodiments pertain to object tracking circuitry configured to: obtain time-of- flight data, based on a time-of-flight measurement, the time-of-flight data indicating the object; obtain image data indicating the object and at least one light spot, the light spot originating from light emitted for the time-of-flight measurement; and generate object data based on the time-of- flight data, on the image data, and on an association between the time-of-flight data and the image data, the association being based on the at least one light spot detected in the image data.
  • Circuitry may include any entity or multitude of entities configured to carry out function and/or methods, as described herein, such as a processor (e.g., CPU (Central Processing Unit), GPU (graphics processing unit), or the like), an FPGA (field-programmable gate array, any type of microprocessor, microcomputer, microarray, or the like. Also, different elements of the elements mentioned above may be combined for realizing circuitry according to the present disclosure.
  • circuitry is an “object tracking circuitry”, in some embodiments, embodiments of the present disclosure also pertain to object detection circuitry, or the like.
  • some embodiments may pertain to a system, a camera, a computer, or the like, which includes or is connectable (or connected) to circuitry according to the present disclosure.
  • the object tracking circuitry is configured to obtain time-of-flight data.
  • any type of time-of-flight (ToF) technology may be used in order to carry out the embodiments described herein, such as direct time-of-flight (dToF), indirect time-of-flight (iToF), spot time-of-flight, or the like.
  • the ToF data may be obtained, whereby the object may be indicated in the ToF data.
  • the object may be indicated based on an object tracking algorithm, based on a measured depth, based on a shape, or the like.
  • the object tracking circuitry is configured to obtain image data.
  • the image data may be generated based on any imaging technology, e.g., based on CCD (charge coupled device) technology, CMOS (complementary metal oxide semiconductor) technology, or the like.
  • the image data may indicate the object based on any imaging method, as well, e.g., based on color imaging, black and white imaging, based on dynamic vision imaging, event sensing, or the like.
  • Dynamic vision or event sensing imaging may refer to an embodiment in which events may be generated in response to generated charges and the events may be determined (instead of e.g., colors). For example, an event may be determined when there is an intensity change (e.g., determined based on a change in a photocurrent) detected, which is above a predetermined threshold.
  • an intensity change e.g., determined based on a change in a photocurrent
  • a pixel of an event sensor (EVS) or dynamic vision sensor (DVS) may not trigger as long as the photocurrent remains within a predetermined delta, and may only trigger when this delta is exceeded.
  • EVS event sensor
  • DVS dynamic vision sensor
  • such as pixel may asynchronously generate its events.
  • signals may be divided into frames (without limiting the present disclosure in that regard).
  • the image data may indicate, apart from the object, at least one light spot originating from light emitted for the ToF measurement.
  • the “spot” may be realized differently and the present disclosure should not be understood as limiting in that way.
  • a change of the light spot(s) may be detected, thereby detecting the object and/or movement of the object.
  • object data may be generated based on an association of the ToF data and the image data, wherein the light spot may be taken into account, as well, thereby generating object data.
  • the image data include event sensing data, as discussed herein.
  • the association is carried out for each time-of-flight measurement of a plurality of time-of-flight measurement, or for a subset (e.g., every second, or for a random number) of ToF measurements.
  • a frame rate of the image data may be higher than a frame rate of the ToF data and thus, the “correct” image acquisition may be determined for the respective ToF measurement.
  • the association may be carried out based on the at least one light spot such that the assignment of the correct image data to the respective ToF data may be carried out based on the at least one light spot, as well.
  • the association is an association in time, without limiting the present disclosure in that regard since a spatial association may envisaged alternatively or additionally.
  • the time-of-flight data have a first capture rate (e.g., frame rate), wherein the image data have a second capture rate (e.g., frame rate), and wherein the second capture rate is greater than the first capture rate, and the association is further based on the first and the second capture rate.
  • first capture rate e.g., frame rate
  • second capture rate e.g., frame rate
  • a capture rate corresponds to a frame rate. Since in some embodiments, asynchronous EVS data may be obtained, the capture rate may correspond to the rate the events are generated, which may be different for each pixel in an EVS. On the other hand, the ToF data may have a fixed (or varying) frame rate.
  • the association may be carried out.
  • the object tracking circuitry is further configured to: determine a correspondence of a first coordinate, indicated by the time-of-flight data, to a second coordinate, indicated by the image data.
  • the determination of the correspondence may correspond to a “live calibration” of a corresponding (camera) system and may be based on the at least one light spot. Since a position of the at least one light spot may be known based on the ToF data (or based on a position of an illuminator with respect to a ToF sensor), a corresponding spot coordinate (for the at least one spot) may be determined for the image data.
  • a coordinate transformation between the two data types may be determined.
  • the association is further based on an emission frequency of the light emitted for the time-of-flight measurement.
  • the emission frequency of a light source may be connected to the frame rate or sampling rate of the ToF data, as commonly known.
  • the emission frequency may also be taken into account for the image data for determining the association, in some embodiments.
  • the association is (further) based on a neural network, as will be discussed below.
  • the object data are used for hand pose estimation.
  • the object tracking circuitry is further configured to: carry out a time-of- flight measurement when it is indicated, in the image data, that a predetermined condition is fulfilled.
  • coarse object data may be determined based on the image data. However, when it is determined, based on the coarse object data, that a finer object detection (or tracking) is necessary, the ToF measurement(s) may start.
  • Some embodiments pertain to an object tracking method including: obtaining time-of-flight data, based on a time-of-flight measurement, the time-of-flight data indicating the object; obtaining image data indicating the object and at least one light spot, the light spot originating from light emitted for the time-of-flight measurement; and generating object data based on the time-of- flight data, on the image data, and on an association between the time-of-flight data and the image data, the association being based on the at least one light spot detected in the image data.
  • the object tracking method may be carried out with object tracking circuitry according to the present disclosure or with an object tracking system (e.g., camera).
  • object tracking system e.g., camera
  • some embodiments pertain to an object tracking system, a camera, a computer, or the like, configured to carry out the methods described herein and/or including circuitry, as described herein.
  • the image data include event sensing data, as discussed herein.
  • the association is carried out of each time-of-flight measurement of a plurality of time-of-flight measurement, as discussed herein.
  • the association is an association in time, as discussed herein.
  • the time-of-flight data have a first capture rate, wherein the image data have a second capture rate, and wherein the second capture rate is greater than the first capture rate, and wherein the association is further based on the first and the second capture rate, as discussed herein.
  • the method further includes: determining a correspondence of a first coordinate, indicated by the time-of- flight data, to a second coordinate, indicated by the image data, as discussed herein.
  • the association is further based on an emission frequency of the light emitted for the time-of-flight measurement, as discussed herein.
  • the association is further based on a neural network, as discussed herein.
  • the object data are used for hand pose estimation, as discussed herein.
  • the method further includes: carrying out a time-of-flight measurement when it is indicated, in the image data, that a predetermined condition is fulfilled, as discussed herein.
  • the methods as described herein are also implemented in some embodiments as a computer program causing a computer and/or a processor to perform the method, when being carried out on the computer and/or processor.
  • a non-transitory computer- readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the methods described herein to be performed.
  • a camera 1 including a dToF unit 2 and an event sensing unit (EVS) 3.
  • EVS event sensing unit
  • the dToF unit 2 includes a transmitter Tx and a receiver Rx.
  • the transmitter Tx emits light spots (which are incident on a hand, as object 4) and which are reflect from the object 4 and received by the receiver Rx. In response to the reception of the reflection of the spots, ToF data are generated.
  • the transmitter Tx uses infrared (IR) light and the receiver Rx includes a bandgap filter to keep such an interference as small as possible.
  • the EVS unit 3 does not include such a filter, such that the ambient light and the IR light are observed by the EVS unit 3.
  • the ambient light, as well as the reflection of the light spots are captured by the event sensing unit 3, which triggers an event in response to a change in the scene, thereby generating event sensing data.
  • the ToF data and the event sensing data are obtained by object tracking circuitry 5, which, in this embodiment, is a hand pose estimation (HPE) circuitry (a processor, in this embodiment), further configured to generate object data, as discussed herein.
  • HPE hand pose estimation
  • the illuminator (transmitter Tx) is controlled.
  • a hand pose is estimated with a high update rate and low latency.
  • the dToF 2 has a sparse output and hence, a sparse projector (transmitter Tx) is used, but the present disclosure is no limited in that regard.
  • the projector is dynamically configurable, i.e., the projected dots may be changed according to the circumstances, e.g., based on the feedback loop. Multiple exposures are used (spatially and temporally separated) to provide a full (or partial, in some embodiments) scan of the field of view.
  • the EVS 3 is sensitive to passive light an active light from the transmitter Tx (which, in this embodiment, is in an infrared range).
  • a distance b (referring to the a baseline between the projector Tx and the EVS) is very small. Thus, it can be estimated where to expect the projected dots in the EVS frame. If there is a significant baseline (b>0 or b»0), the projection of the dots in the EVS frame also depend on the depth.
  • the depth is known from the ToF measurement, such that it is possible to determine where the dot is expected to appear in the EVS frame even in the case of an existing baseline (b>0 or b»0).
  • “calibration” and “structured-light measurement” are carried out at the same time assuming partial calibration and depth information are known.
  • the epipolar line of the laser beam may be known, i.e., the dot projection in the EVS frame lies somewhere on this line.
  • the exact position of the dot in the EVS frame may depend on the actual depth of the scene at this point. If there are multiple dots, there might be many candidates on that line (known as stereo correspondence problem). With additional information, the right dot correspondence may be found. This additional information is the original depth measurement of the dot (known from the ToF measurement), in some embodiments. Thereby, a good initial estimate can be determined where the projected dot in the EVS frame should be. If the ToF depth measurement is perfect, the prediction is perfect, and thus, the predicted spot is corresponds to the real spot in the EVS frame.
  • the prediction will be at a different point, but likely close, such that it can also be found.
  • the detected position may be used to compute depth from structured-light and correct the ToF measurement.
  • This information may be used, for example based on probabilistic filtering, such as Kalman filters, this could be formulated as a typical maximum likelihood estimation problem (MLE) where some parameters such as calibration parameters but also depth are jointly estimated based on observed values and known priors, without limiting the present disclosure in that regard.
  • MLE maximum likelihood estimation problem
  • the additional information used herein correspond to the projected dots in the EVS frame, in some embodiments, which can be either used to refine the calibration (live calibration), for ToF depth measurements, or both.
  • the case b ⁇ 0 is a simplified version of the general case. With no (or negligibly small) baseline, there is no need for triangulation or the use of structured light, in some embodiments.
  • the projected dots in the EVS frame do not (or only negligibly) depend on the depth, such that in this case, “only” live-calibration may be carried out since the depth may not be determinable based on the dots in the EVS frame.
  • Sub-depth maps (obtained based on sub-exposures) of the dToF 2 can be directly processed by the HPE 5 without waiting for the full depth scan to be completed. This is possible since high speed data are provided by the EVS 3 which provides a two-dimensional projection from the sub-exposure locations and can be used to put the depth measurements into the right context of the hand 4.
  • live auto-calibration can be achieved according to the embodiments of the present disclosure.
  • the EVS data and the dToF data may be automatically and continuously registered at highest possible precision without the need of an extrinsic calibration model (since they are oriented based on the dots).
  • power efficient light projection may be achieved. If the projector pattern is dynamically controlled, the output of the HPE 5 and/or the EVS data may be used in the (fast) feedback loop 6 to determined which region should be illuminated.
  • Fig. 2 depicts a method 10 for object tracking, as it is generally known.
  • motion blur may deteriorate the GS acquisition, such that the issue described above may be even more severe.
  • Fig. 3 depicts a method 20 for object tracking according to the present disclosure.
  • the ToF acquisitions are be associated with the correct EVS acquisitions, thereby determining a correspondence.
  • no 2D-3D data correspondence problem may be present, in this embodiment, as discussed under reference of Fig. 4.
  • Fig. 4 it is shown that, if a hand moves down quickly, each ToF sub -acquisition determines a different location of the hand, but in the EVS acquisitions, the movement of the hand can be determined in more detail, such that the hand can be tracked, as shown on the bottom of Fig. 4 depicting the light spots along which the hand moved.
  • Fig. 5 depicts an embodiment for implicitly generating the object data according to the present disclosure.
  • a neural network 30 is configured to estimate a hand pose, i.e., to function as an HPE, as discussed under reference of Fig. 1.
  • the neural network 30 includes a task head 31 which takes a state as an input and predicts an output, based on a state output by a respective predictor of the predictors 35 to 38 (as discussed in the following). Moreover, the neural network 30 includes a depth and event predictor 35, two event predictors 36 and 37, as well as a further depth and event predictor 38, wherein the depth and event predictors 35 and 38 may be the same and the two event predictors 36 and 37 may also be the same, in some embodiments, only their inputs may be different since the input may correspond to an output of a preceding predictor..
  • the predictors 35 and 38 are configured to obtain sparse depth data and event data (sparse depth + events), whereas the predictors 36 and 37 are configured to obtain (only) event data.
  • a current state z_k is fed into the predictor 35 and in the task head 31.
  • a state z_k+l is fed into the predictor 36 and the task head 31, and so on.
  • the predictors take as input a current state as well as measurement input, and output an updated state (e.g., in view of z_k, z_k+l is an updated state).
  • the task head takes the respective state as input and compute a high frequency output (3D hand pose).
  • a state is a collection (at least one) of features or feature embeddings. These features can be predicted by both predictor types, but they may be more accurate when predicted by the predictors 35 or 38.
  • a feature may, for example, include a “finger edge”, a “hand comer”, a “thumb”, or the like, if a hand is supposed to be tracked. More generally, a feature may correspond to information of the object to be tracked or detected and the feature may be learned by the neural network.
  • the neural network is a convolutional neural network, but the present disclosure is not limited to any type of neural network.
  • the inputs to the predictors and to the task heads may be formatted in any way.
  • they are formatted as fixed size tensors, such that events are drawn into a fixed size space-time histogram for concatenation with other data, thereby being suitable for the used CNN.
  • inputs may be partitioned as tokens in case of a spatio-temporal transformer network.
  • a neural network may also be used for (implicitly) carrying out a live auto-calibration, as described herein.
  • a coordinate transformation may be used for projecting ToF coordinates into the EVS coordinate system, or vice versa.
  • the following formula may be used:
  • x’ and y’ are predictions in EVS image coordinates of the ToF measurement (u,v,z), based on which “real” x and y-image coordinates are determined, u and v are coordinates of the ToF dot in the ToF image.
  • X is a scale factor which corresponds to the measured z-coordinate in the ToF system, x and y are observed image coordinates of the active dot (corresponding to (u,v)) in the EVS image.
  • the EVS has its optical center at the cartesian coordinates (0, 0, 0).
  • the ToF system has a translation t and rotation R (R, t are the extrinsics) with respect to the EVS camera.
  • the matrix KEVS corresponds to a 3x3 intrinsic calibration matrix. It projects EVS camera coordinates into EVS image coordinates.
  • the matrix K - 1 TOF is an inverse of KTOF which is a 3x3 intrinsic calibration matrix that projects ToF system coordinates into ToF image coordinates.
  • the dToF dot (u, v) is projected into an EVS frame (x’, y’), the real image location (x, y) (in the EVS frame) is determined, and all events generated by the dot itself are filtered.
  • the EVS image is augmented with a depth value providing a depth map (x,y,z)which is used in addition to filtered EVS data for further processing.
  • depth may be further refined. If a good calibration between dToF Tx and EVS is already present, depth may be triangulated based on structured light (STL) triangulation by finding temporal correspondences.
  • STL structured light
  • B ackprojection (with a known but not necessarily precise calibration) (x’, y’; as described above) may be used to resolve STL ambiguity and find the real (x, y)-image coordinates in case there are multiple correspondences on a same epipolar line. Thereby, unique correspondences may be found which may not be determinable only based on temporal information.
  • an MLE (maximum likelihood estimation) method may be used to estimate both calibration parameters and depth by maximizing a likelihood of the calibration parameters and the depth based on an estimate/observation of the real location and based on a probabilistic model of the ToF depth measurement and the calibration parameters (e.g., based on the original factory calibration).
  • Fig. 6 depicts a further embodiment of an object tracking method 40 according to the present disclosure. In contrast to the embodiment of Fig. 3, in this embodiment, the frame rate is not constant, but varies.
  • a higher frame rate is used, whereas, when it is determined that the user does not interact, a lower frame rate, in a “low precision, low power mode” is used.
  • Fig. 7 depicts, on the left, a conventional ToF illuminator 50 in which a scene is illuminated in a predetermined pattern (i.e., each cell of a grid is illuminated one after another). Hence, energy is allocated statically in time and space.
  • Fig. 7 depicts a ToF illuminator 55 according to the present disclosure in which the cells of the grid are illuminated based on the events detected in an EVS. Thereby, energy is allocated dynamically according to scene conditions.
  • Fig. 8 depicts an embodiment of an AR/VR (augmented reality/virtual reality) system 60 used a head-mounted device.
  • the system 60 includes object tracking circuitry according to the present disclosure, a ToF unit (shown as Tx and Rx, as discussed above), and an EVS unit.
  • the EVS unit includes two input regions 61 (two EVS sensors), which each have a distance “b” with respect to the ToF Tx (wherein, in some embodiments, the distances of the two input regions may differ).
  • two monocular tracking regions (right monocular 2.5D tracking region 62 and left monocular 2.5D tracking region 63) are defined.
  • left and right tracking regions 62 and 63 improved tracking of a left hand and a right hand is possible.
  • a center 3D tracking region 64 is defined due to the ToF unit, thereby providing for precise and fast 3D (hand) tracking, which may be required for immersive virtual object interaction, such as virtual keyboard, or the like.
  • stereo acquisition may be envisaged.
  • Fig. 9a depicts an embodiment in which a head-mounted device (HMD) 70 tracks and illuminates a hand 71 of a user wearing the HMD 70.
  • HMD head-mounted device
  • ToF acquisition is omitted, as shown on the left of Fig. 9a.
  • Fig. 9b depicts the case that, when it is recognized, based on the EVS data, that the hand 71 comes close to the virtual object 72, the area of the virtual object 72 is actively illuminated.
  • both areas may be illuminated, i.e., the embodiments of Figs. 9a and 9b may be combined.
  • Fig. 10 depicts an object tracking method 80 according to the present disclosure in a block diagram.
  • ToF data is obtained, as discussed herein, i.e., a scene is illuminated with light spots/dots and a ToF measurement is carried out.
  • image data are obtained.
  • the image data are event sensing data, as discussed herein.
  • object data are generated with a neural network, as discussed herein.
  • Fig. 11 depicts, in a block diagram, a further embodiment of an object tracking method 90 according to the present disclosure, which is different than the object tracking method 80 of Fig. 10 in that, after the obtaining of image data, at 93 a correspondence between the ToF data and the image data is determined, based on a coordinate transformation, as discussed herein.
  • Fig. 12 depicts a further embodiment of an object tracking method according to the present disclosure in a block diagram. In this embodiment, it is decided, based on the image data, whether ToF data should be obtained.
  • image data are obtained, as discussed herein, as event sensing data.
  • based on the image data it is determined whether there is an indication for a ToF measurement, i.e., it is determined whether a tracked object is within a predetermined distance to a virtual object.
  • object data are generated, as discussed herein.
  • the embodiments describe methods with an exemplary ordering of method steps.
  • the specific ordering of method steps is however given for illustrative purposes only and should not be construed as binding.
  • the ordering of 81 and 82 in the embodiment of Fig. 10 may be exchanged.
  • the ordering of 91 and 92, as well as 93 and 94 in the embodiment of Fig. 11 may be exchanged.
  • the methods described herein can also be implemented as a computer program causing a computer and/or a processor to perform the method, when being carried out on the computer and/or processor.
  • a non-transitory computer-readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the method described to be performed.
  • Object tracking circuitry configured to: obtain time-of-flight data, based on a time-of-flight measurement, the time-of-flight data indicating the object; obtain image data indicating the object and at least one light spot, the light spot originating from light emitted for the time-of-flight measurement; and generate object data based on the time-of-flight data, on the image data, and on an association between the time-of-flight data and the image data, the association being based on the at least one light spot detected in the image data.
  • the object tracking circuitry of anyone of (1) to (5) further configured to: determine a correspondence of a first coordinate, indicated by the time-of-flight data, to a second coordinate, indicated by the image data.
  • An object tracking method comprising: obtaining time-of-flight data, based on a time-of-flight measurement, the time-of-flight data indicating the object; obtaining image data indicating the object and at least one light spot, the light spot originating from light emitted for the time-of-flight measurement; and generating object data based on the time-of-flight data, on the image data, and on an association between the time-of-flight data and the image data, the association being based on the at least one light spot detected in the image data.
  • a computer program comprising program code causing a computer to perform the method according to anyone of (11) to (20), when being carried out on a computer.
  • (22) A non-transitory computer-readable recording medium that stores therein a computer program product, which, when executed by a processor, causes the method according to anyone of (11) to (20) to be performed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Electromagnetism (AREA)
  • Human Computer Interaction (AREA)
  • Length Measuring Devices By Optical Means (AREA)
EP24715173.1A 2023-03-28 2024-03-27 Objektverfolgungsschaltung und objektverfolgungsverfahren Pending EP4689727A1 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP23164676 2023-03-28
PCT/EP2024/058367 WO2024200575A1 (en) 2023-03-28 2024-03-27 Object tracking circuitry and object tracking method

Publications (1)

Publication Number Publication Date
EP4689727A1 true EP4689727A1 (de) 2026-02-11

Family

ID=85778963

Family Applications (1)

Application Number Title Priority Date Filing Date
EP24715173.1A Pending EP4689727A1 (de) 2023-03-28 2024-03-27 Objektverfolgungsschaltung und objektverfolgungsverfahren

Country Status (2)

Country Link
EP (1) EP4689727A1 (de)
WO (1) WO2024200575A1 (de)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017191024A (ja) * 2016-04-14 2017-10-19 株式会社インザライフ 画像用基準光点照射装置
US20240061090A1 (en) * 2021-01-11 2024-02-22 Sony Semiconductor Solutions Corporation Time-of-flight demodulation circuitry, time-of-flight demodulation method, time-of-flight imaging apparatus, time-of-flight imaging apparatus control method
CN116888504A (zh) * 2021-03-03 2023-10-13 索尼半导体解决方案公司 飞行时间数据生成电路和飞行时间数据生成方法

Also Published As

Publication number Publication date
WO2024200575A1 (en) 2024-10-03

Similar Documents

Publication Publication Date Title
US10242454B2 (en) System for depth data filtering based on amplitude energy values
CN109831660B (zh) 深度图像获取方法、深度图像获取模组及电子设备
US11494925B2 (en) Method for depth image acquisition, electronic device, and storage medium
EP2824417B1 (de) Distanzberechungsvorrichtung und distanzberechnungsverfahren
EP3416370B1 (de) Fotografiefokussierungsverfahren, vorrichtung und einrichtung für ein endgerät
US11132804B2 (en) Hybrid depth estimation system
CN113658241B (zh) 单目结构光的深度恢复方法、电子设备及存储介质
US11803982B2 (en) Image processing device and three-dimensional measuring system
TW201817215A (zh) 影像掃描系統及其方法
KR20230158474A (ko) 센싱 시스템
US20240296621A1 (en) Three-dimensional model generation method and three-dimensional model generation device
JP3991501B2 (ja) 3次元入力装置
US20250224516A1 (en) Information processing apparatus, system, information processing method, information processing program, and computer system
US20210063850A1 (en) Imaging device, method for controlling imaging device, and system including imaging device
CN102401901B (zh) 测距系统及测距方法
US11790600B2 (en) Image processing device, imaging apparatus, image processing method, and recording medium
WO2024200575A1 (en) Object tracking circuitry and object tracking method
CN109618085B (zh) 电子设备和移动平台
US12033285B2 (en) Object identification device and object identification method
CN108008403A (zh) 红外激光测距装置及方法、无人机及避障方法
JPH11183142A (ja) 三次元画像撮像方法及び三次元画像撮像装置
KR102530307B1 (ko) 피사체의 위치 검출 시스템
US20250209650A1 (en) Information processing apparatus, system, information processing method, information processing program, and computer system
CN113542534A (zh) 一种tof相机控制方法、装置以及存储介质
CN114697560A (zh) 基于tof成像系统的主动曝光方法及曝光时间的计算方法

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20251021

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR