EP3673464A1 - Visual communications methods, systems and software - Google Patents
Visual communications methods, systems and softwareInfo
- Publication number
- EP3673464A1 EP3673464A1 EP18862646.9A EP18862646A EP3673464A1 EP 3673464 A1 EP3673464 A1 EP 3673464A1 EP 18862646 A EP18862646 A EP 18862646A EP 3673464 A1 EP3673464 A1 EP 3673464A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- receiving device
- data
- scene
- user
- sensor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 162
- 230000000007 visual effect Effects 0.000 title claims abstract description 44
- 238000004891 communication Methods 0.000 title claims abstract description 25
- 238000012545 processing Methods 0.000 claims description 101
- 230000007613 environmental effect Effects 0.000 claims description 66
- 239000013598 vector Substances 0.000 claims description 60
- 230000008859 change Effects 0.000 claims description 48
- 230000008569 process Effects 0.000 claims description 27
- 230000004044 response Effects 0.000 claims description 16
- 238000004458 analytical method Methods 0.000 claims description 11
- 230000001131 transforming effect Effects 0.000 claims description 10
- 238000009499 grossing Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 7
- 230000005484 gravity Effects 0.000 claims description 7
- 238000013144 data compression Methods 0.000 claims description 5
- 230000006837 decompression Effects 0.000 claims description 5
- 238000003384 imaging method Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 22
- 230000006870 function Effects 0.000 description 20
- 238000001514 detection method Methods 0.000 description 7
- 230000009471 action Effects 0.000 description 6
- 230000008901 benefit Effects 0.000 description 6
- 238000013459 approach Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 5
- 238000009826 distribution Methods 0.000 description 5
- 230000006399 behavior Effects 0.000 description 4
- 238000007792 addition Methods 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 101100419015 Hordeum vulgare RPS27 gene Proteins 0.000 description 1
- 241000867418 Morion Species 0.000 description 1
- 101150013568 US16 gene Proteins 0.000 description 1
- 101150044878 US18 gene Proteins 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- XAGFODPZIPBFFR-UHFFFAOYSA-N aluminium Chemical compound [Al] XAGFODPZIPBFFR-UHFFFAOYSA-N 0.000 description 1
- 229910052782 aluminium Inorganic materials 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000010924 continuous production Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 230000008713 feedback mechanism Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000012447 hatching Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
- G06T7/593—Depth or shape recovery from multiple images from stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/24—Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/61—Scene description
Definitions
- the present invention provides methods, systems, devices and computer software/program code products that enable the foregoing aspects and others.
- V3D Virtual 3-D
- the present invention provides methods, systems, devices, and computer software/program code products suitable for a wide range of applications, including, but not limited to: facilitating video communications and presentation of image and video content in telecommunications applications; and facilitating video communications and presentation of image and video content for virtual reality (VR), augmented reality (AR) and bead-mounted display (HMD) systems.
- VR virtual reality
- AR augmented reality
- HMD bead-mounted display
- Methods, systems, devices, and computer software/program code products in accordance with the invention are suitable for implementation or execution in, or in conjunction with, commercially available computer graphics processor configurations and systems including one or more display screens for displaying images, cameras for capturing images, and graphics processors for rendering images for storage or for display, such as on a display screen, and for processing data values for pixels in an image representation.
- the cameras, graphics processors and display screens can be of a form provided in commercially available smartphones. tablets and other mobile telecommunications devices, as well as in commercially available laptop and desktop computers, which may communicate using commercially available network architectures including client/server and client/network/cloud architectures.
- digital processors which can include graphics processor units, including GPGPUs such as those commercially available on cellphones, smartphones, tablets and other commercially available telecommunications and computing devices, as well as in digital display devices and digital cameras.
- GPGPUs graphics processor units
- Those skilled in die art to which this invention pertains will understand the structure and operation of digital processors, GPGPUs and similar digital graphics processor units.
- One aspect of the present invention relates to methods, systems and computer software/program code products that enable the generating of rich scene information representative of a scene.
- This aspect comprises: in a digital processing resource comprising at least one digital processor, (I) receiving data from at least one sensor, the data being at least in part representative of the scene; (2) detecting reliability of the sensor data; (3) remedying unreliable sensor data to generate remedied data; and (4) generating rich scene information from (A) the sensor data, including remedied data, and (B) the reliability information.
- Another aspect of the invention comprises reconstructing the scene as viewed from a virtual viewpoint, based on the rich scene information.
- the sensors comprise at least one stereo pair of cameras.
- the rich scene information comprises depth information.
- the depth information is obtained by stereo disparity analysis.
- detecting reliability of the sensor data comprises utilizing a heuristic.
- detecting reliability of the sensor data comprises: comparing the output of a sensor to the output from one or more additional sensors.
- the comparing comprises comparing sub-sections of data independently.
- the comparing utilizes at least one histogram.
- die histograms pertain to depth data.
- the histograms pertain to stereo disparity data.
- the comparing comprises generating an average.
- the comparing comprises comparing luminance data from one or more cameras.
- the comparing comprises comparing color data from one or more cameras.
- Another aspect of the invention comprises determining whether a sensor is occluded.
- Another aspect of the invention comprises identifying invalid patterns in the received data.
- the remedying comprises excluding unreliable data.
- the remedying comprises reducing the contribution from unreliable sensor data into the rich scene information.
- the remedying comprises notifying a user of unreliable data.
- the remedying comprises notifying a user, via a display, of unreliable data.
- the at least one sensor is associated with a device containing the at least one sensor and a display, and the remedying comprises notifying the user, via the display, of unreliable data.
- the remedying comprises presenting, to the user, intuitive visual cues via the display, the intuitive visual cues being configured so as to tend to direct the user to act in a manner to resolve a condition causing unreliable data.
- the intuitive visual cues are applied via the display, to a region of an image of the scene, the region being associated with the unreliable data.
- the intuitive visual cues comprise a visual effect
- the visual effect is applied more strongly in response to greater unreliability.
- the visual effect comprises a blur effect.
- Another aspect of the invention comprises transmitting the rich scene information to a remote device, the remote device being a device remote from the scene and operable to receive transmitted rich scene information.
- the at least one sensor is associated with a capturing device, the capturing device being operable to transmit any of sensor data and rich scene information, and the remote device notifies the capturing device of unrel iable transmitted data representative of the scene.
- the capturing device presents an indication of unreliable transmitted data.
- the remote device presents an indication of unreliable received data.
- the indication of unreliable data presented by the capturing device correlates with an indication of unreliable data presented by the remote device.
- the indication of unreliable data presented by the capturing device is configured so as to tend to direct a user of the capturing device to remedy an occluded sensor.
- One aspect of the present invention relates to visual communications methods, systems and computer sofrware/program code products, comprising: (A) configuring at least one transmitting device to be operable to: ( 1 ) capture first scene information, representative of a scene, generated by at least one sensor associated with the transmitting device; (2) capture originating environmental parameters; (3) process the first scene information to generate rich scene raformation; and (4) transmit the rich scene information to at least one receiving device; and (B) configuring the at least one receiving device to be operable to: (J ) capture destination environmental parameters; (2) receive the rich scene information from the at least one transmitting device; (3) interpret (he rich scene information; and (4) present the scene, based at least in part on the rich scene information.
- presenting the scene comprises displaying at least one image of the scene, via a display element operable to communicate with the receiving device, based at least in part on the rich scene information.
- the originating environmental parameters comprise parameters associated with the scene.
- the originating environmental parameters comprise parameters associated with the transmitting device.
- f he destination environmental parameters comprise parameters associated with the environment proximate the receiving device.
- the destination environmental parameters comprise parameters associated with the receiving device.
- the transmitting device transmits the originating
- the receiving device utilizes the originating environmental parameters in presenting the scene.
- the receiving device transmits the destination environmental parameters to the transmitting device, and the transmitting device utilizes the destination environmental parameters in processing the first scene information to generate rich scene information.
- the processing of the first scene information comprises data compression.
- the interpreting comprises data decompression.
- the environmental parameters comprise an orientation vector.
- an orientation vector is measured util izing any of an accelerometer, gyroscope, compass, GPS (global positioning system), other spatial sensor, or combination of spatial sensors.
- an orientation vector is substantially constrained with respect to a given device, but can be altered in response to a substantial change in data from a spatial sensor.
- the spatial sensor comprises any of an accclcrometer, gyroscope, compass, GPS, other spatial sensor, or combination of spatial sensors.
- an orientation vector is permitted to move to align with the orientation of an associated device in a gravity field.
- Another aspect of the invention comprises applying a selected smoothing process to smooth high frequency changes to an orientation vector.
- a further aspect of the invention comprises configuring control logic to be operable to apply the selected smoothing process.
- an orientation vector can be at least in part controlled by a user through a user interface.
- an orientation vector is derived from the rich scene information.
- the processing comprises rotating or transforming the rich scene information with respect to an orientation vector.
- the interpreting comprises rotating or transforming the rich scene information with respect to an orientation vector.
- the interpreting or the processing utilizes orientation vectors from more than one device.
- the interpreting or the processing utilizes the difference between orientation vectors from more than one device.
- At least one device rotates or transforms the scene information
- the receiving device presents die scene with a consistent, defined downward orientation that is substantially aligned with a selected axis of the transmitting device or devices, irrespective of the rotation of the devices.
- the receiving device presents at least one image of the scene via a display element.
- the display element is a Head-Mounted Display (HMD).
- HMD Head-Mounted Display
- the display element comprises a display screen on a hand-held device.
- the display element comprises any of a desktop display screen, freestanding display screen, wall mounted display screen, surface mounted display screen or outdoor display screen.
- the transmitting device is operable to generate a feedback view that presents feedback to a user of the transmitting device.
- the feedback comprises an image of the scene.
- the receiving device presents a different portion of the scene from the portion presented by the feedback view of the transmitting device.
- a further aspect of the invention comprises enabling a user to select a gaze direction by utilizing a touch screen interface associated with the receiving device.
- a gaze direction can be controlled at least in part by the output of an accelerometer, gyroscope, compass, GPS, other spatial sensor, or combination of spatial sensors.
- Another aspect of the invention comprises enabling a user of the receiving device to control gaze direction by executing a user gesture observable by a non-contact sensor associated with the receiving device.
- gaze direction can be changed by the physical position of a user relative to a physical position of a receiving device.
- a user of a receiving device can change the focus of a virtual camera that defines a perspective of a displayed image of the scene.
- the focus can be changed by the user selecting a region of a displayed image to bring into sharp focus.
- Another aspect of the invention comprises enabling a user of the receiving device to change focus by executing a user gesture observable by a non-contact sensor associated with the receiving device.
- a further aspect comprises enabling a user of the receiving device to change a field of view of a displayed image.
- Yet another aspect of the invention comprises enabling a user of the receiving device to change field of view by executing a gesture on a touch screen associated with the receiving device.
- the field of view can be changed by motion of a device, the motion being detected by an accelerometer, gyroscope, compass, GPS, other spatial sensor, or combination of spatial sensors.
- a further aspect of the invention comprises enabling the user to change the field of view by executing a gesture observable by a non-contact sensor associated with the receiving device.
- the field of view can be changed by the physical position of a user, relative to the physical position of a receiving device.
- Yet another aspect comprises enabling a user of a receiving device to change an image zoom parameter.
- a related aspect of the invention comprises enabling a user of a receiving device to change a zoom parameter by executing a gesture on a touch screen associated with the receiving device.
- the zoom can be changed by motion of a device, die motion being detected by an accelerometer, gyroscope, compass, GPS (global positioning system), other spatial sensor, or combination of spatial sensors.
- Another aspect of the invention comprises enabling a user of a receiving device to change a zoom parameter by executing a gesture observable by non-contact sensors associated with the receiving device.
- the zoom is controllable by the physical position of a user, relative to the physical position of a receiving device.
- Another aspect of the invention comprises configuring the receiving device to be operable to attempt to preserve the spatial topology of the scene captured by the transmitting device.
- a device is operable to apply a scale factor to the rich scene information.
- a further aspect of the invention comprises enabling a user to modify the scale factor via an interface.
- At least one receiving device is operable to additionally function as a transmitting device
- at least one transmitting device is operable to additionally function as a receiving device.
- some of the devices do not comprise the same sensors or capabilities as the other device or devices.
- One aspect of the present invention relates to a system for generating rich scene information representative of a scene, the system comprising: (A) a digital processing resource comprising at least one digital processor; and (B) at least one sensor operable to generate sensor data in response to sensed conditions and to communicate the sensor data to the digital processing resource; the digital processing resource being configured to: (1) receive sensor data from the at least one sensor, the data being at least in part representative of the scene; (2) detect reliability of the received sensor data; (3) remedy unreliable data to generate remedied data; and (4) generate rich scene information from (A) die sensor data, including remedied data, and (B) the reliability information.
- a visual communications system comprising: (A) a transmitting device; and (B) a receiving device operable to communicate with the transmitting device; die transmitting device being configured to be operable to: (1) capture first scene information, representative of a scene, generated by at least one sensor associated with the transmitting device; (2) capture originating environmental parameters; (3) process the first scene information to generate rich scene information; and (4) transmit the rich scene information to the receiving device; and the receiving device being configured to be operable to: (1 ) capture destination environmental parameters; (2) receive rich scene information transmitted by the transmitting device; (3) interpret the rich scene information; and (4) present the scene, based at least in part on the rich scene information.
- the scene is presented via a display element operable to communicate with the receiving device.
- One aspect of the present invention relates to a program product for use with a digital processing system, the digital processing system comprising a digital processing resource comprising at least one digital processor, the digital processing resource being operable to communicate with at least one sensor operable to (i) generate sensor data in response to sensed conditions and fii) communicate the sensor data to the digital processing resource, the program product comprising digital processor-executable program instructions stored on a non-transitory digital processor-readable medium, which when executed in the digital processing resource cause the digital processing resource to: (1) receive sensor data from the at least one sensor; (2) detect reliabil ity of the received sensor data; (3) remedy the unreliable data to generate remedied data; and (4) generate rich scene information from (A) the sensor data, including remedied data, and (B) the reliability information.
- the digital processing system comprising a digital processing resource comprising at least one digital processor, the digital processing resource being operable to communicate with at least one sensor operable to (i) generate sensor data in response to sensed conditions and fii) communicate the sensor data to the
- Another aspect of the invention relates to a program product for use with a digital processing system, the digital processing system comprising a digital processing resource, the digital processing resource comprising at least one digital processor in any of a digital transmitting device or a digital receiving device operable to communicate with the digital transmitting device, the program product comprising digital processor-executable program instructions stored on a non-transitory digital processor- readable medium, which when executed in the digital processing resource cause the digital processing resource to (A) configure the transmitting device to be operable to: (i) capture first scene information, representative of a scene, through at least one sensor associated with die transmitting device; (2) capture originating environmental parameters; (3 ) process the first scene information to generate rich scene information; and (4) transmit the rich scene information to the receiving device; and (B) configure the receiving device to be operable to: (1) capture destination environmental parameters; (2) receive the rich scene information transmitted by the transmitting device; (3) interpret the rich scene information; and (4) present the scene, based at least in part on the rich scene information.
- the scene is presented via a display element operable to communicate with the receiving device.
- the present invention enables the features described herein to be provided at reasonable computational cost, and in a manner easily accommodated within the digital processing capabilities and form factors of modem mobile devices such as tablets and srnartphoncs, as well as the form factors of laptops, PCs, computer-driven televisions, computer-driven projector devices, and the like, does not dramatically alter the economics of building such devices, and is viable within current or near-current communications network/connccti vity architectures.
- FIGS. 1 - 7 arc schematic diagrams depicting exemplary embodiments and practices of the invention.
- FIGS. 8A - 12 are flowcharts depicting exemplary practices of the invention.
- FIGS. 13 - 22 are schematic block diagrams depicting exemplary embodiments and practices of the invention.
- V3D devices and systems that enable such devices and systems to detect sub-optimal operating conditions (such as, for example, a blocked/occluded camera or other sensor), compensate for those conditions, and encourage user behaviors that allow optimal V3D functioning and discourage user behaviors that inhibit it
- V3D devices and systems described herein and in commonly owned PCT application PCT/US 16/23433 function as visual portals into other spaces.
- V3D devices and systems can be configured in accordance with the present invention to capture and transmit information about the environment in which f hey arc operating.
- information can include, but is not limited to, the orientation of the captured data relative to the device, as well as real- world scale information about the devices being used.
- V3D video-to-reliable systems and devices
- any similar systems and devices that rely on two or more cameras with overlapping views, or the use of depth- sensing sensors in order to produce rich information about the scene can be negatively affected by unreliable sensor information.
- FIG. 1 depicts an example of different respective spatial regions, associated with a two-camera configuration, including a region for which it is possible to correlate objects (i.e., calculate correspondence) between the camera pair, and other regions where it is not possible to calculate correspondence between the camera pair.
- FIG. 1 shows a configuration 100 with cameras (or other sensors) 102 and 104 having a view of a scene.
- Camera 102 includes a sensor 106 and has a field of view 110.
- Camera 104 has a sensor 108 and a field of view 112.
- the fields of view of the respective cameras overlap, yielding a region of overlap "C, in which correlation is possible, and regions "A" and "B” in which correlation is not possible.
- An even more extreme example is that of an object, such as a user's hand, substantially filling the view of one or more cameras, and occluding the view of the objects and features that could otherwise be correlated with the view from the corresponding cameras in the respective pairs.
- An exemplary practice of this aspect of the present invention comprises two overall components: (1) detection of unreliable sensor data; and (2) remedy.
- the term "unreliable data” is used herein to subsume a number of possible conditions, including sensor-generated data that is erroneous, invalid, missing, unsuitable or otherwise deemed or determined to be unreliable.
- Detection is a process, which may be a continuous process, of monitoring die data from die cameras or sensors and making a determination about the reliability and usability of the incoming data. If sub-optimal or erroneous conditions are detected, then Remedy is the process of taking action to correct for the unreliable data, either by adjusting the processing to compensate, by providing feedback to the user so that they may take action, or a combination of both.
- Reliability information may comprise a reliability indicator for each camera, a reliability indicator for each camera pair, or a more fine-grained structure to represent reliability.
- a reliability map may apply to a camera image, a disparity map, or another data set. It may contain a single bit for each data element or group of elements expressing whether the associated elements represent reliable data. It may also contain a value indicating a degree of reliability ranging from highly reliable to unreliable.
- FIG. 2 is a schematic diagram depicting an example, in accordance with a practice of the invention, of a processing pipeline for detecting and remedying unreliable data generated, in mis case, by four cameras/sensors.
- FIG. 2 shows a configuration 200 having four cameras (or other sensors) 2010, 2011, 2012, 2013, which capture or generate data 2014, 2015, 2016, 2017, respectively, and apply the data into a pipeline of stages 201, 202, 203, 204, as follows: (201) determine reliability of camera data; (202) compensate for unreliable data; (203) provide user interface cues to increase reliability, and (204) transmit reliability information to aid reconstruction on a remote device.
- Luminance may be an easy clue to measure. Measuring the luminance from each camera and comparing it with the luminance from the other cameras can provide a clue about the reliability of the camera data. If one camera on a device or one camera in a set of cameras produces luminance data that is an outlier to the data produced by the other cameras, the outlier camera data may be unreliable.
- Excessive luminance variations could be caused by unintentional field-of-view obstruction over one or more cameras, or by high intensity light directed into one or more cameras from a light source or reflection.
- a simple process for comparing luminance between cameras is to use the average luminance across all camera pixels.
- a more sophisticated measurement of luminance is to build a histogram of luminance values across the pixels of the incoming camera image. Using a histogram instead of an average would be robust to false negatives caused by objects in the camera image being substantially different, but sharing a similar image-wide mean luminance value.
- One histogram could be created for each camera, but it may be beneficial to create multiple histograms for each sub-region of the camera output. In some cases these sub-regions may overlap each other.
- comparison of low-resolution versions of the camera outputs can provide suitability or reliability information.
- a coarse image comparison method may be superior to an average luminance method for detecting objects that move too close to a camera pair, such that they have exceeded the maximum range of the stereo disparity solution method used by a multi-camera system.
- a feature of a coarse image comparison method is mat suitability information for regions of the image can be produced, as opposed to rejecting the entire camera's information. This is desirable for some adaptation behaviors and unnecessary for others.
- Stereo correspondence data may be processed into reconstructed depth data, or it may represent raw disparity values.
- One method is to compare the distribution of disparity data among multiple camera pairs.
- histogram data can be generated representing the distribution of disparity values or the distribution of depth values calculated for each stereo pair. Cameras belonging to pairs for which the distributions arc found to be substantially different from other pairs may he considered to include erroneous data. The amount of difference considered tolerable may be based on the amount of overlap between the respective fields of view for different camera pairs.
- a similar method can also be used to determine the reliability of depth data from any source where multiple sensors are used. For example, depth data from one or more active depth sensing sensors can be compared with reconstructed depth data from one or more camera pairs.
- histograms can be created from depth data sourccd from more than one depth-sensing sensor, or from one or more camera pairs, and compared with other histograms to determine reliability of the sensor data.
- Another method involves executing pattern analysis on the data.
- the camera data or reconstructed depth data may contain a pattern mat would be impossible to observe in valid data given the optics of the cameras and their position relative to each other.
- the particular implementation details of a given example will pertain to disparity-based depth information reconstructed from a given stereo pair.
- This aspect of the invention is also applicable in other sensor configurations, although the patterns may be different for different categories of sensors.
- FIG. 3 is a schematic diagram depicting a sensor (e.g., camera pair) where disparity deltas beyond some limiting bounds are impossible to observe; in particular, an example of geometry for which stereo correlation is impossible with a given camera configuration.
- FIG. 3 shows a configuration 300 having two cameras 301 and 302, having a view of a scene containing an object 310.
- the object 310 has elongate "legs" 311 and 312 that form a "well" 313.
- rays 303, 304 (shown as dashed lines) for camera 301 and rays 305, 306 for camera 302, and assuming ray 304 forms an extreme of the field of view of camera 301 , and ray 305 forms an extreme of the field of view of camera 302, then the region or "box" 320 (shown with hatching) within the well 313 of object 310 forms an Invisible Region" in the configuration 300.
- data indicating that a given geometric pattern has been observed may be considered unreliable.
- depth information derived from the disparity information allows a maximum change in disparity information to be computed across the entire camera image.
- This calculated data can be expressed as a disparity bounds function.
- areas of the image where the disparity information changes more rapidly than the bounds function allows, i.e., at a higher spatial frequency than the sensor can observe, are indicative of unreliable data.
- FIG. 3 is only one example of practices in accordance with the invention, and unreliable data is not always associated with the regions of lower disparity, i.e., at further distance or depth from the camera (as in the example of FIG. 3). It is also possible that regions of high disparity, as may be found at lesser depth from the camera, are providing invalid data. In either case, it can be concluded, using a disparity analysis, that some of the data is unreliable.
- Virtual viewpoint reconstruction as taught in one or more of the commonly-owned V3D patent applications listed above and incorporated herein by reference, can make use of multiple disparity maps from multiple pairs of cameras.
- multiple disparity values may be considered for each pixel or each image element, and the system selects suitable disparity values from among the values in the input disparity maps or by blending values appropriately.
- the generation of final disparity values involves using the output of a stereo comparison error function.
- an associated reliability indicator can also be used in addition to or instead of the error function, to determine a best final disparity value for an image element.
- reliable correspondence for a portion of the image may be unavailable for one camera pair.
- the disparity information must be determined using the correspondence from the remaining two camera pairs.
- Some reconstruction methods or algorithms involve one camera m a set of cameras being designated as the master camera for the set. Often this designation is arbitrary, and is useful as an optimization or as a matter of implementation convention. In a system wherein a master camera is designated, the designation may be changed in response to changes in the reliability information associated with the master camera and/or the reliability information associated with one or more other cameras in the set of cameras.
- Some of the conditions leading to unreliable data can be easily remedied by a human user of a V3D device or system. (Examples of such conditions include a circumstance in which a camera or other sensor is occluded by the user's finger or hand.) In these situations, it would be useful to communicate to the user in a manner that is unobtrusive, but which intuitively prompts the user to take corrective action.
- the V3D device or system (for example, a commercially available smartphone configured to execute a V3D method in accordance with the invention) comprises a display element in addition to the camera pairs and/or sensors.
- This display may provide an efficient mechanism for the device system to communicate feedback to the user.
- Some embodiments, such as a content-capture device akin to a personal video camera, may provide a monitor or viewfinder through which the user is able to observe a sample reconstruction of the scene being captured by the cameras and/or other sensors on the device.
- Another example embodiment is a mirroring device, through which the user is able to observe him or herself, possibly from different perspectives than would be available from a conventional mirror.
- a similar mirroring feature may also be present in a communications system, which may include a display that may show a virtual viewport to make the user aware of the visual data that they arc transmitting to other users.
- a simpler feedback mechanism can be used.
- a series of LED lights, or an LED light that varied in color to indicate the degree of reliability of the data could enable correct positioning in front of a device.
- Another embodiment could make use of audible feedback, such as a warning sound, or a voice message.
- a problematic condition can arise when an object is too close to the cameras, such that the disparity for that object, with respect to a given camera pair, is too great.
- other depth-sensing sensors may have range limits that create similar issues.
- This situation is bom common, and especially jarring, when the object in question is the user themselves, or a part of the user's body, such as a hand or the face.
- an exemplary embodiment of the system can identify which parts of the reconstructed image incorporate the unreliable data, in addition, it may also be possible to determine the degree of unreliability for sub-regions of the image. When the unreliability in a given region of the image increases beyond a selected threshold, an exemplary embodiment of the system can alter the processing of data pertaining to that "unreliable region" in response to the increased unreliability.
- visual cues can then be provided in the regions of the display reflecting die unreliable data, such cues being configured to tend to cause a human user to take corrective action to remedy the operating condition causing me unreliable sensor data.
- These visual cues can be overt, such as drawing an indication, drawing an icon, drawing a sprite, or drawing a shape on the user's view of the scene. They can also involve changing the shading, outlining, tinting, blending in a color or a pattern, applying a blur, applying a filter, or applying an image effect to the image regions containing unreliable data or to objects that are contained or partially contained within those image regions.
- some visual cues can be applied more or less strongly depending on the degree of unreliability of the data. For example, a face moving closer to the cameras may begin to blur slightly when the depth or disparity data begins to approach a threshold for unreliability. As the face moves closer still, the blur intensifies.
- a blur and similar effects can serve a dual purpose, bom to notify the user of a sub-optimal situation, and also to hide or distract from any artifacts caused by the unreliable data.
- FIG. 4 An example of such a practice, in accordance with the present invention, is shown in FIG. 4.
- the schematic diagram of FIG.4 shows an example of blur being applied in proportion to data unreliability, in accordance with an exemplary practice of the present invention.
- a face depictted schematically in FIG.4
- the system applies increasingly more blur, to compensate for the increased unreliability from the depth or disparity data, and, in an exemplary practice of the invention, to alert the user of the data unreliability.
- FIG.4 consisting of FIGS.4A, 4B and 4C, shows an example of a display element
- the display element presents an image 420 of a face.
- the displayed image 420 of the face occupies a greater portion of the display 4 i 0 (compare FIGS.4A, 4B, and 4C), while the system applies increasingly more blur.
- FIGS.4A - 4C the increasing blur shown in FIGS. 4A - 4C is generated by the system, in accordance with the invention, in response to the increase in data unreliability, while the increased portion of the screen occupied is merely incidental, sharing the same root cause of the object (e.g., face) being closer to the camera pairs, thereby increasing the difficulty of detecting correspondence on that object.
- objects may be blurred (i.e., the system can apply a selected amount of blur) as they approach the edge of the field of view.
- the affected regions of the screen are blurred, while in other embodiments, a better user experience is delivered by blurring the entire screen.
- the degree of blur may be dictated by the portion of (he image with the highest detected unreliability. This strategy may be particularly useful in a video-chat embodiment. Occluded Cameras or Ouftier Sensors
- the system can continuously monitor each camera and/or each sensor and make a determination whether a given camera or sensor is occluded or otherwise impeded. In situations where it can be determined that a camera or sensor is occluded, this information can be communicated to a user so that the user may take action. A message may be shown to the user on the device's display, or an icon may also be drawn to alert the user to the situation. A useful indicator may be an arrow, to show die user which of the cameras or sensors is responsible for die unreliable data.
- Another method of indicating to the user that a camera is occluded is to create a blurry region, a shaded region, a darkened region, a colored region, or a region in which a pattern is displayed in the area of die screen near to where die affected camera or sensor is mounted on the device.
- FIG. 5 depicts an exemplary practice of the invention, in which a display 410 and camera (41 1 , 412, 413, 414) device shows a reconstructed image 420 of a face, with a shaded region 430 in the lower right-hand portion of the display screen, the shaded region being generated by the system to indicate to the human user of the display/camera device that that the lower right camera 413 (as seen by the user of the display/camera device) is occluded.
- the shaded region 430 of FIG. 5 thus draws the user's attention to an occluded camera.
- the visual indication may communicate die relative or abstract position or positions of the occluded camera, cameras, sensor, or sensors.
- a "camcorder-like" icon can be displayed to draw the user's attention to the lower right corner of the viewfindcr display if the lower right camera is occluded— even if, for example, the entire sensor array is positioned to the left of the viewfinder display.
- reconstruction logic will take steps to minimize the impact of me unreliable camera data and perform the reconstruction with the available data, but the user interface software may additionally highlight the unreliable camera to the user by creating a visually prominent indication on the image displayed to the user.
- FIG. 5 depicts an example of this practice of the invention.
- depth or disparity data arc used in a communications system, such as a video-chat system or any other system with a sender and a receiver
- the user device, or device that collects the data is often not the user or device that ultimately displays the data. Therefore, the system should communicate any erroneous conditions to the user operating the device collecting the data.
- some embodiments may detect erroneous conditions on the transmitting device or system, while some embodiments may detect erroneous conditions on the receiving side.
- the receiving device can respond with a message to the transmitting device, which can then prompt the transmitting device to take appropriate corrective action, including displaying a visual indication to the user operating or otherwise interacting with the transmitting device.
- both the transmitting and receiving device may be operable to detect errors.
- a transmitting device with a plurality of cameras allows a 360 degree field of view to be reconstructed.
- a receiving device men reconstructs a 120 degree subset of the view field.
- sensors contributing to regions outside the receiver's reconstructed view do not affect the receiver's experience.
- a user operating the receiving device may change the subset of the field of view in order to reconstruct a different portion of the view or observe the scene from a different virtual viewpoint.
- the receiving device could then send information or a notification to the transmitting device to indicate that the data being transmitted is unreliable in a way mat is now affecting the desired operation.
- this self- monitor device may be virtual, such a "picture in picture” feature or feedback view that shows the user a sample from the data they are transmitting, in addition to the data that they arc receiving.
- simple techniques or strategics for alerting the user include reporting errors through icons, messages, or sounds communicated to the user by the transmitting device being employed by the user.
- Another exemplary practice of the invention goes further, and involves "hijacking" the display on the transmitting device when an error condition arises.
- a display on the transmitting device may switch to display information notifying the user of the error in the transmitted data upon detection of the error conditions. This could manifest in the transmitting user seeing an image of themselves on the display, possibly along with an error message, instead of seeing the person on the other end of a video chat.
- Another strategy for a video-chat system is to display indications to the user, mixed into data that the user is receiving from the other party in the chat. Such indications arc referred to herein as “false indications"
- a user who moves his face too close to a camera pair may sec the image blur on their display.
- the system can cause this to happen, as an alert to the user, in spite of the fact that the image on the user's display is reconstructed from "clean" data received from the other party in the chat session.
- the blur is imposed by the system primarily to prompt die user to back away from the camera pair and thus improve the quality of the data he or she is transmitting.
- the display on the receiver's device could also impose a full or partial blur to mask the unreliability in the transmitted data.
- a hand moving too close to the sensors on a transmitting device may cause the system to blur the region of the transmitting device's display corresponding to where the receiving device may sec the hand reconstructed from unreliable data.
- the indications may mirror one another.
- a receiving device that performs a blur or presents an indication to hide artifacts during reconstruction on the receiving side may instruct the transmitting device to perform the same or a related blur, or present the same or a related indication.
- FIG. 6 Another aspect of the invention, depicted schematically in FIG. 6, relates to capturing a scene on a transmitting device, and then presenting that scene on a receiving device such that the orientation of the presented scene is consistent with respect to a world orientation vector, regardless of the orientation(s) of the respective devices.
- FIG. 6 shows a transmitting device 610 and a receiving device 620 (such as smartphones, tablet computing devices, or the like), each provisioned with cameras or other sensors, as indicated by the circle or dot elements 613, 614, 615, 616, and 623, 624, 625, 626, at or near the corners of the display screens 611 , 621 of the respective devices 610, 620 in the example of FIG.
- a transmitting device 610 and a receiving device 620 such as smartphones, tablet computing devices, or the like
- the reconstructed image 622 of the person (indicated schematically by a stick figure) displayed on the receiving device 620 remains “vertical", wherein the term “vertical” is defined with relation to a "world downward orientation” indicated by the arrow 650 at the left-hand side of FIG.6.
- the image 612 shown on display 611 of transmitting device 610 can be, tor example, a mirroring of the image 622 shown on the display 621 of receiving device 620.
- image 612 on display 611 of device 610 could be an image of the user of the "opposite" device 620.
- the world downward orientation would be a conventional downward vector defined by gravity or a gravity field.
- the downward orientation could be defined by a dominant perceived direction of consistent acceleration.
- This manner of orientation of the presented image irrespective of the angle at which the receiving device is held, but defined instead by the relationship between the scene captured by the transmitting device and a defined world downward orientation, is referred to herein as synchronization to a common space.
- FIG.6 thus shows a scene captured on a transmitting device that is presented on a receiving device, such that the orientation is consistent with respect to a world orientation vector, regardless of the orientations of the transmitting and receiving devices.
- FIG.7 is a schematic diagram illustrating a system in accordance with the invention, comprising, among other elements, a wall- mounted display device 710 (mounted on wall or other planar surface 740) and a hand-held device 730 with a display 731 , with which a user (indicated by stick figure 738) is interacting.
- Device 710 has cameras or other sensors 712, 713, 714, 715, and displays an image 711.
- Device 730 may also have cameras or other sensors (not shown), and displays an image 732.
- a selected scene scale and orientation can be preserved across different device configurations.
- exemplary practices and embodiments of the invention provide visual communications methods, systems and computer program code products (software) that include or operate in association with (A) at least one transmitting device operable to: ( 1 ) capture first scene information through cameras and/or other forms of sensors; (2) capture originating environmental parameters (such as scene orientation, luminosity and other parameters); (3) digitally process the first scene information to generate rich scene information; and (4) transmit the rich scene information to (B) at least one receiving device, in which die at least one receiving device is operable to: (1) capture destination environmental parameters; (2) receive the rich scene information transmitted by the transmitting device; (3) interpret the received rich scene information; and (4) present the scene, such as by presenting the scene on a display screen, at least in part based on the rich scene information.
- A at least one transmitting device operable to: ( 1 ) capture first scene information through cameras and/or other forms of sensors; (2) capture originating environmental parameters (such as scene orientation, luminosity and other parameters); (3) digitally process the first scene information to generate rich scene information; and (4) transmit the rich
- the scene can be presented, for example, by displaying an image of the scene on a device used by a human user.
- the scene could also be displayed on different forms of display elements, which could include public display elements, such as billboards or other public displays.
- the transmitting device or devices transmit the originating environmental parameters to the receiving device or devices, and the receiving device or devices utilize the originating environmental parameters in presenting the scene.
- the receiving device or devices transmit the destination environmental parameters to the transmitting device or devices, and the transmitting device or devices make use of the destination environmental parameters to process the rich scene information.
- the digital processing of first scene information to generate rich scene information comprises data compression.
- the processing can also comprise data
- the environmental parameters comprise an orientation vector.
- the orientation vector can be measured using an accelerometer, gyroscope, compass, GPS (global positioning system), other spatial sensor, or any combination of spatial sensors.
- GPS global positioning system
- sensors known forms of commercially available mobile devices use a plurality of sensors, in concert, in a manner referred to as “sensor fusion,” to provide, for use by application developers and others, a "best guess" 3D orientation of the device at any given moment of operation.
- an orientation vector is constrained with respect to a device, but may be changed by a substantial change in data from an accelerometer.
- an orientation vector is allowed to move smoothly to align with the orientation of a device in a gravity field, such as may be found on the Earth.
- the invention additionally comprises control logic, operable to smooth high frequency changes to an orientation vector
- an orientation vector may be influenced by a user through a user interface.
- an orientation vector is derived from the rich scene information.
- processing of rich scene information comprises rotating or transforming the rich scene information with respect to an orientation vector.
- interpreting the rich scene information comprises rotating or transforming the rich scene information with respect to an orientation vector.
- Orientation vectors from more than one device may he considered in the same operation or logically connected operations.
- the interpretation or the processing utilizes die difference between orientation vectors from more than one device.
- a device can rotate or transform the scene information, and the receiving device can present the scene with a consistent downward orientation that is aligned with a selected axis of the transmitting device or devices, irrespective of the respective rotational or angular orientations of the respective devices.
- the receiving devices present the scene on a display screen.
- the transmitting devke(s) provide a feedback view to allow userts) of the transmitting device(s) to observe the scene captured by a respective transmitting device on the same device on which it was captured.
- the receiving devicc(s) present a different portion of the scene from the portion presented through the feedback view on the transmitting device or devices.
- a user of a receiving device can control or affect the portion of the scene that is presented on the receiving device.
- a user of a receiving device can adjust a selected gaze direction to change a virtual viewpoint.
- a gaze direction can be controlled or changed by the use of a touch screen interface manipulated by the user; or by an accelerometer, gyroscope, compass, GPS, other spatial detector or combination of spatial detectors; or by a detected or directly applied gesture by a user, as observed by non-contact sensors on a receiving device or by a touchscreen on a receiving device; or by a physical position of a user, relative to (he physical position of a receiving device.
- the focus of a virtual camera can be controlled or changed by a user of a receiving device; or by (he user selecting a region on a screen to bring into sharp focus; or by a detected or applied a gesture by a user, as observed by non-contact sensors on a receiving device or by a touchscreen on a receiving device.
- the field of view can be controlled or changed by a user of a receiving device; or the field of view can be changed by a user's gesture on a touch screen or a user's gesture detected by non-contact sensors; or by motion of a device, as detected by an accelerometer, gyroscope, compass, GPS, or other spatial detector or combination of spatial detectors; or by physical position of a user, relative to the physical position of a receiving device.
- a user of a receiving device can control or change a zoom parameter, such as by a gesture applied to a touchscreen or a gesture detected by a non-contact sensor on a receiving device, or by motion of a device, as detected by an accelerometer, gyroscope, compass, GPS, other spatial sensor, or any combination of spatial sensors, or by physical position of a user, relative to the physical position of a receiving device.
- a zoom parameter such as by a gesture applied to a touchscreen or a gesture detected by a non-contact sensor on a receiving device, or by motion of a device, as detected by an accelerometer, gyroscope, compass, GPS, other spatial sensor, or any combination of spatial sensors, or by physical position of a user, relative to the physical position of a receiving device.
- the receiving dcvice(s) attempt to preserve the spatial topology of the scene that was captured by the transmitting device or devices.
- a device applies a scale factor to the rich scene data.
- die scale factor can be modified by a user of a device through an interface usable by the user to control the device.
- the receiving device(s) also function as transmitting device(s), and the transmitting device(s) also function as receiving device(s).
- some of the devices are different from other devices in the system, and in particular, some of the devices in the system do not comprise the same sensors or capabilities as the other devices.
- telecommunications devices can include known forms of cellphones, smartphones, and other known forms of mobile devices, tablet computers, desktop and laptop computers, and known forms of digital network components and server/cloud/network/client architectures that enable communications between such devices.
- Computer software can encompass any set of computer-readable programs instructions encoded on a non- transitory computer readable medium.
- a computer readable medium can encompass any form of computer readable element, including, but not limited to, a computer hard disk, computer floppy disk, computer-readable flash drive, compu ter-readablc RAM or ROM element or any other known means of encoding, storing or providing digital information, whether local to or remote from the cellphone, smartphonc, tablet computer, PC, laptop, computer-driven television, or other digital processing device or system.
- modules and digital processing hardware elements including memory units and other data storage units, and including commercially available processing units, memory units, computers, servers, smartphoncs and other computing and telecommunications devices.
- modules include computer program instructions, objects, components, data structures, and the like mat can be executed to perform selected tasks or achieve selected outcomes.
- data storage module can refer to any appropriate memory element usable for storing program instructions, machine readable files, databases, and other data structures.
- the various digital processing, memory and storage elements described herein can be implemented to operate on a single computing device or system, such as a server or collection of servers, or they can be implemented and inter-operated on various devices across a network, whether in a server-client arrangement, scrvcr-cioud-client arrangement, or other configuration in which client devices can communicate with allocated resources, functions or applications programs, or with a server, via a communications network.
- One implementation comprises a complete device, including four cameras, capable of encoding content and receiving (full-duplex communication).
- Another is an Apple iPhone-based implementation that can receive and present immersive content (receive-only).
- the Applicants used the following hardware and software structures and tools, among others, to create the two noted implementations, collectively: 1.
- An Intel Core i7-6770HQ processor which includes on-chip the following:
- OpcnCL API using Intel Media SDK running on Linux operating system to implement, among other aspects: Image Rectification, Fast Dense Disparity Estimate(s) (FDDE) and Multi-level Disparity Histogram aspects.
- DLIB Face Detection library to locate presence of viewer's face.
- the Apple iOS SDK was used to access accelerometer, gyroscope and compass for device orientation and to access video decode hardware; and the OpenGL ES API to implement multiple native disparity map voting and image reconstruction to enable an i Phone-based prototype of a receiving device. It is noted that the above-listed hardware and software elements are merely tools or building blocks that can be used in a development process, and not themselves instantiations, embodiments or practices of the invention.
- FIGS. 8A - 12 are flowcharts illustrating method aspects and exemplary practices of the invention.
- the methods depicted in these flowcharts are examples only; the organization, order and number of operations in the exemplary practices can be varied; and the exemplary practices and methods can be arranged or ordered differently, and include different or additional functions, whether singly or in combination, while still being within the spirit and scope of the present invention.
- FIG. 8 shows a method 800 according to an exemplary practice of the invention, including the following operations:
- (801.1 Sensors can comprise at least one stereo pair of cameras.)
- (802.2 Detecting can include comparing output of a sensor to output of other sensor(s).
- the comparing can utilize at least one histogram.
- the histograms can pertain to depth data, stereo disparity data, and/or other data.
- the comparing can involve generating an average.
- the comparing can involve comparing luminance data from one or more cameras.
- the comparing can involve comparing color data from one or more cameras.
- the remedying can comprise: excluding unreliable data.
- the remedying can include notifying a user of unreliable data.
- the display can be part of a device containing the display and at least one sensor.
- (804.6 Remedying can include presenting intuitive visual cues on a display, mat tend to direct the user to act in a manner to resolve a condition causing unreliable data.)
- the visual effect can be applied more strongly in response to greater unreliability.
- the visual effect can include a blur effect.
- 805. Generate rich scene information from (A) the sensor data, including remedied data, and (B) reliability information.
- At least one sensor is associated with the capturing device, and capturing device is operable to transmit any of sensor data and rich scene information.
- the remote device can notify the capturing device of unreliable transmitted data representative of the scene.
- the capturing device can present an indication of unreliable transmitted data.
- the remote device can present an indication of unreliable received data.
- An indication of unreliable data, presented by capturing device, can correlate with indication of unreliable data presented by remote device.
- An indication of unreliable data presented by capturing device is configured so as to tend to direct a user of the capturing device to remedy an occluded sensor.
- FIG. 9 shows a method 900 according to an exemplary practice of the invention, comprising the following operations:
- (903.1 Transmitting device can use destination environmental parameters, transmitted by the receiving device to the transmitting device, in processing the first scene information to generate rich scene information. )
- the processing can include data compression.
- the processing can include rotating or transforming the rich scene information with respect to an orientation vector.
- transmitting device uses transmitting device to transmit the rich scene information to at least one receiving device.
- receiving device uses receiving device to capture destination environmental parameters.
- receiving device uses receiving device to receive the rich scene information from the at least one transmitting device.
- the interpreting can include rotating or transforming the rich scene information with respect to an orientation vector.
- the presenting can include displaying at least one image of scene, via display element operable to communicate with receiving device, based at least in part on rich scene information.
- Receiving device utilizes the originating environmental parameters, transmitted to the receiving device by the transmitting device, in presenting the scene.
- At least one device rotates or transforms the scene information.
- the receiving device can present the scene with a consistent downward orientation that is substantially aligned with a selected axis of the transmitting device or devices, irrespective of the rotation of the devices.
- the receiving device can present at least one image of scene to user, by presenting it on a display element (a Head Mounted Display (HMD), display element on hand-held device, desktop display screen, wall-mounted display, freestanding display, surface mounted display, outdoor display screen or other display element).
- a display element a Head Mounted Display (HMD), display element on hand-held device, desktop display screen, wall-mounted display, freestanding display, surface mounted display, outdoor display screen or other display element.
- HMD Head Mounted Display
- the transmitting device can further comprise a feedback view that presents feedback to a user of transmitting device.
- the feedback can comprise an image of the scene.
- the receiving device can present a different portion of the scene from the portion presented by the feedback view of the transmitting device.
- PIG. 10 shows aspects (collectively 1000) of environmental parameter and orientation vector processing in exemplary practices of the invention, including the following:
- Originating environmental parameters can comprise parameters associated with the scene.
- Originating environmental parameters can comprise parameters associated with the transmitting device.
- Destination environmental parameters can comprise parameters associated with the environment proximate the receiving device.
- (1000.4 Destination environmental parameters can comprise parameters associated with the receiving device.)
- Originating/destination environmental parameters can include an orientation vector.
- a spatial sensor can be any of any of an acceierometer, gyroscope, compass, GPS (Global Positioning System), other spatial sensor, or combination of spatial sensors; and orientation vector can be determined/measured using an acceierometer, gyroscope, compass, GPS, other spatial sensor, or combination of spatial sensors.
- (1000.11 Orientation vector can be at least in part controlled by a user through a user interface.) (1000.12 Orientation vector can be derived from the rich scene information.)
- FIG. 11 shows aspects (collectively 1100) of user control processes and processing, and aspects of a control interface, in exemplary practices of the invention, including the following:
- Gaze direction can be controlled at least in part by output of an acceierometer, gyroscope, compass, GPS, other spatial sensor, or combination of spatial sensors.
- Gaze direction can be changed by the physical position of a user, relative to a physical position of a receiving device.
- User of receiving device can change (he focus of a virtual camera that defines a perspective of a displayed image of scene.)
- J 100. J 7 Zoom can be changed by motion of a device, the motion being detected by an accelerometer, gyroscope, compass. GPS, other spatial sensor, or combination of spatial sensors.
- FIG. 12 shows aspects (collectively 1200) relating to spatial topology, scaling, and other tramm it ting/receiving device operations in exemplary practices of the invention, including the following:
- At least one receiving or transmitting device is operable to apply a scale factor to the rich scene information.
- At least one receiving device is operable to additionally function as a transmitting device, and at least one transmitting device is operable to additionally function as a receiving device.
- FIGS. 13 - 22 arc schematic block diagrams showing aspects of exemplary embodiments of the invention.
- FIG. 13 is a schematic block diagram showing an embodiment of the invention comprising a capturing device 1302 having cameras and/or other sensors 1304, a display element 1305, and digital processing resource 1306 (containing at least one digital processor), and a touchscreen, non- contact sensors and spatial sensors collectively denominated 1307.
- the cameras or other sensors 1304 arc operable to obtain scene information about a scene 1301, which operation may be under the control of a user 1311, who may control or change operations of the device 1302 via the touchscreen, non- contact sensors and spatial sensors 1307.
- the capturing device and its digital processing rcsourccis are operable, in accordance with exemplary practices of the present invention, to capture scene information and process the scene information to generate rich scene information 1303 (shown at the right-hand side of FIG. 13).
- the rich scene information 1303 can be used by the capturing device itself (for example, to present one or more images on display 1305, as schematically depicted by the arrow running from digital processing resource 1306 to display 1305); and/or the rich scene information 1303 can be transmitted for use elsewhere, as schematically depicted by the arrow from processing resource 1306 and emerging out of the right-hand side of device 1302 as depicted in FIG. 13.
- FIG. 14 is a schematic block diagram showing an embodiment of the invention comprising a capturing/transmitting device 1402 and a receiving device 1403 in communication with each other via a network 1404.
- the capturing/transmitting device 1402 comprises cameras/sensors 1406, a display 1407, a digital processing resource 1408 (containing at least one digital processor), and a touchscreen, non-contact sensors and spatial sensors collectively denominated 1409.
- the receiving device 1403 comprises cameras/sensors 1410, display 1411, digital processing resource 1412, and touchscreen, non-contact sensors and spatial sensors collectively denominated 1413.
- the exemplary embodiment of FIG. 14 also comprises an external display 1414, which may be a head-mounted display (HMD), display screen on a handheld device, wall-mounted display, outdoor display, or other display element, which can be controlled, for example, by the digital processing resource 1412 of receiving device 1403.
- HMD head-mounted display
- the cameras/sensors 1406 of capturing'transmitting device 1402 are operable to capture scene information representative of scene 1401 , which operation may be under the control of a user 1431, who may control or change operations of the device via the touchscreen, non-contact sensors and spatial sensors 1409. (A user is not necessarily required, however, as the device may be configured to operate autonomously.) As described in greater detail elsewhere herein, the capturing/transmitting device and its digital processing resource's) are operable, in accordance with exemplary practices of (he present invention, to capture scene information and process the scene information to generate rich scene information.
- the capturing/transmitting and receiving devke(s) 1402, 1403, can send to each other various data and information (collectively shown as 1451, 1452, 1453, 1454) via network 1404 (which may be, for example, the Internet), and this data can comprise scene information, rich scene information, and environmental parameters in accordance with exemplary practices of the present invention, as described in greater detail elsewhere herein.
- network 1404 which may be, for example, the Internet
- the capturing/transmitting device 1402 can transmit rich scene information, environmental parameters, and other information, collectively shown as 1451 , and transmit them to the receiving device 1403 via network 1404 (sec, for example, arrow 1454 running from network 1404 to receiving device 1403).
- the receiving device can also obtain environmental parameters and other information, collectively shown as 1453, and transmit mem to the capturing/transmitting device 1402 via the network 1404 (see, for example, arrow 1452 running from network 1404 to capturing/transmitting device 1402).
- the receiving device 1403 can receive the rich scene information, environmental parameters and other information, and process such data, using its digital processing resource 1412, and present one or more images, based on the rich scene information, environmental parameters and other information, on display 1411 and/or external display 1414, a user or viewer 1404.
- the image(s) may include reconstructed images of scene 1401.
- a user of receiving device 1403 can control or change operations of receiving device 1403 by touchscreen, non-contact sensors and/or spatial sensors 1413 of the receiving device. In some settings, such as an outdoor display embodiment, there may not be a particular "user", per se, but instead one or more human "viewers.”
- the two-device configuration shown in FIG. 14 can be replicated or expanded to cover a plurality of transmitting devices and a plurality of receiving devices, operable to communicate with each other via a network. Also, in accordance with known network and telecommunications architectures, a given device can act as a transmitting device and a receiving device at different times, or as both at the same or substantially the same time, in a full duplex configuration.
- FIG. 15 is a schematic block diagram showing aspects of a digital processing resource 15(X) in accordance with exemplary embodiments and practices of the invention.
- Digital processing resource 1500 can be the same as, or of the same design and configuration as, the digital processing resource 1306 shown in FIG. 13, and digital processing resources 1408, 1412 shown in FIG. 14.
- digital processing resource 1500 As shown in FIG. IS, digital processing resource 1500:
- Digital processing resource 1500 is operable to:
- the remedying can include: notifying user (via a display) of unreliable data.)
- the remedying can include: presenting to user (via a display) intuitive visual cues, configured so as to tend to direct user to act in a manner to resolve a condition causing unreliable data.)
- 1502.4 Generate rich scene information from (A) the sensor data, including remedied data and (B) the reliability information. (1502.5 Reconstruct scene as viewed from virtual viewpoint, based on rich scene information.)
- a digital processing resource associated with the capturing device is operable to: transmit sensor data and/or rich scene information.
- a digital processing resource associated with a remote device is operable to notify the capturing device of unreliable transmitted data representative of the scene.
- the capturing device presents an indication of unreliable transmitted data.
- the remote device presents an indication of unreliable received data.
- FIG. 16 is a schematic block diagram showing aspects, in a digital processing resource 1500. of detecting reliability of sensor data, and related processing elements. As indicated in FIG. 16, in a digital processing resource 1500:
- Detecting reliability can include utilizing a heuristic.
- Detecting reliability can include comparing the output of a sensor to the output from one or more additional sensors.
- the comparing can include comparing subsections of data independently.
- the comparing can utilize at least one histogram.
- the comparing can include generating an average.
- the comparing can include comparing luminance data from one or more cameras.
- the comparing can include comparing color data from one or more cameras.
- Sensors are operable to generate sensor data in response to sensed conditions and communicate sensor data to digital processing resource. (1604. Sensors can include at least one stereo pair of cameras operable to capture scene information representative of a scene.)
- the rich scene information can include depth information.
- the depth information can be obtained by stereo disparity analysis.
- FIG. 17 is a schematic block diagram showing aspects of a transmitting device 1700 in accordance with exemplary embodiments and practices of the invention. As indicated in FIG. 17, a transmitting device in accordance with the invention is operable to:
- the processing can include data compression.
- FIG. 18 is a schematic block diagram showing aspects of a recei ving device 1800 in accordance with exemplary embodiments and practices of the invention.
- a receiving device in accordance with the invention is operable to communicate with one or more transmitting devices, and is further operable to:
- the interpreting can include data decompression.
- Presenting the scene can include displaying at least one image of the scene, via a display element operable to communicate with the receiving device, based at least in part on the rich scene information.
- FIG. 19 is a schematic block diagram showing aspects of processing environmental parameters and/or orientation vectors) in a digital processing resource or other elements of receiving or transmitting device(s) in accordance with the invention. As indicated in FIG. 19, in connection with a digital processing resource or other elements of receiving or transmitting device(s): (1901. Originating environmental parameters can include parameters associated with the scene.)
- Originating environmental parameters can include parameters associated with the transmitting device.
- Destination environmental parameters can include parameters associated with die environment proximate the receiving device.
- Destination environmental parameters can include parameters associated with the receiving device.
- Transmitting device can transmit originating environmental parameters to the receiving device, and the receiving device can utilize the originating environmental parameters in presenting the scene.
- Receiving device can transmit the destination environmental parameters to the transmitting device, and the transmitting device can utilize the destination environmental parameters in processing the first scene information to generate rich scene information.
- the environmental parameters can include an orientation vector.
- Orientation vector can be measured utilizing any of an accelerometer, gyroscope, compass, GPS (global positioning system), other spatial sensor, or combination of spatial sensors.)
- An orientation vector can be substantially constrained with respect to a given device, but can be altered in response to a substantial change in data from a spatial sensor.
- Spatial sensor can include any of an accelerometer, gyroscope, compass, GPS, other spatial sensor, or combination of spatial sensors.
- the digital processing resource is operable to apply a selected smoothing process to smooth high frequency changes to an orientation vector.
- An orientation vector can be at least in part controlled by a user through a user interface.) (1915. The orientation vector can be derived from the rich scene information.)
- processing of scene information can include rotating or transforming the rich scene information with respect to an orientation vector.
- interpreting of scene information can include rotating or transforming the rich scene information with respect to an orientation vector.
- the interpreting or the processing can utilize the difference between orientation vectors from more than one device.) (1920. At least one transmitting or receiving device rotates or transforms the scene information, and ihe receiving device presents the scene with a consistent, defined downward orientation that is substantially aligned with a selected axis of the transmitting device or devices, irrespective of the rotation of the devices.)
- FIG.20 is a schematic block diagram showing aspects of display elements in accordance with exemplary embodiments and practices of the invention. As indicated in FIG.20:
- a receiving device is operable to present at least one image of the scene via a display element.
- a transmitting device is operable to generate feedback view that presents feedback to a user of ihe transmitting device.
- Display element can be a component of the transmitting device or the receiving device.
- Display element can be external to the transmitting device or the receiving device.
- Display element can include a head-mounted display (HMD).
- HMD head-mounted display
- Display element can include a display screen on a hand-held device.
- FIG. 21 is a schematic block diagram showing aspects of user controls and other controls and control interfaces enabled by or practiced in connection with the transmitting devices, receiving devices and processing resources in accordance with the present invention. As indicated in FIG. 21 :
- a receiving device operable to present a different portion of the scene from the portion presented by the feedback view of the transmitting device.
- a receiving device is operable to enable a user of the receiving device to select the portion of the scene presented by receiving device.
- Receiving device operable to enable user of receiving device to select a gaze direction to change a virtual viewpoint, thereby to control the viewpoint of the scene presented by the recei ving device.
- Receiving device operable to enable user of receiving device to select a gaze direction by utilizing a touch screen interface associated with the receiving device.
- Gaze direction can be controlled at least in part by the output of an accelerometer, gyroscope, compass, GPS, other spatial sensor, or combination of spatial sensors.
- Receiving device operable to enable user of receiving device to control gaze direction by executing a user gesture observable by a non-contact sensor associated with the receiving device.
- Receiving device operable to enable gaze direction to be changed by the physical position of a user relative to a physical position of a receiving device.
- Receiving device operable to enable user of receiving device to change focus of a virtual camera that defines a perspective of a displayed image of the scene.
- Focus can be changed by the user selecting a region of a displayed image to bring into sharp focus.
- a receiving device is operable to enable a user of the receiving device to change focus by executing a user gesture observable by a non-contact sensor associated with receiving device.
- Receiving device operable to enable user of receiving device to change field of view of displayed image.
- Receiving device operable to enable user of receiving device to change field of view by executing a gesture on a touch screen associated with the receiving device.
- accelerometer gyroscope
- compass GPS
- other spatial sensor or combination of spatial sensors.
- Receiving device operable to enable user to change field of view by executing a gesture observable by a non-contact sensor associated with receiving device.
- Receiving device is operable to enable user of receiving device to change an image zoom parameter.
- a zoom parameter can be changed by motion of device, the motion being detected by an accelerometer, gyroscope, compass, GPS, other spatial sensor, or combination of spatial sensors.
- Receiving device is operable to enable user of receiving device to change zoom parameter by executing a gesture observable by non-contact sensor associated with receiving device.
- FIG. 22 is a schematic block diagram showing other aspects of receiving devices) and transmitting devices) in accordance with exemplary embodiments and practices of the present invention. As indicated in FIG.22 :
- Receiving device is operable to attempt to preserve the spatial topology of the scene captured by the transmitting device.
- a transmitting or receiving device is operable to apply a scale factor to the rich scene
- a transmitting or receiving device is operable to enable a user to modify the scale factor via a control interface.
- At least one receiving device is operable to additionally function as a transmitting device, and at least one transmitting device is operable to additionally function as a receiving device.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762550685P | 2017-08-27 | 2017-08-27 | |
PCT/US2018/048197 WO2019067134A1 (en) | 2017-08-27 | 2018-08-27 | Visual communications methods, systems and software |
Publications (2)
Publication Number | Publication Date |
---|---|
EP3673464A1 true EP3673464A1 (en) | 2020-07-01 |
EP3673464A4 EP3673464A4 (en) | 2021-05-26 |
Family
ID=65902263
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP18862646.9A Withdrawn EP3673464A4 (en) | 2017-08-27 | 2018-08-27 | Visual communications methods, systems and software |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP3673464A4 (en) |
WO (1) | WO2019067134A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10853625B2 (en) | 2015-03-21 | 2020-12-01 | Mine One Gmbh | Facial signature methods, systems and software |
EP3274986A4 (en) | 2015-03-21 | 2019-04-17 | Mine One GmbH | Virtual 3d methods, systems and software |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070254640A1 (en) * | 2006-04-27 | 2007-11-01 | Bliss Stephen J | Remote control and viewfinder for mobile camera phone |
US8547374B1 (en) * | 2009-07-24 | 2013-10-01 | Lockheed Martin Corporation | Detection and reconstruction of 3D objects with passive imaging sensors |
US9030530B2 (en) * | 2009-12-15 | 2015-05-12 | Thomson Licensing | Stereo-image quality and disparity/depth indications |
US8259161B1 (en) * | 2012-02-06 | 2012-09-04 | Google Inc. | Method and system for automatic 3-D image creation |
JP2013172190A (en) * | 2012-02-17 | 2013-09-02 | Sony Corp | Image processing device and image processing method and program |
US9519972B2 (en) * | 2013-03-13 | 2016-12-13 | Kip Peli P1 Lp | Systems and methods for synthesizing images from image data captured by an array camera using restricted depth of field depth maps in which depth estimation precision varies |
US9196084B2 (en) * | 2013-03-15 | 2015-11-24 | Urc Ventures Inc. | Determining object volume from mobile device images |
WO2015173173A1 (en) | 2014-05-12 | 2015-11-19 | Dacuda Ag | Method and apparatus for scanning and printing a 3d object |
US9412169B2 (en) * | 2014-11-21 | 2016-08-09 | iProov | Real-time visual feedback for user positioning with respect to a camera and a display |
-
2018
- 2018-08-27 WO PCT/US2018/048197 patent/WO2019067134A1/en unknown
- 2018-08-27 EP EP18862646.9A patent/EP3673464A4/en not_active Withdrawn
Also Published As
Publication number | Publication date |
---|---|
WO2019067134A1 (en) | 2019-04-04 |
EP3673464A4 (en) | 2021-05-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11747893B2 (en) | Visual communications methods, systems and software | |
US11709545B2 (en) | Gaze detection method and apparatus | |
CN106662930B (en) | Techniques for adjusting a perspective of a captured image for display | |
JP6456593B2 (en) | Method and apparatus for generating haptic feedback based on analysis of video content | |
WO2021067044A1 (en) | Systems and methods for video communication using a virtual camera | |
US11809617B2 (en) | Systems and methods for generating dynamic obstacle collision warnings based on detecting poses of users | |
US9325935B2 (en) | Preview window for video communications | |
US20170316582A1 (en) | Robust Head Pose Estimation with a Depth Camera | |
WO2016120806A1 (en) | Method and system for providing virtual display of a physical environment | |
CN113487742A (en) | Method and system for generating three-dimensional model | |
IL288336B1 (en) | Techniques to set focus in camera in a mixed-reality environment with hand gesture interaction | |
EP3460745B1 (en) | Spherical content editing method and electronic device supporting same | |
US10957063B2 (en) | Dynamically modifying virtual and augmented reality content to reduce depth conflict between user interface elements and video content | |
KR102450236B1 (en) | Electronic apparatus, method for controlling thereof and the computer readable recording medium | |
EP3435670A1 (en) | Apparatus and method for generating a tiled three-dimensional image representation of a scene | |
EP3673464A1 (en) | Visual communications methods, systems and software | |
US20200211275A1 (en) | Information processing device, information processing method, and recording medium | |
EP3493541B1 (en) | Selecting an omnidirectional image for display | |
CN113228117A (en) | Authoring apparatus, authoring method, and authoring program | |
US20160274765A1 (en) | Providing a context related view with a wearable apparatus | |
CN107147775B (en) | Information processing method and electronic equipment | |
WO2023279163A1 (en) | Method, program, and system for 3d scanning | |
CN117478931A (en) | Information display method, information display device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20200326 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: BIRKHOLD, CHRISTOPH Inventor name: MCCOMBE, JAMES, A. Inventor name: SMITH, BRIAN, W. |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: BIRKHOLD, CHRISTOPH Inventor name: SMITH, BRIAN, W. Inventor name: MCCOMBE, JAMES, A. |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R079 Free format text: PREVIOUS MAIN CLASS: G06T0015000000 Ipc: G06T0007593000 |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20210426 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06T 7/593 20170101AFI20210420BHEP Ipc: G06T 15/00 20110101ALI20210420BHEP |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20230704 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20231115 |