CN109564748B - Mixed photon VR/AR system - Google Patents

Mixed photon VR/AR system Download PDF

Info

Publication number
CN109564748B
CN109564748B CN201780030255.4A CN201780030255A CN109564748B CN 109564748 B CN109564748 B CN 109564748B CN 201780030255 A CN201780030255 A CN 201780030255A CN 109564748 B CN109564748 B CN 109564748B
Authority
CN
China
Prior art keywords
real
world
image
optical
display
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201780030255.4A
Other languages
Chinese (zh)
Other versions
CN109564748A (en
Inventor
萨瑟兰德·库克·埃尔伍德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sa SelandeKukeAierwude
Original Assignee
Sa SelandeKukeAierwude
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/457,991 external-priority patent/US9986217B2/en
Priority claimed from US15/457,980 external-priority patent/US20180031763A1/en
Priority claimed from US15/457,967 external-priority patent/US20180035090A1/en
Priority claimed from US15/458,009 external-priority patent/US20180122143A1/en
Application filed by Sa SelandeKukeAierwude filed Critical Sa SelandeKukeAierwude
Priority to CN202211274817.9A priority Critical patent/CN115547275A/en
Priority claimed from PCT/US2017/022459 external-priority patent/WO2017209829A2/en
Publication of CN109564748A publication Critical patent/CN109564748A/en
Application granted granted Critical
Publication of CN109564748B publication Critical patent/CN109564748B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/12Picture reproducers
    • H04N9/31Projection devices for colour picture display, e.g. using electronic spatial light modulators [ESLM]
    • H04N9/3179Video signal processing therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/64Circuits for processing colour signals

Abstract

A VR/AR system, method, architecture including an enhancer that simultaneously receives and processes real-world image component signals while generating synthetic-world image component signals, and then interleaves/enhances these signals for further processing. In some embodiments, real world signals (with the possibility of pass-through processing by the augmentor) are converted to IR (using, for example, false-color maps) and interleaved with synthetic world signals (produced in IR) for continued processing, including visualization (conversion to visible spectrum), amplitude/bandwidth processing, and output shaping to generate a set of display image precursors intended for HVS.

Description

Mixed photon VR/AR system
Cross Reference to Related Applications
This application claims the benefit of U.S. patent applications nos. 15/457,967, 15/457,980, 15/457,991, and 15/458,009, filed on 3/13/2017, and claims the benefit of U.S. patent applications nos. 62/308,825, 62/308,361, 62/308,585, and 62/308,687, filed on 15/3/15/2016, and are related to U.S. patent applications nos. 12/371,461, 62/181,143, and 62/234,942, the entire contents of which are expressly incorporated herein by reference for all purposes.
Technical Field
The present invention relates generally to video and digital images and data processing devices and networks that generate, transmit, convert, distribute, store and display such data, as well as non-video and non-pixel data processing in arrays, such as sensing arrays and spatial light modulators, and data applications and uses therefor, and more particularly, but not exclusively, for digital video image displays, whether flat screens, flexible screens, 2D or 3D, or projected images, and non-display data processing by device arrays, and spatial forms and positioning of such processes for organization, including compact devices such as flat panel televisions and consumer mobile devices, and data networks that provide image capture, transmission, distribution, segmentation, organization, storage, delivery, display and projection of pixel signals or data signals or aggregations or sets thereof.
Background
The subject matter discussed in the background section should not be admitted to be prior art merely by mention in the background section. Similarly, it should not be assumed that the problems mentioned in the background section or associated with the subject matter of the background section have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches which may themselves be inventions.
The field of the invention is not single, but combines two related fields, augmented reality and virtual reality, and the proposing and providing an integrated mobile device solution solves the key problems and limitations of the prior art in both fields. A brief review of the background of these related art will clarify the problems and limitations to be solved and lay the foundation for the proposed solution of the present disclosure.
Two standard dictionary definitions (sources) for these terms are as follows:
virtual reality: "realistic simulation of an environment using a computer system of interactive software and hardware, including three-dimensional graphics. Abbreviations: and VR is carried out. "
And (3) augmented reality: an "augmented image or environment viewed on a screen or other display is generated by superimposing computer-generated images, sounds, or other data in the real-world environment. "and: "systems or techniques for generating such an enhanced environment. Abbreviations: and (5) AR. "
It is clear from the definition that, although non-technical, and to those skilled in the relevant art, the essential difference is whether the simulation element is a complete and immersive simulation, even a partial direct view of reality, or whether the simulation element is superimposed on an otherwise clear unobstructed real view.
Slightly more technical definitions are provided under the wikipedia entry of the subject, which may be considered to fully represent the field in view of the depth and extent of contribution to page editing.
Virtual Reality (VR), sometimes referred to as immersive multimedia, is a computer simulated environment that can simulate physical presence in the real or imaginary world in some places. Virtual reality enables reproduction of sensory experiences, including virtual taste, vision, smell, hearing, touch, and the like.
Augmented Reality (AR) is a real-time direct or indirect view of a physical, real-world environment, whose elements are augmented (or supplemented) by computer-generated sensory inputs such as sound, video, graphics, or GPS data.
Inherent but only implicit in these definitions are the basic properties of moving viewpoints. Virtual or augmented reality differs from the more general category of computer simulation in that simulated or mixed (augmented or "mixed") reality "true-at-time" imaging, whether or not there is any combination, fusion, synthesis or integration with "real-time", "direct" reality imaging (local or remote), i.e. the point of view of the viewer moves with the viewer as the viewer moves in the real world.
The present disclosure proposes that such a more precise definition is needed to distinguish between static navigation of an immersive display and an empirical simulated world (simulator) and mobile navigation of a simulated world (virtual reality). A sub-category of simulators would then be "personal simulators", or at most "partial virtual reality", where the stationary user is equipped with an immersive HMD (head mounted display) and a haptic interface (e.g. motion tracking gloves), which enables "virtual reality-like" navigation of parts of the simulated world.
On the other hand, the CAVE system will be schematically defined as a limited virtual reality system, since navigation through the size of CAVE is only possible by means of the movable layer, and will then be another form of "partial virtual reality" once the limits of CAVE itself are reached.
Note the difference between the "moving" viewpoint and the "movable" viewpoint. Computer simulations, such as video games, are simulated worlds or "realities", but unless the seeker of the simulated world moves himself or directs another person or robot to move, all of these can be said that the simulated world is "navigable" (although one of the primary achievements of computer graphics in the past forty years was simply "building" a simulated environment that could be explored in software).
For simulations that are virtual or mixed (the term of choice by the author) reality, an important, most typical feature is to have a mapping to the simulation of the real space (whether fully synthetic or mixed). Such real space may be as basic as a laboratory or room within a sound field, and is simply a grid mapped and calibrated to the simulated world in some scale.
This distinction is not evaluable, as local VRs (whether natural, artificial, or mixed) that provide real-time natural interfaces (head tracking, haptic, auditory, etc.) without moving or mapping to actual real terrain are not fundamentally less valuable than local VR systems that simulate physical interactions and provide sensory immersion. However, a VR system is by definition "local" if there is no feedback system for the foot or, more generally, a full-body, range of motion feedback system, and/or a dynamically deformable mechanical interface-interactive surface that supports the user's simulated but (in its sense) full-body motion in any terrain, any state of rest (whether standing, sitting, or reclining).
However, without such an ideal whole-body physical interface/feedback system, limiting VR to "full" and fully mobile versions would limit the topography of the VR world to places where it can be found, modified, or built from scratch in the real world. Such limitations will typically severely limit the scope and capabilities of the virtual reality experience.
However, as will be apparent in the instant disclosure, this distinction creates a difference in that it sets the "open line" for how existing VR and AR systems differ and their limitations, and provides a background to inform the teachings of the present disclosure.
The missing but essential features and requirements of the simulation are established as a complete "virtual reality", the next step is to determine the implicit question by what way to implement a "moving viewpoint". The answer is that providing a view of the movement simulation requires two components (which themselves are implemented by a combination of hardware and software): motion picture display means through which the simulation can be viewed, and motion tracking means which can track the motion of the device including the display in 3 axes of motion, which means that the position of the three-dimensional viewing device over time is measured from a minimum of three tracking points (two if the device is mapped to measure so that a third position on a third axis can be inferred), and relative to a 3-axis frame of reference which can be any arbitrary 3D frame of reference mapped to real space, although for practical purposes of mechanically navigating the space, the 2 axes will form a plane which is the ground plane of the gravity level and the third axis Z is perpendicular to the ground plane.
The solutions to accurately and frequently practically achieve such position orientation as a function of time require a combination of sensors and software, and the advances in these solutions represent a major carrier for the development of VR and AR hardware/software mobile viewing devices and systems fields.
These are relatively new areas, sufficient to document both the original and the later current state-of-the-art in both types of mobile visual simulation systems, in terms of the time frame between the earliest experiments and the current practical technologies and products, which, in addition to the specific innovations in the prior art, are of great significance to the development of the present disclosure, or are related to differences or similar features for better explaining the existing problems in the field, or the differences of the solutions of the present disclosure from the prior art.
The period of many innovations in the relevant simulation and simulator, VR and AR fields has spanned from the end of 1968 to the nineties, where many key issues to achieve practical VR and AR have found initial or partial solutions.
Pioneering experiments and experimental head-mounted display systems of Ivan Sutherland and its assistant Bob Sprouell since 1968 were generally considered as the hallmark of origin for these related fields, although earlier work (essentially conceptual development) was earlier than the first experimental implementation of any form of AR/VR that enabled immersion and navigation.
The birth of the fixed simulator system dates back to the addition of computer-generated imagery to the flight simulator, which is generally believed to have begun in the mid-late 1960 s. This was limited to the use of CRTs, displaying full focus images at user distance from the CRT, until in 1972 Singer-Link introduced a collimated projection system that projected an afocal image through a beam splitter system, which improved the field of view to about 25-35 degrees per unit (up to 100 degrees using three units in a single-flight simulator).
This benchmark was only improved in 1982 by Rediffusion corporation, introducing a wide field of view system, a wide-angle infinite display system, which achieves a 150 degree and then final 240 degree FOV by using multiple projectors and a large, curved collimating screen. It is at this stage that a stationary simulator may be described as ultimately achieving a significant degree of true immersion in virtual reality, using an HMD to isolate the observer and eliminate peripheral visual cue interference.
But while Singer-Link company was promoting screen alignment systems for simulators, the first very limited commercial helmet mounted display was first developed for military use as an advance in VR-type experience, with a reticle-based electronic aiming system integrated with motion tracking of the helmet itself. These initial developments were generally considered to be achieved in preliminary form by the south african air force in the 70's of the 20 th century (followed by the israel air force between then and mid seventies), and could be said to be the beginning of a preliminary AR or intervening/mixed reality system.
These early helmet-mounted systems, with minimal graphics but still pioneering, implemented a limited synthesis of position-coordinated target information and user-actuated motion-tracking targets superimposed on the reticle, after which Steve Mann invented the first "intermediate reality" mobile see-through system, the first generation "EyeTap" with graphics superimposed on the glasses.
Later versions of Mann employed an optical compounding system that merged real and processed images based on splitter/combiner optics. This work preceded the work of Chunyu Gao and Augmented Vision Inc, which essentially proposed a dual Mann system that optically combines the processed real image with the generated image, which completed both the processed real image and the electronically generated image. The true see-through image is preserved in the Man system, but all see-through images are processed in the Gao system, eliminating any direct see-through image, even as an option. (Chunyu Gao, U.S. patent application 20140177023 filed on 13/4/2013). The "optical path folding optics" structure and method specified by the Gao system can be found in other optical HMD systems.
By 1985, jaron Lanier and VPL Research established to develop HMD and "data glove", and thus by the 1980 s, mann, lanier and Redefuzsion companies had three major development paths for VR and AR simulations in a very active field of development, taking some critical advances and establishing some basic solution types, in most cases up to now maintaining the latest levels.
The complexity of Computer Generated Imaging (CGI), continued improvements in gaming machines (hardware and software) with real-time interactive CG technology, greater system integration among multiple systems, and the expansion of the mobility of AR and more limited VR are among the major trends in the 1990 s.
The CAVE system was developed by the electronic visualization laboratory at the university of illinois, chicago, and was first developed globally in 1992, proposing a limited form of mobile VR and a new simulator. (Carolina Cruz-Neira, daniel J Sandin, thomas A.DeFanti, robert V.Kenyon and John C.Hart. "CAVE: audio Visual expert Automatic Virtual Environment (CAVE: audio Visual Experience Automatic Virtual Environment)", ACMCOMMUTIONS, vol.35 (6), 1992, pp.64-72.) in addition to Lanier's HMD/data glove combination, CAVE combines a WFOV simulator "stage" with a haptic interface.
At the same time, louis Rosenberg developed a stationary local AR in armstrong's us air force research laboratory, the "Virtual Fixtures" system (1992) and the stationary "Virtual" VR system of johnson walden, which was considered an initial development as early as 1985 to 1990 and was the first commercial light in 1992.
Integration of mobile AR into a multi-unit mobile vehicle "wargame simulation" system, combining real and virtual vehicles in "augmented simulation" ("AUGSIMM"), will see its next major advance in the form of Loral WDL, shown to the industry in 1993. Project participants in Peculiar Technologies Jon barrialeaux subsequently written "Experiences and innovations in Applying Augmented Reality to Live Training" in 1999, reviewed the findings of the SBIR final report in 1995, and pointed out the persistent problems facing mobile VR and (mobile) AR to date:
VR tracking in AR vs
Generally, commercial products developed for VR have good resolution, but lack the absolute accuracy and wide area coverage required for AR, let alone their use in AUGSIM.
VR applications-users immersed in synthetic environments-are more concerned with relative tracking rather than absolute accuracy. Since the user's world is completely synthetic and self-consistent, the fact that his/her head is just turning 0.1 degrees is much more important than knowing that it is now pointing north within 10 degrees.
AR systems such as AUGSIM do not have such treatment. AR tracking must have good resolution so that the virtual element appears to move smoothly in the real world as the user's head rotates or the vehicle moves, and it must have good accuracy so that the virtual element is properly overlaid and occluded by objects in the real world.
As computing and network speeds continue to improve in the nineties, new projects for outdoor AR systems were initiated, including the U.S. naval research laboratory's BARS system, "BARS: battlefield Augmented Reality System, simon Julie, yohan Baillot, marco Lanzagorta, dennis Brown, lawrence Rosenblum; NATO Symposium on Information Processing Techniques for Military Systems,2000. And (3) abstract: the system consists of a wearable computer, a wireless network system, and a tracking see-through Head Mounted Display (HMD). The user's perception of the environment is enhanced by superimposing the graphics onto the user's field of view. The graphics are registered (aligned) with the actual environment. "
The development of non-military features is also underway, including work by Hirokazu Kato, nara research institute of science and technology, ARToolkit, later released and further developed at HITLab, which introduced software development suites and protocols for viewpoint tracking and virtual object tracking.
These milestones are often considered to be of paramount importance during this period, although other researchers and companies are also active in this area.
While military funding for large-scale development and testing of AR for training simulations has been well documented, the need for such obvious other system-level design and system demonstration is ongoing simultaneously with military funded research efforts.
The most important of these non-military experiments were the AR version of the video game Quake, ARQuake, initiated and led developed by Bruce Thomas at the south australian university wearable computer lab, and published in "ARQuake: an Outdoor/Indor Augmented Reality First Person Application, 4th International Symposium on week computers, pp 139-146, atlanta, ga, oct 2000; (Thomas, B., close, B., donoghue, J., squires, J., de Bondi, P., morris, M., and Piekarski, W.). And (3) abstract: "we propose an architecture for a low-cost, medium-precision six-degree-of-freedom tracking system based on GPS, digital compass and vision-based reference tracking. "
Another system that began to be designed and developed in 1995 was the one developed by the authors of this disclosure. The original objective was to achieve a mix of outdoor AR and tv programs, known as "Everquest Live", which was further developed in the late nineties, the basic elements of which were completed in 1999, when a commercial effort was initiated to fund the original video game/tv mix, and another version was included later for high-end theme vacation development. By 2001, it was disclosed on a confidential basis to companies including Ridley and Tony Scott, and in particular their joint ventightplatet (other partners including Renny Harlin, jean Giraud and European Heavy Metal), as executives of their supervisory business, and brought them current "other world" and "other world Industries" projects and inauguration investments as investments and proposed joint ventures in collaboration with ATP.
The following is a summary of the system design and components ultimately determined in 1999/2000:
selected from "Otherworld Industries Business planning documents" (archived file version, 2003):
technical background: the prior art 'open field' simulation and proprietary integration of mobile virtual reality: tools, facilities and techniques.
This is only a partial listing and overview of the related art, which together form the backbone of a proprietary system. Some technical components are proprietary and some come from external vendors. But the unique system incorporating the validated components would be absolutely proprietary-and revolutionary:
interact with VR-ALTERED WORLD:
1) A mobile military-grade VR device for immersing guests/participants and actors in the VR enhanced landscape of an othrworld. Although their "adventure" (i.e., every movement they explore the ottherport around the resort) is captured in real time by moving the motion capture sensors and digital cameras (with automatic masking techniques), guests/players and employees/actors see each other through their visor and the superposition of computer simulated images. The visor is a binocular semi-transparent flat panel display or a binocular opaque flat panel display with a binocular camera attached in front.
These "synthetic elements" superimposed in the field of view by the flat panel display may comprise altered portions of the landscape (or the entire landscape, digitally modified). In fact, those "synthetic" landscape segments that replace the real existence are generated based on the original 3D photographic "capture" of the respective segments of the resort. (see #7 below). As an accurate photo-based geometric "virtual space" in a computer, they can be digitally modified in any way while maintaining the photo-realistic quality and geometric/spatial precision of the original capture. This allows an accurate combination of live digital photography and modified digital parts of the same space.
Other "synthetic elements" superimposed by the flat panel display include human, biological, atmospheric FX and "magic" generated or modified by a computer. These appear as real elements of the field of view through the display (transparent or opaque).
By using the positioning data, the motion capture data of the guest/player and the staff/actors, and real-time masking them by a plurality of digital cameras, all calibrated to the previous "captured" version of each area of the vacation area (see #4 and 5 below), the composite elements can be absolutely precisely matched in real-time to the real elements presented by the display.
Thus, a photo-realistic computer-generated dragon would appear to be able to pass through a real tree, return to the surroundings, then fly up and land on top of the real castle of the resort-the dragon could then "burn" the computer-generated flame. In flat panel displays (translucent or opaque), the flame appears to "blacken" the upper portion of the castle. This is achieved because the upper part of the castle has been "masked" by the visor by the computer-modified version of the 3D "capture" of the castle in the system file.
2) The physical electro-optical mechanical equipment is used for fighting between a real person and a virtual person, a living being and an FX. The "haptic" interface provides motion sensors and other data, as well as vibration and resistance feedback, allowing real-life human interaction with virtual humans, living beings, and magic in real-time. For example, a haptic device in the form of a "prop" sword handle provides data when a guest/player swings it, and physical feedback when the guest/player presents a "slap" virtual predator magic to achieve the illusion of combat. All of these are combined in real time and displayed via a binocular flat panel display.
3) An open field motion capture device. Mobile and fixed motion capture device equipment (similar to The equipment used for The Matrix movie) is deployed throughout The vacation home. Data points on the subject "gear" worn by the guest/player and the employee/actor are tracked by cameras and/or sensors to provide motion data for interacting with virtual elements in the field of view displayed on the binocular plate in the VR visor.
The output of the motion capture data enables (with sufficient computational rendering capability and use of motion editing and motion libraries) the principle of gurgling roles of the CGI modified versions of the guest/player and employee/actor along the second and third movies of the "ring king".
4) Augmentation of motion capture data with LAAS and GPS data, live laser ranging data, and triangulation techniques (including from Moller Aerobot UAV). The additional "positioning data" allows for more efficient (and error-corrected) integration of live and synthetic elements.
Press release from drone manufacturer:
7 month and 17 days. One week ago, honeywell made a contract for the initial network of Local Area Augmentation System (LAAS) stations, some of which were already running. The system can accurately guide the aircraft to land at airports (and helicopters) with a precision of up to inches. LAAS systems are expected to be put into use in 2006.
5) An automatic real-time mask for open-field "play". In conjunction with motion capture data allowing interaction with the analog elements, the vacation guest/participant will use a P24 (or equivalent) digital camera for digital imaging, using proprietary automation software, to automatically isolate (mask) the appropriate elements from the field of view for integration with the synthetic elements. This technique will become one of the suites for ensuring proper foreground/background separation when superimposing digital elements.
6) Military-level simulation hardware and technology is combined with state-of-the-art game engine software. Haptic devices, synthetic elements and live elements (masked or complete) for interacting with "synthetic" elements such as prop swords, in conjunction with data from motion capture systems, are integrated through military simulation software and game engine software.
These software components provide AI code to animate synthetic people and living beings (AI-or artificial intelligence-software, such as the Massive software used to animate military forces in the ring king movie), generate realistic water, clouds, fire, etc., integrate and combine all elements, just like computer games and military simulation software.
7) Photo-based real-location capture to create a realistic digital virtual collection with image-based techniques, pioneered by The Paul Debevec (The basis for The "bullet time" FX for The movie The Matrix).
The "base" virtual locations (inside and outside of the resort) are indistinguishable from the real world, as they come from the real illumination of the photo and the location at the time of "capture". A small set of high quality digital images, combined with data from the optical probe and laser range finding data, and appropriate "image-based" graphics software, are all that is required to reconstruct in a computer a photo-realistic virtual 3D space that exactly matches the original version.
Although "virtual collections" are captured from locations inside the real castle and outside the surrounding villages, once these "base" or default versions are digitized, all other data with lighting parameters and from the exact time at the time of initial capture can be modified, including lighting, added elements are not present in the real world, and the present elements are modified and "masquerade" to create a fantasy version of our world.
The calibration procedure occurs when the guest/player and employee/actor traverse the "portal" at various points in the resort ("portal" is the effective "intersection point" from our world to "other world"). At this point, the positioning data from the guest/player or employee/actor at the "portal" is employed to "lock" the virtual space in the computer to the coordinates of the "portal". The computer "knows" the coordinates of the portal points for its virtual version of the entire vacation village obtained by the image-based "capture" process described above.
Thus, the computer may "line up" its virtual vacation village with what the client/player or employee/actor would see before it was put into the VR goggles. Thus, with a semi-transparent version of the binocular flat panel display, if the virtual version is overlaid on the real vacation village, one world will match the other very precisely.
Alternatively, using an "opaque" binocular flat panel display visor or helmet, the wearer can confidently walk with the helmet, seeing only a virtual version of the resort in front of him, since the view of the virtual world will exactly match the view he is actually walking.
Of course what can be shown to him through the goggles will be a modified red sky, a boiling storm cloud that does not actually exist, and a dragon perched castle guard rail on top, positive castle "fire".
1000 troops of the eater magic rush down a remote mountain!
8) A supercomputer rendering and simulation facility for vacation villages. A key resource that would enable very high quality, close to feature quality simulation would be supercomputer rendering and simulation at each resort complex site.
Improvements in graphics and games are well known for computer games played on stand-alone computer game consoles (Playstation 2, xbox, gameCube) and desktop computers.
However, it is contemplated that improvements in the gaming experience are improvements in the processor and support systems based on a single console or personal computer. It is then envisioned that the capabilities of the supercomputing centers will support the gaming experience. This alone is a huge leap in the quality of graphics and games. This is just one aspect of the mobile VR adventure that will be the experience of atherworld.
As will be apparent from a review of the foregoing, and as will be apparent to those skilled in the relevant art in VR, AR and the broader simulation arts, the personal hardware or software systems proposed to improve the prior art must take into account the broader system parameters and specify assumptions on these system parameters in order to make an appropriate assessment.
The essence of the present proposal, therefore, is that the emphasis is on hardware technology systems belonging to the portable AR and VR technology classes, and indeed a fusion of both, but its most preferred version is wearable, in the preferred wearable version HMD technology, which only considers or reconsiders the whole system to which it belongs, to be a complete case of an excellent solution. There is therefore a need to present a greater history of VR, AR and simulation systems, since for example the proposals for new HMD technology and the trend towards commercial products are too narrow, not taking into account nor reviewing the assumptions, requirements and new possibilities at the system level.
Similar historical reviews of the major milestones in HMD technology development are not necessary, as a broader history needs to be reviewed at a system level to provide a framework that can be derived to help explain the limitations of the prior art and the current state of the art in its HMD, as well as the reasons for and reasons for the proposed solution to address the identified problems.
Content sufficient to understand and identify the limitations of the prior art in HMDs begins with the following.
In the category of head mounted displays (which for the purposes of this disclosure includes head mounted displays), two main sub-types have been identified so far: VR HMD and ARHMD, follow the meaning of those definitions already provided herein, and of the category of AR HMD, two categories have been used to distinguish whether these types are "video perspective" or "optical perspective" (more commonly simply referred to as "optical HMD").
In VR HMD displays, the user views a single panel or two separate displays. The typical shape of such HMDs is typically that of a visor or face mask, although many VR HMDs have the appearance of a welding helmet, which has a bulky closed visor. To ensure optimal video quality, immersion, and no interference, such systems are completely enclosed, with light absorbing materials around the perimeter of the display.
The authors of the present disclosure have previously proposed two types of VR HMDs in the incorporated U.S. provisional application "SYSTEM, METHOD AND computer PROGRAM PRODUCT FOR MAGNETO-optical DEVICE DISPLAY". One of them simply proposes to replace the conventional direct view LCD with the wafer-type embodiment of the main object of the application, the first practical magneto-optical display, whose superior performance characteristics include extremely high frame rate, and other advantages that improve the display technology as a whole, and in this embodiment, for an improved VR HMD.
The second version contemplates a new remote image display, according to the teachings of this disclosure, that would be generated, for example, in a vehicle cockpit, then transmitted via a fiber optic bundle, and then distributed through a special fiber optic array structure (structures and methods are disclosed in this application), based on the experience of employing the new methods and structures for fiber optic panels for remote image transmission through optical fibers.
Although the core MO technology was not originally commercialized for HMDs, but for projection systems, these developments are relevant to certain aspects of the present proposal and, furthermore, not generally known to the art. In particular, the second version discloses a method that is disclosed prior to other newer proposals that use optical fibers to transmit video images from image engines that are not integrated into or near the HMD optics.
In addition to the tightly controlled stage environment with flat layers, a key consideration for the utility of fully enclosed VR HMDs for mobility is that for sports safety, the virtual world in navigation must be moved within the deviation of human sports safety by 1:1 are mapped to the true surface topography or motion path.
However, as observed and summarized by Barrilleaux, BARS developers of Loral WDL and by other researchers in this field since the last quarter century, it is necessary for AR systems, which are practical systems, to obtain very close correspondence between virtual (synthetic, CG-generated images) and real-world terrain and architectural environments, including (because the military is not unexpected for urban war development systems) the geometry of moving vehicles.
Thus, more generally, for VR or AR to be enabled in mobile form, there must be a 1:1, in the same manner as the above.
In the category of AR HMDs, the distinction between "video perspective" and "optical perspective" is the distinction between a user viewing directly through a transparent or translucent pixel array and a display disposed directly in front of the viewer as part of the eyeglass optics itself, and viewing through a translucent projected image also disposed directly in front of the viewer on an optical element, typically directly adjacent, generated from one microdisplay and transmitted to the facing optics in the form of a light relay.
A major and perhaps only partially practical type of direct-view display transparent or translucent display system is (historically) an LCD without an illumination backplane — thus, in particular, AR video see-through glasses possess one or more viewing optics, including a transparent optical substrate, on which an array of LCD light modulator pixels has been mounted.
For applications like the original Mann "EyeTap", where text/data is displayed or projected directly on facing optics, there is no need to calibrate to real world terrain and objects, although some degree of positional correlation facilitates contextual "tagging" of items in the field of view with informational text. This is the primary purpose of the Google Glass product exposition, but as drafted by the present disclosure, many developers have focused on developing AR-type applications that are not just text superimposed in a live scene.
In addition to the coarse approximate positional correlation in an approximate 2D plane or coarse viewing cone, the main problem with such "calibration" of terrain or objects in the user field of view of a video or optical see-through system is determining the relative position of objects in the viewer's environment. Without reference and/or substantially real-time spatial location data and 3D mapping of the local environment, the computation of perspective and relative size cannot be performed without significant inconsistencies.
In addition to relative size, one key aspect from any viewpoint perspective is realistic lighting/shading, including projection, depending on the lighting direction. Finally, occluding objects from any given viewing position is a key optical property for perceiving perspective and relative distance and positioning.
There is no or no problem to design video see-through or optical see-through HMDs independent of how such data is provided to implement or indeed for moving VRs, spatial viewing of the wearer's surroundings, necessary safety motions or way-finding in video or optical see-through type systems. Is these data provided externally, locally, or from multiple sources? What impact this has on the design and performance of the entire HMD system if it is a partial local and partial HMD? What does this issue affect, if any, the choice between video and optical perspective (considering weight, balance, volume, data processing requirements, lag between components, and other influencing and influenced parameters) and the choice of display and specific optical components?
Among the technical parameters and problems to be solved during the evolution and advancement of VR HMDs are the increase of field of view, the reduction of latency (lag between motion tracking sensor and virtual view change), the improvement of resolution, frame rate, dynamic range/contrast and other general display quality characteristics, and weight, balance, volume and general ergonomics. Image collimation and other display optics details have improved, effectively addressing the problem of "simulator disease", a major problem in the early days.
As these general technology classes and weight, size/volume and balance improve, the weight and volume of displays, optics and other electronic devices tend to decrease.
Fixed VR devices are commonly used in night vision systems in vehicles, including aircraft; however, mobile night vision goggles can be considered an intermediary form of viewing similar to mobile VR, in that the wearer is viewing a substantially real-time real scene (IR imaging), but through a video screen, rather than in "see-through".
This subtype is similar to that defined by barrileaux in the 1999 review, also referenced, as "indirect view display". He provides a definition on the proposed AR HMD where there is no actual "see-through", but only the real/virtual image that is merged/processed on the display, presumably as is contained by any VR type or night vision system.
However, the night vision system is not a fusion or mix of a virtual composite landscape and reality, but a directly transmitted video image of IR sensor data interpreted as monochrome images of different intensities by video signal processing according to the intensity of the IR signal. As a video image, it does apply to real-time text/graphics overlays, the same as the simple form Eyetap originally conceived, and is the primary purpose of its eyewear product as Google has already stated.
The problem of extracting the way and content of live or providing (or both) data from reference to a mobile VR or mobile AR system, or now including this hybrid live processing video feed "indirect view display" with similarities to both categories, the combined view that enables effective integration of virtual and real landscape to provide consistent cues is a design parameter and issue that must be considered when designing any new and improved mobile HMD system, regardless of its type.
Software and data processing for AR has evolved to address these issues based on the early work of the system developers that have been cited. An example of this is the work of Matsui and Suzuki from Canon Corporation, as disclosed in its co-pending U.S. patent application "Mixed reliability space image generation method and Mixed reliability system" (U.S. patent application Ser. No. 10/951,684, 9/29/2004 (U.S. publication No. 20050179617-now U.S. Pat. No. 7,589,747)). The abstract is as follows:
"a mixed reality space image generating apparatus for generating a mixed reality space image formed by superimposing a virtual space image on a real space image obtained by capturing a real space, includes an image synthesizing unit (109) that superimposes the virtual space image, which will be displayed on the real space image in consideration of occlusion of an object on the real space of the virtual space image, and an annotation generating unit (108) that further applies an image to be displayed without taking into consideration any occlusion of the virtual space image. In this way, a mixed reality space image that can achieve natural display and convenient display can be generated.
The purpose of this system is to enable a combination of fully rendered industrial products (e.g. cameras) to be superimposed on a solid model (avatar prop); a pair of optical see-through HMD glasses and a mock-up are both equipped with a position sensor. The real-time pixel-by-pixel look-up comparison process is used to mask out pixels from the physical model so that the CG-generated virtual model can be superimposed on the composite video feed (buffering delay to achieve slightly delayed layering). The system also adds annotation graphics. And (4) computer images. The basic sources of data for determining the mask and thus ensuring correct and non-erroneous occlusion in the composition are motion sensors on the solid model and a predetermined look-up table that compares pixels to pull the hand mask and the solid model mask.
While this system is not suitable for generalizing moving AR, VR or any hybrids, it is an example of an attempt to provide a simple, but not fully automated system for correctly analyzing real 3D space and locating virtual objects in perspective.
In the field of video or optical see-through HMDs, little progress has been made in designing displays or optics and display systems that enable satisfactory, realistic and accurate merged perspective views, including processing of appropriate see-through sequences, appropriate occlusion of merged elements from any given observer position in real space, even assuming ideally computed mixed reality perspectives are delivered to the HMD.
A system has been previously cited that claims to be the most effective solution even if it is a partial solution to this problem, and perhaps the only integrated HMD system (as opposed to software/photogrammetry/data processing and transmission systems that aim to solve these problems in some general way, independent of the HMD), which is suggested by Chunyu Gao in U.S. patent application No. 13/857,656 (U.S. publication No. 20140177023), "a device for an optically see-through head mounted display with mutual occlusion and opacity control capabilities".
Gao started his investigation of the field of see-through HMDS for AR with the following observations:
there are two types of ST-HMDs: optics and video (j.rolland and h.fuchs, "Optical vision-section-through head mounted," displays, "Fundamentals of Wearable Computers and Augmented Reality Fundamentals, pages 113-157, 2001). The main drawbacks of the video perspective method include: image quality degradation of the perspective view; image lag due to processing of the input video stream; the perspective view may be lost due to hardware/software failures. In contrast, an optical see-through HMD (OST-HMD) provides a direct view of the real world through a beam splitter, and therefore has minimal impact on the view of the real world. Is highly preferred in demanding applications where the user's awareness of the field environment is crucial.
Gao, however, does not qualify for viewing the problem with video perspective, first designating prior art video perspective as a dedicated LCD, and he also does not verify that the LCD must (relatively, and also ignores the criteria for) degrade the assertion of the perspective image. Those skilled in the art will recognize that the perspective of such low quality images is derived from results obtained in early see-through LCD systems before recent advances in the field have accelerated. It is not really true or obvious that with optical see-through systems that reprocess or adjust the effects of "real" see-through images "by comparing many optical devices and other display technologies, the end result is relatively low, or inferior, to the proposals of people such as Gao, etc., compared to the most advanced LCD or other video see-through display technologies.
Another problem with this silent summarization is the lag assumption in this type of perspective compared to other systems that also have to process the input live image. In this case, the comparison of the speeds is, in general, the result of a detailed analysis of the components of the competing system and their performance. Finally, the guess that the perspective view of the hardware/software may be lost is essentially endless, arbitrary, and has not been validated by any rigorous analysis that compares system robustness or stability generally between video and optical perspective schemes, or between a particular version of either and its component technology and system design.
In addition to the initial problems of erroneous and biased representations for comparison in the field, there are qualitative problems with the solutions proposed by themselves, including the omission and lack of consideration of the proposed HMD system as a complete HMD system, including as a component in a broader AR system, with data acquisition, analysis and distribution problems that have been previously referenced and solved. While alone being a significant problem and dilemma that the HMD itself and its design can contribute to or hinder, and cannot simply pose as a given, the HMD cannot be allowed to "give" a certain level and quality of data or processing power for generating altered or blended images.
Furthermore, the full dimension of the visual integration problem of real and virtual in mobile platforms is omitted from the description of the problem solution.
The system adopting the present disclosure and the teaching thereof is specifically:
as already described in the background section above, gao suggests the use of two display-type devices, since the specifications of the spatial light modulator that is operable to selectively reflect or transmit live images are essentially those of an SLM for the same purpose as it is in any display application.
The output images from the two devices are then combined in a beam splitter combiner, while being arranged on a pixel-by-pixel basis, assuming no specific explanation other than statements regarding the accuracy of such devices.
However, to achieve this merging of the two pixelated arrays, gao specifies what he calls a replica of "folding optics", but basically nothing but a dual version of the Mann Eyetap scheme, requiring a total of two "folding optics" elements (e.g. a plane grating/HOE or other compact prism or "flat" optical element, one for each light source, plus two objective lenses (one for the wavefront from the real view and the other for the combined image and the focal point of the beam splitter combiner).
Therefore, multiple optical elements (for which he provides a number of conventional optics variations) are required to: 1) Light of the real scene via the first reflecting/folding optics (planar grating/mirror, HOE, TIR prism or other "flat" optics) and from there to the objective lens is collected and passed to the next planar grating/mirror, HOE, TIR prism or other "flat" optics to "fold" the light path again, all in order to ensure that the whole optical system is relatively compact and contained in a schematic set of two rectangular optical relay zones; from the folded optical system, the beam passes through a beam splitter/combiner to the SLM; the now pixellated real image is then returned to the beam splitter/combiner on a pixellated (sampled) basis, either reflected or transmitted, thereby variably (from real image contrast and intensity variations to modify grey levels, etc.) modulated. While the display generates virtual or composite/CG images in synchrony, it may also be calibrated to ensure easy integration with the modified pixelized/sampled real wavefront, and passed through a beam splitter to integrate the pixels of the real scene to pixels, using a multi-step, modified and pixelized sample, from there through an eyepiece objective, and then back to another "folded optical" element to be reflected from the optical system to the viewer's eye.
In general, for the modified pixelated sample portion of the real image wavefront, before reaching the viewer's eye, it passes through seven optical elements, excluding the SLM; the display generates a composite image that passes through only two.
While the precise alignment of optical image combiners is a matter up to the pixel level, whether reflected light collected from an image sample interrogated by a laser or a combined image produced by a small function SLM/display device, maintaining alignment, particularly under mechanical vibration and thermal stress conditions, is considered to be a significant concern in the art.
Digital projection free-space beam combining systems that combine the outputs of high resolution (2 k or 4 k) red, green and blue image engines (images typically generated by DMD or LCoS SLMs) are expensive and it is important to maintain these alignments. Some designs are simpler than the seven element case of the Gao scheme.
In addition, these complex multi-engine multi-element optical combiner systems are far from as compact as required by HMDs.
Monolithic prisms (such as the T-rhombid combiner developed and sold by Agilent for the life science market) have been developed specifically to address the problems exhibited by free space combiners in existing applications.
While companies such as Microvision and others have successfully deployed their SLM-based, originally developed technology for micro-projection into HMD platforms, these optical settings are generally substantially less complex than the Gao proposal.
Furthermore, it is difficult to determine what the rationale for the two image processing steps and computational iterations on the two platforms are, and why smoothing and integration of real and virtual wavefront input needs to be achieved, implementing the correct occlusion/opacity of the combined scene elements. It appears that the most interesting and problematic problem of Gao is the problem of composite image competition, difficult to compare with the brightness of a real image, and therefore the main task of the SLM seems to selectively reduce the brightness of part of a real scene or the whole real scene. Although Gao is not specified nor is there details of how the SLM will perform its associated image modification functions, it can also be inferred that generally occluded pixels can simply be discarded while reducing the intensity of the occluded real scene elements, for example by minimizing the duration of the DMD mirrors in the reflective position in a time division multiplexed system.
Among the many parameters that must be taken into account for calculation, calibration and alignment, includes determining exactly which pixels from the real field are calibration pixels for the synthesized pixels. If there is no exact match, ghost overlaps, mis-alignments and occlusions will multiply, especially in moving scenes. The position of the reflective optical element that passes the real scene wavefront portion to the objective lens has a real perspective position relative to the scene that is first different from the perspective position of the viewer in the scene, which is not flat or centered, but is only a wavefront sample, not a position. Also, when movable, the movement is also performed at the same time, and is not known in advance to the composite image processing unit. Due to these facts alone, the number of variables in the system is very large.
If they are, and the goal of the solution becomes more specific, it may become clear that there may be a simpler way to achieve this than using a second display (in a binocular system, a total of 2 displays are added, a SLM is designated).
Second, it is apparent to the examination of the scheme that if any method, due to the durability of such complex systems with multiple cumulative alignment tolerances, the accumulation of imperfections and wear over time of the original components in the multi-element path, misalignment of the combined beams creates cumulative thermal and mechanical vibration effects, and other complications arising from the complexity of the seven-element plus optical system, it is this system that may inherently create degradation of the external live image wavefront, especially over time.
In addition, as has been noted in detail previously, the problem of computing the spatial relationship between real and virtual elements is trivial. Designing a system, which must drive two (in a binocular system), four display type devices, most likely of different types (and therefore with different color gamuts, frame rates, etc.), from these calculations adds complexity to the already critical system design parameters.
Furthermore, a high frame rate is essential in order to provide high performance images without ghosting or hysteresis and without causing eye fatigue and visual system fatigue. However, for the Gao system, the system design is somewhat simplified only if a perspective rather than a reflective SLM is used; but even with faster felco microdisplays, the frame rate and image speed are still much lower than MEMS devices such as TI DLP (DMD).
However, as higher HMD resolutions are also required, at the very least to achieve wider FOVs, resorting to high resolution DMDs for 2k or 4k devices such as TI means resorting to very expensive solutions, since DMDs with characteristic sizes and quantities are known to be low yielding, with defect rates higher than what is generally tolerable by mass consumers or enterprises for production and costs, and very expensive for the systems that now use them, such as the digital cinema projectors sold by Barco, christie and NEC of the TI OEM in the market.
While starting from planar optical projection technology for optical see-through HMDS (e.g., lumus, BAE, etc.) is an intuitively easy step, where occlusion is neither a design goal nor possible within the scope and capabilities of these methods, essentially replicating the method and adjusting the real image, and then combining the two images using conventional optical settings such as proposed by Gao, while relying on a large number of planar optical elements for the purpose of combination and doing so in a relatively compact space.
To summarize the background review, and returning to the current leader in the two major HMD categories, optical see-through HMD and classic VR HMD, the prior art can be summarized as follows, noting that other variant optical see-through HMDs and VR HMDs are both commercially available and subject to extensive research and development, with a large amount of commercial and academic work, including product announcements, publications and patent applications that have been upgraded since the breakthrough made in Google Glass and Oculus VR HMDs (Rift):
at the time of writing this text, glass' Google, which owns a commercially leading mobile AR optical HMD, has established a breakthrough public visibility and a dominant market position for the category of optical see-through HMDs.
However, they are entering the market along with others who have developed and deployed products in major defense/industrial areas, including Lumus and BAE (Q-Sight holographic waveguide technology). Enterprises such as TruLife Optics, which commercialize uk national physical reality research, were also found in the field of holographic waveguides as they enter other recent markets and research phases, and they claim to be advantageous.
For many military helmet-mounted display applications, and the primary use case for Glass by Google, again as previously described, the superimposition of text and symbolic graphical elements on the view space requires only a rough positional association, which may be sufficient for many initial, simple mobile AR applications.
However, even in the case of information display applications, it is apparent that the greater the density of the marking information of items and terrain in the view space facing (and ultimately surrounding) the viewer, the greater the need for spatial ordering/layering of the labels to match the perspective/relative position of the marking elements.
Overlap-i.e. partial occlusion of the label by real elements in the field of view, rather than merely the label itself, thus necessarily becomes a requirement of optical see-through systems for even "basic" information display purposes to manage visual clutter.
In addition, since the tags must reflect not only the relative positions of the tag elements in the perspective of real space, but also the automated (either pre-determined or software-calculated based) priority and degree of real-time, user-specified priority, tag size and transparency, to name but also the two primary visual cues that the graphics system uses to reflect the information hierarchy.
The problem then immediately arises of how to deal with the relative brightness of the optical elements of these basic optical see-through HMDs (whether monocular or binocular panoply type) and the live elements of the video display elements generated by superimposition, especially in bright outdoor lighting conditions and very dim outdoor conditions, taking into account the translucence and overlap/occlusion problems of the labels and the superimposed graphic elements in detail. Nighttime use is obviously an extreme case of low light problems to fully extend the utility of these display types.
Thus, when we go through the most limited use case conditions of the passive optical see-through HMD type, as would be expected as the information density increases-with the commercial success of such systems and the often dense urban or suburban areas where marking information is obtained from commercial enterprises-and the use parameters under bright and dim conditions increase constraints, it is clear that the "passive" optical see-through HMD cannot escape nor cope with the problems and needs of any realistic practical implementation of a mobile AR HMD.
Furthermore, passive optical pass-through HMDs must be considered as incomplete models for implementing mobile AR HMDs, and, in retrospect, would be considered as merely a transitional step stone to an active system.
Oculus Rift VR (Facebook) HMD: somewhat similar to the impact of Google Glass product marketing campaign, but different from Oculus, which actually led the field to solve and/or begin to substantially solve some of the significant threshold obstacles of actual VR HMDs (rather than following Lumus and BAE for Google), oculus Rift VR HMD was the leading, pre-mass-released VR HMD product at this writing, entered and created a market for widely accepted consumer and commercial/industrial VR.
The basic threshold improvement of Oculus Rift VR HMD can be summarized as the following product function list:
Figure GDA0002121603270000221
a significantly broadened field of view, achieved by using a single current 7 inch diagonal display of 1080p resolution, is located a few inches from the user's eyes and is divided into binocular perspective regions on the single display. As written herein, the current FOV is 100 degrees (improving its original 90 degrees) compared to the total 45 degrees of the current HMD specification. The independent binocular optical device realizes the stereoscopic vision effect.
Figure GDA0002121603270000231
Significantly improved head tracking, resulting in low hysteresis; this is an advancement of the improved motion sensor/software and utilizes micro motion sensor technology that is ported from the Nintendo Wii, apple and other fast followers in cell phone sensor technology, playstation PSP and the current Vita, nintendo DS current 3DS and Xbox Kinect systems, as well as other hand-held and hand-held device products with built-in motion sensors (accelerometers, MEMS gyroscopes, etc.) for 3D dimensional position tracking. Current head tracking implements a multi-point infrared optical system with external sensors working in concert.
Figure GDA0002121603270000232
Low latency, which is the combined result of improved head tracking and fast software processor updates to the interactive gaming software system, although limited by the inherent response time of the display technology employed, the original LCD was replaced by the faster OLED.
Figure GDA0002121603270000233
Low persistence, a form of buffering, to help keep the video stream smooth, works in conjunction with higher switching speed OLED displays.
Figure GDA0002121603270000234
Lighter weight, smaller volume, better balance, and overall improved ergonomics are achieved by employing ski goggles form factors/materials and mechanical platforms.
To summarize the net benefits of combining these improvements, while such systems may not have new modes in structure or operation, the net effect of the improved components and the particularly effective design patent US D701,206, as well as any proprietary software, has produced a breakthrough level of performance and validation of mass market VR HMDs.
Following their lead and taking their approach, in many cases, there were some contemporaneous product programs where others had changed their design based on the success of the Oculus VR Rift configuration, and there have been many VR HMD product developers who made product plan announcements after the first 2012 electronic exposition demonstration and Oculus VR's kicktarter financing activity, with both brand companies and initiatives.
There are three stars (which demonstrate a development pattern very similar to the Oculus VR Rift design as described herein) and sornheus among these quick followers and other enterprises that change their strategy significantly to follow the Oculus VR template. Pioneering companies gaining attention in this area include Vrvana (formerly True Gear Player), gameFace, infiniteEye and avegat.
None of these system configurations look exactly the same as Oculus VR, although some use 2 panels, and others use 4 panels, infiniteEye uses a 4 panel system to extend the FOV to the purported 200+ degrees. Some use LCDs and some use OLEDs. Optical sensors are used to improve the accuracy and update speed of the head tracking system.
All systems are implemented with essentially local or highly constrained mobility. Vehicle-mounted and active optical marker-based motion tracking systems are employed, designed for use in enclosed spaces such as the living room, operating room or simulator phases.
The systems with the greatest difference from the Oculus VR protocol are gleph and Vrvana Totem by avegent.
Glyph actually implements a display solution that follows the previously established optical see-through HMD solution and architecture, employing Texas Instruments DLP DMD to generate projected microimages on reflective planar optical elements, the configuration and operation of which are identical to the planar optical elements of existing optical see-through HMDs, except that a high contrast, absorbing backplane architecture is employed to implement the reflective/indirect micro-projector display type, with the video images belonging to the general class of opaque, non-transparent display images.
However, as established herein before in the discussion of the disclosure of Gao, limitations on increasing display resolution and other system performance beyond 1080p/2k when using DLPDMD or other MEMS components are those of cost, manufacturing yield and defect rate, durability and reliability in such systems.
Furthermore, the limited expansion/magnification factor of the planar optical elements (grating structure, HOE or other) enlarges the SLM image size, but enlarges the interaction/burden on the Human Visual System (HVS), especially the focus system, and the limitation of the image size/FOV from this limited expansion/magnification factor puts limits on safety and viewer comfort. User responses to the use of similarly sized but lower resolution images in the Google glass trial indicate that making the HVS more exhaustive with higher resolution, brighter but equally small image areas poses a challenge to the HVS. The Google official consultant, eli pili, doctor, issued a warning to the Google Glass user when interviewed by the online site, betaBeat (5/19/2014), anticipating some eye strain and discomfort, and then modifying the warning (5/29/2014) in an attempt to limit the cases and scope of potential use. This division is for the way the eye muscles are used, they are not designed or intended for long-term use, and the approximate reason in the revised declaration is to force the user to find the location of the small displayed image.
However, the specific combination of eye muscle usage required for focal use over a small portion of the real FOV cannot be assumed to be the same as required for eye movement over the entire real FOV. In fact, small fine-tuning of local muscles is more constrained and limited than the range of motion involved in scanning a natural FOV. Thus, as is known in the art, repetitive motion in a packed ROM is not limited to only the focus direction, although due to the nature of the HVS, excessive loads beyond the normal use range are expected to be added, but is limited to the limits of the range of motion and the requirement to make very small, controlled fine adjustments.
The added complexity is that the level of detail in the constrained eye movement region may begin to quickly exceed eye fatigue from precision tool work as resolution increases in scenes with complex, fine motion. Any developer of the optical see-through system has not strictly addressed this problem, and these problems, as well as the eye fatigue, headache and dizziness problems reported by Steve Mann for many years using his EyeTap system (which were reported to be partially improved by moving the image to the center of the field of view in the recent Digital EyeTap update, but were also not systematically studied), have received only limited comments, focusing only on the partial problems and the problem of eye fatigue, which can develop from near work and "computer vision disease".
However, the limited public review provided by Google repeatedly states that, in general, glass is used cautiously as an optical see-through system for occasional viewing, rather than long-time or high-frequency viewing.
Another way to understand the Glyph scheme is that the highest level follows the Mann digital EyeTap system and structural arrangement, with variations for implementation of the optical isolation VR operation and lateral projection plane deflection optics settings that employ current optical see-through systems.
In Vrvana Totem, the objective VR Rift is violated by adopting the "indirect view display" scheme of Jon barrileaux by adding a binocular conventional video camera to allow switching between forward image capture for video capture and the simulation generated on the same optically shaded OLED display panel. Vrvana is represented in marketing materials, which they can implement this very basic "indirect view display", following exactly the AR schematic and model determined by Barrilleaux. Obviously, virtually any other VR HMD of this generation of Oculus VR may be fitted with such a conventional camera, at least though having an effect on the weight and balance of the HMD.
From the above it is evident that in the "video see-through HMD" category, or in general in the field of "indirect view display", little substantial progress has been made, which has developed well as a subtype, except for the night vision goggles category, but which lacks any AR feature, except for the provision of adding text or other simple graphics to the live image in the video processor methods known in the art.
In addition, with respect to the existing limitations of VR HMDs, all such systems employing OLED and LCD panels suffer from relatively low frame rates, which results in motion lag and delay, as well as negative physiological effects on some users, belonging to the broad category of "simulator sickness". It is also noted that in cinema digital stereoscopic projection systems employing commercial stereo systems such as the real d system implemented in Texas Instruments (Texas Instruments) DLP DMD based projectors or Sony LCoS based projectors, it has been reported that insufficiently high frame rates result in a small percentage of the audience (up to 10% in some studies) experiencing headache and related symptoms. Some of which are unique to these individuals, but a large part of which can be traced back to frame rate limitations.
Moreover, as noted, the Oculus VR has implemented in part a "low persistence" buffer system to compensate for the still insufficiently high pixel conversion/frame rate of the OLED display used at the time of writing.
Yet another impact on the performance of existing VR HMDs is due to the resolution limitations of existing OLED and LCD panel displays, which partly contributes to the requirement to use 5-7 "diagonal displays and mount them at a distance from the viewing optics (and viewer's eyes) to achieve sufficient effective resolution, contributing to the volume, size and balance of existing and planned products being significantly larger, bulkier, heavier than most other optical head-mounted products.
Potential partial improvements are expected from the use of curved OLED displays, which can be expected to further improve the FOV without increasing the volume. However, the expense of bringing sufficient quantities to the market, which requires a large extra-scale investment in plant capacity at an acceptable production capacity, makes this prospect less practical in the short term. It can only partially solve the volume and size problems.
For the sake of completeness, reference must also be made to a video HMD for viewing video content but without interaction or with any motion sensing capabilities, and thus without the ability to navigate for a virtual or mixed reality/AR world. Over the last fifteen years, such video HMDs have improved substantially, increasing the effective FOV and resolution and viewing comfort/ergonomics, and providing a path of development and progress that current VR HMDs have been able to utilize and build. However, these are also limited by the core performance of the display technology employed, in a mode that follows the observed limitations for OLED, LCD and DMD based reflective/deflection optical systems.
Other important variations of the projected image of the transparent glasses optics paradigm include those from the Osterhoudt Design Group, magic Leap, and Microsoft (Hololens).
Although these variations have some relative advantages or disadvantages-relative to each other and to other prior art reviewed in detail earlier-they all retain the limitations of the basic approach.
For the more fundamental and widespread commonalities, they are also limited by the basic type of display/pixel technology employed, and due to the frame rate/refresh of existing core display technologies, whether fast LC, OLED or MEMS, and whether mechanical scanning fiber optic inputs or other disclosed optical systems for delivering display images to viewing optics are employed, all are still insufficient to meet the requirements of high quality, easy-to-view (HVS), low power, high resolution, high dynamic range and other display performance parameters that individually and collectively contribute to mass-market, high-quality, pleasing AR and VR.
To summarize the state of the art, for the details described previously:
"high visual acuity" VR has improved substantially in many respects from FOV, latency, head/motion tracking, lighter weight, size and volume.
But frame rate/delay and resolution, to a significant degree of inference, weight, size and volume are all constrained by the constraints of the available core display technology.
Modern VRs are limited to stationary in small controlled spaces or height-limited and limited mobile use.
VR is based on a closed version of the optical see-through system, but configured as a lateral projection-deflection system, where the SLM projects an image into the eye through a series of three optical elements, which is limited in the dimensional representation of the reflected image, which is enlarged but not much larger than the output of the SLM (DLP DMD, other MEMS or felco/LCoS) compared to the total area of standard spectacle lenses. The extremely strong version from "close-up work" expands the observation and eye strain risks that would put demands on eye muscles is yet another limitation on practical acceptance. SLM type and size displays also limit the practical ways to improve resolution and overall performance by the scaling cost of the higher resolution SLMs of the cited technology.
Optical see-through systems typically have the same potential for eye fatigue, require relatively small and frequent eye tracking adjustments due to restricting eye muscle use to relatively small areas and within these constraints, and are mostly used for short term use. The design of Google Glass is intended to address the desire for limited duration use by positioning the optical elements up and beyond the direct resting position of the eye in front of direct vision. The user still reported eye strain as it has been extensively documented in media reports by means of texts and interviews from Google Glass Explorers.
Optical see-through systems are limited in the overlapping translucent information density due to the need to organize the tags with real world objects in perspective. Even for graphic information display applications, the requirements of mobility and information density make passive optical viewing limited.
The aspect of "indirect view display" has been implemented in the form of night vision goggles, and Vrvana, a competitor of Oculus VR, has only proposed a recommendation to adapt its tolem-equipped binocular video camera to AR.
The Gao proposal, while claiming to be an optical see-through display, is more in fact an "indirect view display" with quasi-perspective aspects, acting as in an improved projection display, by using an SLM device for sampling a portion of a real wavefront and digitally altering that portion of the wavefront.
The number of optical elements (also the points to be added here, much smaller than the optical area of a conventional lens in a pair of conventional spectacles) interposed in the optical route of the initial wavefront portion is seven or close to this number, introducing opportunities for image aberrations, artifacts and losses, but in areas where complex optical alignment systems are required, such complex free-space alignment of many elements is uncommon and, when required, expensive, difficult to maintain and not robust. The methods by which the SLM is expected to manage the wavefront changes of a real scene are also not specified or validated for specific requirements. In environments where performing calculations to establish the proper relationship between real and synthetic elements in perspective view is already very demanding, especially when individuals move in information-intensive, topographically complex environments, it is not a problem to coordinate signal processing between 2-4 display type devices (depending on monocular or binocular systems), including determining exactly that pixels from the real field are calibration pixels for proper synthetic pixels. Mounting on a vehicle only further exacerbates this problem.
There are countless additional problems to developing a complete system compared to the task of constructing an optical device as proposed by Gao, or even reducing it to a relatively compact form factor. Size, balance and weight are just one of many impacts on the number and necessary locations of the various processing and optical array units, but they are relatively minor compared to the other problems and limitations cited, although they are important for practical deployment of such systems on site for military or ruggedized industrial use or consumer use.
Except for the details of the number and alignment of display type elements, optical systems, pixel system matching, and perspective issues, a 100% "indirect view display" has similar requirements in key respects to the Gao proposal, thus raising doubt as to the extent to which all key parameters of such a system should require "brute force" computation of the stored composite CG 3D mapping space in coordination with a real-time, single perspective, real-time, through-view image. The problem becomes large to the extent that the calculations must all be performed, with the video images captured by the forward-looking camera, forwarded to a non-local (to the HMD and/or the wearer himself/herself) processor for compositing with the composite element, in the basic barrileaux and Vrvana design now possible.
What is needed for a truly mobile system, whether VR or AR, is to achieve immersion and calibration of the real environment as follows:
ergonomic optical and viewing systems that minimize any abnormal requirements on the human visual system. This is to achieve more extended use, which is meant by mobile use.
A wide field of view, ideally including a peripheral view of 120-150 degrees.
High frame rate, ideally 60 fps/eye to minimize latency and other artifacts typically caused by the display.
Efficient resolution at comfortable distance of the unit from the face. The effective resolution criteria that can be used to measure the maximum is either an effective 8k or a "retinal display". This distance should be similar to that of conventional spectacles, typically using the bridge of the nose as the balance point. Collimation and optical path optics are necessary to establish a suitable virtual focal plane that also achieves this effective display resolution and the actual distance of the optical elements to the eye.
High dynamic range, matching as closely as possible the dynamic range of the live, real view.
Vehicle-mounted motion tracking that determines the orientation of both the head and body in a known topography-whether known in advance or in time within the wearer's field of view. This can be supplemented by an external system in a hybrid solution.
Display optics capable of fast synthesis processing within the environment of the human visual system between the real scene wavefront and any synthetic elements. Passive devices should be used as much as possible to minimize the burden on the vehicle (for the HMD and wearer) and/or external processing systems.
Display optics, relatively simple and robust, few optical elements, few active device elements, simple active device design, low weight and thickness, and robust under mechanical and thermal stress.
Light weight, small volume, balanced center of gravity, and form factor, which is suitable for design configurations known to be acceptable to professional users (e.g., military and ruggedized environmental industrial users), rugged sports applications, general consumer, and commercial use. These factors have also been accepted from lens manufacturers such as Oakley, wiley, nike and Adidas to a range of factors from Oakley, adidas, smith, zeal, etc. somewhat more specialized sports goggle manufacturers.
A system that can variably switch between VR experience and perspective integrated mixed view AR systems while maintaining full mobility and variable occlusion.
A system that can both manage the incident wavelengths of the HVS and obtain valid information from those wavelengths of interest via sensors and mixtures thereof. IR, visible and UV are typical wavelengths of interest.
Disclosure of Invention
The present invention discloses a system and method for re-conceiving processes that capture, distribute, organize, transmit, store and present to the human visual system or non-display data array output functions by freeing the device and system design from the compromised functions of the non-optimal operating stages of those processes, rather than breaking the photonic and array signal processing stages down into operating stages that allow the optimal functions of the devices most appropriate for each stage, which in practice means designing and operating the devices at the frequencies at which these devices and processes operate most efficiently, and then doing an efficient frequency/wavelength modulation/shift stage to move around between those "convenient frequencies", with the net effect of further enabling more efficient all-optical signal processing, whether locally or long distance.
The following summary is provided to facilitate an understanding of some features of the technology related to signal processing and is not intended to be a complete description of the invention. A full appreciation of the various aspects of the invention can be gained by taking the entire specification, claims, drawings, and abstract as a whole.
Embodiments of the present invention may involve the decomposition of components of an integrated pixel signal "modulator" into discrete signal processing stages and thus into telecommunications-type networks, which may be compact or spatially remote. The operationally most basic version proposes a three-stage "pixel signal processing" sequence, comprising: pixel logic "state" encoding, typically done in an integrated pixel modulator, is separate from the color modulation stage, which in turn is separate from the intensity modulation stage. Further elaborating on a more detailed pixel signal processing system, which includes sub-stages and options, and is more elaborated and specifically tailored for efficient implementation of the magneto-optical subsystem, and includes 1) an efficient illumination source stage, where most of the light (preferably the non-visible near IR) is converted into the appropriate pattern and emitted into the channelized array, and providing stage 2), pixel logic processing and encoding; then 3) an optional non-visible energy filtration and recovery stage; 4) An optional signal modification stage to improve/modify properties such as signal splitting and mode modification; 5) Frequency/wavelength modulation/shift and additional bandwidth and peak intensity management; 6) Optional signal amplification/gain; 7) An optional analyzer for performing a certain MO-type light valve conversion; 8) An optional configuration for some wireless (stage) of pixel signal processing and distribution. In addition, a DWDM type configuration of the system is proposed, which provides a version and path of an all-optical network, with the main attendant costs and efficiencies thus obtained: particularly motivates and makes it more efficient to process live and recorded image information. Finally, new hybrid magneto-optical sub-devices and structures are proposed, and other devices and structures not previously practical for the disclosed systems are implemented to make maximum use of the pixel signal processing system and optimally configure such systems around it, including new and/or improved versions of devices based on a mixture of magneto-optical and non-magneto-optical effects (such as slow and anti-magneto-optical effects), implementing new fundamental transformations, and new hybrid 2D and 3D photonic crystal structure types that improve many, if not most, MPC type devices for all applications.
In our co-pending application, the inventor of the present disclosure proposes a new class of display systems that break down the components of a pixel signal "modulator", which is typically integrated, into discrete signal processing stages. Thus, the basic logic "state" typically implemented in an integrated pixel modulator is separate from the color modulation stage, which in turn is separate from the intensity modulation stage. This can be considered as a telecommunications signal processing architecture applied to the problem of visible image pixel modulation. Typically, three signal processing stages and three separate device components and operations are proposed, but additional signal influencing operations may be added and considered, including polarization characteristics, conversion from conventional signals to other forms such as polaritons and surface plasmons, signal superposition (e.g. superposition of basic pixel on/off states on other signal data), etc. Highly distributed video signal processing architectures in broadband networks, serving relatively "dumb" display devices consisting essentially of subsequent stages of passive materials, are one primary result, as well as compact photonic integrated circuit devices that implement discrete signal processing steps in series on one or more identical devices in intimate contact between separate devices and in large arrays.
In an improved and detailed version of the hybrid telecom type of the present disclosure, the pixel signal processing display system employs magneto-optical/magneto-optical sub-stages/devices in combination with other pixel signal processing stages/devices, including in particular frequency/wavelength modulation/shifting stages and devices, which can be implemented within the robust scope of the embodiments, including also improved and novel hybrid magneto-optical/photonic devices, not limited to classical or non-linear faraday effect MO effects, but more broadly including non-reciprocal MO effects and phenomena and combinations thereof, and also including hybrid faraday/slow light effects and devices based on kerr effect and on mixing faraday and MO kerr effects and other MO effects; and also includes improved "baffle" structures, where the path of the modulated signal is folded in-plane with the surface of the device to reduce the feature size of the overall device; also includes quasi 2D and 3D photonic crystal structures and a mixture of multilayer thin film PC and surface grating/polarization PC; and also the mixing of MO and Mach-Zehnder interferometer devices.
Thus, including earlier MO-based devices and the improved devices disclosed herein, the present disclosure proposes a pixel signal processing system of the telecommunications type or telecommunications architecture with the following processing flow for the pixel signal processing (or likewise, PIC, sensor or telecommunications signal processing) stage and architecture (and variants thereof) featuring the system of the present disclosure:
Any of the embodiments described herein can be used alone or in any combination with one another. The invention contained in this specification may also include embodiments that are only partially mentioned or implied or not mentioned or implied at all in the summary or abstract of the invention. Although various embodiments of the present invention may be motivated by, and may be motivated by, various deficiencies with the prior art, which may be discussed or alluded to in one or more places in the specification, embodiments of the present invention do not necessarily address any of these deficiencies. In other words, different embodiments of the invention may address different deficiencies that may be discussed in the specification. Some embodiments may only partially address some of the deficiencies or only one deficiency that may be discussed in the specification, and some embodiments may not address any of these deficiencies.
Other features, benefits, and advantages of the invention will be apparent from a reading of the present disclosure, including the description, drawings, and claims.
Drawings
The accompanying figures, in which like reference numerals refer to identical or functionally-similar elements throughout the separate views and which are incorporated in and form a part of the specification, further illustrate the present invention and, together with the detailed description of the invention, serve to explain the principles of the present invention.
FIG. 1 illustrates an imaging architecture that may be used to implement embodiments of the present invention;
FIG. 2 illustrates an embodiment of a photonic converter implementing a version of the imaging architecture of FIG. 1 using the photonic converter as a signal processor;
FIG. 3 shows the general structure of the photonic converter of FIG. 2;
FIG. 4 illustrates a particular embodiment of a photonic converter;
FIG. 5 shows a generic architecture for a hybrid photonic VR/AR system; and
FIG. 6 illustrates an implementation architecture for a hybrid photonic VR/AR system.
Detailed Description
Embodiments of the present invention provide a system and method for re-conceiving processes of capture, distribution, organization, transmission, storage and presentation to the human visual system or non-display data array output functions by freeing up device and system design from the compromised function of the non-optimal operating levels of those processes, rather than breaking down the photon and array signal processing stages into operating levels that allow the optimal function of the devices most appropriate for each stage, which in practice means designing and operating the devices at the frequencies at which these devices and processes operate most efficiently, and then doing efficient frequency/wavelength modulation/shifting stages to move around between those "convenient frequencies," with the net effect of further enabling more efficient all-optical signal processing, whether local or long-distance. The following description is presented to enable one of ordinary skill in the art to make and use the invention and is provided in the context of a patent application and its requirements.
Various modifications to the preferred embodiment and the generic principles and features described herein will be readily apparent to those skilled in the art. Thus, the present invention is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features described herein.
Definition of
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this general inventive concept belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The following definitions apply to some aspects described in relation to some embodiments of the invention. These definitions may also be extended herein.
As used herein, the term "or" includes "and/or" and the term "and/or" includes any and all combinations of one or more of the associated listed items. Expressions such as "at least one," when preceding an element list, modify the entire element list without modifying individual elements of the list.
As used herein, the singular terms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, a reference to an object can include multiple objects unless the context clearly dictates otherwise.
Furthermore, as used in the description herein and the claims that follow, the meaning of "in. It will be understood that when an element is referred to as being "on" another element, it can be directly on the other element or intervening elements may be present. In contrast, when an element is referred to as being "directly on" another element, there are no intervening elements present.
As used herein, the term "set" refers to a collection of one or more objects. Thus, for example, a collection of objects can include a single object or multiple objects. Objects of a collection can also be referred to as members of the collection. The objects of the collection can be the same or different. In some cases, objects of a collection can share one or more common attributes.
As used herein, the term "adjacent" refers to nearby or contiguous. Adjacent objects can be spaced apart from each other or can be in actual or direct contact with each other. In some cases, adjacent objects can be coupled to each other or can be integrally formed with each other.
As used herein, the term "connected" refers to a direct attachment or link. As the context shows, connected objects have no or no substantial intermediate objects or sets of objects.
As used herein, the term "coupled" refers to an operative connection or link. The coupling objects can be directly connected to each other or can be indirectly connected to each other, for example through an intermediate set of objects.
As used herein, the terms "substantially" and "substantially" refer to a substantial degree or range. When used with an event or circumstance, these terms can refer to the exact instance of the event or circumstance occurring, as well as the approximate instance of the event or circumstance occurring, such as taking into account typical tolerance levels or variability of the embodiments described herein.
As used herein, the terms "optional" and "optionally" mean that the subsequently described event or circumstance may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
As used herein, the term "functional device" broadly refers to an energy dissipating structure that receives energy from an energy providing structure. The term functional device includes both unidirectional and bidirectional structures. In some implementations, the functional device may be a component or element of a display.
As used herein, the term "display" broadly refers to a structure or method for producing a display composition. The display component is a collection of display image components produced from processed image component signals generated from display image element precursors. Image element precursors are sometimes referred to in other contexts as pixels or subpixels. Unfortunately, the term "pixel" has evolved into many different meanings, including the output from a pixel/subpixel, as well as the components of the displayed image. Some embodiments of the invention include implementations that separate these elements and form additional intermediate structures and elements, some for independent processing, which may be further obfuscated by referring to all of these elements/structures as pixels, and thus various terms are used herein to explicitly refer to particular components/elements. The display image element precursor emits an image component signal that can be received by an intermediate processing system to produce a set of display image elements from the image component signal. Displaying the set of image primitives produces an image when the image is either directly viewed by a display or reflected by a projection system toward the human visual system under expected viewing conditions. A signal in this context means the output of a signal generator which is or is equivalent to a display image element precursor. Importantly, these signals are preserved as signals in various propagation channels that hold the signals, as long as processing is required, and are not transmitted into free space where the signals create extended wavefronts that combine with other extended wavefronts from other sources that also propagate in free space. The signal has no chirality and no mirror image (i.e. no inverted, inverted or flipped signal, whereas the image and image portions have different mirror images). In addition, image portions are not directly added (it is difficult, if not impossible, to predict the result of overlapping one image portion on another) and it can be very difficult to process image portions. There are many different techniques available for use as signal generators, with different techniques being used to provide signals having different characteristics or benefits and to distinguish between disadvantages. Some embodiments of the present invention allow for hybrid components/systems that can leverage the advantages of technology combinations while minimizing the disadvantages of any particular technology. The incorporated U.S. patent application Ser. No. 12/371,461 describes systems and methods that can advantageously combine these techniques, so the term display image element precursor encompasses both pixel structures for pixel technology and sub-pixel structures for sub-pixel technology.
As used herein, the term "signal" refers to an output from a signal generator, such as a display image primitive precursor, that conveys information about the state of the signal generator when the signal is generated. In an imaging system, each signal is a portion of a displayed image primitive that, when perceived by the human visual system under expected conditions, produces an image or image portion. In this sense, the signal is a coded message, i.e. a sequence of states of displaying image primitive precursors in a communication channel of the coded message. A set of synchronization signals from a set of display image primitive precursors may define a frame (or a portion of a frame) of an image. Each signal may have a characteristic (color, frequency, amplitude, timing, but not chirality) that may be combined with one or more characteristics from one or more other signals.
As used herein, the term "human visual system" (HVS) refers to the biological and psychological process that accompanies the perception and visualization of images from multiple discrete display image primitives (direct views or projections). As such, the HVS implies the human eye, the optic nerve, and the human brain when receiving the composition of the propagated display image primitives and formulating a concept of an image based on those primitives received and processed. The HVS is not exactly the same for every person, but there is a general similarity for a significant percentage of the population.
FIG. 1 illustrates an imaging architecture 100 that may be used to implement embodiments of the present invention. Some embodiments of the invention contemplate the use of the Human Visual System (HVS) to form a human eyeThe perceptual image-from a large collection of signal generation structures-comprises the architecture 100. The architecture 100 includes: image engine 110 including multiple Display Image Primitive Precursors (DIPP) i I =1 to N (N may be any integer from 1 to tens, to hundreds, to thousands of DIPPs). Operating and modulating each DIPP 110 appropriately i To generate a plurality of image component signals 115 i I =1 to N (single image component signal 115) i From each DIPP 110 i ). Processing these image component signals 115 i To form a plurality of Display Image Primitives (DIP) 120 j J =1 to M, M being an integer less than, equal to, or greater than N. DIP 120 when perceived by the HVS j Such as one or more image component signals 115 occupying the same spatial and cross-sectional area i ) A display image 125 (or a series of display images for animation/motion effects, for example) will be formed. When presented in the appropriate format, the HVS slave DIP 120 j The display image 125 is reconstructed, such as in an array on a display or a projected image on a screen, wall, or other surface. This is a familiar phenomenon of the HVS perceiving an image from an array of different colors or shades of gray of small shapes (such as "dots") that are small enough relative to the distance to the viewer (and HVS). Thus, the display image element precursor 110 i Will correspond to the structure commonly referred to as a pixel when referring to a device that generates image component signals from a non-composite color system and will therefore correspond to the structure commonly referred to as a sub-pixel when referring to a device that generates image component signals from a composite color system. Many familiar systems employ a composite color system, such as RGB image component signals, one from each RGB element (e.g., LCD cell, etc.). Unfortunately, the terms pixel and sub-pixel are used in imaging systems to refer to many different concepts such as hardware LCD cells (sub-pixels), light emitted from the cells (sub-pixels), and signals as perceived by the HVS (typically these sub-pixels have been mixed together and are configured to be imperceptible to a user under a set of conditions intended for viewing). Architecture 100 distinguishes these different "pixels or sub-pixels," and thus different terms are used to refer to these different constituent elements.
Architecture 100 may include a hybrid structure in which image engine 105 includes different technologies for one or more subsets of DIPPs 110. That is, a first subset of the DIPPs may use a first color technique (e.g., a composite color technique) to produce a first subset of image component signals, and a second subset of the DIPPs may use a second color technique (e.g., a different composite color technique or a non-composite color technique) different from the first color technique to produce a second subset of image component signals. This allows the set of display image primitives to be generated using a combination of various techniques and can be superior to the display image 125 when generated from any single technique.
Architecture 100 also includes a signal processing matrix 130 that receives image component signals 115 i As input and producing display image elements 120 at output j . There are many possible arrangements of the matrix 130 (some embodiments may include a single-dimensional array) depending on the suitability and purpose of any particular implementation of an embodiment of the invention. In general, matrix 130 includes a plurality of signal channels, such as channels 135-160. There are many different possible arrangements for each channel of the matrix 130. Each channel is sufficiently isolated from other channels, such as optical isolation created by discrete fiber channels, so that for an implementation, a signal in one channel does not interfere with other signals that exceed a crosstalk threshold. Each channel includes one or more inputs and one or more outputs. Each input receives an image component signal 115 from the DIPP 110. Each output produces a display image primitive 120. From input to output, each channel is indicative of pure signal information, and the pure signal information at any point in the channel may include the raw image component signal 115, a disaggregation of a set of one or more processed raw image component signals, and/or an aggregation of a set of one or more processed raw image component signals, each "processing" may have included one or more aggregations or disaggregation of one or more signals.
In this case, aggregation refers to the removal of signals (which themselves may be raw image component signals, processed signals, or a combination thereof) from S A (S A >1)Combining one channel to T A (1<T A <S A ) De-aggregation, which refers to the removal of a signal (which itself may be the original image component signal, the processed signal or a combination) from S D (S D More than or equal to 1) channels are divided into T D (S D <T D ) And (4) a channel. E.g. without any polymerization, S, due to earlier depolymerization A May exceed N and SD may exceed M due to subsequent polymerization. Some embodiments have S A =2,S D 1 and T D And (2). However, the architecture 100 allows for aggregation of many signals, which can produce a signal strong enough that it can be disaggregated to many channels, each with sufficient strength for implementation. The aggregation of signals follows the aggregation of channels (e.g., joining, combining, etc.) or other arrangement of adjacent channels to allow joining, combining, etc. of signals propagated by those adjacent channels, and the de-aggregation of signals follows the de-aggregation of channels (e.g., splitting, separating, splitting, etc.) or other channel configuration to allow splitting, separating, splitting, etc. of signals propagated by the channels. In some implementations, there may be a particular structure or element of a channel to aggregate two or more signals in multiple channels (or to de-aggregate signals in a channel into multiple signals in multiple channels) while maintaining the signal state of the content propagated through the matrix 130.
A number of representative channels are depicted in fig. 1. Channel 135 shows a channel having a single input and a single output. Channel 135 receives a single raw image component signal 115 k And generates a single display image element 120 k . This is not to say that channel 135 may not perform any processing. For example, the processing may include transformation of physical characteristics. The physical dimensions of the inputs of the channel 135 are designed to match/complement their respective/associated resulting image component signals 115 k The effective area of DIPP 110. The physical dimensions of the output need not match those of the input-that is, the output may be relatively narrow or wide, or the circular perimeter input may become a straight perimeter output. Other transformations include repositioning of the signal-albeit the image component signal 115 1 Can be on the imageComponent signal 115 2 Starting nearby, but the display image primitive 1201 produced by channel 135 may be located by the previous "remote" image component signal 115 x Generated display image element 120 x To the side of (c). This gives great flexibility in interleaving signals/primitives, which is separate from the technology used in their production. This possibility for individual or collective physical transformations is an option for each channel of the matrix 130.
Channel 140 illustrates a channel having a pair of inputs and a single output (aggregating the pair of inputs). For example, channel 140 receives two original image component signals, signal 115 3 Sum signal 115 4 And generates a single display image element 120 2 . Channel 140 allows for the addition of two amplitudes, such that primitive 120 2 Having a greater amplitude than either component signal. Channel 140 also allows for improved timing by interleaving/multiplexing the constituent signals; for example, each constituent signal may operate at 30Hz, but the resulting cell may operate at 60 Hz.
Channel 145 illustrates a channel having a single input and a pair of outputs (de-aggregation inputs). For example, channel 140 receives a single original image component signal, signal 115 5 And generates a pair of display image primitives-primitive 120 3 And primitive 120 4 . Channel 145 allows the reproduction of a single signal, such as split into two parallel channels with many properties to de-multiplex the signals, perhaps except amplitude. When amplitude is not desired, as described above, the amplitude can be increased by aggregation, and then de-aggregation can produce a sufficiently strong signal, as shown by the other representative channels shown in fig. 1.
Channel 150 shows a channel with three inputs and a single output. Channels 150 are included to emphasize that virtually any number of independent inputs can be aggregated into a processed signal in a single channel to produce, for example, a single primitive 120 5
Channel 155 shows a channel with a single input and three outputs. Channels 150 are included to emphasize that individual channels (and signals therein) can be de-aggregated into almost any number of separate but correlated outputs and primitives, respectively. On the other handThe channel 155 is different from the channel 145-i.e., the amplitude of the primitive 120 resulting from the output. In channel 145, each amplitude may be divided into equal amplitudes (although some de-aggregation structures may allow variable amplitude splitting). In channel 155, primitives 120 6 May not equal primitive 120 7 And 120 8 Amplitude of (e.g., primitive 120) 6 May have about primitives 120 7 And primitives 120 8 Twice the amplitude of each since all signals need not be de-multiplexed at the same node). The first division may result in half of the signal generated by element 1206, and the resulting half signal is further divided in half for element 120 7 And primitives 120 8 Each of which.
Channel 160 illustrates a channel that includes aggregation of three inputs and de-aggregation into a pair of outputs. Channels 160 are included to emphasize that a single channel may include aggregation of signals and de-aggregation of signals. Thus, a channel may have multiple aggregation regions and multiple disaggregation regions, where necessary or desired.
Thus, the matrix 130 becomes a signal processor by virtue of the manipulation of physical and signal properties of the processing stages 170, including aggregation and disaggregation.
In some embodiments, the matrix 130 may be created by a precision weaving process that defines the physical structure of the channels, such as a jacquard weaving process for a set of optical fibers that collectively define thousands to millions of channels.
In general, embodiments of the invention may include an image generation stage (e.g., image engine 105) coupled to a primitive generation system (e.g., matrix 130). The image generation stage includes N display image element precursors 110. Each display image element precursor 110 i Generating a corresponding image component signal 115 i . These image component signals 115 i Is input into the primitive generation system. The primitive generation system includes an input stage 165 with M input channels (M may be equal to N but need not match-in fig. 1, for example, some signals are not input into the matrix 130). Input to the input channel from a single display image element precursor 110 x Receiving an image component signal 115 x . In FIG. 1, each input letterThe channel has an input and an output, each input channel directing its single original image component signal from its input to its output, and input stage 165 has M inputs and M outputs. The primitive generation system also includes an allocation stage 170 having P allocation channels, each including an input and an output. Typically M = N and P may vary depending on implementation. For some embodiments, P is less than N, e.g., P = N/2. In those embodiments, each input of an assigned channel is coupled to a unique pair of outputs from the input channels. For some embodiments, P is greater than N, e.g., P = N × 2. In those embodiments, each output of an input channel is coupled to a unique input pair of an assigned channel. Thus, the primitive generation system scales the image component signals from the display image primitive precursors-in some cases, multiple image component signals are combined into a signal in an assignment channel, and at other times a single image component signal is split and presented to multiple assignment channels. There are many possible variations of the matrix 130, the input stage 165 and the distribution stage 170.
Fig. 2 illustrates an embodiment of an imaging system 200 implementing a version of the imaging architecture of fig. 1. System 200 includes a set of encoded signals 205, such as a plurality of image component signals (at IR/near IR frequencies) provided to a photon signal converter 215, photon signal converter 215 producing a set 220 of digital image elements 225, preferably at visible frequencies, particularly real world visible imaging frequencies.
Fig. 3 shows the general structure of the photonic signal converter 215 of fig. 2. The converter 215 receives one or more input photon signals and produces one or more output photon signals. The converter 215 adjusts various characteristics of the input photonic signal, such as signal logic state (e.g., on/off), signal color state (IR to visible), and/or signal intensity state.
Fig. 4 shows a particular embodiment of a photonic converter 400. The converter 405 comprises an active light source 405. The source 405 may, for example, include an IR and/or near-IR source to achieve optimal modulator performance in subsequent stages (e.g., an array of LEDs emitting in IR and/or near-IR). Converter 400 includes an optional batch optical energy source homogenizer 410. Homogenizer 410 provides a structure that homogenizes the polarization of light from source 405 as necessary or desired. The homogenizer 410 may be arranged for active and/or passive homogenization.
Next, converter 400 includes encoder 415 in the order in which light from source 405 is propagated. Encoder 415 provides a logical encoding of the light from source 405, which can be homogenized to produce an encoded signal. The encoder 405 may include a hybrid magneto-photonic crystal (MPC), mach-Zehnder, transmission valve, etc. The encoder 415 may comprise an array or matrix of modulators to set the state of the image component signal sets. In this regard, the individual encoder structures may be equivalent to display image element precursors (e.g., pixels and/or sub-pixels, and/or other display light energy signal generators).
The converter 400 includes an optional filter 420, such as a polarizing filter/analyzer (e.g., photonic crystal dielectric mirror) in combination with a planar deflection mechanism (e.g., prism array/grating structure).
Converter 400 includes an optional energy recapture 425 that recaptures energy (e.g., IR-near IR deflection energy) from source 405, which is deflected by elements of filter 420.
Converter 400 includes a modulator 430 that modulates/shifts the wavelength or frequency of the encoded signal (which may have been filtered by optical filter 420) generated from encoder 415. The modulator 430 may include a phosphor, a periodically polarizing material, a vibrating crystal, and the like. The modulator 430 takes the generated/converted IR/near IR frequencies and converts them to one or more desired frequencies (e.g., visible frequencies). The regulator 430 need not shift/modulate all input frequencies to the same frequency and may shift/modulate different input frequencies in IR/near IR to the same output frequency. Other adjustments are possible.
The converter 400 may optionally include a second optical filter 435, for example, for IR/near IR energy, and then may optionally include a second energy recapture 440. The filter 435 may include a photonic crystal dielectric mirror (e.g., a prism array/grating structure) in combination with a planar deflection structure.
Converter 400 may also include an optional amplifier/gain adjustment 445 for adjusting one or more parameters (e.g., increasing the encoded, optionally filtered, and frequency shifted signal amplitude). Other or additional signal parameters may be adjusted by adjustment 445.
FIG. 5 shows a general architecture 500 for a hybrid photonic VR/AR system 505. Architecture 500 exposes system 505 to a surrounding real-world complex electromagnetic wavefront and produces a set 510 of display image primitives for the Human Visual System (HVS). The set of display image primitives 510 may include or use information from the real world (AR mode), or the set of display image primitives may include information generated entirely by the synthetic world (VR mode). The system 505 may be configured to be selectively operable in either or both modes. Further, the system 500 may be configured such that an amount of real world information used in the AR mode may be selectively changed. The system 505 is robust and versatile.
The system 505 may be implemented in many different ways. One embodiment generates an image component signal from the synthetic world and interleaves the synthetic signal with an image component signal generated from the real world ("real world signal") in the AR mode. These signals may be channelized, processed, and distributed using a signal processing matrix that isolates the optical channels, as described in the incorporated patent application 12/371,461. The system 505 includes a signal processing matrix that may contain various passive and active signal manipulation structures in addition to any allocation, aggregation, disaggregation, and/or physical property shaping.
These signal manipulation structures may also vary based on the particular arrangement and design goals of the system 505. For example, the manipulation structures may include a real-world interface 515, an augmentor 520, a visualizer 525, and/or an output constructor 530.
Interface 515 includes functionality similar to displaying image primitive precursors, converting the real-world composite electromagnetic wavefront into a set of real-world image component signals 535, which are channelized and distributed and presented to augmentor 520.
As described herein, the system 505 is very versatile and many different implementations exist. The nature and function of the steering mechanism may be influenced by a variety of different considerations and design goals. All of which are not explicitly detailed herein, but rather set forth some representative embodiments. As described in the incorporated patent applications and herein, architecture 500 is capable of employing a combination (e.g., hybrid) of techniques, each of which may be particularly advantageous for the production of a portion of the DIP510 set, to yield superior overall results, rather than relying on a single technique for the production of all portions.
For example, a real-world composite electromagnetic wavefront includes visible and non-visible wavelengths. Since the set of DIPs 510 also includes visible wavelengths, it can be considered that signal 535 must also be visible. As explained herein, not all embodiments are capable of achieving excellent results when signal 535 is in the visible spectrum.
System 505 may be configured to include visible signal 535. An advantage of some embodiments is that signal 535 is provided using a wavelength that is not visible to the HVS. As used herein, the following ranges of the electromagnetic spectrum are relevant:
a) Visible radiation (light) is electromagnetic radiation with a wavelength between 380nm and 760nm (400-790 terahertz) that will be detected by the HVS and perceived as visible light;
b) Infrared (IR) radiation is (for HVS) invisible electromagnetic radiation having a wavelength between 1mm and 760nm (300 GHz-400 THz) and comprising far infrared rays (1 mm-10 μm), mid infrared rays (10-2.5 μm) and near infrared rays (2.5 μm-750 nm).
c) Ultraviolet (UV) radiation is electromagnetic radiation (for HVS) that is invisible, with wavelengths between 380nm and 10nm (790 THz-30 PHz).
The interface 515 of the invisible real world signal embodiment produces a signal 535 in the infrared/near infrared spectrum. For some embodiments, it may be desirable to generate invisible signal 535 using a spectral mapping that maps specific wavelengths or wavelength bands of the visible spectrum to predetermined specific wavelengths or wavelength bands in the infrared spectrum. This provides the advantage of allowing signal 535 to be efficiently processed into infrared wavelengths within system 505, and includes the advantage of allowing system 505 to restore signal 535 to real world colors.
The interface 515 may include other functional and/or structural elements, such as filters, to remove IR and/or UV components from the received real world radiation. In some applications, such as for night vision mode using IR radiation, the interface 515 will either exclude the IR filter or will have an IR filter that allows some of the received real world radiation to be sampled and processed.
The interface 515 will also include a real-world sampling structure to convert the filtered received real-world radiation into a matrix of processed real-world image component signals (similar to the matrix of display image primitive precursors) that are channelized into a signal distribution and processing matrix.
The signal distribution and processing matrix may also include frequency/wavelength conversion structures to provide processed real-world image component signals in the IR spectrum (when needed). The interface 515 may also pre-process selected features of the filtered real-world image component signal, such as including polarization filtering functions (e.g., polarization filtering of the real-world image component signal to filter IR/UV, or polarization filtering, sorting, and polarization averaging, etc.), depending on what additional signal operations are later performed in the system 505 and what encoding/conversion techniques are implemented.
For example, where system 505 includes a structure or process for modifying signal amplitude based on polarization, interface 515 may prepare signal 535 as appropriate. In some implementations, a default signal amplitude with a maximum value may be desired (e.g., default "on"), in other implementations, a default signal amplitude with a minimum value may be desired (e.g., default "off), and others may have some channels that provide default values under different conditions, rather than all being on or off by default. Setting the polarization state of signal 535, whether visible or not, is a function of interface 515. Other signal capabilities may also be set by the interface 515 for all of the signals 535 or for a selected subset of the signals 535, as predetermined by design goals, techniques, and implementation details.
The real world channelized image component signal 535 is input to the enhancer 520. The enhancer 520 is a special structure in the system 505 for further signal processing. This signal processing may be a multiple function of operating on signal 535, some or all of which may be considered "pass-through" signals depending on how booster 520 operates on it. These multiple functions may include: a) manipulating signals 535, e.g., independent amplitude control, setting/modifying frequency/wavelength and/or logic state, etc., of each individual real-world image component signal, b) generating a set of independent synthetic world image component signals having desired characteristics, and c) interleaving some or all of the "through" real-world image component signals with the generated set of synthetic world image component signals at desired ratios to generate a set of interleaved image component signals 540.
Augmentor 520 is the generator of the set of synthetic world image component signals, in addition to the processor of the received image component signals (e.g., real world). System 505 is configured such that all signals may be processed by enhancer 520. There are many different ways to implement enhancer 520, such as when enhancer 520 is a multi-layer optical device composite defining multiple radiation valves (each gate associated with a signal), some gates configured to be possible to receive some real-world signals individually for controlled passage, and some gates configured to generate synthetic-world signals receive background radiation, isolated from the passage signal, for generating synthetic-world image component signals. Thus, the gates used to generate the synthetic world in this embodiment generate synthetic world signals from background radiation.
As shown, architecture 500 includes multiple (e.g., two) independent sets of display image primitive precursors that are selectively and controllably processed and merged. The interface 515 serves as one set of display image primitive precursors and the enhancer 520 serves as a second set of display image primitive precursors. The first set produces image component signals from the real world and the second set produces image component signals from the synthetic world. In principle, architecture 500 allows additional display image primitive precursors (one or more forming a total of three or more display image primitive precursors) to be available in system 505, which can make additional channelized sets of image component signals available to enhancer 520 for processing.
In one way of considering architecture 500, enhancer 520 defines a master set of display image primitive precursors that generate interlaced signals 540, some of which are initially generated by one or more preliminary sets of display image precursors (e.g., interface 515 generates real-world image component signals), and some of which are generated directly by enhancer 520. Architecture 500 does not require that all display image cell precursors employ the same or complementary technology. By providing all of the constituent component signals in an organized and predetermined format (e.g., in separate channels and in a common frequency range compatible with signal manipulation, such as signal amplitude modulation by booster 520), architecture 500 may provide a robust, and universal solution to one or more of the shortcomings, limitations, and disadvantages of current AR/VR systems.
As described herein, the channelized signal processing and distribution arrangement may aggregate, de-aggregate, and/or otherwise process the individual image component signals as the signals propagate through the system 505. The result may be that the number of signal channels in signal 540 may be different from the sum of the number of pass signals and the number of generated signals. The enhancer 520 interleaves the first number of real-world pass signals with the second number of synthesized signals (the first number is zero for pure VR mode of the system 505). Interleaving in this case includes the widespread presence of both types of signals, and is not meant to require that each real-world pass signal be present in a channel that is physically adjacent to another channel that includes a composite-world signal. Routing can be controlled independently via the channel assignment properties of system 505.
The visualizer 525 receives the interlaced signal 520 and outputs a set of visible signals 545. In system 505, the synthetic world image component signal of signal 540 is generated in the invisible range of the electromagnetic spectrum (e.g., IR or near IR). In some implementations, some or all of the real world signal 535 passed by the enhancer 520 has been converted to the invisible range of the electromagnetic spectrum (which may also overlap or be included wholly or partially within the range of the synthetic world signal). The visualizer 525 performs frequency/wavelength modulation and/or conversion of non-visible signals. When the invisible false-color map is used to define and generate the synthetic and real-world signals, the appropriate color is restored to the real-world signal of the modified frequency, and the synthetic world can be visualized in real-world colors.
The output constructor 530 generates the set of display image primitives 510 from the visible signals 545, whether through direct view or projection, for example, for perception by the HVS. The output constructor 530 may include merging, aggregation, de-aggregation, channel re-arrangement/relocation, physical characteristics definition, fiber shaping, and other possible functions. Constructor 530 may also include amplification of some or all of the visible signals 545, bandwidth modification (e.g., aggregation and time division multiplexing of multiple channels of signals with preconfigured timing relationships-i.e., they may be generated out of phase and combined into a signal to produce a stream of signals at multiples of the frequency of any stream), and other image component signal manipulations. Two streams in a 180 degree phase difference relationship may double the frequency of each stream. Three streams of 120 degrees phase relationship can triple the frequency, so on for N =1 or more multiplexed streams. And the combined streams that are in phase with each other may increase the signal amplitude (e.g., two in-phase streams may double the signal amplitude, etc.).
FIG. 6 shows a hybrid photonic VR/AR system 600 that implements an embodiment of the system 500. The system 600 includes a dashed box that maps corresponding structures between the system 600 and the system 505 of fig. 5.
The system 600 includes an optional optical filter 605, an "annunciator" 610, a real world signal processor 615, a radiation diffuser 620 (e.g., IR radiation) powered by a radiation source 625, a magneto-optical sub-encoder 630, a frequency/wavelength converter 635, a signal processor 640, a signal combiner 645, and output shaper optics 650. As described herein, there are many different implementations and embodiments, some of which include different technologies with different requirements. For example, some embodiments may use radiation in the visible spectrum and do not require elements for wavelength/frequency conversion. For a pure VR implementation, a real world signal processing architecture is not required. In some cases, post-minimization visualization merging and reshaping is needed or desired. The architecture 500 is very flexible and can accommodate a preferred set of technologies.
Filter 605 removes unwanted wavelengths from the ambient real world illumination incident on interface 515. Those that are not needed depend on the application and design goals (e.g., night vision goggles may require some or all of the IR radiation, while other AR systems may wish to remove UV/IR radiation).
The annunciator 610 serves as a display image primitive precursor to convert the filtered incident real world radiation into real world image component signals and insert the respective signals into optically isolated channels of a signal splitter stage. These signals may be based on complex or non-complex imaging models.
Processor 615 may include polarization structures to filter polarization and/or filter, sort, and homogenize polarization, wavelength/frequency converters when some or all of the real world is to be converted to a different frequency (e.g., IR) by an image component signal.
Diffuser 620 takes radiation from the radiation source and creates a background radiation environment for encoder 630 to generate a synthetic world image component signal. The diffuser 620 keeps the background radiation isolated from the real world through the channel.
Encoder 630 simultaneously receives and processes real world pass signals (e.g., it can modulate these signals in addition) and generates a composite world signal. Encoder 630 interleaves/alternates the signals from the real world and from the synthetic world and keeps them in optically isolated channels. In fig. 6, real world signals are depicted as filled arrows and synthetic world signals are depicted as unfilled arrows to show interleaving/alternation. Fig. 6 is not meant to imply that encoder 630 is required to reject significant portions of the real-world signal. Encoder 630 may include a matrix of many display image primitive precursor type structures to process all real world signals and all synthetic world signals.
When present, the converter 635 converts the invisible signal to a visible signal. Thus, the converter 635 may process synthetic world signals, real world signals, or both. In other words, the switching may be enabled on the respective signal distribution channel.
When present, the signal processor 640 may modify signal amplitude/gain, bandwidth, or other signal modification/modulation.
When present, signal combiner 645 may organize (e.g., aggregate, disaggregate, route, group, cluster, replicate, etc.) the signals from visualizers 525.
When present, output shaper optics 650 performs any necessary or desired signal shaping or other signal manipulation to produce the desired display image elements perceptible to the HVS. This may include direct view, projection, reflection, combination, and the like. The routing/grouping may enable 3D imaging or other visual effects.
System 600 may be implemented as a stack of functional photonic components (sometimes integrated) that receive, process, and transmit signals in discrete optically isolated channels from the time they are generated until (if they are included in a display image precursor for propagation to the HVS as part of other signals in other display image precursors.
The field of the invention is not single, but combines two related fields, augmented reality and virtual reality, and the proposing and providing an integrated mobile device solution solves the key problems and limitations of the prior art in both fields. A brief review of the background to these related art will clarify the problems and limitations to be solved and lay the foundation for the proposed solution of the present disclosure.
Two standard dictionary definitions (sources) for these terms are as follows:
virtual reality: "realistic simulation of an environment, including three-dimensional graphics, using a computer system of interactive software and hardware. Abbreviations: and VR (virtual reality). "
Augmented reality: an "augmented image or environment viewed on a screen or other display is generated by superimposing computer-generated images, sounds, or other data in the real-world environment. "and: "systems or techniques for generating such an enhanced environment. Abbreviations: and AR. "
It is apparent from the definition that, although non-technical, and to those skilled in the relevant art, the essential difference is whether the simulation element is a complete and immersive simulation, even a partial direct view of reality, or whether the simulation element is superimposed on an otherwise clear unobstructed real view.
Slightly more technical definitions are provided under the wikipedia item of the subject, which may be considered to fully represent the field in view of the depth and extent of the contribution to page editing.
Virtual Reality (VR), sometimes referred to as immersive multimedia, is a computer simulated environment that can simulate, in some places, the physical presence of the real world or imagined world. Virtual reality enables the reproduction of sensory experiences, including virtual taste, vision, smell, hearing, touch, and the like.
Augmented Reality (AR) is a real-time direct or indirect view of a physical, real-world environment, the elements of which are augmented (or supplemented) by computer-generated sensory input (e.g., sound, video, graphics, or GPS data).
Inherent but only implicit in these definitions are the basic properties of moving viewpoints. Virtual or augmented reality differs from the more general category of computer simulation in that, whether or not there is any combination, fusion, synthesis or integration with "real-time", "direct" reality imaging (local or remote), simulated or mixed (augmented or "mixed") reality "simultaneously with real" imaging, i.e. the point of view of the viewer moves with the viewer as the viewer moves in the real world.
The present disclosure proposes that such a more precise definition is needed to distinguish between static navigation of an immersive display and an empirical simulated world (simulator) and mobile navigation of a simulated world (virtual reality). A sub-category of simulators would then be "personal simulators", or at most "partial virtual reality", where the stationary user is equipped with an immersive HMD (head mounted display) and a haptic interface (e.g. motion tracking gloves), which enables navigation of parts of the simulated world "as virtual reality".
On the other hand, the CAVE system will be schematically defined as a limited virtual reality system, since navigation through the size of CAVE is only possible with the aid of the movable layer, and will then be another form of "partial virtual reality" once the limits of CAVE itself are reached.
Note the difference between the "moving" viewpoint and the "movable" viewpoint. Computer simulations, such as video games, are simulated worlds or "realities", but unless the seeker of the simulated world moves himself or directs another person or robot to move, all of these can be said that the simulated world is "navigable" (although one of the primary achievements of computer graphics in the past forty years was simply "building" a simulated environment that could be explored in software).
For simulations that are virtual or mixed (the preferred term for the author) reality, an important, most typical feature is to have a mapping to the simulation of the real space, whether fully synthetic or mixed. Such a real space may be as basic as a laboratory or a room within a sound field, and is simply a grid mapped and calibrated to the simulated world in some proportion.
This distinction is not evaluable, as local VRs (whether natural, artificial, or mixed) that provide real-time natural interfaces (head tracking, haptic, auditory, etc.) without moving or mapping to actual real terrain are not fundamentally less valuable than local VR systems that simulate physical interactions and provide sensory immersion. However, VR systems are by definition "local" if there is no feedback system for the foot or, more generally, a full body, range of motion feedback system, and/or a dynamically deformable mechanical interface-interactive surface that supports the user's simulated but (in its sense) full body motion in any terrain, any static state (whether standing, sitting or reclining).
However, without such an ideal whole-body physical interface/feedback system, limiting VR to "full" and fully mobile versions would limit the VR world's terrain to where it can be found, modified, or built from scratch in the real world. Such limitations will typically severely limit the scope and capabilities of the virtual reality experience.
However, as will be apparent in the instant disclosure, this distinction creates a difference in that it sets the "open line" for how existing VR and AR systems differ and their limitations, and provides a background to inform the teachings of the present disclosure.
The missing but essential features and requirements of the simulation are established as a complete "virtual reality", the next step is to determine the implicit question by what way to implement "moving the viewpoint". The answer is that providing a view of the movement simulation requires two components (which are themselves implemented by a combination of hardware and software): motion picture display means through which the simulation can be viewed, and motion tracking means which can track the motion of the device including the display in 3 axes of motion, which means that the position of the three-dimensional viewing device over time is measured from a minimum of three tracking points (two if the device is mapped to measure so that a third position on a third axis can be inferred), and relative to a 3-axis frame of reference which can be any arbitrary 3D frame of reference mapped to real space, although for practical purposes of mechanically navigating the space, the 2 axes will form a plane which is the ground plane of the gravity level and the third axis Z is perpendicular to the ground plane.
The solutions to accurately and frequently practically achieve such position orientation as a function of time require a combination of sensors and software, and the advances in these solutions represent a major carrier for the development of VR and AR hardware/software mobile viewing devices and systems fields.
These are relatively new fields, in terms of the time frame between the earliest experiments and the practical techniques and products of today, sufficient to document the original and the later current state of the art in both types of mobile visual simulation systems, which, apart from specific innovations in the prior art, are of importance for the development of the present disclosure, or are related to differences or similar features for better explaining the existing problems in the field or solutions of the present disclosure from the prior art.
Many innovative periods in the relevant simulation and simulator, VR and AR fields were spanned from 1968 to the end of the ninety years, where many key issues to realize practical VR and AR found initial or partial solutions.
Pioneering experiments and experimental head-mounted display systems of Ivan Sutherland and its assistant Bob Sprouell since 1968 were generally considered as the hallmark of the origin of these related fields, although earlier work (essentially concept development) was earlier than the first experimental implementation of this any form of AR/VR that enabled immersion and navigation.
The birth of the fixed simulator system dates back to the addition of computer-generated imagery to the flight simulator, which is generally believed to have begun in the mid-late 1960 s. This was limited to the use of CRTs, displaying full focus images at user-to-CRT distances, until 1972 Singer-Link introduced a collimated projection system that projected an afocal image through a spectroscopic system, which improved the field of view to about 25-35 degrees per unit (using three units to 100 degrees in a single flight crew simulator).
This benchmark was only improved in 1982 by Rediffusion incorporated, introducing a wide field of view system, a wide-angle infinite display system, which achieves a 150 degree and then final 240 degree FOV by using multiple projectors and a large, curved collimating screen. It is at this stage that a fixed simulator may be described as ultimately achieving a significant degree of real immersion in virtual reality, using an HMD to isolate the observer and eliminate peripheral visual cue interference.
But while Singer-Link company was promoting screen alignment systems for simulators, the first very limited commercial helmet mounted display was first developed for military use as an advance in VR-type experience, with a reticle-based electronic aiming system integrated with motion tracking of the helmet itself. These initial developments were generally considered to be achieved in a preliminary form by the south african air force in the 70's of the 20 th century (followed by the israel air force between then and the mid seventies), which could be said to be the beginning of a preliminary AR or intervening/mixed reality system.
These early helmet-mounted systems, with minimal graphics but still pioneering, implemented a limited synthesis of position-coordinated target information and user-actuated motion-tracking targets superimposed on the reticle, after which Steve Mann invented the first "intermediate reality" mobile see-through system, the first generation "EyeTap" with graphics superimposed on the glasses.
Later versions of Mann employed an optical compounding system that merged real and processed images based on splitter/combiner optics. This work preceded the work of Chunyu Gao and Augmented Vision Inc, which essentially proposed a dual Mann system that optically combines the processed real image with the generated image, which completed both the processed real image and the electronically generated image. The true see-through image is preserved in Man's system, but all see-through images are processed in Gao's system, eliminating any direct see-through image, even as an option. (Chunyu Gao, U.S. patent application 20140177023 filed on 13/4/2013). The "optical path folding optics" structure and method specified by the Gao system can be found in other optical HMD systems.
By 1985, jaron Lanier and VPL Research established to develop HMD and "data glove", and thus by the 1980 s, mann, lanier and Redefuzsion companies had three major development paths for VR and AR simulations in a very active field of development, taking some critical advances and establishing some basic solution types, in most cases up to now maintaining the latest levels.
The complexity of Computer Generated Imaging (CGI), continued improvements in gaming machines (hardware and software) with real-time interactive CG technology, greater system integration among multiple systems, and the expansion of the mobility of AR and more limited VR are among the major trends in the 1990 s.
The CAVE system was developed by the electronic visualization laboratory at the university of chicago illinois, and was first developed worldwide in 1992, with a limited form of mobile VR and a new simulator being introduced. (Carolina Cruz-Neira, daniel J Sandin, thomas A.DeFanti, robert V.Kenyon and John C.Hart. "CAVE: audio Visual Experience Automatic Virtual Environment (CAVE: audio Visual Experience Automatic Virtual Environment)", ACM Communications, vol.35 (6), 1992, pp.64-72.) in addition to Lanier's HMD/data glove combination, CAVE combines a WFOV multi-wall simulator "stage" with a haptic interface.
At the same time, louis Rosenberg developed a stationary local AR in armstrong's us air force research laboratory, the "Virtual Fixtures" system (1992) and the stationary "Virtual" VR system of johnson walden, which was considered an initial development as early as 1985 to 1990 and was the first commercial light in 1992.
Integration of mobile AR into a multi-unit mobile vehicle "wargame simulation" system, combining real and virtual vehicles in "augmented simulation" ("AUGSIMM"), will see its next major advance in the form of Loral WDL, shown to the industry in 1993. Project participants of Peculiar Technologies, jon Barrilleaux, subsequently written "Experiences and innovations in Applying Augmented Reality to Live Training" in 1999, reviewed the findings of SBIR Final reports in 1995, and pointed out the continuing problems faced by mobile VR and (mobile) AR to date:
VR tracking in AR vs
Generally, commercial products developed for VR have good resolution, but lack the absolute accuracy and wide area coverage required for AR, let alone their use in AUGSIM.
VR applications-users immersed in synthetic environments-are more concerned with relative tracking rather than absolute accuracy. Since the user's world is completely synthetic and self-consistent, the fact that his/her head is just turning 0.1 degrees is much more important than knowing that it is now pointing north within 10 degrees.
AR systems such as AUGSIM do not have such treatment. AR tracking must have good resolution so that the virtual element appears to move smoothly in the real world as the user's head rotates or the vehicle moves, and it must have good accuracy so that the virtual element is properly overlaid and occluded by objects in the real world.
As computing and network speeds continue to improve in the nineties, new projects for outdoor AR systems were initiated, including the U.S. naval research laboratory's BARS system, "BARS: battlefield Augmented Reality System, simon Julier, yohan Baillot, marco Lanzagorta, dennis Brown, lawrence Rosenblu; NATO Symposium on Information Processing technologies for Military Systems,2000. And (3) abstract: the system consists of a wearable computer, a wireless network system, and a tracking see-through Head Mounted Display (HMD). The user's perception of the environment is enhanced by overlaying graphics onto the user's field of view. The graphics are registered (aligned) with the actual environment. "
Development of non-military features is also underway, including work by Hirokazu Kato, nara research institute of science and technology, ARToolkit, later released and further developed at HITLab, which introduced software development suites and protocols for viewpoint tracking and virtual object tracking.
These milestones are often considered to be of the greatest importance during this period, although other researchers and companies are also active in this area.
While military funding for large-scale development and testing of AR for training simulations has been well documented, the need for such obvious other system-level design and system demonstration is ongoing simultaneously with military funded research efforts.
The most important of these non-military experiments were the AR version of the video game Quake, ARQuake, initiated and led developed by Bruce Thomas at the south australian university wearable computer lab, and published in "ARQuake: an outlet/Indoor Augmented Reality First Person Application, 4th International Symposium on week computers, pp 139-146, atlanta, ga, oct 2000; (Thomas, B., close, B., donoghue, J., squires, J., de Bondi, P., morris, M., and Piekarski, W.). And (3) abstract: "we propose an architecture for a low-cost, medium-precision six-degree-of-freedom tracking system based on GPS, digital compass and vision-based reference tracking. "
Another system that began to be designed and developed in 1995 was the one developed by the authors of this disclosure. The original objective was to achieve a mix of outdoor AR and tv programs, known as "Everquest Live", which was further developed in the late nineties, the basic elements of which were completed in 1999, when a commercial effort was initiated to fund the original video game/tv mix, and later included another version for high-end theme vacation development. By 2001, it was disclosed on a confidential basis to companies including Ridley and Tony Scott, and in particular their joint ventightplatet (other partners including Renny Harlin, jean Giraud and European Heavy Metal), as executives of their supervisory business, and brought them current "other world" and "other world Industries" projects and inauguration investments as investments and proposed joint ventures in collaboration with ATP.
The following is a summary of the system design and components ultimately determined in 1999/2000:
selected from "other Industries Business planning document" (archived file version, 2003):
technical background: the prior art 'open field' simulation and proprietary integration of mobile virtual reality: tools, facilities and techniques.
This is only a partial listing and overview of the related art, which together form the backbone of a proprietary system. Some technical components are proprietary and some come from external vendors. But the unique system incorporating the validated components would be absolutely proprietary-and revolutionary:
interact with VR-ALTERED WORLD:
1) A mobile military-grade VR device for immersing guests/participants and actors in the VR enhanced landscape of an othrworld. Although their "adventure" (i.e., every movement they explore the ottherport around the resort) is captured in real time by moving the motion capture sensors and digital cameras (with automatic masking techniques), guests/players and employees/actors see each other through their visor and the superposition of computer simulated images. The visor is a binocular semi-transparent flat panel display or a binocular opaque flat panel display with a binocular camera attached in front.
These "synthetic elements" superimposed in the field of view by the flat panel display may comprise altered portions of the landscape (or the entire landscape, digitally modified). In fact, those "synthetic" landscape segments that replace the real existence are generated based on the original 3D photographic "capture" of the respective segments of the resort. (see #7 below). As an accurate photo-based geometric "virtual space" in a computer, they can be digitally modified in any way while maintaining the photo-realistic quality and geometric/spatial precision of the original capture. This allows an accurate combination of live digital photography and modified digital parts of the same space.
Other "synthetic elements" superimposed by the flat panel display include human, biological, atmospheric FX and "magic" generated or modified by a computer. These appear as real elements of the field of view through the display (transparent or opaque).
By using the positioning data, the motion capture data of the guest/player and the staff/actors, and real-time masking them by a plurality of digital cameras, all calibrated to the previous "captured" version of each area of the vacation area (see #4 and 5 below), the composite elements can be absolutely accurately matched in real-time to the real elements presented by the display.
Thus, a photo-realistic computer-generated dragon would appear to be able to pass through a real tree, return to the surroundings, then fly up and land on top of the real castle of the resort-the dragon could then "burn" the computer-generated flame. In flat panel displays (translucent or opaque), the flame appears to "darken" the upper portion of the castle. This is achieved because the upper part of the castle has been "masked" by the visor by the computer-modified version of the castle's 3D "capture" in the system file.
2) Physical electro-optical mechanical equipment is used for fighting between real people and virtual people, creatures and FXs. The "haptic" interface provides motion sensors and other data, as well as vibration and resistance feedback, allowing real-life human interaction with virtual humans, living beings, and magic in real-time. For example, a haptic device in the form of a "prop" sword handle provides data as the guest/player swings it, and provides physical feedback as the guest/player presents a "slamming" virtual predator magic to achieve the illusion of combat. All of these are combined in real time and displayed via a binocular flat panel display.
3) An open field motion capture device. Mobile and stationary motion capture device equipment (similar to The equipment used for The Matrix movie) is deployed throughout The vacation district. Data points on the subject "gear" worn by the guest/player and the employee/actor are tracked by cameras and/or sensors to provide motion data for interacting with virtual elements in the field of view displayed on the binocular plate in the VR visor.
The output of the motion capture data enables (with sufficient computational rendering capability and use of motion editing and motion libraries) the principle of gurgling roles of the CGI modified versions of the guest/player and employee/actor along the second and third movies of the "ring king".
4) Augmentation of motion capture data with LAAS and GPS data, live laser ranging data, and triangulation techniques (including from Moller Aerobot UAV). The additional "localization data" allows for more efficient (and error-corrected) integration of live and synthetic elements.
Press release from drone manufacturer:
7, month and 17 days. One week ago, honeywell made a contract for the initial network of Local Area Augmentation System (LAAS) stations, some of which were already running. The system can accurately guide the aircraft to land at airports (and helicopters) with a precision of up to inches. The LAAS system is expected to be put into use in 2006.
5) An automatic real-time matte for open-field "play". In conjunction with motion capture data allowing interaction with the analog elements, the vacation guest/participant will use a P24 (or equivalent) digital camera for digital imaging, using proprietary automation software, to automatically isolate (mask) the appropriate elements from the field of view for integration with the synthetic elements. This technique will become one of the suites for ensuring proper foreground/background separation when superimposing digital elements.
6) Military-level simulation hardware and techniques are combined with state-of-the-art game engine software. Haptic devices, synthetic elements and live elements (masked or complete) for interacting with "synthetic" elements such as props swords, in conjunction with data from motion capture systems, are integrated through military simulation software and game engine software.
These software components provide AI code to animate synthetic people and living beings (AI-or artificial intelligence-software, such as the Massive software used to animate military forces in the ring king movie), generate realistic water, clouds, fire, etc., integrate and combine all elements, just like computer games and military simulation software.
7) Photo-based real-location capture to create a realistic digital virtual collection with image-based techniques, pioneered by The Paul Debevec (The basis for The "bullet time" FX for The movie The Matrix).
The "base" virtual locations (inside and outside of the resort) are indistinguishable from the real world, as they come from the real illumination of the photo and the location at the time of "capture". A small set of high quality digital images, combined with data from the optical probe and laser range finding data, and appropriate "image-based" graphics software, are all that is required to reconstruct in a computer a photo-realistic virtual 3D space that exactly matches the original version.
Although "virtual collections" are captured from locations inside the real castle and outside of surrounding villages, once these "base" or default versions are digitized, all other data with lighting parameters and from the exact time when originally captured can be modified, including lighting, added elements are not present in the real world, and the present elements are modified and "dressed up" to create a fantasy version of our world.
The calibration procedure occurs when the guest/player and the employee/actor traverse the "portal" at different points in the resort ("portal" is the effective "intersection from" our world "to" other world "). At this point, the positioning data from the guest/player or employee/actor at the "portal" is employed to "lock" the virtual space in the computer to the coordinates of the "portal". The computer "knows" the coordinates of the portal points for its virtual version of the entire vacation village obtained by the image-based "capture" process described above.
Thus, the computer may "line up" its virtual vacation village with what the client/player or employee/actor would see before it was put into the VR goggles. Thus, with a semi-transparent version of the binocular flat panel display, if the virtual version is overlaid on the real vacation village, one world will match the other very precisely.
Alternatively, using an "opaque" binocular flat panel display visor or helmet, the wearer can confidently walk with the helmet, seeing only a virtual version of the resort in front of him, since the view of the virtual world will exactly match the view he is actually walking.
Of course what can be shown to him through the goggles will be a modified red sky, a boiling storm cloud that does not actually exist, and a dragon perched castle guard rail on top, positive castle "fire".
1000 troops of the eater magic rush down a remote mountain!
8) A supercomputer rendering and simulation facility for vacation villages. A key resource that would enable very high quality, close to feature quality simulation would be supercomputer rendering and simulation at each resort complex site.
Improvements in graphics and games are well known for computer games played on stand-alone computer game consoles (Playstation 2, xbox, gameCube) and desktop computers.
However, it is contemplated that improvements in the gaming experience are improvements in the processor and support systems based on a single console or personal computer. It is then envisioned that the capabilities of the super computing center will support the gaming experience. This is only a great leap in the quality of graphics and games. This is just one aspect of the mobile VR venture that will be the experience of Otherworld.
As will be apparent from a review of the foregoing, and as will be apparent to those skilled in the relevant art in VR, AR and broader simulation arts, the personal hardware or software system proposed to improve the prior art must take into account the broader system parameters and make explicit assumptions about those system parameters in order to make an appropriate assessment.
The essence of the present proposal, therefore, is that the emphasis is on hardware technology systems belonging to the portable AR and VR technology classes, and indeed a fusion of both, but its most preferred version is wearable, in the preferred wearable version HMD technology, which only considers or reconsiders the whole system to which it belongs, to be a complete case of an excellent solution. There is therefore a need to present a greater history of VR, AR and simulation systems, since for example the proposals for new HMD technology and the trend towards commercial products are too narrow, not taking into account nor reviewing the assumptions, requirements and new possibilities at the system level.
Similar historical reviews of the major milestones in HMD technology development are not necessary, as a broader history needs to be reviewed at a system level to provide a framework that can be derived to help explain the limitations of the prior art and the current state of the art in its HMD, as well as the reasons for and reasons for the proposed solution to address the identified problems.
Content sufficient to understand and identify the limitations of the prior art in HMDs begins with the following.
In the category of head mounted displays (which for the purposes of this disclosure includes head mounted displays), two main sub-types have been identified so far: VR HMD and ARHMD, follow the meaning of those definitions already provided herein, and of the category of AR HMD, two categories have been used to distinguish whether these types are "video perspective" or "optical perspective" (more commonly simply referred to as "optical HMD").
In a VR HMD display, the user views a single panel or two separate displays. The typical shape of such HMDs is typically that of a visor or face mask, although many VR HMDs have the appearance of a welding helmet, which has a bulky closed visor. To ensure optimal video quality, immersion, and no interference, such systems are completely enclosed, with light absorbing materials around the perimeter of the display.
The authors of the present disclosure have previously proposed two types of VR HMDs in the incorporated U.S. provisional application "SYSTEM, METHOD AND computer PROGRAM PRODUCT FOR MAGNETO-optical DEVICE DISPLAY". One of them simply proposes to replace the conventional direct view LCD with the wafer-type embodiment of the main object of the application, the first practical magneto-optical display, whose superior performance characteristics include an extremely high frame rate, and other advantages that improve the display technology as a whole, and in this embodiment, for an improved VR HMD.
The second version contemplates a new telegenerated image display, according to the teachings of this disclosure, which would be generated, for example, in a vehicle cockpit, then transmitted via a fiber optic bundle, and then distributed through a special fiber optic array structure (structures and methods disclosed in this application), based on experience with fiber optic panels employing the new methods and structures for remote image transmission through optical fibers.
Although the core MO technology was not originally commercialized for HMDs, but for projection systems, these developments are relevant to certain aspects of the present proposal and, furthermore, not generally known to the art. In particular, the second version discloses a method that is disclosed prior to other newer proposals that use optical fibers to transmit video images from image engines that are not integrated into or near the HMD optics.
In addition to a tightly controlled stage environment with flat layers, a key consideration for the utility of a fully enclosed VR HMD for mobility is that for sports safety, the virtual world in navigation must be within the deviation of safety for human sports in 1:1 are mapped to the true surface topography or motion path.
However, as observed and summarized by Barrilleaux, BARS developers of Loral WDL and by other researchers in this field since the last quarter century, it is necessary for AR systems, which are practical systems, to obtain very close correspondence between virtual (synthetic, CG-generated images) and real-world terrain and architectural environments, including (because the military is not unexpected for urban war development systems) the geometry of moving vehicles.
Thus, more generally, for VR or AR to be enabled in mobile form, there must be a 1:1, in the same manner.
In the category of AR HMDs, the distinction between "video perspective" and "optical perspective" is the distinction between a user viewing directly through a transparent or translucent pixel array and a display disposed directly in front of the viewer as part of the eyeglass optics itself, and viewing through a translucent projected image also disposed directly in front of the viewer on an optical element, typically directly adjacent, generated from one microdisplay and transmitted to the facing optics in the form of a light relay.
A major and perhaps only partially practical type of direct-view display transparent or translucent display system is (historically) an LCD without an illumination backplane — thus, in particular, AR video see-through glasses possess one or more viewing optics, including a transparent optical substrate, on which an array of LCD light modulator pixels has been mounted.
For applications like the original Mann "EyeTap", where text/data is displayed or projected directly on facing optics, there is no need to calibrate to real world terrain and objects, although some degree of positional correlation facilitates contextual "tagging" of items in the field of view with informational text. This is the primary purpose of the Google Glass product exposition, but as drafted by the present disclosure, many developers have focused on developing AR-type applications that are not just text superimposed in a live scene.
In addition to the coarse approximate positional correlation in an approximate 2D plane or coarse viewing cone, the main problem with such "calibration" of terrain or objects in the user field of view of a video or optical see-through system is determining the relative position of objects in the viewer's environment. Without reference and/or substantially real-time spatial location data and 3D mapping of the local environment, the computation of perspective and relative size cannot be performed without significant inconsistencies.
In addition to relative size, one key aspect from any viewpoint perspective is realistic lighting/shading, including projection, depending on the lighting direction. Finally, occluding objects from any given viewing position is a key optical property for perceiving perspective and relative distance and positioning.
There is no or no problem of designing a video see-through or optical see-through HMD independently of how such data is provided to implement or indeed for moving VRs, spatial viewing of the wearer's surroundings, necessary safety movements or wayfinding in a video or optical see-through type system. Is these data provided externally, locally, or from multiple sources? What impact this has on the design and performance of the entire HMD system if it is a partial local and partial HMD? What does this issue affect, if any, the choice between video and optical perspective (considering weight, balance, volume, data processing requirements, lag between components, and other influencing and influenced parameters) and the choice of display and specific optical components?
Among the technical parameters and problems to be solved during the evolution and advancement of VR HMDs are the increase of the field of view, the reduction of the delay (lag between the motion tracking sensor and the virtual view angle change), the improvement of resolution, frame rate, dynamic range/contrast and other general display quality characteristics, as well as weight, balance, volume and general ergonomics. Image collimation and other display optics details have improved, effectively addressing the problem of "simulator disease", a major problem in the early days.
As these general technology classes and weight, size/volume and balance improve, the weight and volume of displays, optics and other electronic devices tend to decrease.
Fixed VR devices are commonly used in night vision systems in vehicles, including aircraft; however, mobile night vision goggles can be considered an intermediary form of viewing similar to mobile VR, in that the wearer is viewing a substantially real-time real scene (IR imaging), but through a video screen, rather than in "see-through".
This subtype is similar to that defined by barrileaux in the 1999 review, also referenced, as "indirect view display". He provides a definition about the proposed AR HMD where there is no actual "see-through", but only the real/virtual image that is merged/processed on the display, presumably as is contained by any VR type or night vision system.
However, the night vision system is not a fusion or mix of a virtual composite landscape and reality, but a directly transmitted video image of IR sensor data interpreted as monochrome images of different intensities by video signal processing according to the intensity of the IR signal. As a video image, it does apply to real-time text/graphics overlays, the same as the simple form Eyetap originally conceived, and is the primary purpose of its eyewear product as Google has already stated.
The problem of extracting the way and content of live or providing (or both) data from reference to a mobile VR or mobile AR system, or now including this hybrid live processing video feed "indirect view display" with similarities to both categories, the combined view that enables effective integration of virtual and real landscape to provide consistent cues is a design parameter and issue that must be considered when designing any new and improved mobile HMD system, regardless of its type.
Software and data processing for AR has evolved to address these issues based on the early work of the system developers that have been cited. An example of this is the work of Matsui and Suzuki from Canon Corporation, as disclosed in its co-pending U.S. patent application "Mixed reliability space image generation method and Mixed reliability system" (U.S. patent application Ser. No. 10/951,684, 9/29/2004 (U.S. publication No. 20050179617-now U.S. Pat. No. 7,589,747)). The abstract is as follows:
"a mixed reality space image generating apparatus for generating a mixed reality space image formed by superimposing a virtual space image on a real space image obtained by capturing a real space, includes an image synthesizing unit (109) that superimposes the virtual space image, which will be displayed on the real space image in consideration of occlusion of an object on the real space of the virtual space image, and an annotation generating unit (108) that further applies an image to be displayed without taking into consideration any occlusion of the virtual space image. In this way, a mixed reality space image that can achieve natural display and convenient display can be generated.
The purpose of the system is to enable a combination of fully rendered industrial products (e.g. cameras) to be superimposed on a solid model (avatar prop); a pair of optical see-through HMD glasses and a mock-up are both equipped with a position sensor. The real-time pixel-by-pixel look-up comparison process is used to mask out the pixels from the solid model so that the CG-generated virtual model can be superimposed on the composite video feed (buffering delay to achieve slightly delayed layering). The system also adds an annotated graphical computer image. The basic sources of data for determining the mask and thus ensuring correct and non-erroneous occlusions in the composition are the motion sensors on the solid model and a predetermined look-up table that compares pixels to pull the hand mask and the solid model mask.
While this system is not suitable for generalizing moving AR, VR or any hybrids, it is an example of an attempt to provide a simple, but not fully automated system for correctly analyzing real 3D space and locating virtual objects in perspective.
In the field of video or optical see-through HMDs, little progress has been made in designing displays or optics and display systems that enable satisfactory, realistic and accurate merged perspective views, including processing of appropriate see-through sequences, appropriate occlusion of merged elements from any given observer position in real space, even assuming ideally computed mixed reality perspectives are delivered to the HMD.
A system has been previously cited which claims to be the most effective solution even if it is a partial solution to this problem, and perhaps the only integrated HMD system (as opposed to the software/photogrammetry/data processing and transmission system which is intended to solve these problems in some general way, independent of the HMD), which is proposed by Chunyu Gao in U.S. patent application No. 13/857,656 (us publication No. 20140177023) "apparatus for optical see-through head mounted displays with mutual occlusion and opacity control capabilities".
Gao started his investigation of the field of see-through HMDS for AR with the following observations:
there are two types of ST-HMD: optics and video (j.rolland and h.fuchs, "Optical vision-section-through head mounted," displays, "Fundamentals of Wearable Computers and Augmented Reality Fundamentals, pages 113-157, 2001). The main drawbacks of the video perspective method include: image quality degradation of the perspective view; image lag due to processing of the input video stream; the perspective view may be lost due to hardware/software failures. In contrast, an optical see-through HMD (OST-HMD) provides a direct view of the real world through a beam splitter, and therefore has minimal impact on the view of the real world. Is highly preferred in demanding applications where user awareness of the field environment is critical.
Gao, however, does not qualify for viewing the problem with video perspective, first designating prior art video perspective as a dedicated LCD, and he also does not verify that the LCD must (relatively, and also ignores the criteria for) degrade the assertion of the perspective image. Those skilled in the art will recognize that the view of such low quality images is derived from results obtained in early full-view LCD systems before recent advances in the field have accelerated. It is not really true or obvious that with optical see-through systems that reprocess or adjust the effects of "real" see-through images "by comparing many optical devices and other display technologies, the end result is relatively low, or inferior, to the proposals of people such as Gao, etc., compared to the most advanced LCD or other video see-through display technologies.
Another problem with this silent summarization is the lag assumption in this type of perspective compared to other systems that also have to process the input live image. In this case, the comparison of the speeds is, in general, the result of a detailed analysis of the components of the competing system and their performance. Finally, the guess that the perspective view of the hardware/software may be lost is essentially endless, arbitrary, and has not been validated by any rigorous analysis that compares system robustness or stability generally between video and optical perspective schemes, or between a particular version of either and its component technology and system design.
In addition to the initial problems of erroneous and biased representations for comparison in the field, there are qualitative problems with the solutions proposed by themselves, including omission and lack of consideration of the proposed HMD system as a complete HMD system, including as a component in a broader AR system, with data acquisition, analysis and distribution problems that have been previously referenced and addressed. While becoming a significant problem and dilemma that the HMD itself and its design alone can help or hinder and cannot simply pose as a given, the HMD cannot be allowed to "give" a certain level and quality of data or processing power for generating altered or blended images.
Furthermore, the full dimension of the visual integration problem of real and virtual in mobile platforms is omitted from the description of the problem solution.
The system adopting the present disclosure and the teaching thereof is specifically:
as already described in the background section above, gao suggests the use of two display-type devices, since the specifications of the spatial light modulator that is operable to selectively reflect or transmit live images are essentially those of an SLM for the same purpose as it is in any display application.
The output images from the two devices are then combined in a beam splitter combiner, while being arranged on a pixel-by-pixel basis, assuming no specific explanation other than a statement regarding the accuracy of such devices.
However, to achieve this merging of the two pixelated arrays, gao specifies what he calls a replica of "folding optics", but basically nothing but a dual version of the Mann Eyetap scheme, requiring a total of two "folding optics" elements (e.g. a plane grating/HOE or other compact prism or "flat" optical element, one for each light source, plus two objective lenses (one for the wavefront from the real view and the other for the combined image and the focal point of the beam splitter combiner).
Therefore, multiple optical elements (for which he provides a number of conventional optics variations) are required to: 1) Light of a real scene via a first reflecting/folding optic (planar grating/mirror, HOE, TIR prism or other "flat" optic) and from there to the objective lens is collected and passed to the next planar grating/mirror, HOE, TIR prism or other "flat" optic to "fold" the light path again, all to ensure that the whole optical system is relatively compact and contained in a schematic set of two rectangular optical relay zones; from the folded optical system, the beam passes through a beam splitter/combiner to the SLM; the now pixellated real image is then returned to the beam splitter/combiner on a pixellated (sampled) basis, either reflected or transmitted, thereby variably (from real image contrast and intensity variations to modify grey levels, etc.) modulated. While the display generates virtual or composite/CG images in synchrony, it may also be calibrated to ensure easy integration with the modified pixelized/sampled real wavefront, and passed through a beam splitter to integrate the pixels of the real scene to pixels, using a multi-step, modified and pixelized sample, from there through an eyepiece objective, and then back to another "folded optical" element to be reflected from the optical system to the viewer's eye.
In general, for the modified pixelated sample portion of the real image wavefront, before reaching the viewer's eye, it passes through seven optical elements, excluding the SLM; the display generates a composite image that passes through only two.
While the precise alignment of optical image combiners is a matter up to the pixel level, whether reflected light collected from an image sample interrogated by a laser or a combined image produced by a small function SLM/display device, maintaining alignment, particularly under mechanical vibration and thermal stress conditions, is considered to be a significant concern in the art.
Digital projection free space beam combining systems that combine the outputs of high resolution (2 k or 4 k) red, green and blue image engines (images typically generated by DMD or LCoS SLMs) are expensive and it is important to maintain these alignments. Some designs are simpler than the seven element case of the Gao scheme.
In addition, these complex multi-engine multi-element optical combiner systems are far from as compact as required by HMDs.
Monolithic prisms (such as the T-rhombid combiner developed and sold by Agilent for the life science market) have been developed specifically to address the problems exhibited by free space combiners in existing applications.
While companies such as Microvision and others have successfully deployed their SLM-based, originally developed technology for micro-projection into HMD platforms, these optical settings are generally substantially less complex than the Gao proposal.
Furthermore, it is difficult to determine what the rationale for the two image processing steps and computational iterations on the two platforms are, and why smoothing and integration of real and virtual wavefront input needs to be achieved, implementing the correct occlusion/opacity of the combined scene elements. It appears that the most interesting and problematic problem of Gao is the problem of composite image competition, difficult to compare with the brightness of a real image, and therefore the main task of the SLM seems to selectively reduce the brightness of part of a real scene or the whole real scene. Although Gao is not specified nor is there details of how the SLM will perform its associated image modification functions, it can also be inferred that generally occluded pixels can simply be discarded while reducing the intensity of the occluded real scene elements, for example by minimizing the duration of the DMD mirrors in the reflective position in a time division multiplexed system.
Among the many parameters that must be taken into account for calculation, calibration and alignment, includes determining exactly which pixels from the real field are calibration pixels for the synthesized pixels. If there is no exact match, ghost overlap, mis-alignment and occlusion will multiply, especially in moving scenes. The position of the reflective optical element that passes the wavefront portion of the real scene to the objective lens has a real perspective position relative to the scene that is first different from the perspective position of the viewer in the scene, is not flat or centered, is only a wavefront sample, and is not a position. Also, when movable, the movement is also performed at the same time, and is not known in advance to the composite image processing unit. Due to these facts alone, the number of variables in the system is very large.
If they are, and the goal of the solution becomes more specific, it may become clear that there may be a simpler way to achieve this than using a second display (in a binocular system, a total of 2 displays are added, a SLM is designated).
Second, it is apparent for inspection of the scheme that if any method, due to the durability of such complex systems with multiple cumulative alignment tolerances, the accumulation of imperfections and wear over time of the original components in the multi-element path, misalignment of the combined beams creates cumulative thermal and mechanical vibration effects, and other complications arising from the complexity of the seven-element plus optical system, it is this system that may inherently create degradation of the external live image wavefront, particularly over time.
In addition, as has been noted in detail previously, the problem of calculating the spatial relationship between real and virtual elements is of great importance. Designing a system, which must drive two (in a binocular system), four display type devices, most likely of different types (and therefore with different color gamuts, frame rates, etc.), from these calculations adds complexity to the already critical system design parameters.
Furthermore, a high frame rate is essential in order to provide high performance images without ghosting or hysteresis, and without causing eye fatigue and visual system fatigue. However, for the Gao system, the system design is somewhat simplified only if a perspective rather than a reflective SLM is used; but even with faster felco microdisplays, the frame rate and image speed are still much lower than MEMS devices such as TI DLP (DMD).
However, as higher HMD resolutions are also required, at the very least to achieve wider FOVs, resorting to high resolution DMDs for 2k or 4k devices such as TI means resorting to very expensive solutions, since DMDs with characteristic sizes and quantities are known to be low yielding, with defect rates higher than what is generally tolerable by mass consumers or enterprises for production and costs, and very expensive for the systems that now use them, such as the digital cinema projectors sold by Barco, christie and NEC of the TI OEM in the market.
While starting from planar optical projection technology for optical see-through HMDS (e.g., lumus, BAE, etc.) it is an intuitive and easy step where occlusion is neither a design goal nor possible within the scope and capabilities of these methods, essentially replicating the method and adjusting the real image, then combining the two images using conventional optical settings such as proposed by Gao, while relying on a large number of planar optical elements for the purpose of combination and doing so in a relatively compact space.
To summarize the background review, and returning to the current leader in the two major HMD categories, optical see-through HMD and classic VR HMD, the prior art can be summarized as follows, noting that other variant optical see-through HMDs and VR HMDs are both commercially available and subject to extensive research and development, with a large amount of commercial and academic work, including product announcements, publications and patent applications that have been upgraded since the breakthrough made in Google Glass and Oculus VR HMDs (Rift):
at the time of writing, glass' Google, which owns a commercially leading mobile AR optical HMD, has established a breakthrough public visibility and a dominant market position for the optical see-through HMD category.
However, they are entering the market along with others who have developed and deployed products in major defense/industrial areas, including Lumus and BAE (Q-Sight holographic waveguide technology). Enterprises such as TruLife Optics, which commercialize uk national physical reality research, were also found in the field of holographic waveguides as they enter other recent markets and research phases, and they claim to be advantageous.
For many military helmet-mounted display applications, and the primary use case for Glass by Google, again as previously described, the superimposition of text and symbolic graphical elements on the view space requires only a rough positional association, which may be sufficient for many initial, simple mobile AR applications.
However, even in the case of information display applications, it is apparent that the greater the density of tagged information for items and terrain in view space facing (and ultimately surrounding) the viewer, the greater the need for spatial ordering/layering of tags to match the perspective/relative position of the tagged elements.
Overlap-that is, the partial occlusion of the tag by real elements in the field of view, not just the tag itself, and therefore necessarily becomes a requirement for optical see-through systems for even "basic" information display purposes to manage visual clutter.
In addition, since the tags must reflect not only the relative positions of the tag elements in the perspective view of real space, but also the degree of automation (either pre-determined or software-calculated based) priority and real-time, user-specified priority, tag size and transparency, to name but also the two primary visual cues used by the graphics system to reflect the information hierarchy.
The problem then immediately arises of how to deal with the relative brightness of the optical elements of these basic optical see-through HMDs (whether monocular or binocular panoply type) and the live elements of the video display elements generated by superimposition, especially in bright outdoor lighting conditions and very dim outdoor conditions, taking into account the translucence and overlap/occlusion problems of the labels and the superimposed graphic elements in detail. Nighttime use is obviously an extreme case of low light issues to fully extend the utility of these display types.
Thus, when we go through the most limited use case conditions of the passive optical see-through HMD type, as would be expected as the information density increases-with the commercial success of such systems and the often dense urban or suburban areas where marking information is obtained from commercial enterprises-and the use parameters under bright and dim conditions increase constraints, it is clear that the "passive" optical see-through HMD cannot escape nor cope with the problems and needs of any realistic practical implementation of a mobile AR HMD.
Furthermore, passive optical pass-through HMDs must be considered as an incomplete model for implementing mobile AR HMDs, and, retrospectively, would be considered as merely a transitional stepping stone to an active system.
Oculus Rift VR (Facebook) HMD: somewhat similar to the impact of Google Glass product marketing campaign, but different from Oculus, which actually led the field to solve and/or begin to substantially solve some of the significant threshold obstacles of actual VR HMDs (rather than following Lumus and BAE for Google), oculus Rift VR HMD was the leading, pre-mass-released VR HMD product at this writing, entered and created a market for widely accepted consumer and commercial/industrial VR.
The basic threshold advancement of Oculus Rift VR HMD can be summarized as the following product function list:
Figure GDA0002121603270000681
the significantly broadened field of view, achieved by using a single current 7 inch diagonal display of 1080p resolution, is located a few inches from the user's eyes and is divided into binocular see-through regions on the single display. As written herein, the current FOV is 100 degrees (improving its original 90 degrees) compared to the total 45 degrees of the current HMD specification. The independent binocular optical device realizes the stereoscopic vision effect.
Figure GDA0002121603270000691
Significantly improved head tracking, resulting in low hysteresis; this is an advancement of the improved motion sensor/software and utilizes micro motion sensor technology that is ported from the Nintendo Wii, apple and other fast followers in cell phone sensor technology, playstation PSP and the current Vita, nintendo DS current 3DS and Xbox Kinect systems, as well as other hand-held and hand-held device products with built-in motion sensors (accelerometers, MEMS gyroscopes, etc.) for 3D dimensional position tracking. Current head tracking implements a multi-point infrared optical system with external sensors working in concert.
Figure GDA0002121603270000692
Low delay, is improvedThe combined result of head tracking and fast software processor updates to the interactive gaming software system, while limited by the inherent response time of the display technology employed, the original LCD was replaced by the faster OLED.
Figure GDA0002121603270000693
Low persistence, a form of buffering, to help keep the video stream smooth, works in conjunction with higher switching speed OLED displays.
Figure GDA0002121603270000694
Lighter weight, smaller volume, better balance, and overall improved ergonomics are achieved by employing ski goggles form factors/materials and mechanical platforms.
To summarize the net benefits of combining these improvements, while such systems may not have new modes in structure or operation, the net effect of the improved components and the particularly effective design patent US D701,206, as well as any proprietary software, has produced a breakthrough level of performance and validation of mass market VR HMDs.
Following their lead and using their approach, in many cases, there were some contemporaneous product programs where others had changed their design based on the success of the Oculus VR Rift configuration, and many VR HMD product developers have made product plan announcements after the first 2012 e-exposition and Oculus VR's kiskstarter financing campaign, with both brand companies and pioneer companies.
Among these quick followers and other enterprises that apparently changed their strategy to follow the Oculus VR template are three stars (which demonstrated a development pattern very similar to the Oculus VR Rift design as described herein) and sorney's Morpheus. Pioneering companies gaining attention in this area include Vrvana (formerly True Gear Player), gameFace, infiniteEye and avegat.
None of these system configurations look exactly the same as Oculus VR, although some use 2 panels, and others use 4 panels, infiniteEye uses a 4 panel system to extend the FOV to the purported 200+ degrees. Some use LCDs and some use OLEDs. Optical sensors are used to improve the accuracy and update speed of the head tracking system.
All systems are implemented with substantially local or highly constrained mobility. Vehicle mounted and active optical marker based motion tracking systems are employed that are designed for use in enclosed spaces such as the living room, operating room or simulator phases.
The systems with the greatest difference from the Oculus VR protocol are glyh and Vrvana Totem by avegent.
Glyph actually implements a display solution that follows the previously established optical see-through HMD solution and architecture, employing Texas Instruments DLP DMD to generate projected microimages on reflective planar optical elements, the configuration and operation of which are identical to the planar optical elements of existing optical see-through HMDs, except that a high contrast, absorbing backplane architecture is employed to implement the reflective/indirect micro-projector display type, with the video images belonging to the general class of opaque, non-transparent display images.
However, as established herein before in the discussion of the disclosure of Gao, limitations on increasing display resolution and other system performance beyond 1080p/2k when using DLPDMD or other MEMS components are those of cost, manufacturing yield and defect rate, durability and reliability in such systems.
Furthermore, the limited expansion/magnification factor of the planar optical elements (grating structure, HOE or other) enlarges the SLM image size, but enlarges the interaction/burden on the Human Visual System (HVS), especially the focus system, and the limitation of the image size/FOV from this limited expansion/magnification factor puts limits on safety and viewer comfort. User responses to the use of similarly sized but lower resolution images in the Google glass trial indicate that making the HVS more exhaustive with higher resolution, brighter but equally small image areas poses a challenge to the HVS. The Google official advisor, eli Peli, doctor, issued a warning to the Google Glass user when interviewed by the online site, betaBeat (5/19/2014), anticipating some eye strain and discomfort, followed by modifying the warning (5/29/2014) in an attempt to limit the cases and scope of potential use. This division is for the way the eye muscles are used, they are not designed or intended for long-term use, and the approximate reason in the revised declaration is to force the user to find the location of the small displayed image.
However, the particular combination of eye muscle usage required for focal use over a small portion of the real FOV cannot be assumed to be the same as that required for eye movement over the entire real FOV. In fact, small fine-tuning of local muscles is more constrained and limited than the range of motion involved in scanning a natural FOV. Thus, as is known in the art, repetitive motion in a compact ROM is not limited to only the focus direction, although due to the nature of the HVS, excessive loads beyond the normal range of use are expected to increase, but also to limit the range of motion and the requirement for very small, controlled fine adjustments.
The added complexity is that the level of detail in the constrained eye movement region may begin to quickly exceed eye fatigue from delicate tool work as resolution increases in scenes with complex, fine motion. Any developer of the optical communication system has not strictly addressed this problem, and these problems, together with the eyestrain, headache and dizziness problems reported by Steve Mann for many years using his EyeTap system (which were reported to be partially improved by moving the image to the center of the field of view in the recent Digital EyeTap update, but which were not systematically studied), have received only limited comments, focusing only on the partial problems and the problem of eyestrain, which can develop from near work and "computer vision disease".
However, the limited public review provided by Google repeatedly states that, in general, glass is used cautiously as an optical see-through system for occasional viewing, rather than long-time or high-frequency viewing.
Another way to understand the Glyph scheme is that the highest level follows the Mann digital EyeTap system and architectural arrangement, with variations for implementation of optically isolated VR operations and lateral projection plane deflection optics settings that employ current optical see-through systems.
In Vrvana Totem, the objective VR Rift is violated by adopting the "indirect view display" scheme of Jon barrileaux by adding a binocular conventional video camera to allow switching between forward image capture for video capture and the simulation generated on the same optically shaded OLED display panel. Vrvana is represented in marketing materials, which they can implement this very basic "indirect view display", following exactly the AR schematic and model determined by Barrilleaux. Obviously, virtually any other VR HMD of this generation of Oculus VR may be fitted with such a conventional camera, at least though having an impact on the weight and balance of the HMD.
As is evident from the above, little substantial progress has been made in the category "video see-through HMD", or in general in the field of "indirect view display", which has developed well as a subtype, except for the category night vision goggles, but which lacks any AR feature, except for the provision of text or other simple graphics to be added to live images in the video processor methods known in the art.
In addition, with respect to the existing limitations of VR HMDs, all such systems employing OLED and LCD panels suffer from relatively low frame rates, which results in motion lag and delay, as well as negative physiological effects on some users, belonging to the broad category of "simulator sickness". It is also noted that in cinema digital stereoscopic projection systems employing commercial stereo systems such as the RealD system implemented in Texas Instruments DLP DMD based projectors or Sony LCoS based projectors, it has been reported that insufficiently high frame rates result in a small percentage of viewers (up to 10% in some studies) experiencing headaches and related symptoms. Some of which are unique to these individuals, but a large part of which can be traced back to frame rate limitations.
Moreover, as noted, the Oculus VR has partially implemented a "low persistence" buffer system to compensate for the still insufficiently high pixel conversion/frame rate of the OLED display used in writing.
Yet another impact on the performance of existing VR HMDs is due to the resolution limitations of existing OLED and LCD panel displays, which in part facilitates the requirement to use 5-7 "diagonal displays and mount them at a distance from the viewing optics (and viewer's eyes) to achieve sufficient effective resolution, contributing to the volume, size, and balance of existing and planned products being significantly larger, bulkier, heavier, and heavier than most other optical head-mounted products.
Potential partial improvements are expected from the use of curved OLED displays, which can be expected to further improve the FOV without increasing the volume. However, the expense of bringing to the market in sufficient quantities to require a large additional investment in plant capacity at an acceptable production capacity makes this prospect less practical in the short term. It only partially solves the volume and size problems.
For the sake of completeness, reference must also be made to a video HMD for viewing video content but without interaction or with any motion sensing capabilities, and thus without the ability to navigate for a virtual or mixed reality/AR world. Over the last fifteen years, such video HMDs have improved substantially, increasing the effective FOV and resolution as well as viewing comfort/ergonomics, and providing a path of development and progress that current VR HMDs have been able to utilize and build. However, these are also limited by the core performance of the display technology employed, in a mode that follows the observed limitations for OLED, LCD and DMD based reflective/deflection optical systems.
Other important variations of the projected image of the transparent glasses optics paradigm include those from the Osterhoudt Design Group, magic Leap, and Microsoft (Hololens).
Although these variations have some relative advantages or disadvantages-relative to each other and to other prior art reviewed in detail earlier-they all retain the limitations of the basic approach.
For the more fundamental and widespread commonality, they are also limited by the fundamental type of display/pixel technology employed, and the frame rate/refresh of existing core display technologies, whether fast LC, OLED or MEMS, and whether mechanical scanning fiber optic inputs or other disclosed optical systems for delivering display images to viewing optics are employed, are still insufficient to meet the requirements of high quality, easy to view (HVS), low power, high resolution, high dynamic range and other display performance parameters that individually and collectively contribute to mass market, high quality, pleasing AR and VR.
To summarize the state of the art, for the details described previously:
"high visual acuity" VR has improved substantially in many respects from FOV, latency, head/motion tracking, lighter weight, size and volume.
But frame rate/delay and resolution, to a significant degree of inference, weight, size and volume are all constrained by the constraints of the available core display technology.
Modern VRs are limited to stationary in small controlled spaces or height-limited and limited mobile use.
VR is based on a closed version of the optical see-through system, but configured as a lateral projection-deflection system, where the SLM projects an image into the eye through a series of three optical elements, which is limited in the dimensional representation of the reflected image, which is enlarged but not much larger than the output of the SLM (DLP DMD, other MEMS or felco/LCoS) compared to the total area of standard spectacle lenses. The extremely strong version from "close-up work" expands the observation and eye strain risks that would put demands on eye muscles is yet another limitation on practical acceptance. SLM type and size displays also limit the practical ways to improve resolution and overall performance by the scaling cost of the higher resolution SLMs of the cited technology.
Optical see-through systems typically have the same potential for eye fatigue, require relatively small and frequent eye tracking adjustments due to restricting eye muscle use to relatively small areas and within these constraints, and are mostly used for short term use. The design of Google Glass is intended to address the desire for limited duration use by positioning the optical elements up and beyond the direct resting position of the eye in front of direct vision. The user still reported eye strain as it has been extensively documented in media reports by means of texts and interviews from Google Glass Explorers.
Optical see-through systems are limited in the overlapping translucent information density due to the need to organize tags with real world objects in perspective. Even for graphic information display applications, the requirements of mobility and information density make passive optical viewing limited.
The aspect of "indirect view display" has been implemented in the form of night vision goggles, and Vrvana, a competitor of oculus vr, has only proposed a recommendation to adapt its tolem-equipped binocular video camera to AR.
The Gao proposal, while claiming to be an optical see-through display, is more in fact an "indirect view display" with quasi-perspective aspects, acting as in an improved projection display, by using an SLM device for sampling a portion of a real wavefront and digitally altering that portion of the wavefront.
The number of optical elements (also the points to be added here, much smaller than the optical zone of a conventional lens in a pair of conventional spectacles) interposed in the optical route of the initial wavefront portion is seven or close to this number, introducing opportunities for image aberrations, artifacts and losses, but in areas where complex optical alignment systems are required, such complex free-space alignment of many of the elements is not common and is expensive, difficult to maintain and not robust when required. The methods by which the SLM is expected to manage the wavefront changes of a real scene are also not specified or validated for specific requirements. In environments where performing calculations to establish the proper relationship between real and synthetic elements in perspective view is already very demanding, especially when the individual is moving in an information-intensive, topologically complex environment, it is not a problem to coordinate signal processing between 2-4 display type devices (depending on the monocular or binocular systems) either, including determining exactly that the pixels from the real field are calibration pixels for the proper synthetic pixels. Mounting on a vehicle only further exacerbates this problem.
There are numerous additional problems with developing a complete system compared to the task of constructing an optical device as proposed by Gao, or even reducing it to a relatively compact form factor. Size, balance and weight are just one of many impacts on the number and necessary locations of the various processing and optical array units, but they are relatively small compared to the other problems and limitations cited, although they are important for practical deployment of such systems on site for military or ruggedized industrial use or consumer use.
Except for the details of the number and alignment of display type elements, optics, pixel system matching, and perspective issues, a 100% "indirect view display" has similar requirements in key respects to the proposal of Gao, thus raising doubt as to the extent that all key parameters of such a system should require a "brute force" computation of the stored composite CG 3D mapping space in coordination with a real-time, single perspective, real-time, see-through image. The problem becomes greater to the extent that the calculations must all be performed, with the video images captured by the forward-looking camera, forwarded to a non-local (to the HMD and/or the t-wearer himself/herself) processor for compositing with the composite element, in the basic barrileaux and Vrvana designs now possible.
What is needed for a truly mobile system, whether VR or AR, is to achieve immersion and calibration of the real environment as follows:
ergonomic optical and viewing systems that minimize any abnormal requirements on the human visual system. This is to achieve more extended use, which is meant by mobile use.
A wide field of view, ideally including a 120-150 degree peripheral view.
High frame rate, ideally 60 fps/eye to minimize latency and other artifacts typically caused by the display.
Efficient resolution at comfortable distance of the unit from the face. The effective resolution criteria that can be used to measure the maximum is either an effective 8k, or a "retinal display". This distance should be similar to that of conventional spectacles, typically using the bridge of the nose as a point of equilibrium. Collimation and optical path optics are necessary to establish a suitable virtual focal plane that also achieves this effective display resolution and the actual distance of the optical elements to the eye.
High dynamic range, matching as closely as possible the dynamic range of the live, real view.
Vehicle-mounted motion tracking that determines the orientation of both the head and body in a known topography-whether known in advance or in time within the wearer's field of view. This can be supplemented by an external system in the mixing scheme.
Display optics capable of fast synthesis processing within the environment of the human visual system between the real scene wavefront and any synthetic elements. Passive devices should be used as much as possible to minimize the burden on the vehicle (for the HMD and wearer) and/or external processing systems.
Display optics, relatively simple and robust, few optical elements, few active device elements, simple active device design, low weight and thickness, and robust under mechanical and thermal stress.
Light weight, small volume, balanced center of gravity, and form factor, which are suitable for design configurations known to be acceptable to professional users (e.g., military and ruggedized environmental industrial users), rugged sports applications, general consumer, and commercial use. These factors are also accepted from a range of factors from spectacle manufacturers such as Oakley, wiley, nike and Adidas to the somewhat more specialized manufacturers of sports goggles such as Oakley, adidas, smith, zeal.
A system that can variably switch between VR experience and perspective integrated hybrid viewing AR systems while maintaining full mobility and variable occlusion.
A system that can both manage the incident wavelengths of the HVS and obtain valid information from those wavelengths of interest via sensors and mixtures thereof. IR, visible and UV are typical wavelengths of interest.
The system proposed by the present disclosure solves the problem and meets the ultimate goal of the prior art of fundamentally limited and inadequate augmentation and functionality in virtual reality (tasks and standards).
The present disclosure incorporates and implements features of telecommunications architectures and pixel signal processing systems and hybrid magnetic photonics (co-pending U.S. patent application [2008] and photonic encoders of the same inventor), as well as preferred pixel signal processing sub-types of hybrid MPC pixel signal processing, display and networking (co-pending U.S. patent application of the same inventor). The addressing and powering of the devices (particularly the arrays) is preferably wireless addressing and powering of the arrays of the pending U.S. patent application, and preferred embodiments of the hybrid MPC type system are also found in the 3D plants and systems of the pending U.S. patent application.
These pending applications are incorporated by reference in their entirety.
However, in establishing the preferred versions and implementations of the class of system types and the class of critical subsystems and subsystems, it is not intended that all of the details of the present proposal be included in the referenced applications, and that the present application be only a combination of these systems, structures, and methods.
Instead, the present proposal proposes new and improved systems and subsystems, which in most or many cases belong to those cited (and often new) classes and levels, which detail components, systems, subsystems, structures, processes and methods, while also enabling unique new mobile AR and VR systems, with preferred embodiments as wearable systems, and for wearable systems, head-mounted is most preferred, due to the unique combination of these and other classes of constituent elements.
The specification of the proposed system is preferably developed by organizing the overall structure and operational structure in groups (lists) of major subsystems, and then providing the details of these subsystems in a hierarchical fashion.
The main subsystems are as follows:
I. a telecommunications system type architecture for displays with pixel signal processing platforms, and preferably hybrid MPC pixel signal processing, including photonic coding systems and devices.
Sensor system for moving AR and VR
Structural and substrate systems
These major subsystems implement a novel integrated dual "generation" and variable direct-transmission direct-view hybrid display system:
I. a telecommunications system type architecture for a display having a pixel signal processing platform, and preferably hybrid MPC pixel signal processing, includes a photonic coding system and apparatus:
it is an object of the present disclosure to employ passive optical systems and components as much as possible to help minimize the need for active device systems for processing sensor data, especially in real-time, as well as for 3D perspective integration of computer generated imagery and computing real and synthesized/digital or stored digital image information.
The structure/operation of the image processing and pixel image display generation system-the following subdivision of the structural levels, subsystems, components and elements will include specifications of how this goal is achieved. The structure, components and operational stages of the system are sequentially truncated from the outer image wavefront to the transfer of the final intermediate image to the HVS (the order is arbitrarily set from left to right for simplicity) (see fig. 1)):
A. General case-main elements of the system:
IR/near IR and UV filtering stages and structures (IR and near IR filtering omitted in the system version implemented for night vision systems).
2. Polarization filtering to reduce the input through-illumination intensity, an option for which there are some benefits and advantages, or polarization filtering/sorting into channels, polarization rotation, and channel recombination to maintain a maximum input or through-illumination level, an option for which there are other benefits and advantages.
3. Real world through illumination and pixelation or sub-pixelation of the channels implementing these.
4. The through channels are integrated with an internally generated array of sub-pixels, combined in a unified array, to achieve optimal augmented/mixed reality or virtual reality image display presentation.
i. Two preferred overall schemes and structures/architectures for handling and processing through (real world) lighting: while other arrangements and versions are enabled by the general features of the present disclosure, the main difference of the two preferred embodiments is that the processing of incoming natural light is essentially different, with the channels in the structured optical system conveying the light through the subsequent processing stages to the output surface of the compound optics surface inward/facing the viewer — in one case, all real world through illumination is down converted to "pseudo-color" of IR and/or near IR for efficient processing; in another case, the real world direct visible frequency illumination is directly processed/controlled without frequency/wavelength shifting.
Generation/"artificial" subpixels in a unified array: preferably, a hybrid magnetic photon, pixel signal processing and photon encoding system. The same overall methods, sequences and procedures apply to the through optical channel in the case and where all through light is down-converted to IR and/or near IR versions and scenarios.
B. Detailed description of the invention
IR/near IR and UV filtration stages and structures: a wearable HMD "glasses" or "goggles" has a first optical element, preferably in the form of a binocular element, a left and right split element or one goggle-like connecting element, which intercepts the see-through real-world wavefront of light emanating from the relatively forward outside world of the viewer/wearer.
The first element is composite or structured (e.g. a base/structured optical element, on which a multilayer material/film is deposited or which itself is a periodic or aperiodic but complex 2D or 3D structured material, or a mixture of composite and direct structuring), enabling IR and/or near IR filtering and
and (4) UV filtering. Also, more specifically, these may be gratings/structures (photonic crystal structures) and/or bulk thin films whose chemical composition implements reflection and/or absorption of unwanted frequencies. These options for material structuring are well known in the relevant art, and there are many options available commercially.
In some embodiments, particularly for night vision applications, IR filtering is eliminated, and some elements of the functional level sequence are changed in order, eliminated or modified, following the mode and structure of the present disclosure. Details of embodiments of this class and version will be described in detail below.
2. Polarization filtering (to suppress the incoming through-illumination intensity) or polarization filtering/sorting into channels, polarization rotation and channel recombination to preserve the maximum input or through-illumination level: a similar filter, best following the first filter in the optical lineup sequence, is the next element on the opposite right of the figure, the polarization filter or polarization sorting stage. This may in turn be a batch of "polarizer" or polarizer films or deposited materials, and/or polarization grating structures or any other polarization filtering structures and/or materials that provide the best combination of practical features and benefits for any given implementation, i.e., in terms of efficiency, manufacturing cost, weight, durability, and other parameters that may require optimized trade-offs.
3. Polarization filtering option, results: after the series of optical elements arranged over the whole range of the optical/optical structuring element, the incident wavefront is already equal in frequency and it is already polarization mode-equal and sorted/separated by mode. For visible frequencies, the net brightness of each mode channel has been reduced by polarization filtering means, reflecting for simplicity the current efficiency of the periodic grating structure material, in fact approaching 100% filtering efficiency meaning that each channel cancels about 50% of the light.
4. Polarization filtering, sorting, single channel rotation and recombination, the results are: for example, by putting the two separated/sorted channels together, the combined intensity will be close to, but not exactly, the intensity of the original incident light before filtering/separating/sorting.
5. Benefits and importance: since these filters can also be implemented on the same layer/material structure, or sequentially through separate layer/material structures, the HVS 1) prevents poor UV, 2) brightness reduction, 3) removes IR and near IR (except for night vision applications, where the visible spectrum would be minimal and no filtering of visible light is needed). Benefits/features 2 and 3 are of significance to the next stage of the system and the overall system and will be further elucidated below.
6. Real world through illumination and pixelation or sub-pixelation of the channels that accomplish these: sub-pixel subdivision of the incident wavefront, an optically passive or active structure or stage of operation, is implemented with the front and preferably the rear, as it will tend to reduce manufacturing costs. This subdivision can be achieved by a variety of methods known in the art and other methods not yet devised, and includes deposition of differential refractive index bulk materials, material fabrication using a photochemical resist-mask-etch process or nanoparticles in colloidal solution by electrostatic/van der waals based methods and other self-assembly methods; focused ion bam etching or embossing and in particular by etching, cutting and embossing methods, by the fabrication of capillary micro-hole arrays of waveguides, or by the fabrication of other periodic structures of photonic crystal bragg grating type structures, or other periodic gratings or other structures fabricated in bulk material, by means of a modified overall refractive index. Alternatively, or in combination with cited or known methods or methods that may be devised in the future, the sub-pixel sub-fine/guiding material structures forming the array over the area of the macro-optical/structural element may be fabricated by assembling building blocks, such as optical fibers and other optical element precursors, including by methods disclosed elsewhere by the authors of the present disclosure, and proposed by Fink and baylindir, for optical fiber device structure preform assemblies, or fused glass or composite assembly methods.
Certain specific details and requirements of the various embodiments and versions of the present system that are applicable to this structural/operational level of the system will be covered in appropriate later stages of the following structural/operational subdivision of the system.
7. Integrating the through channel with the endogenous sub-pixels in a merged array: however, in addition to providing means for subdividing the incident wavefront from the forward field of view into portions suitable for controlled optical path control and subsequent use for other passive and/or active filtering and/or modification, it is very important to note that there are two types of pixel/sub-pixel components of the array of total fields of view provided to the viewer using the proposed system, as well as two different "branching" processing sequences and operating structures, en route to the final pixel rendering of the viewer. And this is one of the first levels and requirements for the sequence of the present composite structure and operation process, at the appropriate level of which pixel-by-pixel and sub-pixel-by-sub-pixel light path control is implemented.
8. Two pixel signal component types-pass-through and generated or artificial: at the pixel signal processing, pixel logic state encoding stage, as disclosed in the publications cited below, we now use two pixel types, or more precisely two pixel signal component types, respectively.
9. Two preferred overall schemes and structures/architectures for handling and processing through (real world) lighting: while other arrangements and versions are enabled by the general features of the present disclosure, the main difference of the two preferred embodiments is that the processing of incoming natural light is essentially different, with the channels in the structured optical system conveying the light through the subsequent processing stages to the output surface of the compound optics surface inward/facing the viewer — in one case, all real world through illumination is down converted to "pseudo-color" of IR and/or near IR for efficient processing; in another case, the real world direct through visible frequency illumination is directly processed/controlled without frequency/wavelength shifting.
a. In a preferred version, visible light channels that have been UV and IR filtered and polarization mode sorted (and optionally filtered to reduce the total intensity of through illumination) are frequency shifted to IR or near IR, but in either case are invisible frequencies, implementing a "pseudo-color" range of the same scale band positioning width and intensity. After the frequency/wavelength modulated and down-converted photonic pixel signal processing method, the HVS will detect and see nothing. The subsequent photonic pixel signal processing of these channels is then essentially the same as proposed for the generated pixel signal channels, as disclosed in the following section.
b. In another preferred embodiment, the through channel is not frequency/wavelength modulated and is down-converted to invisible IR and/or near IR. In this configuration, the preferred default configuration and pixel logic state of the through channel is "on", for example in the case of a conventional linear faraday rotation conversion scheme for pixel state encoding/modulation, including input and output polarization filtering means, the analyzer (or output polarization means) will be essentially the same as the input polarization means for any given polarization model sorted sub-channel, so that when the operating linear faraday effect pixel logic state encoder is processed and activated, the operation is to reduce the intensity through channel. Some features of this embodiment and the required details are disclosed in the following sections, after the details are provided for the operational function and structure of the generated channel.
If polarization filtering is combined with this preferred embodiment and variation, rather than mode sorting and implementation of separate mode channels, and then combined into merged channels by polarization rotation means, to maintain as much as possible the original pixelated through-illumination, such as by means of passive components (e.g. half-wave plates) and/or active magneto-optical or other mode/polarization angle modulation means, the overall brightness of through-illumination will typically be reduced by about 50%, as a preferred level and method, and in some cases will be more preferred in view of the relative visible range performance of the magneto-optical material as herein.
Accordingly, the background through-illumination luminance maximum is proportionally reduced, and it may be correspondingly easier for the subsystem providing the "generated" (artificial, non-through) sub-pixel channels, and the associated methods and apparatus, to match and integrate and reconcile the generated image elements within the overall illumination range of the general comfort and realism of "augmented reality" imagery and views.
Alternatively, the through channel may be configured in a default "off configuration such that the input polarization means (polarizer) and output means (analyzer) are reversed or" crossed "if a typical linear faraday rotator scheme is employed. It may be advantageous to employ such default configurations as the frequency dependent MO material (or other photon modulation scheme, in terms of employing frequency dependent/performance determining materials) continues to improve, wherein the through-illumination intensity ground state is increased and managed from a default "off" or near zero or effective zero intensity by subsequent photonic pixel signal processing steps and methods.
C. While it is preferable to propose down conversion to IR, UV is also an included option and may in some cases be used in the future to shift the input visible illumination to a convenient non-visible spectral domain for intermediate processing prior to final output, given the common material-system dependence of IR and near-IR performance optimization of photonic modulation apparatus and methods.
10. Merging the generated/"artificial" subpixels in the array: first, we consider the image-generating pixel signal component, or in other words, the pixel signal processing structure, a sequence of operations, which is preferably a hybrid magneto-optical sub-pixel signal processing and photon encoding system.
a. In the most common configuration of the image collection/processing/display subsystem of the overall system proposed for fully mobile AR in daylight conditions, the next structure, process and element in the sequence is the optical IR and/or near IR planar illumination spreading structure and pixel signal processing stage.
b. For such structures and procedures, the optical surfaces and structures (films deposited or mechanically laminated onto the structure/substrate, or patterning or deposition of materials, or combinations of methods known in the art directly on the substrate) uniformly distribute IR and/or near IR illumination uniformly across the entire optical area of a 100+ fov binocular lens or a continuous goggle type form factor. The IR and/or near IR illumination is uniformly distributed by: 1) The combination of light-leaking fibers disposed on the X-Y plane of the structure, either all in the X or Y direction or in a grid. Such leaky fibers have been developed and are commercially available by companies such as physical optics that leak illumination that is transmitted substantially transversely across the fiber core in a substantially uniform manner over a specified design distance, in combination with a diffuser layer, such as non-periodic 3D embossed structured films (embossed non-periodic micro-surfaces) commercially available from luminet, inc; 2) Side illumination from IR and/or IR LED edge arrays or IR and/or near IR edge laser arrays, such as VCSEL arrays, projected to intercept as volumetric illumination, such planar sequential beam expander/spreader optics as planar periodic grating structures, including holographic element (HOE) structures, such as commercially available from Lumus, BAE and other commercial suppliers referenced herein and in the previously cited copending applications, as well as other backplane diffusion structures, materials and devices; in general, other display backplane lighting methods, devices, and structures known in the art may be developed in the future.
c. The purpose of this stage/structure in the sequence of operation and pixel signal processing is to emit IR and/or near IR back plate illumination, which is confined to the opposite interior of the composite optical/material structures proposed so far, with IR and/or near IR filters that reflect the injected IR and/or near IR illumination to the illuminating layer/structure.
d. It is important to note, even if it is a readily apparent fact, that the IR and/or near IR are not visible to the HVS.
The illumination source for the e.r and/or near IR may be an LED, a laser (such as a VCSEL array), or a mixture of both, or other devices known in the art or that may be developed in the future.
f. The injected IR and/or near IR illumination also has a single polarization mode, preferably plane polarized light.
g. This can be achieved by a polarization tuning device by separating the IR and/or near IR LEDs and/or lasers and/or other illumination sources using a polarization splitter or a sequence of filters/reflectors (such as fiber optic splitters) and passing one planar polarization component through a passive and/or active polarization rotation device, such as a bulk magneto-optical or magneto-optical sub-rotator, or a series of passive devices, such as half-wave plates, or a mixture of these. A polarization filter, such as a high efficiency grating or a 2D or 3D periodic photonic crystal type structure, placed at an angle to the incident light, can reflect rejected light into the polarization rotating optical train and channel, which is then recombined with the unaltered portion of the original illumination. In a waveguide, plane or fiber where the polarization modes (plane polarization) are separated, one branch is passed through a polarization tuning device and then the other branch is reconnected.
h. The source illumination may also be confined in its own configuration to produce only light that is plane polarized within a given angle or range.
i. The light may be generated and/or tuned locally in the HMD, or remote from the HMD (such as a wearable vest with power storage) and transmitted to the HMD via optical fibers. In an HMD, the illumination and/or tuning stages and structures/devices may be in close proximity to the composite optical structure, or elsewhere in the HMD and optically transmitted, either through optical fibers if farther away and/or through planar waveguides if closer.
j. The foregoing structure so far and the following structure of operations and processes are examples of pixel signal processing as disclosed in the referenced application, characterized by a decomposition of the process of generating pixel signal characteristics using an optimal method and transmitting into an optimal stage, and generally operating at a wavelength optimized for this type of process, particularly with respect to the pixel state logic encoding stage and process. Many MO and EO and other optical interaction phenomena are optimal for most material systems in the IR or near IR band region. The overall system, method, structure, operation, and process structures, as well as the details of each, including the necessary and optional elements, are disclosed in the referenced applications.
k. Pixel signal processing, pixel logic level coding level-modulator array:
IR and/or near IR illumination after illumination and harmonic level goes through a pixel signal level logic encoding process, operation, structure and means, preferably for the present disclosure the modulation means falls within the category of magneto-optical modulation methods. One preferred method is based on the faraday effect. Details of the apparatus and method are disclosed in the cited U.S. patent application "hybrid MPC pixel signal processing".
m. in a binary pixel signal logic state system, the "on" state is encoded by rotating the polarization angle of the incident plane-polarized light so that when the light passes through the subsequent stage of the pixel signal processing system, the light will pass through an analyzer, followed by and opposite polarization filtering means (called an "analyzer").
In this type of MO (or subtype, MPC) pixel signal logic level encoding system, light passes through a medium or structure and material subjected to a magnetic field, a coherent/bulk or structured photonic crystal or metamaterial, usually solid (although it may also pass through an enclosure containing a gas or pure vapor or liquid), with an effective figure of merit that measures the efficiency of the medium or material/structure to enable rotation of the polarization angle.
Details of preferred types and options for pixel signal processing logic level encoding stages and devices of this preferred type can be found in the referenced co-pending application, and other variations can be found in the prior art or can be developed in the future.
Other aspects of preferred and reference classes of hybrid MPC pixel signal processing systems that require highlighting details include:
q. the hybrid MPC pixel signal processing system implements a memory or "latch", without power, until the pixel logic state needs to change the system. This is achieved by adapting and implementing a magnetic "remanence" method known in the art, wherein the magnetic material is fabricated in a batch process (e.g., integrated Photonics commercially available latching LPE thick MO Bi-YIG films [ see our other disclosures ]; and/or implementing a permanent domain latching periodic ID grating [ see our other disclosures ] of Levy et al, or a composite magnetic material combining a relatively "harder" magnetic material with an optimized MO material in a juxtaposition/mixture such that the applied field latches a low-force, straight-line hysteresis curve material that holds the magnetization (latching) of the MO/MPC material as an intermediate, the intermediate material may surround the MO/MPC material, or it may be mixed or structured with a periodic structure transparent to the transmission frequency [ IR or near/IR ]. This third composite method 2004 was first proposed by the authors of this disclosure in the us provisional application of 2004, including in us patent/us patent application, belotev et al, and this later-developed method for the underlying crystal-based, and implemented in a multi-layer composite material coupling method, wherein the relative coercivity coupling method is called the low efficiency sub-layer ID method implemented in which is implemented in the relative magnetooptical crystal coupling method.
Combinations of these methods are also possible design options.
The benefits of such "memory pixels" in a hybrid MPC scheme are the same as bistable pixel switching such as electrophoretic or "E-Ink" monochrome displays. As a non-volatile (relatively, depending at least on the design of the hysteresis curve and the choice of materials) memory, the image will remain formed as long as there is an IR or near IR illumination source that "transmits" and "processes" in the pixel signal processing channels and systems.
t. a second essential aspect and element of the preferred pixel signal processing, pixel logic encoding stage and method is to efficiently generate a magnetic field that switches the magnetic state of the sub-pixels (as a rationale for color systems such as RGB, so the naming convention is more broadly retained and distinguished when needed for convenience in discussing the common components of the final color pixel). To ensure that there is no magnetic cross talk, it is preferred that the field generating structure (e.g. a "coil") is arranged in the path of the pixel transmission axis, rather than on the side. This reduces the required field strength and manages the flux lines by means of (magnetically) impermeable materials in the surrounding material/matrix or implementation of periodic structures by not providing field generating means at the edges, as in the case of the Levy et al domain continuation method, confining the flux lines to the modulation region. Transparent materials may include available materials such as ITO as well as other new and upcoming conductive materials that are transparent to the relevant frequencies. And/or other materials that are not necessarily transparent in bulk, but may also be deposited or formed in the modulation region/subpixel transmission path in suitable periodic element sizes, geometries, and periodic structures, such as metals.
The method was first proposed by the authors of the present disclosure in the 2004 inner design file of the same company designated us provisional application in 2004, and later disclosed in us patent application. Subsequently, in 201+ years, the researchers at NHK adopted his approach, typically for MO and MPC devices, for kerr rotators, using ITO in the path of the pixels [ reference search ]
v. a third essential element of the preferred hybrid MPC pixel signal processing solution for the pixel signal processing subsystem is the method of addressing the sub-pixel array. As previously mentioned, the preferred method can be found in co-pending U.S. patent applications for wireless addressing and powering of arrays of devices. For the present application, wireless addressing may be sufficient to enhance powering of wireless array (sub-pixel) elements given low power requirements, eliminating the wireless powering approach via low frequency magnetic resonance, although micro-ring resonators may be more efficient than powering through miniature antennas, depending on material selection and design details. However, wireless power to the entire HMD or wearable device is the preferred method of powering the entire unit while reducing the weight and volume of the head-worn, especially when combined with local high power density dielectric container systems or other capacity technologies, capable of powering up through wireless low frequency packets. A basic low frequency magnetic resonance solution is available from Witricity, inc. For more complex systems, reference is made to us patent application, wireless power relay.
Other preferred methods of addressing and powering the array/matrix include voltage-based spin wave addressing, a variation not identified in the referenced application, and thus novel to the present proposal, although applicable to the originally referenced hybrid MPC pixel signal processing applications and other form factors and use cases thereof. Current-based high-speed backplane/active matrix solutions developed for other display technologies (e.g., OLED) are also an option.
Other less preferred pixel signal processing, pixel logic encoding techniques and methods will also benefit from wireless addressing and powering methods, as well as voltage-based spin wave methods, according to other specific design choices.
y. although less preferred, such other pixel signal processing pixel logic encoding devices, including Mach-Zehnder interferometer based modulators, whose efficiency is also typically frequency-material based and most efficient in IR and/or near IR, may also be used, and any number of other pixel signal logic encoding devices are designed in configurations and/or material systems optimized for the most efficient frequencies of such devices in accordance with the teachings of the referenced application.
Preferred embodiments of the proposed system are also necessary to identify dual sub-pixel array systems, with this particular variation and optimized version disclosed herein for the present application, and other non-HMD and non-wearable display system applications with similar operational requirements or desired benefits, according to the cited [2008] U.S. patent application telecommunication structured pixel signal processing method.
The pixel logic state encoding stage of the operating structure and process is an optional signal gain stage after pixel signal processing. The circumstances that this option is relevant will be explicitly pointed out in the following presentation.
Wavelength/frequency shift stage: for the current particular version of the preferred hybrid MPC pixel signal processing system, followed by a frequency up-conversion stage, a phosphor color system enhanced with preferred nano-phosphors and/or quantum dots (e.g., QD vision) (although periodically polarized device/material systems are also specified as options of the cited disclosure). Basic technologies that are commercially available include suppliers from GE, cree, etc., as well as various other suppliers known in commercial practice.
It will now be apparent to those skilled in the art that it is ongoing to separate or decouple the up-conversion process that typically occurs at the illumination level and delay it until after several other levels are completed, optimizing for operation at IR and/or near IR frequencies and other reasons.
Thus, the color system is fully implemented by optimizing the nano-phosphor/quantum dot enhanced phosphor material/structure formulation tuned to the color system, such as the RGB sub-pixel color system. Again, these renewed considerations of the concept and operation of the display system are found in the more detailed published referenced application.
The advantage of using the hybrid MPC pixel signal processing method is the high speed of the local MPC modulation speed, which has proven to be below 10ns for a considerable period of time, while sub-ns is currently the relevant benchmark. Phosphor excitation-the speed of the emission response is relatively fast, if not so fast, but at both gross and net values, the overall full color modulation speed is below 15ns, theoretically optimized to lower net duration measurements.
A variation of the proposed structure adds a band filter on each IR and/or near IR sub-pixel channel, which will be "on" or "off to amplify to R, G or B at the end of the processing sequence. While adding complexity to the filtering elements, this variant may be preferred if 1) the hybrid MPC grade is itself a batch of custom materials in the material composition that more efficiently responds to different sub-bands in the IR and/or near-IR domain, even though it is considered unlikely to be the case, due to the nearly 100% transmission efficiency and very low power polarization rotation of even bulk LPE MO thin films commercially available in this wavelength domain, or more likely, if the efficiency of the different nanophosphor and/or quantum dot enhanced nanophosphor/phosphor material formulations is so high that more precisely equal IR and/or near-IR bands for each final R, G and B sub-pixel composition are of course. The design tradeoff is attributed to the cost/benefit analysis of the added complexity of the additional layers/structures/deposition channels with the band enclosure versus efficiency gain from the ability to use frequency/wavelength shifting materials, which are more "tuned" to different parts of the invisible input illumination spectrum.
After this color processing stage, the groups of sub-pixels implemented by the initial IR and/or near IR illumination sources continue through the merged optical pixel channel. Without adding any other constituent final pixel component, the output pixel will depend on the design choice of modulation and color level component size, as may be required, preferably by diffusion means, including those cited and as disclosed in the cited application, optional pixel expansion may be necessary (the probability of pixel dot size reduction is much less, which requires optical focusing or other methods, as known in the relevant art and as disclosed in certain referenced applications, especially [2008].
To achieve a virtual focal plane at a suitable distance from the observer's eye, collimating optics, including a lenslet array, an array of optical fibers embedded in a textile compound, wherein the optical fibers are arranged parallel to the optical transmission axis; the use of "flat" or planar inverted index metamaterial structures, as well as other optical methods known in the art. Preferably, all elements are fabricated or implemented in a composite layer on the macro-optical elements/structures without the need for additional bulk optical eyepiece elements/structures. Other problems with fiber-based processes versus laminate composite or deposition-fabricated multilayer structures or one or more combinations/hybrids are addressed in the following section under the structure/mechanical system.
As previously mentioned, the pixel signal processing pixel logic array function/optics/structural elements implement the disclosed pixel signal processing pixel logic structure and operational stages, including the preferred hybrid MO/MPC method and operational structure, not a batch device operating in the entire field of the incident wavefront that has been previously filtered, but rather a pixelated array (as contemplated by those skilled in the art).
Each final pixel may comprise at least two pixel components (beyond the color system RGB sub-pixels described above): one component is arranged in an array that does generate a video image from scratch, which may include simple text and digital graphics, but for the full purposes of the present system is capable of generating high resolution images from CGI or relatively remote live or archived digital images or a composite and blended image thereof. This is as described previously.
Direct real world illumination and pixilated array — detailed specification for the case of visible frequency direct (i.e. not down converted to IR/near IR): return transmission and processing of real world non-generated light from the field of view through structured and operable optics and photonic structures and stages;
a. Co-located with these IR and/or near IR driven sub-pixel clusters on the addressing array is another set of pixels or other sub-pixel components, which are actually the final pixel channel components originating from the field of view forward of the viewer and wearer of the HMD. These are the "through" fully addressable components of the final pixel.
b. These channels originate from the previous compound optical element/structure, which is illustratively subdivided into pixels.
c. These optical channels transmit wavefront portions with low wavefront loss by employing available efficient splitting methods. A surface lenslet array or a specular funnel array may be used in conjunction with the proposed subdivision method to achieve very close edge-to-edge light capture efficiency, such that the captured wavefront portions are then efficiently coupled to the relative "cores" of the subdivided/pixelated guided optical/array structure. Thus, whether using conventional step index coupling methods, or using an array of MTIR micro holes, or a true photonic crystal structure, or a hybrid of more than one method, the area formed by the pixelated array for the coupling device will receive a wavefront with a minimized percentage of losses.
d. For certain versions and modes of operation of the present system, efficient wavefront capture, routing and guiding/pixellated segmentation requires broadband optics that focus and/or reflect visible and IR and/or near IR frequencies-and it can be seen that although it is suggested that IR and/or near IR filters be implemented as initial and first optical filtering structures in an optical lineup and sequence.
a. In most configurations, with its IR and near IR illumination levels to be spread across the level, the directing structure is used for "through" capture illumination that is transparent to IR and/or near IR, but provides visible frequency light directing/path limiting, enabling IR and/or near IR to be uniformly distributed while not interfering with the channelized "through" pixel components.
b. Once the directed incident wavefront partial channel reaches the pixel signal processing, pixel state encoding stage, if there is a periodic structured grating (or 2D or 3D periodic structure) of a single formed bulk MO or multi-layer MPC film or another "bulk" film, if the efficiency of that material or structured material is optimized for the IR and/or near IR lines, then parallel pixel signal processing, pixel logic state structure will be achieved in exactly the same way, but much less efficiently.
c. However, as broadband MO materials, whether batch formulations or structured photonic crystal materials, are manufactured in various ways, the efficiency will continue to improve, although it is currently still not comparable to the IR and near IR of optimized MO/MPC materials/structured materials. In early work leading the authors of the present disclosure, new MO and MPC materials were modeled and manufactured in 2005, which not only demonstrated for the first time a significantly improved transmission/Faraday rotation pairing of the green band states, but also demonstrated performance in the blue band that was not negligible, indeed significant, acceptable and competitive for display applications.
However, such materials tend to be more expensive to manufacture and if different materials are deposited, as "thin films", this adds complexity and expense to the manufacturing process for the "as-produced" pixel component and the through pixel component. Such a configuration would improve the efficiency of all conditions of pixel logic state encoding for the "through" component of the final merged pixel equally.
d. Without depositing or forming "custom" MO-type materials (this logic is also applicable to less preferred modulation systems, such as MO/MPC, where the maximum efficiency is frequency dependent, whereas with a single recipe, the intensity of the pass-through final pixel component will be less, to the extent that the modulation device is less efficient, all the same.
e. In general, for a pass-through system, it will be assumed that no phosphor type or other wavelength/frequency shifting device is employed. However, to the extent that native MO/MPC materials may be less efficient, different formulations of band-optimized materials may be employed in such cases to address, to some extent, material performance deficiencies at the pixel logic level encoding level.
f. In addition, and as proposed for low light or night vision operation, an optional "gain" stage, as proposed as an option for some applications in the referenced applications (U.S. patent application pixel signal processing and U.S. patent application hybrid MPC pixel signal processing), where an energized gain material is pumped to achieve energy gain in the gain medium in terms of optics, electricity, acoustics, machinery, or magnetism, as detailed in the referenced application, and by other methods that may be known in the art or designed in the future, enhances the intensity of the transmitted "shoot-through" component of the final pixel as it passes through the gain medium. If this design option is selected, it is not preferred that this be a variable, addressable stage, but rather that the overall gain increase setting.
g. Furthermore, once the directed incoming wavefront portion channel reaches the pixel signal processing, pixel state encoding stage, as indicated, there is an optional configuration of the entire pixel signal processing and optical channel management system that is optional but valuable for low-light and night vision applications.
h. In this variant, the IR filter is removable, with the goal of passing IR and/or near-IR light from the incident real-world wavefront to the active modulation array sequence, so that the incoming "real" IR passes through the pixel signal processing modulator and generates a similar color (monochrome or pseudo-color IR gradient) image for the viewer directly to the extent that the IR output is visible in the field of view, without the intermediary of a sensor array.
i. Also, as indicated, a gain stage may be implemented to enhance the intensity of through IR (+ near IR, if beneficial) to a wavelength/frequency shift stage.
j. In addition, the base IR and/or near IR background illumination may be turned on by the normal full color mode of operation, modulating the intensity to set the appropriate reference level, to the extent that the input IR radiation does not reach the threshold to activate the wavelength/frequency shifting stage and medium.
k. Removal/deactivation of the IR filtering means can be implemented mechanically if the passive optical element is deployed in an articulated or cantilevered articulated apparatus, capable of "flipping over"; or as an active component, such as in an electrophoretically activated bulk encapsulation layer, in which a plurality of relatively flat filtering microelements are electrostatically (mechanically) rotated (as proposed herein) such that the smallest incident angle passes and the plurality of rotating elements no longer filter the IR. Other passive or active activation/removal methods may be employed.
Both IR and polarization filters for low light or night vision operation can be removed depending on whether the generation system is used "actively", not just to generate the threshold, and superimpose the data on some portion of the incident real IR wavefront portion in the pixelated array. If actively used, to maximize the efficiency of the generating source, the preferred digital pixel signal processing system requires an initial polarization filter to implement the optical switch/modulator that encodes the pixel logic state in the signal.
A disadvantage of the m. pass-through system is that it reduces the intensity of the incoming IR and/or near IR.
An alternative embodiment of the present system, which aims to address this problem, sets a gain stage prior to the pixel signal processing, pixel logic state encoding stage to enhance the incoming signal.
The efficiency of a gain medium with incoherent, non-collimated "natural" light must be taken into account in the design parameters of the system and any system that employs an energized gain medium with "natural" incident light input.
p. in a second alternative, a three component system is implemented that includes component subchannels for generating devices, an incident visible light component, and an incident IR component that is not polarization filtered. A pixelated polarization filter element leaving this third subchannel/component without a polarization filter element must be implemented to implement this variant.
q. for the more basic integrated two-component alternative system type with this low-light night vision mode of operation requirement, additional optical elements are required at the initial incoming wavefront input and channelization/pixelation stage.
While the incoming IR (and near IR, if desired) may be divided between a sub-channel that points to the normal "generating" source component of the final viewable pixel and a through-channel that directs the entire visible portion of the incident incoming wavefront to the source component of the final viewable pixel, there is no particular efficiency gain for transmitting any IR and/or near IR to the visible sub-channel and the source of the final viewable pixel.
s. rather, a frequency divider follows in sequence, or is integrated with, the lenslet or alternative optical trapping means for maximizing the trapping of the incoming real wavefront. One approach is to implement opposing filters, one band filter for visible light, allowing only IR and/or near IR light, and an adjacent filter for IR and/or near IR light. Various geometric arrangements of such opposing filters provide different advantages, including both being planar or both being disposed at a relative 45 degree angle from the central focal point of the incident wavefront optical capture structure, to enable focusing (from lenslets or other optical elements or devices, including inverted index metamaterial "planar" lenses) of the composite visible/IR near-IR beam to first separate a band range while reflecting others to the opposing filter surface — for the focused beam portion of the filter structure that may first impinge away from the central focal point, and vice versa. Grating structures are the preferred method of implementing a dual filter-splitter arrangement, but other methods are also known in the art, implementing two filtering surfaces in successive stages based on a batch material formulation that can be deposited by various methods known and to be developed in the art. (Note that the UV is filtered before this stage, but preferably after the IR.
12. Combination of through and generate/artificial pixel/sub-pixel array:
as already noted, the two sets of component optical channels are co-located and preferably output together to a pixel tuning device (diffusion and/or other hybrid methods, and may be obtained by other methods known in the art or designed in the future) such that the generating source is combined with a through-source, just as the RGB sub-pixels of a conventional color vision artificial additive color display system, to form the final composite pixel. Then, as already indicated and as described in detail in the cited application, further pixel beam shaping, and in particular collimation and additional optical orientation, to form an image at the virtual focal plane, is the most efficient and easy for the HVS, which is also part of the object of the present disclosure, considering the ergonomic design goals close to the face.
a. Operation of the basic integrated two-component system, with a "generate" component (itself composed of RGB sub-pixels) and a variable "pass-through" component — firstly, in its primary mode of operation, and secondly, configured as an optional low-light night vision mode:
on bright, sunny days outdoors, the wearer in the form of the proposed HMD views integrated binocular (two separate lens form factor device structures) or connected goggles that present to him/her an image formed by an integrated pixel array, itself formed by the integration of two input components, resulting in high performance pixels and a straight-through variable intensity wavefront section facing the "window of the world" of the viewer:
b. A composite color component for the final integrated pixel, this is formed by a "generating" pixel component, which starts with invisible IR and/or near IR "internal" injection "back-lighting, which turns on or off at a speed of less than 10ns (currently less than 1 ns) for each sub-pixel. The IR and/or near-IR sub-pixels then activate the composite phosphor material/structure, employing the best current materials and systems available to produce the widest possible color gamut.
c. Once the state of the sub-pixel is set, with this very short pulse, the "memory" switch maintains its on state until its state changes, without applying constant power to the switch.
d. Thus, the generation component is a high frame rate, high dynamic range, low power, wide color gamut pixel conversion technique.
e. The second component of the composite pixel is a pass-through component that starts with an effectively high percentage of subdivided portions of the entire wavefront impinging on the front optical surface of the present HMD, entering from the direction that the wearer is facing. These wavefront portions are UV and IR filtered in normal mode, and polarization sorted or filtered (how to choose will depend on the chosen design strategy, reduced real world lighting reference or maximized reference). With reduced reference, i.e. polarization filtering, this results in a substantial reduction in the overall brightness of the visible field (approximately 1/3 to 1/2, depending on the composition of the incident polarization mode and the efficiency of the polarizer).
f. The reduction in pass-through intensity makes the generating system more likely to "compete" and match or exceed the illumination level of the incoming wavefront portion, especially in bright daylight, but typically in all lighting conditions except very low to no light. Thus, by a passive optical device realized by a system component that performs dual tasks or yields dual benefits: it is an essential component of the preferred modulation system (based on polarization modulation) for the encoding of the logic states of the look-behind pixels, it also reduces power requirements and simplifies the process of calibrating, coordinating and synthesizing values for the generation system using the pass-through system.
g. This system design feature takes advantage of the fact that for most people, outdoor bright lighting conditions are managed through the use of polarized sunglasses. It is known that an excessively bright lighting or transmissive display in a room can cause eye fatigue, and therefore reducing the indoor lighting level as a whole can lead to simpler problems, i.e. increasing the lighting level relatively rarely with a generating system, without having to create again a "competing light environment" in the field of view. The combination of reduced natural through-illumination (which can be optionally enhanced by an optional gain stage, although less efficient than using LEDs or certainly less efficient than using lasers) with a generation system that can add graphical or synthetic elements to certain parts of the scene, results in a more harmonious, lower intensity baseline than otherwise. (the generation system-that part of the integrated array-not necessarily the AR mode produces the entire FOV, although it may be in full VR mode).
h. Suppose that the coordination and composition of composition and real elements are computed in the user's perspective-one aspect to be addressed next in the sensing and computing system-generating and passing through a mix of sources can easily and quickly generate a mixed, moving AR/mixed reality view without visible lag and perceptible delay at the display level.
i. Using through-pixel component sub-channels designed with a default "off" scheme (i.e., the preferred polarization-modulated form of the polarizer and analyzer "crossed" rather than being identical), and not transmitting through-wavefront portions, a mobile HMD can operate in a mobile VR mode, allowing for calibration with real-world view and motion tracking. It can be seen that in conjunction with the proposed sensors and associated processing system, the HMD can act as an "indirect view display" that turns off barrelleaux straight through.
j. Turn off the generation system, especially if the cost and complexity of the optimized visible frequency MO/MPC material structure is increased, a variable pass-through system without generation/enhancement channels adding pixel illumination/image primitive information can also be implemented).
In the reverse configuration of the "indirect view display", as will be seen during the detailed description of the proposed sensor and associated processing system, the variable transmission means of the through system can be enhanced to a direct view system if another variant of the present system is employed, and a "through" channel filter subdivision (following the pattern of IR/near IR and visible spectral filters-separators) into RGB sub-pixel channels, each with its own pixel signal logic state encoding modulator. Its disadvantage is the dynamic range and no generation means to supplement, in contrast to the relatively low light limit; furthermore, this variant (simply eliminating the mode or system that generates the structure) would not have the benefit of a dual array that can be addressed by a parallel processing system, simplifying the bottleneck of performing scene integration synthesis and perspective computation. In addition, such a system would be more expensive and less efficient to implement than an IR/near-IR based generation system based on different tuning and visible spectrum optimized MO/MPC materials/structures.
k. An optimized system combines an efficient generation component and a pass-through component of variable intensity but generally lower brightness.
The preferred wireless addressing and powering further reduces the power, heat, weight and volume of the functional device portion of the intelligent structural system.
In the very low light or night vision mode, for systems where the IR filter can be removed or turned off, the IR (and near IR, if needed) pass through the pixel state system without loss, and an optional gain stage increases the IR signal intensity, and/or the IR/near IR internally injected illumination component increases the threshold/base intensity, on which the incoming pixelated IR intensity is added/superimposed, the IR/near IR passes through a wavelength/frequency shifting device (preferably a phosphor system), and a direct view low light or night vision system is achieved whether the system is set to monochrome or pseudo-color. With proper use of the polarization filter, the generation system can operate and add graphics and full imagery, compensate for the reduced intensity of incoming IR using signals from the auxiliary sensor system (see below), or simply add a reference level, as proposed in other configurations, to ensure that the energy input to the wavelength/frequency shift is sufficient to produce sufficient output.
Sensor system for moving AR and VR:
following the general case of this proposal, where the structure displaying the image does not do so without the sensor system, which, according to the various cases of the cited disclosure, optimizes and reconciles the resultant imagery with the general interior (and in some cases, external lighting conditions that may be straight-through may be desirable or required for efficiency considerations); nor does it do it without taking into account the user's position, viewing direction, and general motion tracking.
1. In a preferred version of the system, at least some of the equipment components have a dual function as structural elements; but in those cases where it is not possible at all, to any appreciable extent, to combine the sensing with other elements of the functional purpose, in particular to distinguish the device as the content of an integrated overall system.
2. In the system of the present disclosure, it integrally implements in optimal form motion tracking sensors such as known in the art, including accelerometers, digital gyroscope sensors, optical tracking and other systems in the form of small monolithic macro camera systems, more precisely, multiple distributed sensor arrays are the preferred implementation to realize the benefits of distributed, native and local processing, as well as the additional specific benefits of image/photogrammetry-based methods for capturing "global" lighting conditions in real time and extracting geometry data in real time to enable local updates of stored position/geodetic/topographical data to accelerate the calibration of synthetic image elements and their efficient perspective rendering and integration and combination into mixed view scenes.
3. As disclosed in the referenced application, and briefly extended, the "image-based" and photogrammetric methods in which, among other things, real-time information gathering values are used and proven are light-field methods, such as the commercially available Lytro system, which from a multisampled (and optimally, distributed sensor array) space, is able to image sample the space in real-time, and then after sufficient initial data is input/captured, generate a view-transformed 3D space. The virtual camera is then positioned in real time at a given resolution at varying locations in 3D space extracted from the photogrammetric data.
4. Other image-based methods can be used with the Lytro light field method, in combination with additional local geometry/terrain data, to achieve calibrated perspective image synthesis, including occlusion and opacity (using the integrated dual generation and pass-through components of the preferred proposed display subsystem). Such an approach provides sampling of the entire FOV in real time to obtain shadows/lighting or even simple graphical/text elements with lighting parameters matching the CGI, and to update the navigated real world 3D terrain space in real time, rather than simply performing individual calculations on disconnected, unrelated pixel points from files, GPS, and traditional motion sensors. General corrections can be applied to illumination and relative position/geometry by means of parameter sampling, significantly reducing the computational burden.
5. In conjunction with the "absolute" positioning of the user by means of GPS and other mobile network triangulation from signal methods, motion sensor tracking in conjunction with HMD and any haptic interface, and image-based mapping of the user's body including from field-updated image-based photogrammetry systems, then employing a plurality of small sensors and cameras, relying on relative position and topographical parameters obtained from fast real-time image-based methods.
6. In this regard, the Bayindir/Fink "optical fabric" camera developed at MIT is an example of a specific physical method of validating the implementation of a distributed array. Whether following the fiber optic device and smart textile composite approach proposed by the inventors of the present disclosure, or a simpler MIT fiber optic device fabrication method and optical fabric implementation, or other fiber optic device smart/active/photonic textile approach, a distributed textile composite camera array disposed in the structure of an HMD mechanical frame-and performing the dual tasks by adding also to the structural system solution rather than as a non-participating load on the system as described below-is a preferred version of implementing an advantageous multi-device array system that provides parallel distributed data capture.
7. A multi-point micro sensor array, which may include a plurality of micro camera optical sensor array devices, is another preferred embodiment of the multi-view system.
8. A more basic integrated commercial Lytro system, combined with some of the various other camera/sensor combinations in a small array, is a less preferred but still superior combination, allowing for a variety of image-based approaches.
9. The auxiliary IR sensors, again preferably arranged in a plurality of lower resolution device arrays, can, as already noted, provide override low/night vision feeds to the display system, or provide correction and supplemental data to the generation system to work in harmony and coordination with the through real IR.
10. The Lytro type light field system based on the same arrangement can be used for sensors in other frequency bands for the visible spectrum in a generally horizontal mode, which can include not only low/night vision, but also field analysis for other applications and use cases (such as UV or microwave), depending on the application. Given the limitations of resolution at longer wavelengths, it is still possible to generate spatial reconstructions from invisibility or invisibility supplemented by GPS/LID AR reference data, and obtain other dimensional data collection correlations when performing sensor scans of complex environments. As miniaturization progresses, compact mass spectra, now implemented with smaller and smaller form factors and miniaturization, can also be considered for integration into HMDs.
11. Finally, in an image-based approach that is advantageous for fast data sampling of lighting parameters and what they tell us about the material, geometry and atmospheric conditions of the local environment, one or more miniature "light probes" are reflective spheres whose surfaces can be imaged to extract a compact global reflection map, for example, located at the key vertices of an HMD (left and right corners or individual centers, paired with multiple imagers to capture the entire reflective surface; optionally also able to utilize partially hemispherical "holes" of concave reflection alone or preferably in combination with spheres, or held in place by a magnetic field, or mounted on a strong axis or generally hidden to extract lighting data from a compact compressed reflective surface), can provide a highly accelerated approach to spatial lighting, material and geometry parameterization-not just to accelerate the fast graphical integration of live and generated CGI/digital imagery (shadows, lighting, perspective, including occlusion, etc.), in combination with other related methods of photogrammetry, and for performing fast analysis of possible risk factors for sensitive operations in complex, rapidly changing environments.
Mechanical and substrate system:
From the foregoing, it should be apparent that the image display subsystem and image-based distributed sensing and auxiliary imaging system that have been proposed, with attention to the preferred embodiments, have provided substantial benefit and value to the structural and mechanical and ergonomic goals of the present disclosure.
1. Structure-function integration one preferred embodiment, beneficial for weight, volume, size, balance, ergonomics and cost aspects, is the implementation of a combination of a textile composite structure of tensioned film and a flexible optical structure substrate, particularly preferred is an HMD frame formed of Corning Willow glass, folded (and preferably sealed) with all the processing and functional electronics that must be integrated into the HMD, which can include a less preferred version of the power supply without the use of wireless power, fabricated on the folded glass frame. To protect the glass and wearer, and for comfort and ergonomics, a protective coating is applied/wrapped or otherwise added to the functional optical structural member, such as shock wave system based D30, which is soft and elastic in the absence of shock, but shock wave cured material upon impact provides a protective barrier to the less durable (albeit fairly durable) Willow glass structural/functional system. The folded Willow glass inner surface is the location of the system-on-glass electronics, shaped in a cylinder or semi-cylinder to increase strength and better protect the electronics from impact, and thus also enable thinner substrates.
Fiber optic data and lighting are transferred from the lighting, power (preferably wireless) and data processing unit in the pocket or on the smart textile composite wearable article integrated onto the user's body via the flexible textile wrap and protection (preferably D30 as an external composite layer, or other shock resistant composite component) cable to level and distribute and balance weight.
2. Once the fiber (data, light, and optionally power) cable is integrated with the composite Willow glass frame, the fiber is bonded (rather than more expensive and unnecessary thermal fusion) as a composite material to the data entry point of the E-0 data transmission and for the illumination insertion point on the display surface.
3. In this version, the framing structure elements are also shown as Willow glass or Willow glass type material systems with optional additional composite elements: but instead of solid glass or polymer lenses (binocular paired or continuous goggles) forming the optical form factor elements, these are thin film composite layers, pre-formed in lens-type to help form the desired surface geometry; compression ribs may also be used to achieve the appropriate curvature.
4. Since the sequence of functional optical elements comprises light guide/confinement channels after the initial filter and in its most complex stage, a preferred option existing in both the proposed structure and the substrate system is to implement optical channel elements, such as optical fibers, as part of the aerogel tensioned membrane matrix. Alternatively, a hollow IR/near IR rigid shell can be used, with solid (or semi-flexible) optical channels for IR through to IR generation channels, and visible through channels, with aerogel infiltration into the hollow and intervening spaces, including positive pressure aerogel, to achieve an extremely strong, low density, lightweight reinforced structural system. Aerogel filament composites have been developed commercially and continue to progress in such composite aerogel systems, providing a wide selection of materials for silica and other aerogels, now manufactured in low cost manufacturing processes (Cabot, aspen aerogel, etc.).
5. Other alternatives and/or the ability to use in a hybrid with willow glass are graphene-CNT (carbon nanotube) functional structural systems, used alone or preferably still compounded with aerogels.
6. With further development of graphene or functional electronic and photonic features, graphene layers or multilayers formed on thin Willow glass substrates or in sandwich systems with aerogels would be the preferred structural implementation, a mixture of graphene and CNTs for electronic interconnects, optical fibers and planar waveguides on glass for optical interconnects and in combination with other SOG system elements, and more heterogeneous material systems beyond SOG (would be the case of heterogeneous CMOS + systems, later "pure" CMOS).
7. More recently, graphene, CNT, preferably graphene-CNT combinations as compressive elements, alone or in combination with rolled-up Willow glass and optional aerogel cell interlayers, provide preferred light-weight integrated structural systems with superior substrate quality. Thus, semi-flexible Willow glass (or similar glass products as might be developed by Asahi, schott, etc., but less preferred recently), polymers or polymer glass blends can also be used as the deposition matrix for on-board processors, sensor deployments, and dense pixel signal processing array layers.
Other mobile or semi-wearable form factors (such as tablet computers) may also implement many mobile AR and VR solutions that find full application in the preferred HMD form factor.
Although specific embodiments have been disclosed herein, they should not be construed as limiting the application and scope of the proposed novel image display and projection based on the operations and levels required to decompose and individually optimize the pixel modulation.
The above-described systems and methods have been generally described to aid in understanding the details of preferred embodiments of the invention. In the description herein, numerous specific details are provided, such as examples of components and/or methods, to provide a thorough understanding of embodiments of the invention. Some of the features and benefits of the present invention are realized in this manner, and are not required in every instance. One skilled in the relevant art will recognize, however, that an embodiment of the invention can be practiced without one or more of the specific details, or with other apparatus, systems, assemblies, methods, components, materials, parts, and/or the like. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of embodiments of the invention.
Reference in the specification to "one embodiment," "an embodiment," or "a particular embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention, and not necessarily in all embodiments. Thus, the appearances of the phrases "in one embodiment," "in an embodiment," or "in a particular embodiment" in various places throughout this specification are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics of any specific embodiment of the present invention may be combined in any suitable manner with one or more other embodiments. It is to be understood that other variations and modifications of the embodiments of the present invention described and illustrated herein are possible in light of the teachings herein and are to be considered as part of the spirit and scope of the present invention.
It will also be appreciated that one or more of the elements depicted in the drawings/figures can also be implemented in a more separated or integrated manner, or even removed or rendered as inoperable in certain cases, as is useful in accordance with a particular application.
Additionally, any signal arrows in the drawings/figures should be considered only as exemplary, and not limiting, unless otherwise specifically noted. Further, as used herein, the term "or" is generally intended to mean "and/or" unless otherwise indicated. Combinations of components or steps will also be considered as being noted, where terminology is foreseen as rendering the ability to separate or combine is unclear.
As used in the description herein and the appended claims, "a" and "the" include plural references unless the context clearly dictates otherwise. Furthermore, as used in the description herein and the appended claims, the meaning of "in.
The foregoing description of illustrated embodiments of the present invention, including what is described in the abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed herein. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes only, various equivalent modifications are possible within the spirit and scope of the present invention, as those skilled in the relevant art will recognize and appreciate. As indicated, these modifications may be made to the present invention in light of the foregoing description of illustrated embodiments of the present invention and are to be included within the spirit and scope of the present invention.
Thus, while the invention has been described herein with reference to specific embodiments thereof, the scope of modification, various changes, and substitutions in the foregoing disclosure is intended, and it will be understood that in some instances some features of embodiments of the invention will be employed without a corresponding use of other features, without departing from the scope and spirit of the invention as set forth. Many modifications may be made to adapt a particular situation or material to the essential scope and spirit of the present invention. It is intended that the invention not be limited to the particular terms used in following claims and/or to the particular embodiment disclosed as the best mode contemplated for carrying out this invention, but that the invention will include any and all embodiments and equivalents falling within the scope of the appended claims. The scope of the invention is, therefore, indicated by the appended claims.

Claims (6)

1. A photonic system for visualizing an operating world, the operating world including a synthetic world in a virtual reality mode, comprising:
an enhancer for generating a set of channelized composite world image component signals from the composite world, each signal of the set of channelized composite world image component signals having properties required by a set of enhancers, wherein the enhancer includes the set of channelized composite world image component signals in an output set of channelized enhancer image component signals;
A visualizer, coupled to the enhancer, that processes the output set of channelized enhancer image component signals to modify the frequency/wavelength modulation or frequency/wavelength conversion properties of the enhancer set with the desired properties for each of the channelized enhancer image component signals that generated the output set of channelized enhancer image component signals, each signal in the output set of channelized enhancer image component signals having the desired properties for the visualizer set; and
an output constructor, coupled to the visualizer, generating a display image primitive group from the output group of channelized visualizer image composition signals.
2. The photonic system of claim 1, wherein each said enhancer set required properties comprises frequency/wavelength properties of each said channelized synthetic world image composition signal, wherein frequency/wavelength properties of said enhancer set required properties are located in portions of the electromagnetic spectrum that are not visible to the reference human visual system, and wherein said frequency/wavelength modulation or said frequency/wavelength conversion properties generate said visualizer set required properties, said frequency/wavelength properties of said visualizer set required properties being located in portions of the electromagnetic spectrum that are both visible to the reference human visual system.
3. The photonic system of claim 1, wherein the operating world further comprises a real world in an augmented reality mode, further comprising:
a real-world interface for generating a set of channelized real-world image component signals from the real-world, set of channelized real-world image component signals each having a set of real-world desired attributes; and
wherein the enhancer receives the set of channelized real-world image component signals and selectively includes the set of channelized real-world image component signals in an output set of the channelized enhancer image component signals.
4. The photonic system of claim 3, wherein each said set of real-world desired attributes comprises frequency/wavelength attributes of each said channelized real-world image component signal, wherein each said set of enhancer desired attributes comprises frequency/wavelength attributes of each channelized synthetic world image component signal, wherein said frequency/wavelength attributes of said set of enhancer desired attributes are located in a portion of the electromagnetic spectrum that is not visible to a reference human visual system, and wherein said frequency/wavelength modulation or said frequency/wavelength conversion attributes generate said set of visualizer desired attributes that are located in a portion of the electromagnetic spectrum that is visible to reference said human visual system.
5. The photonic system of claim 4, wherein the real world interface converts the set of real world complex composite electromagnetic wave arrays into the set of channelized real world image constituent signals, wherein the complex composite electromagnetic wave array set wavefronts comprise wavefronts having frequencies/wavelengths in the visible portion of the electromagnetic spectrum and the non-visible portion of the electromagnetic spectrum, and wherein the real world interface includes input structures that suppress the wavefronts having the visible portion of the electromagnetic spectrum to contribute to the set of channelized real world image constituent signals.
6. The photonic system of claim 4, wherein the real world interface converts the set of complex composite electromagnetic wave arrays of the real world into the set of channelized real world image constituent signals, wherein the set of complex composite electromagnetic wave arrays comprises wavefronts having frequencies/wavelengths in the visible portion of the electromagnetic spectrum and the non-visible portion of the electromagnetic spectrum, and wherein the real world interface comprises an input structure that suppresses the wavefronts having the non-visible portion of the electromagnetic spectrum to contribute to the set of channelized real world image constituent signals, wherein the real world interface converts and maps wavefronts in the visible portion of the electromagnetic spectrum to signals in the non-visible portion of the electromagnetic spectrum.
CN201780030255.4A 2016-03-15 2017-03-15 Mixed photon VR/AR system Active CN109564748B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211274817.9A CN115547275A (en) 2016-03-15 2017-03-15 Mixed photon VR/AR system

Applications Claiming Priority (17)

Application Number Priority Date Filing Date Title
US201662308585P 2016-03-15 2016-03-15
US201662308825P 2016-03-15 2016-03-15
US201662308687P 2016-03-15 2016-03-15
US201662308361P 2016-03-15 2016-03-15
US62/308,361 2016-03-15
US62/308,585 2016-03-15
US62/308,825 2016-03-15
US62/308,687 2016-03-15
US15/457,991 US9986217B2 (en) 2016-03-15 2017-03-13 Magneto photonic encoder
US15/457,980 US20180031763A1 (en) 2016-03-15 2017-03-13 Multi-tiered photonic structures
US15/457,967 2017-03-13
US15/457,967 US20180035090A1 (en) 2016-03-15 2017-03-13 Photonic signal converter
US15/458,009 2017-03-13
US15/457,991 2017-03-13
US15/457,980 2017-03-13
US15/458,009 US20180122143A1 (en) 2016-03-15 2017-03-13 Hybrid photonic vr/ar systems
PCT/US2017/022459 WO2017209829A2 (en) 2016-03-15 2017-03-15 Hybrid photonic vr/ar systems

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202211274817.9A Division CN115547275A (en) 2016-03-15 2017-03-15 Mixed photon VR/AR system

Publications (2)

Publication Number Publication Date
CN109564748A CN109564748A (en) 2019-04-02
CN109564748B true CN109564748B (en) 2022-11-04

Family

ID=65863567

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202211274817.9A Pending CN115547275A (en) 2016-03-15 2017-03-15 Mixed photon VR/AR system
CN201780030255.4A Active CN109564748B (en) 2016-03-15 2017-03-15 Mixed photon VR/AR system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202211274817.9A Pending CN115547275A (en) 2016-03-15 2017-03-15 Mixed photon VR/AR system

Country Status (2)

Country Link
JP (2) JP2019521387A (en)
CN (2) CN115547275A (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110518976A (en) * 2019-07-31 2019-11-29 同济大学 A kind of communication device based on distributed optical resonance system
CN110706322B (en) * 2019-10-17 2023-08-11 网易(杭州)网络有限公司 Image display method, device, electronic equipment and readable storage medium
CN112891946A (en) * 2021-03-15 2021-06-04 网易(杭州)网络有限公司 Game scene generation method and device, readable storage medium and electronic equipment
CN116027270B (en) * 2023-03-30 2023-06-23 烟台恒研光电有限公司 Positioning method and positioning system based on averaging processing technology

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4341397B2 (en) * 2003-12-16 2009-10-07 セイコーエプソン株式会社 Light propagation characteristic control apparatus, optical display apparatus, light propagation characteristic control program, optical display apparatus control program, light propagation characteristic control method, optical display apparatus control method, and projector
US9584778B2 (en) * 2008-02-14 2017-02-28 Sutherland C. Ellwood, Jr. Hybrid telecom network-structured architecture and system for digital image distribution, display and projection
US20130335682A1 (en) * 2011-03-09 2013-12-19 Dolby Laboratories Licensing Corporation High Contrast Grayscale and Color Displays
KR101310941B1 (en) * 2012-08-03 2013-09-23 삼성전자주식회사 Display apparatus for displaying a plurality of content views, shutter glasses device for syncronizing with one of the content views and methods thereof
KR101984915B1 (en) * 2012-12-03 2019-09-03 삼성전자주식회사 Supporting Portable Device for operating an Augmented reality contents and system, and Operating Method thereof
US10203762B2 (en) * 2014-03-11 2019-02-12 Magic Leap, Inc. Methods and systems for creating virtual and augmented reality

Also Published As

Publication number Publication date
CN115547275A (en) 2022-12-30
CN109564748A (en) 2019-04-02
JP2022081556A (en) 2022-05-31
JP2019521387A (en) 2019-07-25

Similar Documents

Publication Publication Date Title
US20180122143A1 (en) Hybrid photonic vr/ar systems
Aukstakalnis Practical augmented reality: A guide to the technologies, applications, and human factors for AR and VR
US11935206B2 (en) Systems and methods for mixed reality
Hainich et al. Displays: fundamentals & applications
CA2953335C (en) Methods and systems for creating virtual and augmented reality
US10297071B2 (en) 3D light field displays and methods with improved viewing angle, depth and resolution
US10274731B2 (en) Optical see-through near-eye display using point light source backlight
CN106662731B (en) Wearable 3D augmented reality display
Kim Designing virtual reality systems
WO2018076661A1 (en) Three-dimensional display apparatus
CN109564748B (en) Mixed photon VR/AR system
US8570372B2 (en) Three-dimensional imager and projection device
KR102071077B1 (en) A collimated stereo display system
CN110088663A (en) By providing the system and method that picture material is presented in parallax views in multiple depth planes in multiple pupils
KR20130097014A (en) Expanded 3d stereoscopic display system
Osmanis et al. Advanced multiplanar volumetric 3D display
Cheng Metaverse and immersive interaction technology
Hua Past and future of wearable augmented reality displays and their applications
US20220163816A1 (en) Display apparatus for rendering three-dimensional image and method therefor
WO2017209829A2 (en) Hybrid photonic vr/ar systems
US20220175326A1 (en) Non-Invasive Experimental Integrated Reality System
Hua et al. Head-mounted projection display technology and applications
CN113875230B (en) Mixed mode three-dimensional display method
Pastoor et al. Mixed reality displays
Crespel et al. Autostereoscopic transparent display using a wedge light guide and a holographic optical element: implementation and results

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant