WO2022130414A1 - Dispositif de présence virtuelle utilisant des humains entraînés pour représenter leurs hôtes à l'aide d'une interface homme-machine - Google Patents

Dispositif de présence virtuelle utilisant des humains entraînés pour représenter leurs hôtes à l'aide d'une interface homme-machine Download PDF

Info

Publication number
WO2022130414A1
WO2022130414A1 PCT/IN2021/051186 IN2021051186W WO2022130414A1 WO 2022130414 A1 WO2022130414 A1 WO 2022130414A1 IN 2021051186 W IN2021051186 W IN 2021051186W WO 2022130414 A1 WO2022130414 A1 WO 2022130414A1
Authority
WO
WIPO (PCT)
Prior art keywords
host
virtual
represent
hosts
machine interface
Prior art date
Application number
PCT/IN2021/051186
Other languages
English (en)
Inventor
Lokesh PATEL
Reena Patel
Original Assignee
Patel Lokesh
Reena Patel
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Patel Lokesh, Reena Patel filed Critical Patel Lokesh
Publication of WO2022130414A1 publication Critical patent/WO2022130414A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/0101Head-up displays characterised by optical features
    • G02B2027/0138Head-up displays characterised by optical features comprising image capture systems, e.g. camera
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/01Indexing scheme relating to G06F3/01
    • G06F2203/011Emotion or mood input determined on the basis of sensed human body parameters such as pulse, heart rate or beat, temperature of skin, facial expressions, iris, voice pitch, brain activity patterns

Definitions

  • Virtual presence device which uses trained humans to represent their hosts using Man Machine Interface.
  • the present invention relates to a virtual presence device which uses trained human beings (virtual hosts) to represent their hosts (real hosts), using Men Machine Interface (MMI) technologies due to which virtual presence is experienced in real time.
  • virtual hosts virtual hosts
  • MMI Men Machine Interface
  • Virtual presence is the ability of a user to feel that they are actually present in a virtual location using technologies like virtual reality (VR) or augmented reality (AR). The users may also be given the ability to affect the remote location. The user’s positions, movements, actions, voice etc may be sent, transmitted and duplicated in the remote location.
  • Video conferencing is a simple and most commonly used application to display virtual presence. Videoconferences are generally done on computers, smart televisions, smart phones etc in which the users at both the end can see each other while communicating. Apart from this, robots can be used for virtual presence enabling presence and communication of an individual at the desired location.
  • Using the technology of virtual presence a user can be present at a live real- world location which is remote from his own physical location, without actually travelling to that location. This will reduce the travel expenses, carbon footprint and environment impact along with saving time and improving productivity.
  • a number of technologies are available in the prior art to provide virtual presence facilities to users.
  • US 10834365 describes an audio-visual monitoring system using a virtual assistant wherein a function of a user-controlled virtual assistant (UCVA) device, such as a smart speaker, can be augmented using video or image information about an environment.
  • a system for augmenting an UCVA device includes an image sensor configured to monitor an environment, a processor circuit configured to receive image information from the image sensor and use artificial intelligence to discern a presence of one or more known individuals in the environment from one or more other features in the environment.
  • the system can include an interface coupled to the processor circuit and configured to provide identification information to the UCVA device about the one or more known human beings in the environment.
  • the UCVA device can be configured by the identification information to update an operating mode of the UCVA device.
  • US 10827150 discusses system and methods for facilitating virtual presence which includes a display having a structural matrix configured to arrange a plurality of spaced pixel elements. A plurality of spaced pixel elements collectively form an active visual area wherein an image is displayable. At least one image capture device is disposed within the active visual area for capturing an image.
  • the system is able to sense the environment in front of the display and, in response to what is sensed, is able to change one or more attributes of a displayed image, or, is able to change the displayed image or a portion of the displayed image.
  • US 10722800 discloses co-presence handing in virtual reality wherein a method for controlling a co-presence virtual environment for a first user and a second user includes: determining a first avatar's restricted space in the co-presence virtual environment, the first avatar correspond to the first user of the co-presence virtual environment; receiving user position data from a first computing device associated with the first user and determining the first avatar's location within the co-presence virtual environment; when the first avatar's location is within the first avatar's restricted space, communicating first co-presence virtual environment modification data to the first computing device; and communicating second co-presence virtual environment modification data to a second computing device associated with the second user.
  • US Patent Application 201816203287 relates to cognitive enhancement of communication with tactile stimulation wherein the methods include, for instance: determining a relationship between participants in an electronic communication. An emotion implicating a tactile stimulation is identified and a sender and a receiver of the tactile stimulation are specified. A contact point to which the tactile stimulation is applied on the body of the receiver is determined based on the relationship, according to a mapping between the relationship and the contact point as stored in a tactile stimulation knowledgebase. The tactile stimulation is delivered by use of a virtual presence user device on the side of the receiver.
  • GB 2567731 discusses low power virtual reality presence monitoring and notification wherein the method notifies a virtual reality (VR) headset user to the presence of a human or animal comprising: using thermal sensors (211, Fig. 3) to detect activity near the user 403; determining whether the thermal activity is moving 405; analysing the thermal activity to determine whether it is characteristic of an animate being 407; and generating a visual alert (505, Fig. 5) via the headset 409.
  • the visual alert to the user may be one of: an outline image; a thermal image; a photographic image; or a VR character.
  • the pose and location of the source of the thermal activity may be continuously monitored and the visual alert updated accordingly.
  • the VR headset may also generate an audio alert.
  • the sensors may be thermopile sensors. The system ensures that the VR headset user is alerted to, but not startled by, the presence of a real world animate being.
  • WO 2017007179 discloses a method for expressing social presence of virtual avatar by using facial temperature change according to heartbeats, and system employing the same which comprises the steps of: detecting ECG data of an actual user in real time; detecting, from the ECG data, a facial temperature change according to heartbeats; and changing the face of the user's avatar in response to the facial temperature change.
  • AU 2018427118 provides multi-location virtual collaboration, monitoring and control in which a virtual presence may be established to provide engagement with equipment and operators at a hydrocarbon recovery, exploration, operation, or services environments in order to reduce the expense associated with having knowledge experts, or other personnel, travel to and work at remote hydrocarbon recovery, exploration, operation, or services environments.
  • a wearable device may allow a user, such as a subject matter expert, at a virtual real time operation center to view data from equipment located at the site.
  • JP 2020095517 gives image processing system, image processing method, imaging device and program which allows a communication band to be allocated to an image dynamically and appropriately, and to improve image quality and enhance position detection accuracy in an MR system.
  • An MR system includes an HMD that photographs a real space and acquires a captured image and a PC (image processing device) that generates a composite image of the captured image and an image in a virtual space.
  • the HMD includes: an index detection unit that detects presence or absence of an index to obtain the composite image in the acquired captured image; and a transmission control unit that selects the captured image to be transmitted to the PC or sets a compression ratio of the captured image based on a result of the index detection unit.
  • the PC includes: an index detection unit that detects the index from the received captured image transmitted from the imaging device; and an image combining unit that combines the captured image with the image of the virtual space based on a result of the index detection unit.
  • an index detection unit that detects the index from the received captured image transmitted from the imaging device
  • an image combining unit that combines the captured image with the image of the virtual space based on a result of the index detection unit.
  • CN 111325124 comprises of real-time man- machine interaction system in virtual scene wherein the system comprises a visual attention area prediction module used for mutation detection and a behavior prediction module based on visual attention area characteristics.
  • the visual attention area prediction module receives an input video frame sequence, carries out target information detection, smooth motion detection and mutation information detection in sequence to obtain a visual saliency map, and carries out visual attention area extraction on the visual saliency map to obtain an attention area map; and the behavior prediction module is used for predicting user behaviors by utilizing the characteristics after carrying out characteristics on the user visual area and the video content.
  • the feedback behavior of the user after observing the video is predicted by inputting the video content observed by the user, so that the method can better operate and can cope with a smoothly changing scene in the presence of sudden changes in the scene.
  • a number of systems like videoconferencing are available in the prior art for providing some kind of virtual presence to a host located at another location, such systems do not have a provision for physical presence of the user at the desired location, which is essential for conducting various meetings, inspections, presentations etc.
  • a virtual presence device which uses trained persons, as virtual hosts, wearing a part of the device, to represent their hosts, which are remotely located and wearing the other part of the device, is the need of the day.
  • the main object of the invention is to provide a virtual presence device which uses trained humans to represent their hosts using Man Machine Interface which facilitates remotely conducting meetings, conferences, inspections etc. by a user (host) in any part of the world in real time, with the help of a trained person using the virtual presence device.
  • Another object of the invention is to provide a virtual presence device which uses trained humans to represent their hosts using Man Machine Interface in which the host has a device through which he can control the device of the trained person (virtual host) representing him at a remote location.
  • Still another object of the invention is to provide a virtual presence device which uses trained humans to represent their hosts using Man Machine Interface in which the virtual host device will cast/stream the host data to multiple streams of his device and convey the physical movements and responses received from the host device.
  • Yet another object of the invention is to provide a virtual presence device which uses trained humans to represent their hosts using Man Machine Interface in which the virtual host in the remote location discusses, presents, moves, examines, repairs, trains and/or behaves according to the host due to the continuous communication between the host device and the virtual host device.
  • a further object of the invention is to provide a virtual presence device which uses trained humans to represent their hosts using Man Machine Interface which saves both time and money of the user (host).
  • the present invention provides a virtual presence device which uses trained humans to represent their hosts using Man Machine Interface which uses one person (virtual host) to represent another person (real host) at a different location, using Man Machine Interface (MMI).
  • the host through his device, guides the virtual host, on his device, for actions like walking, running, turning, bowing, seating, standing, speaking, translating, broadcasting, presenting, repairing, inspecting, training etc., thereby creating the hosts virtual presence in any part of the world in real time.
  • the devices used for creating the virtual presence can be divided into host device and virtual host device.
  • the host device comprises of 3D camera, monitor, pointing device, mic array, speakers and position sensor while the virtual host device comprises of curved display on exterior face of helmet, display on interior face of helmet, inside speaker, inside mic, mic array, speakers, 180+ degree camera, position sensor, IR sensor, battery pack, in-use charging, dual band Wi-Fi and active or passive cooling system
  • Both the devices of the host and virtual host are wearable on heads due to which their hands are free for any kinds of work or gestures.
  • Fig. 1 displays the two parts of the device, one worn by the real host, another worn by the virtual host.
  • Fig. 2 gives the list of the hardware of both the host device and the virtual host device
  • Fig. 3 gives the image of the 3D sensing camera.
  • Fig. 4 gives the image of the transparent and flexible OLED display.
  • Fig. 5 gives the image of the hand-gesture device.
  • Fig. 6 gives the flowchart of the gesture recognition framework.
  • Fig. 7 gives an image of the gyroscopic sensor.
  • the present invention describes a virtual presence device which uses trained humans to represent their hosts using Man Machine Interface which is used to provide a real time virtual presence experience.
  • This device helps a human host to create his/her virtual presence in any part of the world in real time with the help of a virtual host, who acts as a virtual host. People surrounding the virtual host can interact with the actual host in real time and the host too can interact with them through the virtual host. With the help of this device, the host can attend meetings in real time, carry out audits remotely of any place, offer service support remotely to industries, can address audience remotely, can operate equipment/machinery remotely, carry out repairs remotely, explore remote locations/places and many such activities with the aid of the virtual host.
  • This device comprises of two parts as shown in Fig. 1, in which one part of the device is worn as a headgear (wearable hat/helmet type of device) by the host, who wishes to display his virtual presence at a remote location wherein the facial expressions, voice and video are captured through the headgear while the body movements are captured by gyroscopic sensors and devices worn on the body by the user.
  • This part of the device has an authoritarian control of the meeting and will be able to control the hardware of the virtual host, as and when required.
  • the other part of the device is worn as a headgear (wearable helmet type of device) by the virtual host, as shown in Fig. 1 who represents the host at the remote location wherein the face of the host is displayed on the front screen of the headgear along with his voice and video.
  • the virtual host also wears gyroscopic sensors and devices on his body to capture the gestures of the host and perform the actions instructed by the host.
  • This device also has external speakers as well as internal speakers and microphones for private as well as public communication. This device casts/streams the host data to multiple streams of devices and also conveys the physical movements and responses received.
  • the virtual host will also be able to act as translator and interpreter, in the language required.
  • Both parts of the device are connected to cloud servers for data storage and internet connectivity using various available devices and technologies like Wi-Fi, 2G, 3G, 4G, 5G internet connectivity for communication.
  • a number of hardware are provided in both the headgears which work with the related software to transfer the activities performed by the host to the virtual host, such that the virtual host will perform these activities at the remote location as a proxy of the host.
  • the host guides the virtual host for actions like walking, running, turning, bowing, seating, standing, speaking, talking, broadcasting, presenting, inspection, repairing, healing, training etc., which are replicated by the virtual host present at the remote location.
  • the host device comprises of hardware like 3D sensing camera, transparent and flexible OLED display, gesture control devices, microphone array, speakers array, gyroscopic sensors, GPS sensors, motion sensors, proximity sensors, infrared sensors, force sensors, temperature/humidity sensors, communication module (4G, 5G, Wi-Fi), battery pack, and active or passive cooling system.
  • hardware like 3D sensing camera, transparent and flexible OLED display, gesture control devices, microphone array, speakers array, gyroscopic sensors, GPS sensors, motion sensors, proximity sensors, infrared sensors, force sensors, temperature/humidity sensors, communication module (4G, 5G, Wi-Fi), battery pack, and active or passive cooling system.
  • the virtual host device comprises of hardware like 3D sensing 180+ degree camera, transparent and flexible OLED display, inside display device, gesture control devices, microphone array (including inside microphone and outside microphone), speakers array (including inside speakers and outside speakers), gyroscopic sensors, GPS sensors, motion sensors, proximity sensors, infrared sensors, force sensors, temperature/humidity sensors, communication module (4G, 5G, Wi-Fi), battery pack, and active or passive cooling system.
  • the hardware of both the host device and the virtual host device have been listed in Fig. 2.
  • the software features of the host device and the virtual host device comprise of Cloud based Authentication, End-To-End Encryption (E2EE), Video & audio Streaming and manipulation, Gesture recording from devices, Movement control and positioning, Giving live feedback to the operator, Playing pre-recorded content, One to one communication feature, Saving meeting hours on cloud and Privacy mode.
  • E2EE End-To-End Encryption
  • Video & audio Streaming and manipulation Gesture recording from devices
  • Movement control and positioning Giving live feedback to the operator, Playing pre-recorded content, One to one communication feature, Saving meeting hours on cloud and Privacy mode.
  • 3D sensing camera (Fig. 3) - Three dimensional (3D) technology is a momentous scientific breakthrough. It is a depth-sensing technology that augments camera capabilities for facial and object recognition. The process of capturing a real-world object’s length, width, and height with more clarity and in-depth detail than can be achieved using a number of different technologies. 3D technology delivers unique advancements in the way day-to-day activities are perceived and approached.
  • 3D is a real game-changer as manufacturers scramble to incorporate these new advancements into consumer products such as mobile phones.
  • 3D sensing technology mimics the human visual system using optical technology, which facilitates the emergence and integration of augmented reality, Al (Artificial Intelligence), and the Internet of Things (loT). This creates unique opportunities in consumer applications.
  • Al Artificial Intelligence
  • LoT Internet of Things
  • VCSELs VCSELs
  • Light source technology for 3D sensing can replace LEDs or edge-emitting laser diodes, as they are simple, have a narrow spectrum, and a stable temperature.
  • Stereoscopic vision, structured light pattern, and time of flight are three technologies used for 3D sensing.
  • the stereoscopic vision technology derives its structure from the way human eyes capture any image. Two cameras are placed at slightly offset positions (just like human eyes). The two captured images are then united into one picture using the software. Small variances resulting from the different camera positions create the stereoscopic, i.e., 3D picture.
  • a laser projection module is deployed, which projects dots on the object or scene to help the camera focus more easily. The captured image is processed to bring out a depth effect. For instance, this technology is used in bullet cameras installed for monitoring people’s movement at door entrances and other places.
  • FLIR Systems U.S manufactures Stereo Vision Camera Systems with stereoscopic vision technology.
  • a light pattern made of either line, squares (periodic structures), or dots is projected on to an object or a scene by a laser projection module.
  • a distorted pattern is created by the reflected light.
  • the reflected light from the target is captured by a camera mounted triangularly to the projection module.
  • the pattern distortion achieved by the triangulation between the projection module and the camera helps in the acquisition of 3D coordinates of the object or scene.
  • the most common example is the True Depth Camera used in iPhone X.
  • the front camera with this technology adds an infrared emitter that projects over 30,000 dots in a known pattern onto the user’s face. Those dots are then photographed by a dedicated infrared camera for analysis, and thus, the image analyzed is used for accessing the phone.
  • Time-of-Flight camera sensors can be used for object scanning, measuring distance, indoor navigation, obstacle avoidance, gesture recognition, tracking objects, measuring volumes, reactive altimeters, 3D photography, and augmented reality games, among others.
  • Transparent and flexible OLED display (Fig. 4) - Many transparent and flexible displays are based on the new display technology which is Organic Light Emitting Diode (OLED) technology, and they are going to be used often in future displays.
  • OLED Organic Light Emitting Diode
  • Both LCD and OLED have brighter displays in a sunny environment and provide excellent quality pictures.
  • OLEDs are overall slightly brighter than LCD because of their dynamic range and no backlight function. Contrast ratio is another important aspect of picture quality. A display with high-contrast will look more realistic than a display with low-contrast. The contrast ratio is the difference between the brightest and darkest pixels in a display. OLED displays have a clear advantage in contrast because they have true black pixels.
  • Both LCD and OLED displays can provide Ultra HD 4K resolution, so there is no difference when it comes to display resolution. Some old LCDs are still available in 1080P resolution, but modern displays are all 4K.
  • Transparent OLED displays only consist of transparent components, and when they are turned off, they are up to 85% transparent. When it is turned on, it allows light to pass in both directions.
  • OLED Organic Light Emitting Diode
  • Gesture Control Devices - Gesture control allows a human to interact with a device without touch or audio. Instead, the device can detect and decipher movements or actions and translate them into functions.
  • Gesture control is the ability to recognize and interpret movements of the human body in order to interact with and control a computer system without direct physical contact.
  • the term “natural user interface” is becoming commonly used to describe these interface systems, reflecting the general lack of any intermediate devices between the user and the system.
  • Gesture control or gesture recognition, is both a topic of computer science and language technology, where the primary goal is the interpretation of human gestures via algorithms.
  • Gesture control devices have the ability to recognize and interpret movements of the human body, allowing users to interact with and control a system without direct physical contact. Gestures can originate from any bodily motion or state, but normally originate from the hand.
  • touchless gesture control technologies used today to enable devices to recognize and respond to gestures and movements. These range from cameras to radar, and they each come with pros and cons depending on the application.
  • EMG Electromyography
  • the Myo armband is a wearable device provided with eight equally spaced non-invasive EMG electrodes and a Bluetooth transmission module.
  • the EMG electrodes detect signals from the forearm muscles activity and afterwards the acquired data is sent to an external electronic device.
  • the sampling rates for Myo data are fixed at 200 Hz and the data is returned as a unitless 8-bit unsigned integer for each sensor representing “activation” and does not translate to millivolts (mV).
  • Recognition of human gestures comes within the more general framework of pattern recognition.
  • systems consist of two processes: the representation and the decision processes.
  • the representation process converts the raw numerical data into a form adapted to the decision process which then classifies the data. This recognition process has been displayed in Fig- 6.
  • Gesture recognition systems inherit this structure and have two more processes: the acquisition process, which converts the physical gesture to numerical data, and the interpretation process, which gives the meaning of the symbol series coming from the decision process.
  • Microphone Array - A Microphone Array (or array microphone) is a microphone device that functions just like a regular microphone, but instead of having only one microphone to record sound input, it has multiple microphones (2 or more) to record sound.
  • the microphones in the array device work together to record sound simultaneously.
  • Microphone arrays can be designed to have as many microphones in them as needed or wanted to record sound output.
  • a common microphone array is a 2-microphone array device, with one microphone placed on the left side of the device and and the other placed on the right side. With one microphone on each side, sounds can be recorded from both the left and right side of the room, making for a dynamic stereo recording which mimics surround sound. When played back on a stereo headset, the separate left and right channel recording are distinctly different and noticeably heard.
  • the most important characteristic that must be present in microphone array devices is microphone matching. All of the microphones in an array must be similar and closely matched and in some aspects exactly the same in order for an array to pick up good recording.
  • Three aspects to consider for microphone matching in microphone arrays are directionality, sensitivity, and phase.
  • Directionality The directionality of a microphone is the direction from which it can pick up sounds. Microphones are made to pick up sound from certain directions when spoken into. Some microphones are made to only pick up sounds from one direction, unidirectional microphones. Other microphones are made so that they can pick up sounds from all directions, omnidirectional microphones. When building an array microphone, all the microphones must have the same directionality. Having one microphone pick up sounds only from a certain direction and the other pick up sounds from all directions would make for disastrous, imbalanced sound recording. Unless there is some unique situation where this would be the case, this is largely undesired.
  • Sensitivity is another aspect that must match for microphone arrays. Sensitivity is the gain that a microphone picks up when recording a signal. Sensitivity must be closely matched in microphone array devices, or else one microphone will be louder than the other, producing imbalanced sound recordings. This is why usually the maximum sensitivity difference allowed in array microphones is ⁇ 1.5dB so that there is no bigger than a 3dB difference in microphone sensitivity of the microphones.
  • Phase is the last important aspect that must match for microphone arrays. Phase is the degree line of reference for the time that a microphone begins recording, meaning, it determines the time that all microphones in an array start and stop recording. If microphones have drastically different phases, they will record signals at different times. This will lead to unsynchronized recording. Again, this is largely undesired. It is desired that microphones record signals at the same time so that there is no delay between signals. Just like sensitivity, there must be a maximum allowable tolerance for phase difference between microphones. This difference is usually ⁇ 1.5 degrees to ensure that signals record at the same time, leading to harmonized recording.
  • Microphone array consists of multiple mics placed on different angels throughout the attendee device, using multiple mics on different angle will provide the audio direction feedback to the system which enable to locate source of recorded voice.
  • Speakers Array - Vertical line arrays and column speakers tend to provide good control of the vertical coverage, but provide a predetermined horizontal. Depending on the aspect ratio of the room and the horizontal reverberation character of the space, point source speakers can perform better than a line array of speakers in this regard. In air, sound is transmitted by pressure variations from its source to the surroundings.
  • the sound level decreases as it gets further and further away from its source. While absorption by air is one of the factors attributing to the weakening of a sound during transmission, distance plays a more important role in noise reduction during transmission. The reduction of a sound is called attenuation.
  • the effect of distance attenuation depends on the type of sound sources. Most sounds or noises we encountered in our daily life are from sources which can be characterized as point or line sources. If a sound source produces spherical spreading of sound in all directions, it is a point source.
  • the noise level decreases by 6 dB per doubling of distance from it. If the sound source produces cylindrical spreading of sound, it may be considered as a line source. For a line source, the noise level decreases by 3 dB per doubling of distance from it.
  • Gyroscope Sensors - Gyroscope sensor is a device that can measure and maintain the orientation and angular velocity of an object. These are more advanced than accelerometers. These can measure the tilt and lateral orientation of the object whereas accelerometer can only measure the linear motion.
  • Gyroscope sensors are also called as Angular Rate Sensor or Angular Velocity Sensors. These sensors are installed in the applications where the orientation of the object is difficult to sense by humans. Measured in degrees per second, angular velocity is the change in the rotational angle of the object per unit of time.
  • Gyroscope sensors can also measure the motion of the object.
  • Accelerometer sensors are combined with consumer electronics Gyroscope sensors.
  • Gyroscope sensors are divided as small and large-sized. From large to small the hierarchy of Gyroscope sensors can be listed as Ring laser gyroscope, Fiber-optic gyroscope,
  • Fluid gyroscope and Vibration gyroscope. Being small and more easy to use Vibration gyroscope, displayed in Fig. 7, is most popular. The accuracy of vibration gyroscope depends upon the stationary element material used in the sensor and structural differences.
  • the main functions of the Gyroscope Sensor for all the applications are Angular velocity sensing, angle sensing, and control mechanisms.
  • Gyroscope Sensors we will be able to accurately transmit angular velocity data from host to attendee device and visa versa, which in turn will give processed data to gesture control devices to replace accurate movements for attendee human.
  • GPS Sensor - GPS is the abbreviation of Global Positioning System, which is a satellite navigation system, which is about 20,000km away from the Earth. It can provide us with location and time information. It can work 24 hours a day under any conditions. A complete GPS requires at least 24 satellites. As technology develop, more than 33 satellites work together in the system of GPS.
  • GPS tracker that is a terminal device based on GPS positioning technology.
  • GPS sensor in Host as well as Attendee device we will be able to capture geographical usage of both the devices.
  • future versions of the present invention we may develop one to many communication/broadcasting devices, wherein advance level of GPS data usage will play vital role.
  • Motion detection, Proximity for absence presence, Infrared for heat, force for movement intensity, temperate & humidity sensors, etc. will be added from time to time.
  • Our motto is to create virtual + augmented reality using human for human in such a way, that at both the ends, as much as we can environment /surrounding data is exchanged.
  • Battery Pack - Battery technology has already improved enormous over the nickel-toting cells used in the 80s.
  • the following decade’s switch to lithium-ion/poly batteries has allowed more power to be crammed into smaller spaces, helping kick-off the smartphone revolution.
  • Today, manufacturers are already using innovative solutions to provide more power, and there isn’t a day that goes by without news of a potentially revolutionary new bit of battery tech hitting the news.
  • the user helmet will have one battery pack installed in the device having capacity to run independently from the docking station. Another battery pack with snap in functionality will enable the user to carry extra battery pack while in mobile use and will also enable the user swap battery without interrupting the device functionality.
  • Active or passive cooling system - Having multiple display and processor on the device will generate heat and may decrease life and efficiency of the device.
  • Passive cooling system along with active cooling system will carry the heat from the device away as silently possible thus reducing noise interruption.
  • SSO Single Sign on
  • End to end encryption - End-to-end encryption is a method of secure communication that prevents third parties from accessing data while it's transferred from one end system or device to another.
  • E2EE the data is encrypted on the sender's system or device, and only the intended recipient can decrypt it. As it travels to its destination, the message cannot be read or tampered with by an internet service provider (ISP), application service provider, hacker or any other entity or service. Many popular messaging service providers use end-to-end encryption, including Facebook, WhatsApp and Zoom. In the present invention, E2EE is used to secure communication.
  • ISP internet service provider
  • application service provider application service provider
  • hacker or any other entity or service.
  • Many popular messaging service providers use end-to-end encryption, including Facebook, WhatsApp and Zoom.
  • E2EE is used to secure communication.
  • Video & audio Streaming and manipulation - Video streaming software allows to mix multiple camera sources to create a professional- looking HD broadcast.
  • Encoding is another major function of streaming software. Typically, it serves two main purposes: encoding and mixing/production.
  • Live broadcasting software uses video encoding technology to convert your video feed into a suitable format for live streaming.
  • Audio & Video stream from host device will be broadcasted live on the front transparent OLED screen to give audience live presence of the host.
  • the software will get the input from the 3D camera hardware connected to the host device. 3D camera will provide depth and distance data along with the video stream, the software will than perform facial recognition and will capture the depth information of the facial area.
  • Software will generate an instant 3d model from the captured data and will transmit to the attendee side to play on the convex flexible display to mimic depth information of the human face on the curved display.
  • the system will use the input from multiple camera array from the attendee device and will create panoramic video stream from the attendee device and using live stitching and stretching operation will create video having width of human eye viewport.
  • the software will also use facial recognition to verify the attendance of the attendee and store and display Realtime on the screen along with other statistics related to the attendee and meeting.
  • Gesture recording from devices - Gesturing is a natural and intuitive way to interact with people and the environment. So it makes perfect sense to use hand gestures as a method of human-computer interaction (HCI).
  • Hand tracking and gesture recognition are not the same things. Both technologies are supposed to use hands for human-machine interaction (HMI) without touching, switching, or employing controllers. Sometimes, systems for hand tracking and gesture recognition require the use of markers, gloves, or sensors, but the ideal system requires nothing but a human hand.
  • HMI human-machine interaction
  • the software will track and record the hand gesture of the narrator and will convert into in instruction set for the attended device operator.
  • Intel has recently released a suite of depth and tracking technologies called RealSense, providing the developer community with open-source tools for a variety of languages and platforms.
  • the Intel RealSense Depth Camera D455 with Lidar, stereo depth, tracking, and coded light capabilities provides a high level of gesture recognition and a longer range for HMI.
  • dynamic hand gesture recognition systems can be applied to various use cases, from robotics and drones to 3D scanning and people tracking.
  • Host will have ability to point the object from the stream of attendee device video feed and the attendee can act according to instruction and verbal communication. Host can also navigate the attendee and send visual signs and notification along with external data such as image or video as per need.
  • Software will transmit bidirectional data from host to attendee device to create real time virtual presence experience. Also such data will addon to enhance augmented reality experiences.
  • the software will take input from the installed gyroscopic and other positioning sensors to record the movement of the operator and it will compare with the given instructions, doing so will create a feedback loop system for gesture and movement.
  • the software will process the compared data and will give feedback to the host as well as attendee device for improvisation and confirmation of the movement.
  • Playing pre-recorded content In case of conferencing or repeating session, the host can play the pre recorded video along with the gesture and dynamic voice recorded audio, the other function of the system such as attendees and recording and logging from the attendee device will continue to function.
  • One to one communication feature This feature will be used when the host side enables the enhanced privacy mode. The operator will be replaced by the attendee, it will also disable the outside display and speaker system.
  • Enabling privacy mode will disable the software function related to storage, data processing and feedback system. At the moment the device will not store any video or audio data and will not log the communication other than connection and disconnection time.
  • the user will not be able to access any features related to data storage and manipulation while in one to one communication mode.
  • the system should have integration to MS Outlook, powerful Agenda, Task and Meeting Minutes modules, as well as a comprehensive Meeting Analytics system for analyzing meeting data making it most effective meeting management tool.
  • No data during the meeting will be saved on the local disk of any of the device.
  • User will have option to save the up-link and down-link stream on the cloud on demand. And can be accessed the saved stream anytime using the cloud service.
  • Privacy mode This mode enables the host user to have a private conversation with the virtual host (by replacing the operator with virtual host) and the software and service will not store or process with any of the tools of the software as well as will not be stored on cloud.
  • the virtual presence device of the present invention which uses trained persons to represent their hosts using Man Machine Interface (MMI) is highly advantageous as it solves a major problem faced in virtual interactions, which is of providing personal presence at a remote location. It provides a virtual host in proxy to the host (user), who walks, talks, moves, enacts, discusses, trains, diagnoses, inspects etc. at the remote location, as per the guidance of the host. This saves the time as well as money of the host which would have been spent if he would have to go personally to the remote location to attend the event.
  • MMI Man Machine Interface

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

La présente invention concerne un dispositif de présence virtuelle qui utilise un hôte virtuel pour représenter l'hôte réel à l'aide de technologies d'interface homme-machine donnant un ressenti de présence virtuelle en temps réel. Une partie du dispositif est portée par l'hôte sous la forme d'un casque qui est constitué d'une multiplicité de matériels où est chargé un logiciel requis, par l'intermédiaire duquel il guide le participant pour des actions comme celles consistant à marcher, courir, se tourner, s'incliner, s'asseoir, se tenir debout, parler, traduire, diffuser, présenter, former, réparer, diagnostiquer, etc. Le participant reçoit la guidance provenant de l'hôte par l'intermédiaire de l'autre partie du dispositif qui est portée par lui sous la forme d'un casque constitué de divers matériels où est chargé un logiciel requis. L'hôte et le participant portent également tous deux des capteurs gyroscopiques sur les parties du corps devant effectuer des mouvements requis.
PCT/IN2021/051186 2020-12-17 2021-12-17 Dispositif de présence virtuelle utilisant des humains entraînés pour représenter leurs hôtes à l'aide d'une interface homme-machine WO2022130414A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202021054896 2020-12-17
IN202021054896 2020-12-17

Publications (1)

Publication Number Publication Date
WO2022130414A1 true WO2022130414A1 (fr) 2022-06-23

Family

ID=82058587

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2021/051186 WO2022130414A1 (fr) 2020-12-17 2021-12-17 Dispositif de présence virtuelle utilisant des humains entraînés pour représenter leurs hôtes à l'aide d'une interface homme-machine

Country Status (1)

Country Link
WO (1) WO2022130414A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180033203A1 (en) * 2016-08-01 2018-02-01 Dell Products, Lp System and method for representing remote participants to a meeting
US10469546B2 (en) * 2011-10-28 2019-11-05 Magic Leap, Inc. System and method for augmented and virtual reality

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10469546B2 (en) * 2011-10-28 2019-11-05 Magic Leap, Inc. System and method for augmented and virtual reality
US20180033203A1 (en) * 2016-08-01 2018-02-01 Dell Products, Lp System and method for representing remote participants to a meeting

Similar Documents

Publication Publication Date Title
US20220337693A1 (en) Audio/Video Wearable Computer System with Integrated Projector
US7725547B2 (en) Informing a user of gestures made by others out of the user's line of sight
US20100060713A1 (en) System and Method for Enhancing Noverbal Aspects of Communication
US11176358B2 (en) Methods and apparatus for sharing of music or other information
US10521013B2 (en) High-speed staggered binocular eye tracking systems
US20220217495A1 (en) Method and network storage device for providing security
US20180278995A1 (en) Information processing apparatus, information processing method, and program
WO2018216355A1 (fr) Appareil de traitement d'informations, procédé de traitement d'informations et programme
US10440103B2 (en) Method and apparatus for digital media control rooms
EP3568992A1 (fr) Utilisation de carillons pour l'identification de roi dans une vidéo à 360 degrés
JP2012175136A (ja) カメラシステムおよびその制御方法
US9332580B2 (en) Methods and apparatus for forming ad-hoc networks among headset computers sharing an identifier
JPWO2019155735A1 (ja) 情報処理装置、情報処理方法及びプログラム
WO2018075523A1 (fr) Système informatique vestimentaire audio/vidéo à projecteur intégré
KR101784095B1 (ko) 복수의 영상 데이터를 이용하는 헤드 마운트 디스플레이 장치 및 복수의 영상 데이터를 송수신하기 위한 시스템
US11810219B2 (en) Multi-user and multi-surrogate virtual encounters
JP6969577B2 (ja) 情報処理装置、情報処理方法、及びプログラム
WO2008066705A1 (fr) Appareil de capteur d'image avec indicateur
JP2023531849A (ja) オーディオ認識を行う拡張現実デバイスおよびその制御方法
US11622083B1 (en) Methods, systems, and devices for presenting obscured subject compensation content in a videoconference
US11212485B2 (en) Transparency system for commonplace camera
WO2022130414A1 (fr) Dispositif de présence virtuelle utilisant des humains entraînés pour représenter leurs hôtes à l'aide d'une interface homme-machine
US12088781B2 (en) Hyper-connected and synchronized AR glasses
EP4325842A1 (fr) Système d'affichage vidéo, dispositif de traitement d'informations, procédé de traitement d'informations et programme
CN117311506A (zh) 增强现实交互方法、装置以及增强现实设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21906015

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21906015

Country of ref document: EP

Kind code of ref document: A1