CN115756170A

CN115756170A - Multi-device interaction system and method based on augmented reality device

Info

Publication number: CN115756170A
Application number: CN202211490160.XA
Authority: CN
Inventors: 张腾翔; 曾馨; 陈益强
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2022-11-25
Filing date: 2022-11-25
Publication date: 2023-03-07

Abstract

The system comprises an augmented reality device and a plurality of screen devices, wherein the augmented reality device is used for obtaining the spatial coordinates of the screen devices and modeling the screen devices; obtaining space coordinates of resources in the screens of the screen devices based on the relative positions of the resources in the screens of the screen devices and the space coordinates of the corresponding screen devices; and when the device interacts with the screen devices, rendering the screen devices and virtual models of resources in the screen of the screen devices, and operating the virtual models of the resources in the screen of the screen devices among the screen devices through the augmented reality device to realize the interaction among the screen devices.

Description

Multi-device interaction system and method based on augmented reality device

Technical Field

The invention relates to the field of human-computer interaction, in particular to a multi-device interaction system and method based on augmented reality equipment.

Background

In modern life, digital devices based on screen interaction, such as computers and mobile phones, are the most types of devices that users make contact with. When interacting with such devices, the primary medium of input is the interactable screen (e.g., gestures such as pointing and swiping directly on the screen) or peripherals attached to it (e.g., mouse, keyboard, stylus, etc.), while output relies primarily on mapped feedback through the screen and input (e.g., movement of gestures, mouse movement matching the coordinate system on the screen). Although such methods have been widely used, there are still some problems:

(1) For screen interaction based on gestures, the user cannot smoothly complete an interaction task due to the fact that one hand is difficult to operate, two hands are occupied and the like, and interaction efficiency of the user is reduced; for gesture interaction based on touch control, remote interaction cannot be finished;

(2) For the screen interaction based on the peripheral equipment, a user needs to cooperate with the same or even several peripheral equipment to finish an interaction task together, the interaction task cannot be separated from fixed external equipment, and the interaction flexibility is reduced;

(3) For the existing multi-equipment working environment, an interaction mode taking specific equipment as a center cannot switch work among various kinds of equipment timely, flexibly and uniformly.

Based on the above problems, researchers have begun to improve the corresponding interaction mode, and there is a room for detecting a finger close to a screen through a self-capacitance touch screen to enhance gesture-based screen interaction. There are also external devices on the market that incorporate multiple functions, such as enabling one external device to bind multiple devices simultaneously via bluetooth and enabling multiple devices to be quickly switched via a shortcut key. However, the existing touch gesture-based and external device-based screen-type interaction methods still have the following limitations:

(1) The traditional mode taking a specific device as an interaction center is difficult to meet the requirements of users in a multi-device working environment;

(2) The screen type equipment has various size changes (such as a large-screen display, a tablet personal computer about 13 inches and a mobile phone about 6 inches), electronic resources in the screen are various in types and relatively small in size (such as pictures, icons, characters, videos and the like in the screen), and the screen type equipment and the internal electronic resources thereof are difficult to accurately acquire;

(3) The existing equipment interaction mode is relatively single, so that the interaction possibility of a user is limited, and the introduction of a new interaction mode often needs to add relatively complex or various sensing equipment and cannot be directly multiplexed among multiple equipment.

Disclosure of Invention

According to the above problems in the prior art, the invention provides a multi-device interaction system based on augmented reality devices, the system including augmented reality devices and multiple screen devices, the augmented reality devices being configured to obtain spatial coordinates of the multiple screen devices and model the multiple screen devices; obtaining space coordinates of resources in the screens of the screen devices based on the relative positions of the resources in the screens of the screen devices and the space coordinates of the corresponding screen devices; rendering the virtual models of the screen devices and the resources in the screen of the screen devices when the screen devices are interacted, and operating the virtual models of the resources in the screen among the screen devices through the augmented reality device to realize the interaction among the screen devices.

In one embodiment, the augmented reality device is further to: obtaining space coordinates of the plurality of screen equipment key points; modeling the plurality of screen class devices based on the key points.

In an embodiment, the augmented reality device further includes a depth camera and an RGB camera, and is configured to obtain depth and position information of the plurality of screen-type device key points, so as to calculate spatial coordinates of the corresponding screen-type device key points.

In one embodiment, the key points of the plurality of screen-type devices are provided with stickers for transmitting wireless signals, the augmented reality device is provided with an antenna array for detecting the wireless signals, and the augmented reality device detects and calculates the spatial coordinates of the key points of the corresponding screen-type devices through the antenna array.

In one embodiment, the on-screen resources of the plurality of on-screen devices are divided into one or more minimum resources, and the relative positions of the one or more minimum resources in the corresponding screens are obtained, and the type, the source address and the relative positions of the one or more minimum resources in the screens are sent to the augmented reality device.

In one embodiment, when the resources in the screen of the screen-type device are updated, the relative position of the minimum resource in the screen is updated and calculated, and then the type, the source address and the relative position of the updated minimum resource in the screen are sent to the augmented reality device.

In one embodiment, the augmented reality device and the plurality of screen devices are connected through a Socket for resource transmission, the plurality of screen devices are server sides, and the augmented reality device is a client side.

In one embodiment, the augmented reality device controls the plurality of screen-like devices by detecting hand movements and/or eye movements of the user.

In one embodiment, the augmented reality device is a head mounted display.

The invention also provides a multi-device interaction method for the multi-device interaction system based on the augmented reality device, which comprises the following steps:

the method comprises the steps that the augmented reality equipment obtains space coordinates of a plurality of screen equipment in the environment and models the screen equipment;

obtaining space coordinates of resources in the screens of the screen devices based on the relative positions of the resources in the screens of the screen devices and the space coordinates of the corresponding screen devices; and

rendering the plurality of screen devices and virtual models of resources in screens of the screen devices when the screen devices interact with the plurality of screen devices, and operating the virtual models of the resources in screens of the screen devices among the plurality of screen devices through augmented reality devices to realize interaction among the plurality of screen devices.

According to the multi-device interaction system and method based on the augmented reality device, the augmented reality device is utilized to model an environment to obtain a uniform space coordinate system, the space coordinates of the multiple screen devices are calculated under the space coordinate system, the uniform space coordinate system is established for the multiple screen devices, and background data can be conveniently calculated during interaction among the multiple screen devices. The method can also acquire resource information of different types and sizes in different screen type equipment, including resource types, data, coordinates and the like, and map the resource information into coordinate points under unified space coordinates, so that the interaction of fine-grained resources can be supported.

Drawings

Fig. 1 shows a schematic diagram of a multi-device interaction system based on an augmented reality device according to an embodiment of the invention.

Fig. 2 shows a flowchart of a multi-device interaction method based on an augmented reality device according to an embodiment of the present invention.

FIG. 3 shows the results of modeling a screen class device according to one embodiment of the invention.

Fig. 4 shows a schematic view of the capture of the picture 1 from the computer a to the computer B.

Fig. 5 shows a schematic drawing of the capture of pictures 1-3 from computer a to computer B.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail by way of specific embodiments with reference to the accompanying drawings. It should be noted that the examples given herein are for illustration only and do not limit the scope of the invention.

In the past, few devices can be used by users to make contact with each other at the same time, so that one device can work well in an interaction mode of a set of own interaction system and taking the device as a center. At present, 3-5 kinds of intelligent screen equipment may exist in the environment at the same time, and a user has urgent need for switching among multiple equipment. However, how to establish a unified interaction enhancement system between a plurality of different types, different sizes, and even different underlying operational logic devices is a challenging problem. Based on the above, the invention provides a multi-device interaction system and method based on augmented reality equipment, namely a multi-device interaction mode which connects multiple devices in series through mixed reality and takes a user as a center.

First, terms and concepts applied in the present invention are explained. The screen-type device is a digital device that performs interaction based on a screen, and may be, for example, a Personal Computer (PC), a smart television, a tablet computer, a mobile phone, and the like. An augmented reality device is a device with augmented reality functionality, which may be, for example, a Head Mounted Display (HMD), a cell phone, smart glasses, or the like. The spatial coordinates of the screen-like device refer to the spatial coordinates of the screen-like device relative to the head-mounted display (i.e., the origin of coordinates). An on-screen resource refers to content displayed on a screen of an on-screen device, and its type may be, for example, text (text), picture (img), file (file), etc. The relative location of the resource within the screen refers to the location of the resource within the screen relative to the screen of the screen-like device, i.e., on the screen of the screen-like device. The spatial coordinates of the resources within the screen refer to the spatial coordinates of the resources within the screen relative to the head mounted display (i.e., the origin of coordinates).

FIG. 1 shows a schematic diagram of an augmented reality device-based multi-device interaction system according to one embodiment of the invention. The augmented reality device-based multi-device interaction system in fig. 1 includes an augmented reality device 101 worn by a user and a plurality of screen devices 102-105, namely a personal computer 102, a smart television 103, a tablet computer 104 and a mobile phone 105. The augmented reality device 101 is used for detecting and calculating spatial coordinates of a plurality of screen devices in an environment, and modeling the plurality of screen devices; obtaining the space coordinates of the resources in the screen based on the relative positions of the resources in the screen of the screen equipment and the space coordinates of the corresponding screen equipment; and when the device interacts with the screen devices, rendering the screen devices and virtual models of resources in the screen by using mixed reality, and operating the virtual models of the resources in the screen among the screen devices through the augmented reality device to realize the interaction among the screen devices.

Fig. 2 shows a flowchart of a multi-device interaction method based on an augmented reality device according to an embodiment of the present invention. The method comprises the following steps:

step S1: and detecting and calculating the space coordinates of the plurality of screen type devices by using the augmented reality device, and modeling the plurality of screen type devices.

According to one embodiment of the invention, step S1 comprises the following sub-steps:

step S11: and detecting a plurality of screen equipment in the environment by using the augmented reality equipment to obtain the space coordinates of key points of the plurality of screen equipment.

The key points of the screen equipment refer to points capable of positioning the screen position of the screen equipment. For example, in an embodiment of a rectangular screen, the screen class device keypoints may be the four corners of the rectangular screen or the three corners of the rectangular screen. Hereinafter, explanation will be given taking as an example that the key points are four corners of a rectangular screen.

Preferably, the initial detection position of the augmented reality device is used as a fixed anchor point (i.e. a coordinate origin) for calculating the spatial coordinates of the key points of the corresponding screen-type device relative to the fixed anchor point. When the position of the augmented reality device changes along with the position of the user, the spatial coordinates of the fixed anchor point and the key point of the screen device are not affected, and therefore the spatial coordinates of the screen device in the environment do not need to be repeatedly detected. The spatial coordinates of the screen type devices in the environment can be re-detected and calculated when the screen type devices move. Preferably, the spatial coordinates of the screen-like device in the environment are also re-detected and calculated when the augmented reality device is re-worn.

In one embodiment, screen-like devices in the environment may be detected and identified using visual algorithms. In another embodiment, a two-dimensional code may be posted on the screen-type device in advance or displayed at a corner of the screen-type device, and the screen-type device in the environment is detected and identified by scanning the two-dimensional code on the screen-type device through the augmented reality device.

In one embodiment, the augmented reality device further includes a depth camera and an RGB camera, and is configured to obtain depth and position information of the plurality of screen-type device key points, and calculate spatial coordinates corresponding to the screen-type device key points. A Hololens2 generation helmet of Microsoft corporation is used as augmented reality equipment, after a user wears the Hololens2 generation helmet, an RGB camera and a depth camera which are assembled on the Hololens2 generation helmet are used for detecting screen equipment in an environment, depth and position information of key points of the screen equipment are obtained, and spatial coordinates of the key points of the corresponding screen equipment are calculated.

According to one embodiment of the present invention, the spatial coordinates of the positioning key points can be divided into the following three steps:

(1) Acquiring RGB photo streams of a plurality of screen devices in real time: screen-class devices in an environment are detected by using a Hololens2 generation helmet-mounted RGB camera and a depth camera, and photo streams shot by the real-time RGB cameras of a plurality of screen-class devices are acquired through PhotoCapture provided in an open-source software package 'mixed reality toolkit' provided by Hololens developer Microsoft corporation.

(2) Positioning corresponding two-dimensional pixel points (X, Y) of key points (such as four corners of a rectangular frame of a screen of the screen equipment) of a plurality of screen equipment in the RGB photo: and processing the RGB photos by utilizing an OpenCV algorithm package. The detection of the screen rectangle is mainly to extract edges, and the brightness of the display part of the screen type device is usually higher than the surrounding environment, so that the picture can be thresholded. Converting the RGB picture into a gray-scale image by using a cvtColor algorithm, performing median filtering by using a mediaBlur algorithm, converting the image into a binary picture by using a threshold algorithm, performing edge detection by using a Canny algorithm, extracting a rectangular contour by using a findContour algorithm, extracting a contour with the largest area by using a contourArea algorithm and an aproxPolyDP algorithm, surrounding the contour by using a polygon, and searching for a convex hull by using a convexHull algorithm. Based on the processing, the two-dimensional pixel coordinates of the key points of the screen equipment in the RGB photo can be accurately obtained.

(3) Acquiring three-dimensional space coordinates of the key point corresponding to two-dimensional pixel coordinates in a Hololens2 generation helmet: the Hololens2 generation helmet packages depth information in a mixed reality toolkit, obtains X and Y parts in three-dimensional space coordinates by calling ConvertexelCoordsToScaaledCoords of PhotoCaptureFrame, obtains Z values by collision point information of Spatial Aware, and combines the three to obtain three-dimensional space coordinates (X, Y, Z) of corresponding key points.

In another embodiment, a screen-like device in an environment may be detected by an augmented reality device and spatial coordinates of corresponding screen-like device keypoints may be calculated using an antenna array. For example, a sticker attached to a screen-type device transmits a wireless signal, an augmented reality device receives the signal, and spatial coordinates corresponding to key points of the screen-type device are calculated. In this embodiment, the sticker is attached to a key point of the screen-type device and transmits a wireless signal, and the augmented reality device has an antenna array for detecting the wireless signal. Preferably, the three-dimensional space coordinates for locating the key points can be divided into the following two steps:

(1) An antenna array on the augmented reality equipment acquires an azimuth angle R and a distance L of a piece pasted on a key point of the screen equipment: the method comprises the steps that a piece is sent to a wireless signal, an antenna array receives and decodes a data packet, IQ (same-direction component and orthogonal component) data values received by different antennas are read, the phase of the wireless signal from a sending end to a receiving end is calculated through the data values, further, the phase difference received between the antennas is obtained, the phase difference data is processed and input into a super-resolution algorithm, for example, a multi-signal classification (MUSIC) algorithm, the maximum value of a spectrum function is obtained in a spatial spectrum domain, and the angle corresponding to the spectrum peak is the estimated value of the direction angle of the piece (the azimuth angle and the pitch angle can be estimated at the same time). Furthermore, for the estimation of the distance of the screen-type device, the received signal strength decoded from the data packet can be used for representation, namely represented by RSSI data.

(2) Calculating the three-dimensional space coordinates of the key points according to the azimuth angles and the distances: taking a Hololens helmet as an augmented reality device, the current coordinates and rotation angle of the Hololens helmet are obtained through main. And (3) taking the Hololens helmet as a circle center, taking the rotation angle as the angle R calculated in the first step, and taking the rotation radius as the distance L calculated in the first step, so that the three-dimensional space coordinate of the patch can be calculated.

Step S12: and modeling the screen equipment based on the key points, and storing the related information of the screen equipment in the space model.

FIG. 3 shows the results of modeling a screen class device according to one embodiment of the invention. As shown in fig. 3, a screen 310 of the screen-like device is represented by a solid box, the screen 310 is modeled based on key points (i.e., four corners) of the screen 310, and a modeling result (i.e., a virtual model) 320 is represented by a dashed box. The dashed box of the virtual model 320 does not coincide with the screen 310 for clarity, but in actual operation, the dashed box of the virtual model 320 preferably coincides with the screen 310.

The screen type device related information includes the location of the key point, modeling information of the screen type device, identity information of the screen type device (e.g., network communication address of the device), and the like, and is used for informing the augmented reality device how to connect to the screen type device. For example, the identity information of the screen-type device can be identified by a two-dimensional code pasted on the screen-type device, the identity information of the screen-type device can be identified by a two-dimensional code at the corner of the screen, or the identity information of the screen-type device can be identified by a signal transmitted by the intelligent sticker.

When the augmented reality device interacts with the already modeled screen-like device, a virtual model (e.g., the dashed box in fig. 3) of the screen-like device may be rendered using mixed reality. When the screen type device is observed through an augmented reality device (such as an HMD), a virtual model which is coincident with a screen can be seen so as to provide visual feedback of user interaction and improve interaction accuracy.

Therefore, the user wears the augmented reality device (e.g., HMD) and establishes a unified coordinate system in the real space based on the augmented reality device, so that each screen-class device has spatial coordinate information in the augmented reality device. The user need not to add extra perception passageway in the environment or on specific equipment, only need wear augmented reality equipment and can detect the screen class equipment of the multiple size of different environment, and the auxiliary user accomplishes many equipment collaborative work better.

Preferably, all screen-like devices are connected in the same local area network.

Step S2: and obtaining accurate space coordinates of the resources in the screen based on the relative positions of the resources in the screen and the space coordinates of the screen equipment.

Obtaining accurate spatial coordinates of resources within the screen can better help users to complete more diverse interactive tasks. The traditional multi-device interactive system can only support file-level positioning, and the precise positioning of finer-grained resources (such as resources of pictures, a section of characters, videos and the like in a file) is challenging due to the characteristics of large quantity, small area, compact interval and the like.

The invention provides a method for obtaining accurate space coordinates of resources in a screen based on the relative positions of the resources in the screen and the space coordinates of screen equipment, and fine-grained resource obtaining and management are realized. Specifically, the spatial coordinates of the screen devices are obtained in step S1, the relative position of the resource in the screen of each screen device is calculated and obtained by the screen device itself, and information such as the corresponding resource type and source file address is obtained and sent to the augmented reality device in combination with the corresponding screen device related information, so as to ensure the accuracy of the obtained relative position of the resource. The augmented reality equipment combines the space coordinates of the screen equipment and the relative positions of the resources in the screen, and after calculation, the space coordinates of the resources in the screen in a unified space coordinate system can be obtained, so that a foundation is laid for subsequent interaction.

According to one embodiment of the invention, the screen type equipment divides resources in the screen into one minimum resource, then forms a resource list by the type and the source address of the minimum resource, and sends the resource list and the relative position of the resource in the screen to the augmented reality equipment in combination with the relevant information of the corresponding screen type equipment. In the present invention, the minimum resource within the screen refers to the minimum operable content within the screen. The minimum resources within the screen vary depending on the resource browser used. In general, the minimum resource within the screen is the smallest resource unit viewable by the human eye that is currently viewable by the resource browser. The type of the minimum resource may be a resource type identifier such as text (text), picture (img), file (file).

For example, when using a desktop, the smallest resource is each software, file icon on the desktop. When using word software, the minimum resources are every word, every picture, etc. in the current page.

In one embodiment, taking a browser as an example of a resource browser, the browser may be considered as a full screen, and the location of the resource in the browser represents the relative location of the resource in the screen. And acquiring a webpage source code through a content or post algorithm in a requests library, decoding the webpage source code through a decode algorithm, analyzing the webpage source code through Beautiful Soup, XPath and requests-html algorithms, acquiring the type and source address of each minimum resource in the browser, and forming a resource list. Traversing the resource list, and acquiring the horizontal displacement and the vertical displacement of each minimum resource relative to the screen through an element.

In another embodiment, the type of each minimum resource on the screen, the source address, and the relative location of the minimum resource within the screen may be analyzed by a computer vision algorithm through a full screen shot.

And the screen equipment sends the resource list, the relative positions of the resources in the screen and corresponding screen equipment identity information to the augmented reality equipment. In one embodiment, the resource transmission is performed by establishing a Socket connection between the augmented reality device and the screen class device. The screen equipment is a server side, and the augmented reality equipment is a client side. One client can be actively connected with a plurality of server terminals, so that the augmented reality device can be simultaneously communicated with a plurality of screen-type devices. After the connection is established, the augmented reality device actively sends a message to the screen device to maintain the connection. And after the resource list of the screen type equipment is updated, the resource list and the relative position of the resources in the screen are actively sent to the augmented reality equipment.

And based on the space coordinates of the plurality of screen-class devices obtained in the step S1 and the relative position of each minimum resource in the screen obtained in the step S2, obtaining the accurate space coordinates of the resources in the screen through simple coordinate calculation.

And step S3: when the device interacts with the screen devices, the mixed reality is used for rendering the screen devices and virtual models of resources in the screens of the screen devices, and the virtual models of the resources in the screens of the screen devices are operated among the screen devices through the augmented reality device, so that the interaction among the screen devices is realized.

For clarity, in the following, a head mounted display is described in detail as an example of an augmented reality device. The head-mounted display is a good sensing platform, many commercial head-mounted displays have sensing functions such as gesture recognition and eye movement tracking, and the peripheral shape characteristics and the wearing position of the head-mounted display have good expansion potential.

Corresponding original data can be obtained by utilizing the existing eye movement and hand movement tracking capability of the head-mounted display. Under the unified space coordinate system, the eye movement and the hand movement can be calculated by using corresponding vectors. Taking eye movement as an example, the eye movement is controlled by muscles, and has a corresponding movement range, so that the limit value of the eye movement can be obtained. The limit value is calculated in equal proportion to the size of the screen type equipment, so that the range of the eye movement can be matched with the screen type equipment. When the user moves the eyeball from top to bottom, a space vector with length and direction can be calculated from the starting point to the end point. Based on this vector, the movement of a manipulation pointer (similar to a mouse cursor) on the device can be represented by eye movements.

Hand motion and eye motion can be used as input in another dimension to expand the user's interaction space with a screen-like device in space. On the basis of a unified coordinate space with a head-mounted display as a center, the interaction between a user and screen-type digital equipment such as a Personal Computer (PC), a mobile phone and the like is expanded by combining hand motions (such as hovering, grabbing and clicking) and eye motions). According to the invention, the eye movement and the hand movement of the user are calculated in the same coordinate space, the eye movement and the hand movement are mapped to the operation pointers in the corresponding screen type equipment, and the corresponding interaction task is completed by matching the specific eye movement (such as forced blinking) and the hand movement (such as grabbing), and the seamless switching operation among multiple equipment can be realized.

According to the invention, based on a unified space coordinate system of the head-mounted display, when a user interacts with a certain screen type device, the system can calculate the size of an interaction plane of the screen type device, the eye movement range and the hand movement range, and adaptively match the movement of an eyeball with the movement in the interaction plane of the screen type device. The method can achieve the purpose that when a user looks at or points to the screen type equipment, the screen type equipment can be operated; when the user rotates the eyeball or moves the hand on the interactive planes of different sizes, the pointer on the corresponding interactive plane can be matched with the eye or hand movement.

Fig. 4 shows a schematic view of the capture of the picture 1 from the computer a to the computer B. The interaction between multiple screen-like devices is explained by taking the example of capturing a picture from a computer a to a computer B in conjunction with fig. 4.

The computer A has pictures 1-6 on the screen. Based on step S1 and step S2, the head mounted display has obtained the spatial coordinates of picture 1-picture 6. Rendering a virtual model X in the head-mounted display that the user can touch and interact with according to the space coordinates of the pictures 1-6. The virtual model X in fig. 4 is a dashed frame for framing the picture, and in practical applications, the virtual model X may also be a three-dimensional frame capable of framing the picture, and a user can freely design the virtual model as needed.

The user can directly grab the virtual model of the picture 1 through hand movement, or can select the virtual model of the picture 1 through eye movement at a distance and then directly grab the picture through hand movement. At this time, the back end of the head-mounted display records "picture 1 on the selected computer a" and transfers picture 1 from computer a to the head-mounted display.

The user grabs the picture 1 to reach the position Y of the computer B, or selects a certain position Y with eyes and then releases his hands. The spatial coordinates of the position Y can be obtained directly with the software package of the head-mounted display (for example the position of the hand or the position of the eye gaze). The relative position of the position Y within the screen of the computer B is calculated, then the picture 1 is transmitted from the head-mounted display to the computer B, and the picture 1 is displayed at the position Y. The head mounted display may then delete picture 1 thereon. In one embodiment, the head mounted display may also notify computer a to delete picture 1 thereon.

In another embodiment, the head mounted display may select a plurality of minimum resources based on eye movement and form a virtual model. Fig. 5 shows a schematic drawing of a picture 1-3 taken from a computer a to a computer B. The user can select the pictures 1-3 through eye movement, and the head-mounted display renders a virtual model X which can be touched and interacted by the user according to the selection of the user. And transmitting the pictures 1-3 from the computer a to the computer B, other steps are the same as those of the embodiment shown in fig. 4, and are not described herein again.

In one embodiment, the invention provides the following four application scenes and interactive application under corresponding interactive scenes based on eye movement and gestures and two types of digital equipment personal computers and mobile phones so as to show the application potential of the invention. The application scenarios of the present invention are only examples, and those skilled in the art can implement any other application scenarios as needed.

Scene 1: eye movement and personal computer based application scenarios:

a) Moving the pointer using eye movements and confirming the selection using a deliberate blinking representation;

b) The sight leaves the personal computer to automatically lock the screen, and moves into the personal computer to automatically unlock;

c) The method comprises the following steps that (1) the line of sight enters the personal computer to automatically match with external equipment (such as a keyboard, a mouse and the like), wherein the external equipment is used for assisting the operation of the personal computer, can be connected with a plurality of personal computers through wireless or Bluetooth and is used for inputting the personal computers; when a personal computer is selected through eye movement, the personal computer is automatically connected with the external device and receives the input of the external device; for example, when the line of sight moves from the personal computer a to the personal computer B, the personal computer B may automatically connect with the external device; therefore, one set of external equipment can be used for a plurality of screen-type equipment;

d) And (3) text editing: in the text editing task, the sight line is used for controlling the text editing point, and the up-and-down page turning is controlled, so that a user does not need to frequently use a mouse and frequently switch between a keyboard and the mouse.

Scene 2: eye movement and mobile phone based application scenarios:

a) Moving the pointer using eye movements and using deliberate blinking to indicate a confirmation selection;

b) The sight leaves the automatic screen locking of the mobile phone and moves into the mobile phone for automatic unlocking;

c) Auxiliary one-handed operation: when a user operates the mobile phone with one hand, some areas at the corners of the screen are difficult to click, corresponding resources can be relocated by using sight line movement, one resource is selected by using the sight line, and then the sight line is moved to another position of the screen.

Scene 3: gesture and personal computer based application scenarios:

a) Moving and copying resources among a plurality of personal computer devices through a grabbing-releasing gesture; for example, when facing to the personal computer a, a picture in the personal computer a can be captured, and then when facing to the personal computer B, the picture is released to the personal computer B, so that the resource can be copied between the personal computer a and the personal computer B; in another embodiment, the grabbing and releasing operations may also be implemented without facing the personal computer, since the spatial locations of all resources are known;

b) The long page is turned by sliding up and down through gestures, and resources beyond the screen can be briefly rendered through a mixed reality function.

Scene 4: gesture and mobile phone based application scenarios:

a) Resources are moved and copied among a plurality of mobile phones through a grabbing-releasing gesture.

In the above example, based on a unified spatial coordinate system of the head-mounted display, in combination with two input modes, namely eye movement and gesture, four specific interactive application scenarios are provided, and interaction efficiency and interaction experience of a user can be improved.

In one embodiment, a plurality of screen class devices are centered on a head-mounted display, resources in one screen class device are copied into the head-mounted display, and the resources are copied from the head-mounted display into another screen class device, so that interaction among the plurality of screen class devices is realized. In another embodiment, the multiple screen-class devices and the head-mounted display are connected in the same local area network, so that resource interaction can be directly performed between the multiple screen-class devices under the control of the head-mounted display.

According to the multi-device interaction system and method based on the augmented reality device, the augmented reality device is utilized to model an environment to obtain a uniform space coordinate system, the space coordinates of the multiple screen devices are calculated under the space coordinate system, the uniform space coordinate system is established for the multiple screen devices, and background data can be conveniently calculated during interaction among the multiple screen devices. The method can also acquire resource information of different types and sizes in different screen type equipment, including resource types, data, space coordinates and the like, and map the resource information into coordinate points under unified space coordinates, so that the interaction of fine-grained resources can be supported.

Although the present invention has been described by way of preferred embodiments, the present invention is not limited to the embodiments described herein, and various changes and modifications may be made without departing from the scope of the present invention.

Claims

1. A multi-device interaction system based on an augmented reality device comprises the augmented reality device and a plurality of screen devices, wherein the augmented reality device is used for obtaining space coordinates of the screen devices and modeling the screen devices; obtaining space coordinates of resources in the screens of the screen devices based on the relative positions of the resources in the screens of the screen devices and the space coordinates of the corresponding screen devices; rendering the virtual models of the screen devices and the resources in the screen of the screen devices when the screen devices are interacted, and operating the virtual models of the resources in the screen among the screen devices through the augmented reality device to realize the interaction among the screen devices.

2. The augmented reality device-based multi-device interaction system of claim 1, wherein the augmented reality device is further to:

obtaining space coordinates of the plurality of screen equipment key points;

modeling the plurality of screen class devices based on the key points.

3. The augmented reality device-based multi-device interaction system of claim 2, wherein the augmented reality device further comprises a depth camera and an RGB camera for obtaining depth and position information of the plurality of screen-class device key points to calculate spatial coordinates of corresponding screen-class device key points.

4. The augmented reality device-based multi-device interaction system according to claim 2, wherein the key points of the plurality of screen devices are provided with stickers for transmitting wireless signals, the augmented reality device is provided with an antenna array for detecting the wireless signals, and the augmented reality device detects and calculates spatial coordinates of the key points of the corresponding screen devices through the antenna array.

5. The augmented reality device-based multi-device interaction system of claim 1,

the resources in the screen of the screen-class devices are divided into one or more minimum resources, and the relative positions of the one or more minimum resources in the corresponding screen are obtained, and the type, the source address and the relative positions of the one or more minimum resources in the screen are sent to the augmented reality device.

6. The augmented reality device-based multi-device interaction system according to claim 5, wherein when the resources in the screen of the screen-class device are updated, the relative position of the minimum resource in the screen is updated and then the updated type of the minimum resource, the source address and the relative position of the minimum resource in the screen are sent to the augmented reality device.

7. The augmented reality device-based multi-device interaction system according to claim 1, wherein resource transmission is performed between the augmented reality device and the plurality of screen devices through Socket connection, the plurality of screen devices are server sides, and the augmented reality device is a client side.

8. The augmented reality device-based multi-device interaction system of claim 1,

the augmented reality device controls the plurality of screen-type devices by detecting hand movements and/or eye movements of a user.

9. An augmented reality device-based multi-device interaction system as claimed in any one of claims 1 to 8, wherein the augmented reality device is a head mounted display.

10. A multi-device interaction method for an augmented reality device-based multi-device interaction system of any one of claims 1-9, the method comprising:

when the device interacts with the screen devices, rendering the screen devices and virtual models of resources in the screen of the device, and operating the virtual models of the resources in the screen among the screen devices through the augmented reality device to realize the interaction among the screen devices.