WO2019227905A1

WO2019227905A1 - Method and equipment for performing remote assistance on the basis of augmented reality

Info

Publication number: WO2019227905A1
Application number: PCT/CN2018/121729
Authority: WO
Inventors: 廖春元; 唐荣兴
Original assignee: 亮风台（上海）信息科技有限公司
Priority date: 2018-05-29
Filing date: 2018-12-18
Publication date: 2019-12-05
Also published as: CN108769517A; CN108769517B

Abstract

The purpose of the present application is to provide a method for performing remote assistance on the basis of augmented reality, the method specifically comprising: capturing in real time video information related to a target object by means of a camera device in a first user equipment; by means of executing a target tracking operation on the target object in the video information, determining corresponding transfer matrix information of the target object in each video frame of the video information; according to the transfer matrix information, superimposing and displaying corresponding tag information on the target object, wherein the tag information comprises operation instruction information which is of a second user to the target object and which is sent by a corresponding second user equipment. In the present application, on the basis of augmented reality technology, a first user equipment superimposes tag information and the like sent by a second user equipment to display in current video information, thus achieving the remote real-time command of a second user to a first user, which may be used in a wide range of fields such as family supervision and guidance in everyday life, as well as in industry, medical treatment, education and so on.

Description

Method and equipment for remote assistance based on augmented reality

This case claims the priority of CN201810533512.2

Technical field

The present application relates to the field of computers, and in particular, to a technology for remote assistance based on augmented reality.

Background technique

Augmented reality (AR) technology is a new type of human-computer interaction technology. It uses cameras, gyroscopes, acceleration sensors, etc. to match three-dimensional points in space and two-dimensional points in images in real time, and uses matching point pairs to track. And calculate the position and orientation of the camera in space, and then use the above information to superimpose the real environment and virtual objects on the same screen or space in real time, resulting in the phenomenon of coexistence of virtual and reality. Users can use augmented reality systems to feel augmented information that does not exist in the objective physical world, such as virtual navigation arrows, virtual game characters, etc., and can break through time, space, and other objective constraints. Using virtual information greatly increases users' Real world understanding and interaction.

Summary of the Invention

An object of the present application is to provide a method and device for remote assistance based on augmented reality.

According to an aspect of the present application, a method for remote assistance based on augmented reality on a first user equipment side is provided, where the method includes:

Shooting video information about a target object in real time through a camera device in the first user equipment;

Determining a transition matrix information corresponding to the target object in each video frame of the video information by performing a target tracking operation on the target object in the video information;

According to the transfer matrix information, corresponding mark information is superimposed and displayed on the target object, where the mark information includes corresponding instruction information of the second user on the target object sent by the second user equipment.

According to another aspect of the present application, a method for remote assistance based on augmented reality on a second user equipment side is provided, where the method includes:

Receiving video information corresponding to a target object that is sent by a corresponding first user equipment in real time through a camera device in the first user equipment;

Presenting the video information, and maintaining corresponding target information superimposed on the target object displayed in each video frame of the video information, wherein the label information includes a second user using the second user device to Operation instruction information of the target object.

According to another aspect of the present application, a method for remote assistance based on augmented reality at a first user equipment side is provided, where the method includes:

Shooting video information about a first target object in real time through a camera device in the first user equipment;

Sending the video information to a corresponding network device;

Receiving first transfer matrix information corresponding to the first target object in each video frame of the video information sent by the network device;

Superimposing and displaying the corresponding first marker information on the first target object according to the first transfer matrix information, wherein the first marker information includes a second user equipment corresponding to the first user object sent by the second user equipment; Operation instruction information of a target object.

According to yet another aspect of the present application, a method for remote assistance based on augmented reality on a network device side is provided, where the method includes:

Receiving video information about a first target object sent by a first user equipment, where the video information is captured in real time by a camera device in the first user equipment;

Determining a first transition matrix information corresponding to the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information;

Sending the first transfer matrix information to the first user equipment;

Sending the video information and the first transfer matrix information to a second user equipment that belongs to the same remote assistance task as the first user equipment.

According to yet another aspect of the present application, a method for remote assistance based on augmented reality on a third user equipment side is provided, where the method includes:

Receiving video information about a third target object and third transfer matrix information corresponding to the third target object in each video frame of the video information sent by a corresponding network device;

Presenting the video information, and superimposing and displaying the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information, wherein the third marker The information includes operation instruction information of the second user on the third target object through the second user equipment;

The video information is captured in real time by a camera device in the first user equipment, and the first user equipment, the third user equipment, and the second user equipment belong to the same remote assistance task, and accept all The remote assistance of the second user equipment is described.

According to another aspect of the present application, a method for performing remote assistance based on augmented reality on a second user equipment side is provided, where the method includes:

Receiving video information about a first target object and first transfer matrix information corresponding to the first target object in each video frame of the video information sent by a corresponding network device;

Presenting the video information, and superimposing and displaying corresponding first marker information on the first target object in each video frame of the video information according to the first transition matrix information, wherein the first marker The information includes operation instruction information of the second user on the first target object through the second user equipment;

The video information is captured in real time by a camera device in the first user equipment that belongs to the same remote assistance task as the second user device, or is based on the first target object captured by the camera device. The real-time video information and other video information of the first target object are reconstructed.

Receiving video information about a target object sent by a first user equipment, where the video information includes a picture taken by an imaging device in the first user equipment;

Adding corresponding tag information to each video frame in the video information according to the transfer matrix information, wherein the tag information remains the target object superimposed on each video frame of the video information, the tag The information includes corresponding operation instruction information of the second user on the target object sent by the second user equipment;

Sending the edited video information to a first user equipment and a second user equipment that belongs to the same remote assistance task as the first user equipment.

According to an aspect of the present application, a method for remote assistance based on augmented reality is provided, wherein the method includes:

The first user equipment captures video information about a target object in real time through a camera device in the first user equipment, and determines a target object in the video by performing a target tracking operation on the target object in the video information. The corresponding transfer matrix information in each video frame of the information, and the corresponding marker information is superimposed and displayed on the target object according to the transfer matrix information, wherein the marker information includes a second User operation instruction information on the target object;

Sending, by the first user equipment, the video information to the second user equipment;

The second user equipment receives and presents the video information, and maintains corresponding target information superimposed and displayed on the target object in each video frame of the video information, wherein the label information includes information obtained by the second user through The operation instruction information of the second user equipment on the target object is described.

According to another aspect of the present application, a method for remote assistance based on augmented reality is provided, wherein the method includes:

The first user equipment captures video information about the first target object in real time through a camera device in the first user equipment, and sends the video information to a corresponding network device;

The network device receives the video information, and determines a first corresponding object of the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information. Transfer matrix information, sending the first transfer matrix information to the first user equipment, and sending the video information and the first transfer matrix information to a first remote user task that belongs to the same remote auxiliary task as the first user equipment Two user equipment;

The first user equipment receives the first transfer matrix information, and superimposes and displays corresponding first marker information on the first target object according to the first transfer matrix information, where the first marker information includes Corresponding to the operation instruction information of the second user on the first target object sent by the second user equipment;

Receiving, by the second user equipment, the video information and the first transfer matrix information, presenting the video information, and superimposing and displaying the corresponding first tag information on the video according to the first transfer matrix information The first target object in each video frame of the information, wherein the video information is captured in real time by a camera device in the first user equipment that belongs to the same remote assistance task as the second user equipment, or is based on Real-time video information about the first target object and other video information of the first target object captured by the imaging device are reconstructed.

The network device receives the video information, and determines a first corresponding object of the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information. Transfer matrix information, and send the first transfer matrix information to the first user equipment;

The network device determines a third transition matrix information corresponding to the third target object in each video frame of the video information by performing a target tracking operation on a third target object in the video information. The third target object belongs to the same remote auxiliary task as the first target object;

Sending, by the network device, the video information and the third transfer matrix information to a third user equipment corresponding to the third target object in the remote assistance task, and sending the video information and the first Sending the transfer matrix information and the third transfer matrix information to a second user equipment that belongs to the same remote auxiliary task as the first user equipment;

Receiving, by the third user equipment, the video information and the third transfer matrix information;

The third user equipment presents the video information, and superimposes and displays the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information;

Receiving, by the second user equipment, the video information, the first transition matrix information, and the third transition matrix information, and in presenting the video information, according to the first transition matrix information, the corresponding The first tag information is superimposed and displayed on the first target object in each video frame of the video information, and the corresponding third tag information is superimposed and displayed on each video of the video information according to the third transition matrix information. The third target object in the frame.

According to an aspect of the present application, a first user equipment for remote assistance based on augmented reality is provided, wherein the device includes:

A real-time shooting module, configured to shoot video information about a target object in real time through a camera device in the first user equipment;

A target tracking module, configured to determine a transition matrix information corresponding to the target object in each video frame of the video information by performing a target tracking operation on the target object in the video information;

An overlay display module, configured to superimpose and display corresponding mark information on the target object according to the transfer matrix information, where the mark information includes corresponding second user equipment to the target object sent by the second user equipment. Operation instructions.

According to another aspect of the present application, a second user equipment for remote assistance based on augmented reality is provided, wherein the device includes:

A video receiving module, configured to receive video information about a target object that is sent by the first user equipment in real time through a camera device in the first user equipment;

A video presentation module is configured to present the video information and maintain corresponding target information superimposed on the target object displayed in each video frame of the video information, wherein the label information includes a second user passing through the first Operation instruction information of the user equipment on the target object.

According to another aspect of the present application, a first user equipment for remote assistance based on augmented reality is provided, where the device includes:

A real-time shooting module, configured to shoot video information about a first target object in real time through a camera device in the first user equipment;

A video sending module, configured to send the video information to a corresponding network device;

A transfer matrix receiving module, configured to receive first transfer matrix information sent by the network device and corresponding to the first target object in each video frame of the video information;

An overlay display module, configured to overlay and display corresponding first marker information on the first target object according to the first transfer matrix information, where the first marker information includes a first Operation instruction information of the two users on the first target object.

According to yet another aspect of the present application, a network device for remote assistance based on augmented reality is provided, where the device includes:

A video receiving module, configured to receive video information about a first target object sent by a first user equipment, where the video information is captured in real time by a camera device in the first user equipment;

A target tracking module, configured to determine first transfer matrix information corresponding to the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information;

A first sending module, configured to send the first transfer matrix information to the first user equipment;

A second sending module is configured to send the video information and the first transfer matrix information to a second user equipment that belongs to the same remote assistance task as the first user equipment.

According to another aspect of the present application, a third user equipment for remote assistance based on augmented reality is provided, where the equipment includes:

A receiving module, configured to receive video information about a third target object sent by a corresponding network device and third transfer matrix information corresponding to the third target object in each video frame of the video information;

A presentation module, configured to present the video information and superimpose and display the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information, wherein, The third tag information includes operation instruction information of the second user on the third target object through the second user equipment;

According to another aspect of the present application, a second user equipment for remote assistance based on augmented reality is provided, where the device includes:

A receiving module, configured to receive video information about a first target object and first transfer matrix information corresponding to the first target object in each video frame of the video information sent by a corresponding network device;

A presentation module, configured to present the video information and superimpose and display the corresponding first marker information on the first target object in each video frame of the video information according to the first transfer matrix information, wherein, The first marking information includes operation instruction information of a second user on the first target object through the second user equipment;

A video receiving module, configured to receive video information about a target object sent by a first user equipment, where the video information includes a picture taken by a camera device in the first user equipment;

A tag adding module is configured to add corresponding tag information to each video frame in the video information according to the transfer matrix information, wherein the tag information remains superimposed on the video frames in the video information. A target object, where the tag information includes operation instruction information corresponding to the target object sent by the second user equipment to the second user;

A video sending module is configured to send the edited video information to a first user equipment and a second user equipment that belongs to the same remote assistance task as the first user equipment.

According to an aspect of the present application, there is provided a system for remote assistance based on augmented reality, wherein the system includes the first user equipment including the real-time shooting module, the target tracking module, and the superimposed display module as described above, and as described above A second user equipment including a video receiving module and a video rendering module.

According to an aspect of the present application, there is provided a system for remote assistance based on augmented reality, wherein the system includes a first user including a real-time shooting module, a video sending module, a transfer matrix receiving module, and an overlay display module as described above. The device is a second user equipment including a receiving module and a presentation module as described above, and a network device including a video receiving module, a target tracking module, a first sending module, and a second sending module as described above.

According to an aspect of the present application, there is provided a system for remote assistance based on augmented reality, wherein the system includes a first user including a real-time shooting module, a video sending module, a transfer matrix receiving module, and an overlay display module as described above. Device, the second user equipment including the receiving module and the presentation module as described above, the receiving module and the third user equipment of the presentation module as described above, and the video receiving module, the target tracking module, and the first transmitting device as described above Network equipment of the module and the second sending module.

Processor; and

A memory arranged to store computer-executable instructions that, when executed, cause the processor to execute:

According to the transfer matrix information, corresponding mark information is superimposed and displayed on the target object, wherein the mark information includes corresponding instruction information of the second user on the target object sent by the second user equipment.

According to another aspect of the present application, a second user equipment for remote assistance based on augmented reality is provided, where the equipment includes:

Processor; and

Sending the video information to a corresponding network device;

Superimposing and displaying the corresponding first marker information on the first target object according to the first transfer matrix information, wherein the first marker information includes a second user equipment corresponding to Operation instruction information of a target object.

Processor; and

Sending the first transfer matrix information to the first user equipment;

Processor; and

Presenting the video information, and superimposing and displaying the corresponding first marker information on the first target object in each video frame of the video information according to the first transition matrix information, wherein the first marker The information includes operation instruction information of the second user on the first target object through the second user equipment;

Processor; and

According to one aspect of the present application, a computer-readable medium is provided that includes instructions that, when executed, cause a system to:

And superimposing and displaying corresponding marker information on the target object according to the transfer matrix information, wherein the marker information includes corresponding instruction information of the second user on the target object sent by the second user equipment.

According to another aspect of the present application, a computer-readable medium is provided that includes instructions that, when executed, cause a system to:

According to yet another aspect of the present application, a computer-readable medium is provided that includes instructions that, when executed, cause a system to:

Sending the video information to a corresponding network device;

Sending the first transfer matrix information to the first user equipment;

Compared with the prior art, this application is based on augmented reality technology. On the basis of establishing a communication connection between the first user equipment and the second user equipment, the first user equipment superimposes and displays marker information and the like sent by the second user equipment on the current video. In the information, the remote real-time command from the second user to the first user can be widely used in daily supervision, guidance, and industrial, medical, and educational fields. It improves the efficiency of communication between people. The earth improves the user experience.

BRIEF DESCRIPTION OF THE DRAWINGS

Other features, objects, and advantages of the present application will become more apparent by reading the detailed description of the non-limiting embodiments with reference to the following drawings:

FIG. 1 illustrates a system topology diagram of remote assistance based on augmented reality according to one aspect of the present application;

2 shows a flowchart of a method for remote assistance based on augmented reality on a first user equipment side according to an embodiment of the present application;

FIG. 3 shows an example diagram of camera control when performing remote assistance based on augmented reality according to an embodiment of the present application; FIG.

4 shows a flowchart of a method for performing remote assistance based on augmented reality on a second user equipment side according to another embodiment of the present application;

5 shows a flowchart of a method for remote assistance based on augmented reality on a first user equipment side according to yet another embodiment of the present application;

6 shows a flowchart of a method for remote assistance based on augmented reality on a network device side according to yet another embodiment of the present application;

7 shows a flowchart of a method for remote assistance based on augmented reality on a third user equipment side according to yet another embodiment of the present application;

FIG. 8 shows a flowchart of a method for performing remote assistance based on augmented reality on a second user equipment side according to another embodiment of the present application;

9 shows a flowchart of a method for remote assistance based on augmented reality on a network device side according to yet another embodiment of the present application;

10 illustrates a system method diagram of remote assistance based on augmented reality according to one aspect of the present application;

11 illustrates a system method diagram of remote assistance based on augmented reality according to another aspect of the present application;

12 illustrates a system method diagram of remote assistance based on augmented reality according to yet another aspect of the present application;

13 illustrates a first user equipment for remote assistance based on augmented reality according to an embodiment of the present application;

14 shows a second user equipment for remote assistance based on augmented reality according to another embodiment of the present application;

15 illustrates a first user equipment for remote assistance based on augmented reality according to yet another embodiment of the present application;

16 illustrates a network device for remote assistance based on augmented reality according to yet another embodiment of the present application;

17 illustrates a third user equipment for remote assistance based on augmented reality according to yet another embodiment of the present application;

FIG. 18 illustrates a second user equipment for remote assistance based on augmented reality according to another embodiment of the present application; FIG.

FIG. 19 illustrates a network device for remote assistance based on augmented reality according to another embodiment of the present application; FIG.

20 shows a schematic diagram of a system for remote assistance based on augmented reality according to an aspect of the present application;

21 illustrates a schematic diagram of a system for remote assistance based on augmented reality according to another aspect of the present application;

22 illustrates a schematic diagram of a system for remote assistance based on augmented reality according to another aspect of the present application;

FIG. 23 illustrates an exemplary system that can be used to implement various embodiments described in this application.

The same or similar reference numerals in the drawings represent the same or similar components.

Detailed ways

The present application is described in further detail below with reference to the drawings.

In a typical configuration of this application, the terminal, the device serving the network, and the trusted party each include one or more processors (CPUs), input / output interfaces, network interfaces, and memory.

Memory may include non-persistent memory, random access memory (RAM), and / or non-volatile memory in computer-readable media, such as read-only memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media includes permanent and non-persistent, removable and non-removable media. Information storage can be accomplished by any method or technology. Information may be computer-readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, read-only disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, tape disk storage or other magnetic storage devices or any other non-transmitting medium can be used to store information that can be accessed by computing devices.

The equipment referred to in this application includes, but is not limited to, user equipment, network equipment, or equipment formed by integrating user equipment and network equipment through a network. The user equipment includes, but is not limited to, any mobile electronic product that can interact with the user, such as a smart phone or a tablet computer. The mobile electronic product can use any operating system, such as the android operating system and the iOS operating system. , Windows operating system, etc. Wherein, the network device includes an electronic device capable of automatically performing numerical calculation and information processing according to an instruction set or stored in advance. The hardware includes, but is not limited to, a microprocessor, an application specific integrated circuit (ASIC), and programmable logic. Devices (PLDs), field programmable gate arrays (FPGAs), digital signal processors (DSPs), embedded devices, and more. The network device includes, but is not limited to, a cloud composed of a computer, a network host, a single network server, multiple network server sets, or multiple servers; here, the cloud is composed of a large number of computers or network servers based on Cloud Computing, Among them, cloud computing is a type of distributed computing, a virtual supercomputer composed of a group of loosely coupled computer sets. The network includes, but is not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless ad hoc network (Ad hoc network), and the like. Preferably, the device may also be a program running on the user device, the network device, or a device formed by integrating the user device and the network device, the network device, the touch terminal, or the network device and the touch terminal through a network.

Of course, those skilled in the art should understand that the above equipment is just an example. If other existing or future equipment may be applicable to this application, it should also be included in the protection scope of this application, and hereby incorporated by reference. this.

FIG. 1 shows a typical scenario of the present application. A first user (such as a worker) holds a first user equipment, and a second user (an expert, etc.) holds a second user equipment. The first user equipment and the second user The device has established a communication connection; the first user device receives the tag information sent by the second user device, and superimposes the tag information on the video information captured in real time to assist the first user to complete the task more accurately and quickly. Among them, the tag The information may be position mark information such as a circle, or operation guidance information that matches preset operation information and is obtained through gesture recognition. The first user equipment and the second user equipment may perform one-to-one interaction directly, one-to-one interaction through the cloud, or many-to-many interaction through the cloud.

The first user equipment includes, but is not limited to, augmented reality glasses, a tablet computer, a mobile terminal, a PC terminal and the like. Here, the following embodiments are described by taking augmented reality glasses as an example. Of course, those skilled in the art should understand that these embodiments are the same. Suitable for tablet, mobile terminal, PC and other first user equipment. The second user equipment includes, but is not limited to, augmented reality glasses, a tablet computer, a mobile terminal, a PC terminal and the like. Here, a tablet computer is used as an example to illustrate the following embodiments. Of course, those skilled in the art should understand that these embodiments are equally applicable. For other second user equipment such as augmented reality glasses, mobile terminals, PCs, etc.

FIG. 2 illustrates a method for performing remote assistance based on an enhanced display at a first user equipment end according to an aspect of the present application, where the method includes steps S11, S12, and S13. In step S11, the first user equipment obtains video information about a target object in real time through a camera device in the first user equipment; in step S12, the first user equipment obtains video information about the target object in the video information. Perform a target tracking operation to determine the transfer matrix information corresponding to the target object in each video frame of the video information; in step S13, the first user equipment superimposes and displays the corresponding marker information on the transfer matrix information on the The target object, wherein the tag information includes corresponding instruction information of the second user on the target object sent by the second user equipment.

Specifically, in step S11, the first user equipment acquires video information about the target object in real time through the camera device in the first user equipment. For example, the target object includes a target object corresponding to the image information in the video frame marked by the first user, a target object corresponding to the image information in the second user marked video frame received by the first user, and the first user equipment according to the first user input. Image information determines the target object and so on. The first user equipment includes an imaging device, through which the first user equipment captures video information about the target object in real time.

In step S12, the first user equipment determines a transition matrix information corresponding to the target object in each video frame of the video information by performing a target tracking operation on the target object in the video information. The transfer matrix information includes the corresponding relationship between the current video frame and the previous video frame of the target object obtained by the first user equipment according to the target tracking algorithm. The target tracking algorithm includes, but is not limited to, a kernel tracking filter target tracking algorithm. KCF), dense optical flow (Denseopticalflow) tracking algorithm, sparse optical flow (Sparseopticalflow) tracking algorithm, Kalmanfiltering (Kalmanfiltering) tracking algorithm, multiple instance learning (Multipleinstancelearning) tracking algorithm, etc .; here the target tracking algorithm is kernel-related The filter target tracking algorithm (Kernelizedcorrelationfilter, KCF) is taken as an example. The KCF algorithm solves the tracking problem by learning a Kernelized regularized least squares (KRLS) linear classifier. The movement of the target in the scene can be regarded as the vector sum of the movement of the target in the horizontal direction and the vertical direction. The KCF algorithm introduces the concept of dense sampling and regards all samples as cyclic shifts of the reference samples. At this time, the Gaussian kernel function is highly structured, that is, the kernel function matrix is a cyclic matrix. According to the principle of cyclic convolution, all dot product operations with the cyclic matrix can be converted into convolution operations with the first row vector of the matrix. At this time, using DFT (Discretefouriertransform, discrete Fourier transform), the spatial domain convolution can be performed through the time domain dot product to achieve fast calculation.

Of course, those skilled in the art should understand that the above-mentioned tracking algorithm is only an example. If other existing or future tracking algorithms are applicable to this application, they should also be included in the protection scope of this application, and are hereby incorporated by reference. Included here.

In step S13, the first user equipment superimposes and displays the corresponding tag information on the target object according to the transfer matrix information, where the tag information includes a message sent by the second user equipment and corresponding to the second user equipment. Operation instruction information of the target object. The tag information includes operation instruction information about the target object, such as virtual operation information on the target object, sent by the second user equipment and received by the first user equipment. For example, when the first user equipment receives the operation instruction information about the target object sent by the second user equipment, the first user performs target tracking according to the transfer matrix information, and superimposes and displays the marker information on the corresponding target object according to the transfer matrix information. position. Wherein, for the augmented reality glasses, the marker information superimposedly displays the corresponding position on the lens of the augmented reality glasses, and the position information is calculated by the augmented reality glasses / network device according to the target tracking algorithm; for the PC terminal, tablet computer or mobile terminal The tag information is superimposed and displayed at a position corresponding to the target object in the current video frame. The first user equipment and the second user equipment may directly establish a communication connection, or may establish a communication connection through a network device. Here, the following is an example of the direct establishment of a communication connection between the first user equipment and the second user equipment. Embodiments, those skilled in the art should understand that these embodiments are also applicable to other communication connection modes such as establishing a communication connection through a network device.

For example, a first user holds augmented reality glasses, a second user holds a tablet computer, and the augmented reality glasses establish a communication connection with the tablet computer. The augmented reality glasses and tablet computer have transmitted the video stream or image about the target object, and the augmented reality glasses have received the tablet's operation instruction information about the target object in the previous video frame, such as the target object is a console The target object may be determined by the first user equipment based on the first user ’s selection operation (such as drawing a circle), or it may be the second user equipment received by the first user equipment based on the second user. The selection operation may also be determined by the first user equipment by identifying the initial image information of the target object; the corresponding operation instruction information includes virtual operation information obtained by the second user equipment to recognize the second user ’s gesture regarding the part operation. Wait. The augmented reality glasses collect the current video information about the target object through the camera in real time, and then use the target tracking algorithm to calculate the transfer matrix information of the target object in the current video frame relative to the target object in the previous video frame. Subsequently, the augmented reality glasses determine the position information of the target object in the current video frame according to the transfer matrix information, and superimpose and display the corresponding marker information at the position, such as the corresponding position of the part on the operating platform in the current video frame superimposedly displays the second user Operation instruction information and the like corresponding to the gesture.

Of course, those skilled in the art should understand that the above tag information and / or operation instruction information is only an example. Other existing or possible future tag information and / or operation instruction information, if applicable to this application, should also be included in This application is within the scope of protection and is hereby incorporated by reference.

In some embodiments, the method further includes step S14 (not shown). In step S14, the first user equipment sends the video information to the second user equipment. For example, the first user equipment captures the current video information about the target object in real time and sends the video information to the second user equipment end, or sends the video information to the second user equipment through the network device. The video information includes image information collected by the first user equipment through the camera device, and may also include audio information collected by the first user equipment through the microphone device, and the audio and video information is mixed into a video / audio stream through a compression algorithm; The first user equipment transmits the compressed video / audio stream to the second user equipment through a network transmission protocol such as a user datagram protocol (UDP), a transmission control protocol (TCP), or a real-time transmission protocol (RTP).

For example, the augmented reality glasses capture video information related to the current target object in real time and send the video information directly to the tablet computer, or send it to the cloud and forward it to the tablet computer in the cloud. The tablet computer receives and presents the video information, and assists the second user to continue to instruct the first user to perform operations such as processing of parts on the operating table.

In some embodiments, in step S14, the first user equipment sends the video information and the transfer matrix information to the second user equipment. For example, when the first user equipment sends video information to the second user equipment, it simultaneously sends the transfer matrix information obtained according to the target tracking operation to the second user equipment for the second user to target the target while presenting the video information. The subject performs target tracking.

For example, the augmented reality glasses capture video information about the current target object in real time, and perform a target tracking operation on the target object in the video information in combination with the previous video frame to determine the target object's relative to the previous video frame in each video frame. Transfer matrix information, etc. Subsequently, the augmented reality glasses send the video information and the transfer matrix information corresponding to each video frame in the video information directly or through a cloud to a tablet computer.

In some embodiments, the method further includes step S15 (not shown). In step S15, the first user equipment receives the operation instruction information of the second user on the target object based on the video information sent by the second user equipment. For example, the second user equipment generates corresponding continuous operation instruction information according to the second user's continuous operation on the target object (such as drawing a line segment circle or the like), or recognizes the gesture operation of the second user through gesture recognition, and the like. Subsequently, the second user equipment sends the continuing operation instruction information to the first user equipment to assist the first user in continuing to perform operations on the target object.

For example, the augmented reality glasses send real-time video information about the target object to the tablet computer, and the tablet computer receives and presents the video information. Subsequently, the tablet computer performs target tracking in each video frame of the obtained video stream to obtain the position of the target object in the video frame. In some embodiments, the tablet computer targets the target in the video frame by means of line segments, circles, and locally increasing brightness. The object is highlighted. The second user instructs the first user to process the part by making a mark on the tablet computer or making gestures within the shooting range of the tablet computer camera. The tablet computer uses the second user's mark as the operation instruction information, or by shooting The gestures and the like obtained are used for gesture recognition, and it is determined that the recognized gestures are the operation instruction information and the like. The tablet then sends the resume instruction to the augmented reality glasses. The augmented reality glasses receive and display the continued operation instruction information in a superimposed position at the corresponding position.

For another example, the augmented reality glasses sends video information about the target object captured in real time to the tablet computer, and also sends the transfer matrix information corresponding to each video frame in the video information to the tablet computer. The tablet computer receives and presents the video information. Subsequently, the tablet computer determines the position of the target object in the video frame according to the received transfer matrix information. In some embodiments, the tablet computer highlights the target object in the video frame by means of line segments, circles, and locally increasing brightness. The second user instructs the first user to process the part by making a mark on the tablet computer or making gestures within the shooting range of the tablet computer camera. The tablet computer uses the second user's mark as the operation instruction information, or by shooting The gestures and the like obtained are used for gesture recognition, and it is determined that the recognized gestures are the operation instruction information and the like. The tablet then sends the resume instruction to the augmented reality glasses. The augmented reality glasses receive and display the continued operation instruction information in a superimposed position at the corresponding position.

Of course, those skilled in the art should understand that the above-mentioned continuing operation instruction information is only an example. If other existing or future continuing operation instruction information is applicable to this application, it should also be included in the protection scope of this application, and This is incorporated herein by reference.

In some embodiments, the method further includes step S16 (not shown). In step S16, the first user equipment receives the imaging control instruction information of the second user on the imaging device sent by the second user equipment, and adjusts the imaging parameters of the imaging device according to the imaging control instruction information. Information, which captures video information about the target object in real time through the adjusted camera device, and sends the video information shot by the adjusted camera device to the second user equipment. For example, the imaging control instruction information includes instruction information for adjusting hardware parameters of the imaging device of the first user equipment. The imaging parameter information includes, but is not limited to, resolution, pixel depth, maximum frame rate, exposure mode and shutter speed, and pixel size. And spectral response characteristics. For example, the first user equipment receives the imaging control instruction information sent by the second user equipment and the second user adjusts the imaging device of the first user, adjusts the imaging parameter information of the imaging device according to the imaging control instruction information, and The adjusted camera device captures video information of the current target object in real time, and sends the video information to the second user equipment.

For example, as shown in Figure 3, Figure A is the real-time video information received by the second user, where the target object is the mouse pad on the table in the screen. The second user wants to observe the target object in more detail. The setting icon in the upper right corner is used to operate or directly zoom out by two-finger expansion on the screen. Based on the operation of the second user, the tablet computer generates corresponding camera control instruction information of the focused target object, and sends the camera control instruction The information is sent to the augmented reality glasses. The augmented reality glasses receive the imaging control instruction information, adjust relevant imaging parameters of the imaging device, such as resolution, focal length, etc., shoot the adjusted video information about the target object, and send the video information to the tablet computer. As shown in FIG. B, the picture is the enlarged video information about the target object received and presented by the tablet computer.

Of course, those skilled in the art should understand that the foregoing camera control instruction information and / or camera parameter information are merely examples, and other existing or future camera control instruction information and / or camera parameter information may be applicable to this application, It should also be included in the protection scope of this application, and hereby incorporated by reference.

In some embodiments, the marking information further includes auxiliary marking information marked by the first user on the target object through the first user device. Wherein, the auxiliary marking information includes operations based on the first user collected by the first user equipment, markings on target objects (such as drawing line segments, circles, etc.), or feedback information on the marking information sent by the second user equipment, such as Ask questions in circles, circle text, etc. For example, the first user equipment generates corresponding auxiliary identification information about the target object according to the operation of the first user, and the first user equipment sends the auxiliary identification information to the second user equipment for further remote interaction.

For example, when the first user captures video information about the target object, the specific position of the target object is circled, and the first user equipment generates corresponding auxiliary identification information according to the operation of the first user. The first user equipment sends the auxiliary identification information to the second user equipment while sending the video information to the second user equipment. The second user equipment receives the video information and the auxiliary identification information, and initializes it in the video frame according to the auxiliary identification information. The position information and the target tracking algorithm calculate the position information of the auxiliary marker information, and display the auxiliary marker information at the corresponding position of each video frame while displaying the video information; for example, the first user equipment calculates the auxiliary marker according to the target tracking algorithm. The information is in the transfer matrix information of each video frame of the video information, and the video information, auxiliary identification information, and corresponding transfer matrix information are sent to the second user equipment. After receiving the second user equipment, the second user equipment presents the video information according to the transfer matrix information in the The corresponding position is superimposed to display auxiliary label information.

For another example, after the augmented reality glasses display the operation instruction information of the second user on the target object at the corresponding position, the first user has doubts about the operation instruction information, and the first user draws a circle around the question location in the operation instruction information. , Or the first user has completed the operation instruction, and wants to get further operation instructions, click the prompt of the next operation at the target object position, and the augmented reality glasses generate the question information or the next step of the corresponding operation instruction information based on the first user's operation The operation instruction information and the like are used as auxiliary identification information, and the auxiliary identification information is transmitted to the tablet computer. The tablet computer receives and displays the auxiliary label information in a corresponding position, and makes corresponding operation instruction information based on the auxiliary label, such as answering a question or the next operation instruction, etc. The tablet computer sends the continuous operation instruction information. To the augmented reality glasses, the augmented reality glasses superimposedly display the continuing operation instruction information in the video information, where the continuing operation instruction information includes auxiliary identification information, such as what was the previous question or a prompt for the next step.

Of course, those skilled in the art should understand that the above auxiliary labeling information is only an example. If other existing or future auxiliary labeling information is applicable to this application, it should also be included in the protection scope of this application. References are included here.

In some embodiments, the target object includes a paper document under discussion; the operation instruction information of the second user on the target object includes information about the second user's video frame of the paper document under discussion. One or more callout locations. For example, the target object may be a paper document under discussion, and the corresponding operation instruction information includes one or more position information of the second user in the video frame of the paper document under discussion, such as underlining a position in the document. Or a mark such as a circle, or a mark corresponding to the text (such as the pinyin, explanation, or related content of the text).

For example, a first user wears augmented reality glasses, and through the augmented reality glasses reading a paper document, the second user holds a tablet computer, and the tablet computer establishes a communication connection with the augmented reality glasses. The augmented reality glasses capture video information of the paper document in discussion through the camera device, and send the video information to the tablet computer. The tablet computer receives the video information and generates corresponding operation instruction information based on the second user's one or more annotation operations in the document under discussion, such as including operation instruction information indicating that the corresponding position of the document has an error such as an error prompt position. The tablet computer sends the operation instruction information to the augmented reality glasses, and the augmented reality glasses calculates the position of the paper document in the video frame under discussion in the video frame of the current video information according to the target tracking algorithm, such as its corresponding transfer matrix information, etc., and According to the transfer matrix information and the error prompt position in the operation instruction information, the corresponding one or more annotation information is superimposed in real time at the corresponding position in the discussion paper document to prompt the first user that the corresponding position in the current document is wrong.

Of course, those skilled in the art should understand that the above target objects and / or operation instruction information are just examples, and other existing or future target objects and / or operation instruction information, if applicable to this application, should also be included in This application is within the scope of protection and is hereby incorporated by reference.

In some embodiments, in step S13, the first user equipment generates rendering marker information according to the one or more labeled position information, and superimposes the rendering marker information on the target according to the transfer matrix information. Object. Wherein, the rendering mark information includes highlight projections such as one or more marked positions, marks such as a line or a circle. For example, the first user equipment generates corresponding rendering mark information according to one or more of the marked position information marked in the discussion paper document in the operation instruction information, and determines, based on the transfer matrix information, whether the paper document in discussion is in the video. The position of each video frame is information, so as to determine the position of the rendering mark in each video frame, and the rendering mark information is superimposed and displayed at the corresponding position.

For example, the augmented reality glasses receive operation instruction information, and the operation instruction information includes the tag information of the second and fifth words in the currently read page of the paper document in question. Based on the operation instruction information, the augmented reality glasses generates rendering mark information underlined at the bottom of the corresponding position of the fifth word in the second row of the read page of the paper document under discussion. The augmented reality glasses calculate the position of the paper document under discussion in each video frame of the current video information according to the target tracking algorithm, and according to the position of the rendering mark relative to the paper document under discussion, the paper document is discussed in each video frame. The underlined rendering mark information is superimposed under the fifth word in the second row of the reading page.

Of course, those skilled in the art should understand that the above rendering mark information is only an example. If other existing or future rendering mark information is applicable to this application, it should also be included in the protection scope of this application, References are included here.

In some embodiments, the method further includes step S17 (not shown). In step S17, the first user equipment captures image information about the target object in real time through the camera device in the first user equipment, sends the image information to the corresponding second user equipment, and receives information about the target object. Tag information, wherein the tag information includes operation instruction information of the second user on the target object in the image information sent by the second user equipment, and the tag information is superimposed and displayed on the target object; Wherein, in step S11, the first user equipment captures video information about the target object in real time through the camera device. For example, the first user equipment captures image information about the target object through the imaging device, and sends the image information to the second user equipment. The second user equipment receives and presents the image information for the second user to operate the target object. Based on the operation of the second user, the second user equipment generates tag information corresponding to the operation instruction information, and sends the tag information to the first user equipment. The first user equipment receives the tag information, and superimposes and displays the tag information at a position corresponding to the target object in the image. Subsequently, the first user equipment collects the video stream about the target object through the camera device, and displays the marker information in each video frame of the video stream by using a target tracking algorithm.

For example, the augmented reality glasses capture image information of the current target object and send the image information to a tablet computer, and the tablet computer receives and presents the image information. The second user performs an operation instruction on the target object based on the presented image information. The tablet computer collects the operation instruction information of the second user to generate corresponding mark information, and sends the mark information to the augmented reality glasses. The augmented reality glasses receive the tag information, and superimpose and display the tag information in the captured image information according to the target tracking algorithm. Subsequently, the augmented reality glasses continue to collect video information of the target object, and superimpose the label information at the corresponding position in real time according to the target tracking algorithm.

FIG. 4 illustrates a method for remote assistance based on augmented reality on a second user equipment according to another aspect of the present application, where the method includes steps S21 and S22. In step S21, the second user equipment receives video information corresponding to the target object that is captured by the first user equipment in real time through the camera device in the first user equipment; in step S22, the second user equipment presents the Video information, and maintain corresponding target information superimposed on the target object displayed in each video frame of the video information, wherein the label information includes a second user's Operation instructions. For example, the second user equipment receives and presents image information or video information about the target object sent by the first user equipment, and collects operations of the second user to generate corresponding mark information. Subsequently, the second user equipment continues to receive video information about the target object sent by the first user equipment and presents the video information, and superimposes and displays the tag information determined before the second user equipment in the presented video.

For example, a first user holds augmented reality glasses, a second user holds a tablet computer, and the augmented reality glasses establish a communication connection with the tablet computer. The augmented reality glasses and tablet computer have transmitted the video stream or image about the target object, and the augmented reality glasses have received the tablet's operation instruction information about the target object in the previous video frame, such as the target object is a console The target object may be determined by the first user equipment based on the first user ’s selection operation (such as drawing a circle), or it may be the second user equipment received by the first user equipment based on the second user. The selection operation may also be determined by the first user equipment by identifying the initial image information of the target object; the corresponding operation instruction information includes virtual operation information obtained by the second user equipment to recognize the second user ’s gesture regarding the part operation. Wait. The augmented reality glasses collect the current video information about the target object through the camera in real time, and then use the target tracking algorithm to calculate the transfer matrix information of the target object in the current video frame relative to the target object in the previous video frame. Subsequently, the augmented reality glasses determine the position information of the target object in the current video frame according to the transfer matrix information, and superimpose and display the corresponding marker information at the position, such as the corresponding position of the part on the operating platform in the current video frame superimposedly displays the second user Operation instruction information and the like corresponding to the gesture. At the same time, the augmented reality glasses send video information to the tablet computer, and the tablet computer receives and presents the video information, and displays the previous tag information at the corresponding position in the video information while the video information is presented. In other real-time examples, the augmented reality glasses also send auxiliary labeling information to the tablet computer, where the auxiliary labeling information includes the target user's mark (such as a line segment, a circle, etc.) collected by the augmented reality glasses based on the operation of the first user. Etc.), or feedback information on the tag information sent by the tablet computer, such as asking questions in the tag information, circled text, etc .; the tablet computer receives the auxiliary tag information and presents the auxiliary message while displaying the video information. The information is displayed superimposed on the corresponding position of the target object.

Of course, those skilled in the art should understand that the above tag information is only an example. If other existing or future tag information is applicable to this application, it should also be included in the protection scope of this application, and hereby incorporated by reference. Included here.

In some implementations, the method further includes step S23 (not shown). In step S23, the second user equipment performs a target tracking operation on the target object in the video information; wherein, in step S22, the second user equipment presents the video information, and according to the target tracking operation, Result information, superimposing and displaying corresponding mark information on the target object in each video frame of the video information, wherein the mark information includes a second user's operation on the target object through the second user equipment Instructions. For example, the second user equipment receives video information about the target object sent by the first user equipment, and the second user equipment performs a target tracking operation on the target object in the video information according to the template information of the target object to determine the target object in each video frame of the video information. The location information in the template information may be sent by the first user equipment to the second user equipment, or may be obtained by the second user equipment based on the initial video frame selected by the second user operation or by importing the template information. Subsequently, when the second user equipment presents the video information, the marker information is superimposed and displayed on the corresponding position of the target object according to the result information of the target tracking, where the marker information may be the second user equipment's initial video frame or The target information in the image information is generated based on the guidance, and may also be mark information generated by the second user based on the operation guidance made by the target object in the video information sent later.

For example, the tablet computer receives video information sent by the augmented reality glasses, performs target tracking on each part in each video frame in the video information according to the part template information of the operating platform, and obtains position information of the part in each video frame. The template of the part may be imported by the second user, may be selected in the initialization frame, or may be sent by the augmented reality glasses. The tablet computer receives and presents video information, and generates corresponding mark information according to the second user's installation guide information for the part (for example, circled or arrow pointing to the installation position, or corresponding to a preset installation operation based on gesture recognition). While presenting the video information, the second user equipment displays the tag information and the like in real-time superimposed display in subsequent video frames according to the position information of the part in each video frame.

In some embodiments, in step S21, the second user equipment receives the video information about the target object obtained in real time through the camera device in the first user equipment and is sent by the corresponding first user equipment. Corresponding transition matrix information in each video frame of the video information; wherein, in step S22, the second user equipment presents the video information, and corresponds to the target object in each video frame of the video information The matrix information is transferred, and the corresponding marker information is superimposed and displayed on the target object in each video frame of the video information, wherein the marker information includes a second user's Operation instructions. For example, when the first user equipment sends video information to the second user equipment, it simultaneously sends the transfer matrix information obtained according to the target tracking operation to the second user equipment for the second user to target the target while presenting the video information. The subject performs target tracking. The second user equipment receives the video information and the transfer matrix information, and simultaneously displays the video information, and superimposes and displays the marker information on the corresponding position in the video information according to the transfer matrix information.

For example, the augmented reality glasses capture video information about the current target object in real time, and perform a target tracking operation on the target object in the video information in combination with the previous video frame to determine the target object's relative to the previous video frame in each video frame. Transfer matrix information, etc. Subsequently, the augmented reality glasses send the video information and the transfer matrix information corresponding to each video frame in the video information directly or through a cloud to a tablet computer. The tablet computer receives the video information and the corresponding transfer matrix information, and displays the video information, and superimposes and displays marker information and the like on the corresponding position of the video information according to the transfer matrix information.

In some embodiments, the method further includes step S24 (not shown). In step S24, the second user equipment obtains the operation instruction information of the second user on the target object based on the video information, and sends the operation instruction information to the first user equipment. For example, the second user equipment generates corresponding continuous operation instruction information according to the second user's continued operation on the target object (such as drawing a line segment circle or the like), or recognizes the second user's gesture operation through gesture recognition, and the like. Subsequently, the second user equipment sends the continuing operation instruction information to the first user equipment to assist the first user in continuing to perform operations on the target object.

For example, the augmented reality glasses send video information about the target object captured in real time to the second tablet computer, and the tablet computer receives and presents the video information. Subsequently, the tablet computer performs target tracking in each video frame of the obtained video stream to obtain the position of the target object in the video frame. In some embodiments, the tablet computer targets the target in the video frame by means of line segments, circles, and locally increasing brightness. The object is highlighted. The second user instructs the first user to process the part by making a mark on the tablet computer or making gestures within the shooting range of the tablet computer camera. The tablet computer uses the second user's mark as the operation instruction information, or by shooting The gestures and the like obtained are used for gesture recognition, and it is determined that the recognized gestures are the operation instruction information and the like. The tablet then sends the resume instruction to the augmented reality glasses. The augmented reality glasses receive and display the continued operation instruction information in a superimposed position at the corresponding position.

For another example, the augmented reality glasses send real-time video information about the target object to the tablet computer, and also send the transfer matrix information corresponding to each video frame in the video information to the tablet computer, and the tablet computer receives and presents the video information . Subsequently, the tablet computer determines the position of the target object in the video frame according to the received transfer matrix information. In some embodiments, the tablet computer highlights the target object in the video frame by means of line segments, circles, and locally increasing brightness. The second user instructs the first user to process the part by making a mark on the tablet computer or making gestures within the shooting range of the tablet computer camera. The tablet computer uses the second user's mark as the operation instruction information, or by shooting The gestures and the like obtained are used for gesture recognition, and it is determined that the recognized gestures are the operation instruction information and the like. The tablet then sends the resume instruction to the augmented reality glasses. The augmented reality glasses receive and display the continued operation instruction information in a superimposed position at the corresponding position.

In some embodiments, the method further includes step S25 (not shown). In step S25, the second user equipment generates imaging control instruction information of the second user on the imaging device according to an imaging control operation performed by the second user through the second user equipment, where the imaging control instruction The information is used to adjust the imaging parameter information of the imaging device, send the imaging control instruction information to the first user equipment, and receive the image sent by the first user equipment and taken by the adjusted imaging device. The video information. For example, the second user equipment receives the video information and adjusts the video information, such as enlarging the area near the target object. The second user determines the corresponding imaging control instruction information based on the user's operation, where the imaging control instruction information includes imaging parameter information for adjusting the imaging device of the first user equipment, and then the second user equipment sends the imaging control instruction information Send to the first user equipment. The imaging control instruction information includes instruction information for adjusting hardware parameters of the imaging device of the first user equipment. The imaging parameter information includes, but is not limited to, resolution, pixel depth, maximum frame rate, exposure mode and shutter speed, and pixel size. And spectral response characteristics.

In some embodiments, the method further includes step S26 (not shown). In step S26, the second user equipment receives and presents image information about the target object that is captured by the first user equipment in real time through the camera device in the first user equipment, and acquires the second user's image on the image. The operation instruction information of the target object in the information, sending the operation instruction information to the first user equipment, and superimposing and displaying the operation instruction information on the target object in the image information; wherein, in step S21 In the second user equipment, the second user equipment receives video information about the target object captured by the first user equipment in real time through the imaging device. For example, the first user equipment captures image information about the target object through the imaging device, and sends the image information to the second user equipment. The second user equipment receives and presents the image information for the second user to operate the target object. Based on the operation of the second user, the second user equipment generates tag information corresponding to the operation instruction information, and sends the tag information to the first user equipment. The first user equipment receives the tag information, and superimposes and displays the tag information at a position corresponding to the target object in the image. Subsequently, the first user equipment collects the video stream about the target object through the camera device, and displays the marker information in each video frame of the video stream by using a target tracking algorithm.

FIG. 5 illustrates a method for remote assistance based on augmented reality on the first user equipment side according to another aspect of the present application, where the method includes steps S31, S32, S33, and S34. In step S31, the first user equipment captures video information about the first target object in real time through the camera device in the first user equipment. In step S32, the first user equipment sends the video information to the corresponding network. Device; in step S33, the first user equipment receives first transfer matrix information corresponding to the first target object in each video frame of the video information sent by the network device; in step S34, the first The user equipment superimposes and displays the corresponding first marker information on the first target object according to the first transfer matrix information, wherein the first marker information includes a second user equipment corresponding to the second user equipment sent by the second user equipment. The operation instruction information of the first target object is described. For example, the first user equipment and the second user equipment establish a communication connection through a network device, and the first user equipment sends the captured video information about the first target object to the network device, and the network device sends the first target object to the first target object according to the video information. Perform target tracking, determine the transition matrix information of the first target object in each video frame corresponding to the video information, and send the transition matrix to the first user equipment and the second user equipment. Subsequently, the first user equipment and the second user equipment superimpose and display the first tag information and the like based on the transfer matrix information sent by the network device, where the first tag information includes an operation instruction of the second user device on the first target object according to the second user equipment. information.

For example, a first user holds augmented reality glasses and a second user holds a tablet computer, and the augmented reality glasses and the tablet computer establish a communication connection through a network device (cloud). The first user takes a real-time shot of the first target object (such as part A on the operating platform), obtains video information related to part A, and sends the video information to the network device. The network device receives the video information related to the part A, and determines the transfer matrix information of the part A in each video frame of the video information according to the target tracking algorithm. Then, the network device returns the transfer matrix information to the augmented reality glasses. The augmented reality glasses receive the transfer matrix information, and display the video information while displaying the corresponding marker information in real-time superimposed on the corresponding position in the video according to the transfer matrix information, wherein the marker information includes operations such as the second user's installation instruction for the part A Instruction information, where the operation instruction information may be generated on a tablet computer, or may be generated by a network device according to an operation about a second user uploaded by the tablet computer.

Of course, those skilled in the art should understand that the above operation instruction information is just an example. If other existing or future operation instruction information is applicable to this application, it should also be included in the protection scope of this application. References are included here.

FIG. 6 illustrates a method for performing remote assistance based on augmented reality on a network device side according to another aspect of the present application, where the method includes steps S41, S42, S43, and S44. In step S41, the network device receives video information about the first target object sent by the first user equipment, where the video information is captured in real time by a camera device in the first user equipment; in step S42, The network device determines a first transition matrix information corresponding to the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information; in step S43 , The network device sends the first transfer matrix information to the first user equipment; in step S44, the network device sends the video information and the first transfer matrix information to the first user equipment belonging to A second user device for the same remote assistance task. Among them, the network device is a server with sufficient computing power, which is mainly responsible for the forwarding of video, audio, and tag information data. At the same time, the network device has some computer vision and image processing algorithms. For example, when video / audio information reaches the network device, the network The device tracks the target object (such as the first target object) by using a tracking algorithm, and then returns the tracking result information to the user device.

For example, a first user holds augmented reality glasses and a second user holds a tablet computer, and the augmented reality glasses and the tablet computer establish a communication connection through a network device (cloud). The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform), obtain video information related to part A, and send the video information to the network device. The network device receives the video information related to the part A, and determines the transfer matrix information of the part A in each video frame according to the target tracking algorithm. Then, the network device returns the transfer matrix information to the augmented reality glasses, and The transfer matrix information and video information are sent to a tablet computer, where the augmented reality glasses and the tablet computer establish communication through a network device to perform the same remote assistance task (eg, installation instruction for part A). The augmented reality glasses receive the transfer matrix information, and display the video information while displaying the corresponding marker information in real-time superimposed on the corresponding position in the video according to the transfer matrix information, wherein the marker information includes operations such as the second user's installation instruction for the part A Instruction information, where the operation instruction information may be generated on a tablet computer, or may be generated by a network device according to an operation about a second user uploaded by the tablet computer. The tablet receives the transfer matrix information and video information sent by the network device. When presenting the video information, the position information of the part A in each video frame is determined according to the transfer matrix information, and the mark information about the part A is superimposed and displayed at the position, such as Operation instructions such as installation instructions for Part A.

In some embodiments, in step S42, the network device reconstructs the video information of the first target object according to the video information and other video information of the first target object, and passes the reconstructed video information Performing a target tracking operation on the first target object in to determine first transfer matrix information corresponding to the first target object in each video frame of the video information. Among them, the network device is mainly responsible for the forwarding of data such as video, audio, and tag information. At the same time, the network device has some computer vision and image processing capabilities. If the video / audio information is sent to the network device, the network device uses the target tracking algorithm and target recognition. , Reconstruction, pose estimation and computer graphics algorithms (such as virtual object rendering, point cloud processing (splicing, down / oversampling, matching, meshing, etc.)) process video information and return the processed result information to the user device . For example, the network device reconstructs the video information uploaded by the first user and videos uploaded by other users to generate overall video information for the first target object, and then performs target tracking on the first target object in the reconstructed video information.

For example, a first user holds augmented reality glasses, a second user holds a tablet computer, and another user (such as a third user) holds a third user device (such as augmented reality glasses, tablet computer, etc.). The three user devices and the tablet computer have established a communication connection through the network device (cloud), and the augmented reality glasses, the third user device and the tablet computer are performing the same remote assistance task (such as the installation instructions for part A), the augmented reality glasses, and The third device is used to capture video information related to Part A. Among them, the augmented reality glasses are mainly used to capture the left half of Part A, and the third user device is mainly used to capture the right half of Part A with a certain degree of overlap. The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform), obtain the first video information related to the left half of part A, and send the first video information to the network device; Part A performs real-time shooting, obtains third video information related to the right half of part A, and sends the third video information to the network device. The network device receives the first video information and the third video information related to the part A, obtains the reconstructed video information including the entire part A according to the first video information and the third video information through a computer vision algorithm, and determines the target video according to the target tracking algorithm. Part A transforms the matrix information in each video frame in the reconstructed video information. Subsequently, the network device returns the transfer matrix information and the reconstructed video information to the augmented reality glasses, the third user equipment, and the tablet computer. The third user equipment receives the transfer matrix information and the reconstructed video information, and displays the reconstructed video information while displaying the corresponding marker information in real-time on the corresponding position in the video according to the transfer matrix information, where the marker information includes the second user pair Operation instruction information such as installation instruction information of Part A, where the operation instruction information may be generated on a tablet computer, or may be generated by a network device according to an operation about a second user uploaded by the tablet computer; in other embodiments, The third user equipment calculates the transfer matrix information of the position information of the right half of the part A in the reconstructed video information with respect to the third video information according to the computer vision algorithm. Subsequently, the third user equipment presents the third video information at the same time The corresponding mark information is superimposed and displayed at the corresponding position.

In some embodiments, the method further includes step S45 (not shown). In step S45, the network device determines a third transition matrix information corresponding to the third target object in each video frame of the video information by performing a target tracking operation on the third target object in the video information, where , The third target object and the first target object belong to the same remote assistance task, and the video information and the third transfer matrix information are sent to the remote assistance task to be related to the third target object The corresponding third user equipment; wherein, in step S44, the network device sends the video information, the first transfer matrix information, and the third transfer matrix information to the same remote assistant as the first user equipment The second user equipment of the task. The third user holds a third user device. The third user device includes, but is not limited to, an augmented reality device, a tablet computer, a PC terminal, and a mobile terminal. Here, a mobile terminal is used as an example to describe the following embodiments. Those skilled in the art should It can be understood that these embodiments are also applicable to other third-user devices such as augmented reality devices, tablet computers, and PC terminals.

For example, the first user holds augmented reality glasses, the second user holds a tablet computer, and the third user holds a mobile terminal. The augmented reality glasses, tablet computer, and mobile terminal establish a communication connection through a network device (cloud), and the augmented reality The glasses, the mobile terminal, and the tablet computer are performing the same remote assistance task (for example, installation instructions for part A and part B on the workbench), and the augmented reality glasses are responsible for shooting video information related to the workbench. The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform) to obtain video information related to part A on the workbench. At the same time, the video information corresponding to the video frame contains part B; then, the augmented reality glasses And send the video information to the network device. The network device receives the video information, obtains the initial positions of part A and part B through image recognition, and calculates the first transition matrix information and the third transition of each video frame in the video information of the part A and part B according to the target tracking algorithm. Matrix information, and then, the network device returns the first transfer matrix information to the augmented reality glasses, sends the third transfer matrix information and video information to the mobile terminal, and sends the first transfer matrix information, the third transfer matrix information, and the video information Send to tablet. The augmented reality glasses receive the first transfer matrix information, and simultaneously display the video information while displaying the corresponding mark information on the corresponding position in the video according to the first transfer matrix information. The mark information includes the second user's installation of the part A. Operation instruction information such as guidance information, where the operation instruction information may be generated on a tablet computer, or may be generated by a network device according to an operation about a second user uploaded by the tablet computer. The mobile terminal receives the third transfer matrix information and video information sent by the network device. When presenting the video information, the position information of part B in each video frame is determined according to the third transfer matrix information, and the position information of part B is superimposed and displayed at the position. Marking information, such as operation instructions for installation instructions for Part B. The tablet computer receives the first transfer matrix information, the third transfer matrix information, and the video information sent by the network device. When presenting the video information, the position information of the part A in each video frame is determined according to the first transfer matrix information, and at the position Superimposedly display mark information about Part A, such as installation instruction information for Part A, and determine the position information of Part B in each video frame based on the third transfer matrix information, and superimpose and display the information about Part B at this position. Marking information, such as operation instructions for installation instructions for Part B. The second user equipment may determine an object of the current tag information of the second user equipment according to a selection operation of the second user.

FIG. 7 illustrates a method for performing remote assistance based on augmented reality on a third user equipment end according to another aspect of the present application, where the method includes steps S51 and S52. In step S51, the third user equipment receives the video information about the third target object and the third transfer matrix information corresponding to the third target object in each video frame of the video information sent by the corresponding network device; In step S52, the third user equipment presents the video information, and superimposes the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information. , Wherein the third mark information includes operation instruction information of the second user on the third target object through the second user equipment, and the video information is captured in real time by a camera device in the first user equipment, The first user equipment, the third user equipment, and the second user equipment belong to the same remote assistance task, and receive remote assistance from the second user equipment, respectively.

For example, the first user holds augmented reality glasses, the second user holds a tablet computer, and the third user holds a mobile terminal. The augmented reality glasses, tablet computer, and mobile terminal establish a communication connection through a network device (cloud), and the augmented reality The glasses, the mobile terminal, and the tablet computer are performing the same remote assistance task (for example, installation instructions for part A and part B on the workbench), and the augmented reality glasses are responsible for shooting video information related to the workbench. The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform) to obtain video information related to part A on the workbench. At the same time, the video information corresponding to the video frame contains part B; then, the augmented reality glasses And send the video information to the network device. The network device receives the video information, obtains the initial positions of part A and part B through image recognition, and calculates the first transition matrix information and the third transition of each video frame in the video information of the part A and part B according to the target tracking algorithm. Matrix information, and then the network device sends the third transfer matrix information and video information to the mobile terminal. The mobile terminal receives the third transfer matrix information and video information sent by the network device. When presenting the video information, the position information of part B in each video frame is determined according to the third transfer matrix information, and the position information of part B is superimposed and displayed at the position. Marking information, such as operation instruction information such as installation guidance information for Part B, where the marking information includes operation instruction information such as installation guidance information for Part B by the second user, where the operation instruction information may be generated on a tablet computer, It may also be generated by the network device according to the operation about the second user uploaded by the tablet computer. In some other real-time examples, the tagging information also includes operations based on the third user collected by the mobile terminal, marking on the target object (such as drawing line segments, circles, etc.), or feedback information on the tagging information sent by the tablet computer, etc. For example, in the tag information, questions are asked, text is drawn in circles, etc .; while the mobile terminal presents the video information, the auxiliary tag information is superimposed and displayed at a position corresponding to the target object.

FIG. 8 illustrates a method for remote assistance based on augmented reality on the second user equipment side according to another aspect of the present application, where the method includes steps S61 and S62. In step S61, the second user equipment receives the video information about the first target object and the first transfer matrix information corresponding to the first target object in each video frame of the video information sent by the corresponding network device; In step S62, the second user equipment presents the video information, and superimposes the corresponding first marker information on the first target object in each video frame of the video information according to the first transfer matrix information. , Wherein the first tag information includes operation instruction information of the second user on the first target object through the second user equipment, and the video information is transmitted through the same remote as the second user equipment. The imaging device in the first user equipment assisting the task is shot in real time, or is reconstructed based on real-time video information about the first target object and other video information of the first target object captured by the camera device.

For example, a first user holds augmented reality glasses and a second user holds a tablet computer, and the augmented reality glasses and the tablet computer establish a communication connection through a network device (cloud). The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform), obtain video information related to part A, and send the video information to the network device. The network device receives the video information related to the part A, and determines the transfer matrix information of the part A in each video frame according to the target tracking algorithm. Then, the network device returns the transfer matrix information to the augmented reality glasses, and The transfer matrix information and video information are sent to a tablet computer, where the augmented reality glasses and the tablet computer establish communication through a network device to perform the same remote assistance task (eg, installation instruction for part A). The tablet receives the transfer matrix information and video information sent by the network device. When presenting the video information, the position information of the part A in each video frame is determined according to the transfer matrix information, and the mark information about the part A is superimposed and displayed at the position, such as Operation instructions such as installation instructions for Part A.

In some embodiments, the method further includes step S63 (not shown). In step S63, the second user equipment receives the third transfer matrix information corresponding to the third target object in each video frame of the video information sent by the network device, and in the process of presenting the video information And superimposing and displaying the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information, wherein the third marker information includes the second The user uses the second user equipment to perform operation instruction information on the third target object.

For example, the first user holds augmented reality glasses, the second user holds a tablet computer, and the third user holds a mobile terminal. The augmented reality glasses, tablet computer, and mobile terminal establish a communication connection through a network device (cloud), and the augmented reality The glasses, the mobile terminal, and the tablet computer are performing the same remote assistance task (for example, installation instructions for part A and part B on the workbench), and the augmented reality glasses are responsible for shooting video information related to the workbench. The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform) to obtain video information related to part A on the workbench. At the same time, the video information corresponding to the video frame contains part B; then, the augmented reality glasses And send the video information to the network device. The network device receives the video information, obtains the initial positions of part A and part B through image recognition, and calculates the first transition matrix information and the third transition of each video frame in the video information of the part A and part B according to the target tracking algorithm. Matrix information, and then, the network device sends the first transfer matrix information, the third transfer matrix information, and the video information to the tablet computer. The tablet computer receives the first transfer matrix information, the third transfer matrix information, and the video information sent by the network device. When presenting the video information, the position information of the part A in each video frame is determined according to the first transfer matrix information, and at the position Superimposedly display mark information about Part A, such as installation instruction information for Part A, and determine the position information of Part B in each video frame based on the third transfer matrix information, and superimpose and display the information about Part B at this position. Marking information, such as operation instruction information such as installation guidance information for part B, where the marking information includes operation instruction information such as installation guidance information for each part by the second user, where the operation instruction information may be generated on a tablet computer, or It may be generated by the network device according to the operation about the second user uploaded by the tablet computer. The second user equipment may determine an object of the current tag information of the second user equipment according to a selection operation of the second user.

FIG. 9 illustrates a method for remote assistance based on augmented reality on a network device side according to still another aspect of the present application, where the method includes steps S71, S72, S73, and S74. In step S71, the network device receives video information about the target object sent by the first user equipment, where the video information includes pictures taken by the camera device in the first user equipment; in step S72, the network device Performing a target tracking operation on the target object in the video information to determine the transfer matrix information corresponding to the target object in each video frame of the video information; in step S73, the network device according to the transfer matrix The information adds corresponding tag information to each video frame in the video information, wherein the tag information remains superimposed on the target object in each video frame of the video information, and the tag information includes a corresponding second The operation instruction information of the second user on the target object sent by the user equipment; in step S74, the network device sends the edited video information to the first user equipment and belongs to the same as the first user equipment Second user equipment for remote assistance tasks.

For example, a first user holds augmented reality glasses and a second user holds a tablet computer, and the augmented reality glasses and the tablet computer establish a communication connection through a network device (cloud). The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform), obtain video information related to part A, and send the video information to the network device. The network device receives the video information related to the part A, and determines the transfer matrix information of the part A in each video frame of the video information according to the target tracking algorithm. Then, the network device uses the transfer matrix information to mark information corresponding to the part A ( (Such as the guidance operation of Part A), add it to the corresponding position of each video frame, and send the edited video frame to the augmented reality glasses and tablet computer, where the augmented reality glasses and tablet computer establish communication through the network device to perform the same remote assistance Tasks (eg, installation instructions for part A). The augmented reality glasses receive and present video information, in which corresponding mark information is displayed in real-time superimposed on the corresponding position in the video information, wherein the mark information includes operation instruction information such as the second user's installation instruction information on the part A, where the The operation instruction information may be generated on a tablet computer, or may be generated by a network device according to an operation on the second user uploaded by the tablet computer. In the same way, the tablet computer receives and presents video information, and the mark information about the part A is superimposed and displayed at the corresponding position of the video information, such as operation instruction information such as the installation instruction information for the part A.

FIG. 10 illustrates a method for remote assistance based on augmented reality according to an aspect of the present application, wherein the method includes:

FIG. 11 illustrates a method for remote assistance based on augmented reality according to another aspect of the present application, wherein the method includes:

FIG. 12 illustrates a method for remote assistance based on augmented reality according to another aspect of the present application, where the method includes:

FIG. 13 shows a first user equipment for remote assistance based on an enhanced display according to an aspect of the present application, wherein the device includes a real-time shooting module 11, a target tracking module 12, and an overlay display module 13. A real-time shooting module 11 is configured to obtain video information about a target object in real time through a camera device in the first user equipment; a target tracking module 12 is configured to perform a target tracking operation on the target object in the video information To determine the corresponding transfer matrix information of the target object in each video frame of the video information; an overlay display module 13 is configured to superimpose and display the corresponding marker information on the target object according to the transfer matrix information, where The tag information includes corresponding operation instruction information of the second user on the target object sent by the second user equipment.

Specifically, the real-time shooting module 11 is configured to obtain video information about a target object in real time through a camera device in the first user equipment. For example, the target object includes a target object corresponding to the image information in the video frame marked by the first user, a target object corresponding to the image information in the second user marked video frame received by the first user, and the first user equipment according to the first user input. Image information determines the target object and so on. The first user equipment includes an imaging device, through which the first user equipment captures video information about the target object in real time.

A target tracking module 12 is configured to determine target transition matrix information corresponding to the target object in each video frame of the video information by performing a target tracking operation on the target object in the video information. The transfer matrix information includes the corresponding relationship between the current video frame and the previous video frame of the target object obtained by the first user equipment according to the target tracking algorithm. The target tracking algorithm includes, but is not limited to, a kernel tracking filter target tracking algorithm. , KCF), dense optical flow (Denseopticalflow) tracking algorithm, sparse optical flow (Sparseopticalflow) tracking algorithm, Kalman filtering (Kalmanfiltering) tracking algorithm, multiple instance learning (Multipleinstancelearning) tracking algorithm, etc .; here the target tracking algorithm to correlate The filter target tracking algorithm (Kernelizedcorrelationfilter, KCF) is taken as an example. The KCF algorithm solves the tracking problem by learning a Kernelized regularized least squares (KRLS) linear classifier. The movement of the target in the scene can be regarded as the vector sum of the movement of the target in the horizontal direction and the vertical direction. The KCF algorithm introduces the concept of dense sampling and regards all samples as cyclic shifts of the reference samples. At this time, the Gaussian kernel function is highly structured, that is, the kernel function matrix is a cyclic matrix. According to the principle of cyclic convolution, all dot product operations with the cyclic matrix can be converted into convolution operations with the first row vector of the matrix. At this time, with the help of DFT (Discretefouriertransform, Discrete Fourier Transform), the spatial convolution can be quickly calculated through the viewpoint dot product.

The superimposed display module 13 is configured to superimpose and display corresponding mark information on the target object according to the transfer matrix information, wherein the mark information includes a second user device corresponding to the target object sent by the second user equipment. Operation instructions. The tag information includes operation instruction information about the target object, such as virtual operation information on the target object, sent by the second user equipment and received by the first user equipment. For example, when the first user equipment receives the operation instruction information about the target object sent by the second user equipment, the first user performs target tracking according to the transfer matrix information, and superimposes and displays the marker information on the corresponding target object according to the transfer matrix information. position. Wherein, for the augmented reality glasses, the marker information superimposedly displays the corresponding position on the lens of the augmented reality glasses, and the position information is calculated by the augmented reality glasses / network device according to the target tracking algorithm; for the PC terminal, tablet computer or mobile terminal The tag information is superimposed and displayed at a position corresponding to the target object in the current video frame. The first user equipment and the second user equipment may directly establish a communication connection, or may establish a communication connection through a network device. Here, the following is an example of the direct establishment of a communication connection between the first user equipment and the second user equipment. Embodiments, those skilled in the art should understand that these embodiments are also applicable to other communication connection modes such as establishing a communication connection through a network device.

In some embodiments, the device further includes a video sending module 14 (not shown). The video sending module 14 is configured to send the video information to the second user equipment. For example, the first user equipment captures the current video information about the target object in real time and sends the video information to the second user equipment end, or sends the video information to the second user equipment through the network device. The video information includes image information collected by the first user equipment through the camera device, and may also include audio information collected by the first user equipment through the microphone device, and the audio information is mixed into a video / audio stream through a compression algorithm. The user equipment transmits the compressed video / audio stream to the second user equipment through a network transmission protocol such as a user datagram protocol (UDP), a transmission control protocol (TCP), or a real-time transmission protocol (RTP).

In some embodiments, the video sending module 14 is configured to send the video information and the transfer matrix information to the second user equipment. For example, when the first user equipment sends video information to the second user equipment, it simultaneously sends the transfer matrix information obtained according to the target tracking operation to the second user equipment for the second user to target the target while presenting the video information. The subject performs target tracking.

In some embodiments, the device further includes an operation receiving module 15 (not shown). The operation receiving module 15 is configured to receive continuing operation instruction information sent by the second user equipment to the target object based on the video information. For example, the second user equipment generates corresponding continuous operation instruction information according to the second user's continuous operation on the target object (such as drawing a line segment circle or the like), or recognizes the gesture operation of the second user through gesture recognition, and the like. Subsequently, the second user equipment sends the continuing operation instruction information to the first user equipment to assist the first user in continuing to perform operations on the target object.

For another example, the augmented reality glasses send video information about the target object captured in real time to the tablet computer, and also send the transfer matrix information corresponding to each video frame in the video information to the tablet computer, and the tablet computer receives and presents the video information. Subsequently, the tablet computer determines the position of the target object in the video frame according to the received transfer matrix information. In some embodiments, the tablet computer highlights the target object in the video frame by means of line segments, circles, and locally increasing brightness. The second user instructs the first user to process the part by making a mark on the tablet computer or making gestures within the shooting range of the tablet computer camera. The tablet computer uses the second user's mark as the operation instruction information, or by shooting The gestures and the like obtained are used for gesture recognition, and it is determined that the recognized gestures are the operation instruction information and the like. The tablet then sends the resume instruction to the augmented reality glasses. The augmented reality glasses receive and display the continued operation instruction information in a superimposed position at the corresponding position.

In some embodiments, the device further includes a camera control module 16 (not shown). The imaging control module 16 is configured to receive imaging control instruction information of the second user on the imaging device sent by the second user equipment, and adjust imaging parameter information of the imaging device according to the imaging control instruction information, The video information about the target object is captured in real time through the adjusted camera device, and the video information captured by the adjusted camera device is sent to the second user equipment. For example, the imaging control instruction information includes instruction information for adjusting hardware parameters of the imaging device of the first user equipment. The imaging parameter information includes, but is not limited to, resolution, pixel depth, maximum frame rate, exposure mode and shutter speed, and pixel size. And spectral response characteristics. For example, the first user equipment receives the imaging control instruction information sent by the second user equipment and the second user adjusts the imaging device of the first user, adjusts the imaging parameter information of the imaging device according to the imaging control instruction information, and The adjusted camera device captures video information of the current target object in real time, and sends the video information to the second user equipment.

In some embodiments, the marking information further includes auxiliary marking information that the first user marks on the target object through the first user equipment. Wherein, the auxiliary marking information includes operations based on the first user collected by the first user equipment, markings on target objects (such as drawing line segments, circles, etc.), or feedback information on the marking information sent by the second user equipment, such as Ask questions in circles, circle text, etc. For example, the first user equipment generates corresponding auxiliary identification information about the target object according to the operation of the first user, and the first user equipment sends the auxiliary identification information to the second user equipment for further remote interaction.

In some embodiments, the overlay display module 13 is configured to generate rendering mark information according to the one or more labeled position information, and superimpose and display the rendering mark information on the target object according to the transfer matrix information. Wherein, the rendering mark information includes highlight projections such as one or more marked positions, marks such as a line or a circle. For example, the first user equipment generates corresponding rendering mark information according to one or more of the marked position information marked in the discussion paper document in the operation instruction information, and determines, based on the transfer matrix information, whether the paper document in discussion is in the video. The position of each video frame is information, so as to determine the position of the rendering mark in each video frame, and the rendering mark information is superimposed and displayed at the corresponding position.

In some embodiments, the device further includes a token acquisition module 17 (not shown). A marker acquisition module 17 is configured to capture image information about a target object in real time through a camera device in the first user equipment, send the image information to a corresponding second user equipment, and receive the marker information about the target object. , Wherein the tag information includes operation instruction information of the second user on the target object in the image information sent by the second user equipment, and the tag information is superimposed and displayed on the target object; wherein, In step S11, the first user equipment captures video information about the target object in real time through the imaging device. For example, the first user equipment captures image information about the target object through the imaging device, and sends the image information to the second user equipment. The second user equipment receives and presents the image information for the second user to operate the target object. Based on the operation of the second user, the second user equipment generates tag information corresponding to the operation instruction information, and sends the tag information to the first user equipment. The first user equipment receives the tag information, and superimposes and displays the tag information at a position corresponding to the target object in the image. Subsequently, the first user equipment collects the video stream about the target object through the camera device, and displays the marker information in each video frame of the video stream by using a target tracking algorithm.

FIG. 14 illustrates a second user equipment for remote assistance based on augmented reality according to another aspect of the present application, where the device includes a video receiving module 21 and a video presentation module 22. The video receiving module 21 is configured to receive video information about a target object that is captured in real time by a camera device in the first user equipment and is sent by a corresponding first user equipment; a video presentation module 22 is configured to present the video information, and The corresponding target information is superimposed and displayed on the target object in each video frame of the video information, wherein the label information includes operation instruction information of the second object on the target object by the second user device. For example, the second user equipment receives and presents image information or video information about the target object sent by the first user equipment, and collects operations of the second user to generate corresponding mark information. Subsequently, the second user equipment continues to receive video information about the target object sent by the first user equipment and presents the video information, and superimposes and displays the tag information determined before the second user equipment in the presented video.

For example, a first user holds augmented reality glasses, a second user holds a tablet computer, and the augmented reality glasses establish a communication connection with the tablet computer. The augmented reality glasses and tablet computer have transmitted the video stream or image about the target object, and the augmented reality glasses have received the tablet's operation instruction information about the target object in the previous video frame, such as the target object is a console The target object may be determined by the first user equipment based on the first user ’s selection operation (such as drawing a circle), or it may be the second user equipment received by the first user equipment based on the second user. The selection operation may also be determined by the first user equipment by identifying the initial image information of the target object; the corresponding operation instruction information includes virtual operation information obtained by the second user equipment to recognize the second user ’s gesture regarding the part operation. Wait. The augmented reality glasses collect the current video information about the target object through the camera in real time, and then use the target tracking algorithm to calculate the transfer matrix information of the target object in the current video frame relative to the target object in the previous video frame. Subsequently, the augmented reality glasses determine the position information of the target object in the current video frame according to the transfer matrix information, and superimpose and display the corresponding marker information at the position, such as the corresponding position of the part on the operating platform in the current video frame superimposedly displays the second user Operation instruction information and the like corresponding to the gesture. At the same time, the augmented reality glasses send the video information to the tablet computer, and the tablet computer receives and presents the video information, and displays the previous mark information at the corresponding position in the video information while the video information is presented. In other real-time examples, the augmented reality glasses also send auxiliary labeling information to the tablet computer, where the auxiliary labeling information includes the target user's mark (such as a line segment, a circle, etc.) collected by the augmented reality glasses based on the operation of the first user. Etc.), or feedback information on the tag information sent by the tablet computer, such as asking questions in the tag information, circled text, etc .; the tablet computer receives the auxiliary tag information and presents the auxiliary message while displaying the video information. The information is displayed superimposed on the corresponding position of the target object.

In some implementations, the device further includes a trace execution module 23 (not shown). A tracking execution module 23 is configured to perform a target tracking operation on the target object in the video information; wherein, a video presentation module 22 is configured to present the video information, and according to the result information of the target tracking operation, The corresponding mark information is superimposed and displayed on the target object in each video frame of the video information, wherein the mark information includes operation instruction information of the second object on the target object by the second user device. For example, the second user equipment receives video information about the target object sent by the first user equipment, and the second user equipment performs a target tracking operation on the target object in the video information according to the template information of the target object to determine the target object in each video frame of the video information. The location information in the template information may be sent by the first user equipment to the second user equipment, or may be obtained by the second user equipment based on the initial video frame selected by the second user operation or by importing the template information. Subsequently, when the second user equipment presents the video information, the marker information is superimposed and displayed on the corresponding position of the target object according to the result information of the target tracking, where the marker information may be the second user equipment's initial video frame or The target information in the image information is generated based on the guidance, and may also be mark information generated by the second user based on the operation guidance made by the target object in the video information sent later.

For example, the tablet computer receives video information sent by the augmented reality glasses, performs target tracking on each part in each video frame in the video information according to the part template information of the operating platform, and obtains position information of the part in each video frame. The template of the part may be imported by the second user, may be selected in the initialization frame, or may be sent by the augmented reality glasses. The tablet computer receives and presents the video information, and generates corresponding mark information according to the second user's installation guide information for the part (for example, circled out or an arrow pointing to the installation position, or a preset installation operation corresponding to gesture recognition). While presenting the video information, the second user equipment displays the tag information and the like in real-time superimposed display in subsequent video frames according to the position information of the part in each video frame.

In some embodiments, the video receiving module 21 is configured to receive video information about a target object obtained in real time through a camera device in the first user equipment and sent by the corresponding first user equipment. Corresponding transition matrix information in each video frame of the video information; wherein the video presentation module 22 is configured to present the video information and according to the corresponding transition matrix information of the target object in each video frame of the video information, And superimposing and displaying the corresponding mark information on the target object in each video frame of the video information, wherein the mark information includes operation instruction information of the second object on the target object by the second user device. For example, when the first user equipment sends video information to the second user equipment, it simultaneously sends the transfer matrix information obtained according to the target tracking operation to the second user equipment for the second user to target the target while presenting the video information. The subject performs target tracking. The second user equipment receives the video information and the transfer matrix information, and simultaneously displays the video information, and superimposes and displays the marker information on the corresponding position in the video information according to the transfer matrix information.

In some embodiments, the device further includes an operation acquisition module 24 (not shown). An operation acquiring module 24 is configured to acquire continuing operation instruction information of the second user on the target object based on the video information, and send the continuing operation instruction information to the first user equipment. For example, the second user equipment generates corresponding continuous operation instruction information according to the second user's continuous operation on the target object (such as drawing a line segment circle or the like), or recognizes the gesture operation of the second user through gesture recognition, and the like. Subsequently, the second user equipment sends the continuing operation instruction information to the first user equipment to assist the first user in continuing to perform operations on the target object.

In some embodiments, the device further includes a camera control module 25 (not shown). The imaging control module 25 is configured to generate imaging control instruction information of the second user on the imaging device according to an imaging control operation performed by the second user through the second user equipment, where the imaging control instruction information is used For adjusting imaging parameter information of the imaging device, sending the imaging control instruction information to the first user equipment, and receiving the first user equipment and receiving the image captured by the adjusted imaging device Video information. For example, the second user equipment receives the video information and adjusts the video information, such as enlarging the area near the target object. The second user determines the corresponding imaging control instruction information based on the user's operation, where the imaging control instruction information includes imaging parameter information for adjusting the imaging device of the first user equipment, and then the second user equipment sends the imaging control instruction information Send to the first user equipment. The imaging control instruction information includes instruction information for adjusting hardware parameters of the imaging device of the first user equipment. The imaging parameter information includes, but is not limited to, resolution, pixel depth, maximum frame rate, exposure mode and shutter speed, and pixel size. And spectral response characteristics.

For example, as shown in Figure 3, Figure A is the real-time video information received by the second user, where the target object is the mouse pad on the table in the screen. The second user wants to observe the target object in more detail. The setting icon in the upper right corner is used to operate or directly zoom out by two-finger expansion on the screen. Based on the operation of the second user, the tablet computer generates corresponding camera control instruction information of the focused target object, and sends the camera control instruction The information is sent to the augmented reality glasses. The augmented reality glasses receive the imaging control instruction information, adjust related imaging parameters of the imaging device, such as resolution, focal length, etc., shoot the adjusted video information about the target object, and send the video information to the tablet computer. As shown in FIG. B, the picture is the enlarged video information about the target object received and presented by the tablet computer.

In some embodiments, the device further includes a tag acquisition module 26 (not shown). The mark acquiring module 26 is configured to receive and present the image information about the target object that is captured by the first user equipment in real time through the camera device in the first user equipment, and acquire the second user's Sending the operation instruction information of the target object to the first user equipment, and superimposing and displaying the operation instruction information on the target object in the image information; wherein, in step S21, The second user equipment receives video information about the target object that is captured by the first user equipment in real time through the camera. For example, the first user equipment captures image information about the target object through the imaging device, and sends the image information to the second user equipment. The second user equipment receives and presents the image information for the second user to operate the target object. Based on the operation of the second user, the second user equipment generates tag information corresponding to the operation instruction information, and sends the tag information to the first user equipment. The first user equipment receives the tag information, and superimposes and displays the tag information at a position corresponding to the target object in the image. Subsequently, the first user equipment collects the video stream about the target object through the camera device, and displays the marker information in each video frame of the video stream by using a target tracking algorithm.

FIG. 15 illustrates a first user device for remote assistance based on augmented reality according to another aspect of the present application, where the device includes a real-time shooting module 31, a video sending module 32, a transfer matrix receiving module 33, and an overlay display module 34. . A real-time shooting module 31 is configured to capture video information about a first target object in real time through a camera device in the first user equipment; a video sending module 32 is configured to send the video information to a corresponding network device; a transfer matrix A receiving module 33 is configured to receive first transfer matrix information corresponding to the first target object in each video frame of the video information sent by the network device; and an overlay display module 34 is configured to The matrix information is transferred, and the corresponding first mark information is superimposed and displayed on the first target object, wherein the first mark information includes a second user operation on the first target object that is sent by the second user equipment. Instructions. For example, the first user equipment and the second user equipment establish a communication connection through a network device, and the first user equipment sends the captured video information about the first target object to the network device, and the network device sends the first target object to the first target object according to the video information. Perform target tracking, determine the transition matrix information of the first target object in each video frame corresponding to the video information, and send the transition matrix to the first user equipment and the second user equipment. Subsequently, the first user equipment and the second user equipment superimpose and display the first tag information and the like based on the transfer matrix information sent by the network device, where the first tag information includes an operation instruction of the second user device on the first target object according to the second user equipment. information.

FIG. 16 shows a network device for remote assistance based on augmented reality according to another aspect of the present application, where the device includes a video receiving module 41, a target tracking module 42, a first sending module 43 and a second sending module 44. The video receiving module 41 receives, in a language, video information about the first target object sent by the first user equipment, where the video information is captured in real time by a camera device in the first user equipment; the target tracking module 42, Configured to determine first transfer matrix information corresponding to the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information; a first sending module 43 is used for sending the first transfer matrix information to the first user equipment; second sending module 44 is used for sending the video information and the first transfer matrix information to the first user The device belongs to a second user device of the same remote assistance task. Among them, the network device is a server with sufficient computing power, which is mainly responsible for the forwarding of video, audio, and tag information data. At the same time, the network device has some computer vision and image processing algorithms. For example, when video / audio information reaches the network device, the network The device tracks the target object (such as the first target object) by using a tracking algorithm, and then returns the tracking result information to the user device.

In some embodiments, the target tracking module 42 is configured to reconstruct the video information of the first target object according to the video information and other video information of the first target object, and to reconstruct the video information after the reconstruction Performing a target tracking operation on the first target object in to determine first transfer matrix information corresponding to the first target object in each video frame of the video information. Among them, the network device is mainly responsible for the forwarding of data such as video, audio, and tag information. At the same time, the network device has some computer vision and image processing capabilities. If the video / audio information is sent to the network device, the network device uses the target tracking algorithm and target recognition. , Reconstruction, pose estimation and computer graphics algorithms (such as virtual object rendering, point cloud processing (splicing, down / oversampling, matching, meshing, etc.)) process video information and return the processed result information to the user device . For example, the network device reconstructs the video information uploaded by the first user and videos uploaded by other users to generate overall video information for the first target object, and then performs target tracking on the first target object in the reconstructed video information.

For example, a first user holds augmented reality glasses, a second user holds a tablet computer, and another user (such as a third user) holds a third user device (such as augmented reality glasses, tablet computer, etc.). The three user devices and the tablet computer have established a communication connection through the network device (cloud), and the augmented reality glasses, the third user device and the tablet computer are performing the same remote assistance task (such as the installation instructions for part A), the augmented reality glasses, and The third device is used to capture video information related to Part A. Among them, the augmented reality glasses are mainly used to capture the left half of Part A, and the third user device is mainly used to capture the right half of Part A with a certain degree of overlap. The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform), obtain the first video information related to the left half of part A, and send the first video information to the network device; Part A performs real-time shooting, obtains third video information related to the right half of part A, and sends the third video information to the network device. The network device receives the first video information and the third video information related to the part A, obtains the reconstructed video information including the entire part A according to the first video information and the third video information through a computer vision algorithm, and determines the target video according to the target tracking algorithm. Part A transforms the matrix information in each video frame in the reconstructed video information. Subsequently, the network device returns the transfer matrix information and the reconstructed video information to the augmented reality glasses, the third user equipment, and the tablet computer. The third user equipment receives the transfer matrix information and the reconstructed video information, and displays the reconstructed video information while displaying the corresponding marker information in real-time on the corresponding position in the video according to the transfer matrix information, where the marker information includes the second user pair Operation instruction information such as the installation instruction information of Part A, where the operation instruction information may be generated on a tablet computer, or may be generated by a network device according to the operation of the second user uploaded by the tablet computer; in other real-time examples The third user equipment calculates the transfer matrix information of the position information of the right half of the part A in the reconstructed video information with respect to the third video information according to the computer vision algorithm. Subsequently, the third user equipment presents the third video information at the same time The corresponding mark information is superimposed and displayed at the corresponding position.

In some embodiments, the device further includes a third sending module 45 (not shown). A third sending module 45, configured to determine target third transfer matrix information corresponding to the third target object in each video frame of the video information by performing a target tracking operation on a third target object in the video information, The third target object and the first target object belong to the same remote assistance task, and the video information and the third transition matrix information are sent to the third assistance object in the remote assistance task. A corresponding third user equipment; wherein a second sending module 44 is configured to send the video information, the first transfer matrix information, and the third transfer matrix information to the same as the first user equipment Second user equipment for remote assistance tasks. The third user holds a third user device. The third user device includes, but is not limited to, an augmented reality device, a tablet computer, a PC terminal, and a mobile terminal. Here, a mobile terminal is used as an example to describe the following embodiments. Those skilled in the art should It can be understood that these embodiments are also applicable to other third-user devices such as augmented reality devices, tablet computers, and PC terminals.

FIG. 17 illustrates a third user equipment device for remote assistance based on augmented reality according to another aspect of the present application, where the device includes a receiving module 51 and a presenting module 52. A receiving module 51, configured to receive video information about a third target object and third transfer matrix information corresponding to the third target object in each video frame of the video information sent by a corresponding network device; a presentation module 52, Configured to present the video information, and superimpose and display corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information, wherein the first The three mark information includes operation instruction information of the second user on the third target object through the second user equipment; wherein the video information is captured in real time by a camera device in the first user equipment, and the first user equipment The third user equipment and the second user equipment belong to the same remote assistance task, and receive the remote assistance of the second user equipment, respectively.

FIG. 18 illustrates a second user equipment for remote assistance based on augmented reality according to another aspect of the present application, where the device includes a receiving module 61 and a presenting module 62. A receiving module 61, configured to receive video information about a first target object and first transfer matrix information corresponding to the first target object in each video frame of the video information sent by a corresponding network device; a presentation module 62, Configured to present the video information, and superimpose and display corresponding first marker information on the first target object in each video frame of the video information according to the first transition matrix information, wherein the first A tag information includes operation instruction information of a second user on the first target object through the second user equipment; wherein the video information is from a first user who belongs to the same remote assistance task as the second user equipment The camera device in the device takes pictures in real time, or reconstructs them based on the real-time video information about the first target object and other video information of the first target object taken by the camera device.

In some embodiments, the device further includes a third tag overlay module 63 (not shown). The third mark superimposing module 63 is configured to receive third transfer matrix information corresponding to the third target object in each video frame of the video information sent by the network device, and in the process of presenting the video information And superimposing and displaying the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information, wherein the third marker information includes the second The user uses the second user equipment to perform operation instruction information on the third target object.

FIG. 19 illustrates a network device for remote assistance based on augmented reality according to another aspect of the present application, where the device includes a video receiving module 71, a target tracking module 72, a tag adding module 73, and a video sending module 74. A video receiving module 71 is configured to receive video information about a target object sent by a first user equipment, where the video information includes a picture taken by a camera device in the first user equipment; a target tracking module 72 is configured to: Performing a target tracking operation on the target object in the video information to determine transition matrix information corresponding to the target object in each video frame of the video information; and a tag adding module 73 for The information adds corresponding tag information to each video frame in the video information, wherein the tag information remains superimposed on the target object in each video frame of the video information, and the tag information includes a corresponding second The operation instruction information of the second user on the target object sent by the user equipment; the video sending module 74 is configured to send the edited video information to the first user equipment and belong to the same as the first user equipment Second user equipment for remote assistance tasks.

For example, a first user holds augmented reality glasses and a second user holds a tablet computer. The augmented reality glasses and tablet computer establish a communication connection through a network device (cloud). The augmented reality glasses take a real-time shot of the first target object (such as part A on the operating platform), obtain video information related to part A, and send the video information to the network device. The network device receives the video information related to the part A, and determines the transfer matrix information of the part A in each video frame of the video information according to the target tracking algorithm. Then, the network device uses the transfer matrix information to mark information corresponding to the part A ( (Such as the guidance operation of Part A), add it to the corresponding position of each video frame, and send the edited video frame to the augmented reality glasses and tablet computer, where the augmented reality glasses and tablet computer establish communication through the network device to perform the same remote assistance Tasks (eg, installation instructions for part A). The augmented reality glasses receive and present video information, in which corresponding mark information is displayed in real-time superimposed on the corresponding position in the video information, wherein the mark information includes operation instruction information such as the second user's installation instruction information on the part A, where the The operation instruction information may be generated on a tablet computer, or may be generated by a network device according to an operation on the second user uploaded by the tablet computer. In the same way, the tablet computer receives and presents video information, and the mark information about the part A is superimposed and displayed at the corresponding position of the video information, such as operation instruction information such as the installation instruction information for the part A.

FIG. 20 shows a system for remote assistance based on augmented reality, wherein the system includes the first user equipment including the real-time shooting module, the target tracking module, and the superimposed display module as described above, and the video receiving device as described above. Module and the second user equipment of the video presentation module.

FIG. 21 shows a system for remote assistance based on augmented reality, where the system includes a first user equipment including a real-time shooting module, a video sending module, a transfer matrix receiving module, and an overlay display module as described above, as described above. A second user equipment including a receiving module and a presentation module, and a network device including a video receiving module, a target tracking module, a first sending module, and a second sending module as described above are described.

FIG. 22 shows a system for remote assistance based on augmented reality, wherein the system includes a first user equipment including a real-time shooting module, a video sending module, a transfer matrix receiving module, and an overlay display module as described above, as described above. The second user equipment including the receiving module and the presenting module, the receiving module and the third user equipment of the presenting module as described above, and the video receiving module, the target tracking module, the first sending module, and the second including the video receiving module as described above. Network equipment of the sending module.

The application also provides a computer-readable storage medium, where the computer-readable storage medium stores computer code, and when the computer code is executed, the method according to any one of the preceding is executed.

The present application also provides a computer program product. When the computer program product is executed by a computer device, the method according to any one of the foregoing is executed.

This application also provides a computer device, where the computer device includes:

One or more processors;

Memory for storing one or more computer programs;

When the one or more computer programs are executed by the one or more processors, the one or more processors are caused to implement the method according to any one of the preceding items.

FIG. 23 illustrates an exemplary system that can be used to implement various embodiments described in this application;

As shown in FIG. 23, in some embodiments, the system 300 can serve as a device for remote assistance based on augmented reality in any of the embodiments. In some embodiments, system 300 may include one or more computer-readable media (e.g., system memory or NVM / storage device 320) with instructions and coupled to the one or more computer-readable media and configured to execute Instructions to one or more processors (eg, processor (s) 305) that implement the modules to perform the actions described in this application.

For one embodiment, the system control module 310 may include any suitable interface controller to provide to at least one of the processor (s) 305 and / or any suitable device or component in communication with the system control module 310 Any appropriate interface.

The system control module 310 may include a memory controller module 330 to provide an interface to the system memory 315. The memory controller module 330 may be a hardware module, a software module, and / or a firmware module.

System memory 315 may be used, for example, to load and store data and / or instructions for system 300. For one embodiment, the system memory 315 may include any suitable volatile memory, such as a suitable DRAM. In some embodiments, the system memory 315 may include a double data rate type quad synchronous dynamic random access memory (DDR4SDRAM).

For one embodiment, the system control module 310 may include one or more input / output (I / O) controllers to provide an interface to the NVM / storage device 320 and the communication interface (s) 325.

For example, the NVM / storage device 320 may be used to store data and / or instructions. The NVM / storage device 320 may include any suitable non-volatile memory (e.g., flash memory) and / or may include any suitable non-volatile storage device (e.g., one or more hard drives (e.g., one or more hard drives) HDD), one or more compact disc (CD) drives, and / or one or more digital versatile disc (DVD) drives).

The NVM / storage device 320 may include storage resources that are physically part of the device on which the system 300 is installed, or it may be accessed by the device without having to be part of the device. For example, the NVM / storage device 320 may be accessed over a network via the communication interface (s) 325.

The communication interface (s) 325 may provide an interface for the system 300 to communicate over one or more networks and / or with any other suitable device. System 300 may wirelessly communicate with one or more components of a wireless network in accordance with any one or more of one or more wireless network standards and / or protocols.

For one embodiment, at least one of the processor (s) 305 may be packaged with the logic of one or more controllers (eg, the memory controller module 330) of the system control module 310. For one embodiment, at least one of the processor (s) 305 may be packaged with the logic of one or more controllers of the system control module 310 to form a system-in-package (SiP). For one embodiment, at least one of the processor (s) 305 may be integrated with the logic of one or more controllers of the system control module 310 on the same mold. For one embodiment, at least one of the processor (s) 305 may be integrated with the logic of one or more controllers of the system control module 310 on the same mold to form a system-on-chip (SoC).

In various embodiments, the system 300 may be, but is not limited to, a server, a workstation, a desktop computing device, or a mobile computing device (eg, a laptop computing device, a handheld computing device, a tablet computer, a netbook, etc.). In various embodiments, the system 300 may have more or fewer components and / or different architectures. For example, in some embodiments, the system 300 includes one or more cameras, keyboards, liquid crystal display (LCD) screens (including touch screen displays), non-volatile memory ports, multiple antennas, graphics chips, application specific integrated circuits ( ASIC) and speakers.

It should be noted that this application may be implemented in software and / or a combination of software and hardware, for example, it may be implemented using an application specific integrated circuit (ASIC), a general purpose computer, or any other similar hardware device. In one embodiment, the software program of the present application may be executed by a processor to implement the steps or functions described above. Likewise, the software program (including related data structures) of the present application can be stored in a computer-readable recording medium, such as a RAM memory, a magnetic or optical drive or a floppy disk and the like. In addition, some steps or functions of this application may be implemented by hardware, for example, as a circuit that cooperates with a processor to perform each step or function.

In addition, a part of the application may be applied as a computer program product, such as a computer program instruction, which, when executed by a computer, may call or provide the method and / or technical solution according to the application through the operation of the computer. Those skilled in the art should understand that the existence forms of computer program instructions in computer-readable media include, but are not limited to, source files, executable files, installation package files, and the like. Accordingly, the manner in which computer program instructions are executed by a computer includes, but is not limited to. Limited to: the computer directly executes the instruction, or the computer compiles the instruction and then executes the corresponding compiled program, or the computer reads and executes the instruction, or the computer reads and installs the instruction and then executes the corresponding installation program. Here, the computer-readable medium can be any available computer-readable storage medium or communication medium that can be accessed by a computer.

Communication media include media whereby communication signals containing, for example, computer-readable instructions, data structures, program modules, or other data, are transmitted from one system to another. Communication media can include conductive transmission media (such as cables and wires (e.g., fiber optics, coaxial, etc.)) and wireless (non-conductive transmission) media that can propagate energy waves, such as sound, electromagnetic, RF, microwave, and infrared . Computer readable instructions, data structures, program modules or other data may be embodied, for example, as a modulated data signal in a wireless medium, such as a carrier wave or a similar mechanism such as embodied as part of a spread spectrum technology. The term "modulated data signal" refers to a signal whose one or more characteristics are altered or set in such a manner as to encode information in the signal. Modulation can be analog, digital, or hybrid modulation techniques.

By way of example, and not limitation, computer-readable storage media may include volatile and non-volatile, non-volatile, non-volatile, non-volatile, non-volatile, and non-volatile Removable and non-removable media. For example, computer-readable storage media include, but are not limited to, volatile memory such as random access memory (RAM, DRAM, SRAM); and non-volatile memory such as flash memory, various read-only memories (ROM, PROM, EPROM) , EEPROM), magnetic and ferromagnetic / ferroelectric memory (MRAM, FeRAM); and magnetic and optical storage devices (hard disk, tape, CD, DVD); or other media now known or developed in the future that can be stored for computer systems Computer-readable information / data used.

It is obvious to a person skilled in the art that the present application is not limited to the details of the above exemplary embodiments, and that the present application can be implemented in other specific forms without departing from the spirit or basic features of the application. Therefore, the embodiments are to be regarded as exemplary and non-limiting in every respect. The scope of the present application is defined by the appended claims rather than the above description, and therefore is intended to fall within the claims. All changes within the meaning and scope of the equivalent requirements are included in this application. Any reference signs in the claims should not be construed as limiting the claims involved. In addition, it is obvious that the word "comprising" does not exclude other units or steps, and that the singular does not exclude the plural. Words such as first and second are used to indicate names, but not in any particular order.

Claims

A method for remote assistance based on augmented reality on a first user equipment side, wherein the method includes:

Shooting video information about a target object in real time through a camera device in the first user equipment;

Determining a transition matrix information corresponding to the target object in each video frame of the video information by performing a target tracking operation on the target object in the video information;

According to the transfer matrix information, corresponding mark information is superimposed and displayed on the target object, wherein the mark information includes corresponding instruction information of the second user on the target object sent by the second user equipment.
The method of claim 1, further comprising:

Sending the video information to the second user equipment.
The method according to claim 1, wherein the sending the video information to the second user equipment comprises:

Sending the video information and the transfer matrix information to the second user equipment.
The method according to claim 2 or 3, wherein the method further comprises:

And receiving, from the second user equipment, continuous operation instruction information of the second user on the target object based on the video information.
The method according to any one of claims 2 to 4, further comprising:

Receiving imaging control instruction information sent by the second user equipment to the imaging device by the second user;

Adjusting imaging parameter information of the imaging device according to the imaging control instruction information;

Shooting video information about the target object in real time through the adjusted camera device;

Sending the video information shot by the adjusted camera device to the second user equipment.
The method according to any one of claims 1 to 5, wherein the marking information further includes auxiliary marking information that the first user marks the target object through the first user equipment.
The method according to any one of claims 1 to 6, wherein the target object includes a paper document under discussion; the operation instruction information of the second user on the target object includes the second user's Describes one or more labeled position information in a video frame of a discussion paper document.
The method according to claim 7, wherein, according to the transfer matrix information, the corresponding mark information is superimposed and displayed on the target object, and wherein the mark information includes a second The operation instruction information of the user on the target object includes:

Generating rendering mark information according to the one or more marked position information;

Superimposing and displaying the rendering mark information on the target object according to the transfer matrix information.
The method according to any one of claims 1 to 8, wherein the method further comprises:

Shooting image information about a target object in real time through a camera device in the first user equipment;

Sending the image information to a corresponding second user equipment;

Receiving tag information about the target object, where the tag information includes operation instruction information of the second user on the target object in the image information sent by the second user equipment;

Superimposing and displaying the mark information on the target object;

Wherein, the real-time shooting of the video information about the target object through the camera device in the first user equipment includes:

Video information about the target object is captured in real time by the camera device.
A method for remote assistance based on augmented reality at a second user equipment end, wherein the method includes:

Receiving video information corresponding to a target object that is sent by a corresponding first user equipment in real time through a camera device in the first user equipment;

Presenting the video information, and maintaining corresponding target information superimposed on the target object displayed in each video frame of the video information, wherein the label information includes a second user using the second user device to Operation instruction information of the target object.
The method according to claim 10, wherein the method further comprises:

Performing a target tracking operation on the target object in the video information;

Wherein, the presenting the video information and keeping the corresponding tag information superimposed and displayed on the target object in each video frame of the video information, wherein the tag information includes a second user passing the second user The operation instruction information of the device on the target object includes:

Presenting the video information, and superimposing and displaying corresponding mark information on the target object in each video frame of the video information according to the result information of the target tracking operation, wherein the mark information includes a second user Information indicating operation of the target object by the second user equipment.
The method according to claim 10, wherein the receiving corresponding video information about the target object in real time through a camera device in the first user equipment sent by the first user equipment comprises:

Receiving video information corresponding to a target object obtained in real time through a camera device in the first user equipment and corresponding transfer matrix information of the target object in each video frame of the video information sent by the corresponding first user equipment;

Wherein, the presenting the video information and keeping the corresponding tag information superimposed and displayed on the target object in each video frame of the video information, wherein the tag information includes a second user passing the second user The operation instruction information of the device on the target object includes:

Presenting the video information, and superimposing and displaying corresponding marker information on the target object in each video frame of the video information according to the corresponding transfer matrix information of the target object in each video frame of the video information , Wherein the tag information includes operation instruction information of the second user on the target object through the second user equipment.
The method according to any one of claims 10 to 12, wherein the method further comprises:

Acquiring instruction information for the second user to continue operating the target object based on the video information;

Sending the continue operation instruction information to the first user equipment.
The method according to any one of claims 10 to 13, wherein the method further comprises:

Generating imaging control instruction information of the second user for the imaging device according to an imaging control operation performed by the second user through the second user equipment, where the imaging control instruction information is used to adjust Camera parameter information;

Sending the imaging control instruction information to the first user equipment;

Receiving the video information sent by the first user equipment and captured by the adjusted camera device.
The method according to any one of claims 10 to 14, wherein the method further comprises:

Receiving and presenting image information about a target object that is captured in real time by a camera device in the first user equipment and sent by the first user equipment;

Acquiring operation instruction information of the second user on the target object in the image information;

Sending the operation instruction information to the first user equipment;

Superimposing and displaying the operation instruction information on the target object in the image information;

Wherein, receiving the corresponding video information about the target object that is sent by the first user equipment in real time through the camera device in the first user equipment includes:

Receiving video information about the target object captured by the first user equipment in real time through the camera device.
A method for remote assistance based on augmented reality on a first user equipment side, wherein the method includes:

Shooting video information about a first target object in real time through a camera device in the first user equipment;

Sending the video information to a corresponding network device;

Receiving first transfer matrix information corresponding to the first target object in each video frame of the video information sent by the network device;

Superimposing and displaying the corresponding first marker information on the first target object according to the first transfer matrix information, wherein the first marker information includes a second user equipment corresponding to Operation instruction information of a target object.
A method for remote assistance based on augmented reality on a network device side, wherein the method includes:

Receiving video information about a first target object sent by a first user equipment, where the video information is captured in real time by a camera device in the first user equipment;

Determining a first transition matrix information corresponding to the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information;

Sending the first transfer matrix information to the first user equipment;

Sending the video information and the first transfer matrix information to a second user equipment that belongs to the same remote assistance task as the first user equipment.
The method according to claim 17, wherein the determining that the first target object corresponds to each video frame of the video information by performing a target tracking operation on the first target object in the video information The first transfer matrix information includes:

Reconstructing video information of the first target object according to the video information and other video information of the first target object;

By performing a target tracking operation on the first target object in the reconstructed video information, first transition matrix information corresponding to the first target object in each video frame of the video information is determined.
The method according to claim 17, wherein the method further comprises:

Performing a target tracking operation on a third target object in the video information to determine third transfer matrix information corresponding to the third target object in each video frame of the video information, wherein the third target object Belong to the same remote assistance task as the first target object;

Sending the video information and the third transfer matrix information to a third user equipment corresponding to the third target object in the remote assistance task;

Wherein, sending the video information and the first transfer matrix information to a second user equipment belonging to the same remote assistance task as the first user equipment includes:

Sending the video information, the first transfer matrix information, and the third transfer matrix information to a second user equipment that belongs to the same remote assistance task as the first user equipment.
A method for performing remote assistance based on augmented reality on a third user equipment side, wherein the method includes:

Receiving video information about a third target object and third transfer matrix information corresponding to the third target object in each video frame of the video information sent by a corresponding network device;

Presenting the video information, and superimposing and displaying the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information, wherein the third marker The information includes operation instruction information of the second user on the third target object through the second user equipment;

The video information is captured in real time by a camera device in the first user equipment, and the first user equipment, the third user equipment, and the second user equipment belong to the same remote assistance task, and accept all The remote assistance of the second user equipment is described.
A method for remote assistance based on augmented reality on a second user equipment side, wherein the method includes:

Receiving video information about a first target object and first transfer matrix information corresponding to the first target object in each video frame of the video information sent by a corresponding network device;

Presenting the video information, and superimposing and displaying corresponding first marker information on the first target object in each video frame of the video information according to the first transition matrix information, wherein the first marker The information includes operation instruction information of the second user on the first target object through the second user equipment;

The video information is captured in real time by a camera device in the first user equipment that belongs to the same remote assistance task as the second user device, or is based on the first target object captured by the camera device. The real-time video information and other video information of the first target object are reconstructed.
The method according to claim 21, wherein the method further comprises:

Receiving third transfer matrix information corresponding to the third target object in each video frame of the video information sent by the network device;

In the process of presenting the video information, according to the third transition matrix information, the corresponding third marker information is superimposed and displayed on the third target object in each video frame of the video information, wherein the first The three mark information includes operation instruction information of the second user on the third target object through the second user equipment.
A method for remote assistance based on augmented reality on a network device side, wherein the method includes:

Receiving video information about a target object sent by a first user equipment, where the video information includes a picture taken by an imaging device in the first user equipment;

Determining a transition matrix information corresponding to the target object in each video frame of the video information by performing a target tracking operation on the target object in the video information;

Adding corresponding tag information to each video frame in the video information according to the transfer matrix information, wherein the tag information remains the target object superimposed on each video frame of the video information, the tag The information includes corresponding operation instruction information of the second user on the target object sent by the second user equipment;

Sending the edited video information to a first user equipment and a second user equipment that belongs to the same remote assistance task as the first user equipment.
A method for remote assistance based on augmented reality, wherein the method includes:

The first user equipment captures video information about a target object in real time through a camera device in the first user equipment, and determines a target object in the video by performing a target tracking operation on the target object in the video information. The corresponding transfer matrix information in each video frame of the information, and the corresponding marker information is superimposed and displayed on the target object according to the transfer matrix information, wherein the marker information includes a second User operation instruction information on the target object;

Sending, by the first user equipment, the video information to the second user equipment;

The second user equipment receives and presents the video information, and maintains corresponding target information superimposed and displayed on the target object in each video frame of the video information, wherein the label information includes information obtained by the second user through The operation instruction information of the second user equipment on the target object is described.
A method for remote assistance based on augmented reality, wherein the method includes:

The first user equipment captures video information about the first target object in real time through a camera device in the first user equipment, and sends the video information to a corresponding network device;

The network device receives the video information, and determines a first corresponding object of the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information. Transfer matrix information, sending the first transfer matrix information to the first user equipment, and sending the video information and the first transfer matrix information to a first remote user task that belongs to the same remote auxiliary task as the first user equipment Two user equipment;

The first user equipment receives the first transfer matrix information, and superimposes and displays corresponding first marker information on the first target object according to the first transfer matrix information, where the first marker information includes Corresponding to the operation instruction information of the second user on the first target object sent by the second user equipment;

Receiving, by the second user equipment, the video information and the first transfer matrix information, presenting the video information, and superimposing and displaying the corresponding first tag information on the video according to the first transfer matrix information The first target object in each video frame of the information, wherein the video information is captured in real time by a camera device in the first user equipment belonging to the same remote assistance task as the second user equipment, or Real-time video information about the first target object and other video information of the first target object captured by the imaging device are reconstructed.
A method for remote assistance based on augmented reality, wherein the method includes:

The first user equipment captures video information about the first target object in real time through a camera device in the first user equipment, and sends the video information to a corresponding network device;

The network device receives the video information, and determines a first corresponding object of the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information. Transfer matrix information, and send the first transfer matrix information to the first user equipment;

The first user equipment receives the first transfer matrix information, and superimposes and displays corresponding first marker information on the first target object according to the first transfer matrix information, where the first marker information includes Corresponding to the operation instruction information of the second user on the first target object sent by the second user equipment;

The network device determines a third transition matrix information corresponding to the third target object in each video frame of the video information by performing a target tracking operation on a third target object in the video information. The third target object belongs to the same remote auxiliary task as the first target object;

Sending, by the network device, the video information and the third transfer matrix information to a third user equipment corresponding to the third target object in the remote assistance task, and sending the video information and the first Sending the transfer matrix information and the third transfer matrix information to a second user equipment that belongs to the same remote auxiliary task as the first user equipment;

Receiving, by the third user equipment, the video information and the third transfer matrix information;

The third user equipment presents the video information, and superimposes and displays the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information;

Receiving, by the second user equipment, the video information, the first transition matrix information, and the third transition matrix information, and in presenting the video information, according to the first transition matrix information, the corresponding The first tag information is superimposed and displayed on the first target object in each video frame of the video information, and the corresponding third tag information is superimposed and displayed on each video of the video information according to the third transition matrix information. The third target object in the frame.
A first user device for remote assistance based on augmented reality, wherein the device includes:

A real-time shooting module, configured to shoot video information about a target object in real time through a camera device in the first user equipment;

A target tracking module, configured to determine a transition matrix information corresponding to the target object in each video frame of the video information by performing a target tracking operation on the target object in the video information;

An overlay display module, configured to superimpose and display corresponding mark information on the target object according to the transfer matrix information, where the mark information includes corresponding second user equipment to the target object sent by the second user equipment. Operation instructions.
The device according to claim 27, wherein the device further comprises a camera control module, the camera control module is configured to:

Receiving imaging control instruction information sent by the second user equipment to the imaging device by the second user;

Adjusting imaging parameter information of the imaging device according to the imaging control instruction information;

Shooting video information about the target object in real time through the adjusted camera device;

Sending the video information shot by the adjusted camera device to the second user equipment.
The device according to claim 27 or 28, wherein the device further comprises a mark acquisition module, the mark acquisition module being configured to:

Shooting image information about a target object in real time through a camera device in the first user equipment;

Sending the image information to a corresponding second user equipment;

Receiving tag information about the target object, where the tag information includes operation instruction information of the second user on the target object in the image information sent by the second user equipment;

Superimposing and displaying the mark information on the target object;

The real-time shooting module is used for:

Video information about the target object is captured in real time by the camera device.
A second user equipment for remote assistance based on augmented reality, wherein the equipment includes:

A video receiving module, configured to receive video information about a target object that is sent by the first user equipment in real time through a camera device in the first user equipment;

A video presentation module is configured to present the video information and maintain corresponding target information superimposed on the target object displayed in each video frame of the video information, wherein the label information includes a second user passing through the first Operation instruction information of the user equipment on the target object.
A first user device for remote assistance based on augmented reality, wherein the device includes:

A real-time shooting module, configured to shoot video information about a first target object in real time through a camera device in the first user equipment;

A video sending module, configured to send the video information to a corresponding network device;

A transfer matrix receiving module, configured to receive first transfer matrix information sent by the network device and corresponding to the first target object in each video frame of the video information;

An overlay display module, configured to overlay and display corresponding first marker information on the first target object according to the first transfer matrix information, where the first marker information includes a first Operation instruction information of the two users on the first target object.
A network device for remote assistance based on augmented reality, wherein the device includes:

A video receiving module, configured to receive video information about a first target object sent by a first user equipment, where the video information is captured in real time by a camera device in the first user equipment;

A target tracking module, configured to determine first transfer matrix information corresponding to the first target object in each video frame of the video information by performing a target tracking operation on the first target object in the video information;

A first sending module, configured to send the first transfer matrix information to the first user equipment;

A second sending module is configured to send the video information and the first transfer matrix information to a second user equipment that belongs to the same remote assistance task as the first user equipment.
The device according to claim 32, wherein the target tracking module is configured to:

Reconstructing video information of the first target object according to the video information and other video information of the first target object;

By performing a target tracking operation on the first target object in the reconstructed video information, first transition matrix information corresponding to the first target object in each video frame of the video information is determined.
The device according to claim 32, wherein the device further comprises a third sending module, the third sending module being configured to:

Performing a target tracking operation on a third target object in the video information to determine third transfer matrix information corresponding to the third target object in each video frame of the video information, wherein the third target object Belong to the same remote assistance task as the first target object;

Sending the video information and the third transfer matrix information to a third user equipment corresponding to the third target object in the remote assistance task;

The second sending module is configured to:

Sending the video information, the first transfer matrix information, and the third transfer matrix information to a second user equipment that belongs to the same remote assistance task as the first user equipment.
A third user equipment for remote assistance based on augmented reality, wherein the equipment includes:

A receiving module, configured to receive video information about a third target object sent by a corresponding network device and third transfer matrix information corresponding to the third target object in each video frame of the video information;

A presentation module, configured to present the video information and superimpose and display the corresponding third marker information on the third target object in each video frame of the video information according to the third transition matrix information, wherein, The third tag information includes operation instruction information of the second user on the third target object through the second user equipment;

The video information is captured in real time by a camera device in the first user equipment, and the first user equipment, the third user equipment, and the second user equipment belong to the same remote assistance task, and accept all The remote assistance of the second user equipment is described.
A second user equipment for remote assistance based on augmented reality, wherein the equipment includes:

A receiving module, configured to receive video information about a first target object and first transfer matrix information corresponding to the first target object in each video frame of the video information sent by a corresponding network device;

A presentation module, configured to present the video information and superimpose and display the corresponding first marker information on the first target object in each video frame of the video information according to the first transfer matrix information, wherein, The first marking information includes operation instruction information of a second user on the first target object through the second user equipment;

The video information is captured in real time by a camera device in the first user equipment that belongs to the same remote assistance task as the second user device, or is based on the first target object captured by the camera device. The real-time video information and other video information of the first target object are reconstructed.
A network device for remote assistance based on augmented reality, wherein the device includes:

A video receiving module, configured to receive video information about a target object sent by a first user equipment, where the video information includes a picture taken by a camera device in the first user equipment;

A target tracking module, configured to determine a transition matrix information corresponding to the target object in each video frame of the video information by performing a target tracking operation on the target object in the video information;

A tag adding module is configured to add corresponding tag information to each video frame in the video information according to the transfer matrix information, wherein the tag information remains superimposed on the video frames in the video information. A target object, where the tag information includes operation instruction information corresponding to the target object sent by the second user equipment to the second user;

A video sending module is configured to send the edited video information to a first user equipment and a second user equipment that belongs to the same remote assistance task as the first user equipment.
A system for remote assistance based on augmented reality, wherein the system includes a first user equipment according to any one of claims 27 to 29 and a second user equipment according to claim 30.
A system for remote assistance based on augmented reality, wherein the system includes a first user equipment according to claim 31, a second user equipment according to claim 36, and any one of claims 32 to 34. The network equipment described above.
A system for remote assistance based on augmented reality, wherein the system includes a first user equipment according to claim 31, a second user equipment according to claim 36, and a third user according to claim 35. A device and a network device according to claims 32 to 34.
A first user device for remote assistance based on augmented reality, wherein the device includes:

Processor; and

A memory arranged to store computer-executable instructions which, when executed, cause the processor to perform the operations of the method according to any one of claims 1 to 23.
A computer-readable medium including instructions that, when executed, cause a system to perform the operations of the method of any one of claims 1 to 23.