CN112039937B

CN112039937B - Display method, position determination method and device

Info

Publication number: CN112039937B
Application number: CN202010486797.6A
Authority: CN
Inventors: 郭志刚; 司马经华; 潘以瑶; 王有俊; 伍朝晖; 张海波
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2019-06-03
Filing date: 2020-06-01
Publication date: 2022-08-09
Anticipated expiration: 2040-06-01
Also published as: CN110381111A; CN112039937A

Abstract

The embodiment of the application discloses a display method, which comprises the following steps: the first terminal device sends a first image containing the target object to the second terminal device, and the second terminal device sends first position information of a position in the first image corresponding to the operation of the first position on the target object to the first terminal device. And the first terminal equipment determines that the first position on the target object corresponds to a second position in the virtual coordinate system according to the first position information, and the second position corresponds to the actual position of the first position on the target object in the real three-dimensional space. After determining the second position, a marker may be added at the second position in the virtual coordinate system and rendered on a second image including the target object displayed on the first terminal device such that a position of the marker on the second image corresponds to a position of the first position on the target object in the second image, and the marker rendered on the second image is displayed.

Description

Display method, position determination method and device

Technical Field

The present application relates to the field of computers, and in particular, to a display method, a position determination method, and an apparatus.

Background

With the development of scientific technology, remote assistance functions have become commonplace. Remote assistance functions implemented based on Augmented Reality (AR) technology are gaining favor of more and more users due to their ability to implement a user experience similar to "hand-held" remote assistance.

As can be understood in conjunction with fig. 1, fig. 1 is a schematic diagram of an exemplary scenario of remote assistance provided by an embodiment of the present application. The scenario shown in fig. 1 is that a user a (not shown) and a user B (not shown) are in a video call, and during the video call, the user a starts the remote assistance function. After the terminal device 110 used by the user a sends the photographed image 101 to the terminal device 120 used by the user B, the terminal device 120 may display the image 101, and the user B may perform a corresponding operation on the terminal device 120, for example, click on a screen position on the terminal device 120 corresponding to a first position on a target object in the image 101, for example, a P point on a cylinder shown in fig. 1. Terminal device 120 sends the screen position information, e.g., the screen coordinates, corresponding to the P point to terminal device 110, and terminal device 110 renders an auxiliary mark at the screen position described by the screen position information corresponding to the P point to prompt user a to perform a corresponding operation based on the auxiliary mark.

It can be understood that, in practical applications, there is a certain time difference between the time when the terminal device 110 sends the image 101 to the terminal device 120 and the time when the terminal device 120 sends the screen position corresponding to the point P to the terminal device 110. Since it takes a certain time for the terminal device 110 to transmit the image 101 to the terminal device 120, it takes a certain time for the user B to perform a corresponding operation on the terminal device 120, and it also takes a certain time for the terminal device 120 to transmit the screen position information corresponding to the point P to the terminal device 110. During the time difference, the position of the terminal device 110 may change, because the user a may move or shake the terminal device during the video call, which results in that the image 102 captured by the terminal device 110 is different from the image 101 when the terminal device 110 receives the screen position information. That is, the screen position described by the screen position information received by the terminal device 110 does not coincide with the first position on the current target object, i.e., the position of the P point in the image 102. Therefore, the position (the position of the point Q in fig. 1) of the auxiliary mark rendered by the terminal device 110 based on the screen position described by the aforementioned screen position information is inconsistent with the position of the point P on the target object in the image 102, so that the remote assistance function is not effective and the real "hand-holding" remote assistance cannot be realized.

Disclosure of Invention

The embodiment of the application provides a display method and a display device, which can solve the problem that the traditional remote assistance function realized based on the AR technology cannot realize the real 'hand-held' remote assistance.

In a first aspect, an embodiment of the present application provides a display method, and in particular, a first terminal device may send a first image including a target object to a second terminal device, the second terminal device may display the first image after receiving the first image, a user using the second terminal device may trigger an operation on the second terminal device for a first location on the target object, and the second terminal device sends, to the first terminal device, first location information corresponding to a location in the first image corresponding to the operation on the second terminal device by the user for the first location on the target object. It is considered that even if the position of the first terminal device is changed after the first image is transmitted to the second terminal device, the actual position of the first position on the target object corresponding to the operation triggered by the user on the second terminal device in the real three-dimensional stereo space is not changed. Therefore, in the embodiment of the present application, the first terminal device may determine, according to the aforementioned first position information, that the first position on the target object corresponds to the second position in the virtual coordinate system, where the second position in the virtual coordinate system corresponds to the actual position of the first position on the target object in the real three-dimensional stereo space. After determining that the first position on the target object corresponds to the second position in the virtual coordinate system, a mark may be added at the second position in the virtual coordinate system and rendered on a second image including the target object displayed on the first terminal device such that the corresponding position of the mark on the second image coincides with the position of the first position on the target object in the second image, and the mark rendered on the second image is displayed. After the first terminal device displays the aforementioned mark rendered on the second image, the user using the first terminal device can perform further operations based on the mark, thereby achieving the effect of "hand-holding" remote assistance.

In one possible implementation, after the first terminal device sends the first image to the second terminal device, the second terminal device may display the first image. Then, a user using the second terminal device may trigger an operation for the first position on the target object on a screen of the second terminal device, and the second terminal device may acquire a screen position corresponding to the operation for the first position on the target object by the user on the screen of the second terminal device, and determine the first position information according to the screen position. And sending the first position information to the first terminal device, so that the first terminal device determines that the first position on the target object corresponds to the second position in the virtual coordinate axis based on the first position information, thereby adding a mark at the second position in the virtual coordinate system, rendering the mark on a second image including the target object displayed on the first terminal device, so that the position corresponding to the mark on the second image coincides with the position of the first position on the target object in the second image, and displaying the mark rendered on the second image. After the first terminal device displays the aforementioned mark rendered on the second image, the user using the first terminal device can perform further operations based on the mark, thereby achieving the effect of "hand-holding" remote assistance.

In a possible implementation manner, in consideration of practical applications, if information of an absolute image position of the first position on the target object in the first image is to be determined, the information may be determined by combining an image resolution of the first image and a relative image position of the first position on the target object in the first image, where the image resolution of the first image may be sent by the first terminal device to the second terminal device. In order to reduce the amount of data sent by the first terminal device to the second terminal device, the first position information may be the aforementioned information describing the relative image position of the first position on the target object in the first image. For this case, the second terminal device may determine the first position information according to the aforementioned screen position and the position of the display area of the first image on the screen of the second terminal device.

In a possible implementation manner, considering that the first position on the target object corresponds to the second position in the virtual coordinate system, and is related to the image resolution of the first image, in this embodiment of the application, the first terminal device may determine that the first position on the target object corresponds to the second position in the virtual coordinate system according to the first position information and the image resolution of the first image. Specifically, the image position of the first position on the target object in the image coordinate system corresponding to the first image is determined, and the second position of the first position on the target object in the virtual coordinate system is determined according to the image position.

In a possible implementation manner, considering that the second position of the first position on the target object in the virtual coordinate system corresponds to the second position in the virtual coordinate system, the view information corresponding to the first image and the image position of the first position on the target object in the first image are related, in this embodiment of the application, the first terminal device may determine that the first position on the target object corresponds to the second position in the virtual coordinate system according to the view information corresponding to the first image and the image position of the first position on the target object in the first image. Specifically, the first terminal device may acquire view information corresponding to the first image, and calculate a position of the target ray in the virtual coordinate system according to the image position and the view information corresponding to the first image; the target ray is a ray whose end point is the target position and which passes through the second position in the virtual coordinate system. And then the first terminal device determines the intersection point position of the target ray and the virtual object in the virtual coordinate system as the second position of the first position on the target object corresponding to the virtual coordinate system.

In a second aspect, an embodiment of the present application provides a position determining method, specifically, a second terminal device receives a first image sent by a first terminal device, and displays the first image on a screen of the second terminal device; the first image is an image including a target object; the second terminal equipment receives an operation instruction which is triggered by a user on the second terminal equipment and aims at a first position on the target object, and determines first position information; the first position information corresponds to a position in the first image corresponding to the operation of the user on the second terminal device aiming at the first position on the target object; the second terminal equipment determines that the first position on the target object corresponds to the second position in the virtual coordinate system according to the first position information, and sends the second position, corresponding to the first position on the target object, in the virtual coordinate system to the first terminal equipment; the second position in the virtual coordinate system corresponds to an actual position of the first position on the target object in the real three-dimensional space. After the second terminal device sends the first terminal device the second position in the virtual coordinate system corresponding to the first position on the target object, the first terminal device may add a mark at the second position in the virtual coordinate system, and render the mark on the second image including the target object displayed on the first terminal device, so that the position on the second image corresponding to the mark coincides with the position on the second image corresponding to the first position on the target object, and the first terminal device may further display the mark rendered on the second image. After the first terminal device displays the aforementioned mark rendered on the second image, the user using the first terminal device can perform further operations based on the mark, thereby achieving the effect of "hand-holding" remote assistance.

In one possible implementation, the first location information describes a relative location of a first location on the target object in the first image; the method further comprises the following steps: and the second terminal equipment determines the first position information according to the screen position which is triggered by the user on the second terminal equipment and corresponds to the operation aiming at the first position on the target object and the position of the display area of the first image on the screen of the second terminal equipment.

In a possible implementation manner, the determining, by the second terminal device according to the first position information, that the first position on the target object corresponds to a second position in a virtual coordinate system includes: the second terminal equipment receives the image resolution of the first image sent by the first terminal equipment; the second terminal device determines the image position of the first position on the target object in the image coordinate system corresponding to the first image according to the first position information and the image resolution of the first image; and the second terminal equipment determines a second position of the first position on the target object in the virtual coordinate system according to the image position.

In one possible implementation, the method further includes: the second terminal device receives information related to the virtual coordinate system, which is sent by the first terminal device, wherein the information related to the virtual coordinate system comprises the position of a virtual object corresponding to the target object in the virtual coordinate system; and the position of the virtual object corresponding to the target object in the virtual coordinate system corresponds to the actual position of the target object in the real three-dimensional space.

In a possible implementation manner, the determining, by the second terminal device, that the first position on the target object corresponds to the second position in the virtual coordinate system according to the image position includes: the second terminal device receives visual field information corresponding to the first image sent by a first terminal device, wherein the visual field information corresponding to the first image is used for describing a target position of a camera shooting the first image in the virtual coordinate system and a target shooting direction of the camera shooting the first image in the virtual coordinate system when the first image is shot; the second terminal device calculates the position of a target ray in the virtual coordinate system according to the visual field information and the image position; the target ray is a ray of which the end point is the target position and passes through the second position in the virtual coordinate system; and the second terminal equipment determines the intersection point position of the target ray and the virtual object in the virtual coordinate system as a second position of the first position on the target object corresponding to the virtual coordinate system.

In a third aspect, an embodiment of the present application provides a method for determining a location, specifically, a second terminal device sends a screen capture request to a first terminal device; the second terminal equipment receives a first image sent by the first terminal equipment and displays the first image on a screen of the second terminal equipment; the first image is an image including a target object; the second terminal equipment receives an operation instruction which is triggered by a user on the second terminal equipment and aims at a first position on the target object, and determines first position information; and the second terminal equipment sends the first position information to the first terminal equipment. As can be seen, during the user trigger mark on the second terminal device side, the second terminal device statically displays the first image on the main interface of the second terminal device. When the user triggers the first position of the target object, the target object displayed by the first image is referred to. This not only ensures that the user accurately identifies the position of the marker, but also that the user-triggered position can be accurately correlated to the first position of the target object on the first image. Accordingly, when the first terminal device renders the mark on the second image including the target object displayed on the first terminal device, the position of the mark displayed on the second image can accurately correspond to the object which the user wants to mark. The performance of the remote assistance of the 'hand handle' can be improved, and the effect of the remote assistance of the 'hand handle' is optimized.

In one possible implementation, the method further includes: the second terminal device receives the view information corresponding to the first image, the second terminal device receives an operation instruction which is triggered by a user on the second terminal device and aims at a first position on the target object, and the determining of the first position information comprises: and the second terminal equipment determines third position information according to the operation instruction, and the second terminal determines the first position information according to the third position information and the view information corresponding to the first image. The content displayed by the second terminal device comes from the first terminal device, the position of the target object displayed by the first image is affected by the posture of the first terminal device, and the posture of the first terminal device may be changed frequently, so that the view information corresponding to the first image may be changed. And the first image is an image obtained after an object in a three-dimensional space is converted into a two-dimensional space. Based on this, the second terminal device may obtain partial information as the first location information according to the view information corresponding to the first image and the location information triggered by the user, so that the data volume of subsequent calculation may be reduced, and resources may be saved.

In a fourth aspect, an embodiment of the present application provides a terminal device, where the terminal device includes: the first sending unit is used for sending a first image to a second terminal device, wherein the first image is an image comprising a target object; a first receiving unit, configured to receive first location information from the second terminal device; the first position information corresponds to a position of a user in the first image corresponding to the operation of the user on the second terminal device aiming at the first position on the target object; the first determining unit is used for determining a second position, corresponding to the first position on the target object, in the virtual coordinate system according to the first position information; a second position in the virtual coordinate system corresponding to an actual position of the first position on the target object in the real three-dimensional space; an adding unit configured to add a marker at a second position in the virtual coordinate system; a rendering unit for rendering the marker on a second image such that a corresponding position of the marker on the second image coincides with a position of the first position on the target object in the second image; the second image is an image including the target object displayed on the terminal device; a display unit for displaying the marker rendered on the second image.

In a possible implementation manner, the first position information is determined according to a screen position corresponding to an operation, triggered by the user on the second terminal device, for the first position on the target object; the first image is displayed on a screen of the second terminal device.

In one possible implementation, the first location information describes a relative location of a first location on the target object in the first image; and the first position information is determined according to the screen position and the position of the display area of the first image on the screen of the second terminal equipment.

In a possible implementation manner, the first determining unit is specifically configured to: determining the image position of the first position on the target object in the image coordinate system corresponding to the first image according to the first position information and the image resolution of the first image;

and determining a second position of the first position on the target object corresponding to the virtual coordinate system according to the image position.

In a possible implementation manner, the first determining unit is specifically configured to: acquiring view information corresponding to a first image, wherein the view information corresponding to the first image is used for describing a target position of a camera shooting the first image in the virtual coordinate system and a target shooting direction of the camera shooting the first image in the virtual coordinate system when the first image is shot; calculating the position of a target ray in the virtual coordinate system according to the visual field information and the image position; the target ray is a ray with an end point being the target position and passing through the second position in the virtual coordinate system; determining the intersection point position of the target ray and a virtual object in the virtual coordinate system as a second position of the first position on the target object corresponding to the virtual coordinate system; and the position of the virtual object in the virtual coordinate system corresponds to the actual position of the target object in the real three-dimensional space.

In a fifth aspect, an embodiment of the present application provides a terminal device, where the terminal device includes: the second receiving unit is used for receiving a first image sent by first terminal equipment and displaying the first image on a screen of the terminal equipment; the first image is an image including a target object; a third receiving unit, configured to receive an operation instruction, triggered by the user on the terminal device, for a first location on the target object; a second determination unit for determining the first position information; the first position information corresponds to a position in the first image corresponding to the operation of the user on the terminal equipment aiming at the first position on the target object; a third determining unit, configured to determine, according to the first position information, a second position, in the virtual coordinate system, where the first position on the target object corresponds to; the second sending unit is used for sending the first position on the target object to a second position corresponding to the virtual coordinate system in the first terminal device; the second position in the virtual coordinate system corresponds to an actual position of the first position on the target object in the real three-dimensional space.

In one possible implementation, the first location information describes a relative location of a first location on the target object in the first image; the device further comprises: a fourth determining unit, configured to determine the first location information according to a screen location corresponding to an operation, triggered by the user on the terminal device, for a first location on the target object and a location of a display area of the first image on a screen of the terminal device.

In a possible implementation manner, the third determining unit is specifically configured to: receiving the image resolution of the first image sent by the first terminal equipment; determining the image position of the first position on the target object in the image coordinate system corresponding to the first image according to the first position information and the image resolution of the first image; and determining a second position of the first position on the target object corresponding to the virtual coordinate system according to the image position.

In one possible implementation, the apparatus further includes: a fourth receiving unit, configured to receive information related to the virtual coordinate system, which is sent by the first terminal device, where the information related to the virtual coordinate system includes a position of a virtual object corresponding to the target object in the virtual coordinate system; and the position of the virtual object corresponding to the target object in the virtual coordinate system corresponds to the actual position of the target object in the real three-dimensional space.

In a possible implementation manner, the third determining unit is specifically configured to: receiving visual field information corresponding to the first image sent by a first terminal device, wherein the visual field information corresponding to the first image is used for describing a target position of a camera shooting the first image in the virtual coordinate system and a target shooting direction of the camera shooting the first image in the virtual coordinate system when the first image is shot; calculating the position of a target ray in the virtual coordinate system according to the visual field information and the image position; the target ray is a ray of which the end point is the target position and passes through the second position in the virtual coordinate system; and determining the intersection point position of the target ray and the virtual object in the virtual coordinate system as a second position of the first position on the target object corresponding to the virtual coordinate system.

In a sixth aspect, an embodiment of the present application provides an electronic device, where the electronic device includes: a memory and at least one processor; the memory to store instructions; the at least one processor, configured to execute the instructions in the memory, performs the method of any of the above first aspects.

In a seventh aspect, an embodiment of the present application provides an electronic device, where the device includes: a memory and at least one processor; the memory to store instructions; the at least one processor, configured to execute the instructions in the memory, performs the method of any of the second and third aspects above.

In an eighth aspect, embodiments of the present application provide a computer-readable storage medium, comprising instructions, which, when executed on a computer, cause the computer to perform the method of any one of the above first aspects.

In a ninth aspect, embodiments of the present application provide a computer-readable storage medium, which includes instructions that, when executed on a computer, cause the computer to perform the method of any one of the second and third aspects above.

In a tenth aspect, embodiments of the present application provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any of the first aspects above.

In an eleventh aspect, embodiments of the present application provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the method of any one of the second and third aspects above.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic diagram of an exemplary scenario of remote assistance provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a display method according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a screen coordinate system according to an embodiment of the present disclosure;

fig. 4 is a flowchart illustrating a method for determining that a first location on a target object corresponds to a second location in a virtual coordinate system according to an embodiment of the present disclosure;

FIG. 5 is a schematic view of a viewing cone provided in accordance with an embodiment of the present application;

fig. 6 is a flowchart illustrating a method for determining that a first location on a target object corresponds to a second location in a virtual coordinate system according to an embodiment of the present disclosure;

fig. 7 is a schematic flowchart of a position determining method according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 11A is a schematic diagram of another exemplary scenario of remote assistance provided in an embodiment of the present application;

FIG. 11B is a schematic diagram of an exemplary interface when a user triggers an operation instruction in the scenario of FIG. 11A according to an embodiment of the present application;

FIG. 11C is a schematic diagram of an exemplary interface after a user-triggered location in the scenario of FIG. 11B is associated with a first image, provided by an embodiment of the present application;

fig. 12 is a schematic flowchart of another display method provided in the embodiment of the present application;

FIG. 13A-1 is a schematic diagram of an exemplary scenario of a first screen capture scheme provided in an embodiment of the present application;

FIG. 13A-2 is a diagram illustrating an exemplary scenario of a second screen capture scheme provided by an embodiment of the present application;

13A-3 are exemplary scene schematics of a third screen capture scenario provided by an embodiment of the present application;

13A-4 are exemplary scene schematics of a fourth screen capture scheme provided by embodiments of the present application;

FIG. 13B is a schematic diagram of an exemplary user interface of a second terminal device in the scenarios of FIGS. 13A-1 through 13A-4 provided by an embodiment of the present application;

FIG. 13C is a schematic illustration of an exemplary user interface in the scenarios of FIGS. 13A-3 and 13A-4 provided by an embodiment of the present application;

fig. 14A is a schematic view of a scene of a video shot by a first terminal device according to an embodiment of the present application;

fig. 14B is an exemplary scene schematic diagram illustrating a scene taken in fig. 14A for a second terminal device provided in the embodiment of the present application.

Detailed Description

The embodiment of the application provides a display method, a position determination method and a position determination device, which are used for solving the problem that the traditional remote assistance function realized based on the AR technology cannot realize the real 'hand-holding' remote assistance.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Moreover, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

For convenience of understanding, an application scenario of the embodiment of the present application is first briefly described. The embodiment of the application can be applied to a remote assistance scene, and the remote assistance scene can be a video call scene. As can be understood in conjunction with fig. 1, in practical applications, when user a and user B make a video call, the terminal device 110 used by user a displays an image captured by the terminal device 120 used by user B, and the terminal device 120 displays an image captured by the terminal device 110. In this embodiment of the application, in the process of the video call between the user a and the user B, the user a or the user B may selectively start the remote assistance function, taking the remote assistance function started by the user a as an example, after the user a starts the remote assistance function, the terminal device 110 may not display the image captured by the terminal device 120, but display the image captured by the terminal device 110, and the terminal device 120 may still display the image captured by the terminal device 110.

Further, if the user a opens the remote assistance function, after the terminal device 110 sends the captured image 101 to the terminal device 120, the terminal device 120 may display the image, for example, the image 101 shown in fig. 1, and the user B may perform a corresponding operation on the terminal device 120, for example, click a screen position on the screen of the terminal device 120 corresponding to a target object in the image 101, for example, the point P shown in fig. 1. Terminal device 120 sends the screen position information, e.g., the screen coordinates, corresponding to the point P to terminal device 110, and terminal device 110 renders an auxiliary mark at a screen position on the screen of terminal device 110, which corresponds to the screen position information described in the foregoing, e.g., terminal device 110 renders an auxiliary mark at a screen position on the screen of terminal device 110, which corresponds to the foregoing received screen coordinates, so as to prompt user a to perform a corresponding operation based on the auxiliary mark. Since a certain time difference exists between the moment when the terminal device 110 sends the image 101 to the terminal device 120 and the moment when the terminal device 120 sends the screen position information corresponding to the point P to the terminal device 110, when the terminal device 110 receives the screen position information, the image 102 captured by the terminal device 110 is different from the image 101, and thus the screen position information received by the terminal device 110 is not matched with the position of the current point P in the image 102. Therefore, the position of the auxiliary mark rendered by the terminal device 110 is inconsistent with the position of the point P in the image 102, so that the remote assistance function is not effective and the real "hand-holding" remote assistance cannot be realized.

It should be noted that the above-mentioned video call scene is only one application scene of the embodiment of the present application, and the embodiment of the present application may also be applied to other scenes, for example, a scene for remote home monitoring, specifically, the camera may send the first captured image to the terminal device 110, if the terminal device 110 starts the remote assistance function, the terminal device 110 may send the first image to the terminal device 120, a user using the terminal device 120, for example, the aforementioned user B, may perform a corresponding operation on the received first image, for example, click a certain position on the first image, or draw a circle at a certain position on the first image, and send screen position information corresponding to the operation to the terminal device 110, the terminal device 110 may render an assistance mark on the screen of the terminal device 110 at a screen position corresponding to the screen position information described in the foregoing screen position information, for example, the user a using the terminal device 110 is prompted to perform a corresponding operation based on the auxiliary mark, or the user a is prompted that an abnormality occurs at a position corresponding to the auxiliary mark, and so on. However, considering that the camera may shake or change the shooting angle during shooting, there is a problem that the above-mentioned remote assistance effect of the handle cannot be achieved.

Of course, the embodiments of the present application can also be applied to other scenarios, and are not described in detail here.

In order to solve the above problems, embodiments of the present application provide a display method, a position determining method, and an apparatus, which can solve the above problems. The following describes a display method provided by an embodiment of the present application with reference to the drawings.

In the following description of the embodiments of the present application, unless otherwise specified, the foregoing video call scenario is taken as an example for explanation, and the implementation manner of other scenarios is similar to that of the video call scenario.

Referring to fig. 2, the figure is a schematic flow chart of a display method provided in the embodiment of the present application. The display method provided by the embodiment of the application can be implemented by the following steps 101-105.

Step 101: the first terminal device sends a first image to the second terminal device, wherein the first image is an image including a target object.

It should be noted that, in the embodiment of the present application, the first terminal device and the second terminal device may be a mobile terminal device such as a smart phone and a tablet computer, or a terminal device such as a desktop computer, and the embodiment of the present application is not particularly limited.

It should be noted that the embodiment of the present application does not specifically limit the target object, and the target object may be an object included in the real environment where the first terminal device is located or a part of the object, for example, the target object may be a keyboard, a certain button in the keyboard, a table, a certain position of a table top, a cabinet, a door handle of the cabinet, a handle of a cup and a handle of the cup, and the like.

It should be noted that, in the embodiment of the present application, the first image may be an image captured by the first terminal device, for example, a scene corresponding to the aforementioned video call. The first image may also be sent to the first terminal device by other devices, for example, corresponding to the scene monitored by the remote home, and the first image is sent to the first terminal device after being shot by the camera.

Step 102: the first terminal device receives first position information from the second terminal device, wherein the first position information corresponds to a position in the first image corresponding to the operation of the user on the second terminal device aiming at the first position on the target object.

After the second terminal device receives the first image sent by the first terminal device, the second terminal device may display the first image on a screen of the second terminal device.

After the first image is displayed on the screen of the second terminal device, a user using the second terminal device may trigger an operation for the first position on the target object on the second terminal device, for example, click a screen position on the second terminal device corresponding to the first position on the target object; as another example, a circle is drawn at a screen position corresponding to the first position on the target object; as another example, text is marked at a screen location corresponding to the first location on the target object, and so on. It will be appreciated that the user triggering an action on the second terminal device for a first location on the target object may actually be to prompt the user using the first terminal device to pay attention to the first location on the target object, for example to prompt the user using the first terminal device to perform a corresponding action for the first location on the target object. Therefore, in this embodiment of the application, the second terminal device may determine, according to a screen position corresponding to an operation triggered by a user on the second terminal for a first position on the target object, first position information that may describe an image position of the first position on the target object in the first image, and send the first position information to the first terminal device, so that the first terminal device may further perform a corresponding operation according to the first position information, so as to achieve an effect of "handle" remote assistance in a practical sense.

It should be noted that the first position information is not specifically limited in the embodiments of the present application, and the first position information may be, for example, information describing an absolute image position of the first position on the target object in the first image, and the first position information may be, for example, coordinates describing the absolute image position of the first position on the target object in the first image; the first position information may also be, for example, information describing a relative image position of the first position on the target object in the first image, for example, the first position information may be coordinates describing the relative image position of the first position on the target object in the first image, and the embodiment of the present application is not limited in particular. In consideration of the practical application, if the information of the absolute image position of the first position on the target object in the first image is to be determined, the image resolution of the first image and the relative image position of the first position on the target object in the first image may be combined to determine, where the image resolution of the first image may be sent to the second terminal device by the first terminal device. In one implementation of the embodiment of the present application, in order to reduce the amount of data sent by the first terminal device to the second terminal device, the first position information may be the aforementioned information describing the relative image position of the first position on the target object in the first image. For this case, the second terminal device may determine the first position information according to the aforementioned screen position and the position of the display area of the first image on the screen of the second terminal device.

Regarding the determination manner of the information describing the relative position of the first position on the target object in the first image, for example, assuming that the screen resolution of the second terminal device is 540 × 480, a screen coordinate system may be constructed for the screen of the second terminal device, which can be understood with reference to fig. 3, where fig. 3 is a schematic diagram of a screen coordinate system provided in the embodiment of the present application. In fig. 3, the origin (0,0) of the screen coordinate system of the second terminal device is the O point at the lower left corner of the display screen of the second terminal device, the X coordinate axis of the screen coordinate system passes through the origin O (0,0) and is parallel to one side OX of the display screen, and the Y coordinate axis of the screen coordinate system passes through the origin O (0,0) and is parallel to one side OY of the display screen. As shown, the screen coordinate system has a coordinate range of (0,0) to (480,540); if the display area of the first image on the second terminal device is the whole screen, that is, the first image is displayed on the screen of the second terminal device in full screen, and the corresponding coordinate of the screen position is (480,270), the first position information may be the coordinate (480/480, 270/540), that is, (1, 0.5). Regarding the case where the first image is displayed in a specific display area on the screen of the second terminal device, the first position information is calculated in a similar manner and will not be described in detail here.

Step 103: the first terminal equipment determines that the first position on the target object corresponds to a second position in the virtual coordinate system according to the first position information, and adds a mark at the second position in the virtual coordinate system; a second location in the virtual coordinate system describing an actual location of the first location on the target object in the real three-dimensional volumetric space.

It should be noted that the virtual coordinate system may be constructed by the first terminal device, and specifically, after the user using the first terminal device starts the remote assistance function on the first terminal device, the first terminal device may start, for example, an AR service, and the AR service may construct the virtual coordinate system and construct a virtual object corresponding to the object in the real three-dimensional stereo space. The coordinates of the virtual object corresponding to the object in the real three-dimensional space in the virtual coordinate system can represent the actual position of the object in the real three-dimensional space. Specifically, for a video call scene, the AR service may, for example, acquire a plurality of frame images continuously captured by the first terminal device, and analyze part or all of feature points on the surface of an object in the three-dimensional stereo space, so as to determine coordinates in a virtual coordinate system of a virtual object corresponding to the object. For the remote home monitoring scene, for example, the AR service may acquire a multi-frame image sent to the first terminal device by the camera, and analyze part or all of feature points on the surface of an object in the three-dimensional space where the target object is located, so as to determine coordinates in a virtual object virtual coordinate system corresponding to the object in the three-dimensional space where the target object is located. The object may be some feature points, for example, points corresponding to four corners of a keyboard, or may be a virtual plane obtained by fitting the feature points, for example, a virtual plane corresponding to a ground in a real three-dimensional stereo space. The AR service may also obtain depth information of the object according to a depth camera, a radar, and other devices, so as to fit a virtual plane according to the depth information, for example, for an express delivery box in a real three-dimensional space, a virtual plane corresponding to a plurality of surfaces of the express delivery box may be fitted according to the depth information, so as to determine coordinates of a virtual object corresponding to the object in a virtual coordinate system.

In an implementation manner of the embodiment of the present application, for a video call scenario, an origin of a virtual coordinate system constructed by the first terminal device may be determined by a position of the first terminal device in a real three-dimensional stereo space when the first terminal device starts an AR service, for example, the origin of the virtual coordinate system corresponds to the position of the first terminal device in the real three-dimensional stereo space, that is, coordinates of the first terminal device in a world coordinate system, and corresponds to coordinates of the origin in the virtual coordinate system. For the aforementioned remote home monitoring scenario, considering that the camera and the first terminal device may not be located in the same three-dimensional stereo space, for example, the camera is located in a living room, and the first terminal device is located in a bedroom or even an office, etc., for this case, the origin of the virtual coordinate system may correspond to the position of the camera in the real three-dimensional stereo space, that is, the coordinates of the camera in the world coordinate system, and the coordinates of the origin in the virtual coordinate system. In an implementation manner of the embodiment of the present application, the virtual coordinate system includes three coordinate axes, which are an X axis, a Y axis, and a Z axis. The Y axis may be a direction opposite to gravity, the X axis may be a projection direction of the usb port on the first terminal device on a horizontal plane, and the Z axis is a direction perpendicular to both the X axis and the Y axis.

It can be understood that, for a video call scenario, if the first terminal device is a mobile device such as a smartphone, each time the first terminal device starts the AR service, the virtual coordinate origin and the three coordinate axis directions set by the AR service are both determined by a state of the first terminal device when the first terminal device starts the AR service, such as a position of the first terminal device in a real three-dimensional stereo space, as well as a posture of the first terminal device itself, so that the origin and the three coordinate axes of the virtual coordinate system established when the first terminal device starts the AR service may not be consistent.

Of course, the above is only an exemplary illustration, and the determination of the origin of the virtual coordinate system and the determination of the directions of the three coordinate axes may also be determined according to other manners, which are related to the setting of the AR service itself, and are not described one by one here. It should be noted that the AR service mentioned in the embodiment of the present application may be, for example, an augmented reality software development kit (AR SDK) in the conventional technology, such as AREngine provided by hua, arcre provided by google, and kit provided by apple. Of course, the AR service may also be redeveloped, and the embodiment of the present application is not particularly limited.

In the embodiment of the present application, a mark may be added at a first position on the target object corresponding to a second position in the virtual coordinate system by using the aforementioned AR service. The mark is not particularly limited in the embodiments of the present application, and may be, for example, a five-pointed star, an arrow, or the like. Considering that some marks have different directions, such as arrow directions, and the directions are different, and the display manners thereof are also different, in an implementation manner of the embodiment of the present application, when the first terminal device adds the mark, the direction of the mark in the virtual coordinate system may also be obtained, so that a mark corresponding to the direction is added in the virtual coordinate system, for example, an arrow pointing downward in the arrow direction is added.

Step 104: rendering the mark on the second image by the first terminal device, so that the position of the mark on the second image corresponds to the position of the first position on the target object in the second image; the second image is an image including the target object displayed by the first terminal device.

It should be noted that in this scenario of remote assistance, the second image may be an image captured at a time after the first terminal device captures the first image and displayed on the first terminal device, for example, the first image is an image captured at time k, and the second image is an image captured at time k + m and displayed on the first terminal device. In other scenarios, for example, in the aforementioned remote home monitoring scenario, the second image may be an image captured at a certain time after the first image is captured by the camera, and sent to the first terminal device and displayed by the first terminal device.

In the embodiment of the present application, for the aforementioned video call scenario, it is considered that even if the position of the first terminal device is changed after the first image is sent to the second terminal device, the actual position of the first position on the target object corresponding to the operation triggered by the user on the second terminal device in the real three-dimensional stereo space is not changed within the above time difference. Similarly, for the aforementioned remote home monitoring scene, even if the camera shakes after capturing the first image, or the capturing angle changes, the actual position of the first position on the target object corresponding to the operation triggered by the user on the second terminal in the real three-dimensional stereo space will not change within the time difference. In this embodiment of the application, the first position on the target object determined by the first terminal device corresponds to the second position in the virtual coordinate system, which may represent an actual position of the first position on the target object in the real three-dimensional stereo space. After determining that the first position on the target object corresponds to the second position in the virtual coordinate system, a mark may be added at the second position in the virtual coordinate system, and the mark may be rendered on a second image including the target object displayed on the first terminal device, so that the corresponding position of the mark on the second image is consistent with the position of the first position on the target object in the second image, so as to achieve the effect of "hand-grip" remote assistance.

The embodiment of the present application is not particularly limited to the specific implementation manner of rendering the mark on the second image, and as an example, a rendering engine provided by the AR service may be called to render the mark on the second image. Specifically, the projection matrix of the first terminal device and the view information corresponding to the second image may be provided to the rendering engine, and the rendering engine may render the mark on the screen of the first terminal device according to the projection matrix of the first terminal device, the view information corresponding to the second image, and the second position of the mark in the virtual coordinate system, so that the position of the mark corresponding to the second image is consistent with the position of the first position on the target object in the second image.

The embodiment of the present application does not specifically limit the view information corresponding to the second image, and the view information may be used to describe a position where the camera capturing the second image is located in the virtual coordinate system when the second image is captured, and a capturing direction of the camera capturing the second image in the virtual coordinate system when the second image is captured. In this embodiment of the application, the first terminal device may obtain the view information corresponding to the second image by using the AR service.

Step 105: the first terminal device displays the mark rendered on the second image.

It will be appreciated that the first terminal device, after rendering the marker on the second image, may indicate to some extent that a file is generated that may display the marker and the first location on the target object to the same location on the second image. In this embodiment of the application, after the mark is rendered on the second image, the rendered mark may be displayed through a corresponding hardware device, so that a user may view the mark through the content displayed by the hardware device. In this embodiment, the first terminal device may display the mark, so that the user using the first terminal device may determine, according to the mark, the first position on the target object corresponding to the user using the second terminal device performing the foregoing operation, and further, so that the user using the first terminal device may perform a further operation based on the mark.

As can be seen from the above description, with the scheme of the embodiment of the present application, the effect of "hand grip" remote assistance can be achieved.

It can be understood that, during the video call, the first terminal device may send all the multiple frames of images taken by the first terminal device to the second terminal device, and similarly, for the foregoing remote home monitoring scene, the first terminal device may send all the multiple frames of images from the camera to the second terminal device. In order to determine which frame of image corresponds to the user when the user triggers an operation on the second terminal device, in this embodiment of the application, when the first terminal device sends the first image to the second terminal device, the first terminal device may also send the identifier of the first image to the second terminal device, and correspondingly, when the second terminal device sends the first location information to the first terminal device, the second terminal device may also correspondingly send the identifier of the first image to the first terminal device, so that the first terminal device determines that the first location information corresponds to the first image.

The embodiment of the present application does not specifically limit the identifier of the first image, which may be, for example, the time when the first image is captured; the first identifier may also be, for example, field-of-view information corresponding to the first image, where the field-of-view information is used to represent a target position of a camera shooting the first image in the virtual coordinate system when the first image is shot, and a target shooting direction of the camera shooting the first image in the virtual coordinate system when the first image is shot. In this embodiment of the application, the terminal device may obtain the identifier of the first image by using the AR service, so as to send the identifier of the first image to the second terminal device.

The following describes a specific implementation manner of determining, by the first terminal device in the foregoing step 103, that the first position on the target object corresponds to the second position in the virtual coordinate system according to the first image position.

Referring to fig. 4, the flowchart is a schematic flowchart of a method for determining that a first location on a target object corresponds to a second location in a virtual coordinate system according to an embodiment of the present application. The method may be implemented, for example, by the following steps 201-202.

Step 201: and the first terminal equipment determines the image position of the first position on the target object in the image coordinate system corresponding to the first image according to the first position information and the image resolution of the first image.

In this embodiment of the application, for the video call scenario, the first terminal device may obtain an image resolution of the first image through the AR service, and when the first terminal device starts the remote assistance function, the AR service may configure the image resolution of the image shot by the first terminal device. Specifically, the AR service may obtain a screen resolution of the first terminal device, so as to configure an image captured by the first terminal device, for example, an image resolution of the first image, according to the screen resolution of the first terminal device. Generally, the AR service configures an image resolution of the first image to be smaller than a screen resolution of the first terminal device. For the remote home monitoring scene, the image resolution of the first image may be that the camera shooting the first image is sent to the first terminal device.

For the case that the aforementioned first position information may be information describing a relative image position of the first position on the target object in the first image, when the step 201 is implemented, for example, a product of the first position information and an image resolution of the first image may be determined as an image position of the first position on the target object in an image coordinate system corresponding to the first image. For example, if the first position information is coordinates (0.5,1.0), and the image resolution of the first image is 480 × 480, the image position of the first position on the target object in the image coordinate system corresponding to the first image is determined to be (480 × 0.5,480 × 1.0), that is, (240,480).

Step 202: and the first terminal equipment determines a second position of the first position on the target object corresponding to the virtual coordinate system according to the image position.

Before describing the specific implementation of step 202, first, the shooting principle of the camera of the first terminal device is described with reference to fig. 5. Fig. 5 is a schematic view of a viewing cone provided in an embodiment of the present application.

It should be noted that the shape of the cone determines how to map the image captured by the camera onto the screen of the first terminal device. Referring to fig. 5, point O in fig. 5 is a position where the camera is located in the virtual coordinate system, and 502 is a viewing angle of the camera; a plane 502 in the virtual coordinate system shown in fig. 5 corresponds to the screen of the first terminal device, and a plane 503 corresponds to the farthest range that can be photographed by the camera. The aforementioned first location on the target object corresponds to the marker added at the second location in the virtual coordinate system, and may be, for example, the virtual object 504 shown in fig. 5.

In a specific implementation, step 202 may first determine, in combination with a view cone of the first terminal device when the first image is captured, a direction of the first position on the target object in the virtual coordinate system, and then determine, according to the direction and the virtual object corresponding to the target object, a second position of the first position on the target object in the virtual coordinate system. Regarding the virtual object corresponding to the target object, the construction part of the virtual coordinate system has been described in detail above, and is not described in detail here.

Referring to fig. 6, a possible implementation manner of step 202 is described below, and fig. 6 is a flowchart illustrating a method for determining that a first location on a target object corresponds to a second location in a virtual coordinate system according to an embodiment of the present application. The method may be implemented, for example, by the following steps 301-303.

Step 301: the first terminal equipment acquires the visual field information corresponding to the first image.

It should be noted that the field-of-view information corresponding to the first image may describe a position of a camera capturing the first image in the virtual coordinate system when the first image is captured, and a capturing direction of the camera capturing the first image in the virtual coordinate system when the first terminal device captures the first image. For a video call scene, the view information corresponding to the first image may describe a position where a camera of the first terminal device is located in a virtual coordinate system when the first terminal device shoots the first image, and a shooting direction of the camera of the first terminal device in the virtual coordinate system when the first terminal device shoots the first image. For convenience of description, the "position of the camera taking the first image in the virtual coordinate system when taking the first image" is referred to as a "target position", and the "shooting direction of the camera taking the first image in the virtual coordinate system when taking the first image" is referred to as a "target shooting direction".

As described above, in the embodiment of the present application, the view information corresponding to the first image may be acquired through the AR service.

Step 302: the first terminal device calculates the position of the target ray in the virtual coordinate system according to the image position and the view information corresponding to the first image; the target ray is a ray whose end point is the target position and which passes through the second position in the virtual coordinate system.

The embodiment of the present application is not particularly limited to a specific implementation of calculating the position of the target ray in the virtual coordinate system, and as an example, the position of the target ray in the virtual coordinate system may be calculated by calculating a target equation of the target ray in the virtual coordinate system.

In the embodiment of the present application, the target equation may represent a direction of the first position on the target object in the virtual coordinate system, that is, a direction of the second position in the virtual coordinate system. In one implementation of the embodiments of the present application, the target equation may be determined by the following equations (1) and (2).

[ End ] < View > < SP >

Direction End-Start equation (2)

In equation (1), [ End ] is the position of a point on the target ray, [ End ] can be a 4 x 1 column vector, a homogeneous expression of coordinate values, e.g., [ End ] can be (XW, YW, ZW,1) -1;

[ View ] ^ (-1) represents an inverse matrix of a visual field matrix [ View ] corresponding to the first image, and is determined by a visual field range corresponding to the first image, namely the shooting direction of a camera shooting the first image in a virtual coordinate system when the first image is shot and the position of the camera shooting the first image in the virtual coordinate system when the first image is shot are determined;

it should be noted that [ View ] ^ (-1) may also be referred to as a pose matrix when the camera takes the first image, and the pose matrix may be, for example, a matrix of (4 × 4). The pose matrix when the camera takes the first image may be [ (R _ C & T _ C @0&1) ], for example, where R _ C is a rotation matrix of the camera in a world coordinate system, and the rotation matrix may be (3 × 3), for example; t _ C is a translation vector of the camera in the world coordinate system, and T _ C may be, for example, a (3 × 1) column vector. After the first terminal equipment starts AR service, the first terminal equipment can acquire a rotation matrix R _ C of a camera in a world coordinate system and a translation vector T _ C of the camera in the world coordinate system through the AR service, so that a pose matrix [ View ] ^ (-1) is obtained;

project (-1) is an inverse matrix of a Projection matrix corresponding to the first image, the first terminal device may obtain the Projection matrix corresponding to the first image by using an AR service, and the first terminal device may also determine the Projection matrix corresponding to the first image according to parameters when the camera captures the first image; the parameters of the first image captured by the camera may include, for example, a horizontal-axis focal length fx of the camera, a vertical-axis focal length fy of the camera, principal point offset values cx and cy corresponding to the camera, a pixel width value width and a pixel height value height of the first image captured by the camera, and the like. The horizontal axis focal length fx and the vertical axis focal length fy of the camera and the principal point offset values cx and cy corresponding to the camera are related to the hardware configuration of the camera, generally speaking, the horizontal axis focal length fx and the vertical axis focal length fy corresponding to a plurality of cameras of the same type and the principal point offset values cx and cy corresponding to the cameras are the same;

[ p ] is the image position of the target object in the first image, and [ p ] can be a 4 x 1 column vector and is a homogeneous expression of coordinate values, similar to [ End ];

the Direction represents a target equation;

end is a coordinate representation of [ End ], representing the coordinates of a point on the target ray in a virtual coordinate system;

the Start indicates a target position of the camera that captured the first image in the virtual coordinate system when the first image was captured, that is, coordinates of the camera that captured the first image in the virtual coordinate system when the first image was captured. In yet another implementation of the embodiments of the present application, the target equation may be determined by the following equations (3) and (4).

[ End ] ═ outer parameter matrix ] × [ inner parameter matrix ] ^ (1) × [ p ] formula (3)

Direction End-Start equation (4)

In equation (3), [ End ] is a point location on the target ray;

the external reference matrix is determined by the position and the direction of the camera in the virtual coordinate system when the first image is taken, and the external reference matrix may be the aforementioned pose matrix, which is an inverse matrix of the View matrix [ View ] corresponding to the first image.

The internal reference matrix is determined according to the camera internal reference corresponding to the first image, and may be determined according to a horizontal axis focal length fx of the camera, a vertical axis focal length fy of the camera, and principal point offset values cx and cy corresponding to the camera, for example. As described above, the horizontal-axis focal length fx and the vertical-axis focal length fy of the camera and the principal point offset values cx and cy corresponding to the camera are related to the hardware configuration of the camera, so that the reference matrix can be determined by acquiring the identifier of the camera, for example, the model of the camera.

Equation (4) is the same as equation (2), and is not described in detail here.

In the embodiment of the present application, the objective equation may also be calculated in other manners, for example, the objective equation may be calculated by directly calculating the camera internal reference corresponding to the first image and the position and the direction of the camera in the virtual coordinate system when the first image is captured without using a matrix, for example, considering that the matrix itself may be split into a plurality of equations, and therefore, in the embodiment of the present application, the foregoing objective equation may also be calculated by directly using an expression expressed by an equation which is equivalent to the expression expressed by a matrix. However, the basic principle is the same whether the calculation is performed by using a matrix or an equation.

Step 303: the first terminal device determines the intersection point position of the target ray and the virtual object in the virtual coordinate system as the second position of the first position on the target object corresponding to the virtual coordinate system.

It will be appreciated that for a camera, it may take a two-dimensional image of an object in real three-dimensional stereo space, and thus for all points on a ray, such as a target ray, there is a point on the two-dimensional image. Therefore, after determining the equation of the target ray in the virtual coordinate system, it may also be determined that the first location on the target object corresponds to the second location in the virtual coordinate system in combination with the virtual object in the virtual coordinate system.

The intersection point of the target ray and the virtual object actually corresponds to the first position on the target object corresponding to the screen position where the user triggers the operation on the second terminal device and corresponds to the second position in the virtual coordinate system.

In this embodiment of the present application, a collision function provided by the AR service may be called, and an intersection position of the target ray and the virtual object in the virtual coordinate system may be calculated, so as to determine that the first position on the target object corresponds to the second position in the virtual coordinate system. The collision function is not specifically limited in the embodiments of the present application, and may be, for example, a hitTest function provided by the ARCore.

The display method applied to the first terminal device in the embodiment of the present application is described above, and a position determination method applied to the second terminal device is described below with reference to the drawings.

It should be noted that, similar to the basic principle of the aforementioned display method applied to the first terminal device, the position determination method applied to the second terminal device takes into account that, for the aforementioned video call scenario, even if the position of the first terminal device is changed after the first image is sent to the second terminal device, the actual position of the first position on the target object corresponding to the operation triggered by the user on the second terminal device in the real three-dimensional stereo space within the time difference is usually not changed. Similarly, for the aforementioned remote home monitoring scene, even if the camera shakes after capturing the first image, or the capturing angle changes, the actual position of the first position on the target object corresponding to the operation triggered by the user on the second terminal in the real three-dimensional stereo space will not change within the time difference. Therefore, in the embodiment of the present application, it may be determined by the second terminal device according to the aforementioned first position information that the first position on the target object corresponds to the second position in the virtual coordinate system, and the first position on the target object corresponds to the second position in the virtual coordinate system, which describes an actual position of the first position on the target object in the real three-dimensional stereo space. After determining that the first position on the target object corresponds to the second position in the virtual coordinate system, the first position on the target object corresponding to the second position in the virtual coordinate system may be sent to the first terminal device, so that the first terminal device adds a mark at the second position in the virtual coordinate system, renders the mark on a second image including the target object displayed on the first terminal device, and displays the mark rendered on the second image.

Referring to fig. 7, the figure is a schematic flowchart of a position determination method according to an embodiment of the present application. The method may be implemented, for example, by steps 401-403 as follows.

Step 401: the second terminal equipment receives the first image sent by the first terminal equipment and displays the first image on a screen of the second terminal equipment; the first image is an image including a target object.

Step 402: the second terminal device receives an operation instruction which is triggered by the user on the second terminal device and aims at a first position on the target object, and determines first position information, wherein the first position information corresponds to a position, in the first image, of the user on the second terminal device, aiming at the first position on the target object.

In one implementation of the embodiment of the present application, the first position information may be information describing a relative image position of the first position on the target object in the first image. For this case, the second terminal device may determine the first position information according to the aforementioned screen position and the display area of the first image on the screen of the second terminal device.

It should be noted that the principle of the steps 401-402 is similar to that of the steps 101-102, except that the steps 101-102 are executed by the first terminal device, and the steps 401-402 are executed by the second terminal side, so that reference may be made to the description portion of the steps 101-102 in the description of the steps 401-402, and details thereof are not described here.

Step 403: the second terminal equipment determines that the first position on the target object corresponds to the second position in the virtual coordinate system according to the first position information, and sends the first position on the target object corresponding to the second position in the virtual coordinate system to the first terminal equipment; the second position in the virtual coordinate system represents the actual position of the first position on the target object in the real three-dimensional space.

Regarding the description of the virtual coordinate system, reference may be made to the relevant description part in step 103 above, and details are not described here.

It should be noted that, in an implementation manner that the second terminal device determines, according to the first position information, that the first position on the target object corresponds to the second position in the virtual coordinate system, similar to the determination, by the first terminal device, that the first position on the target object corresponds to the second position in the virtual coordinate system according to the first position information, an image position of the first position on the target object in the image coordinate system corresponding to the first image is determined according to the first position information and the image resolution of the first image; and then determining a second position in the virtual coordinate system corresponding to the first position on the target object according to the image position of the first position on the target object in the image coordinate system corresponding to the first image.

In consideration of the fact that when the first terminal device sends the first image to the second terminal device, the first image may be sent as a compressed image in order to reduce the bandwidth occupied by sending the first image; that is, the second terminal device may not be able to obtain the image resolution of the first image according to the first image itself, and therefore, when the first terminal device transmits the first image to the second terminal device, the image resolution of the first image may also be transmitted to the second terminal device, so that the second terminal device determines the image position of the first position on the target object in the image coordinate system corresponding to the first image according to the first position information and the image resolution of the first image, and further determines that the first position on the target object corresponds to the second position in the virtual coordinate system according to the image position of the first position on the target object in the image coordinate system corresponding to the first image.

It should be noted that, the second terminal device determines, according to the image position of the first position on the target object in the image coordinate system corresponding to the first image, a specific implementation manner that the first position on the target object corresponds to the second position in the virtual coordinate system, which is the same as the specific implementation manner that the first terminal device determines, according to the image position of the first position on the target object in the image coordinate system corresponding to the first image, the first position on the target object corresponds to the second position in the virtual coordinate system, and both the first terminal device and the second terminal device calculate the position of the target ray in the virtual coordinate system according to the view information corresponding to the first image and the image position of the first position on the target object in the image coordinate system corresponding to the first image; the target ray is a ray with an end point as a target position and passing through a second position in the virtual coordinate system; and then determining the intersection point position of the target ray and the virtual object in the virtual coordinate system as the second position of the first position on the target object in the virtual coordinate system.

However, the second terminal device may not obtain the visual field information corresponding to the first image, because the first image is not an image captured by the second terminal device, and therefore, when the first terminal device transmits the first image to the second terminal device, the visual field information corresponding to the first image may also be transmitted to the second terminal device, so that the second terminal device can determine that the first position on the target object corresponds to the second position in the virtual coordinate system based on the visual field information corresponding to the first image.

It can be understood that, when the second terminal device determines that the first position on the target object corresponds to the second position in the virtual coordinate system according to the view information corresponding to the first image, the second terminal device combines the related information of the virtual coordinate system, such as the position of the virtual object corresponding to the target object in the virtual coordinate system. As can be seen from the foregoing description of the virtual coordinate system, the virtual coordinate system is constructed by the first terminal device according to information in the real three-dimensional stereo space where the target object is located, for example, information such as the position of the object in the real three-dimensional stereo space. Therefore, if the second terminal device and the target object are not in the same environment, the second terminal device cannot construct the same virtual coordinate system as the virtual coordinate system constructed by the first terminal device according to the information in the real three-dimensional space in which the target object is located. After receiving the information related to the virtual coordinate system constructed by the first terminal device, the second terminal device may determine, by combining the information related to the virtual coordinate system constructed by the first terminal device, a second position of the target object corresponding to the first position in the virtual coordinate system.

After the second terminal device sends the first position on the target object corresponding to the second position in the virtual coordinate system to the first terminal device, the first terminal device may add a mark at the second position in the virtual coordinate system, render the mark on the second image, and display the rendered mark on the second image; the second image is an image including the target object displayed by the first terminal device; the first position on the target object corresponds to the second position in the virtual coordinate system, and the actual position of the first position on the target object in the real three-dimensional space is reflected.

With respect to the implementation manner that after the second terminal device sends the first terminal device to the first terminal device that the first position on the target object corresponds to the second position in the virtual coordinate system, the first terminal device adds a mark at the second position in the virtual coordinate system, renders the mark on the second image, and displays the mark rendered on the second image, reference may be made to the relevant description part of step 103 and step 105 in the above-mentioned embodiment, and details are not described here.

It should be noted that, in this embodiment, specific information sent by the second terminal device to the first terminal device is not specifically limited, as long as the first terminal device can determine, according to the information, that the first position on the target object corresponds to the second position in the virtual coordinate system. In view of this, in an implementation manner of the embodiment of the present application, step 403 may be replaced by that the second terminal device determines the position of the aforementioned target ray in the virtual coordinate system according to the first position information, and sends the position of the target ray in the virtual coordinate system to the first terminal device. And further, the first terminal device adds a mark at the position, corresponding to the second position in the virtual coordinate system, of the first position on the target object, renders the mark on the second image, and displays the mark rendered on the second image.

In the above embodiment, the video stream sent by the first terminal device to the second terminal device includes a plurality of video frames, and each video frame is an image. Correspondingly, after the first terminal device sends the first image to the second terminal device, the first terminal device still continues to send the image to the second terminal device. Then, the second terminal device starts receiving the image of the first terminal device to the time when the first image is received, and the second terminal device has continuously received the plurality of images. When the user on the second terminal device side starts inputting an operation instruction for the first position on the target object on the second terminal device, the user may not see the first image but send another image (for example, the ith image) after sending the first image. The operation instruction triggered by the user and aimed at the target object is associated with the first image, that is, the position information of the first position is associated with the first image, and the first terminal device may shake in the time period of inputting the operation instruction by the user, so that the picture seen by the user in the process of inputting the operation instruction on the second terminal device is changed, the user at the side of the second terminal device may have the illusion that the input position is incorrect, and the effect of the remote assistant function is poor.

Illustratively, as shown in fig. 11A, the first terminal device continuously transmits an image to the second terminal device, and accordingly, the second terminal device receives an image 1101, an image 1102, an image 1103, an image 1104, and an image 1105 in this order. The image 1101 is, for example, the first image described in the above embodiment. When the second terminal device receives and displays the image 1101, the second terminal device side user triggers the second terminal device to input an operation instruction. At the time of completion of receiving the operation instruction, displayed on the second terminal device is, for example, an image 1105. As shown in fig. 11B, the position where the user starts triggering on the second terminal device is, for example, the position P of the ball, and an image 1105 is displayed when the user finishes triggering (as shown in fig. 11C). However, the second terminal device receives an operation instruction triggered by the user, and associates the position triggered by the user with the image 1101. If the first terminal device shakes during transmission of the image 1101 to the image 1105, the position P on the image 1101 is displayed as the position Q illustrated in fig. 11C due to the screen change after the user trigger is completed. As shown in fig. 11C, the visual effect of the position Q to the user is not a ball that the user wants to mark, and therefore, the user may think that the self-triggered position is wrong, and thus, the above-mentioned embodiment may make the user experience of the second terminal device side poor.

In order to solve the problem, the embodiment of the application further provides a display method. As shown in fig. 12, the method may be implemented, for example, by steps 501-506 as follows.

Step 501: the second terminal equipment acquires the first image and statically displays the first image on a screen of the second terminal equipment; the first image is an image including a target object.

The "static display" means that after the second terminal device acquires the first image, the first image is displayed on the main interface of the second terminal device, and even if the second terminal device still receives images from the first terminal device continuously, the received other images are not displayed on the main interface, for example, the received other images may be displayed through a small window on the main interface, or the received other images may not be displayed.

It can be understood that, when remote assistance is performed, the first terminal device serves as a party seeking help and displays a video collected by the first terminal device, and the second terminal device serves as a party assisting help and displays a video collected and sent by the first terminal device.

The trigger time of the screen capture may be triggered when the remote assistance function is started, and the remote assistance function may be started by a user on the first terminal device side or a user on the second terminal device side. In one implementation, when a user at a first terminal device triggers the first terminal device to send a remote assistance request to a second terminal device, a video at the first terminal device is captured to obtain a first image. Further, the first terminal device transmits the first image to the second terminal device. In a second implementation manner, a user on the first terminal device side triggers the first terminal device to send a remote assistance request to the second terminal device, and when the second terminal device receives the remote assistance request, the second terminal device captures a video displayed on the second terminal device to obtain the first image. In a third implementation manner, a user on the side of the second terminal device triggers the remote assistance function to be started, the second terminal device sends a video screen capture request to the first terminal device, and the first terminal device receives the video screen capture request, captures a video screen displayed on the first terminal device to acquire the first image and sends the first image to the second terminal device.

The trigger time of the screen capture may be triggered by the user on the first terminal device side or the user on the second terminal device side after the remote assistance function has been turned on. In an implementation manner, in a remote assistance process, a user at a first terminal device side triggers a screen capture control on a first terminal device interface, and the first terminal device receives a screen capture instruction, and then captures a video at the first terminal device side to obtain a first image. After that, the first terminal device transmits the first image to the second terminal device. In another implementation manner, in the remote assistance process, a user at the second terminal device triggers a "screen capture" control on an interface of the second terminal device, and the second terminal device receives a screen capture instruction, and then captures a video at the second terminal device to obtain a first image. In another implementation manner, in the remote assistance process, a user at the side of the second terminal device triggers a "screen capture" control on an interface of the second terminal device, the second terminal device receives a screen capture instruction, and then, for a video screen capture request sent by the first terminal device, the first terminal device receives the video screen capture request, captures a video screen displayed on the first terminal device, so as to obtain a first image, and sends the first image to the second terminal device.

Illustratively, as shown in fig. 13A-1, the second terminal device is able to continuously receive images from the first terminal device, for example, the second terminal device receives an image 1301, an image 1302, an image 1303, an image 1304, and an image 1305 in this order. After the first terminal device sends the image 1301 to the second terminal device, for example, a remote assistance request is sent to the second terminal device in response to a trigger of the user on the side of the first terminal device. In this embodiment, after receiving a trigger of a user, the first terminal device captures a screen to obtain the image 1302, and further sends the image 1302 and the remote assistance request to the second terminal device. The second terminal device receives the image 1302 and displays the image 1302 on the main interface. As shown in fig. 13B, the second terminal device continues to receive the image 1303, the image 1304, and the image 1305, and displays the image 1303, the image 1304, and the image 1305, and the received subsequent images through the small window 131 on the main interface on which the image 1302 is displayed. In addition, another possible implementation manner is that after receiving the trigger of the user, the first terminal device sends a remote assistance request to the second terminal device, and then sends the first image to the second terminal device. And are not limited herein.

As shown in fig. 13A-2, the first terminal device transmits an image 1301, an image 1302, an image 1303, an image 1304, and an image 1305 in order to the second terminal device. After the first terminal device sends the image 1301 to the second terminal device, for example, a remote assistance request is sent to the second terminal device in response to a trigger of the user on the side of the first terminal device. In this embodiment, after the second terminal device receives the remote assistance request, the second terminal device captures a screen to obtain the image 1302. Further, as shown in fig. 13B, the second terminal device displays the image 1302 on the main interface, and still continues to receive the image 1303, the image 1304, and the image 1305, and displays the image 1303, the image 1304, and the image 1305, and the received subsequent images through the small window 131 on the main interface on which the image 1302 is displayed.

As shown in fig. 13A-3, the first terminal device has initiated a remote assistance request to the second terminal device. The first terminal device transmits an image 1301, an image 1302, an image 1303, an image 1304, and an image 1305 to the second terminal device in order. After the first terminal device transmits the image 1301 to the second terminal device, for example, a screen capture request input by a user on the first terminal device side is received. In response to the screen capture request, the first terminal device captures the screen to obtain the image 1302, and further, transmits the image 1302 to the second terminal device. The manner in which the second terminal device displays the image 1302 and the

images

1303, 1304, and 1305 is as shown in fig. 13B and will not be described in detail here.

As shown in fig. 13A-4, the first terminal device has initiated a remote assistance request to the second terminal device. The first terminal device transmits an image 1301, an image 1302, an image 1303, an image 1304, and an image 1305 to the second terminal device in order. After receiving the image 1301, the second terminal device receives, for example, a screen capture request input by a user. In response to the screen capture request, the second terminal device captures a screen to obtain an image 1302. The manner in which the second terminal device displays the image 1302 and the

images

It should be noted that in some embodiments, in the embodiments illustrated in fig. 13A-3 and fig. 13A-4, the user may trigger the "screen capture" shortcut key of the corresponding terminal device, so that the corresponding terminal device performs the screen capture operation. In other embodiments, during the execution of the remote assistance function, a "screen capture" control may be displayed on the interface of at least one of the first terminal device and the second terminal device. Illustratively, as shown in fig. 13C, the "screen capture" control displayed on the terminal device interface may be a button 132 labeled with a "screen capture" identifier. Of course, in other embodiments, the "screen capture" control displayed on the terminal device interface may be an icon indicating "screen capture". And are not limited herein.

It is to be understood that fig. 13A-1 to 13A-4 are only schematic descriptions and do not constitute a limitation on the method of acquiring the first image in the present embodiment. In other implementation manners, the first terminal device may further capture a screen to obtain a first image in response to an instruction of the second terminal device, and then send the first image to the second terminal device. And is not described in detail herein.

It can be understood that, when the remote assistance function is turned off or the user triggers to cancel the screen capture, the video stream sent by the first terminal device continues to be displayed on the second terminal device. For example, the interface illustrated in fig. 13B may further include a button 133 for triggering the remote assistance function to be turned off, and the button 133 may be labeled with "complete assistance", for example. Further, the user on the second terminal device side can turn off the remote assistance function by clicking the button 133 after the mark is triggered at the position P of the image 1302. Further, after the second terminal device receives the close instruction, the small window 131 may be enlarged to the whole screen to display the video stream transmitted from the first terminal device on the main interface of the second terminal device.

Before the screen capture is triggered, the video stream from the first terminal equipment is displayed on the second terminal equipment, namely, the video stream changes along with the change of the video content collected and transmitted by the first terminal equipment. After the screen capture is triggered, the first image is displayed on the second terminal device instead of the video stream, and the second terminal device can still receive the video stream sent by the first terminal device but does not display the video stream; or instead of displaying the first image on the main interface of the second terminal device, a video stream is displayed in a small window in the screen of the second terminal device (as shown in the exemplary embodiment of fig. 13B). After the screen capture is triggered, the first terminal device may continue to send the video stream to the second terminal device, or may stop sending the video stream until the user triggers cancellation of the screen capture and then resumes sending the video stream, or may stop sending the video stream until the user at the first terminal device side or the user at the second terminal device side triggers the remote assistance to complete and then resumes sending the video stream.

Step 502: the second terminal device receives an operation instruction which is triggered by a user on the first image and aims at a first position on the target object, and determines first position information.

After screen capture, the first image is always displayed on the main interface of the second terminal device, and in the process that a user at the side of the second terminal device triggers the mark, the picture change caused by shaking of the first terminal device is avoided. The first position of the target object seen and triggered by the user is the first position of the target object displayed on the first image. The first position information corresponds to a position in the first image corresponding to the operation of the user on the second terminal device aiming at the first position on the target object. This can improve the experience of the user at the second terminal device side.

Illustratively, the image 1302 is always displayed on the main interface of the second terminal device. Accordingly, what the user wants to mark is a first location of the target image displayed in image 1302, such as location P of the ball in FIG. 11B. Furthermore, after the user triggers the operation instruction, the first position information determined by the second terminal device can accurately indicate the position P of the spherical object, and the situation that the user mistakenly thinks that the mark is wrong due to picture jitter does not occur.

Therefore, by the adoption of the implementation mode, the position of the user on the second terminal device side can be accurately identified, the first position information can accurately indicate the first position of the target object marked by the user, the accuracy of position determination is improved, and the effect of remote assistance of a 'hand handle' is optimized.

For other relevant descriptions regarding step 502, reference may be made to the description above for step 102, which is not detailed here.

Step 503: the second terminal equipment determines that the first position on the target object corresponds to the second position in the virtual coordinate system according to the first position information, and sends the first position on the target object corresponding to the second position in the virtual coordinate system to the first terminal equipment; the second position in the virtual coordinate system corresponds to an actual position of the first position on the target object in the real three-dimensional space.

Step 504: the second terminal device sends the second location to the first terminal device.

The implementation process of step 503-504 can refer to the implementation process of step 403 in fig. 7, and will not be described in detail here.

Step 505: rendering the mark on the second image according to the second position by the first terminal device, so that the position of the mark on the second image is consistent with the position of the first position on the target object in the first image; the second image is an image including the target object displayed by the first terminal device.

For the description of step 505, reference may be made to the above description of step 104, which is not detailed here.

Step 506: the first terminal device displays the mark rendered on the second image.

For the description of step 506, reference may be made to the above description of step 105, which is not detailed here.

The method illustrated in steps 501 to 506 is only an exemplary description of the present application and does not limit embodiments of the present application.

It can be understood that after the first terminal device side user or the second terminal device side user triggers to cancel the screen capture, or after the first terminal device side user or the second terminal device side user triggers to close the remote assistance function, the second terminal device continues to display the video stream sent by the first terminal device, and the second terminal device does not display the first image any more.

As can be seen, in the method described in this embodiment, during the period when the user triggers the mark on the second terminal device side, the second terminal device displays the first image on the main interface of the second terminal device unchanged. When the user triggers the first position of the target object, the target object displayed by the first image is referred to. In this way, not only can the user be assured of accurately identifying the position of the mark, but the user can accurately observe that the triggered position can be accurately correlated to the first position of the target object on the first image. Accordingly, when the first terminal device renders the mark on the second image including the target object displayed on the first terminal device, the position of the mark displayed on the second image can accurately correspond to the object which the user wants to mark. The performance of the remote assistance of the 'hand handle' can be improved, and the effect of the remote assistance of the 'hand handle' is optimized.

Optionally, in some embodiments, the second terminal device may send the first location information to the first terminal device after determining the first location information in step 502. Thereafter, the first terminal device may perform the operations of steps 103-105. And is not described in detail herein.

Optionally, in some embodiments, the second terminal device acquires the view information corresponding to the first image. The content displayed by the second terminal device comes from the first terminal device, the posture of the first terminal device has an influence on the position of the target object displayed by the first image, and the posture of the first terminal device may change frequently, that is, the view information corresponding to the image shot by the first terminal device may change. Based on this, in some embodiments, the first terminal device may further transmit the view information corresponding to the first image to the second terminal device. For example, the first terminal device may transmit the view information corresponding to the first image to the second terminal device when the first image is transmitted, or may transmit the view information corresponding to the first image to the second terminal device when the second terminal device requests the view information corresponding to the first image.

When the first terminal device shoots a video, feature points of a captured object can be obtained based on the content of the video, for example, when the first terminal device shoots a cube, the feature points of the cube can be generated, the feature points are used for describing the cube in a video image, the feature points are generally concentrated at the position of a light-dark boundary in the video image, it can be understood that the feature points can indicate corners of the cube, or indicate sides or faces of the cube, and it can be understood that if there are other objects in the shooting environment, feature points of other objects can also be generated. The information of the feature point may be sent to the second terminal device together with the first image or according to a request of the second terminal device.

In some embodiments, the first terminal device may not send the information of the feature point to the second terminal device, but after the second terminal device receives the first image, the information of the feature point may be obtained by calculation according to the first image.

Therefore, in some embodiments, in step 502, after the second terminal device receives the trigger of the user on the first image, the location information corresponding to the area triggered by the user is determined. Then, the second terminal device may acquire partial information as the first position information from the determined position information based on the visual field information corresponding to the first image and the feature point of the photographic subject. Then, the first location information is sent to the first terminal device, and the first terminal device may render and display the aforementioned mark locally according to the first location information, for example, the first terminal device may perform the operations of steps 103 to 105.

Illustratively, the first image is an image obtained after an object in a three-dimensional space is converted into a two-dimensional space. Illustratively, as shown in fig. 14A, a scene of a video shot by a first terminal device 1403 is shown, the first terminal device 1403 shoots a cube 1402 and a background wall 1401 through a camera, the first terminal device 1403 can capture a screen to obtain a first image, and the first image and view information corresponding to the first image are sent to a second terminal device. The first terminal 1403 may further calculate feature points of the first image (for example, 1411-. It is to be understood that the feature points may or may not be displayed on the display screen of the first terminal device 1403. The first terminal device 1403 sends the first image and the view information corresponding to the first image to the second terminal device, and can also send and capture information of the

characteristic points

1411 and 1416 of the object. After receiving the data, the second terminal device displays the first image, as shown in fig. 14B, the second terminal device may display the feature points 1411 and 1416, or may not display them. The user on the second terminal device side marks an object desired to be marked on the first image, for example, the user marks the upper surface of the cube 1402 with the curve 1420 (i.e., the third position information). To reduce the data sent to the first terminal, the second terminal device may send the feature point information that intersects the curve 1420 to the first terminal device, so the second terminal device may take the

intersection

1411,1412,1413,1416 of the curve 1420 and the feature points 1411-1416. Further, the second terminal device may combine the visual field information corresponding to the first image and the feature point information of the photographic subject, and the second terminal device may calculate that the cube 1402 is marked by the user instead of the background wall 1401, and thus the feature point 1416 is removed from the intersection. The second terminal device sends the

simplified intersection

1411,1412,1413 to the first terminal device so that the first terminal device can calculate that the user marked the top surface of the cube. It is to be understood that if the second terminal device does not remove the feature 1416, the first terminal device may also remove the feature point 1416 using the locally stored view information corresponding to the first image and the intersection itself when determining the marker object using the

intersection

1411,1412,1413,1416. If the feature point 1416 is not removed, when the first terminal device determines to mark the object, it may happen that 1411,1412,1413 and 1416 in the 3D space are displayed in a connected line, and the user on the first terminal device side may misunderstand that the user on the second terminal device side marks the background wall 1401 in addition to the cube 1402. It can be understood that the first terminal device does not send the feature point information to the second terminal device, but the second terminal device calculates the feature point information according to the first image, and then combines the view information corresponding to the first image and the curve of the user mark to obtain the first position information.

Based on the display method applied to the first terminal device provided by the above embodiment, the embodiment of the present application further provides a terminal device, and the display apparatus is described below with reference to the accompanying drawings.

Referring to fig. 8, the figure is a schematic structural diagram of a terminal device according to an embodiment of the present application. The terminal device 800 provided in the embodiment of the present application may be the first terminal device mentioned in the above embodiments, and the terminal device 800 may include, for example, a first sending unit 801, a first receiving unit 802, a first determining unit 803, an adding unit 804, a rendering unit 805, and a display unit 806.

A first sending unit 801, configured to send a first image to a second terminal device, where the first image is an image including a target object;

a first receiving unit 802, configured to receive first location information from the second terminal device; the first position information corresponds to a position in the first image corresponding to the operation of the user on the second terminal device aiming at the first position on the target object;

a first determining unit 803, configured to determine, according to the first position information, that a first position on the target object corresponds to a second position in a virtual coordinate system; a second position in the virtual coordinate system corresponding to an actual position of the first position on the target object in the real three-dimensional space;

an adding unit 804 for adding a marker at a second position in the virtual coordinate system;

a rendering unit 805 for rendering the marker on a second image such that a corresponding position of the marker on the second image coincides with a position of the first position on the target object in the second image; the second image is an image including the target object displayed on the terminal device;

a display unit 806 for displaying the marker rendered on the second image.

In a possible implementation manner, the first determining unit 803 is specifically configured to:

determining the image position of the first position on the target object in the image coordinate system corresponding to the first image according to the first position information and the image resolution of the first image;

acquiring view information corresponding to a first image, wherein the view information corresponding to the first image is used for describing a target position of a camera shooting the first image in the virtual coordinate system and a target shooting direction of the camera shooting the first image in the virtual coordinate system when the first image is shot;

calculating the position of a target ray in the virtual coordinate system according to the visual field information and the image position; the target ray is a ray of which the end point is the target position and passes through the second position in the virtual coordinate system;

determining the intersection point position of the target ray and a virtual object in the virtual coordinate system as a second position of the first position on the target object corresponding to the virtual coordinate system; and the position of the virtual object in the virtual coordinate system corresponds to the actual position of the target object in the real three-dimensional space.

Since the terminal device 800 is a terminal device corresponding to the display method executed by the terminal device provided in the above method embodiment, and the specific implementation of each unit of the terminal device 800 is the same concept as the above method embodiment, for the specific implementation of each unit of the terminal device 800, reference may be made to the description part of the above method embodiment about the display method executed by the terminal device, and details are not repeated here.

Based on the location determination method applied to the second terminal device provided in the foregoing embodiment, an embodiment of the present application further provides a terminal device, which is described below with reference to the accompanying drawings.

Referring to fig. 9, the figure is a schematic structural diagram of a terminal device according to an embodiment of the present application. The terminal device 900 provided in the embodiment of the present application may be the second terminal device mentioned in the above embodiment, and the terminal device 900 includes, for example: a second receiving unit 901, a third receiving unit 902, a second determining unit 903, a third determining unit 904 and a second transmitting unit 905.

A second receiving unit 901, configured to receive a first image sent by a first terminal device, and display the first image on a screen of the terminal device; the first image is an image including a target object;

a third receiving unit 902, configured to receive an operation instruction, triggered by the user on the terminal device, for the first location on the target object;

a second determining unit 903 for determining the first position information; the first position information corresponds to a position in the first image corresponding to the operation of the user on the terminal equipment aiming at the first position on the target object;

a third determining unit 904, configured to determine, according to the first position information, that the first position on the target object corresponds to a second position in a virtual coordinate system;

a second sending unit 905, configured to send a second position, in the virtual coordinate system, corresponding to the first position on the target object to the first terminal device; the second position in the virtual coordinate system corresponds to an actual position of the first position on the target object in the real three-dimensional space.

In one implementation, the first location information describes a relative location of a first location on the target object in the first image; the terminal apparatus 900 further includes:

a fourth determining unit, configured to determine the first location information according to a screen location corresponding to an operation, triggered by the user on the terminal device, for a first location on the target object and a location of a display area of the first image on a screen of the terminal device.

In an implementation manner, the third determining unit 904 is specifically configured to:

receiving the image resolution of the first image sent by the first terminal equipment;

In one implementation, the terminal device 900 further includes:

a fourth receiving unit, configured to receive information related to the virtual coordinate system, which is sent by the first terminal device, where the information related to the virtual coordinate system includes a position of a virtual object corresponding to the target object in the virtual coordinate system; and the position of the virtual object corresponding to the target object in the virtual coordinate system corresponds to the actual position of the target object in the real three-dimensional space.

receiving visual field information corresponding to the first image sent by a first terminal device, wherein the visual field information corresponding to the first image is used for describing a target position of a camera shooting the first image in the virtual coordinate system and a target shooting direction of the camera shooting the first image in the virtual coordinate system when the first image is shot;

and determining the intersection point position of the target ray and the virtual object in the virtual coordinate system as a second position of the first position on the target object corresponding to the virtual coordinate system.

Since the terminal device 900 is a terminal device corresponding to the position determining method executed by the terminal device provided in the above method embodiment, and the specific implementation of each unit of the terminal device 900 is the same concept as the above method embodiment, for the specific implementation of each unit of the terminal device 900, reference may be made to the description part of the position determining method executed by the terminal device in the above method embodiment, and details are not repeated here.

An embodiment of the present application further provides an electronic device, where the electronic device includes: a memory and at least one processor;

the memory to store instructions;

the at least one processor is configured to execute the instructions in the memory, and perform the display method performed by the first terminal device provided in the above embodiment.

the memory to store instructions;

the at least one processor, configured to execute the instructions in the memory, performs the position determining method performed by the second terminal device provided in the above embodiments.

It should be noted that both the terminal device 800 and the terminal device 900 provided in the embodiment of the present application may have the structure described in fig. 10, and fig. 10 is a schematic structural diagram of a terminal device provided in the embodiment of the present application.

Referring to fig. 10, the terminal device 1000 includes: a processor 1010, a communication interface 1020, and a memory 1030. The number of the processors 1010 in the terminal device 1000 may be one or more, and one processor is taken as an example in fig. 10. In the embodiment of the present application, the processor 1010, the communication interface 1020 and the memory 1030 may be connected by a bus system or other means, wherein the connection via the bus system 1040 is taken as an example in fig. 10.

Processor 1010 may be a Central Processing Unit (CPU), a Network Processor (NP), or a combination of a CPU and an NP. The processor 1010 may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof.

As known to those skilled in the art, the Processor 1010 may also be multiple, for example, in a typical mobile phone architecture, the Processor may include an Application Processor (AP), a Baseband Processor (BP), a Graphics Processing Unit (GPU), a Neural-Network Processing Unit (NPU), and an Image Signal Processor (ISP) in many recent designs. The processors may be discrete devices or may be integrated on the same Chip, and a typical System on Chip (SoC) usually includes one or more of the processors.

Memory 1030 may include volatile memory (RAM), such as random-access memory (RAM); the memory 1030 may also include a non-volatile memory (SSD), such as a flash memory (flash memory), a hard disk (HDD) or a solid-state drive (SSD); memory 1030 may also include a combination of the above types of memory.

The memory 1030 may store information related to the virtual coordinate systems mentioned in the previous embodiments.

Optionally, memory 1030 stores an operating system and programs, executable modules or data structures, or subsets thereof, or extensions thereof, wherein the programs may include various operating instructions for performing various operations. The operating system may include various system programs for implementing various basic services and for handling hardware-based tasks. When the terminal device 1000 is the terminal device 800 mentioned in the foregoing embodiment, the processor 1010 may read a program in the memory 1030 to implement the display method provided in the embodiment of the present application; when the terminal device 1000 is the terminal device 900 mentioned in the foregoing embodiments, the processor 1010 may read the program in the memory 1030 to implement the position determining method provided in the embodiment of the present application.

The bus system 1040 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus system 1040 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 10, but this is not intended to represent only one bus or type of bus.

The embodiment of the present application further provides a computer-readable storage medium, which includes instructions, when the computer-readable storage medium runs on a computer, to make the computer execute the display method executed by the first terminal device provided in the above embodiment.

An embodiment of the present application further provides a computer-readable storage medium, which includes instructions, when executed on a computer, to cause the computer to execute the position determining method performed by the second terminal device provided in the foregoing embodiment.

The embodiment of the present application further provides a computer program product containing instructions, which when run on a computer, causes the computer to execute the display method executed by the first terminal device provided in the above embodiment.

Embodiments of the present application further provide a computer program product containing instructions, which when run on a computer, cause the computer to execute the position determining method performed by the second terminal device provided in the above embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a first server, or a network device) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method of displaying, the method comprising:

the method comprises the steps that a first terminal device sends a first image to a second terminal device, wherein the first image is an image comprising a target object;

the first terminal equipment receives first position information from the second terminal equipment; the first position information corresponds to a position in the first image corresponding to the operation of the user on the second terminal device aiming at the first position on the target object;

the first terminal equipment determines that the first position on the target object corresponds to a second position in a virtual coordinate system according to the first position information, and adds a mark at the second position in the virtual coordinate system; a second position in the virtual coordinate system corresponding to an actual position of the first position on the target object in the real three-dimensional space;

the first terminal device renders the mark on a second image, so that the corresponding position of the mark on the second image is consistent with the position of the first position on the target object in the second image; the second image is an image including the target object displayed on the first terminal device;

the first terminal device displays the marker rendered on the second image.

2. The method according to claim 1, wherein the first position information is determined according to a screen position corresponding to an operation, triggered by a user on the second terminal device, for a first position on the target object; the first image is displayed on a screen of the second terminal device.

3. The method of claim 2, wherein the first location information describes a relative location of a first location on the target object in the first image; and the first position information is determined according to the screen position and the position of the display area of the first image on the screen of the second terminal equipment.

4. The method of claim 3, wherein the first terminal device determines that the first position on the target object corresponds to the second position in the virtual coordinate system according to the first position information, and the method comprises:

the first terminal device determines the image position of the first position on the target object in the image coordinate system corresponding to the first image according to the first position information and the image resolution of the first image;

and the first terminal equipment determines a second position of the first position on the target object in the virtual coordinate system according to the image position.

5. The method of claim 4, wherein the first terminal device determines that the first position on the target object corresponds to the second position in the virtual coordinate system according to the image position, and comprises:

the first terminal device acquires view information corresponding to a first image, wherein the view information corresponding to the first image is used for describing a target position of a camera shooting the first image in the virtual coordinate system and a target shooting direction of the camera shooting the first image in the virtual coordinate system when the first image is shot;

the first terminal device calculates the position of a target ray in the virtual coordinate system according to the visual field information and the image position; the target ray is a ray of which the end point is the target position and passes through the second position in the virtual coordinate system;

the first terminal device determines the intersection point position of the target ray and a virtual object in the virtual coordinate system as a second position of the first position on the target object corresponding to the virtual coordinate system; and the position of the virtual object in the virtual coordinate system corresponds to the actual position of the target object in the real three-dimensional space.

6. The method according to claim 1, wherein the first terminal device displays an image shot by the first terminal device, and before the first terminal device sends the first image to the second terminal device, the method further comprises the step that the first terminal device receives a screen capture request of the second terminal device, and the first terminal device captures the current display content to form the first image.

7. A terminal device, characterized in that the terminal device comprises:

the first sending unit is used for sending a first image to a second terminal device, wherein the first image is an image comprising a target object;

a first receiving unit, configured to receive first location information from the second terminal device; the first position information corresponds to a position in the first image corresponding to the operation of the user on the second terminal device aiming at the first position on the target object;

the first determining unit is used for determining a second position, corresponding to the first position on the target object, in the virtual coordinate system according to the first position information; a second position in the virtual coordinate system corresponds to an actual position of the first position on the target object in a real three-dimensional space;

an adding unit for adding a mark at a second position in the virtual coordinate system;

a rendering unit for rendering the marker on a second image such that a corresponding position of the marker on the second image coincides with a position of the first position on the target object in the second image; the second image is an image including the target object displayed on the terminal device;

a display unit for displaying the marker rendered on the second image.

8. The terminal device according to claim 7, wherein the first location information is determined according to a screen location corresponding to an operation, triggered by the user, on the second terminal device for the first location on the target object; the first image is displayed on a screen of the second terminal device.

9. The terminal device according to claim 8, wherein the first position information describes a relative position of a first position on the target object in the first image; and the first position information is determined according to the screen position and the position of the display area of the first image on the screen of the second terminal equipment.

10. The terminal device of claim 9, wherein the first determining unit is specifically configured to:

11. The terminal device of claim 10, wherein the first determining unit is specifically configured to:

12. An electronic device, characterized in that the electronic device comprises: a memory and at least one processor;

the memory to store instructions;

the at least one processor, configured to execute the instructions in the memory, to perform the method of any of claims 1-6.

13. A computer-readable storage medium comprising instructions which, when executed on a computer, cause the computer to perform the method of any of claims 1-6 above.