CN117474984B

CN117474984B - Augmented reality tag tracking method, device, equipment and storage medium

Info

Publication number: CN117474984B
Application number: CN202311812529.9A
Authority: CN
Inventors: 何永祺; 黄伟峰; 李军; 刘京京; 杨志成; 曹雄; 郑坚礼
Original assignee: Kaitong Technology Co ltd
Current assignee: Kaitong Technology Co ltd
Priority date: 2023-12-27
Filing date: 2023-12-27
Publication date: 2024-04-05
Anticipated expiration: 2043-12-27
Also published as: CN117474984A

Abstract

The application discloses an augmented reality tag tracking method, device, equipment and storage medium, wherein the method comprises the following steps: when a label point is marked in a camera picture, calculating camera coordinates of the label point according to internal parameters of the camera and screen coordinates of the label point, and calculating world coordinates of the label point according to the camera coordinates of the label point; when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained; when the camera zooms, the current zoom multiple is obtained, and the screen coordinates of the tag point after the camera zooms are calculated according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms. The method and the device realize that the label of the object moves along with the object under the conditions of camera rotation and picture zooming.

Description

Augmented reality tag tracking method, device, equipment and storage medium

Technical Field

The application relates to the technical field of augmented reality, in particular to an augmented reality tag tracking method, an augmented reality tag tracking device, augmented reality tag tracking equipment and a storage medium.

Background

The common camera generally does not have the function of adding the augmented reality tag, which is not beneficial to user experience, but the existing augmented reality camera has the function of adding the augmented reality tag, so that a user can add the tag on a video picture. In the field of video security, object marking in a monitoring picture is an important technology. By marking the object in the monitoring picture, the state of the object can be monitored in real time and corresponding measures can be taken in time. By marking the object, the target object in the monitoring picture can be accurately identified and tracked, the position and the state of the object can be quickly positioned, and the monitoring of the real-time state of the object can be provided. By reasonably utilizing the object marking technology, the efficiency of the security system can be improved, and the prevention and coping capacity for security events can be enhanced.

However, in the prior art, under the conditions of camera rotation and screen zoom, the position of an object in a video screen changes, but the label of the object cannot follow the movement.

Disclosure of Invention

The application provides an augmented reality tag tracking method, device, equipment and storage medium, which are used for improving the technical problem that in the prior art, under the conditions of camera rotation and picture scaling, the position of an object in a video picture changes, but the tag of the object cannot follow movement.

In view of this, a first aspect of the present application provides an augmented reality tag tracking method, including:

when a label point is marked in a camera picture, calculating camera coordinates of the label point according to internal parameters of the camera and screen coordinates of the label point, and calculating world coordinates of the label point according to the camera coordinates of the label point;

when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained;

when the camera zooms, the current zoom multiple is obtained, and the screen coordinates of the tag point after the camera zooms are calculated according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms.

Optionally, the calculating the camera coordinates of the tag point according to the internal parameters of the camera and the screen coordinates of the tag point includes:

acquiring untwisted screen coordinates of the tag point according to internal parameters, distortion coefficients and screen coordinates of the tag point of the camera;

and calculating the camera coordinates of the tag points according to the internal parameters of the camera and the untwisted screen coordinates of the tag points.

Optionally, the method further comprises:

marking a plurality of first target points in a target picture of the camera, and calculating screen coordinates of each first target point after the camera rotates to obtain a screen coordinate calculated value of each first target point after the camera converts;

acquiring actual screen coordinates of each first target point after the camera rotates;

taking a screen coordinate calculated value of the first target point after the rotation of the camera as input, and taking an actual screen coordinate of the first target point after the rotation of the camera as output training convolutional neural network to obtain a coordinate offset correction model;

and physically correcting the screen coordinates of the tag points after the rotation of the camera through the coordinate offset correction model to obtain corrected screen coordinates of the tag points.

Optionally, the calculating the screen coordinates of the tag point after the zooming of the camera according to the current zoom multiple and the screen coordinates of the tag point before the zooming of the camera includes:

and inputting the current scaling multiple, the screen coordinates of the tag point before scaling of the camera and the current angle of the camera into a preset coordinate scaling model for coordinate scaling to obtain the screen coordinates of the tag point after scaling of the camera.

Optionally, the training process of the preset coordinate scaling model includes:

Acquiring screen coordinates of a plurality of second target points in a camera picture, actual screen coordinates of each second target point after zooming of the camera, zooming times and the current angle of the camera;

and taking the scaling multiple, the screen coordinates of each second target point before the scaling of the camera and the current angle of the camera as inputs, and taking the actual screen coordinates of each second target point after the scaling of the camera as outputs to train the convolutional neural network, so as to obtain a preset coordinate scaling model.

acquiring screen coordinates of the tag points after zooming by 1 time by the camera, and calculating the coordinate distance between the screen coordinates of the tag points when the camera is not zoomed and the screen coordinates after zooming by 1 time by the camera;

and calculating a scaling distance according to the current scaling multiple and the coordinate distance, and calculating the screen coordinates of the tag point after the scaling of the camera according to the scaling distance and the screen coordinates of the tag point before the scaling of the camera.

Optionally, the method further comprises:

calculating screen coordinates of a plurality of third target points in the camera after zooming of the camera according to each zoom factor, and obtaining screen coordinate calculated values of the third target points under each zoom factor;

Acquiring actual screen coordinates corresponding to each third target point under each scaling multiple;

obtaining coordinate offset under each scaling multiple by fitting a screen coordinate calculated value of each third target point under each scaling multiple with an actual screen coordinate;

and correcting the screen coordinates of the tag points after the scaling of the camera according to the coordinate offset obtained by the current scaling multiple.

A second aspect of the present application provides an augmented reality tag tracking device, comprising:

the first calculating unit is used for calculating camera coordinates of the tag points according to the internal parameters of the camera and screen coordinates of the tag points when the tag points are marked in the camera picture, and calculating world coordinates of the tag points according to the camera coordinates of the tag points;

the second calculation unit is used for obtaining a transformation matrix after the rotation of the camera when the camera rotates, calculating new camera coordinates through the transformation matrix and world coordinates of the tag points, and mapping the new camera coordinates of the tag points into a camera picture based on internal parameters of the camera to obtain screen coordinates of the tag points after the rotation of the camera;

and the third calculation unit is used for obtaining the current scaling multiple when the camera is scaled, and calculating the screen coordinates of the tag point after the scaling of the camera according to the current scaling multiple and the screen coordinates of the tag point before the scaling of the camera.

A third aspect of the present application provides an electronic device comprising a processor and a memory;

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to perform the augmented reality tag tracking method according to any one of the first aspects according to instructions in the program code.

A fourth aspect of the present application provides a computer readable storage medium for storing program code which when executed by a processor implements the augmented reality tag tracking method of any one of the first aspects.

From the above technical scheme, the application has the following advantages:

the application provides an augmented reality tag tracking method, which comprises the following steps: when a label point is marked in a camera picture, calculating camera coordinates of the label point according to internal parameters of the camera and screen coordinates of the label point, and calculating world coordinates of the label point according to the camera coordinates of the label point; when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained; when the camera zooms, the current zoom multiple is obtained, and the screen coordinates of the tag point after the camera zooms are calculated according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms.

In the method, after marking the tag point, camera coordinates are calculated according to internal parameters of a camera and screen coordinates of the tag point, world coordinates of the tag point are further obtained, when the camera is selected, the world coordinates of the tag point are converted into the camera coordinates according to a transformation matrix after the camera rotates, new camera coordinates of the tag point are obtained, then the new camera coordinates of the tag point are remapped into a camera picture according to the internal parameters of the camera, and the screen coordinates of the tag point after the camera rotates are obtained; when the camera zooms, the screen coordinates of the tag points after the camera zooms are calculated according to the zoom multiple, so that the tag follows the object, and the technical problem that the tag of the object cannot follow movement when the position of the object in the video picture changes under the conditions that the camera rotates and the picture zooms in the prior art is solved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.

Fig. 1 is a schematic flow chart of an augmented reality tag tracking method according to an embodiment of the present application;

fig. 2 is another flow chart of an augmented reality tag tracking method according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of an augmented reality tag tracking apparatus according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the present application solution better understood by those skilled in the art, the following description will clearly and completely describe the technical solution in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

For ease of understanding, referring to fig. 1, an embodiment of the present application provides an augmented reality tag tracking method, including:

and 101, when the tag points are marked in the camera picture, calculating camera coordinates of the tag points according to internal parameters of the camera and screen coordinates of the tag points, and calculating world coordinates of the tag points according to the camera coordinates of the tag points.

When the target in the camera picture is required to be marked, a point can be marked on the target in the camera picture and used as a label point, and when the label point is marked, the screen coordinates of the label point can be obtained.

The camera needs to be calibrated to obtain the internal parameters and the external parameters, and distortion coefficients are also obtained during calibration in consideration of distortion of the camera. For example, 3 photographs (front, side, upward) can be respectively taken at four corners and the center by holding a checkerboard picture in front of the camera, then 5 photographs are taken at random positions, internal parameters, external parameters and distortion coefficients of the camera are calculated through 20 collected photographs, and specific calculation processes can refer to the prior art, and detailed description of specific processes is omitted.

The internal parameters of the camera describe the relation between the camera space of the camera and the screen space, and the conversion from the screen point to the three-dimensional point of the camera space can be realized through the internal parameters, namely:

camera coordinates internal reference = screen coordinates

However, in reality, the screen coordinates acquired from the camera picture are coordinates distorted by the lens, and in order to acquire more accurate camera coordinates, the screen coordinates on the picture need to be restored to undistorted coordinates first, and then internal reference conversion is performed. The undistorted screen coordinates of the tag point can be obtained according to the internal parameters and distortion coefficients of the camera and the screen coordinates of the tag point, and particularly, the undistorted screen coordinates of the tag point can be solved by calling an undistitortPoints function in an image processing tool openCV, namely:

Untwisted screen coordinates= cv. undistitortPoints (screen coordinates, internal parameters, distortion coefficients)

After the untwisted screen coordinates of the tag points are obtained, camera coordinates of the tag points are calculated according to the internal parameters of the camera and the untwisted screen coordinates of the tag points. Specifically, the product of the untwisted screen coordinates of the tag point and the inversion matrix of the internal reference is calculated to obtain the camera coordinates of the tag point, namely:

inversion matrix of camera coordinates = undistorted screen coordinates x internal parameters

After the camera coordinates of the tag point are obtained, the world coordinates of the tag point in the camera world space can be calculated, and the world coordinates of the tag point can be calculated according to the external parameters of the camera and the camera coordinates of the tag point.

Step 102, when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained.

When the camera rotates, the position of the target in the camera picture changes, correspondingly, the tag point also changes, a transformation matrix after the rotation of the camera is calculated according to the angle after the rotation of the camera, and new camera coordinates of the tag point can be calculated through the transformation matrix and world coordinates of the tag point, namely:

New camera coordinates = world coordinates × inverse of the transformation matrix after camera rotation

And then, mapping the new camera coordinates of the tag points into the camera picture by combining the internal parameters of the camera to obtain screen coordinates of the tag points after the camera rotates, wherein the undistorted screen coordinates of the tag points are calculated through the distortion coefficients in the steps, and the camera coordinates are also calculated based on the undistorted screen coordinates.

And 103, when the camera zooms, acquiring the current zoom multiple, and calculating the screen coordinates of the tag point after the camera zooms according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms.

In one embodiment, a camera with maximum zoom of 4 times is taken as a research object, and the picture after physical zooming is found through superposition pair, so that the corresponding area of the picture in a non-zooming state can be effectively and directly zoomed and completely covered, and the two parts show a multiple relation. When the same point in the picture of each zoom level is examined, the distance between different zoom levels of each point forms an arithmetic series in the zooming process, and by the rule, the zoom level and the zoomed coordinates can be related, so that the new position of each screen coordinate after zooming is determined.

Specifically, screen coordinates of the tag points after the camera is scaled by 1 time are obtained, and coordinate distances between the screen coordinates of the tag points when the camera is not scaled by 1 time and the screen coordinates of the tag points after the camera is scaled by 1 time are calculated, wherein the coordinate distances comprise horizontal coordinate distances and vertical coordinate distances;

and calculating the scaling distance according to the current scaling multiple and the coordinate distance, and calculating the screen coordinates of the tag point after the scaling of the camera according to the scaling distance and the screen coordinates of the tag point before the scaling of the camera. And multiplying the current scaling times on the basis of the horizontal coordinate distance and the vertical coordinate distance to obtain the horizontal coordinate scaling distance and the vertical coordinate scaling distance, and then calculating the screen coordinates of the tag points after the scaling of the camera according to the scaling distances and the screen coordinates of the tag points before the scaling of the camera.

In the embodiment of the application, the new screen coordinates of the label points in different zoom states are calculated through the relation between zoom levels, so that label following can be realized rapidly.

In another embodiment, in order to further improve the calculation efficiency, a preset coordinate scaling model is obtained through training, and the current scaling multiple, the screen coordinates of the tag point before scaling of the camera and the current angle of the camera are input into the preset coordinate scaling model to perform coordinate scaling, so that the screen coordinates of the tag point after scaling of the camera are obtained.

The training process of the preset coordinate scaling model comprises the following steps:

In this embodiment of the present invention, screen coordinates of a plurality of second target points in a camera frame are obtained, where the selected second target points may be corner points, then the camera is scaled by 1 times, 1.5 times, 2 times, 3 times, 4 times, and so on, and actual screen coordinates of the second target points after the camera is scaled by various times and angles of the camera are obtained, the scaled times, the screen coordinates of the second target points before the camera is scaled and the current angles of the camera are used as inputs, the actual screen coordinates of the second target points after the camera is scaled are used as outputs to train a convolutional neural network, the convolutional neural network may select a shallow multi-layer perceptron, calculate weights of each feature through the multi-layer perceptron, so as to learn weight relationships among features, and because the camera may have physical offsets, and when collecting data, deviations may be generated, a Sigmoid function is selected as an activation function to alleviate and correct the deviations, so that a prediction result of the network is as close to an ideal state as possible. The convolutional neural network outputs a scaled screen coordinate predicted value of the second target point, a loss value is calculated according to the actual screen coordinate and the screen coordinate predicted value, and the network parameters of the convolutional neural network are reversely updated through the loss value until the convolutional neural network converges (if the maximum iteration number is reached, or the training error is lower than an error threshold, or the training error converges near a certain value), so as to obtain a trained convolutional neural network model, and the trained convolutional neural network model is used as a preset coordinate scaling model.

After the preset coordinate scaling model is obtained, the current scaling multiple, the screen coordinates of the tag point before the scaling of the camera and the current angle of the camera are input into the preset coordinate scaling model, and the screen coordinates of the tag point after the scaling of the camera are calculated according to the current scaling multiple, the camera angle and the screen coordinates of the tag point before the scaling of the camera through the preset coordinate scaling model.

In the embodiment of the application, after marking the tag point, calculating camera coordinates according to the internal parameters of the camera and screen coordinates of the tag point, further obtaining world coordinates of the tag point, converting the world coordinates of the tag point to the camera coordinates according to a transformation matrix after the camera rotates when the camera is selected, obtaining new camera coordinates of the tag point, and remapping the new camera coordinates of the tag point to a camera picture according to the internal parameters of the camera, so as to obtain the screen coordinates of the tag point after the camera rotates; when the camera zooms, the screen coordinates of the tag points after the camera zooms are calculated according to the zoom multiple, so that the tag follows the object, and the technical problem that the tag of the object cannot follow movement when the position of the object in the video picture changes under the conditions that the camera rotates and the picture zooms in the prior art is solved.

The foregoing is one embodiment of an augmented reality tag tracking method provided by the present application, and the following is another embodiment of an augmented reality tag tracking method provided by the present application.

In step 201, when a tag point is marked in a camera picture, camera coordinates of the tag point are calculated according to internal parameters of the camera and screen coordinates of the tag point, and world coordinates of the tag point are calculated according to the camera coordinates of the tag point.

The details of step 201 are the same as those of step 101, and will not be described here again.

Step 202, when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained.

The details of step 202 are the same as those of step 102, and will not be described here again.

And 203, physically correcting the screen coordinates of the tag points after the rotation of the camera to obtain corrected screen coordinates of the tag points.

In the embodiment of the application, a certain offset is considered between the new screen coordinates of the tag point calculated after the rotation of the camera (i.e. the screen coordinates of the tag point after the rotation of the camera) and the actual screen coordinates of the tag point in the camera picture, because when the camera is actually installed, the virtual origin of the camera and the actual physical rotation center are hard to be consistent, which leads to that the camera is integrally offset in a certain way during rotation, so that the screen coordinates after the rotation can also be offset. In order to more accurately acquire the screen coordinates of the tag points after the rotation of the camera, the embodiment of the application performs physical correction on the calculated screen coordinates of the tag points after the rotation of the camera.

Specifically, marking a plurality of first target points in a target picture of the camera, and calculating screen coordinates of each first target point after the camera rotates to obtain a screen coordinate calculated value of each first target point after the camera converts;

Marking at least 3 first target points in a camera picture, acquiring screen coordinates of the first target points, collecting screen coordinates of the camera rotating at least in two directions of 30 degrees, 180 degrees and the like, obtaining actual screen coordinates of the first target points after the camera rotates, obtaining screen coordinate calculated values of the first target points after the camera rotates through calculation, taking the screen coordinate calculated values of the first target points after the camera rotates as input, taking the actual screen coordinates of the first target points after the camera rotates as output training convolutional neural network, taking the trained convolutional neural network model as a coordinate offset correction model, and inputting the calculated screen coordinates of the tag points after the camera rotates into the coordinate offset correction model, so that the coordinate offset correction model performs physical correction on the screen coordinates of the tag points after the camera rotates according to the learned offset parameters, and obtaining corrected screen coordinates of the tag points, thereby eliminating physical offset of the rotated screen coordinates.

And 204, when the camera zooms, acquiring the current zoom multiple, and calculating the screen coordinates of the tag point after the camera zooms according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms.

Further, in order to obtain more accurate scaled screen coordinates, the embodiment of the present application further corrects the calculated scaled screen coordinates of the tag point on the camera, specifically:

In the embodiment of the application, the coordinate offset under each scaling factor is obtained by fitting the calculated values of the screen coordinates of each third target point under different scaling factors and the corresponding actual screen coordinates, and after the current scaling factor is obtained, the coordinate offset under the current scaling factor can be determined according to the relation between each scaling factor and the coordinate offset, and the calculated screen coordinates of the tag point after scaling of the camera are offset corrected through the coordinate offset under the current scaling factor, so that the corrected screen coordinates of the tag point after scaling of the camera are obtained, and the coordinate offset is eliminated.

In the embodiment of the application, after marking the tag point, calculating camera coordinates according to the internal parameters of the camera and screen coordinates of the tag point, further obtaining world coordinates of the tag point, converting the world coordinates of the tag point to the camera coordinates according to a transformation matrix after the camera rotates when the camera is selected, obtaining new camera coordinates of the tag point, and remapping the new camera coordinates of the tag point to a camera picture according to the internal parameters of the camera, so as to obtain the screen coordinates of the tag point after the camera rotates; when the camera zooms, calculating the screen coordinates of the tag points after the camera zooms according to the zoom multiple, so that the tag follows the object, and the technical problem that the position of the object in the video picture changes but the tag of the object cannot follow the movement under the conditions of camera rotation and picture zooming in the prior art is solved;

Further, the embodiment of the application performs physical correction on the calculated screen coordinates of the tag points after the rotation of the camera by training to obtain a coordinate offset correction model so as to eliminate the physical deviation of the screen coordinates; and correcting the calculated screen coordinates of the tag points after the scaling of the camera by fitting the coordinate offset under each scaling multiple, so that the tag can more accurately follow the object after the rotation and scaling of the camera.

Referring to fig. 3, an embodiment of the present application further provides an augmented reality tag tracking apparatus, including:

As a further refinement, the first computing unit is specifically configured to:

when a label point is marked in a camera picture, acquiring untwisted screen coordinates of the label point according to internal parameters, distortion coefficients and screen coordinates of the label point of the camera;

calculating camera coordinates of the tag points according to internal parameters of the camera and untwisted screen coordinates of the tag points;

world coordinates of the tag points are calculated from camera coordinates of the tag points.

As a further improvement, the device further comprises: a physical correction unit configured to:

As a further improvement, a third calculation unit is specifically configured to:

when the camera zooms, obtaining the current zoom multiple;

As a further improvement, the training process of the preset coordinate scaling model includes:

When the camera zooms, obtaining the current zoom multiple;

and calculating the scaling distance according to the current scaling multiple and the coordinate distance, and calculating the screen coordinates of the tag point after the scaling of the camera according to the scaling distance and the screen coordinates of the tag point before the scaling of the camera.

As a further improvement, the device further comprises: a coordinate shift correction unit configured to:

Referring to fig. 4, an embodiment of the present application further provides an electronic device, where the device includes a processor and a memory;

the memory is used for storing the program codes and transmitting the program codes to the processor;

the processor is configured to perform the augmented reality tag tracking method of the foregoing method embodiments according to instructions in the program code.

The embodiment of the application also provides a computer readable storage medium, and the computer readable storage medium is used for storing program codes, and when the program codes are executed by a processor, the augmented reality label tracking method in the embodiment of the method is realized.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and units described above may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.

The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to execute all or part of the steps of the methods described in the embodiments of the present application by a computer device (which may be a personal computer, a server, or a network device, etc.). And the aforementioned storage medium includes: u disk, mobile hard disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.

The above embodiments are merely for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. An augmented reality tag tracking method, comprising:

when the camera zooms, the current zoom multiple is obtained, and the screen coordinates of the tag point after the camera zooms are calculated according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms;

the method further comprises the steps of:

2. The augmented reality tag tracking method according to claim 1, wherein the calculating the camera coordinates of the tag point from the internal parameters of the camera and the screen coordinates of the tag point comprises:

3. The augmented reality tag tracking method according to claim 1, wherein calculating the screen coordinates of the tag point after the camera zoom according to the current zoom multiple and the screen coordinates of the tag point before the camera zoom comprises:

4. The augmented reality tag tracking method of claim 3, wherein the training process of the preset coordinate scaling model comprises:

5. The augmented reality tag tracking method according to claim 3, wherein calculating the screen coordinates of the tag point after the camera zoom according to the current zoom multiple and the screen coordinates of the tag point before the camera zoom comprises:

6. The augmented reality tag tracking method of claim 5, further comprising:

7. An augmented reality tag tracking device, comprising:

The third calculation unit is used for obtaining the current scaling multiple when the camera is scaled, and calculating the screen coordinates of the tag point after the scaling of the camera according to the current scaling multiple and the screen coordinates of the tag point before the scaling of the camera;

the apparatus further comprises: a physical correction unit configured to:

8. An electronic device comprising a processor and a memory;

The processor is configured to perform the augmented reality tag tracking method of any one of claims 1-6 according to instructions in the program code.

9. A computer readable storage medium for storing program code which when executed by a processor implements the augmented reality tag tracking method of any one of claims 1 to 6.