CN117474984B - Augmented reality tag tracking method, device, equipment and storage medium - Google Patents
Augmented reality tag tracking method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN117474984B CN117474984B CN202311812529.9A CN202311812529A CN117474984B CN 117474984 B CN117474984 B CN 117474984B CN 202311812529 A CN202311812529 A CN 202311812529A CN 117474984 B CN117474984 B CN 117474984B
- Authority
- CN
- China
- Prior art keywords
- camera
- tag
- coordinates
- screen coordinates
- point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 230000003190 augmentative effect Effects 0.000 title claims abstract description 34
- 239000011159 matrix material Substances 0.000 claims abstract description 27
- 230000009466 transformation Effects 0.000 claims abstract description 25
- 238000013527 convolutional neural network Methods 0.000 claims description 26
- 238000012937 correction Methods 0.000 claims description 22
- 238000012549 training Methods 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 4
- 230000006870 function Effects 0.000 description 8
- 230000006872 improvement Effects 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/14—Digital output to display device ; Cooperation and interconnection of the display device with other functional units
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0499—Feedforward networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- Closed-Circuit Television Systems (AREA)
Abstract
The application discloses an augmented reality tag tracking method, device, equipment and storage medium, wherein the method comprises the following steps: when a label point is marked in a camera picture, calculating camera coordinates of the label point according to internal parameters of the camera and screen coordinates of the label point, and calculating world coordinates of the label point according to the camera coordinates of the label point; when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained; when the camera zooms, the current zoom multiple is obtained, and the screen coordinates of the tag point after the camera zooms are calculated according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms. The method and the device realize that the label of the object moves along with the object under the conditions of camera rotation and picture zooming.
Description
Technical Field
The application relates to the technical field of augmented reality, in particular to an augmented reality tag tracking method, an augmented reality tag tracking device, augmented reality tag tracking equipment and a storage medium.
Background
The common camera generally does not have the function of adding the augmented reality tag, which is not beneficial to user experience, but the existing augmented reality camera has the function of adding the augmented reality tag, so that a user can add the tag on a video picture. In the field of video security, object marking in a monitoring picture is an important technology. By marking the object in the monitoring picture, the state of the object can be monitored in real time and corresponding measures can be taken in time. By marking the object, the target object in the monitoring picture can be accurately identified and tracked, the position and the state of the object can be quickly positioned, and the monitoring of the real-time state of the object can be provided. By reasonably utilizing the object marking technology, the efficiency of the security system can be improved, and the prevention and coping capacity for security events can be enhanced.
However, in the prior art, under the conditions of camera rotation and screen zoom, the position of an object in a video screen changes, but the label of the object cannot follow the movement.
Disclosure of Invention
The application provides an augmented reality tag tracking method, device, equipment and storage medium, which are used for improving the technical problem that in the prior art, under the conditions of camera rotation and picture scaling, the position of an object in a video picture changes, but the tag of the object cannot follow movement.
In view of this, a first aspect of the present application provides an augmented reality tag tracking method, including:
when a label point is marked in a camera picture, calculating camera coordinates of the label point according to internal parameters of the camera and screen coordinates of the label point, and calculating world coordinates of the label point according to the camera coordinates of the label point;
when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained;
when the camera zooms, the current zoom multiple is obtained, and the screen coordinates of the tag point after the camera zooms are calculated according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms.
Optionally, the calculating the camera coordinates of the tag point according to the internal parameters of the camera and the screen coordinates of the tag point includes:
acquiring untwisted screen coordinates of the tag point according to internal parameters, distortion coefficients and screen coordinates of the tag point of the camera;
and calculating the camera coordinates of the tag points according to the internal parameters of the camera and the untwisted screen coordinates of the tag points.
Optionally, the method further comprises:
marking a plurality of first target points in a target picture of the camera, and calculating screen coordinates of each first target point after the camera rotates to obtain a screen coordinate calculated value of each first target point after the camera converts;
acquiring actual screen coordinates of each first target point after the camera rotates;
taking a screen coordinate calculated value of the first target point after the rotation of the camera as input, and taking an actual screen coordinate of the first target point after the rotation of the camera as output training convolutional neural network to obtain a coordinate offset correction model;
and physically correcting the screen coordinates of the tag points after the rotation of the camera through the coordinate offset correction model to obtain corrected screen coordinates of the tag points.
Optionally, the calculating the screen coordinates of the tag point after the zooming of the camera according to the current zoom multiple and the screen coordinates of the tag point before the zooming of the camera includes:
and inputting the current scaling multiple, the screen coordinates of the tag point before scaling of the camera and the current angle of the camera into a preset coordinate scaling model for coordinate scaling to obtain the screen coordinates of the tag point after scaling of the camera.
Optionally, the training process of the preset coordinate scaling model includes:
Acquiring screen coordinates of a plurality of second target points in a camera picture, actual screen coordinates of each second target point after zooming of the camera, zooming times and the current angle of the camera;
and taking the scaling multiple, the screen coordinates of each second target point before the scaling of the camera and the current angle of the camera as inputs, and taking the actual screen coordinates of each second target point after the scaling of the camera as outputs to train the convolutional neural network, so as to obtain a preset coordinate scaling model.
Optionally, the calculating the screen coordinates of the tag point after the zooming of the camera according to the current zoom multiple and the screen coordinates of the tag point before the zooming of the camera includes:
acquiring screen coordinates of the tag points after zooming by 1 time by the camera, and calculating the coordinate distance between the screen coordinates of the tag points when the camera is not zoomed and the screen coordinates after zooming by 1 time by the camera;
and calculating a scaling distance according to the current scaling multiple and the coordinate distance, and calculating the screen coordinates of the tag point after the scaling of the camera according to the scaling distance and the screen coordinates of the tag point before the scaling of the camera.
Optionally, the method further comprises:
calculating screen coordinates of a plurality of third target points in the camera after zooming of the camera according to each zoom factor, and obtaining screen coordinate calculated values of the third target points under each zoom factor;
Acquiring actual screen coordinates corresponding to each third target point under each scaling multiple;
obtaining coordinate offset under each scaling multiple by fitting a screen coordinate calculated value of each third target point under each scaling multiple with an actual screen coordinate;
and correcting the screen coordinates of the tag points after the scaling of the camera according to the coordinate offset obtained by the current scaling multiple.
A second aspect of the present application provides an augmented reality tag tracking device, comprising:
the first calculating unit is used for calculating camera coordinates of the tag points according to the internal parameters of the camera and screen coordinates of the tag points when the tag points are marked in the camera picture, and calculating world coordinates of the tag points according to the camera coordinates of the tag points;
the second calculation unit is used for obtaining a transformation matrix after the rotation of the camera when the camera rotates, calculating new camera coordinates through the transformation matrix and world coordinates of the tag points, and mapping the new camera coordinates of the tag points into a camera picture based on internal parameters of the camera to obtain screen coordinates of the tag points after the rotation of the camera;
and the third calculation unit is used for obtaining the current scaling multiple when the camera is scaled, and calculating the screen coordinates of the tag point after the scaling of the camera according to the current scaling multiple and the screen coordinates of the tag point before the scaling of the camera.
A third aspect of the present application provides an electronic device comprising a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the augmented reality tag tracking method according to any one of the first aspects according to instructions in the program code.
A fourth aspect of the present application provides a computer readable storage medium for storing program code which when executed by a processor implements the augmented reality tag tracking method of any one of the first aspects.
From the above technical scheme, the application has the following advantages:
the application provides an augmented reality tag tracking method, which comprises the following steps: when a label point is marked in a camera picture, calculating camera coordinates of the label point according to internal parameters of the camera and screen coordinates of the label point, and calculating world coordinates of the label point according to the camera coordinates of the label point; when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained; when the camera zooms, the current zoom multiple is obtained, and the screen coordinates of the tag point after the camera zooms are calculated according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms.
In the method, after marking the tag point, camera coordinates are calculated according to internal parameters of a camera and screen coordinates of the tag point, world coordinates of the tag point are further obtained, when the camera is selected, the world coordinates of the tag point are converted into the camera coordinates according to a transformation matrix after the camera rotates, new camera coordinates of the tag point are obtained, then the new camera coordinates of the tag point are remapped into a camera picture according to the internal parameters of the camera, and the screen coordinates of the tag point after the camera rotates are obtained; when the camera zooms, the screen coordinates of the tag points after the camera zooms are calculated according to the zoom multiple, so that the tag follows the object, and the technical problem that the tag of the object cannot follow movement when the position of the object in the video picture changes under the conditions that the camera rotates and the picture zooms in the prior art is solved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive faculty for a person skilled in the art.
Fig. 1 is a schematic flow chart of an augmented reality tag tracking method according to an embodiment of the present application;
fig. 2 is another flow chart of an augmented reality tag tracking method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an augmented reality tag tracking apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will clearly and completely describe the technical solution in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
For ease of understanding, referring to fig. 1, an embodiment of the present application provides an augmented reality tag tracking method, including:
and 101, when the tag points are marked in the camera picture, calculating camera coordinates of the tag points according to internal parameters of the camera and screen coordinates of the tag points, and calculating world coordinates of the tag points according to the camera coordinates of the tag points.
When the target in the camera picture is required to be marked, a point can be marked on the target in the camera picture and used as a label point, and when the label point is marked, the screen coordinates of the label point can be obtained.
The camera needs to be calibrated to obtain the internal parameters and the external parameters, and distortion coefficients are also obtained during calibration in consideration of distortion of the camera. For example, 3 photographs (front, side, upward) can be respectively taken at four corners and the center by holding a checkerboard picture in front of the camera, then 5 photographs are taken at random positions, internal parameters, external parameters and distortion coefficients of the camera are calculated through 20 collected photographs, and specific calculation processes can refer to the prior art, and detailed description of specific processes is omitted.
The internal parameters of the camera describe the relation between the camera space of the camera and the screen space, and the conversion from the screen point to the three-dimensional point of the camera space can be realized through the internal parameters, namely:
camera coordinates internal reference = screen coordinates
However, in reality, the screen coordinates acquired from the camera picture are coordinates distorted by the lens, and in order to acquire more accurate camera coordinates, the screen coordinates on the picture need to be restored to undistorted coordinates first, and then internal reference conversion is performed. The undistorted screen coordinates of the tag point can be obtained according to the internal parameters and distortion coefficients of the camera and the screen coordinates of the tag point, and particularly, the undistorted screen coordinates of the tag point can be solved by calling an undistitortPoints function in an image processing tool openCV, namely:
Untwisted screen coordinates= cv. undistitortPoints (screen coordinates, internal parameters, distortion coefficients)
After the untwisted screen coordinates of the tag points are obtained, camera coordinates of the tag points are calculated according to the internal parameters of the camera and the untwisted screen coordinates of the tag points. Specifically, the product of the untwisted screen coordinates of the tag point and the inversion matrix of the internal reference is calculated to obtain the camera coordinates of the tag point, namely:
inversion matrix of camera coordinates = undistorted screen coordinates x internal parameters
After the camera coordinates of the tag point are obtained, the world coordinates of the tag point in the camera world space can be calculated, and the world coordinates of the tag point can be calculated according to the external parameters of the camera and the camera coordinates of the tag point.
Step 102, when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained.
When the camera rotates, the position of the target in the camera picture changes, correspondingly, the tag point also changes, a transformation matrix after the rotation of the camera is calculated according to the angle after the rotation of the camera, and new camera coordinates of the tag point can be calculated through the transformation matrix and world coordinates of the tag point, namely:
New camera coordinates = world coordinates × inverse of the transformation matrix after camera rotation
And then, mapping the new camera coordinates of the tag points into the camera picture by combining the internal parameters of the camera to obtain screen coordinates of the tag points after the camera rotates, wherein the undistorted screen coordinates of the tag points are calculated through the distortion coefficients in the steps, and the camera coordinates are also calculated based on the undistorted screen coordinates.
And 103, when the camera zooms, acquiring the current zoom multiple, and calculating the screen coordinates of the tag point after the camera zooms according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms.
In one embodiment, a camera with maximum zoom of 4 times is taken as a research object, and the picture after physical zooming is found through superposition pair, so that the corresponding area of the picture in a non-zooming state can be effectively and directly zoomed and completely covered, and the two parts show a multiple relation. When the same point in the picture of each zoom level is examined, the distance between different zoom levels of each point forms an arithmetic series in the zooming process, and by the rule, the zoom level and the zoomed coordinates can be related, so that the new position of each screen coordinate after zooming is determined.
Specifically, screen coordinates of the tag points after the camera is scaled by 1 time are obtained, and coordinate distances between the screen coordinates of the tag points when the camera is not scaled by 1 time and the screen coordinates of the tag points after the camera is scaled by 1 time are calculated, wherein the coordinate distances comprise horizontal coordinate distances and vertical coordinate distances;
and calculating the scaling distance according to the current scaling multiple and the coordinate distance, and calculating the screen coordinates of the tag point after the scaling of the camera according to the scaling distance and the screen coordinates of the tag point before the scaling of the camera. And multiplying the current scaling times on the basis of the horizontal coordinate distance and the vertical coordinate distance to obtain the horizontal coordinate scaling distance and the vertical coordinate scaling distance, and then calculating the screen coordinates of the tag points after the scaling of the camera according to the scaling distances and the screen coordinates of the tag points before the scaling of the camera.
In the embodiment of the application, the new screen coordinates of the label points in different zoom states are calculated through the relation between zoom levels, so that label following can be realized rapidly.
In another embodiment, in order to further improve the calculation efficiency, a preset coordinate scaling model is obtained through training, and the current scaling multiple, the screen coordinates of the tag point before scaling of the camera and the current angle of the camera are input into the preset coordinate scaling model to perform coordinate scaling, so that the screen coordinates of the tag point after scaling of the camera are obtained.
The training process of the preset coordinate scaling model comprises the following steps:
acquiring screen coordinates of a plurality of second target points in a camera picture, actual screen coordinates of each second target point after zooming of the camera, zooming times and the current angle of the camera;
and taking the scaling multiple, the screen coordinates of each second target point before the scaling of the camera and the current angle of the camera as inputs, and taking the actual screen coordinates of each second target point after the scaling of the camera as outputs to train the convolutional neural network, so as to obtain a preset coordinate scaling model.
In this embodiment of the present invention, screen coordinates of a plurality of second target points in a camera frame are obtained, where the selected second target points may be corner points, then the camera is scaled by 1 times, 1.5 times, 2 times, 3 times, 4 times, and so on, and actual screen coordinates of the second target points after the camera is scaled by various times and angles of the camera are obtained, the scaled times, the screen coordinates of the second target points before the camera is scaled and the current angles of the camera are used as inputs, the actual screen coordinates of the second target points after the camera is scaled are used as outputs to train a convolutional neural network, the convolutional neural network may select a shallow multi-layer perceptron, calculate weights of each feature through the multi-layer perceptron, so as to learn weight relationships among features, and because the camera may have physical offsets, and when collecting data, deviations may be generated, a Sigmoid function is selected as an activation function to alleviate and correct the deviations, so that a prediction result of the network is as close to an ideal state as possible. The convolutional neural network outputs a scaled screen coordinate predicted value of the second target point, a loss value is calculated according to the actual screen coordinate and the screen coordinate predicted value, and the network parameters of the convolutional neural network are reversely updated through the loss value until the convolutional neural network converges (if the maximum iteration number is reached, or the training error is lower than an error threshold, or the training error converges near a certain value), so as to obtain a trained convolutional neural network model, and the trained convolutional neural network model is used as a preset coordinate scaling model.
After the preset coordinate scaling model is obtained, the current scaling multiple, the screen coordinates of the tag point before the scaling of the camera and the current angle of the camera are input into the preset coordinate scaling model, and the screen coordinates of the tag point after the scaling of the camera are calculated according to the current scaling multiple, the camera angle and the screen coordinates of the tag point before the scaling of the camera through the preset coordinate scaling model.
In the embodiment of the application, after marking the tag point, calculating camera coordinates according to the internal parameters of the camera and screen coordinates of the tag point, further obtaining world coordinates of the tag point, converting the world coordinates of the tag point to the camera coordinates according to a transformation matrix after the camera rotates when the camera is selected, obtaining new camera coordinates of the tag point, and remapping the new camera coordinates of the tag point to a camera picture according to the internal parameters of the camera, so as to obtain the screen coordinates of the tag point after the camera rotates; when the camera zooms, the screen coordinates of the tag points after the camera zooms are calculated according to the zoom multiple, so that the tag follows the object, and the technical problem that the tag of the object cannot follow movement when the position of the object in the video picture changes under the conditions that the camera rotates and the picture zooms in the prior art is solved.
The foregoing is one embodiment of an augmented reality tag tracking method provided by the present application, and the following is another embodiment of an augmented reality tag tracking method provided by the present application.
In step 201, when a tag point is marked in a camera picture, camera coordinates of the tag point are calculated according to internal parameters of the camera and screen coordinates of the tag point, and world coordinates of the tag point are calculated according to the camera coordinates of the tag point.
The details of step 201 are the same as those of step 101, and will not be described here again.
Step 202, when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained.
The details of step 202 are the same as those of step 102, and will not be described here again.
And 203, physically correcting the screen coordinates of the tag points after the rotation of the camera to obtain corrected screen coordinates of the tag points.
In the embodiment of the application, a certain offset is considered between the new screen coordinates of the tag point calculated after the rotation of the camera (i.e. the screen coordinates of the tag point after the rotation of the camera) and the actual screen coordinates of the tag point in the camera picture, because when the camera is actually installed, the virtual origin of the camera and the actual physical rotation center are hard to be consistent, which leads to that the camera is integrally offset in a certain way during rotation, so that the screen coordinates after the rotation can also be offset. In order to more accurately acquire the screen coordinates of the tag points after the rotation of the camera, the embodiment of the application performs physical correction on the calculated screen coordinates of the tag points after the rotation of the camera.
Specifically, marking a plurality of first target points in a target picture of the camera, and calculating screen coordinates of each first target point after the camera rotates to obtain a screen coordinate calculated value of each first target point after the camera converts;
acquiring actual screen coordinates of each first target point after the camera rotates;
taking a screen coordinate calculated value of the first target point after the rotation of the camera as input, and taking an actual screen coordinate of the first target point after the rotation of the camera as output training convolutional neural network to obtain a coordinate offset correction model;
and physically correcting the screen coordinates of the tag points after the rotation of the camera through the coordinate offset correction model to obtain corrected screen coordinates of the tag points.
Marking at least 3 first target points in a camera picture, acquiring screen coordinates of the first target points, collecting screen coordinates of the camera rotating at least in two directions of 30 degrees, 180 degrees and the like, obtaining actual screen coordinates of the first target points after the camera rotates, obtaining screen coordinate calculated values of the first target points after the camera rotates through calculation, taking the screen coordinate calculated values of the first target points after the camera rotates as input, taking the actual screen coordinates of the first target points after the camera rotates as output training convolutional neural network, taking the trained convolutional neural network model as a coordinate offset correction model, and inputting the calculated screen coordinates of the tag points after the camera rotates into the coordinate offset correction model, so that the coordinate offset correction model performs physical correction on the screen coordinates of the tag points after the camera rotates according to the learned offset parameters, and obtaining corrected screen coordinates of the tag points, thereby eliminating physical offset of the rotated screen coordinates.
And 204, when the camera zooms, acquiring the current zoom multiple, and calculating the screen coordinates of the tag point after the camera zooms according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms.
In one embodiment, a camera with maximum zoom of 4 times is taken as a research object, and the picture after physical zooming is found through superposition pair, so that the corresponding area of the picture in a non-zooming state can be effectively and directly zoomed and completely covered, and the two parts show a multiple relation. When the same point in the picture of each zoom level is examined, the distance between different zoom levels of each point forms an arithmetic series in the zooming process, and by the rule, the zoom level and the zoomed coordinates can be related, so that the new position of each screen coordinate after zooming is determined.
Specifically, screen coordinates of the tag points after the camera is scaled by 1 time are obtained, and coordinate distances between the screen coordinates of the tag points when the camera is not scaled by 1 time and the screen coordinates of the tag points after the camera is scaled by 1 time are calculated, wherein the coordinate distances comprise horizontal coordinate distances and vertical coordinate distances;
and calculating the scaling distance according to the current scaling multiple and the coordinate distance, and calculating the screen coordinates of the tag point after the scaling of the camera according to the scaling distance and the screen coordinates of the tag point before the scaling of the camera. And multiplying the current scaling times on the basis of the horizontal coordinate distance and the vertical coordinate distance to obtain the horizontal coordinate scaling distance and the vertical coordinate scaling distance, and then calculating the screen coordinates of the tag points after the scaling of the camera according to the scaling distances and the screen coordinates of the tag points before the scaling of the camera.
In the embodiment of the application, the new screen coordinates of the label points in different zoom states are calculated through the relation between zoom levels, so that label following can be realized rapidly.
Further, in order to obtain more accurate scaled screen coordinates, the embodiment of the present application further corrects the calculated scaled screen coordinates of the tag point on the camera, specifically:
calculating screen coordinates of a plurality of third target points in the camera after zooming of the camera according to each zoom factor, and obtaining screen coordinate calculated values of the third target points under each zoom factor;
acquiring actual screen coordinates corresponding to each third target point under each scaling multiple;
obtaining coordinate offset under each scaling multiple by fitting a screen coordinate calculated value of each third target point under each scaling multiple with an actual screen coordinate;
and correcting the screen coordinates of the tag points after the scaling of the camera according to the coordinate offset obtained by the current scaling multiple.
In the embodiment of the application, the coordinate offset under each scaling factor is obtained by fitting the calculated values of the screen coordinates of each third target point under different scaling factors and the corresponding actual screen coordinates, and after the current scaling factor is obtained, the coordinate offset under the current scaling factor can be determined according to the relation between each scaling factor and the coordinate offset, and the calculated screen coordinates of the tag point after scaling of the camera are offset corrected through the coordinate offset under the current scaling factor, so that the corrected screen coordinates of the tag point after scaling of the camera are obtained, and the coordinate offset is eliminated.
In another embodiment, in order to further improve the calculation efficiency, a preset coordinate scaling model is obtained through training, and the current scaling multiple, the screen coordinates of the tag point before scaling of the camera and the current angle of the camera are input into the preset coordinate scaling model to perform coordinate scaling, so that the screen coordinates of the tag point after scaling of the camera are obtained.
The training process of the preset coordinate scaling model comprises the following steps:
acquiring screen coordinates of a plurality of second target points in a camera picture, actual screen coordinates of each second target point after zooming of the camera, zooming times and the current angle of the camera;
and taking the scaling multiple, the screen coordinates of each second target point before the scaling of the camera and the current angle of the camera as inputs, and taking the actual screen coordinates of each second target point after the scaling of the camera as outputs to train the convolutional neural network, so as to obtain a preset coordinate scaling model.
In this embodiment of the present invention, screen coordinates of a plurality of second target points in a camera frame are obtained, where the selected second target points may be corner points, then the camera is scaled by 1 times, 1.5 times, 2 times, 3 times, 4 times, and so on, and actual screen coordinates of the second target points after the camera is scaled by various times and angles of the camera are obtained, the scaled times, the screen coordinates of the second target points before the camera is scaled and the current angles of the camera are used as inputs, the actual screen coordinates of the second target points after the camera is scaled are used as outputs to train a convolutional neural network, the convolutional neural network may select a shallow multi-layer perceptron, calculate weights of each feature through the multi-layer perceptron, so as to learn weight relationships among features, and because the camera may have physical offsets, and when collecting data, deviations may be generated, a Sigmoid function is selected as an activation function to alleviate and correct the deviations, so that a prediction result of the network is as close to an ideal state as possible. The convolutional neural network outputs a scaled screen coordinate predicted value of the second target point, a loss value is calculated according to the actual screen coordinate and the screen coordinate predicted value, and the network parameters of the convolutional neural network are reversely updated through the loss value until the convolutional neural network converges (if the maximum iteration number is reached, or the training error is lower than an error threshold, or the training error converges near a certain value), so as to obtain a trained convolutional neural network model, and the trained convolutional neural network model is used as a preset coordinate scaling model.
After the preset coordinate scaling model is obtained, the current scaling multiple, the screen coordinates of the tag point before the scaling of the camera and the current angle of the camera are input into the preset coordinate scaling model, and the screen coordinates of the tag point after the scaling of the camera are calculated according to the current scaling multiple, the camera angle and the screen coordinates of the tag point before the scaling of the camera through the preset coordinate scaling model.
In the embodiment of the application, after marking the tag point, calculating camera coordinates according to the internal parameters of the camera and screen coordinates of the tag point, further obtaining world coordinates of the tag point, converting the world coordinates of the tag point to the camera coordinates according to a transformation matrix after the camera rotates when the camera is selected, obtaining new camera coordinates of the tag point, and remapping the new camera coordinates of the tag point to a camera picture according to the internal parameters of the camera, so as to obtain the screen coordinates of the tag point after the camera rotates; when the camera zooms, calculating the screen coordinates of the tag points after the camera zooms according to the zoom multiple, so that the tag follows the object, and the technical problem that the position of the object in the video picture changes but the tag of the object cannot follow the movement under the conditions of camera rotation and picture zooming in the prior art is solved;
Further, the embodiment of the application performs physical correction on the calculated screen coordinates of the tag points after the rotation of the camera by training to obtain a coordinate offset correction model so as to eliminate the physical deviation of the screen coordinates; and correcting the calculated screen coordinates of the tag points after the scaling of the camera by fitting the coordinate offset under each scaling multiple, so that the tag can more accurately follow the object after the rotation and scaling of the camera.
Referring to fig. 3, an embodiment of the present application further provides an augmented reality tag tracking apparatus, including:
the first calculating unit is used for calculating camera coordinates of the tag points according to the internal parameters of the camera and screen coordinates of the tag points when the tag points are marked in the camera picture, and calculating world coordinates of the tag points according to the camera coordinates of the tag points;
the second calculation unit is used for obtaining a transformation matrix after the rotation of the camera when the camera rotates, calculating new camera coordinates through the transformation matrix and world coordinates of the tag points, and mapping the new camera coordinates of the tag points into a camera picture based on internal parameters of the camera to obtain screen coordinates of the tag points after the rotation of the camera;
And the third calculation unit is used for obtaining the current scaling multiple when the camera is scaled, and calculating the screen coordinates of the tag point after the scaling of the camera according to the current scaling multiple and the screen coordinates of the tag point before the scaling of the camera.
As a further refinement, the first computing unit is specifically configured to:
when a label point is marked in a camera picture, acquiring untwisted screen coordinates of the label point according to internal parameters, distortion coefficients and screen coordinates of the label point of the camera;
calculating camera coordinates of the tag points according to internal parameters of the camera and untwisted screen coordinates of the tag points;
world coordinates of the tag points are calculated from camera coordinates of the tag points.
As a further improvement, the device further comprises: a physical correction unit configured to:
marking a plurality of first target points in a target picture of the camera, and calculating screen coordinates of each first target point after the camera rotates to obtain a screen coordinate calculated value of each first target point after the camera converts;
acquiring actual screen coordinates of each first target point after the camera rotates;
taking a screen coordinate calculated value of the first target point after the rotation of the camera as input, and taking an actual screen coordinate of the first target point after the rotation of the camera as output training convolutional neural network to obtain a coordinate offset correction model;
And physically correcting the screen coordinates of the tag points after the rotation of the camera through the coordinate offset correction model to obtain corrected screen coordinates of the tag points.
As a further improvement, a third calculation unit is specifically configured to:
when the camera zooms, obtaining the current zoom multiple;
and inputting the current scaling multiple, the screen coordinates of the tag point before scaling of the camera and the current angle of the camera into a preset coordinate scaling model for coordinate scaling to obtain the screen coordinates of the tag point after scaling of the camera.
As a further improvement, the training process of the preset coordinate scaling model includes:
acquiring screen coordinates of a plurality of second target points in a camera picture, actual screen coordinates of each second target point after zooming of the camera, zooming times and the current angle of the camera;
and taking the scaling multiple, the screen coordinates of each second target point before the scaling of the camera and the current angle of the camera as inputs, and taking the actual screen coordinates of each second target point after the scaling of the camera as outputs to train the convolutional neural network, so as to obtain a preset coordinate scaling model.
As a further improvement, a third calculation unit is specifically configured to:
When the camera zooms, obtaining the current zoom multiple;
acquiring screen coordinates of the tag points after zooming by 1 time by the camera, and calculating the coordinate distance between the screen coordinates of the tag points when the camera is not zoomed and the screen coordinates after zooming by 1 time by the camera;
and calculating the scaling distance according to the current scaling multiple and the coordinate distance, and calculating the screen coordinates of the tag point after the scaling of the camera according to the scaling distance and the screen coordinates of the tag point before the scaling of the camera.
As a further improvement, the device further comprises: a coordinate shift correction unit configured to:
calculating screen coordinates of a plurality of third target points in the camera after zooming of the camera according to each zoom factor, and obtaining screen coordinate calculated values of the third target points under each zoom factor;
acquiring actual screen coordinates corresponding to each third target point under each scaling multiple;
obtaining coordinate offset under each scaling multiple by fitting a screen coordinate calculated value of each third target point under each scaling multiple with an actual screen coordinate;
and correcting the screen coordinates of the tag points after the scaling of the camera according to the coordinate offset obtained by the current scaling multiple.
In the embodiment of the application, after marking the tag point, calculating camera coordinates according to the internal parameters of the camera and screen coordinates of the tag point, further obtaining world coordinates of the tag point, converting the world coordinates of the tag point to the camera coordinates according to a transformation matrix after the camera rotates when the camera is selected, obtaining new camera coordinates of the tag point, and remapping the new camera coordinates of the tag point to a camera picture according to the internal parameters of the camera, so as to obtain the screen coordinates of the tag point after the camera rotates; when the camera zooms, calculating the screen coordinates of the tag points after the camera zooms according to the zoom multiple, so that the tag follows the object, and the technical problem that the position of the object in the video picture changes but the tag of the object cannot follow the movement under the conditions of camera rotation and picture zooming in the prior art is solved;
Further, the embodiment of the application performs physical correction on the calculated screen coordinates of the tag points after the rotation of the camera by training to obtain a coordinate offset correction model so as to eliminate the physical deviation of the screen coordinates; and correcting the calculated screen coordinates of the tag points after the scaling of the camera by fitting the coordinate offset under each scaling multiple, so that the tag can more accurately follow the object after the rotation and scaling of the camera.
Referring to fig. 4, an embodiment of the present application further provides an electronic device, where the device includes a processor and a memory;
the memory is used for storing the program codes and transmitting the program codes to the processor;
the processor is configured to perform the augmented reality tag tracking method of the foregoing method embodiments according to instructions in the program code.
The embodiment of the application also provides a computer readable storage medium, and the computer readable storage medium is used for storing program codes, and when the program codes are executed by a processor, the augmented reality label tracking method in the embodiment of the method is realized.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the apparatus and units described above may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again.
The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to execute all or part of the steps of the methods described in the embodiments of the present application by a computer device (which may be a personal computer, a server, or a network device, etc.). And the aforementioned storage medium includes: u disk, mobile hard disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.
The above embodiments are merely for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.
Claims (9)
1. An augmented reality tag tracking method, comprising:
when a label point is marked in a camera picture, calculating camera coordinates of the label point according to internal parameters of the camera and screen coordinates of the label point, and calculating world coordinates of the label point according to the camera coordinates of the label point;
when the camera rotates, a transformation matrix after the rotation of the camera is obtained, new camera coordinates are calculated through the transformation matrix and world coordinates of the tag points, and the new camera coordinates of the tag points are mapped into a camera picture based on internal parameters of the camera, so that screen coordinates of the tag points after the rotation of the camera are obtained;
when the camera zooms, the current zoom multiple is obtained, and the screen coordinates of the tag point after the camera zooms are calculated according to the current zoom multiple and the screen coordinates of the tag point before the camera zooms;
the method further comprises the steps of:
marking a plurality of first target points in a target picture of the camera, and calculating screen coordinates of each first target point after the camera rotates to obtain a screen coordinate calculated value of each first target point after the camera converts;
acquiring actual screen coordinates of each first target point after the camera rotates;
taking a screen coordinate calculated value of the first target point after the rotation of the camera as input, and taking an actual screen coordinate of the first target point after the rotation of the camera as output training convolutional neural network to obtain a coordinate offset correction model;
And physically correcting the screen coordinates of the tag points after the rotation of the camera through the coordinate offset correction model to obtain corrected screen coordinates of the tag points.
2. The augmented reality tag tracking method according to claim 1, wherein the calculating the camera coordinates of the tag point from the internal parameters of the camera and the screen coordinates of the tag point comprises:
acquiring untwisted screen coordinates of the tag point according to internal parameters, distortion coefficients and screen coordinates of the tag point of the camera;
and calculating the camera coordinates of the tag points according to the internal parameters of the camera and the untwisted screen coordinates of the tag points.
3. The augmented reality tag tracking method according to claim 1, wherein calculating the screen coordinates of the tag point after the camera zoom according to the current zoom multiple and the screen coordinates of the tag point before the camera zoom comprises:
and inputting the current scaling multiple, the screen coordinates of the tag point before scaling of the camera and the current angle of the camera into a preset coordinate scaling model for coordinate scaling to obtain the screen coordinates of the tag point after scaling of the camera.
4. The augmented reality tag tracking method of claim 3, wherein the training process of the preset coordinate scaling model comprises:
Acquiring screen coordinates of a plurality of second target points in a camera picture, actual screen coordinates of each second target point after zooming of the camera, zooming times and the current angle of the camera;
and taking the scaling multiple, the screen coordinates of each second target point before the scaling of the camera and the current angle of the camera as inputs, and taking the actual screen coordinates of each second target point after the scaling of the camera as outputs to train the convolutional neural network, so as to obtain a preset coordinate scaling model.
5. The augmented reality tag tracking method according to claim 3, wherein calculating the screen coordinates of the tag point after the camera zoom according to the current zoom multiple and the screen coordinates of the tag point before the camera zoom comprises:
acquiring screen coordinates of the tag points after zooming by 1 time by the camera, and calculating the coordinate distance between the screen coordinates of the tag points when the camera is not zoomed and the screen coordinates after zooming by 1 time by the camera;
and calculating a scaling distance according to the current scaling multiple and the coordinate distance, and calculating the screen coordinates of the tag point after the scaling of the camera according to the scaling distance and the screen coordinates of the tag point before the scaling of the camera.
6. The augmented reality tag tracking method of claim 5, further comprising:
Calculating screen coordinates of a plurality of third target points in the camera after zooming of the camera according to each zoom factor, and obtaining screen coordinate calculated values of the third target points under each zoom factor;
acquiring actual screen coordinates corresponding to each third target point under each scaling multiple;
obtaining coordinate offset under each scaling multiple by fitting a screen coordinate calculated value of each third target point under each scaling multiple with an actual screen coordinate;
and correcting the screen coordinates of the tag points after the scaling of the camera according to the coordinate offset obtained by the current scaling multiple.
7. An augmented reality tag tracking device, comprising:
the first calculating unit is used for calculating camera coordinates of the tag points according to the internal parameters of the camera and screen coordinates of the tag points when the tag points are marked in the camera picture, and calculating world coordinates of the tag points according to the camera coordinates of the tag points;
the second calculation unit is used for obtaining a transformation matrix after the rotation of the camera when the camera rotates, calculating new camera coordinates through the transformation matrix and world coordinates of the tag points, and mapping the new camera coordinates of the tag points into a camera picture based on internal parameters of the camera to obtain screen coordinates of the tag points after the rotation of the camera;
The third calculation unit is used for obtaining the current scaling multiple when the camera is scaled, and calculating the screen coordinates of the tag point after the scaling of the camera according to the current scaling multiple and the screen coordinates of the tag point before the scaling of the camera;
the apparatus further comprises: a physical correction unit configured to:
marking a plurality of first target points in a target picture of the camera, and calculating screen coordinates of each first target point after the camera rotates to obtain a screen coordinate calculated value of each first target point after the camera converts;
acquiring actual screen coordinates of each first target point after the camera rotates;
taking a screen coordinate calculated value of the first target point after the rotation of the camera as input, and taking an actual screen coordinate of the first target point after the rotation of the camera as output training convolutional neural network to obtain a coordinate offset correction model;
and physically correcting the screen coordinates of the tag points after the rotation of the camera through the coordinate offset correction model to obtain corrected screen coordinates of the tag points.
8. An electronic device comprising a processor and a memory;
the memory is used for storing program codes and transmitting the program codes to the processor;
The processor is configured to perform the augmented reality tag tracking method of any one of claims 1-6 according to instructions in the program code.
9. A computer readable storage medium for storing program code which when executed by a processor implements the augmented reality tag tracking method of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311812529.9A CN117474984B (en) | 2023-12-27 | 2023-12-27 | Augmented reality tag tracking method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311812529.9A CN117474984B (en) | 2023-12-27 | 2023-12-27 | Augmented reality tag tracking method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117474984A CN117474984A (en) | 2024-01-30 |
CN117474984B true CN117474984B (en) | 2024-04-05 |
Family
ID=89638239
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311812529.9A Active CN117474984B (en) | 2023-12-27 | 2023-12-27 | Augmented reality tag tracking method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117474984B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034104A (en) * | 2018-08-15 | 2018-12-18 | 罗普特(厦门)科技集团有限公司 | A kind of scene tag localization method and device |
CN111881763A (en) * | 2020-06-30 | 2020-11-03 | 北京小米移动软件有限公司 | Method and device for determining user gaze position, storage medium and electronic equipment |
CN114706484A (en) * | 2022-04-18 | 2022-07-05 | Oppo广东移动通信有限公司 | Sight line coordinate determination method and device, computer readable medium and electronic equipment |
CN115909162A (en) * | 2022-11-28 | 2023-04-04 | 重庆紫光华山智安科技有限公司 | Video label transfer method, device, storage medium and label transfer equipment |
CN116740716A (en) * | 2023-06-14 | 2023-09-12 | 四川云从天府人工智能科技有限公司 | Video labeling method, video labeling device, electronic equipment and medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100245588A1 (en) * | 2009-03-31 | 2010-09-30 | Acuity Systems Inc. | Tag tracking system |
-
2023
- 2023-12-27 CN CN202311812529.9A patent/CN117474984B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109034104A (en) * | 2018-08-15 | 2018-12-18 | 罗普特(厦门)科技集团有限公司 | A kind of scene tag localization method and device |
CN111881763A (en) * | 2020-06-30 | 2020-11-03 | 北京小米移动软件有限公司 | Method and device for determining user gaze position, storage medium and electronic equipment |
CN114706484A (en) * | 2022-04-18 | 2022-07-05 | Oppo广东移动通信有限公司 | Sight line coordinate determination method and device, computer readable medium and electronic equipment |
CN115909162A (en) * | 2022-11-28 | 2023-04-04 | 重庆紫光华山智安科技有限公司 | Video label transfer method, device, storage medium and label transfer equipment |
CN116740716A (en) * | 2023-06-14 | 2023-09-12 | 四川云从天府人工智能科技有限公司 | Video labeling method, video labeling device, electronic equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN117474984A (en) | 2024-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108898630B (en) | Three-dimensional reconstruction method, device, equipment and storage medium | |
CN109448090B (en) | Image processing method, device, electronic equipment and storage medium | |
US11181624B2 (en) | Method and apparatus for calibration between laser radar and camera, device and storage medium | |
CN108447097B (en) | Depth camera calibration method and device, electronic equipment and storage medium | |
EP4307233A1 (en) | Data processing method and apparatus, and electronic device and computer-readable storage medium | |
CN110147744B (en) | Face image quality assessment method, device and terminal | |
KR101791590B1 (en) | Object pose recognition apparatus and method using the same | |
US11940774B2 (en) | Action imitation method and robot and computer readable storage medium using the same | |
CN113449570A (en) | Image processing method and device | |
CN112183506A (en) | Human body posture generation method and system | |
CN102289803A (en) | Image Processing Apparatus, Image Processing Method, and Program | |
CN108765317A (en) | A kind of combined optimization method that space-time consistency is stablized with eigencenter EMD adaptive videos | |
JPWO2018139461A1 (en) | Moving object detection apparatus, moving object detection method, and storage medium | |
CN112200157A (en) | Human body 3D posture recognition method and system for reducing image background interference | |
CN112766027A (en) | Image processing method, device, equipment and storage medium | |
CN115546365A (en) | Virtual human driving method and system | |
EP3185212B1 (en) | Dynamic particle filter parameterization | |
CN113112542A (en) | Visual positioning method and device, electronic equipment and storage medium | |
CN108305321A (en) | A kind of three-dimensional human hand 3D skeleton patterns real-time reconstruction method and apparatus based on binocular color imaging system | |
CN112053383A (en) | Method and device for real-time positioning of robot | |
CN117523659A (en) | Skeleton-based multi-feature multi-stream real-time action recognition method, device and medium | |
CN111681302A (en) | Method and device for generating 3D virtual image, electronic equipment and storage medium | |
CN114859938A (en) | Robot, dynamic obstacle state estimation method and device and computer equipment | |
CN117372604B (en) | 3D face model generation method, device, equipment and readable storage medium | |
CN117474984B (en) | Augmented reality tag tracking method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: An augmented reality tag tracking method, device, equipment, and storage medium Granted publication date: 20240405 Pledgee: Zhuhai China Resources Bank Co.,Ltd. Guangzhou Branch Pledgor: Kaitong Technology Co.,Ltd. Registration number: Y2024980031582 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right |