CN117079169B

CN117079169B - Map scene adaptation method and system

Info

Publication number: CN117079169B
Application number: CN202311345362.XA
Authority: CN
Inventors: 杨海宁; 邓泽西; 栾德龙; 张丙锐
Original assignee: One Station Development Beijing Cloud Computing Technology Co ltd
Current assignee: One Station Development Beijing Cloud Computing Technology Co ltd
Priority date: 2023-10-18
Filing date: 2023-10-18
Publication date: 2023-12-22
Anticipated expiration: 2043-10-18
Also published as: CN117079169A

Abstract

The invention discloses a map scene adaptation method and a map scene adaptation system, which belong to the technical field of virtual reality and comprise the steps of acquiring first video data; identifying scene identification data in the first video data; acquiring position data of a user, and judging whether the position data is matched with scene identification data or not; if so, generating a first interaction bubble; and synthesizing the first video data and the first interactive bubble into second video data. According to the method, the local image data in the 3D scene are grabbed in the mode, and the positions of different first interaction bubbles are adapted, so that the positions of the first interaction bubbles are prevented from being too fuzzy or incompletely displayed, and the user can conveniently watch and operate the first interaction bubbles.

Description

Map scene adaptation method and system

Technical Field

The invention belongs to the technical field of virtual reality, and particularly relates to a map scene adaptation method.

Background

The current three-dimensional visual interaction field includes VR and AR technologies.

VR is a Virtual Reality technology (abbreviated as VR), which is also called Virtual Reality or spirit technology, and is a brand new practical technology developed in the 20 th century. The virtual reality technology comprises a computer, electronic information and simulation technology, and the basic implementation mode is that the computer technology is used as the main mode, and the latest development achievements of various high technologies such as a three-dimensional graphic technology, a multimedia technology, a simulation technology, a display technology, a servo technology and the like are utilized and integrated, and a realistic virtual world with various sensory experiences such as three-dimensional vision, touch sense, smell sense and the like is generated by means of equipment such as the computer, so that a person in the virtual world generates an immersive sense. With the continuous development of social productivity and scientific technology, VR technology is increasingly required by various industries. VR technology has also made tremendous progress and has gradually become a new scientific and technological area.

The AR is an augmented reality (Augmented Reality, abbreviated as AR) technology, which is a technology of skillfully fusing virtual information with a real world, and widely uses various technical means such as multimedia, three-dimensional modeling, real-time tracking and registration, intelligent interaction, sensing, and the like, and applies virtual information generated by a computer to the real world after simulation and emulation of the virtual information such as characters, images, three-dimensional models, music, video, and the like, wherein the two kinds of information are mutually complemented, thereby realizing the enhancement of the real world.

The current virtual reality technology does not add a three-dimensional model to the video, thereby bringing the user with the feeling of being personally on the scene. But it cannot be combined with the actual map scene to provide the user with more interactive displays, i.e. it cannot bring the above-mentioned toy-like thing into play with a meaningful function.

For example, people see that the store will immediately search for relevant coupons based on the store's text information. But this operating logic is too single and the user cannot view their coupons or other information at the same time at all stores caused by their gaze.

Thus, there is a need for a map scene adaptation method that is capable of adapting to information meaningful to a user through a map scene.

Disclosure of Invention

The invention aims to provide a map scene adaptation method which is used for solving the problems in the prior art.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

the invention provides a map scene adaptation method, which comprises the following steps of

Acquiring first video data;

identifying scene identification data in the first video data;

acquiring position data of a user, and judging whether the position data is matched with scene identification data or not; if so, generating a first interaction bubble;

synthesizing the first video data and the first interactive bubble into second video data;

wherein identifying scene identification data in the first video data comprises the steps of

Judging whether the local image data of the first video data are completely consistent with the pre-stored first image data, if so, converting the local image data into scene identification data, and if not, selecting second image data by utilizing an image frame from the pre-stored first image data, wherein the area of the second image data is larger than or equal to the first proportion of the area of the first image data;

dividing the second image data into a plurality of grid images by a plurality of grids, judging whether the local image data of one frame of the first video data is identical to all grid images of one second image data, if so, judging whether adjacent frames of the first video data before and after the local image data are identical to grid images of a second proportion of the second image data by taking the local image of one frame as the center, if so, judging whether the number of continuous frames of the local image data in the first video is larger than a first preset threshold value, and if so, outputting scene identification data corresponding to the local data;

Wherein synthesizing the first video data and the first interactive bubble into the second video data comprises the following steps of

If the first interaction bubble is the same, configuring the first interaction bubble at the center of the scene identification data, and synthesizing the first video data and the first interaction bubble into second video data;

if the first interactive bubble is inconsistent with the second interactive bubble, the first interactive bubble is configured at the center of the local image data of one frame, which is the same as all grid images of one second image data, of the scene identification data, and the first video data and the first interactive bubble are synthesized into the second video data.

The invention provides a map scene adaptation method, wherein if yes, the step of judging whether the number of frames of continuous local image data in the first video is larger than a first preset threshold value comprises the following steps of

If yes, acquiring the rotation track distance of the user before the first time periodAnd a moving track distance S, and outputs a dynamic speed V, + according to the following formula>The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is the duration of the first time period; judging whether the dynamic speed V exceeds a second preset threshold, if not, judging whether the number of continuous local image data in the first video is larger than a first preset threshold, if yes, judging whether the number of continuous local image data exceeding the first preset threshold in the first time period is larger than a third preset threshold, and if yes, judging that the number of continuous local image data in the first video is larger than the first preset threshold.

The invention provides a map scene adaptation method, wherein the area of a first interaction bubble is 1/3 of the area of scene identification data.

The invention provides a map scene adaptation method, wherein the judging whether the position data and scene identification data are matched or not comprises the following steps of

Checking whether the distance between the specific position of the user and the scene position coordinate of the scene identification data is lower than a preset position threshold value, if so, taking the specific position coordinate of the user as the center of a circle, taking the preset position threshold value as the radius, and taking the direction of the user as the center line of a fan shape to configure the fan shape with the center angle of 90 degrees, judging whether the scene position coordinate of the scene identification data is positioned in the area of the fan shape, if so, judging and judging that the scene position coordinate is matched, if not, judging that the scene position coordinate is not matched, and if not, judging that the scene position coordinate is not matched.

The invention provides a map scene adaptation method, wherein the duration t of a first time period is smaller than or equal to the time length of first video data.

The invention provides a system of a map scene adaptation method, which comprises

A video acquisition module for acquiring first video data;

a video identification module for identifying scene identification data in the first video data;

The position filtering module is used for acquiring position data of a user and judging whether the position data is matched with scene identification data or not; if so, generating a first interaction bubble;

a video synthesis module for synthesizing the first video data and the first interactive bubble into second video data;

The invention provides a system of a map scene adaptation method, wherein if so, the step of judging whether the number of frames of continuous local image data in the first video is larger than a first preset threshold value comprises the following steps of

The invention provides a system of a map scene adaptation method, wherein the area of a first interaction bubble is 1/3 of the area of scene identification data.

The invention provides a system of a map scene adaptation method, wherein judging whether the position data and scene identification data are matched or not comprises the following steps of

The invention provides a system of a map scene adaptation method, wherein the duration t of a first time period is smaller than or equal to the time length of first video data.

The beneficial effects are that:

according to the method, the local image data in the 3D scene are grabbed in the mode, the local image data are matched with the first image data prestored in the database, the scene identification data which occupy a relatively large area and are continuously appeared are preferably considered in the matching process to serve as the identified object, and the position of different first interaction bubbles is adapted according to the fact that the images appearing in the first video are divided into the completely appearing local image data and the incompletely appearing local image data, so that the position of the first interaction bubbles is prevented from being excessively blurred or incompletely displayed, and the user can conveniently watch and operate the images.

Drawings

FIG. 1 is a flow chart of a map scene adaptation method and system according to the present invention;

FIG. 2 is a schematic diagram of one frame of first video data of a map scene adaptation method and system according to the present invention;

FIG. 3 is a schematic diagram of a first frame of second video data of a map scene adaptation method and system according to the present invention;

FIG. 4 is a schematic diagram of a map scene adaptation method and system according to the present invention for a second frame of second video data;

FIG. 5 is a schematic diagram of a third frame of second video data of a map scene adaptation method and system according to the present invention;

FIG. 6 is a diagram of statistically consecutive frames of a map scene adaptation method and system according to the present invention;

FIG. 7 is a schematic diagram of another statistical continuous frame of a map scene adaptation method and system according to the present invention;

FIG. 8 is a partial enlarged view of the left side of FIG. 3;

fig. 9 is a partial enlarged view of the right side of fig. 3.

Detailed Description

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the present invention will be briefly described below with reference to the accompanying drawings and the description of the embodiments or the prior art, and it is obvious that the following description of the structure of the drawings is only some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort to a person skilled in the art. It should be noted that the description of these examples is for aiding in understanding the present invention, but is not intended to limit the present invention.

Examples:

as shown in fig. 1-7, referring to fig. 1, 2 and 3, the present embodiment provides a map scene adaptation method, which includes

Acquiring first video data;

identifying scene identification data in the first video data;

When the device is used, referring to fig. 2, 3, 4 and 5, a user can acquire first video data of the user through the AR device and watch the first video data, for example, the first interactive bubbles such as coupons are attached to scene identification data of a shop, so that the user can know the first interactive bubbles of all scene identification data caused by the sightedness at a glance, and the comparison of coupon information of the shop caused by the sightedness is facilitated, so that the user can make a better choice.

If the first interactive bubble is the same, the configuration of the first interactive bubble at the center of the scene identification data can be understood as follows: the area of the first interactive bubble is 1/10-1/2, preferably 1/3, of the area of the scene identification data. The center is the relation between the center of the local image data of the frame, which is the position of the first interactive bubble of other frames, and the position of the first interactive bubble at the centroid of the graph.

If the first interactive bubble is inconsistent, the first interactive bubble is configured at the center of the local image data of one frame, which is the same as all grid images of one second image data, of the scene identification data, and the first video data and the first interactive bubble are combined into the second video data, which can be understood as:

If it is determined that one frame of the continuous frames of the continuous first video data is identical to the second grid image in all of the grid images, the first interactive bubble is disposed at the center of the second grid image, and the first interactive bubble of the other frames has a relationship between the center of the partial image data of the frame and the position of the first interactive bubble. In other words, the combination of the first interactive bubble and the first video data may be understood as adding the image feature of the first interactive bubble data to the local image data, and how the local image data moves with respect to the first image data, so that the first interactive bubble data moves with respect to the first image data.

Wherein the division of the second image data into a plurality of grid images in a plurality of grids can be understood as:

the plurality of grids are the same in size, and endpoints of the grids are arranged at the highest point of the second image data or at the upper left endpoint or the upper right endpoint or the lower left endpoint or the lower right endpoint of the first video data.

The first interactive bubble is configured at the center of the scene identification data and the first video data and the first interactive bubble are combined into the second video data if the first interactive bubble and the second interactive bubble are the same, and the first interactive bubble and the second interactive bubble are configured at the center of the local image data if the first interactive bubble and the second interactive bubble are the same, and the first interactive bubble and the second interactive bubble are configured to be completely matched with the center of the local image data if the first interactive bubble and the second interactive bubble are different and larger than the first interactive bubble.

I.e. at the center, at the centroid of the representative image.

The dividing the second image data into a plurality of grid images by a plurality of grids can be understood as that the side length of the plurality of grids can be 1/100000-1/10, preferably 1/100, of the maximum height of the second grid image. In other words, if 10000 pixels of the maximum height of the second image data are used, a square grid is formed with 100 pixels as a side length, and one of the grid images is framed.

Further, the entire grid image of the second image data may delete a part of the grid disposed outside the second image data. That is, the entire grid of the second image data should include only the complete grid that is within the outline of the second image data.

The acquiring the first video data may be understood as acquiring the first video data through 3D glasses, and the 3D glasses may be VR glasses or AR glasses.

The acquisition of the location data of the user is understood to be the acquisition of the location data of the user by means of a GPS module or from the location of the nearest base station connected. Further, the location data of the user can be obtained by a positioning mode similar to navigation software. The user's location data may include the user's specific location coordinates, although the user's location data may also include specific location coordinates and user orientation.

The database also pre-stores scene position coordinates corresponding to the scene identification data.

In other words, determining whether the position data matches the scene identification data is understood to be

Checking whether the distance between the specific position of the user and the scene position coordinates of the scene identification data is lower than a preset position threshold value, if so, judging that the specific position of the user is matched with the scene position coordinates, and if not, judging that the specific position of the user is not matched with the scene position coordinates.

As a variant, determining whether the position data matches the scene identification data is understood to be

Wherein, if matched, the generation of the first interaction bubble can be understood as

The database is pre-stored with first interaction bubbles corresponding to the scene identification data;

And outputting the first interactive bubble corresponding to the scene identification data according to the scene identification data.

The first interaction bubble can be understood as a text bubble of WeChat chat or other types of interaction bubbles. The first interactive bubble comprises text information and a background plate corresponding to the text information. The first interaction bubble may be understood as an image arranged on the partial image data.

Wherein the combining of the first video data and the first interactive bubble into the second video data can be understood as

And configuring a first interactive bubble on the scene identification data of the first video data, and synthesizing the first interactive bubble into second video data.

Wherein determining whether the partial image data of the first video data is identical to the pre-stored first image data is understood as

Configuring each closed image in the first video as local image data, wherein the closed image can be the local image data closed by its own outline or closed by its own outline and a frame of the first video data;

the local image data closed by the self contour is compared with the pre-stored first image data, if the local image data is completely consistent, the local image data is judged to be consistent, and if the local image data is not completely consistent, the local image data is judged to be inconsistent.

Wherein if they are consistent, then converting them into scene identification data can be understood as

If so, converting the local image data closed by the self contour into scene identification data;

if the first image data frame and the second image data frame are inconsistent, the method can be understood as that the pre-stored first image data frame is selected to be the second image data

The size of the image frame can be large or small, the image frame can be rectangular in any proportion, the image frame is preferably rectangular in a ratio of 16 to 9, and the size of the first image can also be large or small. In other words, the present invention first simulates using a 16 to 9 frame to frame the image of one of the portions of the pre-stored first image. And then comparing the second image data, so as to more accurately find out whether the local images of each frame of the second image data and the first video are the same.

Wherein, the first proportion can be 1% -99%, and is preferably 25%. That is, the area of the second image data is 25% or more of the area of the first image data. Further, the above-described area can be understood as an area of the displayed first image data and second image data in the same first video data. Too low a first ratio is prone to occurrence of many misjudgments, and the first interactive bubble is difficult to display rapidly.

Wherein the partial image data may be understood as a part of one of the frames of the first video data.

Wherein, if the adjacent frames of the first video data before and after the partial image of one frame is the same as the grid image of the second proportion of the second image data, judging whether the continuous frame number of the partial image data in the first video is greater than a first preset threshold value is understood as

Capturing partial image data of one frame of first video data which is completely matched with the second image data as a reference frame, and listing the same percentages of adjacent frames before and after the reference frame as grid images of the second image data respectively, wherein the reference frame is used as a starting frame to a first frame which is less than 75% before the reference frame and is the same as the same grid image of the second image data; and comparing the frame number of the local image data of the first video data from the start frame to the end frame with a first preset threshold value by taking the previous frame from the reference frame to the first grid image which is the same with the second image data and is less than 75% of the same as the first grid image which is the second image data after the reference frame as the end frame. If the video data is larger than the first video data, the local image data representing the first video data from the start frame to the end frame is scene identification data.

Wherein, the first preset threshold value can be 10 to positive infinity, and is preferably 120 frames. In other words, only matching partial image data appearing in consecutive 120 frames can be identified as scene identification data. In other words, the common image is 60 to 120 frames per second, that is, the local image data continuously appearing for 1 to 2 seconds can be identified as the scene identification data.

Preferably, the upper limit of time of each first video data may be 2 seconds to positive infinity, preferably 10 seconds. In other words, the first video data of each time is a section of 5 seconds, and the method of the invention is refreshed every 5 seconds, thereby not only ensuring the sensitivity of the first interaction bubble, but also avoiding the first interaction bubble from moving or jumping frequently to reduce the user experience and bring visual fatigue to the user.

It should be noted that, in the first video of the above 5 second period, the first video data and the first interactive bubble are defined to be inconsistent, and after the first video data and the first interactive bubble are synthesized, the relative position of the first interactive bubble and the scene identification data is unchanged.

Wherein, the second proportion can be 1% -99%, and is preferably 75%. In other words, there may be some dynamic change in the second scale of the continuous second image data, that is, a part of a scene identification data of the edge of the first video, although also in the first video data, moving and shaking with the person. Some offset and error may be found in the first video data. This requires only one frame to be identical to all of the meshes, while others need only be identical to 75% of the meshes to determine continuous compliance.

Synthesizing the first video data and the first interactive bubble into second video data; this section is used to determine the location of the scene identification data, cannot block critical information, and has a threshold size.

The method and the device reserve the prior coupon, activity, i station and the like as the first interaction bubble and blend the first interaction bubble into the 3D scene.

In particular, referring to fig. 2, 3 and 4, it should be noted that the different partial image data in fig. 2, 3 and 4 are only subjected to the distinguishing processing of the small images, and the difference between the different partial image data in the real images is more. The first image data and the second image data are more difficult to be overlapped with all grid images or partial grid images of the local image data. That is, the different partial image data in fig. 2, 3, and 4 described above are configured only for convenience of understanding the spirit of the present invention by those skilled in the art.

As shown in fig. 2, 3, 4, referring to fig. 3, "full 100 minus 50" may be the first interaction bubble; each store may be a particular identified local image data, each store also counting scene identification data. The store closest to the edge of the second image data configures a centroid with the current image.

Wherein, referring to fig. 6 and 7, if yes, the step of determining whether the number of frames of the continuous local image data in the first video is greater than a first preset threshold comprises

If yes, acquiring the rotation track distance of the user before the first time periodAnd a moving track distance S, and outputs a dynamic speed V, + according to the following formula>The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is the duration of the first time period; judging whether the dynamic speed V exceeds a second preset threshold, if not, judging whether the number of continuous local image data frames in the first video is larger than a first preset threshold, if yes, judging whether the number of local image data frames exceeding the first preset threshold in the first time period is larger than a third preset threshold, and if yesAnd judging that the number of frames of the continuous local image data in the first video is larger than a first preset threshold value.

The invention grabs the rotation track distance of the previous rotation direction of the user by the methodAnd the moving track distance S is moved to obtain the dynamic speed V, so that whether the scene captured by the first video data changes too quickly or not can be known, if so, the tolerance degree for the continuous frame number is increased, the first interactive bubble can be combined with the scene identification data more sensitively, and the scene identification data with the first interactive bubble can be observed by a user as early as possible or as far as possible.

Wherein the third preset threshold may be 60 to plus infinity, preferably 80% of the total frame number of the first period. That is, if the dynamic speed V exceeds the second preset threshold, it is determined whether the number of frames of the partial image data exceeding the first preset threshold in the first period is greater than 80% of the total number of frames in the first period, and if so, it is determined that the number of frames is greater than the first threshold.

The duration t of the first time period is smaller than or equal to the length of the first video data.

For example, the duration of the first period is 4 seconds, the length of each first video data is 5 seconds, and the rotation track distanceIt may be represented that the user 'S direction is rotated 60 ° in total within the above 4 seconds, and the movement trajectory distance S may represent that the trajectory of the user' S position data is 6 m in total within the above 4 seconds, according to the formula ∈ ->It can be seen that the dynamic speed V is (60+3×6)/4=19.5. The second preset threshold may be 10. That is, the dynamic speed V exceeds the second preset threshold at this time. The dynamic speed of the user during this period is proved to be high, the acquired image of the first video data is changed greatly, and therefore, the judgment of the number of continuous frames is relaxed, fromWhile ensuring a higher sensitivity of the outputted first interaction bubble.

Wherein, referring to fig. 3, the area of the first interactive bubble is 1/3 of the area of the scene identification data.

The center is the relation between the center of the local image data of the frame, which is the position of the first interactive bubble of other frames, and the position of the first interactive bubble at the centroid of the graph.

Wherein referring to fig. 3, determining whether the position data matches the scene identification data comprises the steps of

The duration t of the first time period is less than or equal to the time length of the first video data.

Referring to fig. 1-5, referring to fig. 1, 2 and 3, the invention provides a system of a map scene adaptation method, comprising

A video acquisition module for acquiring first video data;

Referring to fig. 6 and 7, the present invention provides a system of a map scene adaptation method, if yes, the step of determining whether the number of frames of continuous local image data in the first video is greater than a first preset threshold value includes

Referring to fig. 2 and 3, the area of the first interactive bubble is 1/3 of the area of the scene identification data.

Wherein, referring to fig. 2 and 3, determining whether the position data matches the scene identification data comprises the following steps

Finally, it should be noted that: the foregoing description is only of the preferred embodiments of the invention and is not intended to limit the scope of the invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A map scene adaptation method is characterized by comprising the following steps of

Acquiring first video data;

identifying scene identification data in the first video data;

Judging whether the local image data of one frame of the first video data is identical to all grid images of one second image data, if so, configuring the first interaction bubble at the center of the scene identification data, and synthesizing the first video data and the first interaction bubble into second video data;

judging whether the local image data of the first video data are completely consistent with the pre-stored first image data, if not, configuring the first interactive bubble at the center of the local image data of one frame, which is the same as all grid images of one second image data, of the scene identification data, and synthesizing the first video data and the first interactive bubble into the second video data.

2. The map scene adaptation method according to claim 1, wherein if yes, the step of determining whether the number of frames of the continuous partial image data in the first video is greater than a first preset threshold comprises

If yes, obtainRotational trajectory distance of user before first time period And a moving track distance S, and outputs a dynamic speed V, + according to the following formula>The method comprises the steps of carrying out a first treatment on the surface of the Wherein t is the duration of the first time period; judging whether the dynamic speed V exceeds a second preset threshold, if not, judging whether the number of continuous local image data in the first video is larger than a first preset threshold, if yes, judging whether the number of continuous local image data exceeding the first preset threshold in the first time period is larger than a third preset threshold, and if yes, judging that the number of continuous local image data in the first video is larger than the first preset threshold.

3. The map scene adaptation method according to claim 1, wherein the area of the first interactive bubble is 1/3 of the area of the scene identification data.

4. The map scene adaptation method according to claim 1, wherein determining whether the position data and scene identification data match comprises the steps of

5. A map scene adaptation method according to claim 2, wherein the duration t of the first time period is less than or equal to the length of time of the first video data.

6. A system for map scene adaptation method, comprising

A video acquisition module for acquiring first video data;

7. The system of claim 6, wherein if so, the step of determining whether the number of consecutive frames of the partial image data in the first video is greater than a first predetermined threshold comprises

8. The system of a map scene adaptation method according to claim 6, wherein the area of the first interactive bubble is 1/3 of the area of the scene identification data.

9. The system of a map scene adaptation method according to claim 6, wherein determining whether the location data matches scene identification data comprises the steps of

10. The system of a map scene adaptation method according to claim 7, wherein the duration t of the first period of time is less than or equal to the length of time of the first video data.