CN111325665A

CN111325665A - Video light weight embedding method based on network panorama

Info

Publication number: CN111325665A
Application number: CN202010262762.4A
Authority: CN
Inventors: 余大飞
Original assignee: Tongchuanglantian Investment Management Beijing Co ltd
Current assignee: Tongchuanglantian Investment Management Beijing Co ltd
Priority date: 2020-04-07
Filing date: 2020-04-07
Publication date: 2020-06-23

Abstract

The invention discloses a video light weight embedding method based on a network panorama, which comprises the following steps: selecting a moving target in a video, separating the moving target from the video by taking a frame as a unit to form a moving target sequence, storing the position of each pixel of the moving target in the video, and carrying out interpolation or deletion operation on the pixels of adjacent moving targets in the moving target sequence; setting an embedding position of a moving target sequence in the network panorama, and filtering each moving target in the moving target sequence and a background of the embedding position of the network panorama so as to smooth pixel data of the moving target and the background of the embedding position; the embedded position determination method comprises the steps of matching or registering the positions of the video and the network panorama; and in the playing process of the network panorama, dynamically displaying the active target sequence at the embedded position. According to the technical scheme, the moving target in the video and the panoramic image are integrated, so that the network bandwidth of content transmission is reduced, and the display capacity of the network panoramic image on the dynamic content is expanded.

Description

Video light weight embedding method based on network panorama

Technical Field

The invention relates to the field of network panoramas, in particular to a video light-weight embedding method based on a network panoramas.

Background

With the wide application of the internet HTML5 technology, the spreading of the network panorama on the internet does not depend on plug-ins such as Flash, SilverLight and the like, and a user directly interacts with the panorama of a server side through a client side supporting the HTML5 technology. In the fields of virtual tourism, virtual shopping, digital exhibition halls and geographic navigation, the network panoramic image is widely applied by the characteristics of strong interaction capacity, high resolution, light volume and the like. In engineering practice, a panoramic image is a combined display of physical real scenes in space, and the display capability of video content based on time change is poor. In the prior art, a video is embedded into a panoramic image as an independent module of the network panoramic image, so that on one hand, the video has a large volume, needs to occupy more network bandwidth and consumes more system resources; on the other hand, the video content and the network panorama content are not fused, the content reproduction form is limited, and especially when the video scene and the network panorama scene are the same, the video scene and the network panorama cannot display the content based on a background, and the user experience is poor.

In view of this, an object of the present invention is to provide a method for embedding a video in a network panorama in a lightweight manner, which extracts a moving object in the video and then embeds the extracted moving object into the network panorama, so as to alleviate the problems in the prior art.

Disclosure of Invention

A video lightweight embedding method based on a network panorama comprises the following steps: selecting a moving target in the video, separating the moving target from the video by taking a frame as a unit to form a moving target sequence, and storing the position of each pixel of the moving target in the video; carrying out interpolation or deletion operation on pixels of adjacent moving targets in the moving target sequence; setting an embedding position of a moving target sequence in the network panorama, and filtering each moving target in the moving target sequence and a background of the embedding position of the network panorama so as to smooth pixel data of the moving target and the background of the embedding position; the embedded position determination method comprises the steps of matching or registering the positions of the video and the network panorama; and in the playing process of the network panorama, dynamically displaying the active target sequence at the embedded position.

Further, a method of selecting a moving object in a video, comprising at least one of: detecting the moving target in each frame of the video according to the time domain characteristics of the moving target; and detecting the outline of the moving object in the video according to the optical flow characteristics of the moving object.

Further, the position matching of the video and the network panorama comprises the following steps: selecting any moving target from the moving target sequence, setting at least one first mark point on the moving target, setting the same number of second mark points on the network panorama, and mapping a graph formed by the at least one first mark point to a graph formed by the second mark points through at least one of graph operations of rotation, translation, scaling and imitation; and carrying out the same graphic operation on the active target sequence and mapping the active target sequence to the embedded position of the network panorama.

Further, the registration of the video with the network panorama comprises: if the network panoramic image and the video correspond to the same physical real scene, carrying out image registration on an image formed by the video by taking a frame as a unit and the network panoramic image to obtain a transformation matrix from the video to the network panoramic image; and mapping the active target sequence to the embedding position of the network panorama by transformation matrix operation.

Further, the activity target sequence comprises at least one of the following processes before embedding the network panorama: adjusting the brightness of the moving target sequence to be consistent with the network panorama; the active target sequence sets the luminance degradation on the active target according to the light source configuration of the network panorama.

Further, the activity target sequence is dynamically displayed at the embedding position, and the activity target sequence comprises at least one of the following: if the display area of the network panorama contains the embedding position of the activity target sequence, dynamically displaying the activity target sequence at the embedding position; and if the network panorama receives the trigger of at least one of a mouse, a keyboard, voice, video and an image, dynamically displaying the activity target sequence at the embedded position.

Further, if the video contains sound, the sound is played according to the starting time of the target sequence when the target sequence is dynamically displayed.

Further, if the network panorama comprises more than two active target sequences, dynamically displaying one of the more than two active target sequences; or dynamically displaying a plurality of activity target sequences and controlling the playing speed or sound of the plurality of activity target sequences.

And further, storing the dynamic display progress of the activity target sequence to control the display progress.

The invention has the following beneficial effects:

the technical scheme provided by the invention can have the following beneficial effects: the technical scheme provided by the invention realizes the light-weight embedding of the video in the network panorama, the moving target and the panorama are integrated into a whole, and the network bandwidth required by content transmission is reduced. On the other hand, the content fusion among heterogeneous media is realized, and the display capability of the network panorama on the dynamic content is expanded.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are one embodiment of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic flow chart of a network panorama-based video lightweight embedding method according to an embodiment of the present invention.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and the described embodiments are some, but not all embodiments of the present invention.

Fig. 1 is a flowchart of a video lightweight embedding method based on a network panorama according to an embodiment of the present invention, and as shown in fig. 1, the method includes the following four steps.

Step S101: the moving object sequence is separated from the video. Specifically, selecting a moving target in a video, separating the moving target from the video by taking a frame as a unit to form a moving target sequence, and storing the position of each pixel of the moving target in the video; and carrying out interpolation or deletion operation on the pixels of the adjacent moving targets in the moving target sequence.

In an alternative embodiment, the moving object in each frame of the video is detected based on the temporal characteristics of the moving object. Specifically, since the position of the moving object in the video frame changes with time, all the moving objects in the video are obtained by using the difference between the background frame and the current frame of the video or the difference between two adjacent frames of the video.

In another alternative embodiment, the contours of the moving objects in the video are detected based on optical flow characteristics of the moving objects. Specifically, parameter estimation is performed on an optical flow field of a video, a pixel area conforming to a motion model is solved, and the pixel areas are combined to form a moving target.

In addition, even if the moving object is separated according to the video temporal characteristics or the optical flow characteristics, there are missing edge pixels or redundant out-of-edge pixels, and therefore, it is necessary to perform interpolation or deletion calculation on the pixels of the moving object. In engineering practice, the sampling rate of video frames is typically higher than 25 frames per second, with small pixel differences between adjacent moving objects. When the difference of the pixels of two adjacent active targets is larger than a first threshold value, introducing at least one active target which is nearest to the two adjacent active targets to form a subsequence, and performing interpolation or deletion operation on the edge pixels of each active target according to the mean value or the median value of the number of pixels contained in the active target in the subsequence.

Step S102: and determining the embedding position of the active target sequence. Specifically, an embedding position of the moving target sequence is set in the network panorama, and the determination method of the embedding position comprises position matching or registration of the video and the network panorama.

In an optional embodiment, any moving target 1 is selected from a moving target sequence, at least one first marking point 2 is arranged on the moving target 1, the same number of second marking points 3 are arranged on a network panorama, and a graph formed by the at least one first marking point 2 is mapped to a graph formed by the second marking points 3 through at least one of graph operations of rotation, translation, scaling and imitation; and carrying out the same graphic operation on the active target sequence and mapping the active target sequence to the embedded position of the network panorama.

It should be noted that, when the moving objects are images and each moving object is separated from the video, the position of each pixel of the moving object in the video is stored, so that the relative position and motion between the moving objects can be known. At least one first marking point 2 is arranged on any moving object 1 in the object sequence, so that the positions of other moving objects in the sequence relative to the first marking point 2 can be obtained. The same number of second mark points 3 are arranged on the network panorama, the images formed by the first mark points 2 and the images formed by the second mark points 3 are mapped, the mapping method is the transformation among the images, the concrete transformation comprises rotation, translation, scaling and affine, and the moving target sequence is subjected to the same image operation, so that the embedding position matched with the network panorama is obtained.

In another optional embodiment, if the network panorama and the video correspond to the same physical real scene, performing image registration on an image formed by the video in a frame unit and the network panorama to obtain a transformation matrix from the video to the network panorama; and mapping the active target sequence to the embedding position of the network panorama by transformation matrix operation. It should be noted that, because the position of each pixel of the moving object in the video is stored, after the image formed by the video in units of frames is subjected to image registration with the network panorama, the moving object is mapped into the network panorama according to the transformation matrix of the registration result, thereby implementing embedding into the network panorama.

Step S103: and processing the active target sequence. Specifically, each active object in the sequence of active objects is filtered with the network panorama embedded location background to smooth the pixel data of the active object and the embedded location background.

It should be noted that, in order to enable the activity target sequence to have a better visual experience after being embedded into the network panorama, each activity target in the activity target sequence needs to be filtered with the background of the embedded position of the network panorama. The filtering method comprises mean filtering, median filtering, Gaussian filtering, box filtering, Laplace filtering, bilateral filtering and the like, and data of the movable target at the embedded position becomes smooth after filtering.

In an alternative embodiment, the brightness of the sequence of active targets is adjusted to coincide with the network panorama.

In another alternative embodiment, the sequence of active targets sets the luminance degradation on the active targets according to the light source configuration of the network panorama. It should be noted that, the luminance of each pixel of the moving target is adjusted according to the configuration of the light source of the network panorama and according to the degradation model with the lower luminance as the distance from the light source increases, so as to improve the visual experience after the moving target is embedded.

Step S104: the sequence of active targets is shown in the network panorama. Specifically, in the process of playing the network panorama, the activity target sequence is dynamically displayed at the embedded position. If the display area of the network panorama contains the embedding position of the activity target sequence, dynamically displaying the activity target sequence at the embedding position;

in an alternative embodiment, the active object sequence is dynamically displayed at the embedded location if the network panorama receives a trigger from at least one of a mouse, a keyboard, voice, video, and image.

In a preferred embodiment, if the video contains sound, the sound is played according to the start time of the target sequence when the target sequence is dynamically displayed.

In another alternative embodiment, if the network panorama contains more than two active target sequences, one of the more than two active target sequences is dynamically presented; or dynamically displaying a plurality of activity target sequences and controlling the playing speed or sound of the plurality of activity target sequences.

In yet another alternative embodiment, the dynamic display progress of the activity goal sequence is stored for display progress control when the activity goal sequence is displayed again.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A video lightweight embedding method based on a network panorama is characterized by comprising the following steps:

selecting a moving target in the video, separating the moving target from the video by taking a frame as a unit to form a moving target sequence, and storing the position of each pixel of the moving target in the video; carrying out interpolation or deletion operation on pixels of adjacent moving targets in the moving target sequence;

setting the embedding positions of the moving target sequences in the network panorama, and filtering each moving target in the moving target sequences and the background of the embedding positions of the network panorama so as to smooth the pixel data of the moving target and the background of the embedding positions;

the determination method of the embedding position comprises position matching or registration of the video and the network panorama;

and in the playing process of the network panorama, dynamically displaying the activity target sequence at an embedded position.

2. The method of claim 1, wherein the method of selecting a moving object in the video comprises at least one of:

detecting the moving target in each frame of the video according to the time domain characteristics of the moving target;

and detecting the outline of the moving target in the video according to the optical flow characteristics of the moving target.

3. The method of claim 1, wherein the matching of the video with the location of the network panorama comprises:

selecting any moving target (1) in the moving target sequence, arranging at least one first mark point (2) on the moving target (1), arranging the same number of second mark points (3) on the network panorama, and mapping a graph formed by the at least one first mark point (2) to a graph formed by the second mark points (3) through at least one of graph operations of rotation, translation, scaling and imitation; and the active target sequence carries out the same graphic operation and is mapped to the embedded position of the network panorama.

4. The method of claim 1, wherein the registering of the video with the network panorama comprises:

if the network panoramic image and the video correspond to the same physical real scene, carrying out image registration on an image formed by the video by taking a frame as a unit and the network panoramic image to obtain a transformation matrix from the video to the network panoramic image; and mapping the active target sequence to the embedding position of the network panorama by the transformation matrix operation.

5. The method of claim 1, wherein the active target sequence comprises at least one of the following processes before embedding the network panorama:

adjusting the brightness of the moving target sequence to be consistent with the network panorama;

and setting brightness degradation on the moving target according to the light source configuration of the network panorama by the moving target sequence.

6. The method of claim 1, wherein the active target sequence is dynamically exposed at an embedding location, comprising at least one of:

if the display area of the network panorama comprises the embedded position of the activity target sequence, dynamically displaying the activity target sequence at the embedded position;

and if the network panorama receives at least one trigger of a mouse, a keyboard, voice, video and an image, the activity target sequence is dynamically displayed at the embedded position.

7. The method according to any one of claims 1 to 6, wherein if the video contains sound, the sound is played according to the start time of the target sequence when the target sequence is dynamically displayed.

8. The method according to any one of claims 1 to 7, wherein if said network panorama comprises more than two of said active object sequences, one of said more than two active object sequences is dynamically presented; or dynamically displaying a plurality of the activity target sequences, and controlling the playing speed or sound of the plurality of the activity target sequences.

9. The method of any of claims 1 to 8, further comprising: and storing the dynamic display progress of the activity target sequence so as to control the display progress.