CN106127680B - 720-degree panoramic video fast browsing method - Google Patents

720-degree panoramic video fast browsing method Download PDF

Info

Publication number
CN106127680B
CN106127680B CN201610496238.7A CN201610496238A CN106127680B CN 106127680 B CN106127680 B CN 106127680B CN 201610496238 A CN201610496238 A CN 201610496238A CN 106127680 B CN106127680 B CN 106127680B
Authority
CN
China
Prior art keywords
video
image
shot
point
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610496238.7A
Other languages
Chinese (zh)
Other versions
CN106127680A (en
Inventor
罗文峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Youxiang Computing Technology Co Ltd
Original Assignee
Shenzhen Youxiang Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Youxiang Computing Technology Co Ltd filed Critical Shenzhen Youxiang Computing Technology Co Ltd
Priority to CN201610496238.7A priority Critical patent/CN106127680B/en
Publication of CN106127680A publication Critical patent/CN106127680A/en
Application granted granted Critical
Publication of CN106127680B publication Critical patent/CN106127680B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/60Rotation of a whole image or part thereof
    • G06T3/608Skewing or deskewing, e.g. by two-pass or three-pass rotation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • G06F16/743Browsing; Visualisation therefor a collection of video files or sequences
    • G06T3/04
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4038Scaling the whole image or part thereof for image mosaicing, i.e. plane images composed of plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes

Abstract

the invention discloses a 720-degree panoramic video fast browsing method which includes the steps of firstly reconstructing 720-degree panoramic video images through a back projection method to obtain corresponding views in each sight line direction of a spherical viewpoint space, then judging the length of a lens through calculating absolute brightness frame differences of adjacent image frames in a video sequence, and then extracting key frames to achieve fast browsing of panoramic videos. The method can quickly generate perspective views of the virtual scene in different sight directions, effectively simulate the rotation and zooming motion of the camera in the views in all directions, improve the browsing speed of the virtual scene, and can well meet the specific application field of a virtual reality system.

Description

720-degree panoramic video fast browsing method
Technical Field
The invention belongs to the technical field of image processing, relates to video panoramic image processing, and particularly relates to a 720-degree panoramic video quick browsing method.
background
With the development of information technology, people have higher and higher requirements for acquiring scene information in a wide view angle range, a traditional photographing method can only acquire image frames in a limited view angle range, and an image splicing technology is generated and rapidly developed for solving the problem. The method splices two or more pictures with overlapped information into a complete image with an ultra-wide visual angle so as to achieve the purposes of reducing image redundancy and acquiring wider visual angle information, wherein the generation of the panoramic image is also a typical application of the image splicing technology.
the 720-degree panoramic video is a video image sequence based on a spherical model, and can realize the panoramic browsing in any view angle directions of 360 degrees horizontally and 360 degrees vertically. When browsing, the spherical video image needs to be subjected to back projection transformation according to the current sight direction and the view field range so as to obtain a planar perspective image which accords with the habit of human vision. In this way, it is possible to simultaneously realize the simulation of the rotational movement and the zoom movement of the camera, changing the field of view.
a great deal of time and energy are consumed for retrieving and playing back massive video data, the traditional dragging and browsing method easily ignores sudden abnormal events in a short time, and the long-time video data search is not beneficial to the extraction of effective information. Therefore, the panoramic video needs to be further processed to realize quick browsing of the panoramic video, and the core work of the method is segmentation of the original video and extraction of key sequences in the video.
currently, video segmentation and key frame extraction methods are mainly classified into four categories:
the method is a simple generation algorithm, and a method for extracting key frames by performing equal-time uniform sampling on a video sequence, but the method easily has the problems of excessive extraction or insufficient representation of the key frames due to different changes of video information quantity in a short time;
Secondly, a generation method based on visual information is used for carrying out operations such as scene clustering, lens detection, key frame extraction and the like by applying various video processing technologies according to visual information such as color, shape, texture and the like in the video to finally generate a thumbnail video, and the method based on visual characteristics is obviously improved on a simple generation algorithm, but ignores information such as audio, subtitles and the like in the original video;
Thirdly, a multi-feature fusion generation method, for example, a face recognition technology is adopted to detect the appearance of important characters in news, an audio processing technology is utilized to detect wonderful segments in sports videos, and a plurality of features of the videos are fused by combining the features of the videos and other image processing technologies, so that the algorithm processing process is relatively complex;
and fourthly, searching structural rules between shots and between scenes based on a video syntax semantic generating method, and forming a video abstract on the basis of the structural rules.
in summary, for different video types and purposes, the video fast browsing processing methods are different, and the current panoramic video technology is widely applied to network virtual displays such as tourist attractions, home homes, automobile displays, leisure clubs, urban building plans, and the like, and these video scenes mainly provide an immersive experience and perfectly display the panorama to achieve a better propaganda purpose.
Disclosure of Invention
The invention provides a fast browsing method of 720-degree panoramic video, which realizes the video observation under all-directional angles of 360 degrees horizontally and 360 degrees vertically through a back projection method, extracts key frames according to the difference of the lengths of shooting lenses under different scenes of the video, forms a video summary and achieves the purpose of fast browsing.
A720-degree panoramic video fast browsing method comprises the following steps:
S1, firstly, reconstructing the 720-degree panoramic video image by using a back projection method to obtain a view sequence corresponding to each sight line direction of the spherical viewpoint space.
and S2, judging the length of the shot by calculating the absolute brightness frame difference of adjacent image frames in the video sequence, and then extracting key frames to realize the quick browsing of the panoramic video.
wherein, S1 includes the following steps:
S1.1, completing splicing of 720-degree panoramic images based on a spherical viewpoint space model, establishing two coordinate systems with the center of a sphere as the center, and respectively representing a world coordinate system XYZ and a camera coordinate system XYZ; the camera coordinate system XYZ is obtained by rotating the world coordinate system XYZ by an angle α around the X axis in the world coordinate system and then by an angle β around the Y axis in the world coordinate system.
The method for completing the splicing of the 720-degree panoramic image based on the spherical viewpoint space model in the S1.1 comprises the following steps: according to the property that straight lines which are parallel to the y axis in the camera coordinate system xyz and are perpendicular to the image transverse axis in images generated according to a spherical parameter transformation formula are still perpendicular lines, a plurality of live-action images shot by a fisheye lens are subjected to rotation transformation correction to obtain the direction information of pixel points on each live-action image in a viewpoint space, the plurality of images are spliced by using the direction information to eliminate repeated information possibly existing between the live-action images, and finally the images are projected onto a spherical surface and stored in the form of spherical panoramic images.
And S1.2, unifying the basic measurement units of the pixels under the two coordinate systems in the S1.1, and calculating the pixel focal distance taking the pixel as the basic measurement unit, namely, estimating the pixel focal distance f from the viewpoint to the view plane for each pixel under the camera coordinate system.
s1.2, setting an image S as a spliced spherical panoramic image, setting Q as any pixel point on the spherical panoramic image S, and setting image coordinates as; j is a view to be generated, a point P is a point corresponding to a point Q on the spherical panoramic image on the view J, and the image coordinates are as follows; f represents the focal length of the pixel, and f is estimated according to the lens used for shooting the live-action image.
The method for estimating the pixel focal length f of the wide-angle lens or the standard lens comprises the following steps: if the camera horizontally rotates for a circle to shoot n live-action images, the horizontal visual angle of the camera is 360/n, the width of the live-action image is W, and the pixel focal length estimation formula of the common lens can be obtained according to the trigonometric function relationship as follows:
f=W/(2tan(180/n))。
The method for estimating the pixel focal length f of the fisheye lens comprises the following steps: and (3) recording the width of the image after the black frame of the fisheye image is removed as W, and then the pixel focal length estimation formula of the fisheye lens is as follows: and f is W/phi, wherein phi is the horizontal visual field of the fisheye lens.
s1.3, establishing a conversion relation between coordinates of two-dimensional image points and three-dimensional parameter coordinate points corresponding to a spherical surface by using a pixel focal length f, rotating an alpha angle around an X axis in a world coordinate system according to the world coordinate system XYZ, and rotating a beta angle around a Y axis in the world coordinate system, wherein along with the rotation of coordinate axes, the representation of the pixel points on each coordinate component is correspondingly changed (the corresponding positions of the pixel points after the rotation of the coordinate system need to be represented again under a new coordinate system, and the coordinate components are the corresponding components on three coordinate axes of X, Y and z respectively), and the change can be represented on each coordinate component by using a trigonometric function relation, so that a transformation matrix H of the corresponding points under the two coordinate systems is obtained.
and S1.4, establishing an inverse transformation function by the transformation matrix H, finding out a corresponding relation from any point on the panoramic image to a point on each view in the spherical space, and calculating coordinates of each point to obtain a corresponding view in each sight direction of the viewpoint space.
in S1.3, a conversion relation between the coordinates of the two-dimensional image point and the three-dimensional parameter coordinate point corresponding to the spherical surface is established by using the pixel focal length f, as follows:
Calculating a transformation matrix H of corresponding points under two coordinate systems, wherein the expression is as follows:
In S1.4, as is clear from expressions (1) and (2) in S1.3, the point in the coordinate system XYZ corresponds to the coordinate in the coordinate system XYZ.
knowing that the width of a live-action image shot in a video is W and the height is H, establishing a functional relation between any point Q on the spherical panoramic image and a point P corresponding to the point Q on a view J, and calculating the coordinates of each corresponding point by using a formula (3) to obtain a view corresponding to each sight line direction of a viewpoint space.
the method of S2 of the present invention comprises the steps of:
s2.1, structuring the panoramic video sequence, wherein the panoramic video corresponds to a video sequence formed by a group of views in each sight line direction, and classifying the video sequence obtained in the step S1 according to the view frame sequences projected on the view angles in different directions to obtain video sequence groups which can be browsed independently in a plurality of view angles;
S2.2, respectively segmenting the video sequence group in each visual angle direction, calculating the absolute brightness frame difference of adjacent image frames in the video sequence, judging the conversion node of the video shot, and segmenting the video sequence into a plurality of shot segments;
S2.3, calculating the sum of the motion amount of each lens segment, setting a motion amount measurement threshold, and judging whether the current lens is a long lens or a short lens according to the lens duration;
S2.4, respectively extracting key frames from the long shot and the short shot, randomly extracting one key frame from the short shot, and extracting multi-frame images as the key frames from the long shot according to an equal interval method;
And S2.5, recombining the extracted key frame sequences, restoring the extracted key frame sequences to different visual angle directions to generate a video summary, and achieving the purpose of quickly browsing the video by an observer through the operation on the video summary.
In S2.2, the absolute luminance frame difference AIFD is selected as a characteristic quantity for measuring the degree of change of video content, and the definition formula is as follows:
in the above formula, the sum respectively represents the brightness value of a pixel point of an image frame at the coordinate at the moment t and the brightness value of a pixel point of the next frame at the coordinate at the moment t in the video sequence, and W and H respectively represent the width and height of the video frame; if the number of image frames of the video completely played in a certain view angle direction is N, the average value of the luminance frame differences of the video is:
the average value of the brightness frame differences is calculated as a judgment standard, two different coefficients a and b are set (false detection is easy when the values of a and b are set to be too small, and omission is easy when the values of a and b are set to be too large, in the experiment, the value of a is 1.2, the value of b is 2.3, and empirical values are set to weight the average value of the brightness frame differences to obtain high and low thresholds thresh _ low and thresh _ high which are used as judgment conditions for whether the shot is converted or not and in which mode, wherein the judgment conditions are that
In S2.2, the method for segmenting the video sequence group is as follows:
Firstly, initializing input video frame data, calculating AIFD characteristic values of two adjacent frames at the time t, comparing and judging the characteristic value of a current frame with a judgment threshold value, and thus detecting whether shot conversion exists between the current frame and a next frame.
In S2.3, the lens is judged to belong to a long lens or a short lens by calculating the sum of the motion amount of the calculated lens and comparing the sum with a preset motion amount measurement threshold, wherein the relative motion amount between two adjacent video frames at the time t is represented and the duration of the lens is represented, and when the sum of the motion amount of the lens is greater than the motion amount measurement threshold, the long lens is judged, otherwise, the short lens is judged.
The 72-degree panoramic video quick browsing method provided by the invention can quickly generate perspective views of a virtual scene in different sight directions, effectively simulate camera rotation and zooming motion in the views in all directions, improve the browsing speed of the virtual scene, and can well meet the specific application field of a virtual reality system.
drawings
FIG. 1 is a schematic diagram of a coordinate system for back projection of panoramic images
FIG. 2 is a block diagram of a panoramic video with segmentation and reconstruction in different view directions
FIG. 3 key frame extraction block diagram
FIG. 4 is a schematic diagram of trigonometric function relationship between W, f and θ
Detailed Description
the present invention will be described in further detail below with reference to the accompanying drawings.
In order to effectively browse 720-degree panoramic videos in all directions, the first step of the invention is to reconstruct the panoramic videos by using a back projection method to obtain a view sequence corresponding to each sight line direction of a spherical viewpoint space, and simulate the rotation and zoom motion of a camera to realize browsing the videos in different view angles, and the specific steps are as follows:
s1.1, completing splicing of 720-degree panoramic images based on a spherical viewpoint space model, establishing two coordinate systems with the center of sphere as the center, and respectively representing a world coordinate system XYZ and a camera coordinate system XYZ.
The method for completing splicing of the 720-degree panoramic image based on the spherical viewpoint space model comprises the following steps: according to the property that straight lines which are parallel to the y axis in the camera coordinate system xyz and are perpendicular to the image transverse axis in images generated according to a spherical parameter transformation formula are still perpendicular lines, a plurality of live-action images shot by a fisheye lens are subjected to rotation transformation correction to obtain the direction information of pixel points on each live-action image in a viewpoint space, the plurality of images are spliced by using the direction information to eliminate repeated information possibly existing between the live-action images, and finally the images are projected onto a spherical surface and stored in the form of spherical panoramic images.
The camera coordinate system XYZ is obtained by rotating the world coordinate system XYZ around the X-axis in the world coordinate system and then rotating around the Y-axis in the world coordinate system.
Setting an image S as a spliced spherical panoramic image, wherein Q is any pixel point on the spherical panoramic image S and the image coordinate is; j is a view to be generated (i.e. J is a view in a certain sight line direction to be finally obtained), as shown in fig. 1, a point P is a point on the spherical surface corresponding to a point Q on the view J, and the image coordinates are; f represents the focal length of a pixel, and is estimated according to the use of a common lens (a general wide-angle lens and a standard lens) or a fisheye lens for shooting a live-action image.
S1.2, estimating the pixel focal length f of the lens in order to unify the basic measurement units of the pixel points under the two coordinate systems.
the pixel focal length estimation method of a common lens (a general wide-angle lens and a standard lens) comprises the following steps: if the camera horizontally rotates for a circle to shoot n live-action images, the horizontal visual angle of the camera is 360/n, the width of the live-action image is W, and the pixel focal length estimation formula of the common lens can be obtained according to the trigonometric function relationship as follows: f is W/(2tan (180/n)). The trigonometric function relationship refers to the relationship of sine, cosine, tangent and the like existing in a right triangle.
Referring to fig. 4, taking a cut-plane view of the panorama, the various quantities may be represented by trigonometric relationships. Assuming that θ represents the horizontal angle of view of the camera, θ is 360/n, and it can be seen from the figure that W, f and θ are trigonometric functions:
The conversion deduces that the information is converted,W/(2tan (θ/2)), that is, W/(2tan (180/n))
The pixel focal length estimation of the fisheye lens can be deduced by an equidistant imaging model of the fisheye lens, and specifically comprises the following steps: and (3) recording the width of the image after the black frame of the fisheye image is removed as W, and then the pixel focal length estimation formula of the fisheye lens is as follows: and f is W/phi, wherein phi is the horizontal visual field of the fisheye lens, and the searching can be carried out through the fisheye lens specification.
s1.3, according to the inverse operation in the panoramic image generation process, converting the two-dimensional image coordinates into three-dimensional parameter coordinates for operation, wherein an image coordinate point Q corresponds to a point on a spherical surface and meets the following conversion relational expression:
Calculating a transformation matrix H of corresponding points under two coordinate systems, wherein the expression is as follows:
after the transformation matrix is obtained, the point in the coordinate system XYZ corresponds to the coordinate in the coordinate system XYZ, as can be seen from the above two equations.
S1.4, knowing that the width of a live-action image frame shot in the video is W and the height is H, establishing a functional relation between any point Q on the spherical panoramic image and a corresponding point P on a view J, and calculating the coordinates of each corresponding point.
The first step of the method is completed, the image frames of the 720-degree panoramic video are subjected to back projection transformation to obtain the corresponding view in any sight line direction of the viewpoint space, and the 720-degree panoramic video can be watched in any sight line direction. The method comprises the following steps of directly browsing the 720-degree panoramic video one by one in an all-round mode, wherein the problem of overlarge browsing access data exists, on one hand, fatigue of an observer is easily caused, and on the other hand, the efficiency of extracting key information is influenced.
the video is composed of a plurality of different scenes, each scene comprises a plurality of shots, wherein each shot comprises a long shot and a short shot, each shot is formed by playing a plurality of frames of associated images according to a certain sequence, and therefore the video frames are the most basic units for forming the video. In order to realize the quick browsing of the video, acquiring the key frame in the video image becomes the key for extracting the effective information of the video. Generally speaking, different types of videos have key points and secondary points to a shooting scene according to their own subjects, and the length of a shot is also distinguished according to the difference of the focus points, so that the extraction of key frames is more beneficial by detecting and judging the long shot and the short shot in the videos.
S2.1, obtaining view sequences of the panoramic video in different directions in the first step, classifying the video sequences according to the view sequences projected on different direction view angles (an expanded 360-degree panoramic image is formed by splicing images of a plurality of different view angles, restoring the panoramic image to a plurality of views on different view angles through back projection, arranging the views according to the sequences, and classifying the views according to sequence numbers) to obtain a view sequence group which can be browsed independently on a plurality of view angles. A frame of panorama in a panoramic video is back-projected to obtain a plurality of views in the direction of the line of sight, and a panoramic video is back-projected to obtain a sequence of video views in the direction of a plurality of viewing angles, each direction having a plurality of views.
And S2.2, respectively carrying out segmentation processing on the video sequence groups in different directions and visual angles.
The absolute luminance frame difference aifd (absolute intensity frame difference) is used as a feature quantity for measuring the degree of change of video content, and is defined as follows:
in the above formula, the sum represents the brightness value of the pixel point of the image frame at the coordinate at the moment t and the brightness value of the pixel point of the image frame at the coordinate at the moment t +1 in the video sequence, and W and H represent the width and height of the video frame respectively. If the number of image frames of the video completely played in a certain view angle direction is N, the average value of the luminance frame differences of the video is:
because the brightness frame difference of the pixel points under the same lens is not changed greatly and presents a more uniform distribution condition, two different coefficients a and b can be set by calculating the average value of the brightness frame difference as a judgment reference, and when the values of a and b are set to be too small, false detection is easy; if the setting is too large, the leak detection is easy. (in the experiment, the value of a is 1.2, the value of b is 2.3, and the value is an empirical value), the average value of the luminance frame differences is weighted to obtain high and low thresholds thresh _ low and thresh _ high, which are used as the judgment conditions for whether the shot is converted or not and in which mode.
The specific implementation steps of segmenting the video sequence include initializing input video frame data, calculating AIFD characteristic values of two adjacent frames at time t, comparing and judging the characteristic value of the current frame with a judgment threshold value, and thus detecting whether lens conversion exists between the current frame and the next frame. The judging method is that if the current frame characteristic value is less than thresh _ low, no shot switching exists, if the current frame characteristic value is more than thresh _ low and less than thresh _ high, the current frame is considered to have the possibility of gradual shot conversion, if the current frame characteristic value is more than thresh _ high, the current frame is considered to have the possibility of abrupt shot conversion, and the current frame is recorded as the conversion of the shot, namely, a shot conversion node is recorded no matter the current frame is gradual or abrupt.
s2.3, the motion component is usually used for representing the content change condition in the video, the total motion amount of the calculated lens is calculated and compared with a set motion amount measurement threshold (the motion amount measurement threshold is a preset threshold, generally, the histogram difference of two frames of images under the same lens is considered to be small, when a difference accumulated value, namely the motion amount total exceeds the set motion amount measurement threshold, the lens is judged to be a long lens or a short lens, wherein the difference represents the relative motion amount between two adjacent video frames at the time t, namely the difference between the two adjacent frames, and is measured by the histogram difference rate, even if the two frames of images under the same lens are not completely unchanged, the difference value is only small. And the duration of the shot is represented, and when the sum of the motion amount of the shot is greater than the motion amount measurement threshold, the shot is judged to be a long shot, otherwise, the shot is judged to be a short shot.
and S2.4, extracting a key frame for the short shot according to a random selection method, and selecting a plurality of frames of images at equal intervals as the key frame of the long shot by the long shot according to the start frame of the shot.
And S2.5, recombining the extracted key frame sequence, restoring the extracted key frame sequence to different visual angle directions to generate a video summary, and achieving the purpose of quickly browsing the video by an observer through the operation on the video summary.

Claims (8)

1. A720-degree panoramic video fast browsing method is characterized by comprising the following steps:
s1, firstly, reconstructing a 720-degree panoramic video image by using a back projection method to obtain a view sequence corresponding to each sight line direction of a spherical viewpoint space;
s2, judging the length of the shot by calculating the absolute brightness frame difference of adjacent image frames in the video sequence, and then extracting key frames to realize the quick browsing of the panoramic video, wherein the method comprises the following steps:
S2.1, carrying out structuring processing on the panoramic video sequence, and classifying the video sequence obtained in the step S1 according to the view frame sequences projected on the view angles in different directions to obtain a video sequence group which can be browsed independently on a plurality of view angles;
S2.2, respectively segmenting the video sequence group in each visual angle direction, calculating the absolute brightness frame difference of adjacent image frames in the video sequence, judging the conversion node of the video shot, and segmenting the video sequence into a plurality of shot segments;
s2.3, through the sum of the lens motion amount of each lens segmentThe motion quantity measuring threshold is compared with a preset motion quantity measuring threshold, whether the shot belongs to a long shot or a short shot is judged, wherein M _ f (t) represents the relative motion quantity between two adjacent video frames at the time t, S _ time represents the duration of the shot, when the sum of the motion quantities of the shot is greater than the motion quantity measuring threshold, the shot is judged to be the long shot, and otherwise, the shot is the short shot;
s2.4, respectively extracting key frames from the long shot and the short shot, randomly extracting one key frame from the short shot, and extracting multi-frame images as the key frames from the long shot according to an equal interval method;
And S2.5, recombining the extracted key frame sequences, restoring the extracted key frame sequences to different visual angle directions to generate a video summary, and achieving the purpose of quickly browsing the video by an observer through the operation on the video summary.
2. The method for browsing 720-degree panoramic video quickly as claimed in claim 1, wherein the method of S1 is as follows:
s1.1, completing splicing of 720-degree panoramic images based on a spherical viewpoint space model, establishing two coordinate systems with the center of a sphere as the center, and respectively representing a world coordinate system XYZ and a camera coordinate system XYZ; the camera coordinate system XYZ is obtained by rotating a world coordinate system XYZ by an angle alpha around an X axis in the world coordinate system and then rotating by an angle beta around a Y axis in the world coordinate system;
S1.2, unifying the basic measurement units of the pixels under the two coordinate systems in the S1.1, calculating the pixel focal distance taking the pixel as the basic measurement unit, namely estimating the pixel focal distance f from the viewpoint to the view plane for each pixel under the camera coordinate system;
S1.3, establishing a conversion relation between coordinates of two-dimensional image points and three-dimensional parameter coordinate points corresponding to a spherical surface by using a pixel focal length f, rotating an alpha angle around an X axis in a world coordinate system according to the world coordinate system XYZ, and rotating a beta angle around a Y axis in the world coordinate system, wherein along with the rotation of a coordinate axis, the representation of pixel points on each coordinate component also changes correspondingly, and the change can be represented on each coordinate component by using a trigonometric function relation, so that a transformation matrix H of corresponding points under two coordinate systems is obtained;
and S1.4, establishing an inverse transformation function by the transformation matrix H, finding out a corresponding relation from any point on the panoramic image to a point on each view in the spherical space, and calculating coordinates of each point to obtain a corresponding view in each sight direction of the viewpoint space.
3. The method for browsing 720-degree panoramic video quickly according to claim 1, wherein in S1.2, an image S is a spliced spherical panoramic image, Q is any one pixel point on the spherical panoramic image S, and the image coordinates are (x ', y'); j is a view to be generated, a point P is a point corresponding to a point Q on the spherical panoramic image on the view J, and the image coordinates are (x, y); f represents the focal length of the pixel, and f is estimated according to a lens used for shooting a live-action image;
The method for estimating the pixel focal length f of the wide-angle lens or the standard lens comprises the following steps: if the camera horizontally rotates for a circle to shoot n live-action images, the horizontal visual angle of the camera is 360/n, the width of the live-action image is W, and the pixel focal length estimation formula of the common lens can be obtained according to the trigonometric function relationship as follows: w/(2tan (180/n));
The method for estimating the pixel focal length f of the fisheye lens comprises the following steps: and (3) recording the width of the image after the black frame of the fisheye image is removed as W, and then the pixel focal length estimation formula of the fisheye lens is as follows: and f is W/phi, wherein phi is the horizontal visual field of the fisheye lens.
4. The method for browsing 720-degree panoramic video quickly as claimed in claim 3, wherein in S1.3, a pixel focal length f is used to establish a conversion relation between the coordinates of a two-dimensional image point and a three-dimensional parameter coordinate point corresponding to a spherical surface, as follows:
calculating a transformation matrix H of corresponding points under two coordinate systems, wherein the expression is as follows:
5. The method according to claim 4, wherein in S1.4, as shown in equations (1) and (2) in S1.3, the point Q '(u, v, w) in the coordinate system XYZ corresponds to (u', v ', w') ═ H (u, v, w) in the coordinate system XYZ;
Knowing that the width of a live-action image shot in a video is W and the height of the live-action image is H, establishing a functional relation between any point Q (x ', y') on the spherical panoramic image and a point P (x, y) corresponding to the point Q on a view J, and calculating the coordinates of each corresponding point by using a formula (3) to obtain a view corresponding to each sight line direction of a viewpoint space;
6. the method according to claim 1, wherein in S2.2, an absolute luminance frame difference AIFD is selected as a characteristic quantity for measuring a degree of change of video content, and the definition thereof is as follows:
In the above formula, f (x, y, t) and f (x, y, t +1) respectively represent the brightness value of a pixel point at the (x, y) coordinate of an image frame at the time t in the video sequence and the brightness value of a pixel point at the (x, y) coordinate of the next frame at the time t in the video sequence, and W and H respectively represent the width and the height of the video frame; if the number of image frames of the video completely played in a certain view angle direction is N, the average value of the luminance frame differences of the video is:
setting two different coefficients a and b by calculating the mean value of the luminance frame difference as a determination reference, weighting the mean value of the luminance frame difference to obtain high and low thresholds thresh _ low and thresh _ high as determination conditions for whether and in which manner the shot is converted, wherein
7. The method for browsing 720-degree panoramic video quickly according to claim 6, wherein a takes a value of 1.2 and b takes a value of 2.3.
8. The method for browsing 720-degree panoramic video quickly according to claim 6, wherein in S2.2, the method for segmenting the video sequence group is as follows:
firstly, initializing input video frame data, calculating AIFD characteristic values of two adjacent frames at the time t, comparing and judging the characteristic value of a current frame with a judgment threshold value, and thus detecting whether shot conversion exists between the current frame and a next frame.
CN201610496238.7A 2016-06-29 2016-06-29 720-degree panoramic video fast browsing method Active CN106127680B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610496238.7A CN106127680B (en) 2016-06-29 2016-06-29 720-degree panoramic video fast browsing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610496238.7A CN106127680B (en) 2016-06-29 2016-06-29 720-degree panoramic video fast browsing method

Publications (2)

Publication Number Publication Date
CN106127680A CN106127680A (en) 2016-11-16
CN106127680B true CN106127680B (en) 2019-12-17

Family

ID=57284438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610496238.7A Active CN106127680B (en) 2016-06-29 2016-06-29 720-degree panoramic video fast browsing method

Country Status (1)

Country Link
CN (1) CN106127680B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108122191B (en) * 2016-11-29 2021-07-06 成都美若梦景科技有限公司 Method and device for splicing fisheye images into panoramic image and panoramic video
CN108174265B (en) * 2016-12-07 2019-11-29 华为技术有限公司 A kind of playback method, the apparatus and system of 360 degree of panoramic videos
CN106792151A (en) * 2016-12-29 2017-05-31 上海漂视网络科技有限公司 A kind of virtual reality panoramic video player method
CN108269234B (en) * 2016-12-30 2021-11-19 成都美若梦景科技有限公司 Panoramic camera lens attitude estimation method and panoramic camera
CN107213636B (en) * 2017-05-31 2021-07-23 网易(杭州)网络有限公司 Lens moving method, device, storage medium and processor
CN107172412A (en) * 2017-06-11 2017-09-15 成都吱吖科技有限公司 A kind of interactive panoramic video storage method and device based on virtual reality
CN107484004B (en) * 2017-07-24 2020-01-03 北京奇艺世纪科技有限公司 Video processing method and device
US10779006B2 (en) * 2018-02-14 2020-09-15 Qualcomm Incorporated Signaling 360-degree video information
CN108769731B (en) * 2018-05-25 2021-09-24 北京奇艺世纪科技有限公司 Method and device for detecting target video clip in video and electronic equipment
CN111669547B (en) * 2020-05-29 2022-03-11 成都易瞳科技有限公司 Panoramic video structuring method
CN114342909A (en) * 2022-01-04 2022-04-15 阳光电源股份有限公司 Laser bird repelling method and related device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6459822B1 (en) * 1998-08-26 2002-10-01 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Video image stabilization and registration
CN102833525A (en) * 2012-07-19 2012-12-19 中国人民解放军国防科学技术大学 Browsing operation method of 360-degree panoramic video
CN103338343A (en) * 2013-05-29 2013-10-02 山西绿色光电产业科学技术研究院(有限公司) Multi-image seamless splicing method and apparatus taking panoramic image as reference
CN104219584B (en) * 2014-09-25 2018-05-01 广东京腾科技有限公司 Panoramic video exchange method and system based on augmented reality
CN105678693B (en) * 2016-01-25 2019-05-14 成都易瞳科技有限公司 Panoramic video browses playback method

Also Published As

Publication number Publication date
CN106127680A (en) 2016-11-16

Similar Documents

Publication Publication Date Title
CN106127680B (en) 720-degree panoramic video fast browsing method
KR101923845B1 (en) Image processing method and apparatus
Matsuyama et al. 3D video and its applications
Bertel et al. Omniphotos: casual 360 vr photography
CN101916455B (en) Method and device for reconstructing three-dimensional model of high dynamic range texture
CN107240147B (en) Image rendering method and system
JPH09139956A (en) Apparatus and method for analyzing and emphasizing electronic scene
KR100560464B1 (en) Multi-view display system with viewpoint adaptation
JP2016537901A (en) Light field processing method
CN110060201B (en) Hot spot interaction method for panoramic video
CN111476884B (en) Real-time three-dimensional human body reconstruction method and system based on single-frame RGBD image
Sharma et al. A flexible architecture for multi-view 3DTV based on uncalibrated cameras
CN114119739A (en) Binocular vision-based hand key point space coordinate acquisition method
CN107197135B (en) Video generation method and video generation device
Zhang et al. Multiscale-vr: Multiscale gigapixel 3d panoramic videography for virtual reality
CN110580720A (en) camera pose estimation method based on panorama
CN111405270A (en) VR immersive application system based on 3D live-action cloning technology
CN108564654B (en) Picture entering mode of three-dimensional large scene
WO2005088966A1 (en) Substitute method for role head of digital tv program
Ma et al. VommaNet: An End-to-End network for disparity estimation from reflective and texture-less light field images
KR101289283B1 (en) A holographic display method using a hybrid image acquisition system
CN112102504A (en) Three-dimensional scene and two-dimensional image mixing method based on mixed reality
Ling et al. Gans-nqm: A generative adversarial networks based no reference quality assessment metric for rgb-d synthesized views
CN113382227A (en) Naked eye 3D panoramic video rendering device and method based on smart phone
Lin et al. Fast intra-frame video splicing for occlusion removal in diminished reality

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant