WO2022028407A1 - 一种全景视频剪辑方法、装置、存储介质及设备 - Google Patents

一种全景视频剪辑方法、装置、存储介质及设备 Download PDF

Info

Publication number
WO2022028407A1
WO2022028407A1 PCT/CN2021/110259 CN2021110259W WO2022028407A1 WO 2022028407 A1 WO2022028407 A1 WO 2022028407A1 CN 2021110259 W CN2021110259 W CN 2021110259W WO 2022028407 A1 WO2022028407 A1 WO 2022028407A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
panoramic video
sky
perspective
frame
Prior art date
Application number
PCT/CN2021/110259
Other languages
English (en)
French (fr)
Inventor
贾顺
那强
江振祺
蔡锦霖
Original Assignee
影石创新科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 影石创新科技股份有限公司 filed Critical 影石创新科技股份有限公司
Publication of WO2022028407A1 publication Critical patent/WO2022028407A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • G06T3/047Fisheye or wide-angle transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules

Definitions

  • the present application belongs to the field of video processing, and in particular relates to a panoramic video editing method, device, storage medium and device.
  • Panoramic video is a video obtained by using a panoramic camera to shoot 360 degrees in all directions. Users can arbitrarily watch the dynamic video within the shooting angle of the panoramic camera.
  • the flat-panel display can only display images from one of the perspectives of the panoramic video at a certain time, when the user wants to watch a certain prominent target object during a certain period of time during the playback of the panoramic video, the target may disappear due to the disappearance of the target.
  • Embodiments of the present invention provide a panorama video editing method, device, storage medium and device, which are used to solve the problem that the output of panorama video is not smooth due to the inability to provide an effective panorama video editing method in the prior art.
  • an embodiment of the present invention provides a panoramic video editing method, the method includes the following steps:
  • Traverse the panoramic video edit the panoramic video according to the perspective of the forward direction, the perspective of the salient target, the perspective of the symmetry target, and the perspective of the sky target, and generate a target corresponding to the panoramic video video.
  • a panoramic video shot by a panoramic camera is acquired, and the forward direction viewing angle of the panoramic camera during shooting is recorded, specifically including:
  • the panoramic video is an original spherical video.
  • a saliency target a symmetry target and a sky target, specifically including:
  • a preset salient target detection and recognition algorithm is used to detect the panoramic video frame and/or the double fisheye image frame to obtain a salient target.
  • detecting the panoramic video frame and/or the double fisheye image frame to obtain a saliency target which further includes:
  • the target with the largest saliency value is set as the currently detected panoramic video frame and/or double fisheye image frame.
  • the current salience target is set as the currently detected panoramic video frame and/or double fisheye image frame.
  • a saliency target a symmetry target and a sky target, specifically including:
  • a preset symmetrical target detection and recognition algorithm is used to detect the images in the vertical upward direction of the panoramic video frame and/or the double fisheye image frame to obtain a symmetrical target.
  • a saliency target a symmetry target and a sky target, specifically including:
  • the panoramic video frame and/or the image in the vertical upward direction of the double fisheye image frame is detected to obtain the sky target.
  • a preset target tracking algorithm is used to track the salient target, the symmetric target and the sky target, to obtain the viewing angle where the salient target is located, and where the symmetric target is located.
  • the viewing angle and the viewing angle where the sky target is located specifically including:
  • the preset target tracking algorithm is used in the subsequent panoramic video frame and/or double fish eye image frame in turn.
  • the current saliency target, the current symmetry target and the current sky target are tracked in the eye image frame, and the viewing angle of the current saliency target, the viewing angle of the current symmetry target and the viewing angle of the current sky target are obtained.
  • the method further includes:
  • a preset target tracking algorithm is used to sequentially track the current saliency target, the current symmetry target and the current sky in subsequent panoramic video frames and/or double fisheye image frames. After the target is tracked, and the perspective of the current salient target, the perspective of the current symmetry target, and the perspective of the current sky target are obtained, the details further include:
  • the spherical viewpoint coordinates respectively obtain the lens images corresponding to the perspective of the salient target, the perspective of the symmetry target, and the perspective of the sky target;
  • the lens images corresponding to the viewing angle of the salient target, the viewing angle of the symmetrical target, and the viewing angle of the sky target are respectively generated.
  • the stop tracking event is the loss of the current salience target, the current symmetry target and the current sky target or the area of the tracking frame is smaller than a preset area.
  • the panoramic video is edited according to the perspective of the forward direction, the perspective of the salient target, the perspective of the symmetry target, and the perspective of the sky target,
  • Generating a target video corresponding to the panoramic video specifically includes:
  • the target video is a single-view video or a plane video.
  • the present invention provides a panoramic video editing device, wherein the device includes:
  • Acquisition module used to acquire the panoramic video shot by the panoramic camera, and record the forward direction angle of view of the panoramic camera during shooting;
  • Frame extraction module for performing a frame extraction operation on the obtained panoramic video to obtain corresponding panoramic video frames and/or double fish-eye image frames;
  • Identification module used to identify and acquire salient targets, symmetry targets and sky targets according to the panoramic video frame and/or the double fisheye image frame;
  • Tracking module used for using a preset target tracking algorithm to track the salient target, the symmetrical target and the sky target, and obtain the viewing angle of the salient target, the viewing angle of the symmetrical target and the viewing angle of the sky target;
  • Processing module used to traverse the panoramic video, edit the panoramic video according to the perspective of the forward direction, the perspective of the salient target, the perspective of the symmetry target and the perspective of the sky target, and generate the The target video corresponding to the panoramic video.
  • the present invention provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, characterized in that, when the computer program is executed by a processor, the steps of the method according to the first aspect are implemented .
  • the present invention provides a panoramic video editing device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that the processor executes the The computer program implements the steps of the method as described in the first aspect.
  • the present invention provides a panorama video editing method.
  • the panorama video shot by a panorama camera is acquired, and the forward direction viewing angle of the panorama camera is recorded when shooting; the acquired panorama video is subjected to a frame extraction operation to obtain a corresponding panorama video frame. and/or double fisheye image frame; according to the panoramic video frame and/or double fisheye image frame, identify and obtain a salient target, a symmetrical target and a sky target; adopt a preset target tracking algorithm to track the salient target, symmetrical target According to the perspective of the forward direction, the perspective of the salient target, the perspective of the symmetrical target, and the perspective of the sky target According to the perspective of the sky target, the panoramic video is edited, and the target video corresponding to the panoramic video is generated, which realizes the automatic editing of the panoramic video, and at the same time ensures the smoothness of the transition in the target video, and the validity and interest of the content.
  • FIG. 1 is a flowchart of the implementation of the panoramic video editing method provided by the first embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of a panoramic video editing apparatus according to Embodiment 2 of the present invention.
  • FIG. 3 is a schematic structural diagram of a panoramic video editing device provided in Embodiment 3 of the present invention.
  • FIG. 1 shows an implementation process of a panoramic video editing method provided by Embodiment 1 of the present invention.
  • the panoramic video editing method provided by the embodiments of the present invention can be applied to a computing device, where the computing device may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and panoramic cameras.
  • FIG. 1 only shows the part related to the embodiment of the present invention, and the details are as follows:
  • the embodiment of the present invention is applicable to panoramic video editing for editing panoramic video
  • the panoramic video is captured by a panoramic camera
  • the panoramic camera is composed of two or more fisheye lenses
  • the obtained panoramic video is an original spherical video.
  • the moving direction of the panoramic camera is obtained according to the panoramic video, and then the lens image corresponding to the viewing angle of the forward direction is obtained, which may further be as follows:
  • the displacement amount of the current video frame of the panoramic camera relative to the previous video frame is optimized to obtain the optimized displacement amount
  • a frame extraction operation is performed on the obtained panoramic video according to a preset time interval, so as to obtain corresponding panoramic video frames and/or double fisheye image frames.
  • S103 identify and acquire a salient target, a symmetrical target and a sky target;
  • the panoramic video frame and/or the double fisheye image frame is detected in sequence, and the salient target, the symmetry target and the sky target are identified and obtained, and the following steps are used to achieve:
  • the target with the largest saliency value is set as the current saliency target of the currently detected panoramic video frame and/or double fisheye image frame, so as to accurately obtain when multiple saliency targets are detected.
  • the saliency target to be tracked in the currently detected panoramic video frame and/or double fisheye image frame when no salient target under the target category is detected in the currently detected panoramic video frame and/or double fisheye image frame, confirm There is no saliency target in the currently detected panoramic video frame and/or double fisheye image frame.
  • the target type can be set according to the preset shooting scene of the panoramic camera to further improve the accuracy of target detection.
  • the target type can be sculpture, stone monument, flower bed and landmark building, and the salient target under the sculpture category is There can be animal sculptures, plant sculptures, human sculptures, etc.
  • the symmetrical target detection is performed on the panoramic video frame and/or the double fisheye image frame, which specifically includes:
  • a preset symmetrical target detection and recognition algorithm is used to detect the images in the vertical upward direction of the panoramic video frame and/or the double fisheye image frame to obtain a symmetrical target.
  • sky target detection is performed on panoramic video frames and/or double fisheye image frames, specifically including:
  • the panoramic video frame and/or the image in the vertical upward direction of the double fisheye image frame is detected to obtain the sky target.
  • the panoramic video frames and/or the double fish-eye image frames are detected in sequence, and the salient target, the symmetry target and the sky target are identified and obtained.
  • the eye image frame can be used including but not limited to FT (Frequency-tuned Salient Region Detection, based on frequency-tuned salient region detection) algorithms or superpixel convolutional neural networks (e.g., A Superpixelwise Convolutional Neural Network for Salient Object Detection, superpixel convolutional neural network for target detection) and other algorithms, thereby improving the accuracy of target detection and ensuring the stability of target detection.
  • FT Frequency-tuned Salient Region Detection, based on frequency-tuned salient region detection
  • superpixel convolutional neural networks e.g., A Superpixelwise Convolutional Neural Network for Salient Object Detection, superpixel convolutional neural network for target detection
  • Embodiment 1 of the present invention when the current saliency target, the current symmetry target, and the current sky target are detected in the currently detected panoramic video frame and/or double fisheye image frame, the preset target tracking algorithm is used in sequence in The current saliency target, the current symmetry target, and the current sky target are tracked in subsequent panoramic video frames and/or double fisheye image frames, and the viewing angle of the current salient target, the viewing angle of the current symmetry target, and the current sky target are obtained. The perspective of the sky target.
  • the preset target tracking algorithm may include, but is not limited to, KCF (High-speed Tracking with Kernelized Correlation filters, high-speed tracking based on kernel correlation filters) algorithm or DSST (Accurate Scale Estimation for Robust Visual Tracking, accurate scale estimation for robust visual tracking) algorithm, etc.
  • KCF High-speed Tracking with Kernelized Correlation filters, high-speed tracking based on kernel correlation filters
  • DSST Average Scale Estimation for Robust Visual Tracking, accurate scale estimation for robust visual tracking
  • the method further includes:
  • a preset target tracking algorithm is used to sequentially track the current salient target, the current symmetry target and the current sky target in subsequent panoramic video frames and/or double fisheye image frames, and obtain After the current viewing angle of the salient target, the current viewing angle of the symmetrical target, and the current viewing angle of the sky target, it further includes:
  • the spherical viewpoint coordinates respectively obtain the lens images corresponding to the perspective of the salient target, the perspective of the symmetry target, and the perspective of the sky target;
  • the lens images corresponding to the viewing angle of the salient target, the viewing angle of the symmetrical target, and the viewing angle of the sky target are respectively generated.
  • the tracking stop event is the loss of the current saliency target, the current symmetry target, and the current sky target, or the area of the tracking frame is smaller than a preset area.
  • S105 Traverse the panoramic video, and edit the panoramic video according to the perspective of the forward direction, the perspective where the salient target is located, the perspective where the symmetrical target is located, and the perspective where the sky target is located, and generate the corresponding panoramic video. target video.
  • the panoramic video is edited according to the perspective of the forward direction, the perspective of the salient target, the perspective of the symmetry target, and the perspective of the sky target, and the corresponding panoramic video is generated.
  • target videos including:
  • the target video is a single-view video or a plane video.
  • the method before the setting of the playback number and the corresponding playback speed of the saliency target view segment and/or the sky target view segment and/or the symmetry target view segment, the method further includes:
  • the target video is a single-view video or a plane video.
  • Embodiment 1 of the present invention according to the duration of the saliency target perspective segment and the time interval between two adjacent saliency segments, the lens image corresponding to the perspective where the salient target is located in the panoramic video frame and/or the double fisheye image frame is edited, And set the playback speed of the lens image, which can be further specified as:
  • the saliency target view segment When the duration of the saliency target view segment is less than the preset first threshold, the saliency target view segment is discarded; when the duration of the saliency target view segment is greater than the preset first threshold and less than the preset second threshold , according to the preset expansion rule, the saliency target view segment is correspondingly expanded, and if the expanded saliency target view segment is still smaller than the preset second threshold, the saliency target view segment is discarded; when the saliency target view segment is When the duration of the viewing angle segment is greater than the preset first threshold and less than the preset second threshold, the salient target viewing angle segment is correspondingly expanded according to the preset expansion rule.
  • the expanded saliency target viewing angle segment is larger than the preset second threshold, then keep the expanded saliency target perspective segment, and set the playback speed of the shot image to the preset first speed; when the duration of the saliency target perspective segment is greater than the preset second threshold, and the saliency target perspective If the time interval between the perspective segment and the previous saliency target perspective segment is less than the preset threshold, the playback speed of the shot image is set to be the preset first speed; when the duration of the saliency target perspective segment is greater than the preset second threshold, and The time interval between the saliency target perspective segment and the previous saliency target perspective segment is greater than the preset threshold, then the playback speed of the shot images of the first half of the saliency perspective segment is set to the preset first speed, and the The playback speed of the shot images in the second half of the salient perspective clip is a preset second speed, and the first speed may be greater or less than the second speed, and the duration of the first half of the salient perspective clip may be greater or less
  • Embodiment 1 of the present invention according to the duration of the viewing angle segment of the symmetrical target and the duration of the panoramic video, the lens image corresponding to the viewing angle of the symmetrical target in the editing panoramic video frame and/or the double fisheye image frame is determined, and the lens image is set.
  • the playback speed can be further specified as:
  • the symmetrical target view segment When the duration of the symmetrical target view segment is less than the preset threshold, the symmetrical target view segment is discarded; when the duration of the panoramic video is greater than the preset threshold, and the duration of the symmetrical target view segment is also greater than the preset duration threshold, set the playback speed of the shot image to the preset third speed; when the duration of the panoramic video is less than the preset threshold, and the duration of the symmetrical target viewing angle segment is greater than the preset threshold, set the playback speed of the shot image is a preset fourth speed, wherein the preset third speed may be greater than or less than the preset fourth speed.
  • the lens rotation method and/or corresponding The playback speed can be further specified as:
  • the rotation direction may be the rotation angle from the viewing angle of the advancing direction to the viewing angle of the salient target, and may specifically refer to the rotation angle from the viewing angle of the advancing direction to the viewing angle of the salient target in the Yaw direction (vertical the axis of the ground) rotation angle.
  • the salient target perspective segment and/or the sky target perspective segment and/or the symmetric target perspective segment are automatically edited to generate a target video corresponding to the panoramic video, which may further specifically be :
  • the first saliency target segment will be directly transferred to the panorama video frame and/or the lens corresponding to the forward direction of the double fisheye image frame after the end of the first saliency target segment. image, and then into the second saliency target segment;
  • the interval between the first saliency target segment and the second saliency target segment is greater than the preset threshold, insert a preset symmetric target segment and/or between the first saliency target segment and the second saliency target segment Sky target fragment.
  • the salient target perspective segment and/or the sky target perspective segment and/or the symmetric target perspective segment are automatically edited to generate a target video corresponding to the panoramic video, which may further specifically be :
  • the key point may select the time as the abscissa, and select the relative rotation angle as the ordinate.
  • a panoramic video shot by a panoramic camera is obtained, and the forward direction viewing angle of the panoramic camera during shooting is recorded; a frame extraction operation is performed on the obtained panoramic video to obtain corresponding panoramic video frames and/or double fish eyes Image frame; according to the panoramic video frame and/or double fish-eye image frame, identify and obtain the salient target, the symmetrical target and the sky target; adopt the preset target tracking algorithm to track the salient target, the symmetrical target and the sky target , to obtain the viewing angle of the salient target, the viewing angle of the symmetrical target, and the viewing angle of the sky target; , editing the panoramic video, and generating a target video corresponding to the panoramic video, which realizes automatic editing of the panoramic video, and at the same time ensures the smoothness of transitions in the target video, and the validity and interest of the content.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • FIG. 2 shows the structure of the panoramic video editing apparatus provided by the second embodiment of the present invention. For the convenience of description, only the parts related to the embodiment of the present invention are shown.
  • the panoramic video editing device includes an acquisition module 21, a frame extraction module 22, an identification module 23, a tracking module 24, and a processing module 25, wherein:
  • Obtaining module 21 used to obtain the panoramic video shot by the panoramic camera, and record the forward direction angle of view of the panoramic camera during shooting;
  • Frame extraction module 22 for performing a frame extraction operation on the obtained panoramic video to obtain corresponding panoramic video frames and/or double fisheye image frames;
  • Recognition module 23 for identifying and acquiring salient targets, symmetry targets and sky targets according to the panoramic video frame and/or the double fisheye image frame;
  • Tracking module 24 for using a preset target tracking algorithm to track the salient target, the symmetrical target and the sky target, and obtain the viewing angle where the salient target is located, the viewing angle where the symmetrical target is located, and the viewing angle where the sky target is located;
  • Processing module 25 used to traverse the panoramic video, edit the panoramic video according to the perspective of the forward direction, the perspective of the salient target, the perspective of the symmetry target, and the perspective of the sky target, and generate the The target video corresponding to the above-mentioned panoramic video.
  • each module of the panoramic video editing device may be implemented by corresponding hardware or software units, and each module may be an independent software and hardware unit, or may be integrated into a software and hardware unit, which is not limited here. invention.
  • a computer-readable storage medium where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the steps in the above-mentioned embodiments of the panoramic video editing method are implemented, for example, Steps S101 to S105 shown in FIG. 1 .
  • the computer program is executed by the processor, the functions of each unit in the above-mentioned apparatus embodiments, for example, the functions of units 21 to 25 shown in FIG. 2 , are implemented.
  • the computer-readable storage medium of the embodiments of the present invention may include any entity or device capable of carrying computer program codes, recording medium, for example, memory such as ROM/RAM, magnetic disk, optical disk, flash memory, and the like.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • FIG. 3 shows the structure of the panoramic video editing device provided by the third embodiment of the present invention. For convenience of description, only the parts related to the embodiment of the present invention are shown.
  • the panoramic video editing device 3 of the embodiment of the present invention includes a processor 30 , a memory 31 , and a computer program 32 stored in the memory 31 and executable on the processor 30 .
  • the processor 30 executes the computer program 32
  • the steps in the above-mentioned embodiment of the panoramic video editing method are implemented, for example, steps S101 to S105 shown in FIG. 1 .
  • the processor 30 executes the computer program 32
  • the functions of the modules in the above-mentioned apparatus embodiments, such as the functions of the modules 21 to 25 shown in FIG. 2, are implemented.
  • the panoramic video editing device in the embodiment of the present invention may be a smart phone, a personal computer, a panoramic camera itself, or the like.
  • the processor 30 in the panoramic video editing device 3 executes the computer program 32 to implement the panoramic video editing method, reference may be made to the descriptions of the foregoing method embodiments, which will not be repeated here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Studio Devices (AREA)

Abstract

适用于全景视频技术领域,提供一种全景视频剪辑方法、装置、存储介质及设备,该方法包括:获取全景相机拍摄的全景视频,并记录所述全景相机在移动拍摄时的前进方向视角;对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;根据所述视角,剪辑所述全景视频,生成所述全景视频对应的目标视频,实现全景视频的自动剪辑,同时保证目标视频中转场的流畅性、以及内容的有效性和趣味性。

Description

一种全景视频剪辑方法、装置、存储介质及设备 技术领域
本申请属于视频处理领域,尤其涉及一种全景视频剪辑方法、装置、存储介质及设备。
背景技术
全景视频是利用全景摄像机进行全方位360度进行拍摄而得到的视频,用户能够任意观看在全景摄像机拍摄角度范围内的动态视频。在观看全景视频时,由于平面显示器某一时刻只能显示全景视频其中一个视角的图像,当用户在全景视频播放过程中的某个时间段想要观看某一个显著目标对象时,可能由于目标消失在当前视角而需要不断控制显示器转动视角,因此操作比较麻烦,同时也会影响观看体验。
技术问题
本发明实施例提供一种全景视频剪辑方法、装置、存储介质及设备,用于解决由于现有技术无法提供一种有效的全景视频剪辑方法,导致全景视频在输出时流畅性不高的问题。
技术解决方案
一方面,本发明实施例提供了一种全景视频剪辑方法,所述方法包括下述步骤:
获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角;
对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;
根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;
采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;
遍历所述全景视频,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频。
结合第一方面,在一种可能实现的方式中,获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角,具体包括:
根据所述全景视频获取所述全景相机移动拍摄时的前进方向,获取所述前进方向视角对应的镜头图像;
根据所述镜头图像得到所述全景相机在拍摄时的前进方向视角;
其中,所述全景视频为原始球面视频。
结合第一方面,在一种可能实现的方式中,根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标,具体包括:
采用预设显著性目标检测识别算法,检测所述全景视频帧和/或双鱼眼图像帧,获取显著性目标。
结合第一方面,在一种可能实现的方式中,检测所述全景视频帧和/或双鱼眼图像帧,获取显著性目标,具体还包括:
当在当前被检测全景视频帧和/或双鱼眼图像帧中检测出预设的显著性目标种类时,将显著性值最大的目标设置为当前被检测全景视频帧和/或双鱼眼图像帧的当前显著性目标。
结合第一方面,在一种可能实现的方式中,根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标,具体包括:
根据所述全景视频帧和/或双鱼眼图像帧,得到所述全景视频帧和/或双鱼眼图像帧的时间戳;
根据所述全景视频帧和/或双鱼眼图像帧的时间戳,得到所述全景视频帧和/或双鱼眼图像帧的第一旋转矩阵;
根据所述第一旋转矩阵渲染所述全景视频帧和/或双鱼眼图像帧,获得所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像;
采用预设对称性目标检测识别算法,检测所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像,获取对称性目标。
结合第一方面,在一种可能实现的方式中,根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标,具体包括:
根据所述全景视频帧和/或双鱼眼图像帧,得到所述全景视频帧和/或双鱼眼图像帧的时间戳;
根据所述全景视频帧和/或双鱼眼图像帧的时间戳,得到所述全景视频帧和/或双鱼眼图像帧的第一旋转矩阵;
将所述全景视频帧和/或双鱼眼图像帧的行进方向矩阵分解成第一欧拉角(分别为Yaw、Pitch、Roll),并将Pitch角设置PI/2,得到第二欧拉角;
将所述第二欧拉角转换得到第二旋转矩阵;
根据所述第二旋转矩阵渲染所述全景视频帧和/或双鱼眼图像帧,获得所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像;
采用预设天空目标检测识别算法,检测所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像,获取天空目标。
结合第一方面,在一种可能实现的方式中,采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,具体包括:
当在当前被检测全景视频帧和/或双鱼眼图像帧中检测出当前显著性目标、当前对称性目标和当前天空目标时,采用预设的目标跟踪算法依次在后续全景视频帧和/或双鱼眼图像帧中对所述当前显著性目标、当前对称性目标和当前天空目标进行追踪,并获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角。
结合第一方面,在一种可能实现的方式中,获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角之后,还包括:
分别检测所述当前显著性目标、当前对称性目标和当前天空目标的停止追踪事件,当检测到所述当前显著性目标、当前对称性目标和当前天空目标的停止追踪事件时,分别跳转至对所述全景视频进行抽帧操作的步骤,并继续分别进行识别获取显著性目标、对称性目标和天空目标。
结合第一方面,在一种可能实现的方式中,采用预设的目标跟踪算法依次在后续全景视频帧和/或双鱼眼图像帧中对所述当前显著性目标、当前对称性目标和当前天空目标进行追踪,并获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角之后,具体还包括:
分别获取所述当前被检测全景视频帧和/或双鱼眼图像帧中当前显著性目标、当前对称性目标和当前天空目标追踪框的中心坐标,根据所述中心坐标分别计算所述当前显著性目标、当前对称性目标和当前天空目标的球面视点坐标;
根据所述球面视点坐标,分别获取所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角对应的镜头图像;
根据所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角对应的镜头图像,分别生成显著性目标视角片段、对称性目标视角片段和天空目标视角片段。
结合第一方面,在一种可能实现的方式中,所述停止追踪事件为丢失所述当前显著性目标、当前对称性目标和当前天空目标或所述追踪框的面积小于预设面积。
结合第一方面,在一种可能实现的方式中,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频,具体包括:
根据全景视频时长;
和/或,显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段出现的数量及对应的时长;
和/或,显著性目标所在视角和/或对称性目标所在视角和/或天空目标所在视角与所述前进方向视角的关系;
设置显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段的播放数量和对应的播放速度,自动剪辑所述显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段,生成所述全景视频对应的目标视频;
其中,目标视频为单一视角视频或平面视频。
结合第一方面,在一种可能实现的方式中,所述设置显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段的播放数量和和对应的播放速度之前,还包括:
根据全景视频时长、显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段出现的数量及时长,设置镜头旋转方式和/或对应的播放速度。
第二方面,本发明提供一种全景视频剪辑装置,其特征在于,所述装置包括:
获取模块:用于获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角;
抽帧模块:用于对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;
识别模块:用于根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;
跟踪模块:用于采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;
处理模块:用于遍历所述全景视频,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频。
第三方面,本发明提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如第一方面所述方法的步骤。
第四方面,本发明提供一种全景视频剪辑设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如第一方面所述方法的步骤。
有益效果
本发明提供一种全景视频剪辑方法,获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角;对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频,实现了全景视频的自动剪辑,同时保证了目标视频中转场的流畅性、以及内容的有效性和趣味性。
附图说明
图1是本发明实施例一提供的全景视频剪辑方法的实现流程图。
图2是本发明实施例二提供的全景视频剪辑装置的结构示意图。
图3是本发明实施例三提供的全景视频剪辑设备的结构示意图。
本发明的实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
以下结合具体实施例对本发明的具体实现进行详细描述:
实施例一:
图1示出了本发明实施例一提供的全景视频剪辑方法的实现流程。本发明实施例提供的全景视频剪辑方法可应用在计算设备上,其中,计算设备可以但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和全景相机。
为了便于说明,图1仅示出了与本发明实施例相关的部分,详述如下:
S101、获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角;
本发明实施例适用于全景视频剪辑,以用于剪辑全景视频,该全景视频通过全景相机拍摄得到,全景相机由2个或2个以上鱼眼镜头组成,得到的全景视频为原始球面视频。在记录全景相机在拍摄时的前进方向视角时,具体地,根据全景视频获取全景相机移动拍摄时的前进方向,进而获取前进方向视角对应的镜头图像,从而通过前进方向视角对应的镜头图像实现全景相机前进方向视角的记录。
具体地,根据全景视频获取全景相机移动拍摄时的前进方向,进而获取前进方向视角对应的镜头图像,进一步可以具体为:
获取拍摄当前视频帧时全景相机相对于世界坐标系的旋转量和全景视频的
当前视频帧和前一视频帧对应的多路鱼眼图像;
分别提取全景视频的前一视频帧对应的多路鱼眼图像的角点,获取待跟踪的角点序列;
分别对所述待跟踪的角点序列进行跟踪,获取当前视频帧和前一视频帧对应的鱼眼图像中待跟踪的匹配点对;
根据所述匹配点对,对全景相机当前视频帧相对前一视频帧的位移量进行优化,得到优化后的位移量;
将优化后的位移量作为虚拟相机的前进方向,计算当前虚拟相机的旋转矩阵,利用所述拍摄当前视频帧时全景相机相对于世界坐标系的旋转量和所述当前虚拟相机的旋转矩阵对全景视频的当前视频帧进行转场渲染。
S102、对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;
在本发明实施例中,按照预设的时间间隔对获取的全景视频进行抽帧操作,以得到对应的全景视频帧和/或双鱼眼图像帧。
S103、根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;
在本发明实施例一中,依次对全景视频帧和/或双鱼眼图像帧进行检测,识别获取显著性目标、对称性目标和天空目标,通过以下步骤来实现:
(1)在本发明实施例中,在对全景视频帧和/或双鱼眼图像帧进行显著性目标检测时,当在当前被检测的全景视频帧和/或双鱼眼图像帧中检测出预设目标种类下的显著性目标时,将显著性值最大的目标设置为当前被检测全景视频帧和/或双鱼眼图像帧的当前显著性目标,从而在检测到多个显著性目标时准确地获取当前被检测全景视频帧和/或双鱼眼图像帧中的待追踪显著性目标,当在当前被检测全景视频帧和/或双鱼眼图像帧中未检测出目标种类下的显著性目标时,确认当前被检测全景视频帧和/或双鱼眼图像帧不存在显著性目标。目标种类可以根据全景相机预设的拍摄场景设置,以进一步提高目标检测的准确性,例如,目标种类可以为雕塑类、石碑类、花坛类和标志性建筑类,雕塑类下的显著性目标则可以有动物雕塑、植物雕塑、人体雕塑等。
(2)在本发明实施例一中,对全景视频帧和/或双鱼眼图像帧进行对称性目标检测,具体包括:
根据所述全景视频帧和/或双鱼眼图像帧,得到所述全景视频帧和/或双鱼眼图像帧的时间戳;
根据所述全景视频帧和/或双鱼眼图像帧的时间戳,得到所述全景视频帧和/或双鱼眼图像帧的第一旋转矩阵;
根据所述第一旋转矩阵渲染所述全景视频帧和/或双鱼眼图像帧,获得所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像;
采用预设对称性目标检测识别算法,检测所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像,获取对称性目标。
(3)在本发明实施例一中,对全景视频帧和/或双鱼眼图像帧进行天空目标检测,具体包括:
根据所述全景视频帧和/或双鱼眼图像帧,得到所述全景视频帧和/或双鱼眼图像帧的时间戳;
根据所述全景视频帧和/或双鱼眼图像帧的时间戳,得到所述全景视频帧和/或双鱼眼图像帧的第一旋转矩阵;
将所述全景视频帧和/或双鱼眼图像帧的行进方向矩阵分解成第一欧拉角(分别为Yaw、Pitch、Roll),并将Pitch角设置PI/2,得到第二欧拉角;
将所述第二欧拉角转换得到第二旋转矩阵;
根据所述第二旋转矩阵渲染所述全景视频帧和/或双鱼眼图像帧,获得所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像;
采用预设天空目标检测识别算法,检测所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像,获取天空目标。
(4)在本发明实施例一中,依次对全景视频帧和/或双鱼眼图像帧进行检测,识别获取显著性目标、对称性目标和天空目标,具体地,对全景视频帧和/或双鱼眼图像帧进行目标检测识别时,可以采用包括但不限于FT(Frequency-tuned Salient Region Detection,基于频率调谐显著区域检测)算法或超像素卷积神经网络(例如,A Superpixelwise Convolutional Neural Network for Salient Object Detection,用于目标检测的超像素卷积神经网络)等算法,从而提高目标检测的准确性,同时保证目标检测的稳定性。
S104、采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;
在本发明实施例一中,当在当前被检测全景视频帧和/或双鱼眼图像帧中检测出当前显著性目标、当前对称性目标和当前天空目标时,采用预设的目标跟踪算法依次在后续全景视频帧和/或双鱼眼图像帧中对所述当前显著性目标、当前对称性目标和当前天空目标进行追踪,并获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角。
具体地,所述预设目标跟踪算法可采用包括但不限于KCF(High-speed Tracking with Kernelized Correlation filters,基于核相关滤波器的高速追踪)算法或DSST(Accurate Scale Estimation for Robust Visual Tracking,用于鲁棒视觉跟踪的精确尺度估计)算法等。
在本发明实施例一中,获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角之后,还包括:
分别检测所述当前显著性目标、当前对称性目标和当前天空目标的停止追踪事件,当检测到所述当前显著性目标、当前对称性目标和当前天空目标的停止追踪事件时,分别跳转至对所述全景视频进行抽帧操作的步骤,并继续分别进行识别获取显著性目标、对称性目标和天空目标。
在本发明实施例一中,采用预设的目标跟踪算法依次在后续全景视频帧和/或双鱼眼图像帧中对所述当前显著性目标、当前对称性目标和当前天空目标进行追踪,并获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角之后,具体还包括:
分别获取所述当前被检测全景视频帧和/或双鱼眼图像帧中当前显著性目标、当前对称性目标和当前天空目标追踪框的中心坐标,根据所述中心坐标分别计算所述当前显著性目标、当前对称性目标和当前天空目标的球面视点坐标;
根据所述球面视点坐标,分别获取所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角对应的镜头图像;
根据所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角对应的镜头图像,分别生成显著性目标视角片段、对称性目标视角片段和天空目标视角片段。
在本发明实施例一中,所述停止追踪事件为丢失所述当前显著性目标、当前对称性目标和当前天空目标或所述追踪框的面积小于预设面积。
S105、遍历所述全景视频,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频。
在本发明实施例一中,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频,具体包括:
根据全景视频时长;
和/或,显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段出现的数量及对应的时长;
和/或,显著性目标所在视角和/或对称性目标所在视角和/或天空目标所在视角与所述前进方向视角的关系;
设置显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段的播放数量和对应的播放速度,自动剪辑所述显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段,生成所述全景视频对应的目标视频;
其中,目标视频为单一视角视频或平面视频。
在本发明实施例一中,所述设置显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段的播放数量和对应的播放速度之前,还包括:
根据全景视频时长、显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段出现的数量及时长,设置镜头旋转方式和/或对应的播放速度。
在本发明实施例一中,上述步骤还可进一步具体为:
遍历所述全景视频帧和/或双鱼眼图像帧,当遍历到的全景视频帧和/或双鱼眼图像帧中不存在显著性目标或对称性目标或天空目标时,剪辑全景视频帧和/或双鱼眼图像帧中前进方向视角对应的镜头图像,并设置镜头图像的播放速度;
当全景视频帧和/或双鱼眼图像帧中存在显著性目标时,根据显著性目标视角片段的时长及相邻两个显著性片段的时间间隔,剪辑全景视频帧和/或双鱼眼图像帧中显著性目标所在视角对应的镜头图像,并设置镜头图像的播放速度;
当全景视频帧和/或双鱼眼图像帧中存在对称性目标时,根据对称性目标视角片段的时长及全景视频的时长,确定剪辑全景视频帧和/或双鱼眼图像帧中对称性目标所在视角对应的镜头图像,并设置镜头图像的播放速度;
当全景视频帧和/或双鱼眼图像帧中存在天空目标时,根据预设规则,确定剪辑全景视频帧和/或双鱼眼图像帧中天空目标所在视角对应的镜头图像,并设置镜头图像的播放速度;
根据剪辑的镜头图像和设置的对应播放速度,生成所述全景视频对应的目标视频;
其中,所述目标视频为单一视角视频或平面视频。
在本发明实施例一中,根据显著性目标视角片段的时长及相邻两个显著性片段的时间间隔,剪辑全景视频帧和/或双鱼眼图像帧中显著性目标所在视角对应的镜头图像,并设置镜头图像的播放速度,进一步可以具体为:
当所述显著性目标视角片段的时长小于预设第一阈值时,舍弃该显著性目标视角片段;当所述显著性目标视角片段的时长大于预设第一阈值且小于预设第二阈值时,根据预设扩充规则对所述显著性目标视角片段进行相应扩充,若扩充后的显著性目标视角片段仍小于预设第二阈值,则舍弃该显著性目标视角片段;当所述显著性目标视角片段的时长大于预设第一阈值且小于预设第二阈值时,根据预设扩充规则对所述显著性目标视角片段进行相应扩充,若扩充后的显著性目标视角片段大于预设第二阈值,则保留扩充后的显著性目标视角片段,并设置镜头图像的播放速度为预设第一速度;当所述显著性目标视角片段的时长大于预设第二阈值,且所述显著性目标视角片段与前一个显著性目标视角片段的时间间隔小于预设阈值,则设置镜头图像的播放速度为预设第一速度;当所述显著性目标视角片段的时长大于预设第二阈值,且所述显著性目标视角片段与前一个显著性目标视角片段的时间间隔大于预设阈值,则将所述显著性视角片段前半部分的镜头图像的播放速度为预设第一速度,将所述显著性视角片段后半部分的镜头图像的播放速度为预设第二速度,且所述第一速度可以大于或者小于所述第二速度,且所述显著性视角片段前半部分的时长可以大于或者小于所述显著性视角片段后半部分的时长;
在本发明实施例一中,根据对称性目标视角片段的时长及全景视频的时长,确定剪辑全景视频帧和/或双鱼眼图像帧中对称性目标所在视角对应的镜头图像,并设置镜头图像的播放速度,进一步可以具体为:
当所述对称性目标视角片段的时长小于预设阈值时,舍弃该对称性目标视角片段;当所述全景视频的时长大于预设阈值,且所述对称性目标视角片段的时长也大于预设阈值,则设置镜头图像的播放速度为预设第三速度;当所述全景视频的时长小于预设阈值,且所述对称性目标视角片段的时长大于预设阈值,则设置镜头图像的播放速度为预设第四速度,其中,所述预设第三速度可以大于或者小于预设第四速度。
优选地,在本发明实施例一中,根据全景视频时长、显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段出现的数量及时长,设置镜头旋转方式和/或对应的播放速度,进一步可具体为:
计算所述前进方向所在视角到所述显著性目标所在视角的旋转方向,若所述旋转方向大于预设阈值,则基于所述旋转方向,对所述显著性目标所在视角对应的镜头图像先进行顺时针旋转,再进行逆时针旋转;若所述旋转方向小于预设阈值,则基于所述旋转方向,对所述显著性目标所在视角对应的镜头图像先进行逆时针旋转,再进行顺时针旋转;
其中,所述旋转方向可以为所述前进方向所在视角到所述显著性目标所在视角的旋转角度,具体可以指的是所述前进方向所在视角到所述显著性目标所在视角在Yaw方向(垂直于地面的轴)旋转的角度。
根据所述旋转结果,生成所述显著性目标所在视角对应的显著性目标片段;
优选地,在本发明实施例一中,自动剪辑所述显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段,生成所述全景视频对应的目标视频,进一步具体可为:
若第一显著性目标片段与第二显著性目标片段的间隔时间小于预设阈值,则第一显著性目标片段结束后直接转入全景视频帧和/或双鱼眼图像帧前进方向视角对应的镜头图像,然后进入到第二显著性目标片段;
若第一显著性目标片段与第二显著性目标片段的间隔时间大于预设阈值,则在第一显著性目标片段和第二显著性目标片段之间插入预设的对称性目标片段和/或天空目标片段。
优选地,在本发明实施例一中,自动剪辑所述显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段,生成所述全景视频对应的目标视频,进一步具体可为:
选取所述显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段的关键点,将所述关键点按照预设规则组合成平滑的曲线,然后根据所述曲线对所述关键点之间的任一点按照预设规则进行插值,通过上述操作,可以保证多目标视角之间流畅切换;
其中,所述关键点可选取时间为横坐标,选取相对旋转角度作为纵坐标。
在本申请中,获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角;对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频,实现了全景视频的自动剪辑,同时保证了目标视频中转场的流畅性、以及内容的有效性和趣味性。
实施例二:
图2示出了本发明实施例二提供的全景视频剪辑装置的结构,为了便于说明,仅示出了与本发明实施例相关的部分。
在本发明实施例中,全景视频剪辑装置包括获取模块21、抽帧模块22、识别模块23、跟踪模块24及处理模块25,其中:
获取模块21:用于获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角;
抽帧模块22:用于对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;
识别模块23:用于根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;
跟踪模块24:用于采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;
处理模块25:用于遍历所述全景视频,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频。
在本发明实施例中,全景视频剪辑装置的各模块可由相应的硬件或软件单元实现,各模块可以为独立的软、硬件单元,也可以集成为一个软、硬件单元,在此不用以限制本发明。
实施例三:
在本发明实施例中,提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,该计算机程序被处理器执行时实现上述全景视频剪辑方法实施例中的步骤,例如,图1所示的步骤S101至S105。或者,该计算机程序被处理器执行时实现上述各装置实施例中各单元的功能,例如图2所示单元21至25的功能。
本发明实施例的计算机可读存储介质可以包括能够携带计算机程序代码的任何实体或装置、记录介质,例如,ROM/RAM、磁盘、光盘、闪存等存储器。
实施例四:
图3示出了本发明实施例三提供的全景视频剪辑设备的结构,为了便于说明,仅示出了与本发明实施例相关的部分。
本发明实施例的全景视频剪辑设备3包括处理器30、存储器31以及存储在存储器31中并可在处理器30上运行的计算机程序32。该处理器30执行计算机程序32时实现上述全景视频剪辑方法实施例中的步骤,例如图1所示的步骤S101至S105。或者,处理器30执行计算机程序32时实现上述装置实施例中各模块的功能,例如图2所示模块21至25的功能。
本发明实施例的全景视频剪辑设备可以为智能手机、个人计算机或全景相机本身等。该全景视频剪辑设备3中处理器30执行计算机程序32时实现全景视频剪辑方法时实现的步骤可参考前述方法实施例的描述,在此不再赘述。
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (15)

  1. 一种全景视频自动剪辑方法,其特征在于,所述方法包括:
    获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角;
    对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;
    根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;
    采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;
    遍历所述全景视频,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频。
  2. 如权利要求1所述的方法,其特征在于,获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角,具体包括:
    根据所述全景视频获取所述全景相机移动拍摄时的前进方向,获取所述前进方向视角对应的镜头图像;
    根据所述镜头图像得到所述全景相机在拍摄时的前进方向视角;
    其中,所述全景视频为原始球面视频。
  3. 如权利要求1所述的方法,其特征在于,根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标,具体包括:
    采用预设显著性目标检测识别算法,检测所述全景视频帧和/或双鱼眼图像帧,获取显著性目标。
  4. 如权利要求3所述的方法,其特征在于,检测所述全景视频帧和/或双鱼眼图像帧,获取显著性目标,具体还包括:
    当在当前被检测全景视频帧和/或双鱼眼图像帧中检测出预设的显著性目标种类时,将显著性值最大的目标设置为当前被检测全景视频帧和/或双鱼眼图像帧的当前显著性目标。
  5. 如权利要求1所述的方法,其特征在于,根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标,具体包括:
    根据所述全景视频帧和/或双鱼眼图像帧,得到所述全景视频帧和/或双鱼眼图像帧的时间戳;
    根据所述全景视频帧和/或双鱼眼图像帧的时间戳,得到所述全景视频帧和/或双鱼眼图像帧的第一旋转矩阵;
    根据所述第一旋转矩阵渲染所述全景视频帧和/或双鱼眼图像帧,获得所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像;
    采用预设对称性目标检测识别算法,检测所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像,获取对称性目标。
  6. 如权利要求1所述的方法,其特征在于,根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标,具体包括:
    根据所述全景视频帧和/或双鱼眼图像帧,得到所述全景视频帧和/或双鱼眼图像帧的时间戳;
    根据所述全景视频帧和/或双鱼眼图像帧的时间戳,得到所述全景视频帧和/或双鱼眼图像帧的第一旋转矩阵;
    将所述全景视频帧和/或双鱼眼图像帧的行进方向矩阵分解成第一欧拉角(分别为Yaw、Pitch、Roll),并将Pitch角设置PI/2,得到第二欧拉角;
    将所述第二欧拉角转换得到第二旋转矩阵;
    根据所述第二旋转矩阵渲染所述全景视频帧和/或双鱼眼图像帧,获得所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像;
    采用预设天空目标检测识别算法,检测所述全景视频帧和/或双鱼眼图像帧垂直向上方向的图像,获取天空目标。
  7. 如权利要求1所述的方法,其特征在于,采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,具体包括:
    当在当前被检测全景视频帧和/或双鱼眼图像帧中检测出当前显著性目标、当前对称性目标和当前天空目标时,采用预设的目标跟踪算法依次在后续全景视频帧和/或双鱼眼图像帧中对所述当前显著性目标、当前对称性目标和当前天空目标进行追踪,并获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角。
  8. 如权利要求7所述的方法,其特征在于,获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角之后,还包括:
    分别检测所述当前显著性目标、当前对称性目标和当前天空目标的停止追踪事件,当检测到所述当前显著性目标、当前对称性目标和当前天空目标的停止追踪事件时,分别跳转至对所述全景视频进行抽帧操作的步骤,并继续分别进行识别获取显著性目标、对称性目标和天空目标。
  9. 如权利要求7所述的方法,其特征在于,采用预设的目标跟踪算法依次在后续全景视频帧和/或双鱼眼图像帧中对所述当前显著性目标、当前对称性目标和当前天空目标进行追踪,并获取所述当前显著性目标所在视角、当前对称性目标所在视角和当前天空目标所在视角之后,具体还包括:
    分别获取所述当前被检测全景视频帧和/或双鱼眼图像帧中当前显著性目标、当前对称性目标和当前天空目标追踪框的中心坐标,根据所述中心坐标分别计算所述当前显著性目标、当前对称性目标和当前天空目标的球面视点坐标;
    根据所述球面视点坐标,分别获取所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角对应的镜头图像;
    根据所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角对应的镜头图像,分别生成显著性目标视角片段、对称性目标视角片段和天空目标视角片段。
  10. 如权利要求6所述的方法,其特征在于,所述停止追踪事件为丢失所述当前显著性目标、当前对称性目标和当前天空目标或所述追踪框的面积小于预设面积。
  11. 如权利要求1-10所述的方法,其特征在于,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频,具体包括:
    根据全景视频时长;
    和/或,显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段出现的数量及对应的时长;
    和/或,显著性目标所在视角和/或对称性目标所在视角和/或天空目标所在视角与所述前进方向视角的关系;
    设置显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段的播放数量和对应的播放速度,自动剪辑所述显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段,生成所述全景视频对应的目标视频;
    其中,目标视频为单一视角视频或平面视频。
  12. 如权利要求11所述的方法,其特征在于,所述设置显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段的播放数量和和对应的播放速度之前,还包括:
    根据全景视频时长、显著性目标视角片段和/或天空目标视角片段和/或对称性目标视角片段出现的数量及时长,设置镜头旋转方式和/或对应的播放速度。
  13. 一种全景视频剪辑装置,其特征在于,所述装置包括:
    获取模块:用于获取全景相机拍摄的全景视频,并记录所述全景相机在拍摄时的前进方向视角;
    抽帧模块:用于对获取的所述全景视频进行抽帧操作,得到对应的全景视频帧和/或双鱼眼图像帧;
    识别模块:用于根据所述全景视频帧和/或双鱼眼图像帧,识别获取显著性目标、对称性目标和天空目标;
    跟踪模块:用于采用预设目标跟踪算法,跟踪所述显著性目标、对称性目标和天空目标,得到所述显著性目标所在视角、对称性目标所在视角和天空目标所在视角;
    处理模块:用于遍历所述全景视频,根据所述前进方向视角、所述显著性目标所在视角、所述对称性目标所在视角和所述天空目标所在视角,剪辑所述全景视频,生成所述全景视频对应的目标视频。
  14. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至12任一项所述方法的步骤。
  15. 一种全景视频剪辑设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至12任一项所述方法的步骤。
PCT/CN2021/110259 2020-08-03 2021-08-03 一种全景视频剪辑方法、装置、存储介质及设备 WO2022028407A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010764998.8A CN114095780A (zh) 2020-08-03 2020-08-03 一种全景视频剪辑方法、装置、存储介质及设备
CN202010764998.8 2020-08-03

Publications (1)

Publication Number Publication Date
WO2022028407A1 true WO2022028407A1 (zh) 2022-02-10

Family

ID=80119986

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/110259 WO2022028407A1 (zh) 2020-08-03 2021-08-03 一种全景视频剪辑方法、装置、存储介质及设备

Country Status (2)

Country Link
CN (1) CN114095780A (zh)
WO (1) WO2022028407A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115243101A (zh) * 2022-06-20 2022-10-25 上海众源网络有限公司 视频动静率识别方法、装置、电子设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114660097B (zh) * 2022-03-23 2023-06-02 成都智元汇信息技术股份有限公司 一种基于双源双视角的同步校正方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018223370A1 (zh) * 2017-06-09 2018-12-13 深圳大学 一种基于时空约束的视频显著性检测方法及系统
CN110197126A (zh) * 2019-05-06 2019-09-03 深圳岚锋创视网络科技有限公司 一种目标追踪方法、装置及便携式终端
CN111163267A (zh) * 2020-01-07 2020-05-15 影石创新科技股份有限公司 一种全景视频剪辑方法、装置、设备及存储介质
CN111242975A (zh) * 2020-01-07 2020-06-05 影石创新科技股份有限公司 自动调整视角的全景视频渲染方法、存储介质及计算机设备
US20200195847A1 (en) * 2017-08-31 2020-06-18 SZ DJI Technology Co., Ltd. Image processing method, and unmanned aerial vehicle and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018223370A1 (zh) * 2017-06-09 2018-12-13 深圳大学 一种基于时空约束的视频显著性检测方法及系统
US20200195847A1 (en) * 2017-08-31 2020-06-18 SZ DJI Technology Co., Ltd. Image processing method, and unmanned aerial vehicle and system
CN110197126A (zh) * 2019-05-06 2019-09-03 深圳岚锋创视网络科技有限公司 一种目标追踪方法、装置及便携式终端
CN111163267A (zh) * 2020-01-07 2020-05-15 影石创新科技股份有限公司 一种全景视频剪辑方法、装置、设备及存储介质
CN111242975A (zh) * 2020-01-07 2020-06-05 影石创新科技股份有限公司 自动调整视角的全景视频渲染方法、存储介质及计算机设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115243101A (zh) * 2022-06-20 2022-10-25 上海众源网络有限公司 视频动静率识别方法、装置、电子设备及存储介质
CN115243101B (zh) * 2022-06-20 2024-04-12 上海众源网络有限公司 视频动静率识别方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN114095780A (zh) 2022-02-25

Similar Documents

Publication Publication Date Title
WO2021139731A1 (zh) 一种全景视频剪辑方法、装置、设备及存储介质
WO2021227360A1 (zh) 一种交互式视频投影方法、装置、设备及存储介质
Lai et al. Semantic-driven generation of hyperlapse from 360 degree video
US9813607B2 (en) Method and apparatus for image capture targeting
WO2018201809A1 (zh) 基于双摄像头的图像处理装置及方法
US10021381B2 (en) Camera pose estimation
WO2021139583A1 (zh) 自动调整视角的全景视频渲染方法、存储介质及计算机设备
US10937142B2 (en) Arrangement for generating head related transfer function filters
US10586378B2 (en) Stabilizing image sequences based on camera rotation and focal length parameters
WO2022028407A1 (zh) 一种全景视频剪辑方法、装置、存储介质及设备
CN113973190A (zh) 视频虚拟背景图像处理方法、装置及计算机设备
WO2021217398A1 (zh) 图像的处理方法及装置、可移动平台及其控制终端、计算机可读存储介质
US10297285B2 (en) Video data processing method and electronic apparatus
WO2019157922A1 (zh) 一种图像处理方法、装置及ar设备
El-Saban et al. Improved optimal seam selection blending for fast video stitching of videos captured from freely moving devices
JP2020053774A (ja) 撮像装置および画像記録方法
TW201905850A (zh) 在影像中去除處理物件客體的方法及執行這種方法的裝置
WO2022206312A1 (zh) 全景视频的自动剪辑方法、装置、终端及存储介质
CN110245549A (zh) 实时面部和对象操纵
Wang et al. Video stabilization: A comprehensive survey
CN109089058B (zh) 视频画面处理方法、电子终端及装置
CN107392850B (zh) 图像处理方法及其系统
CN109889736B (zh) 基于双摄像头、多摄像头的图像获取方法、装置及设备
CN115589532A (zh) 防抖处理方法、装置、电子设备和可读存储介质
JP7492012B2 (ja) パノラマビデオ編集方法、装置、機器及び記憶媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21852642

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12-07-2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21852642

Country of ref document: EP

Kind code of ref document: A1