CN110602479A - Video conversion method and system - Google Patents

Video conversion method and system Download PDF

Info

Publication number
CN110602479A
CN110602479A CN201910858867.3A CN201910858867A CN110602479A CN 110602479 A CN110602479 A CN 110602479A CN 201910858867 A CN201910858867 A CN 201910858867A CN 110602479 A CN110602479 A CN 110602479A
Authority
CN
China
Prior art keywords
background
foreground
dimensional
depth information
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910858867.3A
Other languages
Chinese (zh)
Inventor
林海鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HAILIN COMPUTER TECHNOLOGY (SHENZHEN) CO LTD
Original Assignee
HAILIN COMPUTER TECHNOLOGY (SHENZHEN) CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HAILIN COMPUTER TECHNOLOGY (SHENZHEN) CO LTD filed Critical HAILIN COMPUTER TECHNOLOGY (SHENZHEN) CO LTD
Priority to CN201910858867.3A priority Critical patent/CN110602479A/en
Publication of CN110602479A publication Critical patent/CN110602479A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/156Mixing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/275Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/293Generating mixed stereoscopic images; Generating mixed monoscopic and stereoscopic images, e.g. a stereoscopic image overlay window on a monoscopic image background

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a video conversion method and a video conversion system. The video conversion method comprises the following steps: dividing an input two-dimensional video into a plurality of frame images; carrying out scene segmentation on a plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background; judging whether the background moves, if the background is static, classifying background objects by adopting a depth learning algorithm, and calculating the depth information of the background according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model; and performing edge detection on the foreground to obtain an accurate foreground object contour, and calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background. According to the method and the device, the background object and the foreground object in the two-dimensional video can be accurately segmented, the accuracy of calculating the background and foreground depth information is improved, and the authenticity of the three-dimensional video is further improved.

Description

Video conversion method and system
Technical Field
The present invention relates to the field of video conversion technologies, and in particular, to a video conversion method and system.
Background
Generally, videos watched by electronic devices such as a mobile phone, a flat panel or a television are two-dimensional videos, and compared with three-dimensional videos, the visual perception brought by the two-dimensional videos to users is slightly poor, and therefore a technology of converting two-dimensional videos into three-dimensional videos is provided. However, the current technology of converting two-dimension into three-dimension mainly adopts a three-dimension synthesis technology based on a depth map, and the problem that the depth information of a two-dimension video is not accurate enough exists.
Disclosure of Invention
The invention aims to provide a video conversion method and a video conversion system which can accurately calculate the depth information of a two-dimensional image so as to improve the reality of a three-dimensional video.
In order to realize the purpose of the invention, the invention also adopts the following technical scheme:
a video conversion method, comprising the steps of:
dividing an input two-dimensional video into a plurality of frame images;
carrying out scene segmentation on the plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background;
judging whether the background moves, if the background is static, classifying background objects of the background by adopting a deep learning algorithm, and calculating the depth information of the background according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model;
performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background;
performing edge detection on the foreground to obtain an accurate foreground object contour, calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performing three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground;
and synthesizing the three-dimensional background and the three-dimensional foreground into a three-dimensional video and outputting the three-dimensional video.
The video conversion method comprises the following steps: dividing an input two-dimensional video into a plurality of frame images; carrying out scene segmentation on a plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background; judging whether the background moves, if so, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model; if the background is static, performing background object classification on the background by adopting a deep learning algorithm, and calculating the depth information of the background according to the classification result; performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background; performing edge detection on the foreground to obtain an accurate foreground object contour, calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performing three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground; and synthesizing the three-dimensional background and the three-dimensional foreground into a three-dimensional video and outputting the three-dimensional video. The method can accurately segment the background object and the foreground object in the two-dimensional video, improve the accuracy of calculating the background and foreground depth information and further improve the authenticity of the three-dimensional video.
In order to realize the purpose of the invention, the invention also adopts the following technical scheme:
a video conversion system comprising:
the frame dividing module is used for dividing the input two-dimensional video into a plurality of frame images;
the scene segmentation module is used for carrying out scene segmentation on the plurality of frames of images and segmenting the plurality of frames of images into a foreground and a background;
the background type judging module is used for judging whether the background moves or not, if the background is static, a deep learning algorithm is adopted to classify background objects of the background, and the depth information of the background is calculated according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model;
the background three-dimensional reconstruction module is used for performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background;
the foreground three-dimensional reconstruction module is used for carrying out edge detection on the foreground to obtain an accurate foreground object outline, calculating the depth information of the foreground according to the position information of the foreground object outline in the background and the depth information of the background, and carrying out three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground;
and the three-dimensional video output module synthesizes the three-dimensional background and the three-dimensional foreground into a three-dimensional video for output.
Drawings
FIG. 1 is a flow chart illustrating a video conversion method according to an embodiment;
fig. 2 is a schematic structural diagram of a video conversion system according to an embodiment.
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise. In the description of the present invention, "a plurality" means at least one, e.g., one, two, etc., unless specifically limited otherwise.
Referring to fig. 1, the present embodiment provides a video conversion method including step S10, step S20, step S30, step S40, step S50, and step S60, which are detailed as follows:
in step S10, the input two-dimensional video is divided into several frame images.
In step S20, the images of the frames are subjected to scene segmentation, and the images of the frames are segmented into foreground and background.
In this embodiment, the foreground refers to a main target of interest in the two-dimensional video, including but not limited to a person, an animal, a vehicle, and the like; background refers to objects of interest for non-two-dimensional video, including but not limited to the sky, the ground, trees, buildings, etc. Specifically, the method for scene segmentation of a plurality of frames of images may adopt a scene segmentation method commonly used by those skilled in the art, for example, a segmentation method based on color information, and the embodiment is not limited.
In step S30, it is determined whether the background is moving, and if the background is stationary, a deep learning algorithm is used to classify the background objects, and the depth information of the background is calculated according to the classification result; if the background moves, a global motion model is established according to the motion information of the background, and the depth information of the background is calculated according to the global motion model.
In the embodiment, for a static background, a large number of background images of sky, ground, buildings, trees and the like are acquired, a feature database of different background images of sky, ground, buildings, trees and the like is established through continuous training and learning, and background objects of an input two-dimensional video frame can be accurately classified according to the feature database, so that the depth information of the background is accurately calculated; for a moving background, the depth information of the background is calculated according to a global motion model, which is a mathematical model used to describe the global motion of the video frame and is mainly generated by camera operations, wherein the camera operations include, but are not limited to, rotation, translation, horizontal swing, vertical swing, zooming, and the like.
In step S40, a three-dimensional background is obtained by performing three-dimensional reconstruction of the background based on the depth information of the background.
In step S50, performing edge detection on the foreground to obtain an accurate foreground object contour, calculating depth information of the foreground according to position information of the foreground object contour in the background and depth information of the background, and performing three-dimensional reconstruction of the foreground according to the depth information of the foreground to obtain a three-dimensional foreground.
In this embodiment, the foreground is subjected to edge detection to obtain an accurate foreground object profile, so that the foreground object can be accurately segmented, and the accuracy of calculating the foreground depth information can be improved.
In step S60, the three-dimensional background and the three-dimensional foreground are synthesized into a three-dimensional video output.
The video conversion method comprises the following steps: dividing an input two-dimensional video into a plurality of frame images; carrying out scene segmentation on a plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background; judging whether the background moves, if the background is static, classifying background objects by adopting a deep learning algorithm, and calculating the depth information of the background according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model; performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background; performing edge detection on the foreground to obtain an accurate foreground object contour, calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performing three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground; and synthesizing the three-dimensional background and the three-dimensional foreground into a three-dimensional video and outputting the three-dimensional video. The method can accurately segment the background object and the foreground object in the two-dimensional video, improve the accuracy of calculating the background and foreground depth information and further improve the authenticity of the three-dimensional video.
The present application further provides a video conversion system, referring to fig. 2, comprising a frame dividing module 100, a scene segmentation module 200, a background type determination module 300, a background three-dimensional reconstruction module 400, a foreground three-dimensional reconstruction module 500, and a three-dimensional video output module 600, wherein,
the frame dividing module 100 divides an input two-dimensional video into a plurality of frame images.
The scene segmentation module 200 performs scene segmentation on the plurality of frames of images, and segments the plurality of frames of images into a foreground and a background.
The background type judging module 300 is used for judging whether the background moves or not, if the background is static, the background object classification is carried out on the background by adopting a deep learning algorithm, and the depth information of the background is calculated according to the classification result; if the background moves, a global motion model is established according to the motion information of the background, and the depth information of the background is calculated according to the global motion model.
And the background three-dimensional reconstruction module 400 is used for performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background.
The foreground three-dimensional reconstruction module 500 performs edge detection on the foreground to obtain an accurate foreground object contour, calculates the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performs three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain the three-dimensional foreground.
And the three-dimensional video output module 600 synthesizes the three-dimensional background and the three-dimensional foreground into a three-dimensional video for output.
The video conversion system can accurately segment the background object and the foreground object in the two-dimensional video, improve the accuracy of calculating the background and foreground depth information and further improve the authenticity of the three-dimensional video.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (2)

1. A video conversion method, comprising the steps of:
dividing an input two-dimensional video into a plurality of frame images;
carrying out scene segmentation on the plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background;
judging whether the background moves, if the background is static, classifying background objects of the background by adopting a deep learning algorithm, and calculating the depth information of the background according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model;
performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background;
performing edge detection on the foreground to obtain an accurate foreground object contour, calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performing three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground;
and synthesizing the three-dimensional background and the three-dimensional foreground into a three-dimensional video and outputting the three-dimensional video.
2. A video conversion system, comprising:
the frame dividing module is used for dividing the input two-dimensional video into a plurality of frame images;
the scene segmentation module is used for carrying out scene segmentation on the plurality of frames of images and segmenting the plurality of frames of images into a foreground and a background;
the background type judging module is used for judging whether the background moves or not, if the background is static, a deep learning algorithm is adopted to classify background objects of the background, and the depth information of the background is calculated according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model;
the background three-dimensional reconstruction module is used for performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background;
the foreground three-dimensional reconstruction module is used for carrying out edge detection on the foreground to obtain an accurate foreground object outline, calculating the depth information of the foreground according to the position information of the foreground object outline in the background and the depth information of the background, and carrying out three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground;
and the three-dimensional video output module synthesizes the three-dimensional background and the three-dimensional foreground into a three-dimensional video for output.
CN201910858867.3A 2019-09-11 2019-09-11 Video conversion method and system Pending CN110602479A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910858867.3A CN110602479A (en) 2019-09-11 2019-09-11 Video conversion method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910858867.3A CN110602479A (en) 2019-09-11 2019-09-11 Video conversion method and system

Publications (1)

Publication Number Publication Date
CN110602479A true CN110602479A (en) 2019-12-20

Family

ID=68858847

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910858867.3A Pending CN110602479A (en) 2019-09-11 2019-09-11 Video conversion method and system

Country Status (1)

Country Link
CN (1) CN110602479A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022071875A1 (en) * 2020-09-30 2022-04-07 脸萌有限公司 Method and apparatus for converting picture into video, and device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101640809A (en) * 2009-08-17 2010-02-03 浙江大学 Depth extraction method of merging motion information and geometric information
CN101917636A (en) * 2010-04-13 2010-12-15 上海易维视科技有限公司 Method and system for converting two-dimensional video of complex scene into three-dimensional video
CN102223553A (en) * 2011-05-27 2011-10-19 山东大学 Method for converting two-dimensional video into three-dimensional video automatically
CN102263979A (en) * 2011-08-05 2011-11-30 清华大学 Depth map generation method and device for plane video three-dimensional conversion
CN102724532A (en) * 2012-06-19 2012-10-10 清华大学 Planar video three-dimensional conversion method and system using same
CN102724531A (en) * 2012-06-05 2012-10-10 上海易维视科技有限公司 Method and system for converting two-dimensional video into three-dimensional video

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101640809A (en) * 2009-08-17 2010-02-03 浙江大学 Depth extraction method of merging motion information and geometric information
CN101917636A (en) * 2010-04-13 2010-12-15 上海易维视科技有限公司 Method and system for converting two-dimensional video of complex scene into three-dimensional video
CN102223553A (en) * 2011-05-27 2011-10-19 山东大学 Method for converting two-dimensional video into three-dimensional video automatically
CN102263979A (en) * 2011-08-05 2011-11-30 清华大学 Depth map generation method and device for plane video three-dimensional conversion
CN102724531A (en) * 2012-06-05 2012-10-10 上海易维视科技有限公司 Method and system for converting two-dimensional video into three-dimensional video
CN102724532A (en) * 2012-06-19 2012-10-10 清华大学 Planar video three-dimensional conversion method and system using same

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022071875A1 (en) * 2020-09-30 2022-04-07 脸萌有限公司 Method and apparatus for converting picture into video, and device and storage medium
US11871137B2 (en) 2020-09-30 2024-01-09 Lemon Inc. Method and apparatus for converting picture into video, and device and storage medium

Similar Documents

Publication Publication Date Title
US9179071B2 (en) Electronic device and image selection method thereof
EP3104332B1 (en) Digital image manipulation
US10552962B2 (en) Fast motion based and color assisted segmentation of video into region layers
CN102741879B (en) Method for generating depth maps from monocular images and systems using the same
JP4938861B2 (en) Complex adaptive 2D-to-3D video sequence conversion
EP2947627B1 (en) Light field image depth estimation
US20070086645A1 (en) Method for synthesizing intermediate image using mesh based on multi-view square camera structure and device using the same and computer-readable medium having thereon program performing function embodying the same
US20140176672A1 (en) Systems and methods for image depth map generation
US20080278487A1 (en) Method and Device for Three-Dimensional Rendering
EP2755187A2 (en) 3d-animation effect generation method and system
US10970824B2 (en) Method and apparatus for removing turbid objects in an image
CN105488812A (en) Motion-feature-fused space-time significance detection method
CN102609950B (en) Two-dimensional video depth map generation process
KR20130030208A (en) Egomotion estimation system and method
US8565513B2 (en) Image processing method for providing depth information and image processing system using the same
KR101548639B1 (en) Apparatus for tracking the objects in surveillance camera system and method thereof
KR101173559B1 (en) Apparatus and method for the automatic segmentation of multiple moving objects from a monocular video sequence
JP6039657B2 (en) Method and device for retargeting 3D content
JP2018124890A (en) Image processing apparatus, image processing method, and image processing program
JP2014522596A5 (en)
KR101125061B1 (en) A Method For Transforming 2D Video To 3D Video By Using LDI Method
CN108647605B (en) Human eye gaze point extraction method combining global color and local structural features
CN110602479A (en) Video conversion method and system
KR20160039447A (en) Spatial analysis system using stereo camera.
CN112017120A (en) Image synthesis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191220