CN110602479A - Video conversion method and system - Google Patents
Video conversion method and system Download PDFInfo
- Publication number
- CN110602479A CN110602479A CN201910858867.3A CN201910858867A CN110602479A CN 110602479 A CN110602479 A CN 110602479A CN 201910858867 A CN201910858867 A CN 201910858867A CN 110602479 A CN110602479 A CN 110602479A
- Authority
- CN
- China
- Prior art keywords
- background
- foreground
- dimensional
- depth information
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/156—Mixing image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/275—Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/293—Generating mixed stereoscopic images; Generating mixed monoscopic and stereoscopic images, e.g. a stereoscopic image overlay window on a monoscopic image background
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a video conversion method and a video conversion system. The video conversion method comprises the following steps: dividing an input two-dimensional video into a plurality of frame images; carrying out scene segmentation on a plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background; judging whether the background moves, if the background is static, classifying background objects by adopting a depth learning algorithm, and calculating the depth information of the background according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model; and performing edge detection on the foreground to obtain an accurate foreground object contour, and calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background. According to the method and the device, the background object and the foreground object in the two-dimensional video can be accurately segmented, the accuracy of calculating the background and foreground depth information is improved, and the authenticity of the three-dimensional video is further improved.
Description
Technical Field
The present invention relates to the field of video conversion technologies, and in particular, to a video conversion method and system.
Background
Generally, videos watched by electronic devices such as a mobile phone, a flat panel or a television are two-dimensional videos, and compared with three-dimensional videos, the visual perception brought by the two-dimensional videos to users is slightly poor, and therefore a technology of converting two-dimensional videos into three-dimensional videos is provided. However, the current technology of converting two-dimension into three-dimension mainly adopts a three-dimension synthesis technology based on a depth map, and the problem that the depth information of a two-dimension video is not accurate enough exists.
Disclosure of Invention
The invention aims to provide a video conversion method and a video conversion system which can accurately calculate the depth information of a two-dimensional image so as to improve the reality of a three-dimensional video.
In order to realize the purpose of the invention, the invention also adopts the following technical scheme:
a video conversion method, comprising the steps of:
dividing an input two-dimensional video into a plurality of frame images;
carrying out scene segmentation on the plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background;
judging whether the background moves, if the background is static, classifying background objects of the background by adopting a deep learning algorithm, and calculating the depth information of the background according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model;
performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background;
performing edge detection on the foreground to obtain an accurate foreground object contour, calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performing three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground;
and synthesizing the three-dimensional background and the three-dimensional foreground into a three-dimensional video and outputting the three-dimensional video.
The video conversion method comprises the following steps: dividing an input two-dimensional video into a plurality of frame images; carrying out scene segmentation on a plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background; judging whether the background moves, if so, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model; if the background is static, performing background object classification on the background by adopting a deep learning algorithm, and calculating the depth information of the background according to the classification result; performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background; performing edge detection on the foreground to obtain an accurate foreground object contour, calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performing three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground; and synthesizing the three-dimensional background and the three-dimensional foreground into a three-dimensional video and outputting the three-dimensional video. The method can accurately segment the background object and the foreground object in the two-dimensional video, improve the accuracy of calculating the background and foreground depth information and further improve the authenticity of the three-dimensional video.
In order to realize the purpose of the invention, the invention also adopts the following technical scheme:
a video conversion system comprising:
the frame dividing module is used for dividing the input two-dimensional video into a plurality of frame images;
the scene segmentation module is used for carrying out scene segmentation on the plurality of frames of images and segmenting the plurality of frames of images into a foreground and a background;
the background type judging module is used for judging whether the background moves or not, if the background is static, a deep learning algorithm is adopted to classify background objects of the background, and the depth information of the background is calculated according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model;
the background three-dimensional reconstruction module is used for performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background;
the foreground three-dimensional reconstruction module is used for carrying out edge detection on the foreground to obtain an accurate foreground object outline, calculating the depth information of the foreground according to the position information of the foreground object outline in the background and the depth information of the background, and carrying out three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground;
and the three-dimensional video output module synthesizes the three-dimensional background and the three-dimensional foreground into a three-dimensional video for output.
Drawings
FIG. 1 is a flow chart illustrating a video conversion method according to an embodiment;
fig. 2 is a schematic structural diagram of a video conversion system according to an embodiment.
Detailed Description
To facilitate an understanding of the invention, the invention will now be described more fully with reference to the accompanying drawings. Preferred embodiments of the present invention are shown in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise. In the description of the present invention, "a plurality" means at least one, e.g., one, two, etc., unless specifically limited otherwise.
Referring to fig. 1, the present embodiment provides a video conversion method including step S10, step S20, step S30, step S40, step S50, and step S60, which are detailed as follows:
in step S10, the input two-dimensional video is divided into several frame images.
In step S20, the images of the frames are subjected to scene segmentation, and the images of the frames are segmented into foreground and background.
In this embodiment, the foreground refers to a main target of interest in the two-dimensional video, including but not limited to a person, an animal, a vehicle, and the like; background refers to objects of interest for non-two-dimensional video, including but not limited to the sky, the ground, trees, buildings, etc. Specifically, the method for scene segmentation of a plurality of frames of images may adopt a scene segmentation method commonly used by those skilled in the art, for example, a segmentation method based on color information, and the embodiment is not limited.
In step S30, it is determined whether the background is moving, and if the background is stationary, a deep learning algorithm is used to classify the background objects, and the depth information of the background is calculated according to the classification result; if the background moves, a global motion model is established according to the motion information of the background, and the depth information of the background is calculated according to the global motion model.
In the embodiment, for a static background, a large number of background images of sky, ground, buildings, trees and the like are acquired, a feature database of different background images of sky, ground, buildings, trees and the like is established through continuous training and learning, and background objects of an input two-dimensional video frame can be accurately classified according to the feature database, so that the depth information of the background is accurately calculated; for a moving background, the depth information of the background is calculated according to a global motion model, which is a mathematical model used to describe the global motion of the video frame and is mainly generated by camera operations, wherein the camera operations include, but are not limited to, rotation, translation, horizontal swing, vertical swing, zooming, and the like.
In step S40, a three-dimensional background is obtained by performing three-dimensional reconstruction of the background based on the depth information of the background.
In step S50, performing edge detection on the foreground to obtain an accurate foreground object contour, calculating depth information of the foreground according to position information of the foreground object contour in the background and depth information of the background, and performing three-dimensional reconstruction of the foreground according to the depth information of the foreground to obtain a three-dimensional foreground.
In this embodiment, the foreground is subjected to edge detection to obtain an accurate foreground object profile, so that the foreground object can be accurately segmented, and the accuracy of calculating the foreground depth information can be improved.
In step S60, the three-dimensional background and the three-dimensional foreground are synthesized into a three-dimensional video output.
The video conversion method comprises the following steps: dividing an input two-dimensional video into a plurality of frame images; carrying out scene segmentation on a plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background; judging whether the background moves, if the background is static, classifying background objects by adopting a deep learning algorithm, and calculating the depth information of the background according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model; performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background; performing edge detection on the foreground to obtain an accurate foreground object contour, calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performing three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground; and synthesizing the three-dimensional background and the three-dimensional foreground into a three-dimensional video and outputting the three-dimensional video. The method can accurately segment the background object and the foreground object in the two-dimensional video, improve the accuracy of calculating the background and foreground depth information and further improve the authenticity of the three-dimensional video.
The present application further provides a video conversion system, referring to fig. 2, comprising a frame dividing module 100, a scene segmentation module 200, a background type determination module 300, a background three-dimensional reconstruction module 400, a foreground three-dimensional reconstruction module 500, and a three-dimensional video output module 600, wherein,
the frame dividing module 100 divides an input two-dimensional video into a plurality of frame images.
The scene segmentation module 200 performs scene segmentation on the plurality of frames of images, and segments the plurality of frames of images into a foreground and a background.
The background type judging module 300 is used for judging whether the background moves or not, if the background is static, the background object classification is carried out on the background by adopting a deep learning algorithm, and the depth information of the background is calculated according to the classification result; if the background moves, a global motion model is established according to the motion information of the background, and the depth information of the background is calculated according to the global motion model.
And the background three-dimensional reconstruction module 400 is used for performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background.
The foreground three-dimensional reconstruction module 500 performs edge detection on the foreground to obtain an accurate foreground object contour, calculates the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performs three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain the three-dimensional foreground.
And the three-dimensional video output module 600 synthesizes the three-dimensional background and the three-dimensional foreground into a three-dimensional video for output.
The video conversion system can accurately segment the background object and the foreground object in the two-dimensional video, improve the accuracy of calculating the background and foreground depth information and further improve the authenticity of the three-dimensional video.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (2)
1. A video conversion method, comprising the steps of:
dividing an input two-dimensional video into a plurality of frame images;
carrying out scene segmentation on the plurality of frames of images, and segmenting the plurality of frames of images into a foreground and a background;
judging whether the background moves, if the background is static, classifying background objects of the background by adopting a deep learning algorithm, and calculating the depth information of the background according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model;
performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background;
performing edge detection on the foreground to obtain an accurate foreground object contour, calculating the depth information of the foreground according to the position information of the foreground object contour in the background and the depth information of the background, and performing three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground;
and synthesizing the three-dimensional background and the three-dimensional foreground into a three-dimensional video and outputting the three-dimensional video.
2. A video conversion system, comprising:
the frame dividing module is used for dividing the input two-dimensional video into a plurality of frame images;
the scene segmentation module is used for carrying out scene segmentation on the plurality of frames of images and segmenting the plurality of frames of images into a foreground and a background;
the background type judging module is used for judging whether the background moves or not, if the background is static, a deep learning algorithm is adopted to classify background objects of the background, and the depth information of the background is calculated according to the classification result; if the background moves, establishing a global motion model according to the motion information of the background, and calculating the depth information of the background according to the global motion model;
the background three-dimensional reconstruction module is used for performing three-dimensional reconstruction on the background according to the depth information of the background to obtain a three-dimensional background;
the foreground three-dimensional reconstruction module is used for carrying out edge detection on the foreground to obtain an accurate foreground object outline, calculating the depth information of the foreground according to the position information of the foreground object outline in the background and the depth information of the background, and carrying out three-dimensional reconstruction on the foreground according to the depth information of the foreground to obtain a three-dimensional foreground;
and the three-dimensional video output module synthesizes the three-dimensional background and the three-dimensional foreground into a three-dimensional video for output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910858867.3A CN110602479A (en) | 2019-09-11 | 2019-09-11 | Video conversion method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910858867.3A CN110602479A (en) | 2019-09-11 | 2019-09-11 | Video conversion method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110602479A true CN110602479A (en) | 2019-12-20 |
Family
ID=68858847
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910858867.3A Pending CN110602479A (en) | 2019-09-11 | 2019-09-11 | Video conversion method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110602479A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022071875A1 (en) * | 2020-09-30 | 2022-04-07 | 脸萌有限公司 | Method and apparatus for converting picture into video, and device and storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101640809A (en) * | 2009-08-17 | 2010-02-03 | 浙江大学 | Depth extraction method of merging motion information and geometric information |
CN101917636A (en) * | 2010-04-13 | 2010-12-15 | 上海易维视科技有限公司 | Method and system for converting two-dimensional video of complex scene into three-dimensional video |
CN102223553A (en) * | 2011-05-27 | 2011-10-19 | 山东大学 | Method for converting two-dimensional video into three-dimensional video automatically |
CN102263979A (en) * | 2011-08-05 | 2011-11-30 | 清华大学 | Depth map generation method and device for plane video three-dimensional conversion |
CN102724532A (en) * | 2012-06-19 | 2012-10-10 | 清华大学 | Planar video three-dimensional conversion method and system using same |
CN102724531A (en) * | 2012-06-05 | 2012-10-10 | 上海易维视科技有限公司 | Method and system for converting two-dimensional video into three-dimensional video |
-
2019
- 2019-09-11 CN CN201910858867.3A patent/CN110602479A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101640809A (en) * | 2009-08-17 | 2010-02-03 | 浙江大学 | Depth extraction method of merging motion information and geometric information |
CN101917636A (en) * | 2010-04-13 | 2010-12-15 | 上海易维视科技有限公司 | Method and system for converting two-dimensional video of complex scene into three-dimensional video |
CN102223553A (en) * | 2011-05-27 | 2011-10-19 | 山东大学 | Method for converting two-dimensional video into three-dimensional video automatically |
CN102263979A (en) * | 2011-08-05 | 2011-11-30 | 清华大学 | Depth map generation method and device for plane video three-dimensional conversion |
CN102724531A (en) * | 2012-06-05 | 2012-10-10 | 上海易维视科技有限公司 | Method and system for converting two-dimensional video into three-dimensional video |
CN102724532A (en) * | 2012-06-19 | 2012-10-10 | 清华大学 | Planar video three-dimensional conversion method and system using same |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022071875A1 (en) * | 2020-09-30 | 2022-04-07 | 脸萌有限公司 | Method and apparatus for converting picture into video, and device and storage medium |
US11871137B2 (en) | 2020-09-30 | 2024-01-09 | Lemon Inc. | Method and apparatus for converting picture into video, and device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9179071B2 (en) | Electronic device and image selection method thereof | |
EP3104332B1 (en) | Digital image manipulation | |
US10552962B2 (en) | Fast motion based and color assisted segmentation of video into region layers | |
CN102741879B (en) | Method for generating depth maps from monocular images and systems using the same | |
JP4938861B2 (en) | Complex adaptive 2D-to-3D video sequence conversion | |
EP2947627B1 (en) | Light field image depth estimation | |
US20070086645A1 (en) | Method for synthesizing intermediate image using mesh based on multi-view square camera structure and device using the same and computer-readable medium having thereon program performing function embodying the same | |
US20140176672A1 (en) | Systems and methods for image depth map generation | |
US20080278487A1 (en) | Method and Device for Three-Dimensional Rendering | |
EP2755187A2 (en) | 3d-animation effect generation method and system | |
US10970824B2 (en) | Method and apparatus for removing turbid objects in an image | |
CN105488812A (en) | Motion-feature-fused space-time significance detection method | |
CN102609950B (en) | Two-dimensional video depth map generation process | |
KR20130030208A (en) | Egomotion estimation system and method | |
US8565513B2 (en) | Image processing method for providing depth information and image processing system using the same | |
KR101548639B1 (en) | Apparatus for tracking the objects in surveillance camera system and method thereof | |
KR101173559B1 (en) | Apparatus and method for the automatic segmentation of multiple moving objects from a monocular video sequence | |
JP6039657B2 (en) | Method and device for retargeting 3D content | |
JP2018124890A (en) | Image processing apparatus, image processing method, and image processing program | |
JP2014522596A5 (en) | ||
KR101125061B1 (en) | A Method For Transforming 2D Video To 3D Video By Using LDI Method | |
CN108647605B (en) | Human eye gaze point extraction method combining global color and local structural features | |
CN110602479A (en) | Video conversion method and system | |
KR20160039447A (en) | Spatial analysis system using stereo camera. | |
CN112017120A (en) | Image synthesis method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191220 |