TW201911239A

TW201911239A - Method and apparatus for generating three-dimensional panoramic video

Info

Publication number: TW201911239A
Application number: TW107100258A
Authority: TW
Inventors: 鄭廷威
Original assignee: 威盛電子股份有限公司
Priority date: 2017-08-16
Filing date: 2018-01-04
Publication date: 2019-03-16
Also published as: TWI683280B

Abstract

A method and an apparatus for generating a three-dimensional (3D) panoramic video are provided. In the method, plural frames are captured from a panoramic video. Each frame is transformed into a polyhedral mapping projection comprising plural side planes, a top plane and a bottom plane. Displacements of plural pixels in the side planes are calculated by using the plural side planes of each frame, and displacements of plural pixels in the top plane and the bottom plane are calculated by using the displacements of the side planes. Then, the pixels in the side planes, the top plane and the bottom plane of each frame are shifted according the displacements of the polyhedral mapping projection to generate a shifted polyhedral mapping projection. The shifted polyhedral mapping projection is transformed into a shifted frame in a two-dimensional (2D) space. The shifted frames and corresponding frames form 3D images and the 3D images are encoded into a 3D panoramic video.

Description

Stereoscopic surround film production method and device

本發明是有關於一種影片產生方法及裝置，且特別是有關於一種立體環景影片產生方法及裝置。The present invention relates to a method and apparatus for generating a movie, and more particularly to a method and apparatus for generating a stereoscopic scene.

球面環景相機系統中主要是利用一個或多個相機拍攝沿著水平軸的360度視野（field of view，FOV）和沿著垂直軸的180度視野的球面環景影片。如此，可以捕捉到相機系統（或者預期的觀看者）周圍各個方向上的整體環境，以用於例如虛擬實境（Virtual reality，VR）應用。近年來，技術已經發展到可以將球面環景相機系統所擷取的影片在顯示設備上以立體的方式呈現。The spherical panoramic camera system mainly uses a one or more cameras to capture a 360-degree field of view (FOV) along the horizontal axis and a spherical panoramic film along the vertical axis. As such, the overall environment in various directions around the camera system (or intended viewer) can be captured for use, for example, in a virtual reality (VR) application. In recent years, technology has evolved to allow stereoscopic presentation of movies captured by a spherical panoramic camera system on a display device.

然而，大部分的球面環景影片內容僅用於二維顯示，因此，存在著將以二維球面環景影片呈現的數位內容轉換為以立體球面環景影片呈現的需求。However, most of the spherical surround video content is only used for two-dimensional display, so there is a need to convert the digital content presented in the two-dimensional spherical surround movie into a stereoscopic spherical movie.

本發明提供一種立體環景影片產生方法及裝置，其係將環景影片的圖幀投影到多面體上並計算各投影中像素的移動量，用以對像素進行平移以獲得具視差的圖幀，而可與原始圖幀結合以產生立體環景影片。The invention provides a method and a device for generating a stereoscopic scene film, which is to project a frame of a scene film onto a polyhedron and calculate the amount of movement of pixels in each projection for translating the pixels to obtain a frame with parallax. It can be combined with the original frame to produce a stereoscopic movie.

本發明的立體環景影片產生方法適用於具有處理器的電子裝置，此方法是擷取環景影片中多個圖幀，並將各圖幀轉換為多面體映射投影，其中多面體映射投影包括多個側面圖、上圖及下圖。接著依據各圖幀轉換後的多面體映射投影的側面圖計算側面圖中多個像素的移動量，並依據側面圖的移動量計算多面體映射投影的上圖與下圖中多個像素的移動量。然後依據所計算的多面體映射投影的移動量，將各圖幀轉換後的多面體映射投影的側面圖、上圖及下圖中的像素平移，以生成平移多面體映射投影。最後將平移多面體映射投影轉換為具有二維空間格式的平移圖幀，並將平移圖幀與對應的圖幀組成立體影像以編碼成立體環景影片。The stereoscopic scene production method of the present invention is applicable to an electronic device having a processor, which is to capture multiple frame frames in a surround movie and convert each frame into a polyhedral mapping projection, wherein the polyhedral mapping projection includes multiple Side view, upper picture and lower picture. Then, the amount of movement of the plurality of pixels in the side view is calculated according to the side view of the polyhedral map projection after each frame is converted, and the amount of movement of the plurality of pixels in the upper graph and the lower graph of the polyhedral map projection is calculated according to the movement amount of the side view. Then, according to the calculated movement amount of the polyhedral mapping projection, the pixels in the side view, the upper picture and the lower picture of the polyhedral mapping projection converted by each picture frame are translated to generate a translation polyhedral mapping projection. Finally, the translational polyhedral mapping projection is converted into a translational frame frame with a two-dimensional spatial format, and the translational image frame and the corresponding image frame are combined into a stereoscopic image to encode a stereoscopic movie.

本發明的立體環景影片產生裝置包括連接裝置、儲存裝置及處理器。其中，連接裝置連接影像來源裝置，用以自影像來源裝置接收環景影片。儲存裝置是用以儲存多個模組。處理器耦接連接裝置及儲存裝置，用以載入並執行儲存裝置中的模組，這些模組包括圖幀擷取模組、映射模組、視差計算模組、像素平移模組、轉換模組及影片編碼模組。圖幀擷取模組擷取環景影片中多個圖幀；映射模組將各圖幀轉換為多面體映射投影，其中多面體映射投影包括多個側面圖、上圖及下圖；視差計算模組取各圖幀轉換後的多面體映射投影的側面圖計算側面圖中多個像素的移動量，並依據側面圖的移動量計算多面體映射投影的上圖與下圖中多個像素的移動量；像素平移模組依據所計算的多面體映射投影的移動量，將各圖幀轉換後的多面體映射投影的側面圖、上圖及下圖中的像素平移，以生成平移多面體映射投影；轉換模組將平移多面體映射投影轉換為具有二維空間格式的平移圖幀；影片編碼模組將平移圖幀與對應的圖幀組成立體影像以編碼成立體環景影片。The stereoscopic scene film generating device of the present invention comprises a connecting device, a storage device and a processor. The connecting device is connected to the image source device for receiving the surround video from the image source device. The storage device is for storing a plurality of modules. The processor is coupled to the connection device and the storage device for loading and executing modules in the storage device, and the module includes a frame capture module, a mapping module, a parallax calculation module, a pixel translation module, and a conversion module. Group and video encoding module. The frame capture module captures multiple frame frames in the surround movie; the mapping module converts each frame frame into a polyhedral mapping projection, wherein the polyhedral mapping projection includes multiple side views, upper and lower images; parallax computing module The side view of the polyhedral map projection converted by each frame is used to calculate the movement amount of the plurality of pixels in the side view, and the amount of movement of the plurality of pixels in the upper graph and the lower graph of the polyhedral map projection is calculated according to the movement amount of the side view; The panning module translates the pixels in the side view, the upper picture and the lower picture of the polyhedral mapping projection converted by each frame frame according to the calculated movement amount of the polyhedral mapping projection to generate a translation polyhedral mapping projection; the conversion module will translate The polyhedral mapping projection is converted into a panning frame frame having a two-dimensional spatial format; the film encoding module composes the panning frame and the corresponding frame frame into a stereoscopic image to encode the stereoscopic movie.

基於上述，本發明的立體環景影片產生方法及裝置藉由將具有二維空間格式的球面環景影片的各圖幀轉換為三維空間中多面體的映射投影，並計算該映射投影每一面像素的移動量，據以對像素進行平移，將平移後的圖幀轉回二維空間格式，藉此使平移後的圖幀與相對應的原始圖幀可組成立體環景影像而用以編碼成立體環景影片。Based on the above, the stereoscopic scene film generating method and apparatus of the present invention converts each frame frame of a spherical ring-shaped film having a two-dimensional space format into a mapping projection of a polyhedron in a three-dimensional space, and calculates a pixel of each side of the mapping projection. The amount of movement, according to the translation of the pixel, the translated image frame is converted back to the two-dimensional space format, so that the translated image frame and the corresponding original frame can be formed into a stereoscopic image for encoding the body Movies in the surround.

為讓本發明的上述特徵和優點能更明顯易懂，下文特舉實施例，並配合所附圖式作詳細說明如下。The above described features and advantages of the invention will be apparent from the following description.

為了將二維的球面環景影片以立體的方式呈現，本發明的裝置除了將球面環景影片中各圖幀轉換為三維空間中多面體（polyhedron）的映射投影，並計算各圖幀的側面圖中各像素的移動量外，還依據側面圖邊緣的移動量進行上圖與下圖中各像素的移動量的推算。依據上述計算或推算出來的移動量，平移各圖幀中轉換後的映射投影的側面圖、上圖及下圖中的各像素，從而獲得平移映射投影。接著，將平移映射投影轉換回具有二維空間格式的平移圖幀，並且分別將平移圖幀與相對應的原始圖幀配置於左右眼而獲得立體影像。最後，將所獲得的立體影像編碼，即可產生具立體效果的環景影片。In order to present a two-dimensional spherical panoramic film in a stereoscopic manner, the apparatus of the present invention converts each frame in a spherical panoramic film into a polyhedron mapping projection in a three-dimensional space, and calculates a side view of each frame. In addition to the amount of movement of each pixel, the amount of movement of each pixel in the above figure and the following figure is also calculated based on the amount of movement of the edge of the side view. According to the above calculated or calculated amount of movement, each pixel in the side view, the upper picture and the lower picture of the converted map projection in each picture frame is translated, thereby obtaining a translation map projection. Then, the translation map projection is converted back to the translation frame frame having the two-dimensional spatial format, and the translation frame frame and the corresponding original image frame are respectively arranged in the left and right eyes to obtain a stereoscopic image. Finally, the obtained stereo image is encoded to produce a stereoscopic movie with a stereo effect.

圖1是依照本發明一實施例所繪示之立體環景影片產生裝置的方塊圖。本實施例的立體環景影片產生裝置是以圖1中的電子裝置10為例，其例如是具備運算功能的相機、攝影機、手機、個人電腦、VR頭盔（headset）、雲端伺服器或其他裝置，其中至少包括連接裝置12、儲存裝置14及處理器16，其功能分述如下：FIG. 1 is a block diagram of a three-dimensional surround view movie generating apparatus according to an embodiment of the invention. The stereoscopic scene film generating device of this embodiment is an example of the electronic device 10 of FIG. 1 , which is, for example, a camera with a computing function, a camera, a mobile phone, a personal computer, a VR helmet, a cloud server, or the like. At least the connection device 12, the storage device 14, and the processor 16 are included, and their functions are as follows:

連接裝置12例如是通用序列匯流排（Universal Serial Bus，USB）、RS232、藍芽、無線相容認證（Wireless fidelity，Wi-Fi）等有線或無線的傳輸介面，其可用以連接影像來源裝置，從而自影像來源裝置接收影片。所述的影像來源裝置例如是可拍攝環景影片的環景相機、儲存有環景影片的硬碟或記憶卡、或是位於遠端用以儲存環景影片的伺服器，在此不設限。The connecting device 12 is, for example, a wired or wireless transmission interface such as a Universal Serial Bus (USB), RS232, Bluetooth, Wireless Fidelity (Wi-Fi), etc., which can be used to connect an image source device. Thereby receiving a movie from the image source device. The image source device is, for example, a ring camera that can capture a surround movie, a hard disk or a memory card that stores a surround movie, or a server that is located at a remote end for storing a surround movie. .

儲存裝置14例如是任何型態的固定式或可移動式隨機存取記憶體（random access memory，RAM）、唯讀記憶體（read-only memory，ROM）、快閃記憶體（flash memory）或類似元件或上述元件的組合。在本實施例中，儲存裝置14用以記錄圖幀擷取模組141、映射模組142、視差計算模組143、像素平移模組144、轉換模組145及影片編碼模組146。The storage device 14 is, for example, any type of fixed or removable random access memory (RAM), read-only memory (ROM), flash memory or Similar elements or combinations of the above elements. In this embodiment, the storage device 14 is configured to record the frame capture module 141, the mapping module 142, the parallax calculation module 143, the pixel translation module 144, the conversion module 145, and the movie encoding module 146.

處理器16例如是中央處理單元（Central Processing Unit，CPU），或是其他可程式化之一般用途或特殊用途的微處理器（Microprocessor）、數位訊號處理器（Digital Signal Processor，DSP）、可程式化控制器、特殊應用積體電路（Application Specific Integrated Circuits，ASIC）、可程式化邏輯裝置（Programmable Logic Device，PLD）或其他類似裝置或這些裝置的組合，其與連接裝置12及儲存裝置14連接。The processor 16 is, for example, a central processing unit (CPU), or other programmable general purpose or special purpose microprocessor (Microprocessor), digital signal processor (DSP), programmable Controllers, Application Specific Integrated Circuits (ASICs), Programmable Logic Devices (PLDs), or other similar devices, or combinations of these devices, connected to the connection device 12 and the storage device 14. .

在本實施例中，儲存在儲存裝置14中的模組例如是電腦程式，而可由處理器16載入，據以執行本實施例的立體環景影片產生的方法。以下即舉實施例說明此方法的詳細步驟。In the present embodiment, the module stored in the storage device 14 is, for example, a computer program, and can be loaded by the processor 16 to perform the method for generating the stereoscopic scene film of the embodiment. The detailed steps of this method are illustrated by the following examples.

圖2是依照本發明一實施例所繪示之立體環景影片產生方法的流程圖。請同時參照圖1及圖2，本實施例的方法適用於上述圖1的電子裝置10，以下即搭配圖1中電子裝置10的各項裝置，說明本實施例立體環景影片產生方法的詳細步驟：2 is a flow chart of a method for generating a stereoscopic scene film according to an embodiment of the invention. Referring to FIG. 1 and FIG. 2 simultaneously, the method of the present embodiment is applicable to the electronic device 10 of FIG. 1 , and the following is a detailed description of the method for generating a stereoscopic scene film according to the embodiment of the electronic device 10 of FIG. 1 . step:

首先，由圖幀擷取模組141從連接裝置12所接收的影片中擷取多個圖幀（步驟S202）。其中，所述圖幀可以具有二維空間格式，但本發明不限於此。其中，所述的影片例如是電子裝置10自影像來源裝置接收的環景影片，所述的影像來源裝置可以是配置於電子裝置上的多個相機，用以拍攝電子裝置前方、後方等視野的影片，並將多個視野影片中相對應的圖幀拼接成例如是二維空間格式的球面環景圖幀，以完成球面環景影片。影像來源裝置也可以是電子裝置本身的儲存裝置14，用以儲存已拍攝的球面環景影片，但不限於此。First, the picture frame capture module 141 retrieves a plurality of picture frames from the movie received by the connection device 12 (step S202). Wherein, the picture frame may have a two-dimensional spatial format, but the invention is not limited thereto. The video source device is, for example, a surround sound movie received by the electronic device 10 from the image source device, and the image source device may be a plurality of cameras disposed on the electronic device for capturing the front and rear views of the electronic device. The movie is stitched into corresponding spherical frames in a plurality of visual films into a spherical ring view frame, for example, in a two-dimensional space format, to complete a spherical panoramic movie. The image source device may also be a storage device 14 of the electronic device itself for storing the captured spherical surround film, but is not limited thereto.

需說明的是，在本實施例中，用以拍攝影片的多個相機例如是魚眼相機，其具有接近180度的視角，而多個視野影片可拍攝涵蓋部分重疊的影像以便拼接。在其他實施例中，相機也可以是除魚眼相機之外的相機，並且多個相機中相對應的各視野影片之間也可以是不重疊的。It should be noted that, in this embodiment, a plurality of cameras for taking a movie, such as a fisheye camera, have a viewing angle of nearly 180 degrees, and a plurality of visual films can capture images that overlap partially overlap for splicing. In other embodiments, the camera may also be a camera other than a fisheye camera, and the corresponding field of view films in the plurality of cameras may also be non-overlapping.

此外，在本實施例中，球面環景影片中的各圖幀是以等矩長方投影（Equirectangular projection）的格式來表示，其中等矩長方投影是將經度映射到恆定間距的垂直直線，以及緯度映射到恆定間距的水平直線。在其他實施例中，除了等矩長方投影之外，也可以是米勒圓柱投影（Miller cylindrical projection）、卡西尼投影（Cassini projection）等投影，用於表示球面環景影片中的各圖幀。In addition, in the embodiment, each frame in the spherical scene film is represented by a format of Equirectangular projection, wherein the equidistant rectangular projection is a vertical line that maps the longitude to a constant pitch. And horizontal lines with latitude mapped to constant spacing. In other embodiments, in addition to the equidistant rectangular projection, projections such as Miller cylindrical projection and Cassini projection may be used to represent each image in the spherical panoramic movie. frame.

回到圖2的流程，在擷取多個圖幀後，映射模組142將球面環景影片中各圖幀轉換為多面體映射投影（步驟S204）。在一實施例中，映射模組142例如會利用立方體映射投影（Cube map），將球面環景影片中各圖幀投影到三維空間中立方體的六個方形面，並將所述投影以六個方形紋理（texture）或展開為具有六個區域的單一紋理的方式儲存。舉例來說，圖3是依照本發明一實施例所繪示之圖幀的等距長方投影對應於立方體映射投影的示意圖。請參照圖3，本實施例各圖幀的等距長方投影圖30中包含對應立方體映射投影的多個側面圖32、上圖34及下圖36的影像部分。其中，上述的多個側面圖32包括左圖322、前圖324、右圖326及後圖328。在其他實施例中，映射模組142也可以使用除了立方體之外的多面體，例如三角棱柱、六角棱柱等，以用於球面環景影片中各圖幀的環境映射投影。一般而言，球面環景影片中各圖幀可投影到包括上面、下面及多個側面的多面體上，而生成上圖、下圖及多個側面圖，其中上圖及下圖各自具有多個邊緣，且上圖/下圖的每個邊緣均對應於其中一個側面圖的上部/下部邊緣。例如，在立方體映射投影的情況下，側面圖的數量為四個，上圖及下圖的邊緣的數量皆為四個，而分別與四個側面圖的邊緣相對應。Returning to the flow of FIG. 2, after capturing a plurality of picture frames, the mapping module 142 converts each picture frame in the spherical scene movie into a polyhedral mapping projection (step S204). In an embodiment, the mapping module 142, for example, uses a cube map to project each frame of the sphere in the spherical scene into six square faces of the cube in the three-dimensional space, and the projection is six. A square texture (texture) or unfolding is stored as a single texture with six regions. For example, FIG. 3 is a schematic diagram of an equidistant rectangular projection of a frame corresponding to a cube mapping projection according to an embodiment of the invention. Referring to FIG. 3, the isometric rectangular projection 30 of each frame of the present embodiment includes image portions of a plurality of side views 32, 34 and 34 of the corresponding cube mapping projection. The plurality of side views 32 include a left image 322, a front image 324, a right image 326, and a rear image 328. In other embodiments, the mapping module 142 may also use polyhedrons other than cubes, such as triangular prisms, hexagonal prisms, etc., for environmental mapping projection of frames in a spherical surround movie. In general, each frame of a spherical panoramic film can be projected onto a polyhedron including upper, lower, and a plurality of sides, and the upper image, the lower image, and the plurality of side views are generated, wherein the upper image and the lower image each have multiple The edge, and each edge of the upper/lower image corresponds to the upper/lower edge of one of the side views. For example, in the case of a cube-mapped projection, the number of side views is four, and the number of edges of the upper and lower figures is four, respectively corresponding to the edges of the four side views.

舉例來說，圖4A、圖4B是依照本發明一實施例所繪示之立方體映射投影的示意圖。請同時參照圖3、圖4A及圖4B，其中圖4B是立方體40的展開圖。在本實施例中，球面環景影片中各圖幀的等矩長方投影通過使用立方體40的六個面作為映射形狀的立方體映射來進行映射投影。也就是說，球面環景影片中各圖幀的等距長方投影圖30中對應的左圖322、前圖324、右圖326、後圖328、上圖34及下圖36的影像部分分別被映射投影至立方體40的六個面而生成左圖422、前圖424、右圖426、後圖428、上圖44及下圖46，並儲存為六個正方形。其中，用於實現等矩長方投影與立方體映射投影之間的轉換方法為本領域技術人員所熟知，在此不再贅述。For example, FIG. 4A and FIG. 4B are schematic diagrams of a cube mapping projection according to an embodiment of the invention. Please refer to FIG. 3, FIG. 4A and FIG. 4B simultaneously, wherein FIG. 4B is an expanded view of the cube 40. In the present embodiment, the equidistant rectangular projection of each frame in the spherical surround movie is mapped and projected by using the six faces of the cube 40 as a cubic map of the mapped shape. That is to say, the image portions of the corresponding left image 322, front image 324, right image 326, rear image 328, upper image 34, and lower image 36 in the equidistant rectangular projection image 30 of each frame of the spherical scene film are respectively The six faces of the cube 40 are mapped to be generated to generate a left image 422, a front image 424, a right image 426, a rear image 428, a top image 44, and a lower portion 46, and are stored as six squares. The conversion method between the equi-orthogonal projection and the cube-mapped projection is well known to those skilled in the art, and details are not described herein.

值得注意的是，由於立方體映射是從代表每個立方體面的90度視錐體（view frustum）所定義的視點繪製景物六次而產生，因此與其他類型的映射投影相比，其像素計算可以以更線性的方式進行。另外，當要使用虛擬實境設備顯示球面環景影片時，採用立方體映射投影格式的影片可更加受到立體圖形硬體加速的支援。此外，在一實施例中，映射投影中的左圖422、前圖424、右圖426、後圖428、上圖44及下圖46是具有相同寬度（以像素為單位）的正方形。在另一實施例中，左圖422、前圖424、右圖426及後圖428的寬度W（以像素為單位）也可以是以等矩長方投影表示的球面環景影片中各圖幀的水平解析度（以像素為單位）的四分之一。然而，本發明並不限於此。在其他實施例中，其他的四邊形（tetragon）、高度或寬度也可以用於上述的左圖422、前圖424、右圖426、後圖428、上圖44及下圖46。It is worth noting that since the cube mapping is generated six times from the viewpoint defined by the 90-degree view frustum of each cube face, its pixel calculation can be compared with other types of mapped projections. In a more linear way. In addition, when a virtual reality device is to be used to display a spherical panoramic movie, a movie in a cube-mapped projection format can be more supported by stereoscopic graphics hardware acceleration. Moreover, in one embodiment, the left image 422, the front image 424, the right image 426, the back image 428, the upper image 44, and the lower image 46 in the mapped projection are squares having the same width (in pixels). In another embodiment, the width W (in pixels) of the left image 422, the front image 424, the right image 426, and the rear image 428 may also be frame images in a spherical ring movie represented by an equidistant rectangular projection. The horizontal resolution (in pixels) is a quarter. However, the invention is not limited thereto. In other embodiments, other tetragons, heights, or widths may be used for the left panel 422, the front panel 424, the right panel 426, the rear panel 428, the upper panel 44, and the lower panel 46 described above.

回到圖2的流程，在獲得轉換後的多面體映射投影後，視差計算模組143會取各圖幀轉換後的映射投影的多個側面圖來計算側面圖中多個像素的移動量，並依據側面圖的移動量來計算映射投影的上圖與下圖中多個像素的移動量（步驟S206）。需注意的是，由於深度感知是由眼睛的水平分離視差所引起的，因此上圖和下圖（其大致與眼睛分離所在的水平面平行）的移動量應當與多個側面圖（其大致與眼睛分離所在的水平面正交）的移動量以不同方法進行計算。Returning to the flow of FIG. 2, after obtaining the converted polyhedral mapping projection, the disparity calculation module 143 calculates a plurality of side views of the mapped projections of the respective frame frames to calculate the amount of movement of the plurality of pixels in the side view, and The amount of movement of the plurality of pixels in the upper map and the lower graph of the map projection is calculated in accordance with the amount of movement of the side view (step S206). It should be noted that since the depth perception is caused by the horizontal separation of the parallax of the eye, the amount of movement of the upper and lower images (which are substantially parallel to the horizontal plane in which the eye is separated) should correspond to multiple side views (which are roughly related to the eye). The amount of movement of the horizontal plane in which the separation is located is calculated in different ways.

在一實施例中，為了取得球面環景影片中各圖幀每個像素的移動量，可以對球面環景影片中各圖幀轉換後的映射投影的多個側面圖進行已知的深度估計技術，例如利用相對模糊（Relative blurriness）、基於區塊匹配（Block-based matching）或光流（Optical flow）等方法，來計算側面圖中各像素的初始移動量。In an embodiment, in order to obtain the amount of movement of each pixel of each frame in the spherical scene film, a known depth estimation technique may be performed on the plurality of side views of the mapped projections of the converted frames in the spherical scene film. For example, the initial blur amount of each pixel in the side view is calculated by a method such as Relative Blurness, Block-based Matching, or Optical Flow.

在上述深度估計技術中，相對模糊法是從相機焦點的角度來看，距相機較遠的物體比較接近相機的物體模糊，因此，該方法是基於圖幀中各像素的模糊程度來計算每個像素的移動量。In the above depth estimation technique, the relative blur method is that the object farther from the camera is closer to the object blur of the camera from the perspective of the camera focus. Therefore, the method calculates each based on the degree of blur of each pixel in the frame. The amount of movement of the pixel.

在基於區塊匹配方法中，影片的一個或多個圖幀被分割成多個區塊，當前圖幀的各區塊可以與大小相同但在參考圖幀中移位的區塊進行比較。其中與最小匹配成本相關聯的確定位移可以被識別為所述區塊中的所有像素的估計移動幅度，且上述移動幅度可用來計算一個或多個圖幀中像素的移動量。例如，移動幅度較大的像素可被認定為更靠近相機。In the block-based matching method, one or more picture frames of a movie are segmented into a plurality of blocks, and each block of the current picture frame can be compared with a block of the same size but shifted in the reference picture frame. The determined displacement associated with the minimum matching cost may be identified as an estimated magnitude of movement of all pixels in the block, and the magnitude of the movement may be used to calculate the amount of movement of pixels in one or more of the map frames. For example, a pixel that moves a larger amount can be considered to be closer to the camera.

另一方面，光流法可以識別物體亮度模式的移動，例如，影片中各圖幀的光流可被認為是移動場，其中每個點被指定有描述其移動的速度向量。光流技術可以包括經由亮度常數方程式（brightness constancy equation）將物件速度與基於像素梯度的亮度變化相關聯，其可以使用全面或局部的優化技術來計算一個或多個圖幀中像素的光流移動向量。合適的已知光流方法可以包括Farneback方法、Lucas-Kanade方法、主成份分析（principal component analysis，PCA）方法等。在一些實施例中，可以使用任意的光流法來計算側面圖的初始移動量。On the other hand, the optical flow method can recognize the movement of the brightness mode of the object. For example, the optical flow of each picture frame in the movie can be considered as a moving field, where each point is assigned a velocity vector describing its movement. Optical flow techniques may include correlating object velocity with pixel gradient based luminance variation via a brightness constancy equation, which may use full or partial optimization techniques to calculate optical flow movement of pixels in one or more map frames vector. Suitable known optical flow methods may include the Farneback method, the Lucas-Kanade method, the principal component analysis (PCA) method, and the like. In some embodiments, any optical flow method can be used to calculate the initial amount of movement of the side view.

在本實施例中，側面圖中多個像素的移動量的計算是先計算各圖幀轉換後的映射投影的側面圖之間的光流場，其中上述的光流場表示為各圖幀轉換後的映射投影的側面圖中各像素在多個軸向上的初始移動量。接著，可選地，對初始移動量進行高斯平滑化計算而計算出平滑移動量；在一實施例中，上述的高斯平滑化計算可包括時間軸上的高斯平滑化計算或空間上的高斯平滑化計算。In this embodiment, the calculation of the amount of movement of the plurality of pixels in the side view is to first calculate an optical flow field between the side views of the mapped projections after the conversion of the respective frames, wherein the optical flow field is represented as a frame conversion of each image. The initial amount of movement of each pixel in a plurality of axial directions in the side view of the subsequent mapped projection. Then, optionally, Gaussian smoothing calculation is performed on the initial movement amount to calculate a smooth movement amount; in an embodiment, the Gaussian smoothing calculation described above may include Gaussian smoothing calculation on the time axis or spatial Gaussian smoothing. Calculation.

舉例來說，圖5A及圖5B是依照本發明一實施例所繪示之球面環景影片中圖幀轉換後的映射投影的側面圖的範例。請同時參照圖5A及圖5B，圖幀50a為影片中時間T的圖幀，圖幀50b為影片中時間T+1的圖幀。在本實施例中，影片中圖幀50a的各像素與圖幀T+1的各像素之間的移動的畫素數量代表側面圖中各像素的初始移動量。其中，距離觀看者較遠的物件的像素具有較小的初始移動量，相反地，距離觀看者較近的物件的像素具有較大的初始移動量。For example, FIG. 5A and FIG. 5B are diagrams showing an example of a side view of a map projection after a frame transition in a spherical ring-shaped movie according to an embodiment of the invention. Referring to FIG. 5A and FIG. 5B simultaneously, the frame 50a is a frame of time T in the movie, and the frame 50b is a frame of time T+1 in the movie. In the present embodiment, the number of pixels moving between each pixel of the picture frame 50a and each pixel of the picture frame T+1 represents the initial amount of movement of each pixel in the side view. Wherein, the pixels of the object farther from the viewer have a smaller initial movement amount, and conversely, the pixels of the object closer to the viewer have a larger initial movement amount.

詳細來說，影片中圖幀50a的每個像素（i,j）的初始移動量可包括歐幾里德坐標（Euclidian coordinates）中的水平移動量（）和垂直移動量（）： In detail, the initial amount of movement of each pixel (i, j) of the frame 50a in the movie Can include the amount of horizontal movement in Euclidian coordinates ( ) and the amount of vertical movement ( ):

可選地，可以進一步使用高斯濾波器對初始移動量進行時間平滑化以產生平滑移動量，例如： Alternatively, the initial movement amount may be further smoothed using a Gaussian filter to generate a smooth movement amount. ,E.g:

為了使像素移動量的幅度適用於不同顯示器的影像解析度，可以將初始移動量（或平滑移動量）乘以正規化（normalization）因子P，其中正規化因子P與球面環景影片中各圖幀轉換後的映射投影的側面圖的寬度W（以像素為單位）的解析度成比例，而這亦可以與如上所述的一些實施例中以等矩長方投影格式表示的球面環景影片中各圖幀的水平解析度（以像素為單位）成比例，從而得到每個像素的最終移動量：In order to adapt the amplitude of the pixel shift amount to the image resolution of different displays, the initial shift amount (or the smooth shift amount) may be multiplied by a normalization factor P, where the normalization factor P and the sphere in the spherical scene movie are The resolution of the width W (in pixels) of the side view of the frame-converted mapped projection is proportional, and this may also be a spherical surround film represented in an equi-orthogonal projection format in some embodiments as described above. The horizontal resolution (in pixels) of each frame in the frame is proportional, resulting in the final amount of movement per pixel :

雖然最終移動量在下標中省略了時間T的標示，但本領域人員應當理解最終移動量也是針對各時間T進行計算。在另一實施例中，若顯示器的影像解析度剛好可配合球面環景影片中各圖幀的水平解析度，則正規化因子P可以是1，亦即最終移動量等於初始移動量。可選地，可以進一步使用例如高斯模糊濾波器對初始移動量進行空間平滑化以得到更平滑的最終移動量。且本領域技術人員應可理解，可以根據不同的情況改變執行正規化、時間平滑化及空間平滑化的順序以得到最終移動量。此外，還可以調整在時間平滑化及空間平滑化中使用的參數和/或濾波器。Although the final amount of movement The indication of time T is omitted in the subscript, but those skilled in the art will appreciate that the final amount of movement is also calculated for each time T. In another embodiment, if the image resolution of the display is just enough to match the horizontal resolution of each frame in the spherical movie, the normalization factor P may be 1, that is, the final amount of movement is equal to the initial amount of movement. Alternatively, the initial amount of movement may be further spatially smoothed using, for example, a Gaussian blur filter to obtain a smoother final amount of movement. It should be understood by those skilled in the art that the order of performing normalization, time smoothing, and spatial smoothing can be changed according to different situations to obtain the final amount of movement. In addition, parameters and/or filters used in temporal smoothing and spatial smoothing can also be adjusted.

舉例來說，圖6A至圖6C是依照本發明一實施例所繪示之計算移動量的範例。請參考圖6A至圖6C，影像60a是圖幀轉換後的映射投影的側面圖的光流場圖，影像60b呈現圖幀轉換後的映射投影的側面圖的初始移動量，影像60c呈現圖幀轉換後的映射投影的側面圖套用高斯模糊後的平滑移動量。For example, FIG. 6A to FIG. 6C are diagrams illustrating an example of calculating the amount of movement according to an embodiment of the invention. Referring to FIG. 6A to FIG. 6C, the image 60a is an optical flow field diagram of a side view of the map projection after the frame is converted, and the image 60b presents the initial movement amount of the side view of the map projection after the frame conversion, and the image 60c presents the frame. The side view of the converted map projection is applied with a smooth moving amount after Gaussian blur.

視差計算模組143在計算各側面圖中每個像素的最終移動量後，取各側面圖中位於與上圖及下圖的邊界的多個邊緣像素的最終移動量計算上圖及下圖的多個邊緣像素的移動量，以及以邊緣像素的移動量作為初始值，計算上圖及下圖內其他像素的移動量。詳細來說，上圖的邊緣及下圖的邊緣（即，上圖/下圖中像素的最外側的一行）的每一行像素被指定有移動量（例如，最終移動量Δd_(i,j) ），上述被指定的移動量是根據側面圖的上部和下部邊緣（即，側面圖中像素的最上面/最下面的一行）中每個相對應上圖及下圖的邊緣像素的最終移動量來計算。After calculating the final movement amount of each pixel in each side view, the disparity calculation module 143 calculates the final movement amount of the plurality of edge pixels located at the boundary between the top view and the lower image in each side view, and calculates the upper picture and the following figure. The amount of movement of the plurality of edge pixels and the amount of movement of the edge pixels are used as initial values, and the amount of movement of the other pixels in the upper image and the lower image is calculated. In detail, each row of pixels of the edge of the upper image and the edge of the lower image (ie, the outermost row of pixels in the upper/lower image) is assigned a movement amount (for example, the final movement amount Δd _{(i, j)} The specified amount of movement is the final amount of movement of each of the upper and lower edges (ie, the uppermost/lower row of pixels in the side view) of the corresponding upper and lower edges of the side view. To calculate.

舉例來說，圖7是依照本發明一實施例所繪示之圖幀的立方體映射投影。請參照圖7，本實施例立方體映射投影70是由左圖722、前圖724、右圖726、後圖728、上圖74及下圖76六個正方形所構成。在本實施例中，上圖74包括四個邊緣74a、74b、74c及74d，左圖722包括有與上圖74對應的邊緣72a及與下圖76對應的邊緣72a’，前圖724包括有與上圖74對應的邊緣72b及與下圖76對應的邊緣72b’，右圖726包括有與上圖74對應的邊緣72c及與下圖76對應的邊緣72c’，後圖728包括有與上圖74對應的邊緣72d及與下圖76對應的邊緣72d’。下圖76包括四個邊緣76a、76b、76c及76d。For example, FIG. 7 is a cube-mapped projection of a frame of a picture in accordance with an embodiment of the invention. Referring to FIG. 7, the cube mapping projection 70 of the present embodiment is composed of six squares of a left image 722, a front image 724, a right image 726, a rear image 728, a top image 74, and a lower image 76. In the present embodiment, the upper diagram 74 includes four edges 74a, 74b, 74c and 74d, and the left diagram 722 includes an edge 72a corresponding to the upper diagram 74 and an edge 72a' corresponding to the lower diagram 76. The front diagram 724 includes The edge 72b corresponding to the upper diagram 74 and the edge 72b' corresponding to the lower diagram 76, the right diagram 726 includes an edge 72c corresponding to the upper diagram 74 and an edge 72c' corresponding to the lower diagram 76, the rear diagram 728 includes the upper and the upper Figure 74 corresponds to the edge 72d and the edge 72d' corresponding to Figure 76 below. The lower diagram 76 includes four edges 76a, 76b, 76c, and 76d.

具體而言，在一實施例中，上圖74的邊緣74a、邊緣74b、邊緣74c及邊緣74d各像素的移動量計算方法如下所述。將左面722的邊緣72a各像素的最終移動量指定至對應的邊緣74a，以成為邊緣74a各像素的移動量；將前圖724的邊緣72b各像素的最終移動量指定至對應的邊緣，以成為邊緣74b各像素的移動量；將右圖726的邊緣72c的最終移動量指定至對應的邊緣74c，以成為邊緣74c各像素的移動量；將後圖728的邊緣72d的最終移動量指定至對應的邊緣74d，以成為邊緣74d各像素的移動量。其中，由於上圖74的四個角上的四個像素可以分別對應於兩個相鄰圖的兩個最上端頂角的像素，因此可以將兩個相鄰圖的兩個最上端頂角像素的最終移動量中任一個或者兩個最上端頂角像素的最終移動量的平均值分別指定至上述四個像素的移動量。下圖76邊緣像素的計算方法與上述上圖74類似，因此不再贅述。Specifically, in an embodiment, the calculation method of the movement amount of each pixel of the edge 74a, the edge 74b, the edge 74c, and the edge 74d of the upper diagram 74 is as follows. The final amount of movement of each pixel of the edge 72a of the left side 722 is assigned to the corresponding edge 74a to be the amount of movement of each pixel of the edge 74a; the final amount of movement of each pixel of the edge 72b of the front diagram 724 is assigned to the corresponding edge to become The amount of movement of each pixel of the edge 74b; the final amount of movement of the edge 72c of the right diagram 726 is assigned to the corresponding edge 74c to become the amount of movement of each pixel of the edge 74c; the final amount of movement of the edge 72d of the subsequent diagram 728 is assigned to the corresponding The edge 74d is to be the amount of movement of each pixel of the edge 74d. Wherein, since the four pixels on the four corners of the upper graph 74 can respectively correspond to the pixels of the two uppermost vertex angles of the two adjacent graphs, the two uppermost vertex pixels of the two adjacent graphs can be selected. The average of the final movement amounts of any one or two of the uppermost apex pixels of the final movement amount is respectively assigned to the movement amount of the above four pixels. The calculation method of the edge pixel in the following figure 76 is similar to the above-mentioned figure 74, and therefore will not be described again.

視差計算模組143如此指定上圖與下圖邊緣像素的移動量之後，分別將上圖及下圖分割為多個區塊，並針對上圖及下圖中的各像素，依照所屬的區塊使用周圍多個相鄰像素的移動量計算像素的移動量。簡單來說，在確定上圖及下圖的邊緣的每行像素的移動量後，上圖及下圖的其他內部各像素可以由外到內照順序地依據其相鄰像素的移動量計算而取得。After the parallax calculation module 143 specifies the movement amount of the edge pixels of the upper image and the lower image, the upper image and the lower image are respectively divided into a plurality of blocks, and the pixels in the upper image and the lower image are in accordance with the associated blocks. The amount of movement of the pixel is calculated using the amount of movement of a plurality of adjacent pixels around. In brief, after determining the amount of movement of each row of pixels in the upper and lower edges of the image, the other internal pixels in the above and below figures can be calculated from the outside to the inside according to the amount of movement of the adjacent pixels. Acquired.

舉例來說，圖8A至圖8E是依照本發明一實施例所繪示之計算上圖移動量的範例。請參照圖8A，本實施例的立方體映射投影的上圖84被分割四個區塊I~IV，且已指定邊緣84a、邊緣84b、邊緣84c及邊緣84d各像素的移動量。接著，計算上述區塊I~IV中各區塊內部像素的移動量的方法請參照圖8B至8E。其中，圖8B的實施例中內部像素（i,j）的移動量可以根據其三個相鄰像素（例如像素（i-1,j）、像素（i-1,j-1）及像素（i,j-1））的移動量的算術平均來取得。圖8C的另一實施例中內部像素（i,j）的移動量可以根據其三個相鄰像素（例如像素（i-1,j-1）、像素（i,j-1）及像素（i+1,j-1））的移動量的算術平均來取得。圖8D的另一實施例中內部像素（i,j）的移動量可以根據其兩個相鄰像素（例如像素（i-1,j）及像素（i,j-1））的移動量的算術平均來取得。圖8E的另一實施例中內部像素（i,j）的移動量可以根據其四個相鄰像素（例如像素（i-1,j）、像素（i-1,j-1）、像素（i,j-1）及像素（i+1,j-1））的移動量的算術平均來取得。For example, FIG. 8A to FIG. 8E are diagrams illustrating an example of calculating the amount of movement of the upper image according to an embodiment of the invention. Referring to FIG. 8A, the upper graph 84 of the cube mapping projection of the present embodiment is divided into four blocks I to IV, and the amount of movement of each pixel of the edge 84a, the edge 84b, the edge 84c, and the edge 84d has been specified. Next, a method of calculating the amount of movement of pixels inside each block in the above blocks I to IV will be described with reference to FIGS. 8B to 8E. Wherein, the amount of movement of the internal pixel (i, j) in the embodiment of FIG. 8B may be based on three adjacent pixels (eg, pixel (i-1, j), pixel (i-1, j-1), and pixel ( The arithmetic mean of the amount of movement of i, j-1)) is obtained. In another embodiment of FIG. 8C, the amount of movement of the internal pixel (i, j) may be based on three adjacent pixels thereof (eg, pixel (i-1, j-1), pixel (i, j-1), and pixel ( The arithmetic mean of the amount of movement of i+1, j-1)) is obtained. In another embodiment of FIG. 8D, the amount of movement of the internal pixel (i, j) may be based on the amount of movement of two adjacent pixels (eg, pixel (i-1, j) and pixel (i, j-1)). Arithmetic average to get. In another embodiment of FIG. 8E, the amount of movement of the internal pixel (i, j) may be based on four adjacent pixels thereof (eg, pixel (i-1, j), pixel (i-1, j-1), pixel ( The arithmetic mean of the movement amounts of i, j-1) and pixels (i+1, j-1)) is obtained.

在本發明另一實施例中，以下述方程式來計算上述區塊I~IV的內部像素的移動量，其中i=0、i=W-1、j=0或j=W-1（W為上圖的寬度）對應於四個邊緣的像素行/列：區塊I：，其中，且。區塊II：，其中，且。區塊III：，其中，且。區塊IV：，其中，且。In another embodiment of the present invention, the amount of movement of the internal pixels of the blocks I to IV is calculated by the following equation, where i=0, i=W-1, j=0, or j=W-1 (W is The width of the above image corresponds to the pixel row/column of four edges: Block I: , among them, And . Block II: , among them, And . Block III: , among them, And . Block IV: , among them, And .

其中圖8B繪示出了在上述區塊I中各像素的計算方法所依據的三個相鄰像素的相對位置。上述方程式中的遞減參數g用於逐漸減小內部像素的移動量，使得越接近上圖中心的像素，其移動量可以越接近零，並且遞減參數g可以是0到1之間的任意數字（例如0.1）。在其他實施例中，也可以省略因子3 /（3 + g），使得移動量不會逐漸減小。FIG. 8B illustrates the relative positions of three adjacent pixels on which the calculation method of each pixel in the above block I is based. The decreasing parameter g in the above equation is used to gradually reduce the amount of movement of the internal pixels such that the closer to the pixel in the center of the upper graph, the closer the amount of movement can be to zero, and the decreasing parameter g can be any number between 0 and 1 ( For example, 0.1). In other embodiments, the factor 3 / (3 + g) may also be omitted so that the amount of movement does not gradually decrease.

值得注意的是，圖8B至圖8E所繪示計算移動量的方法僅只是本發明的一些實施範例，並不用以限縮本發明的範疇。且在另一個實施例中，上圖84可以沿著對角線分成四個區塊如圖9所示以計算區塊I’~IV’的內部像素的移動量，也可以分為其他數量的區塊（如六個或八個區塊）。以下是計算圖9區塊I’的內部像素的移動量的另一實施例：區塊I’：，其中，且。It should be noted that the method for calculating the amount of movement illustrated in FIG. 8B to FIG. 8E is only some embodiments of the present invention, and is not intended to limit the scope of the present invention. In another embodiment, the upper graph 84 can be divided into four blocks along the diagonal line as shown in FIG. 9 to calculate the amount of movement of the internal pixels of the blocks I' to IV', and can also be divided into other numbers. Block (such as six or eight blocks). The following is another embodiment for calculating the amount of movement of the internal pixels of block I' of Figure 9: Block I': , among them, And .

在其他實施例中，可以根據相鄰像素的移動量的幾何平均值或中值來確定內部像素的移動量。且雖然本實施例中將上圖與下圖的邊緣像素的移動量指定為相對應的側面圖的上部和下部最邊緣的一行邊緣像素的最終移動量，但在另一實施例中也可以將上圖與下圖的邊緣像素的移動量指定為相對應的側面圖的上部和下部邊緣的邊緣像素的最終移動量再乘以一常數；在又一實施例中也可以依據側面圖的上部和下部最邊緣的數個邊緣像素的最終移動量（例如是最邊緣的二個邊緣像素的最終移動量的平均值）決定相對應的上圖與下圖的邊緣像素的移動量。此外，下圖的詳細實施方式與上述上圖84類似，因此不再贅述。In other embodiments, the amount of movement of the internal pixels may be determined based on a geometric mean or median of the amount of movement of adjacent pixels. And although the amount of movement of the edge pixels of the upper and lower figures is specified as the final movement amount of the upper and lowermost edge line pixels of the corresponding side view in the present embodiment, in another embodiment, The amount of movement of the edge pixels of the upper and lower figures is specified as the final movement amount of the edge pixels of the upper and lower edges of the corresponding side view and multiplied by a constant; in still another embodiment, the upper part of the side view can also be used. The final amount of movement of the plurality of edge pixels at the lowermost edge (for example, the average of the final movement amounts of the two edge pixels of the most edge) determines the amount of movement of the edge pixels of the corresponding upper and lower images. In addition, the detailed implementation of the following figure is similar to the above-mentioned figure 84, and therefore will not be described again.

回到圖2的流程，在計算側面圖、上圖及下圖各像素的移動量後，可以根據各像素的移動量來確定各像素在立體球面環景影片中各圖幀的左眼和右眼圖幀之間的平移量和方向。在一實施例中，距離觀看者較遠的物件的像素應具有較小的平移量，而距離觀看者較近的物件的像素應具有較大的平移量。在一實施例中，各像素在左眼和右眼圖幀之間的平移量可以等於用前述方式所計算的各像素的移動量；在另一實施例中，各像素在左眼和右眼圖幀之間的平移量可以等於用前述方式所計算的各像素的移動量再乘以一常數，且側面圖與上圖及下圖的各像素的移動量所乘的常數可以是相同或不同。然而，各像素的平移量與所述像素的移動量之關係並不限於此。像素平移模組144依據所計算的映射投影的移動量，平移各圖幀轉換後的映射投影的側面圖、上圖及下圖中的像素，以生成平移映射投影（步驟S208）。Returning to the flow of FIG. 2, after calculating the amount of movement of each pixel in the side view, the upper picture, and the lower picture, the left eye and the right of each picture frame in the stereo spherical movie can be determined according to the amount of movement of each pixel. The amount and direction of translation between eye frames. In an embodiment, the pixels of the object that are further away from the viewer should have a smaller amount of translation, while the pixels of the object that are closer to the viewer should have a larger amount of translation. In an embodiment, the amount of translation of each pixel between the left eye and the right eye frame may be equal to the amount of movement of each pixel calculated in the foregoing manner; in another embodiment, each pixel is in the left and right eyes. The amount of shift between the frame frames may be equal to the amount of movement of each pixel calculated in the foregoing manner and multiplied by a constant, and the constants multiplied by the amount of movement of each pixel of the side view and the lower figure may be the same or different. . However, the relationship between the amount of shift of each pixel and the amount of movement of the pixel is not limited thereto. The pixel translation module 144 translates the pixels in the side view, the upper image, and the lower image of the mapped projection of each frame frame according to the calculated amount of movement of the mapped projection to generate a translation map projection (step S208).

關於側面圖的平移，像素平移模組144依據側面圖的移動量平移側面圖的像素以生成平移側面圖，所述側面圖例如以水平捲繞（horizontal wraparound）方式向右平移。舉例來說，請參考圖7的立方體映射投影70，當左圖722、前圖724、右圖726及後圖728向右平移時，後圖728最右邊緣被移出的像素會被平移回補至左圖722的最左邊緣。然而，側面圖平移的方向並不限於此，例如也可以向左平移。Regarding the translation of the side view, the pixel translation module 144 translates the pixels of the side view according to the amount of movement of the side view to generate a translational side view, which is translated to the right, for example, in a horizontal wraparound manner. For example, please refer to the cube mapping projection 70 of FIG. 7. When the left image 722, the front image 724, the right image 726, and the rear image 728 are translated to the right, the pixels removed from the rightmost edge of the back image 728 are translated and complemented. To the leftmost edge of the left diagram 722. However, the direction in which the side view is translated is not limited thereto, and for example, it may be shifted to the left.

關於上圖及下圖的平移，像素平移模組144依據上圖的移動量以順時針及逆時針其中之一的方向旋轉上圖中的像素以生成平移上圖，以及依據下圖的移動量以順時針及逆時針中的另一方向旋轉下圖中的像素以生成平移下圖，其中平移側面圖、平移上圖及平移下圖構成所述平移映射投影。在一實施例中，當側面圖的像素以水平捲繞方式向右平移時，上圖的像素以逆時針方向旋轉而下圖的像素以順時針方向旋轉；當側面圖的像素以水平捲繞方式向左平移時，上圖的像素以順時針方向旋轉而下圖的像素以逆時針方向旋轉。在上述旋轉的細節方面，在一實施例中，像素平移模組144分別將上圖及下圖從中心依照各邊緣分割為多個區塊（例如是三角形的區塊），並依據上圖及下圖的移動量平移各區塊內的各像素。各區塊內的各像素平移的方向可以是平行於該區塊對應的邊緣的方向，也可以是接近於以該中心為圓心的順時針或逆時針方向。其中當上圖的像素的平移跨出所屬區塊至相鄰區塊時，依相鄰區塊內的平移方向轉向以繼續在相鄰區塊內平移，當下圖的像素的平移跨出所屬區塊至相鄰區塊時，依相鄰區塊內的平移方向轉向以繼續在相鄰區塊內平移。Regarding the translation of the above figure and the following figure, the pixel shifting module 144 rotates the pixels in the above figure in the direction of one of clockwise and counterclockwise according to the amount of movement of the above figure to generate a panning upper image, and the amount of movement according to the following figure The pixels in the lower image are rotated in the other of clockwise and counterclockwise to generate a translation lower map, wherein the translation side view, the translation upper view, and the translation lower view constitute the translation map projection. In an embodiment, when the pixels of the side view are translated to the right in a horizontal winding manner, the pixels of the upper image are rotated in a counterclockwise direction and the pixels of the lower image are rotated in a clockwise direction; when the pixels of the side view are horizontally wound When the mode is shifted to the left, the pixels of the above figure rotate in a clockwise direction and the pixels of the lower figure rotate in a counterclockwise direction. In the embodiment of the above-mentioned rotation, the pixel translation module 144 divides the upper image and the lower image from the center into a plurality of blocks (for example, triangular blocks) according to the respective edges, and according to the above figure and The amount of movement in the following figure shifts each pixel in each block. The direction in which each pixel in each block translates may be parallel to the direction of the corresponding edge of the block, or may be close to a clockwise or counterclockwise direction centered on the center. When the translation of the pixel of the above figure crosses the adjacent block to the adjacent block, the direction of the translation in the adjacent block is turned to continue to translate in the adjacent block, and the translation of the pixel of the following figure is out of the adjacent area. When the block is to an adjacent block, it is turned according to the direction of translation within the adjacent block to continue to translate within the adjacent block.

舉例來說，圖9是依照本發明一實施例所繪示之立方體映射投影的上圖旋轉平移的示意圖。請參照圖9的上圖94，在本實施例中，上圖94從中心沿著對角線被分割成四個三角形區塊以對應四個邊緣，包括三角形區塊942、三角形區塊944、三角形區塊946以及三角形區塊948。所述上圖依據各像素的移動量以接近於逆時針方向旋轉平移，其具體平移方式如下所述，三角形區塊942的各像素向下平移，三角形區塊944的各像素向右平移，三角形區塊946的各像素向上平移，三角形區塊948的各像素向左平移。For example, FIG. 9 is a schematic diagram of a top view rotation translation of a cube mapping projection according to an embodiment of the invention. Referring to the upper diagram 94 of FIG. 9, in the embodiment, the upper diagram 94 is divided into four triangular blocks from the center along the diagonal to correspond to the four edges, including the triangular block 942 and the triangular block 944. Triangle block 946 and triangle block 948. The upper image is rotated in a manner that is close to the counterclockwise direction according to the amount of movement of each pixel. The specific translation mode is as follows. Each pixel of the triangular block 942 is translated downward, and each pixel of the triangular block 944 is translated to the right. Each pixel of block 946 translates upward, and each pixel of triangular block 948 translates to the left.

需注意的是，下圖各像素的旋轉方向與上圖的旋轉方向相反。在上述實施例中，下圖各像素以接近於順時針方向旋轉平移。下圖各區塊旋轉平移的詳細實施方式與上圖94類似，因此不再贅述。然而，上圖及下圖的旋轉平移方向並不限於此。在另一實施例中，可以將上圖及下圖分割為其他數量或形狀的區塊，使得上圖及下圖以更接近逆時針或順時針方向的平移。在另一實施例中，也可以不將上圖及下圖分割為區塊而直接使用以中心為圓心的順時針或逆時針方向平移所有上圖及下圖的像素。在其他實施例中，平移之後可能發生像素缺失的情況，上述情況可以通過已知的圖像修補演算法（例如快速前進方法、Navier-Stokes方程方法等）來恢復。It should be noted that the rotation direction of each pixel in the figure below is opposite to the rotation direction of the above figure. In the above embodiment, each pixel of the following figure is rotated in translation in a clockwise direction. The detailed implementation of the rotation translation of each block in the following figure is similar to that of the above figure 94, and therefore will not be described again. However, the rotational translation directions of the above and below figures are not limited thereto. In another embodiment, the upper and lower figures may be segmented into other numbers or shapes of blocks such that the upper and lower figures are closer to a counterclockwise or clockwise translation. In another embodiment, the pixels in the upper and lower figures may be directly translated in a clockwise or counterclockwise direction centered on the center without dividing the upper and lower figures into blocks. In other embodiments, a pixel miss may occur after translation, which may be recovered by known image patching algorithms (eg, a fast forward method, a Navier-Stokes equation method, etc.).

回到圖2的流程，在生成平移映射投影後，轉換模組145轉換平移映射投影為具有二維空間格式的平移圖幀（步驟S210）。具體而言，在本發明一實施例中，將平移後的立方體映射投影格式的各圖幀轉換回等矩長方投影格式的圖幀。然而，上述的二維空間格式並不限於此。Returning to the flow of FIG. 2, after generating the translation map projection, the conversion module 145 converts the translation map projection into a translation map frame having a two-dimensional spatial format (step S210). Specifically, in an embodiment of the invention, each frame of the translated cube-mapped projection format is converted back to a frame of the equi-orthogonal projection format. However, the above two-dimensional spatial format is not limited to this.

最後，在轉換為平移圖幀後，影片編碼模組146將平移圖幀與對應的原始圖幀組成立體影像以編碼成立體影片（步驟S210）。平移圖幀的二維空間格式可以是與轉換為多面體映射投影前的原始圖幀的格式相同或不同。若平移圖幀的二維空間格式與轉換為多面體映射投影前的原始圖幀的格式不同，可以另外再將平移圖幀與對應的原始圖幀轉換成相同的二維空間格式。詳細來說，在本發明一實施例中，將球形環景影片中平移後等矩長方投影格式的圖幀以及相對應的原始等矩長方投影格式的圖幀分別作為觀看者的左眼影像和右眼影像，以提供立體影像的呈現。儘管上述實施例的平移圖幀為左眼影像、原始圖幀為右眼影像，但本領域技術人員應可理解，原始圖幀可以在相反的方向上平移以產生平移圖幀為右眼影像，而原始圖幀為左眼影像的實施方式。在一實施例中左眼影像和右眼影像可以是同步分別呈現給觀看者的左眼和右眼觀看，但本發明不限於此。Finally, after converting to the panning frame, the video encoding module 146 composes the panning frame and the corresponding original frame into a stereoscopic image to encode the intensive movie (step S210). The two-dimensional spatial format of the panning frame may be the same or different than the format of the original frame frame before being converted to a polyhedral mapping projection. If the two-dimensional spatial format of the translation frame is different from the original image frame before being converted to the polyhedral mapping projection, the translation frame and the corresponding original image frame may be additionally converted into the same two-dimensional spatial format. In detail, in an embodiment of the present invention, the frame of the translational equal-equal rectangular projection format and the corresponding frame of the original equal-equal rectangular projection format in the spherical panoramic film are respectively used as the left eye of the viewer. Image and right eye image to provide a stereoscopic image. Although the pan frame of the above embodiment is a left eye image, and the original frame is a right eye image, those skilled in the art should understand that the original frame can be translated in the opposite direction to generate a pan image as a right eye image. The original frame is the implementation of the left eye image. In an embodiment, the left eye image and the right eye image may be synchronized to the left eye and the right eye of the viewer, respectively, but the invention is not limited thereto.

在對球面環景影片中各圖幀進行上述映射投影、視差計算、像素平移、轉換格式及組成立體影像的步驟後，即可將各立體影像編碼並產生立體球面環景影片。由於二維空間格式（例如等矩長方投影格式）的環景影片的影像具有高度的非線性特性，故無法直接使用已知的深度估計技術計算整個影像的所有像素的移動量。本發明中先轉換環景影片為多面體映射投影格式，直接計算側面圖的移動量後再依據各側面圖的移動量計算上圖與下圖的移動量，如此可以準確且簡便地決定左眼影像和右眼影像之間的平移量以產生立體球面環景影片。After performing the above mapping projection, parallax calculation, pixel shifting, conversion format, and the steps of composing the stereoscopic image for each frame in the spherical panoramic movie, each stereoscopic image can be encoded and a stereo spherical panoramic movie can be generated. Since the image of the surround movie of the two-dimensional spatial format (for example, the equidistant rectangular projection format) has a highly nonlinear characteristic, it is not possible to directly calculate the amount of movement of all the pixels of the entire image using the known depth estimation technique. In the present invention, the first conversion scene movie is a polyhedral mapping projection format, and the movement amount of the side view is directly calculated, and then the movement amount of the upper picture and the lower picture is calculated according to the movement amount of each side view, so that the left eye image can be accurately and easily determined. The amount of translation between the image and the right eye image produces a stereoscopic spherical surround movie.

綜上所述，本發明的立體環景影片產生方法及裝置藉由將例如是二維空間格式的球面環景影片的各圖幀轉換為三維空間中多面體的映射投影，並對轉換後各多面體映射投影圖幀側面圖執行深度估計，計算各圖幀中側面圖的各像素的移動量，且依據側面圖的移動量計算上圖及下圖的移動量，接著依據各像素的移動量對各像素進行平移，並將平移後的多面體映射投影圖幀轉回原始圖幀的二維空間格式，藉此使平移後的圖幀與相對應的原始圖幀可產生立體環景影像以編碼成立體環景影片。In summary, the stereoscopic scene producing method and apparatus of the present invention converts each frame frame of a spherical ring-shaped movie, such as a two-dimensional space format, into a mapped projection of a polyhedron in a three-dimensional space, and converts each polyhedron after conversion. Performing depth estimation on the side view of the map of the projected image, calculating the amount of movement of each pixel of the side view in each frame, and calculating the amount of movement of the upper image and the lower image according to the amount of movement of the side view, and then according to the amount of movement of each pixel The pixel is translated, and the translated polyhedral map projection frame is rotated back to the two-dimensional spatial format of the original image frame, so that the translated image frame and the corresponding original image frame can generate a stereoscopic panoramic image to encode the body. Movies in the surround.

雖然本發明已以實施例揭露如上，然其並非用以限定本發明，任何所屬技術領域中具有通常知識者，在不脫離本發明的精神和範圍內，當可作些許的更動與潤飾，故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention, and any one of ordinary skill in the art can make some changes and refinements without departing from the spirit and scope of the present invention. The scope of the invention is defined by the scope of the appended claims.

10‧‧‧電子裝置10‧‧‧Electronic devices

12‧‧‧連接裝置12‧‧‧Connecting device

14‧‧‧儲存裝置14‧‧‧Storage device

141‧‧‧圖幀擷取模組141‧‧‧ Frame capture module

142‧‧‧映射模組142‧‧‧ mapping module

143‧‧‧視差計算模組143‧‧‧parallax computing module

144‧‧‧像素平移模組144‧‧‧Pixel Translation Module

145‧‧‧轉換模組145‧‧‧Transition module

146‧‧‧影片編碼模組146‧‧‧Video coding module

16‧‧‧處理器16‧‧‧ Processor

30‧‧‧等距長方投影圖30‧‧‧Equidistant rectangular projection

40‧‧‧立方體40‧‧‧ cube

70‧‧‧立方體映射投影70‧‧‧Cube mapping projection

32‧‧‧側面圖32‧‧‧Side view

34、44、74、84、94‧‧‧上圖34, 44, 74, 84, 94‧‧‧ Above

36、46、76‧‧‧下圖36, 46, 76‧‧‧ below

322、422、722‧‧‧左圖322, 422, 722‧‧‧ left picture

324、424、724‧‧‧前圖324, 424, 724 ‧ ‧ front map

326、426、726‧‧‧右圖326, 426, 726‧‧‧ right

328、428、728‧‧‧後圖328, 428, 728‧‧‧

50a、50b‧‧‧圖幀50a, 50b‧‧‧ frame

60a-60c‧‧‧影像60a-60c‧‧ images

72a-72d、72a’-72d’ 、74a-74d、76a-76d、84a-84d‧‧‧邊緣Edges 72a-72d, 72a'-72d', 74a-74d, 76a-76d, 84a-84d‧‧‧

942-948‧‧‧三角形區塊942-948‧‧‧Triangular block

S202-S210‧‧‧本發明一實施例之立體環景影片產生方法的步驟S202-S210‧‧‧ steps of a method for generating a stereoscopic scene film according to an embodiment of the present invention

圖1是依照本發明一實施例所繪示之立體環景影片產生裝置的方塊圖。圖2是依照本發明一實施例所繪示之立體環景影片產生方法的流程圖。圖3是依照本發明一實施例所繪示之圖幀的等距長方投影對應於立方體映射投影的示意圖。圖4A、圖4B是依照本發明一實施例所繪示之立方體映射投影的示意圖。圖5A及圖5B是依照本發明一實施例所繪示之球面環景影片中圖幀轉換後的映射投影的側面圖的範例。圖6A至圖6C是依照本發明一實施例所繪示之計算移動量的範例。圖7是依照本發明一實施例所繪示之圖幀的立方體映射投影。圖8A至圖8E是依照本發明一實施例所繪示之計算上圖移動量的範例。圖9是依照本發明一實施例所繪示之立方體映射投影的上圖旋轉平移的示意圖。附件是依照本發明一實施例所繪示之計算移動量過程中側面圖的範例。FIG. 1 is a block diagram of a three-dimensional surround view movie generating apparatus according to an embodiment of the invention. 2 is a flow chart of a method for generating a stereoscopic scene film according to an embodiment of the invention. FIG. 3 is a schematic diagram of an equidistant rectangular projection of a frame corresponding to a cube mapping projection according to an embodiment of the invention. 4A and 4B are schematic diagrams showing a cube mapping projection according to an embodiment of the invention. 5A and FIG. 5B are diagrams showing an example of a side view of a map projection after a frame transition in a spherical scene movie according to an embodiment of the invention. 6A-6C are diagrams illustrating an example of calculating a movement amount according to an embodiment of the invention. FIG. 7 is a diagram showing a cube mapping projection of a frame of a picture according to an embodiment of the invention. 8A-8E are diagrams illustrating an example of calculating the amount of movement of the upper graph according to an embodiment of the invention. FIG. 9 is a schematic diagram of a top view rotation translation of a cube mapping projection according to an embodiment of the invention. The accessory is an example of a side view during the process of calculating the amount of movement according to an embodiment of the invention.

Claims

A method for generating a stereoscopic scene film, which is suitable for an electronic device having a processor, the method comprising the steps of: capturing a plurality of frame frames in a scene film; converting each of the frame frames into a polyhedral mapping projection, wherein the polyhedron The mapping projection includes a plurality of side views, an upper image, and a lower image; calculating a movement amount of the plurality of pixels in the side view according to the side view of the polyhedral mapping projection converted by each of the image frames, and according to the side The movement amount of the graph calculates the movement amount of the plurality of pixels in the upper graph and the lower graph of the polyhedral map projection; and shifts each of the graph frame conversion according to the calculated movement amount of the polyhedral map projection The polyhedron maps the side view of the projection, the upper image, and the pixels in the lower image to generate a translational polyhedral mapping projection; and converts the translational polyhedral mapping projection into a translation having a two-dimensional spatial format The frame is framed, and the panning frame and the corresponding frame are combined to form a stereo image to be encoded into a stereoscopic movie.

The method of claim 1, wherein the calculating the amount of movement of the pixel in the side view according to the side view of the polyhedral map projection after each of the frame frames is converted comprises: calculating each An initial movement amount of each of the pixels in the side view of the polyhedral map projection after the frame is converted, and multiplying the initial movement amount by a normalization factor to obtain the movement amount of each of the pixels, wherein The normalization factor is proportional to the horizontal resolution of the picture frame in the surround movie.

The method of claim 1, wherein the calculating the amount of movement of the pixel in the upper image and the lower image of the polyhedral mapping projection according to the movement amount of the side view comprises: The amount of movement of the plurality of edge pixels located at a boundary with the upper image and the lower image in each of the side views calculates the boundary between the upper image and the lower image at the boundary of each of the side views. The amount of movement of the plurality of edge pixels; and calculating the amount of movement of the other pixels in the upper image and the lower image according to the amount of movement of the edge pixels in the upper and lower images.

The method of claim 3, wherein the upper image and the other pixels in the lower image are calculated according to the movement amount of the edge pixel of the upper image and the lower image. The step of moving the amount includes: dividing the upper image and the lower image into a plurality of blocks, respectively, and using the pixels in the upper image and the lower image according to the associated block. The amount of movement of a plurality of adjacent pixels around each of the pixels calculates the amount of movement of the pixel.

The method of claim 4, wherein the step of calculating the amount of movement of the pixel by using the amount of movement of the adjacent pixels around each of the pixels in accordance with the associated block includes: Calculating the amount of movement of the pixel by the amount of movement and the decrementing parameter of the adjacent pixel around the pixel, wherein the decrementing parameter is used to make the inner pixel of the upper image and the lower image closer to the center The amount of movement is smaller.

The method of claim 1, wherein the side view and the upper image of the polyhedral map projection after each of the frame frames are translated according to the calculated amount of movement of the polyhedral map projection And the step of generating the pan-polyhedral map projection in the image in the following figure, comprising: translating the side view of the side view to the right or left in a horizontal winding manner according to the moving amount of the side view a pixel to generate a translational side view; rotating the pixel in the upper image in a direction clockwise and counterclockwise according to the amount of movement of the upper image to generate a translational upper map; and according to the following figure The amount of movement rotates the pixel in the lower image in the other of the clockwise and counterclockwise directions to generate a translational lower image, wherein the translational side view, the translational upper image, and the translation The following figure constitutes the translational polyhedral mapping projection.

The method of claim 6, wherein the pixel in the upper figure is rotated in a direction of one of the clockwise and the counterclockwise according to the amount of movement of the upper figure to generate the Translating the upper image and rotating the pixels in the lower image in the other of the clockwise and counterclockwise directions to generate the panning lower image according to the amount of movement of the lower image to include: The above figure is divided into a plurality of blocks from the center, and each of the pixels in each of the blocks is translated according to the movement amount of the upper figure to approach the direction of one of the clockwise and the counterclockwise, Wherein when the translation of the pixel straddles the block to an adjacent block, the direction is turned to continue to translate within the adjacent block; and the lower image is split from the center into a plurality of regions Blocking, and translating each of the pixels in each of the blocks according to the amount of movement of the lower figure to approach the other of the clockwise and counterclockwise directions, wherein when the pixel is transposed When the block is adjacent to the adjacent block, turn in the other direction to continue Translating said adjacent inner zone.

The method of claim 6, wherein when the pixel of the side view is translated to the right in a horizontal winding manner, the pixel of the upper image is rotated in a counterclockwise direction The pixel is rotated in a clockwise direction; when the pixel of the side view is translated to the left in a horizontally wound manner, the pixel of the upper image is rotated in a clockwise direction as described in the lower figure The pixel rotates in a counterclockwise direction.

The method of claim 1, wherein the polyhedral mapping projection is a cube mapping projection.

The method of claim 1, wherein the step of calculating the amount of movement of the pixel in the side view according to the side view of the polyhedral map projection after each of the frame frames is converted comprises: using a principal component The optical flow method is used to calculate the amount of movement of the pixels in the side view.

A stereoscopic panoramic film generating device includes: a connecting device connected to the image source device for receiving a surround view movie from the image source device; a storage device for storing a plurality of modules; and a processor coupled to the connecting device and the a storage device, loading and executing the module in the storage device, the module comprising: a frame capture module, capturing a plurality of frame frames in the surround movie; mapping a module, converting each The frame is a polyhedral mapping projection, wherein the polyhedral mapping projection comprises a plurality of side views, an upper picture and a lower picture; and a disparity calculation module, the side view of the polyhedral mapping projection converted by each of the picture frames Calculating a movement amount of the plurality of pixels in the side view, and calculating a movement amount of the plurality of pixels in the upper image and the lower image of the polyhedral map projection according to the movement amount of the side view; a group, according to the calculated amount of movement of the polyhedral map projection, translating the side view of the polyhedral map projection converted by each of the frame frames, the upper image, and the a pixel to generate a translational polyhedral mapping projection; a conversion module, converting the translational polyhedral mapping projection into a panning frame frame having a two-dimensional spatial format; and a video encoding module, the panning frame frame and the corresponding image frame A stereoscopic image is formed to be encoded into a three-dimensional surround movie.

The apparatus of claim 11, wherein the disparity calculation module includes calculating an initial movement amount of each of the pixels in the side view of the polyhedral map projection after each of the image frame conversions, and The initial amount of movement is multiplied by a normalization factor to obtain the amount of movement of each of the pixels, wherein the normalization factor is proportional to a horizontal resolution of the picture frame in the surround movie.

The apparatus of claim 11, wherein the parallax computing module includes the amount of movement according to a plurality of edge pixels located at a boundary with the upper image and the lower image in each of the side views Calculating the amount of movement of the plurality of edge pixels located at a boundary with each of the side views in the upper image and the lower image, and according to the edge pixels of the upper image and the lower image The amount of movement calculates the amount of movement of the other pixels in the upper graph and the lower graph.

The device of claim 13, wherein the parallax computing module further divides the upper image and the lower image into a plurality of blocks, respectively, and is for the upper image and the lower image Each of the pixels calculates the amount of movement of the pixel according to the moving amount of the plurality of adjacent pixels around each of the pixels according to the associated block.

The device of claim 14, wherein the disparity calculation module further calculates the movement amount of the pixel according to the movement amount and the decrement parameter of the adjacent pixels around each of the pixels, wherein The decrementing parameter is used to make the amount of movement of the inner pixel closer to the center in the upper and lower graphs smaller.

The device of claim 11, wherein the pixel translation module includes the pixel of the side view being translated to the right or left in a horizontal winding manner according to the amount of movement of the side view to generate a translational side And rotating the pixel in the upper graph in a direction of one of a clockwise and a counterclockwise direction to generate a translational upper map according to the amount of movement of the upper image, and the movement according to the lower image The amount of the pixel in the lower graph is rotated in the other of the clockwise and counterclockwise directions to generate a translational lower map, wherein the translational side view, the translational upper image, and the translational lower image constitute A translational polyhedral mapping projection.

The device of claim 16, wherein the pixel shifting module further divides the upper image from the center into a plurality of blocks, and translates the blocks into the blocks according to the moving amount of the upper figure. Each of the pixels is in a direction approaching one of the clockwise and counterclockwise directions, wherein when the translation of the pixel crosses the block to an adjacent block, the direction is turned to continue in the Translating in the adjacent block; and dividing the lower image from the center into a plurality of blocks, and translating each of the pixels in each of the blocks according to the amount of movement of the lower figure to approach the The other of the hour hand and the counterclockwise direction, wherein when the translation of the pixel spans the block to an adjacent block, the other direction is turned to continue to translate within the adjacent block.

The device of claim 16, wherein when the pixel of the side view is translated to the right in a horizontal winding manner, the pixel of the upper image is rotated in a counterclockwise direction The pixel is rotated in a clockwise direction; when the pixel of the side view is translated to the left in a horizontally wound manner, the pixel of the upper image is rotated in a clockwise direction as described in the lower figure The pixel rotates in a counterclockwise direction.

The device of claim 11, wherein the polyhedral mapping projection is a cube mapping projection.

The apparatus of claim 11, wherein the parallax computing module calculates the amount of movement of the movement amount of the pixel in the side view by a principal component analysis optical flow method.