TWM529333U

TWM529333U - Embedded three-dimensional image system

Info

Publication number: TWM529333U
Application number: TW105209678U
Authority: TW
Inventors: Qi-Zhou Gao
Original assignee: Nat Univ Tainan
Priority date: 2016-06-28
Filing date: 2016-06-28
Publication date: 2016-09-21

Description

Stereo embedded image system

本創作是關於一種影像系統，特別是指立體嵌入式影像系統。This creation is about an imaging system, especially a stereoscopic embedded image system.

當人觀看顯示器所顯示的立體(3D)影像時，因為人的雙眼位置是呈左右相對，故雙眼分別從不同的角度看到相同的顯示器的顯示畫面，使得雙眼所看到的影像會略有不同，人的大腦會將雙眼所觀視的兩影像組合成一幅影像，並透過雙眼對物體深度感覺的機制，根據雙眼影像的相似性和差異性來作出影像深度的判斷，從而生成立體的視覺效果。When a person views the stereoscopic (3D) image displayed on the display, since the position of the eyes of the person is left-right, the eyes of the same display are seen from different angles, so that the images seen by both eyes are made. It will be slightly different. The human brain combines the two images viewed by both eyes into one image, and through the mechanism of the depth perception of the objects through the eyes, the image depth is judged according to the similarity and difference of the binocular images. To create a stereoscopic visual effect.

習知產生立體影像的方式，是以二維(2D)影像搭配相對應的深度資訊，即可合成出要分別提供給人的雙眼觀視的影像，其中深度資訊為灰階影像。然而，在二維影像與深度資訊合成一合成影像後，該合成影像會存在一些沒有色彩資訊的像素，即為空洞(hole)，尤其是在深度變化強烈的地方，會產生巨大的空洞，因此在該合成影像的空洞區域會使人的大腦無法組合為完整的立體影像。常見的解決方法是將深度圖進行平滑濾波，雖然可以減少空洞的大小，但會破壞原始的深度資訊，使得該合成影像的立體感大幅降低。The conventional method of generating stereoscopic images is to combine two-dimensional (2D) images with corresponding depth information to synthesize images of binoculars to be separately provided, wherein the depth information is grayscale images. However, after the composite image is synthesized by the two-dimensional image and the depth information, the synthesized image has some pixels without color information, that is, a hole, especially in a place where the depth changes strongly, a huge cavity is generated, so In the hollow area of the synthetic image, the human brain cannot be combined into a complete stereoscopic image. A common solution is to smooth the depth map. Although the size of the hole can be reduced, the original depth information is destroyed, and the stereoscopic effect of the synthesized image is greatly reduced.

因此本創作的主要目的是提供一種立體嵌入式影像系統，降低空洞在立體影像中所帶來影響。Therefore, the main purpose of this creation is to provide a stereoscopic embedded image system to reduce the impact of voids in stereoscopic images.

本創作立體嵌入式影像系統包含：一高畫質多媒體介面接收端，接收一原始彩色影像資料；一場效可程式化邏輯閘陣列裝置，電性連接該高畫質多媒體介面接收端以接收該原始彩色影像資料，該場效可程式化邏輯閘陣列裝置包含：一立體匹配單元，將該原始彩色影像資料轉換為一彩色影像資料與一深度影像資料；以及一深度影像繪圖器，包含：一深度圖前處理單元，其輸入端連接該立體匹配單元，以接收該彩色影像資料與該深度影像資料，並根據該彩色影像資料與該深度影像資料產生一前處理結果，該前處理結果包含有二維影像圖和深度圖；兩影像扭曲單元，其輸入端連接該立體匹配單元與該深度圖前處理單元，分別根據該前處理結果與該彩色影像資料進行影像的扭曲而分別產生一第一影像資料與一第二影像資料；兩修補順序預估單元，其輸入端分別連接該兩影像扭曲單元以分別接收該第一影像資料與該第二影像資料，並以分別對該第一影像資料與該第二影像資料執行修補順序的預估而分別對應產生一第一預估影像修補資料與一第二預估影像修補資料；以及兩最佳補塊匹配單元，其輸入端分別連接該兩修補順序預估單元以分別接收該第一預估影像修補資料與該第二預估影像修補資料，並分別根據該第一預估影像修補資料與該第二預估影像修補資料執行最佳補塊的匹配以分別產生一第一模擬影像資料與一第二模擬影像資料；以及一高畫質多媒體介面發送端，電性連接該場效可程式化邏輯閘陣列裝置，以輸出該第一模擬影像與該第二模擬影像。The creation of the stereoscopic embedded image system comprises: a high-quality multimedia interface receiving end, receiving an original color image data; and a functional programmable logic gate array device electrically connected to the high-quality multimedia interface receiving end to receive the original The color image data, the field effect programmable logic gate array device comprises: a stereo matching unit, converting the original color image data into a color image data and a depth image data; and a depth image plotter comprising: a depth a pre-processing unit, wherein the input end is connected to the stereo matching unit to receive the color image data and the depth image data, and generate a pre-processing result according to the color image data and the depth image data, the pre-processing result includes two Dimensional image and depth map; two image warping units, wherein the input end is connected to the stereo matching unit and the depth map pre-processing unit, respectively generating a first image according to the pre-processing result and the color image data distortion Data and a second image data; two patching sequence estimation units, which lose The input end is respectively connected to the two image distorting units to respectively receive the first image data and the second image data, and respectively generate an estimate of the repair order of the first image data and the second image data respectively. a first estimated image repairing data and a second estimated image repairing data; and two optimal patch matching units, wherein the input ends are respectively connected to the two patching sequence estimating units to respectively receive the first estimated image repairing data and The second estimated image is repaired, and the optimal patch is matched according to the first estimated image repair data and the second estimated image repair data, respectively, to generate a first simulated image data and a second simulation respectively. The image data; and a high-quality multimedia interface transmitting end electrically connected to the field-effect programmable logic gate array device to output the first analog image and the second analog image.

根據本創作，整體來看，係對該彩色影像資料與該深度影像資料依序通過影像扭曲、修補順序的預估與補塊的匹配後產生該第一模擬影像與該第二模擬影像，其中，本創作係使用填補順序預估以及最佳補塊搜尋來填補空洞，以在對合成影像立體感不造成太大影響的情形下，使得合成的空洞大小降低，降低影像的失真。According to the present invention, as a whole, the first image and the second image are generated by sequentially matching the color image data and the depth image data through image distortion, patching order estimation and patch matching, wherein This creation uses the fill order estimation and the best patch search to fill the holes, so that the size of the synthesized holes is reduced and the distortion of the image is reduced without greatly affecting the stereoscopic effect of the composite image.

請參考圖1，本創作立體嵌入式影像系統包含有一高畫質多媒體介面(High Definition Multimedia Interface, HDMI)接收端10、一場效可程式化邏輯閘陣列(Field Programmable Gate Array, FPGA)裝置11與一高畫質多媒體介面發送端12。該場效可程式化邏輯閘陣列裝置11的輸入端電性連接該高畫質多媒體介面接收端10，該高畫質多媒體介面接收端10電性連接該場效可程式化邏輯閘陣列裝置11的輸出端。Referring to FIG. 1, the stereoscopic embedded image system includes a high definition multimedia interface (HDMI) receiving end 10, and a Field Programmable Gate Array (FPGA) device 11 and A high-quality multimedia interface sender 12. The input end of the field effect programmable logic gate array device 11 is electrically connected to the high quality multimedia interface receiving end 10, and the high quality multimedia interface receiving end 10 is electrically connected to the field effect programmable logic gate array device 11 The output.

該高畫質多媒體介面接收端10與該高畫質多媒體介面發送端12是現已廣泛使用的數位影音傳輸媒介，其可以接收或傳送未壓縮的音訊訊號及視訊訊號。本創作實施例中，該高畫質多媒體介面接收端10可經由一傳輸線連接一影像產生裝置，該影像產生裝置可為數位相機、行動裝置或電腦產品，以從該影像產生裝置接收待處理的一原始彩色影像資料RXdata與一同步化信號SYNC。該高畫質多媒體介面發送端12則可經由另一傳輸線連接一終端顯示裝置，該終端顯示裝置可為一顯示器或一電腦產品。The high-definition multimedia interface receiving end 10 and the high-definition multimedia interface transmitting end 12 are widely used digital audio and video transmission media, which can receive or transmit uncompressed audio signals and video signals. In the present embodiment, the high-definition multimedia interface receiving end 10 can be connected to an image generating device via a transmission line, and the image generating device can be a digital camera, a mobile device or a computer product to receive a pending image from the image generating device. An original color image data RXdata and a synchronization signal SYNC. The high-definition multimedia interface transmitting end 12 can be connected to a terminal display device via another transmission line, and the terminal display device can be a display or a computer product.

該場效可程式化邏輯閘陣列裝置11係影像處理技術領域中常見的可程式邏輯元件，一般而言，該場效可程式化邏輯閘陣列裝置11包含有一立體匹配(stereo matching)單元110與一深度影像繪圖(Depth-Image-Based Rendering, DIBR)器111，該立體匹配單元110將該原始彩色影像資料RXdata轉換為一彩色影像資料IMcolor與一深度影像資料IMdepth，此為該立體匹配單元110的既有功能，且該深度影像繪圖器111依據該同步化信號SYNC運作，在此不加以詳述。The field effect programmable logic gate array device 11 is a programmable logic device commonly used in the field of image processing technology. Generally, the field effect programmable logic gate array device 11 includes a stereo matching unit 110 and a depth-of-image-based rendering (DIBR) device 111, the stereo matching unit 110 converts the original color image data RXdata into a color image data IMcolor and a depth image data IMdepth, which is the stereo matching unit 110 The existing function of the depth image plotter 111 operates according to the synchronization signal SYNC and will not be described in detail herein.

請參考圖2，概括而言，該深度影像繪圖器111包含有一深度圖前處理單元112、兩影像扭曲單元113、兩修補順序預估單元114與一最佳補塊匹配單元115。Referring to FIG. 2 , the depth image plotter 111 includes a depth map pre-processing unit 112 , two image warping units 113 , two patch sequence estimating units 114 and an optimal patch matching unit 115 .

該深度圖前處理單元112的輸入端連接該立體匹配單元110，以接收經過該立體匹配單元110處理過後的該彩色影像資料IMcolor與該深度影像資料IMdepth，並根據該彩色影像資料IMcolor與該深度影像資料IMdepth執行一深度圖前處理步驟以產生一前處理結果R，該前處理結果R包含有二維(2D)影像圖和深度圖。The input end of the depth map pre-processing unit 112 is connected to the stereo matching unit 110 to receive the color image data IMcolor and the depth image data IMdepth processed by the stereo matching unit 110, and according to the color image data IMcolor and the depth The image data IMdepth performs a depth map pre-processing step to generate a pre-processing result R, which includes a two-dimensional (2D) image map and a depth map.

該兩影像扭曲單元113的輸入端連接該立體匹配單元110與該深度圖前處理單元112，分別根據該前處理結果R將該彩色影像資料IMcolor分別執行一影像扭曲(warping)步驟，以分別產生一第一影像資料IM1與一第二影像資料IM2，舉例來說，若第一影像資料IM1要供使用者左眼觀看，則第二影像資料IM2係供使用者右眼觀看。The input ends of the two image warping units 113 are connected to the stereo matching unit 110 and the depth map pre-processing unit 112, respectively performing an image warping step according to the pre-processing result R to respectively generate an image warping step. For example, if the first image data IM1 is to be viewed by the user's left eye, the second image data IM2 is for the user's right eye to view.

該兩修補順序預估單元114的輸入端分別連接該兩影像扭曲單元113以分別接收該第一影像資料IM1與該第二影像資料IM2，以分別對該第一影像資料IM1與該第二影像資料IM2執行一修補順序預估步驟，並分別對應產生一第一預估影像修補資料IME1與一第二預估影像修補資料IME2。The input ends of the two patching sequence estimating units 114 are respectively connected to the two image warping units 113 to respectively receive the first image data IM1 and the second image data IM2 to respectively respectively the first image data IM1 and the second image. The data IM2 performs a patching sequence estimation step, and correspondingly generates a first estimated image patching material IME1 and a second estimated image patching material IME2.

該兩最佳補塊匹配單元115的輸入端分別連接該兩修補順序預估單元114，該兩最佳補塊匹配單元115的輸出端分別連接該高畫質多媒體介面發送端12。該兩最佳補塊匹配單元115分別接收該兩修補順序預估單元114的第一預估影像修補資料IME1與第二預估影像修補資料IME2，並分別根據第一預估影像修補資料IME1與第二預估影像修補資料IME2執行最佳補塊匹配(patch)步驟，以分別產生一第一模擬影像資料IMS1與一第二模擬影像資料IMS2，進而傳送到該高畫質多媒體介面發送端12。The input ends of the two optimal patch matching units 115 are respectively connected to the two patching sequence estimating units 114, and the outputs of the two optimal patch matching units 115 are respectively connected to the high-quality multimedia interface transmitting end 12. The two optimal patch matching units 115 respectively receive the first estimated image repairing material IME1 and the second estimated image repairing material IME2 of the two patching sequence estimating units 114, and respectively repairing the data according to the first estimated image IME1 and The second estimated image repairing material IME2 performs an optimal patching step to generate a first analog image data IMS1 and a second analog image data IMS2, respectively, and then transmitted to the high-quality multimedia interface transmitting end 12 .

該第一模擬影像資料IMS1與該第二模擬影像資料IMS2係透過該高畫質多媒體介面發送端12傳送給該終端顯示裝置，以由該終端顯示裝置顯示合成後的該第一模擬影像資料IMS1與該第二模擬影像資料IMS2。舉例來說，該第一模擬影像資料IMS1可供對應於人的左眼觀看，該第二模擬影像資料IMS2可供對應於人的右眼觀看，如此一來，使用者觀看終端顯示裝置所呈現的畫面時，其大腦會將該第一模擬影像資料IMS1與該第二模擬影像資料IMS2組合成具有立體(3D)視覺效果的影像。The first analog image data IMS1 and the second analog image data IMS2 are transmitted to the terminal display device through the high-quality multimedia interface transmitting end 12, so that the synthesized first simulated image data IMS1 is displayed by the terminal display device. And the second analog image data IMS2. For example, the first analog image data IMS1 can be viewed by the left eye corresponding to the person, and the second analog image data IMS2 can be viewed by the right eye of the person, so that the user views the terminal display device. In the picture, the brain combines the first analog image data IMS1 and the second analog image data IMS2 into an image having a stereoscopic (3D) visual effect.

本創作係透過該深度圖前處理單元112的設置，其產生的前處理結果R供後續執行的影像扭曲(warping)與填洞之步驟更順利執行。Through the setting of the depth map pre-processing unit 112, the generated pre-processing result R is performed more smoothly for the subsequent image warping and hole filling steps.

以下分別說明該深度圖前處理單元112、該影像扭曲單元113、該修補順序預估單元114以及該最佳補塊匹配單元115的運作。The operation of the depth map pre-processing unit 112, the image warping unit 113, the patching order estimating unit 114, and the optimal patch matching unit 115 will be separately described below.

該深度圖前處理單元112係根據該彩色影像資料IMcolor與該深度影像資料IMdepth判斷一物件之區塊邊緣屬於深度邊緣或影像邊緣。The depth map pre-processing unit 112 determines, according to the color image data IMcolor and the depth image data IMdepth, that the block edge of an object belongs to a depth edge or an image edge.

對於所述深度邊緣的判斷，首先說明一差異度 D，差異度 D表示如下： (1) For the judgment of the depth edge, a difference degree D is first described, and the degree of difference D is expressed as follows: (1)

在第(1)式中， d _max表示左右兩張圖中兩個對應點的水平位移量(disparity)的最大差異值， d _min表示左右兩張圖中兩個對應點的水平位移量(disparity)的最小差異值， d表示左右兩張圖中兩個對應點的水平位移量(disparity)。該水平位移量 d係透過攝影機的焦距 f計算表示如下： (2) b是兩攝影機的基準距離， z是物體和攝影機之間的垂直距離， f是攝影機的焦距。 In the formula (1), d _max represents the maximum difference value of the horizontal displacement of the two corresponding points in the left and right images, and d _min represents the minimum difference in the horizontal displacement of the two corresponding points in the left and right images. The value, d, represents the horizontal displacement of the two corresponding points in the left and right images. The horizontal displacement d is calculated by the focal length f of the camera as follows: (2) b is the reference distance between the two cameras, z is the vertical distance between the object and the camera, and f is the focal length of the camera.

深度差異值 D _diff定義如下： (3) The depth difference value D _{diff is} defined as follows: (3)

在第(3)式中， D _i _{,
j}表示以位於座標( i, j)之像素為中心的深度區塊。 In the formula (3), D _i _{, j} denotes a depth block centered on a pixel located at coordinates ( i , j ).

是以，該深度圖前處理單元112根據下式判斷深度邊緣： (4) Therefore, the depth map pre-processing unit 112 determines the depth edge according to the following formula: (4)

在第(4)式中， λ為一第一判斷門檻值，該第一判斷門檻值 λ表示如下： (5) In the formula (4), λ is a first judgment threshold value, and the first judgment threshold value λ is expressed as follows: (5)

b是兩攝影機的基準距離，H是所謂空洞(hole)的面積，f是攝影機的焦距，若DepthEdge=1，即判斷為深度邊緣。b is the reference distance of the two cameras, H is the area of the so-called hole, and f is the focal length of the camera. If DepthEdge=1, it is judged as the depth edge.

對於所述影像邊緣的判斷，該深度圖前處理單元112首先計算該物件之區塊的一影像強度值 Var _i _{,
j}，該影像強度值 Var _i _{,
j}表示如下： (6) For the determination of the edge of the image, the depth map pre-processing unit 112 first calculates an image intensity value Var _i _{, j} of the block of the object _, and the image intensity value Var _i _{, j is} expressed as follows: (6)

在第(6)式中， Var _i _{,
j}代表該物件以( i, j)像素為中心的區塊的期望值(expected value)， μ為所有影像區塊與以( i, j)像素為中心的區塊差異的平均值， Y _i _{,
j}表示位於座標( i, j)之像素強度，其表示如下： (7) In equation (6), Var _i _{, j} represents the expected value of the block centered on the ( i , j ) pixel of the object, μ is the center of all image blocks and centered on the ( i , j ) pixel The average of the block differences, Y _i _{, j} represents the pixel intensity at coordinates ( i , j ), which is expressed as follows: (7)

是以，該深度圖前處理單元112根據下式判斷影像邊緣： (8) Therefore, the depth map pre-processing unit 112 determines the image edge according to the following formula: (8)

在第(8)式中， γ為一第二判斷門檻值，該第二判斷門檻值 γ表示如下： (9) In the formula (8), γ is a second judgment threshold value, and the second judgment threshold value γ is expressed as follows: (9)

在第(9)式中，Ng表示不同灰階值的數量，h _i表示直方圖的灰階值。 In the formula (9), Ng represents the number of different gray scale values, and h _i represents the gray scale value of the histogram.

另一方面，若該物件之區塊被判斷為同時是深度邊緣與影像邊緣，則該深度圖前處理單元112執行一深度對齊步驟與一擴展步驟，該深度對齊步驟為將區塊的深度值 d _{p, q} 設定為一最小深度，表示如下： (10) On the other hand, if the block of the object is determined to be both the depth edge and the image edge, the depth map pre-processing unit 112 performs a depth alignment step and an extension step, the depth alignment step is to set the depth value of the block. d _{p, q is} set to a minimum depth, as follows: (10)

在第(10)式中，1≦p≦m，1≦q≦n。(p,q)是像素的座標值，m、n是代表門檻值。In the formula (10), 1≦p≦m, 1≦q≦n. (p, q) is the coordinate value of the pixel, and m and n are the threshold values.

再者，本創作判斷深度擴展的邊緣方向如下： (11) Furthermore, the edge direction of this creative judgment depth expansion is as follows: (11)

在前述擴展的步驟中，是在深度的邊緣複製k個鄰居像素，如下所示： (12) In the aforementioned extended step, k neighbor pixels are copied at the edge of the depth as follows: (12)

在第(12)式中，1≦ p≦ k。 k是代表門檻值。 In the formula (12), 1≦ p ≦ k . k is the threshold value.

深度圖前處理單元112將輸出一影像圖和深度圖給影像扭曲單元113，使得影像扭曲單元113更容易處理執行。The depth map pre-processing unit 112 will output an image map and a depth map to the image warping unit 113, making the image warping unit 113 easier to process and execute.

各該影像扭曲單元113主要是將原本在彩色影像資料IMcolor之平面的像素點，投射到三維座標系中對應的真實三維點，然後再投射到終端顯示裝置的虛擬影像平面上。以 f表示攝影機的焦距，以 t _x 表示攝影機的基準距離。在 X、 Y和 Z維度中一個擁有深度的點預計將投影到攝影機影像平面的像素位置、和上，以下兩式表示： (13) (14) Each of the image warping units 113 mainly projects the pixel points originally on the plane of the color image data IMcolor to the corresponding real three-dimensional points in the three-dimensional coordinate system, and then projects onto the virtual image plane of the terminal display device. The focal length of the camera is denoted by f , and the reference distance of the camera is denoted by t _x . A point with depth in the X , Y, and Z dimensions is expected to be projected to the pixel location of the camera image plane , with Above, the following two expressions indicate: (13) (14)

在第(13)式與第(14)式中，xc和xr分別是從該彩色影像資料IMcolor與該深度影像資料IMdepth上的資訊得到，Zc為零平行設定平面(zero-parallax setting plane)，Z是物體和攝影機之間的垂直距離。第(13)式與第(14)式係將物體假設在終端顯示裝置的前方(即使用者與終端顯示裝置之間)。當呈現一個3D視訊時，需定義物體的適當範圍，供使用者從不同的角度看同一物體時會感覺到不同的深度，而感知深度正比於屏幕視差(perceived depth)，如下式所示： (15) In the equations (13) and (14), xc and xr are obtained from the information of the color image data IMcolor and the depth image data IMdepth, respectively, and Zc is a zero-parallax setting plane. Z is the vertical distance between the object and the camera. Equations (13) and (14) assume that the object is in front of the terminal display device (i.e., between the user and the terminal display device). When presenting a 3D video, it is necessary to define an appropriate range of objects for the user to feel different depths when viewing the same object from different angles, and the perceived depth is proportional to the screen's perceived depth, as shown in the following equation: (15)

在第(15)式中， p是屏幕視差， Z是物體和攝影機之間的垂直距離， e是使用者兩眼距離， d是表示左右兩張圖中兩個對應點的水平位移量(disparity)的參考差異值。從第(15)式中，觀看距離Z象徵在現實世界中相機的真正距離，但在深度圖表示的是標準化8灰階度的距離。因此，各該影像扭曲單元113可使用線性量化轉換深度值，如以下第(16)式所示， g被定義為深度影像資料IMdepth中的灰階度， D表示物體和攝影機在三維空間中的真實距離，當物體在 Z _c 的位置上，左右兩眼的影像會重合，看起來的立體深度剛好等於螢幕的位置。也就是說：在零視差(Zero Parallax)之後的物件，看起來會像是在螢幕裡面。在零視差(Zero Parallax)之前的物件，就會有跳出螢幕的感覺。因此，本創作將零視差設定為 Z _c 等於觀看距離。 (16) In equation (15), p is the screen parallax, Z is the vertical distance between the object and the camera, e is the distance between the two eyes of the user, and d is the horizontal displacement of the two corresponding points in the left and right images. Refer to the difference value. From the equation (15), the viewing distance Z symbolizes the true distance of the camera in the real world, but the depth map represents the distance of the normalized 8 gray scale. Therefore, each of the image warping units 113 can convert the depth value using linear quantization, as shown in the following formula (16), g is defined as the gray scale in the depth image data IMdepth, and D represents the object and the camera in the three-dimensional space. The true distance, when the object is in the position of Z _c , the images of the left and right eyes will coincide, and the stereo depth will be exactly equal to the position of the screen. That is to say: objects after zero parallax (Zero Parallax) will look like they are inside the screen. Objects before Zero Parallax will have the feeling of jumping out of the screen. Therefore, this creation sets the zero parallax to Z _c equal to the viewing distance. (16)

第(13)式與第(14)式當中的，可以第(16)式帶入，推算得到如下式： (17) Among the formulas (13) and (14) Can be brought in by (16), and the following formula is derived: (17)

在第(17)式中，可表示如下，其中，讓乘法可以使用移位暫存器實現。 (18) In the formula (17), It can be expressed as follows, wherein the multiplication can be implemented using a shift register. (18)

在第(18)式中，沒有了第(17)式之複雜除法運算，有別於習知技術使用大量複雜的數學式，本創作將使的影像扭曲之步驟更容易用硬體(即：各該影像扭曲單元13可包含有位移暫存器執行第(18)的運算)設計方式來實現。前述中，第(1)式至第(17)式為影像處理技術領域的現有技術，在此不詳細推導其過程。In the formula (18), without the complicated division of the equation (17), unlike the conventional technique using a large number of complicated mathematical expressions, this creation will make the step of image distortion easier to use hardware (ie: Each of the image warping units 13 can be implemented by including a displacement register executing the operation of (18). In the foregoing, the formulas (1) to (17) are prior art in the field of image processing technology, and the process thereof is not deduced in detail herein.

影像扭曲單元113將輸出有空洞需要修補的3D圖像給修補順序預估單元114。The image warping unit 113 supplies a 3D image to which the patch needs to be repaired to the patching order estimating unit 114.

需說明的是，由於使用者觀點的差異，有些在原始影像被遮蔽的區域，在虛擬的左、右眼影像會變成可見的區域，但是卻又沒有資訊在左、右眼影像中而形成空洞。這些新曝光的區域，在計算機圖學裡稱為未填補區域(disocclusion)，未填補區域的資訊既無法在原本的彩色影像資料IMcolor找到，也無法在其相關聯的深度影像資料IMdepth中獲得。本創作利用平均鄰居像素紋理(averaging textures of neighboring pixels)的方法來填補這些新曝光的區域，此方法稱為空洞填補(hole-filling)。It should be noted that due to the difference of the user's viewpoint, some of the virtual left and right eye images will become visible in the area where the original image is shaded, but there is no information in the left and right eye images to form a cavity. . These newly exposed areas are called disocclusions in computer graphics. The information of unfilled areas cannot be found in the original color image data IMcolor, nor in the associated depth image data IMdepth. This creation uses the methods of average textures of neighboring pixels to fill these newly exposed areas. This method is called hole-filling.

關於各該修補順序預估單元114，根據Criminisi等人的觀點(Criminisi, A., Perez, P., Toyama, K.. "Region filling and object removal by exemplar-based image inpainting," IEEE Transactions on Image Processing, 13(9), pp.1200-1212, 2004)，是在“空洞填充前”應當首先沿著線性結構傳輸，從而使結構擴展和空洞邊界形狀連接，Criminisi等人設計了一個優先權函數決定填充順序，簡述如下。給定一個補塊 Ψ _p ，其中心點為 p，且 p在空洞區域 Ω中，權值 P( p)計算如下： (19) Regarding each of the patching order estimating units 114, according to Criminisi et al. (Criminisi, A., Perez, P., Toyama, K.. "Region filling and object removal by exemplar-based image inpainting," IEEE Transactions on Image Processing, 13(9), pp.1200-1212, 2004), should be transmitted first along the linear structure before "cavity filling", so that the structure extension and the void boundary shape are connected. Criminisi et al. designed a priority function. Determine the order of filling, as briefly described below. Given a patch Ψ _p whose center point is p and p is in the hole region Ω , the weight P ( p ) is calculated as follows: (19)

第(19)式中，信任項C(p)和資料項D(p)分別被定義為： (20) (21) In the formula (19), the trust item C(p) and the data item D(p) are respectively defined as: (20) (twenty one)

在第(20)式與第(21)式中，是在p點的照度(isophote)，Φ為非空洞區域，是正交運算符號，np是與∂Ω正交的單位向量以及a是標準化因子。簡言之，C(p)是在Ψp中非空洞的比例，然後D(p)是梯度項ÑIp和一般項np的內積。Criminisi等人將權值P(p)定義為信任項C(p)和資料項D(p)的乘積，資料項D(p)給與獨立區塊較大的權重區塊，而信任項C(p)給與被非空洞區域包圍的區塊較高的權值。依空洞內部的填補程序，由於信任項C(p))在範圍[0,1]之間，信任項C(p)每更新一次其變得更小，表示越不確定在空洞內部被填補點的影像值，因權值P(p)可能會衰減為零，P(p)=C(p)D(p)將造成照度會停止傳輸。有鑒於此，本創作係不同於Criminisi等人的觀點，本創作將權值P(p)如下列定義： (22) In equations (20) and (21), Is the illumination at p point (isophote), Φ is the non-void area, Is an orthogonal operation symbol, np is a unit vector orthogonal to ∂Ω, and a is a normalization factor. In short, C(p) is the proportion of non-void in Ψp, and then D(p) is the inner product of the gradient term ÑIp and the general term np. Criminisi et al. define the weight P(p) as the product of the trust item C(p) and the data item D(p), and the data item D(p) gives a larger weight block to the independent block, and the trust item C (p) Give a higher weight to the block surrounded by the non-voided area. According to the filling procedure inside the hole, since the trust item C(p) is between the range [0, 1], the trust item C(p) becomes smaller every time it is updated, indicating that the more uncertain the hole is filled inside the hole. The image value, because the weight P(p) may decay to zero, P(p)=C(p)D(p) will cause the illumination to stop transmitting. In view of this, this creation is different from Criminisi et al., which defines the weight P(p) as follows: (twenty two)

深度項 D _p ( p)給與背景區域高權值，其定義如下： (23) The depth term D _p ( p ) gives the background area a high weight, which is defined as follows: (twenty three)

是以，在本創作中，能用現有周圍點來修復遺失的區域。Therefore, in this creation, the existing surrounding points can be used to repair the lost area.

修補順序預估單元114輸出預修補的空洞順序給最佳補塊匹配單元115，再由最佳補塊匹配單元115搜尋最符合補塊進行修補。The patching sequence estimating unit 114 outputs the pre-patched hole sequence to the optimal patch matching unit 115, and then the best patch matching unit 115 searches for the most matching patch for patching.

關於各該最佳補塊匹配單元115，為了搜尋最符合補塊，本創作計算補塊間的每個像素來算出相似度。在DIBR中，不只有影像資訊，還有深度的資訊，各該最佳補塊匹配單元115的相似度度量方法稱為深度灰階距離 (depth-based gray-level distance, DBGLD)法，是加入深度資訊到灰階距離法(可參考L. Zhang and W.J. Tam, “Stereoscopic image generation based on depth images for 3D TV,” IEEE Transactions on Broadcasting, vol. 51, No. 2, pp. 1191-199, June 2005)而產生。With respect to each of the optimum patch matching units 115, in order to search for the most suitable patch, the present calculation calculates the similarity for each pixel between the patches. In DIBR, there is not only image information, but also depth information. The similarity measure method of each of the best patch matching units 115 is called a depth-based gray-level distance (DBGLD) method. Depth information to gray scale distance method (refer to L. Zhang and WJ Tam, "Stereoscopic image generation based on depth images for 3D TV," IEEE Transactions on Broadcasting, vol. 51, No. 2, pp. 1191-199, June Produced by 2005).

令 Ψ為有空洞需要填補的補塊， Ψ的垂直鄰居補塊與 Ψ的垂直距離 VD( ΨD)被定義為： (24) [Psi] so that there is a cavity to be filled patches, Ψ vertical neighbor patches and [Psi] is the vertical distance VD (ΨD) is defined as: (twenty four)

在第(24)式中，v表示是Ψ的垂直鄰居補塊。In the formula (24), v denotes a vertical neighbor patch of Ψ.

再者， Ψ的水平鄰居補塊與 Ψ的水平平滑灰階距離 HD( Ψ)被定義為： (25) Furthermore, Ψ horizontal neighbor patches with smooth gray scale level from [Psi] HD (Ψ) is defined as: (25)

在第(25)式中，h表示是Ψ的水平鄰居補塊。In the equation (25), h denotes a horizontal neighbor patch that is Ψ.

深度灰階距離(depth-based gray-level distance, DBGLD)定義為： (26) The depth-based gray-level distance (DBGLD) is defined as: (26)

其中， Ψ的最符合patch Ψ¢可以下式取得： (27) Among them, Ψ 's most suitable patch Ψ ¢ can be obtained as follows: (27)

是以，經過該兩最佳補塊匹配單元115的運算處理即產生該第一模擬影像資料IMS1與該第二模擬影像資料IMS2。Therefore, the first analog image data IMS1 and the second analog image data IMS2 are generated through the operation processing of the two optimal patch matching units 115.

本創作使用微軟研究室(Microsoft Research)所提供公開的深度影像序列，其提供解析度為1024 768的連續影像以及與其相對應的深度圖。本創作使用「ballet」和「break-dancers」這兩組立體影片。「ballet」影片在兩個不同的深度範圍存在大量的深度不連續，而大量深度不連續則會導致在不同的深度層級產生大量的未填補區域。再者，「break-dancers」影片在幾乎同樣的深度層級擁有大量物件，而漸進的深度不連續會導致影片中產生小的未填補區域。 This creation uses a public depth image sequence provided by Microsoft Research that provides a resolution of 1024. A continuous image of 768 and its corresponding depth map. This creation uses two sets of stereoscopic films, "ballet" and "break-dancers". The "ballet" film has a large number of depth discontinuities in two different depth ranges, and a large number of depth discontinuities results in a large number of unfilled areas at different depth levels. Furthermore, "break-dancers" movies have a large number of objects at almost the same depth level, and progressive depth discontinuities can result in small unfilled areas in the movie.

峰值信號雜訊比(Peak signal-to-noise ratio, PSNR)與結構相似性(structural similarity, SSIM)是評估成像品質的參考依據。本創作在「ballet」影片的PSNR與SSIM的表現分別為30.8883與0.6834，而對於「break-dancers」影片的PSNR與SSIM的表現分別為31.0556與0.6767，皆顯示本創作有優異的成像品質。Peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) are the reference for evaluating imaging quality. The PSNR and SSIM performances of the "ballet" films were 30.8883 and 0.6834 respectively, while the PSNR and SSIM performances of the "break-dancers" films were 31.0556 and 0.6767, respectively, which showed excellent image quality.

10‧‧‧高畫質多媒體介面接收端
11‧‧‧場效可程式化邏輯閘陣列裝置
110‧‧‧立體匹配單元
111‧‧‧深度影像繪圖器
112‧‧‧深度圖前處理單元
113‧‧‧影像扭曲單元
114‧‧‧修補順序預估單元
115‧‧‧最佳補塊匹配單元
12‧‧‧高畫質多媒體介面發送端
RXdata‧‧‧原始彩色影像資料
SYNC‧‧‧同步化信號
IMcolor‧‧‧彩色影像資料
IMdepth‧‧‧深度影像資料
R‧‧‧前處理結果
IM1‧‧‧第一影像資料
IM2‧‧‧第二影像資料
IME1‧‧‧第一預估影像修補資料
IME2‧‧‧第二預估影像修補資料
IMS1‧‧‧第一模擬影像資料
IMS2‧‧‧第二模擬影像資料10‧‧‧High-quality multimedia interface receiver
11‧‧‧ Field effect programmable logic gate array device
110‧‧‧ Stereo matching unit
111‧‧‧Deep image plotter
112‧‧‧Depth map pre-processing unit
113‧‧‧Image Distortion Unit
114‧‧‧ Patching order estimation unit
115‧‧‧Best patch matching unit
12‧‧‧High-quality multimedia interface sender
RXdata‧‧‧ original color image data
SYNC‧‧‧Synchronized signal
IMcolor‧‧‧ color image data
IMdepth‧‧‧Deep image data
R‧‧‧ pre-processing results
IM1‧‧‧ first image data
IM2‧‧‧Second image data
IME1‧‧‧First Estimated Image Repair Information
IME2‧‧‧ second estimated image repair data
IMS1‧‧‧ first simulated image data
IMS2‧‧‧Second analog image data

圖1：本創作立體嵌入式影像系統之實施例的方塊示意圖。圖2：本創作立體嵌入式影像系統之實施例的流程示意圖。Figure 1: Block diagram of an embodiment of the present stereoscopic embedded image system. FIG. 2 is a schematic flow chart of an embodiment of the present stereoscopic embedded image system.

10‧‧‧高畫質多媒體介面接收端 10‧‧‧High-quality multimedia interface receiver

11‧‧‧場效可程式化邏輯閘陣列裝置 11‧‧‧ Field effect programmable logic gate array device

110‧‧‧立體匹配單元 110‧‧‧ Stereo matching unit

111‧‧‧深度影像繪圖器 111‧‧‧Deep image plotter

12‧‧‧高畫質多媒體介面發送端 12‧‧‧High-quality multimedia interface sender

RXdata‧‧‧原始彩色影像資料 RXdata‧‧‧ original color image data

SYNC‧‧‧同步化信號 SYNC‧‧‧Synchronized signal

IMcolor‧‧‧彩色影像資料 IMcolor‧‧‧ color image data

IMdepth‧‧‧深度影像資料 IMdepth‧‧‧Deep image data

IMS1‧‧‧第一模擬影像資料 IMS1‧‧‧ first simulated image data

IMS2‧‧‧第二模擬影像資料 IMS2‧‧‧Second analog image data

Claims

A stereoscopic embedded image system, comprising: a high-quality multimedia interface receiving end, receiving an original color image data; and an effect programmable logic gate array device electrically connected to the high-quality multimedia interface receiving end to receive the original The color image data, the field effect programmable logic gate array device comprises: a stereo matching unit, converting the original color image data into a color image data and a depth image data; and a depth image plotter comprising: a depth a pre-processing unit, wherein the input end is connected to the stereo matching unit to receive the color image data and the depth image data, and generate a pre-processing result according to the color image data and the depth image data, the pre-processing result includes two Dimensional image and depth map; two image warping units, wherein the input end is connected to the stereo matching unit and the depth map pre-processing unit, respectively generating a first image according to the pre-processing result and the color image data distortion Data and a second image data; two patching sequence estimation units, which lose The input end is respectively connected to the two image distorting units to respectively receive the first image data and the second image data, and respectively generate an estimate of the repair order of the first image data and the second image data respectively. a first estimated image repairing data and a second estimated image repairing data; and two optimal patch matching units, wherein the input ends are respectively connected to the two patching sequence estimating units to respectively receive the first estimated image repairing data and The second estimated image is repaired, and the optimal patch is matched according to the first estimated image repair data and the second estimated image repair data, respectively, to generate a first simulated image data and a second simulation respectively. The image data; and a high-quality multimedia interface transmitting end electrically connected to the field-effect programmable logic gate array device to output the first analog image and the second analog image.

The stereoscopic embedded image system of claim 1, wherein each of the image warping units includes a displacement register.