TWI410896B - Compressibility-aware media retargeting with structure preserving - Google Patents

Compressibility-aware media retargeting with structure preserving Download PDF

Info

Publication number
TWI410896B
TWI410896B TW099107658A TW99107658A TWI410896B TW I410896 B TWI410896 B TW I410896B TW 099107658 A TW099107658 A TW 099107658A TW 99107658 A TW99107658 A TW 99107658A TW I410896 B TWI410896 B TW I410896B
Authority
TW
Taiwan
Prior art keywords
image
processing unit
content
video
adjustment
Prior art date
Application number
TW099107658A
Other languages
Chinese (zh)
Other versions
TW201133396A (en
Inventor
shu fan Wang
Shang Hong Lai
Original Assignee
Nat Univ Tsing Hua
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nat Univ Tsing Hua filed Critical Nat Univ Tsing Hua
Priority to TW099107658A priority Critical patent/TWI410896B/en
Publication of TW201133396A publication Critical patent/TW201133396A/en
Application granted granted Critical
Publication of TWI410896B publication Critical patent/TWI410896B/en

Links

Landscapes

  • Image Processing (AREA)

Abstract

The present invention provides a method for retargeting an image, comprising determining a total block structure energy of an input image content. A compressibility rate of the input image content is determined based on the total block structure energy. An optimal scaling factor of the input image content is obtained. The input image content is warped by using a new coordinate matrices and uniformly scaling the input image content to a target image resolution.

Description

具結構保存之壓縮感知多媒體畫面調整方法Compressed sensing multimedia picture adjustment method with structure preservation

本發明係關於一種影像處理方法用於調整影像/視訊資料,特別係有關於一種具結構保存之壓縮感知多媒體畫面調整方法。The invention relates to an image processing method for adjusting image/video data, in particular to a method for adjusting a compressed sensing multimedia picture with structure preservation.

影像大小調整在許多影像處理應用中是一標準手段。實際工作上係藉由均勻地調整影像至一目標大小。目前,正發展的影像大小調整方法之興趣在於,保持重要特徵完整性的同時尋找影像大小的改變,此處這些特徵可以透過由上往下或由下往上的方法偵測。由上往下的方法係利用例如臉部偵測器的工具以偵測影像中的重點區域,而由下往上的方法係依靠視覺顯著性方法以建立該影像的視覺顯著圖。一旦視覺顯著圖建構完成,畫面裁切可以利用以顯示該影像之最重要區域。Image resizing is a standard tool in many image processing applications. In practice, the image is evenly adjusted to a target size. At present, the growing image size adjustment method is interested in finding changes in image size while maintaining the integrity of important features, which can be detected by top-down or bottom-up methods. The top-down approach utilizes a tool such as a face detector to detect key areas in the image, while the bottom-up approach relies on a visual saliency approach to create a visual saliency map of the image. Once the visual saliency map is constructed, the screen crop can be utilized to display the most important areas of the image.

最近幾年,由於顯示影像於不同解析度或長寬比之各種顯示裝置上的需求增加,內容知覺影像/視訊畫面調整方法變得越來越重要。一些演算法已提出使得影像或視訊內容適應不同的顯示設定。目前存在的方法大部分落在三個範疇:其一係應用畫面裁切或切割策略以移除較少的重要區域(參考文件1:S. Avidan and A. Shamir,“Seam carving for content-aware image resizing,”ACM Transactions on Graphics(SIGGRAPH),vol. 26,no. 3,2007;參考文件2:M. Rubinstein,A. Shamir,and S. Avidan,“Improved seam carving for video retargeting,”ACM Transactions on Graphics(SIGGRAPH),pp. 1-9,2008);其二係分割該影像為前景與後景層並獨立地縮放每一層(參考文件3:V. Setlur,S. Takagi,R. Raskar,M. Gleicher,and B. Gooch,“Automatic image retargeting,”ACM SIGGRAPH Sketches,vol. 154,p. 4,2004;參考文件4:C. Tao,J. Jia,and H. Sun,“Active window oriented dynamic video retargeting,”Workshop on Dynamical Vision ICCV,2007);而其三係根據區域影像顯著性而採取適應式畫面變形該影像之方法(參考文件5:L. Wolf,M. Guttmann,and D. Cohen-Or,“Non-homogeneous content-driven video-retargeting,”International Conference on Computer Vision,pp. 1-6,2007;參考文件6:Y.-S. Wang,C.-L. Tai,O. Sorkine,and T.-Y. Lee,“Optimized scale-and-stretch for image resizing,”ACM Transactions on Graphics(SIGGRAPH Asia),pp. 1-8,2008)。In recent years, content sensing video/video picture adjustment methods have become more and more important due to the increasing demand for display images on various display devices of different resolutions or aspect ratios. Some algorithms have proposed adapting image or video content to different display settings. Most of the existing methods fall into three categories: one applies a picture cropping or cutting strategy to remove fewer important areas (Reference 1: S. Avidan and A. Shamir, “Seam carving for content-aware Image resizing,"ACM Transactions on Graphics(SIGGRAPH), vol. 26, no. 3,2007; Reference 2:M. Rubinstein, A. Shamir, and S. Avidan, "Improved seam carving for video retargeting," ACM On Graphics (SIGGRAPH), pp. 1-9, 2008); the second is to split the image into foreground and back layers and scale each layer independently (Ref. 3: V. Setlur, S. Takagi, R. Raskar, M. Gleicher, and B. Gooch, "Automatic image retargeting," ACM SIGGRAPH Sketches, vol. 154, p. 4, 2004; Reference 4: C. Tao, J. Jia, and H. Sun, "Active window oriented Dynamic video retargeting, "Workshop on Dynamical Vision ICCV, 2007"; and its three methods of adapting the image according to the regional image saliency (Ref. 5: L. Wolf, M. Guttmann, and D. Cohen -Or, "Non-homogeneous content-driven video-retargeting," Internation Al Conference on Computer Vision, pp. 1-6, 2007; Reference 6: Y.-S. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee, “Optimized scale-and- Stretch for image resizing, "ACM Transactions on Graphics (SIGGRAPH Asia), pp. 1-8, 2008).

對於利用一些演算法而以畫面裁切為基礎之方法而言,其係採用畫面裁切的方式而從影像中移除較不重要的區域,其可能拋棄大量的影像資訊,並且若重要特徵係位於影像/視訊中較遠的部分則有時候會失效。為了處理畫面裁切方法所導致的問題,Avidan與Shamir(參考文件1)提出一個遞增式地移除或插入區域的有趣想法,稱為影像切縫法(seam carving)。然而,簡單延伸的影像切縫法應用於視訊畫面調整會產生晃動偽影。因此,改良式的影像切縫演算法被提出以藉由計算發送能量以求出最低成本以降低偽影(artifacts)效果(參考文件2)。雖然在某些例子中無法避免產生偽影或扭曲內容結構的問題,影像切縫法對於影像/視訊畫面調整仍為一有效的技術。For the method based on picture cropping using some algorithms, it uses the method of picture cropping to remove less important areas from the image, which may discard a large amount of image information, and if important features are Parts that are located farther in the image/video sometimes fail. In order to deal with the problems caused by the cropping method, Avidan and Shamir (Ref. 1) propose an interesting idea of incrementally removing or inserting regions, called seam carving. However, the simple extended image slitting method is applied to the adjustment of the video screen to cause swaying artifacts. Therefore, an improved image slitting algorithm is proposed to reduce artifacts by calculating the transmitted energy to find the lowest cost (Ref. 2). Although in some cases the problem of artifacts or distortion of the content structure cannot be avoided, the image slitting method is still an effective technique for image/video image adjustment.

沒有採取畫面裁切或切割影像內容之方法,影像分割方法可以提供另一種方式以分離不同重要的影像區域。目前,Wolf等人(參考文件5)提出一種以畫面變形為基礎之方法,其係藉由結合顯著性量測、臉部偵測器與移動評估以自動地偵測重要區域以利於視訊畫面調整。他們制訂影像調整之網格映射以作為解決一大型稀疏線性系統。基於類似概念,Wolf等人(參考文件6)提出一種方法,其允許重要區域均勻地縮放與同質區域被扭曲。此方法給予更多自由度以利用同質影像區域。從參考文件五與六之結果中,其係限制以較佳地保存於被調整影像/視訊中的突出對象之整體形狀。不同的方法也可以用於調整具有不同內容的影像。另一方面,可能需要結合許多方法以適當地調整一影像大小。Without the method of cropping or cutting the image content, the image segmentation method can provide another way to separate different important image regions. Currently, Wolf et al. (Ref. 5) propose a method based on picture distortion, which combines saliency measurement, face detector and motion estimation to automatically detect important areas for video image adjustment. . They developed a grid map of image adjustments to solve a large sparse linear system. Based on a similar concept, Wolf et al. (Ref. 6) propose a method that allows important regions to be uniformly scaled and homogenous regions to be distorted. This method gives more freedom to utilize the homogenous image area. From the results of references 5 and 6, it is limited to the overall shape of the highlighted object that is preferably stored in the adjusted image/video. Different methods can also be used to adjust images with different content. On the other hand, it may be necessary to combine many methods to properly adjust an image size.

在人類視覺系統中,人眼對於某些物體的形狀很敏感,例如圓形或直線。基於影像顯著性量測,大部分之前的工作都限制影像調整方法在一由下往上策略上以保存結構的連續性,而其可能無法完整的保存整體的影像結構。舉例而言,在影像切縫方法中(參考文件1與2),其係移除/插入一八連接縫以調整影像大小。In the human visual system, the human eye is sensitive to the shape of certain objects, such as circles or straight lines. Based on image saliency measurement, most of the previous work limited the image adjustment method to maintain the continuity of the structure on a bottom-up strategy, which may not completely preserve the overall image structure. For example, in the image slitting method (Refs. 1 and 2), it removes/inserts an eight-joint seam to adjust the image size.

大部分的習知技術係根據所評估能量來調整影像大小,並且最小化鄰近畫素的形變(參考文件1、2、5或6)。根據這些由下往上方法,形變將累積以產生非預期的影像結構形變或顯著的偽影。在以畫面變形為基礎之方法中,大部分形變係由於非均勻的畫面變形於易受傷的對象上所導致。再者,基於Wolf等人(參考文件5)所提之影像調整演算法之基礎概念,影像大小之調整問題的公式被選定為解決一約束線性系統。而大部分之前的工作係聚焦於空間域影像/視訊大小調整。一些其他工作係應用於沿時間域之視訊大小調整,且較不重要的訊框之整修導致縮短影像序列。最近,一些研究者處理影像大小調整問題視為拼圖玩具,亦即重組補綴或畫素。這些方法可以透過人機互動而用於編輯影像。在前述的方法中,每一種方法都有共通的問題於大部分的前述工作中,即結構扭曲的問題。Most of the prior art techniques adjust the image size based on the energy assessed and minimize the deformation of adjacent pixels (reference documents 1, 2, 5 or 6). According to these bottom-up methods, the deformation will accumulate to produce unintended image structure deformation or significant artifacts. In the method based on picture deformation, most of the deformation is caused by a non-uniform picture deformation on an easily injured object. Furthermore, based on the basic concept of the image adjustment algorithm proposed by Wolf et al. (Ref. 5), the formula for adjusting the image size is selected to solve a constrained linear system. Most of the previous work focused on spatial image/video resizing. Some other work is applied to video size adjustments along the time domain, and less important frame refurbishment results in shorter image sequences. Recently, some researchers have dealt with image resizing problems as jigsaw puzzles, that is, reorganizing patches or pixels. These methods can be used to edit images through human interaction. In the foregoing methods, each method has a common problem in most of the aforementioned work, that is, the problem of structural distortion.

鑒於前述缺點,因此本發明提供一改進多媒體影像大小調整之方法以克服大部分的前述工作中所遭遇的問題。In view of the foregoing shortcomings, the present invention provides a method of improving the size adjustment of multimedia images to overcome most of the problems encountered in the foregoing work.

為了克服習知技術問題,本發明提供一種由上而下區塊結構能量評估與多媒體壓縮率評估之技術,以達到影像內容結構保存之目的。In order to overcome the conventional technical problems, the present invention provides a technique for energy evaluation and multimedia compression rate evaluation from top to bottom block structure to achieve the purpose of image content structure preservation.

本發明之目的係提供一種整體影像結構之保存方法。It is an object of the present invention to provide a method of preserving an overall image structure.

對照之前的工作,目標結構可以被保護,並且利用整體的方法以降低扭曲(distortion),以及平衡內容知覺與多媒體壓縮率。In contrast to previous work, the target structure can be protected and the overall approach is utilized to reduce distortion and balance content perception and multimedia compression.

本發明之再一目的係提供一種具結構保存之壓縮感知多媒體畫面調整方法,以達到較佳的影像內容與目標結構保存而不會過度壓縮與伸長之目的。A further object of the present invention is to provide a compressed sensing multimedia picture adjustment method with structure preservation, so as to achieve better image content and target structure preservation without excessive compression and elongation.

為了得到上述目的,本發明提供一種影像調整之方法,包括:藉由一中央處理器或處理單元以決定一輸入影像內容之一整體區塊結構能量;基於整體區塊結構能量,藉由中央處理器或處理單元以決定輸入影像內容之壓縮率;藉由中央處理器或處理單元以決定輸入影像內容之一最佳比例因數;以及藉由中央處理器或處理單元以利用一新座標矩陣而形變輸入影像內容並均勻地縮放輸入影像內容以輸出一目標影像解析度。In order to achieve the above object, the present invention provides a method for image adjustment, comprising: determining a whole block structure energy of an input image content by a central processing unit or a processing unit; and performing central processing based on the overall block structure energy Or processing unit to determine the compression ratio of the input image content; to determine an optimal scaling factor of the input image content by the central processing unit or processing unit; and to deform by using a new coordinate matrix by the central processing unit or processing unit Input image content and evenly scale the input image content to output a target image resolution.

本發明之影像調整之方法更包括於形變輸入影像內容之前,藉由中央處理器或處理單元而解決一稀疏線性系統的最小平方解。The image adjustment method of the present invention further includes solving the least squares solution of a sparse linear system by a central processing unit or a processing unit before deforming the input image content.

根據本發明另一觀點,揭露一種視訊調整之方法,包括:藉由一中央處理器或處理單元以決定每一幀之一輸入視訊內容之一整體區塊結構能量與一壓縮率;藉由中央處理器或處理單元以決定輸入視訊內容之一最佳比例因數;以及藉由中央處理器或處理單元以利用一新座標矩陣而形變輸入視訊內容並均勻地縮放輸入視訊內容以輸出一目標視訊解析度。According to another aspect of the present invention, a method for video adjustment includes: determining, by a central processing unit or a processing unit, an overall block structure energy and a compression ratio of one input video content of each frame; The processor or the processing unit determines an optimal scaling factor of the input video content; and the central processing unit or the processing unit deforms the input video content by using a new coordinate matrix and uniformly scales the input video content to output a target video resolution degree.

此外,本發明之視訊調整之方法更包括於形變輸入視訊內容之前,解決一包含限制條件之稀疏線性系統的最小平方解,與平滑最佳比例因數。In addition, the method for video adjustment of the present invention further includes solving a least square solution of a sparse linear system including a constraint condition and smoothing the optimal scaling factor before deforming the input video content.

本發明將配合其較佳實施例與隨附之圖示詳述於下。應可理解者為本發明中所有之較佳實施例僅為例示之用,並非用以限制。因此除文中之較佳實施例外,本發明亦可廣泛地應用在其他實施例中。且本發明並不受限於任何實施例,應以隨附之申請專利範圍及其同等領域而定。The invention will be described in detail below in conjunction with its preferred embodiments and the accompanying drawings. It should be understood that all of the preferred embodiments of the invention are intended to be illustrative only and not limiting. Therefore, the invention may be applied to other embodiments in addition to the preferred embodiments. The invention is not limited to any embodiment, but should be determined by the scope of the appended claims and their equivalents.

為了克服習知技術的缺點,本發明提出一種自動調整影像大小的演算法,其可以相當地保存影像中顯著的結構。取代降低鄰近畫素或網格之扭曲,本發明定義一區塊結構能量,均勻地分配被偵測結構區段之定界框內的畫素上的局部結構能量。基於該能量,所提出的演算法迫使每一區塊區域之形變盡可能的均勻。根據影像內容,本發明進一步提出評估每一方向的壓縮率。壓縮率與總熵有助於決定最佳的比例因數,以用於調整該影像至與目標影像大小相同長寬比的最佳解析度。與前述影像調整方法所做的實驗上比較,結果顯示出利用本發明演算法之影像大小調整具有較優的結構保存效果。In order to overcome the shortcomings of the prior art, the present invention proposes an algorithm for automatically adjusting the image size, which can considerably preserve the significant structure in the image. Instead of reducing the distortion of adjacent pixels or grids, the present invention defines a block structure energy that evenly distributes the local structure energy on the pixels within the bounding box of the detected structure segment. Based on this energy, the proposed algorithm forces the deformation of each block area to be as uniform as possible. Based on the image content, the present invention further proposes to evaluate the compression ratio in each direction. The compression ratio and total entropy help determine the optimal scaling factor for adjusting the image to the best resolution for the same aspect ratio as the target image size. Compared with the experimental results of the image adjustment method described above, the results show that the image size adjustment using the algorithm of the present invention has a superior structure preservation effect.

本發明提出由上而下策略以保存影像結構,其係藉由利用一區塊顯著圖以適應所架構目標之大小。從影像梯度大小與方向分佈,每一影像之壓縮率可以被評估。基於X與Y二方向的壓縮率評估,可以最佳化影像調整方法。因此,調整的影像框符合基本的要求:保持顯著的內容與保存整體結構。在本發明中,不同的策略可以用於自動適應地調整影像大小,使得顯著的影像內容與結構可以最佳地保存。The present invention proposes a top-down strategy to preserve the image structure by utilizing a block saliency map to accommodate the size of the architectural target. From the image gradient size and direction distribution, the compression ratio of each image can be evaluated. The image adjustment method can be optimized based on the compression ratio evaluation of the X and Y directions. Therefore, the adjusted image frame meets the basic requirements: maintaining significant content and preserving the overall structure. In the present invention, different strategies can be used to automatically adjust the image size so that significant image content and structure can be optimally preserved.

在影像/視訊大小調整方法中,影像內容可以適應不同型態的顯示裝置,且其長寬比可以改變而內容重要部分盡可能地保持。在本發明之前的研究中,集中於影像中線結構的保存(參考文件7:S.-F. Wang and S.-H. Lai,“Fast structure-preserving image retargeting”Proceedings of the 2009 IEEE International Conference on Acoustics,Speech and Signal Processing,pp. 1049-1052,2009),而其對於不同種類的物體結構是不足的。在本發明中,影像大小調整將從一嶄新的觀點來處理,亦即調整影像長寬比。無需考慮真實目標對象大小,首先決定最佳的寬度與長度以最佳地適應原始影像至目標長寬比,然後均勻地縮放其至目標影像大小。本發明所提出的演算法可以概括於第一圖。In the image/video size adjustment method, the image content can be adapted to different types of display devices, and the aspect ratio can be changed while the important portion of the content is maintained as much as possible. In the previous study of the present invention, focusing on the preservation of the image centerline structure (Ref. 7: S.-F. Wang and S.-H. Lai, "Fast structure-preserving image retargeting" Proceedings of the 2009 IEEE International Conference) On Acoustics, Speech and Signal Processing, pp. 1049-1052, 2009), which is insufficient for different kinds of object structures. In the present invention, image size adjustment will be handled from a new perspective, that is, the image aspect ratio is adjusted. Instead of considering the true target size, first determine the optimal width and length to best fit the original image to the target aspect ratio and then scale it evenly to the target image size. The algorithm proposed by the present invention can be summarized in the first figure.

請參照第一圖,其顯示根據本發明之影像大小調整之流程圖。上述影像大小調整可以於一影像/視訊處理裝置或一顯示裝置10中來執行,如第二圖所示。第二圖顯示影像/視訊處理裝置10之一方塊圖,該影像/視訊處理裝置10包括一影像/視訊調整模組11、一中央處理器或處理單元12、一顯示器13、一作業系統15、一輸入單元16、一輸出單元17、一記憶體單元19、一框記憶體20與一視訊記憶體21等。上述方塊圖僅用於說明而非用於限制本發明之範圍。舉例而言,不同型態的影像/視訊處理裝置或顯示裝置包括數位電視、電腦、筆記型電腦、行動電話、蘋果公司手機(iPhone)、蘋果公司音樂播放器(iPod)、電子書與數位相框等。輸入影像5得以輸入影像/視訊調整模組11,而影像/視訊調整模組11係作為一影像處理單元,以執行從一顯示裝置所輸出之輸入影像訊號的預先處理。由於影像/視訊大小調整模組並非本發明之主要特徵,因此省略其詳細的敘述。記憶體單元19包括一唯讀記憶體、一隨機存取記憶體與一非揮發性快閃記憶體。框記憶體20儲存由影像/視訊調整模組11處理之後的框影像。視訊記憶體21儲存由影像/視訊調整模組11處理之後的視訊影像。輸入單元16用於讀取從一外部裝置而來之資料與程式,而輸出單元17用於輸出從處理單元12而來之資料。Please refer to the first figure, which shows a flow chart of image size adjustment according to the present invention. The image size adjustment described above can be performed in an image/video processing device or a display device 10, as shown in the second figure. The second image shows a block diagram of the image/video processing device 10 . The image/video processing device 10 includes an image/video adjustment module 11 , a central processing unit or processing unit 12 , a display 13 , an operating system 15 , An input unit 16, an output unit 17, a memory unit 19, a frame memory 20, a video memory 21, and the like. The above-described block diagrams are for illustrative purposes only and are not intended to limit the scope of the invention. For example, different types of video/video processing devices or display devices include digital televisions, computers, notebook computers, mobile phones, Apple mobile phones (iPhones), Apple music players (iPods), e-books, and digital photo frames. Wait. The input image 5 is input to the image/video adjustment module 11, and the image/video adjustment module 11 serves as an image processing unit for performing pre-processing of the input image signal output from a display device. Since the video/video size adjustment module is not a main feature of the present invention, a detailed description thereof will be omitted. The memory unit 19 includes a read only memory, a random access memory and a non-volatile flash memory. The frame memory 20 stores the frame image processed by the image/video adjustment module 11. The video memory 21 stores the video image processed by the video/video adjustment module 11. The input unit 16 is for reading data and programs from an external device, and the output unit 17 is for outputting data from the processing unit 12.

在第一步驟100中,其係藉由一中央處理單元或一處理單元,例如第二圖中的處理單元12,以執行而決定/計算一輸入影像中的內容結構之整體區塊結構能量。輸入影像5可以藉由影像處理單元(影像/視訊調整模組11)來處理,然後藉由中央處理單元或處理單元以計算影像內容結構之整體區塊結構能量。許多可能的重要影像可以藉由區塊結構能量來處理。在一實施例中,基於畫面變形為基礎之方法,一區塊結構能量可以引進以用於結構保存。為了評估一影像之重要性,本發明利用強度梯度、區塊結構能量與內容顯著性(圖)以達到影像大小調整之目的,並且保護對象結構與藉由整體方法以最小化扭曲。換言之,本發明提出一整體方法於每一結構件;亦即由上而下方法,無需採取連接畫素(影像切縫法)、鄰近畫素(Wolf等人於2007年所提方法)或鄰近網格(Wang等人於2008年所提方法),這些方法產生之扭曲會累積於局部的平滑處。整體方法迫使定界框均勻地縮放,因而內容結構得以安全於框內。相對重要性可以藉由所提出區塊結構能量來決定。In a first step 100, the overall block structure energy of the content structure in an input image is determined/calculated by a central processing unit or a processing unit, such as processing unit 12 in the second figure. The input image 5 can be processed by the image processing unit (image/video adjustment module 11), and then calculated by the central processing unit or the processing unit to calculate the overall block structure energy of the image content structure. Many possible important images can be processed by block structure energy. In an embodiment, based on the method of picture deformation based, a block structure energy can be introduced for structural preservation. In order to evaluate the importance of an image, the present invention utilizes intensity gradients, block structure energy, and content saliency (figure) to achieve image size adjustment, and to protect object structures and to minimize distortion by a holistic approach. In other words, the present invention proposes a holistic approach to each structural member; that is, the top-down method, without the need to connect pixels (image slitting), adjacent pixels (Wolf et al., 2007) or adjacent Grid (the method proposed by Wang et al. in 2008), the distortion produced by these methods will accumulate in the local smoothing. The holistic approach forces the bounding box to scale evenly so that the content structure is safe within the frame. The relative importance can be determined by the energy of the proposed block structure.

為了擷取彩色影像之輪廓,於參考文件9(J. V. D. Weijer,T. Gevers,and A. W. M. Smeulders,“Robust photometric invariant features from the color tensor,”IEEE Trans. Image Processing,vol. 15,p. 2006,2004)中所提之彩色張量可應用以偵測突出邊緣。所擷取輪廓可以連接至其他物體輪廓並遍布整體影像。因此,許多片段(pieces)之擷取邊緣可以藉由於角落處簡易的切割而分解,其可藉由Harris角落偵測器來偵測。值得注意的是,包含不足夠畫素的結構區段係作為雜訊而處理然後移除之。In order to capture the outline of a color image, reference is made to Document 9 (JVD Weijer, T. Gevers, and AWM Smeulders, "Robust photometric invariant features from the color tensor," IEEE Trans. Image Processing, vol. 15, p. 2006, 2004. The color tensor mentioned in ) can be applied to detect protruding edges. The captured contour can be connected to other object contours and spread over the overall image. Therefore, the edges of many pieces can be decomposed by simple cutting at the corners, which can be detected by the Harris corner detector. It is worth noting that structural segments containing insufficient pixels are processed as noise and then removed.

本發明提供一更簡易與有效的方式以保護擷取的結構片段;亦即區塊結構能量。結構保存的基本概念在於區塊結構能量內之所有畫素應該盡可能均勻地拉長或壓縮。為了維持顯著的結構,每一結構片段應該被視為一單一單元而被保護。舉一直線例子而言,於線段上的畫素於調整之後應該具有相同的斜度。因此,於線段上的所有畫素被限制而具有相同的斜度。不同形狀有許多顯著的結構片段於調整之後被保護。如第三圖所示,擷取的(內容)結構片段200包括直線與曲線,並且強迫其於每一方向均勻的形變。在第三圖中,區塊結構能量構成一果凍狀空間(jelly-like space)並且保護結構片段免於嚴重失真。舉例而言,擷取結構片段200可以沿著x軸、y軸與(x及y)軸方向壓縮,以分別成為結構片段201、202與203。值得注意的是,結構片段201、202與203之直線部分仍然為直線,且結構片段201、202與203之曲線部分也適當地彎曲而不會產生偽影。The present invention provides a simpler and more efficient way to protect captured structural segments; that is, block structure energy. The basic concept of structure preservation is that all pixels within the energy of the block structure should be stretched or compressed as evenly as possible. In order to maintain a significant structure, each structural segment should be treated as a single unit. For the straight line example, the pixels on the line segment should have the same slope after adjustment. Therefore, all pixels on the line segment are limited to have the same slope. There are many significant structural segments of different shapes that are protected after adjustment. As shown in the third figure, the captured (content) structure segment 200 includes lines and curves and forces it to be uniformly deformed in each direction. In the third figure, the block structure energy constitutes a jelly-like space and protects the structural segments from severe distortion. For example, the capture structure segment 200 can be compressed along the x-axis, y-axis, and (x and y) axis directions to become structural segments 201, 202, and 203, respectively. It is to be noted that the straight portions of the structural segments 201, 202 and 203 are still straight, and the curved portions of the structural segments 201, 202 and 203 are also appropriately curved without artifacts.

接下來,每一結構片段於x與y方向的大小調整之彈性可以被定義。其中Bi 表示邊界區塊內的畫素集,其劃界與保護結構i 。對於x方向的調整,在Bi 區塊內的所有畫素之區塊結構能量值係相同,其定義為:Next, the elasticity of the size adjustment of each structural segment in the x and y directions can be defined. Where Bi represents the set of pixels in the boundary block, which delimits and protects the structure i . For the adjustment of the x direction, the energy values of the block structure of all pixels in the Bi block are the same, which is defined as:

其中ei 表示結構i 的邊緣畫素之集指數,表示一輸入影像對x之偏微分,亦即強度梯度。值得注意的是,畫素Pj 可能被超過一個區塊所覆蓋,因此沿x方向的結合總能量可以被定義為:Where ei represents the set index of the edge pixels of the structure i , Indicates the partial differentiation of an input image to x, that is, the intensity gradient. It is worth noting that the pixel Pj may be covered by more than one block, so the total combined energy in the x direction can be defined as:

其中Esal 表示顯著圖,其可藉由Itti等人所提供(參考文件8:L. Itti,C. Koch,and E. Niebur,“A model of saliency-based visual attention for rapid scene analysis,”IEEE Trans. Pattern Anal. Mach. Intell.,1998)。值得注意的是,∣Gx ∣、Esal係歸一化且範圍介於0到1之間。沿y方向的總區塊結構能量也可以依類似的方式來定義:Where Esal represents a saliency map, which can be provided by Itti et al. (Ref. 8: L. Itti, C. Koch, and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis," IEEE Trans Pattern Anal. Mach. Intell., 1998). It is worth noting that ∣ Gx ∣, Esal and Normalized and ranged from 0 to 1. The total block structure energy along the y direction can also be defined in a similar manner:

根據方程式2與3,其可以藉由強度梯度、區塊結構能量與顯著性圖案以評估輸入影像的重要性。總區塊結構能量可以藉由強度梯度、區塊結構能量與顯著性圖案來選定或決定,例如是其中的最大值。於不同軸向的結合能量圖(例如用於水平或垂直調整之能量圖)係分別地定義以提供較佳的彈性以利於調整(x與y方向的調整)。值得注意的是,根據每一結構的方向性,不同方向的區塊結構能量是不同的。換言之,若某些結構於一方向是易受損的,則此區塊的結構將於該方向作補強。According to Equations 2 and 3, the importance of the input image can be evaluated by intensity gradient, block structure energy and saliency patterns. The total block structure energy can be selected or determined by intensity gradient, block structure energy, and saliency patterns, such as the maximum value therein. The combined energy maps in different axial directions (eg, energy maps for horizontal or vertical adjustment) are defined separately to provide better flexibility to facilitate adjustment (adjustment in the x and y directions). It is worth noting that the energy of the block structure in different directions is different according to the directionality of each structure. In other words, if some structures are easily damaged in one direction, the structure of the block will be reinforced in that direction.

接著,參考第一圖,在第二步驟101中,其係藉由中央處理單元或處理單元來執行以決定或評估輸入影像內容結構於每一方向之壓縮率。過度的壓縮/伸長可能扭曲內容結構,因此分配沿x與y方向的壓縮/延伸對於最小化扭曲是重要的。從以上敘述,輸入影像之重要性可以藉由強度梯度、區塊結構能量與顯著圖(saliency map)來評估。換言之,最佳的壓縮率可以根據影像內容的能量圖而評估。在一實施例中,本發明利用二個方向之調整,例如利用同質的影像區域以於x與y方向縮小及延伸,其可參考文件6。此方法稱為利用改變寬度與長度以配合目標對象長寬比。具有不同內容的影像可以具有不同的彈性以調整,且每一方向的壓縮率可以評估,其壓縮率的定義如下所述:Next, referring to the first figure, in a second step 101, it is performed by a central processing unit or processing unit to determine or evaluate the compression ratio of the input image content structure in each direction. Excessive compression/elongation can distort the content structure, so distributing compression/extension along the x and y directions is important to minimize distortion. From the above, the importance of the input image can be evaluated by intensity gradient, block structure energy, and saliency map. In other words, the optimal compression ratio can be evaluated based on the energy map of the image content. In one embodiment, the present invention utilizes adjustments in two directions, such as using a homogenous image area to narrow and extend in the x and y directions, which can be referenced to file 6. This method is called using the change width and length to match the target object aspect ratio. Images with different content can have different resilience to adjust, and the compression ratio in each direction can be evaluated. The compression ratio is defined as follows:

其中指示於畫素(x,y)之強度梯度大小,係沿x與y方向之最大強度梯度值之平均。上述二值係重要的強度梯度大小之臨界值,其也提供壓縮率r x r y 之底線。壓縮率r x r y 係藉由方程式4來評估。請參考第四圖,其顯示壓縮率的一個例子。在第四圖中,一輸入影像300例如一汽車,其顯示分別沿著x與y方向之強度梯度大小分布301與302,及壓縮區域303可以被選定或決定。壓縮區域303可能是相對不重要的區域。由於本實施例使用最大值運算子,此量測是一根本的安全線。among them Indicates the intensity gradient of the pixel (x, y), versus The average of the maximum intensity gradient values along the x and y directions. The above two values versus It is the critical value of the magnitude of the important intensity gradient, which also provides the bottom line of the compression ratios r x and r y . The compression ratios r x and r y are evaluated by Equation 4. Please refer to the fourth figure, which shows an example of the compression ratio. In the fourth figure, an input image 300, such as a car, displays intensity gradient sizes 301 and 302 along the x and y directions, respectively, and the compressed region 303 can be selected or determined. The compressed area 303 may be a relatively unimportant area. Since this embodiment uses the maximum operator, this measurement is a fundamental safety line.

接下來,參考第一圖,在第三步驟102中,其藉由中央處理器或處理單元執行以決定每一方向中輸入影像內容結構之最佳比例因數。在本發明中,可以僅考慮來源影像與目標影像之長寬比。影像大小調整係定義為調整一大小為Wo ×Ho 之來源(輸入)影像為大小為Wr ×Hr 之目標(被調整)影像。Wo 表示來源影像的寬度,而Ho 表示來源影像的高度。Wr 表示被調整影像的寬度,而Hr 表示被調整影像的高度。在一實施例中,來源影像與目標影像之間的最佳比例因數可以僅考慮其長寬比。x與y方向之調整因子分別設定為Sx 與Sy ,因此輸入影像與被調整影像之間的影像解析度之線性關係可以定義為:Next, referring to the first figure, in a third step 102, it is performed by a central processing unit or processing unit to determine the optimal scaling factor for the input image content structure in each direction. In the present invention, only the aspect ratio of the source image to the target image can be considered. Image resizing is defined as adjusting the source (input) image of size Wo × Ho to the target (adjusted) image of size Wr × Hr . Wo represents the width of the source image, and Ho represents the height of the source image. Wr represents the width of the image being adjusted, and Hr represents the height of the image being adjusted. In an embodiment, the optimal scaling factor between the source image and the target image may only consider its aspect ratio. The adjustment factors of the x and y directions are set to S x and S y respectively , so the linear relationship between the image resolution between the input image and the adjusted image can be defined as:

根據不同的影像內容,最佳的Sx 與Sy 可能不同,即使目標長寬比(W r /H r )保持不變。Depending on the image content, the optimal S x and S y may be different, even if the target aspect ratio ( W r / H r ) remains unchanged.

請參考第五圖,其顯示調整因子(Sx ,Sy )座標系統,其中x方向調整因子Sx 代表x軸座標,而y方向調整因子Sy 代表y軸座標。調整因子沿x方向位置(Sx ,0),其位於點(1,0)右邊與左邊分別表示座標伸長與壓縮。類似地,調整因子沿y方向位置(0,Sy ),其位於點(0,1)上邊與下邊分別表示座標伸長與壓縮。在第五圖中,其顯示最佳調整因子解析度()須滿足底下要求:長寬比限制(亦即方程式5)、壓縮率與接近原來大小,亦即點(1,1)。換言之,為了降低被調整影像扭曲,首先考量係使得調整因子(Sx ,Sy )接近原來大小點(1,1)。第二考量係使得調整因子(Sx ,Sy )接近接近藉由評估壓縮率所定義的安全線,例如(1-r x ,1+r y )。Please refer to the fifth figure, which shows the adjustment factor (S x , S y ) coordinate system, where the x-direction adjustment factor S x represents the x-axis coordinate and the y-direction adjustment factor S y represents the y-axis coordinate. The adjustment factor is located in the x direction (S x , 0), which is located at the right and left of the point (1, 0), indicating the coordinate elongation and compression, respectively. Similarly, the adjustment factor is located along the y-direction (0, S y ), which is located at the top and bottom of the point (0, 1), respectively, indicating coordinate elongation and compression. In the fifth figure, it shows the best adjustment factor resolution ( ) must meet the following requirements: aspect ratio limit (ie, Equation 5), compression ratio and close to the original size, that is, point (1,1). In other words, in order to reduce the distortion of the adjusted image, the first consideration is to make the adjustment factor (S x , S y ) close to the original size point (1, 1). The second consideration is that the adjustment factor (S x , S y ) is close to the safety line defined by evaluating the compression ratio, for example (1- r x , 1+ r y ).

如第五圖所示,結合上述二考量與長寬比限制,問題可以公式化為:As shown in the fifth figure, combined with the above two considerations and aspect ratio limitations, the problem can be formulated as:

其中β=W r H o /W o H r 為一常數。因此,上述公式的封閉型式解,亦即最佳調整因子(),可以很容易地得到,如下所述:Where β = W r H o / W o H r is a constant. Therefore, the closed-form solution of the above formula, that is, the optimal adjustment factor ( ), can be easily obtained as follows:

該注意的是,ω可以被定義或是一常數。端視影像內容而定,ω可以藉由結構複雜性評估而自動決定。若這結構內容較清晰,則其於壓縮率上的權重較大,反之亦然。結構複雜性可以定義為影像梯度大小與方向的函數(結合)。It should be noted that ω can be defined or a constant. Depending on the content of the image, ω can be automatically determined by structural complexity assessment. If the structure is clearer, its weight on the compression ratio is greater, and vice versa. Structural complexity can be defined as a function (combination) of image gradient size and direction.

ω=c ‧exp(-H all )ω= c ‧exp(- H all )

H all =(H g +H g ) (8) H all =( H g + H g ) (8)

其中H g 為影像梯度大小的熵,其代表梯度變量的複雜度。H g 表示方向梯度的熵,其係關於內容結構的濃密度。c 為一常數,用以再縮放ω至0與c 之間的範圍。值得注意的是H all 為正數,相關於影像梯度大小與方向的複雜度。因此,H all 較大,則ω越小。結構複雜度ω可以用於作為影像大小調整之自由度。Where H g is the entropy of the image gradient size, which represents the complexity of the gradient variable. H g represents the entropy of the directional gradient, which is related to the density of the content structure. c is a constant used to rescale ω to a range between 0 and c . Notably H all a positive number, the complexity related to the gradient magnitude and orientation of the image. Thus, H all the larger, the smaller is ω. The structural complexity ω can be used as a degree of freedom in image size adjustment.

參考第一圖,在步驟103中,其藉由中央處理器或處理單元而執行以解決稀疏線性系統的最小平方解。如前所述,x與y方向之伸長/縮小關係係藉由來源影像與目標影像之解析度而定義為Sx =β‧Sy 。對應至不同影像內容之最佳調整因子可以藉由方程式7決定。若β小於1,則影像應沿著x軸壓縮與沿著y軸伸長。若β大於1,則影像轉換而沿著y軸壓縮與沿著x軸伸長。在本發明中,例如要求一給定大小的影像Wo ×Ho 調整至大小為Sx Wo ×Sy Ho ,其需要於三個型態限制條件之下重新建立畫素p=(x,y)的新座標(Xp,Yp)。舉例而言,上述限制條件可以為幾何限制條件。例如採取Xp的計算。在第一限制條件中,每一畫素係假定位於從其左邊與右邊鄰近處一固定距離:Xx,y -Xx-1,y =1與Xx+1,y -Xx,y =1。第二限制條件係映射每一畫素至一位置,類似其上邊與下邊鄰近處之一:X x , y -X x , y + 1 =0。第三限制條件係符應畫面變形影像至被調整影像大小之尺寸:X 1 , y =1與。在一實施例中,對於內容知覺影像調整,一重要的畫素應該形變至一距離接近其鄰近之一畫素,而較不重要之畫素可以與其鄰近畫素混合或分割移除。因此,上述第一限制條件應該藉由對應的結構能量值(亦即方程式2)所加權。因為每一區塊結構能量內的畫素之縮小與伸長應該盡可能的均勻,若,最大重要值1可以指定至第二限制條件(平滑項),其中為所有畫素集合,由結構區塊所覆蓋。在限制線性系統中的方程式如下所述:Referring to the first figure, in step 103, it is performed by a central processing unit or processing unit to solve the least squares solution of the sparse linear system. As described above, the elongation/reduction relationship between the x and y directions is defined as S x =β‧S y by the resolution of the source image and the target image. Corresponding to the optimal adjustment factor for different image content versus It can be determined by Equation 7. If β is less than 1, the image should be compressed along the x-axis and elongated along the y-axis. If β is greater than 1, the image is converted to be compressed along the y-axis and elongated along the x-axis. In the present invention, for example, it is required that a given size image W o ×H o is adjusted to a size of S x W o ×S y H o , which needs to re-establish the pixel p=( under three types of restriction conditions) The new coordinates (Xp, Yp) of x, y). For example, the above restrictions may be geometric constraints. For example, take the calculation of Xp. In the first constraint, each pixel is assumed to be at a fixed distance from its left and right sides: X x, y -X x-1, y =1 and X x+1, y -X x, y =1. The second constraint is to map each pixel to a position similar to one of its upper and lower edges: X x , y - X x , y + 1 =0. The third constraint condition is the size of the image deformed image to the size of the adjusted image: X 1 , y =1 and . In one embodiment, for content-aware image adjustment, an important pixel should be deformed to a distance close to one of its neighboring pixels, while a less important pixel can be mixed or split with its neighboring pixels. Therefore, the first constraint condition should be based on the corresponding structural energy value. (that is, Equation 2) is weighted. Because the reduction and elongation of the pixels in the energy of each block structure should be as uniform as possible, if , the maximum important value of 1 can be assigned to the second constraint (smooth term), where For all pixel collections, covered by structural blocks. The equations in the linear system are as follows:

加權表示上與下畫素之間的重要性,其控制畫素X x , y 上的平滑度。於方程式9中的所有方程式構成一超定限制稀疏線性系統,其中前三個方程式表示鄰近限制條件,而第四個方程式指示邊界限制條件。畫素的最佳新座標可以藉由減少上述方程式之平方誤差的加總而得到,其相當於求稀疏線性系統之最小平方解(x),即。類似地,畫素(x,y)之座標變量Y x , y 可以從相對應的稀疏線性系統之最小平方解中求得:Weighting Represents the importance between the upper and lower pixels, which controls the smoothness of the pixels X x , y . All of the equations in Equation 9 constitute an overdetermined constrained sparse linear system in which the first three equations represent neighboring constraints and the fourth equations indicate boundary constraints. The best new coordinates of the pixels can be obtained by reducing the sum of the squared errors of the above equations, which is equivalent to finding the least squares solution (x) of the sparse linear system, ie . Similarly, the coordinate variables Y x , y of the pixel (x, y) can be obtained from the least squares solution of the corresponding sparse linear system:

加權表示左與右畫素之間的重要性,其控制畫素Y x , y 上的平滑度。類似地,於方程式10中的所有方程式構成一超定限制稀疏線性系統,其中前三個方程式表示鄰近限制條件,而第四個方程式指示邊界限制條件。換言之,根據方程式9與10,上述稀疏線性系統之最小平方解可以求得。基於上述稀疏線性系統之解,可以得到影像形變程度。Weighting Represents the importance between the left and right pixels, which controls the smoothness of the pixels Y x , y . Similarly, all of the equations in Equation 10 constitute an overdetermined constrained sparse linear system in which the first three equations represent neighboring constraints and the fourth equations indicate boundary constraints. In other words, according to Equations 9 and 10, the least squares solution of the above sparse linear system can be obtained. Based on the solution of the above-described sparse linear system, the degree of image deformation can be obtained.

參考第一圖,在步驟104中,其利用中央處理器或處理單元而執行以利用新的座標矩陣(X與Y)而形變輸入影像並均勻地縮放該輸入影像成為目標影像解析度。基於方程式9與10,被調整影像可以根據所有畫素的新座標(X與Y)而藉由形變原始影像而產生。值得注意的是,目標大小(Sx Wo ×Sy Ho )可能不等於真實的目標影像解析度(Wr ×Hr )。雖然於形變過程之後調整影像解析度為(Sx Wo ×Sy Ho )而非(Wr ×Hr ),而由於其長寬比相同因此可以均勻地縮放至(Wr ×Hr )。壓縮率與調整因數評估的例子可以顯示於第三、第四與第五圖。基於x與y方向中不同的能量圖,可以知道該評估結果是否適合沿特定方向來調整。Referring to the first figure, in step 104, it is executed by a central processing unit or processing unit to deform the input image with a new coordinate matrix (X and Y) and uniformly scale the input image to a target image resolution. Based on Equations 9 and 10, the adjusted image can be generated by deforming the original image based on the new coordinates (X and Y) of all pixels. It is worth noting that the target size (S x W o ×S y H o ) may not be equal to the true target image resolution (W r ×H r ). Although the image resolution is adjusted to (S x W o ×S y H o ) instead of (W r ×H r ) after the deformation process, it can be uniformly scaled to (W r ×H r due to the same aspect ratio. ). Examples of compression ratio and adjustment factor evaluation can be shown in the third, fourth and fifth figures. Based on the different energy maps in the x and y directions, it can be known whether the evaluation result is suitable for adjustment in a particular direction.

舉例而言,Sx =0.62與Sy =1.24,壓縮率於二方向上是高的,因此沿x方向的壓縮可能是較大的,而言y方向的伸長可能是較小的。為了視覺化整體影像上之最終運算,而採取分別測量沿著X與Y矩陣上的x與y方向之鄰近畫素之間的距離。相對應的壓縮/伸長分佈係藉由一噴射彩條而色彩化。可以觀察的是藉由區塊結構能量,結構可以得到很好的保存。For example, S x = 0.62 and S y = 1.24, the compression ratio is high in both directions, so the compression in the x direction may be larger, and the elongation in the y direction may be smaller. To visualize the final operation on the overall image, the distance between adjacent pixels in the x and y directions along the X and Y matrices is measured separately. Corresponding compression/elongation distribution versus It is colored by a jet color strip. It can be observed that the structure can be well preserved by the energy of the block structure.

本發明之另一主題係關於一視訊調整(video retargeting)。關於視訊調整,其總結的演算法顯示於第六圖中。視訊調整係一既存的視訊轉換之方法,以符應一任意顯示器之尺寸。請參考第六圖,其顯示根據本發明視訊調整之方法流程。類似地,視訊調整可以在一影像/視訊處理裝置或顯示器中來實施,例如第二圖中的裝置。在第一步驟500中,其藉由中央處理器或處理單元而執行以決定/計算每一訊框之總區塊結構能量與壓縮率。基於中等的視訊攝影機變得越來越負擔得起,使用者可以容易地製造他們自己的多媒體內容。為了符應不同型態的影像/視訊處理裝置或顯示裝置,例如數位電視、電腦、筆記型電腦、行動電話、蘋果公司手機(iPhone)、蘋果公司音樂播放器(iPod)、電子書與數位相框等,依據其內容,訊框應該自適應地被調整。然而,個別地調整每一幀可能導致閃爍偽影。為了符合視訊調整的需求,可能要作一些修正並且延伸到上述所提之影像調整技術。首先,在一視訊剪接中引起觀看者注意的最顯著區域是移動的目標(物體)。基於此考量,於方程式2與3中的顯著性(突出)測量,藉由歸一化移動能量Em可以表示為:Another subject of the invention relates to a video retargeting. Regarding video adjustment, the summarized algorithm is shown in the sixth figure. Video adjustment is an existing method of video conversion to meet the size of an arbitrary display. Please refer to the sixth figure, which shows the method flow of the video adjustment according to the present invention. Similarly, video adjustments can be implemented in an image/video processing device or display, such as the device in the second figure. In a first step 500, it is performed by a central processing unit or processing unit to determine/calculate the total block structure energy and compression ratio for each frame. Based on medium video cameras becoming more affordable, users can easily create their own multimedia content. In order to conform to different types of video/video processing devices or display devices, such as digital TVs, computers, notebook computers, mobile phones, Apple mobile phones (iPhone), Apple music players (iPod), e-books and digital photo frames Etc., depending on its content, the frame should be adaptively adjusted. However, individually adjusting each frame may result in flicker artifacts. In order to meet the needs of video adjustment, some corrections may be made and extended to the above-mentioned image adjustment technology. First, the most significant area that attracts the viewer's attention in a video clip is the moving target (object). Based on this consideration, the significance (outburst) measurements in Equations 2 and 3 can be expressed as: by normalizing the moving energy Em:

其中E m 表示從鄰近影像框所計算之移動運動場的大小。基於效率考量,移動向量可以僅於偵測角落上評估。如同攝影機移動,其大約量取移動向量的平均,而真實物體移動可以藉由移除攝影機移動而被評估。值得注意的是,E m 也是歸一化至範圍0與1之間。Where E m represents the size of the moving motion field calculated from the adjacent image frame. Based on efficiency considerations, the motion vector can be evaluated only on the detection corner. As with camera movement, it approximates the average of the motion vectors, while real object movement can be evaluated by removing the camera movement. It is worth noting that E m is also normalized to a range between 0 and 1.

接下來,參考第六圖,在第二步驟501中,其藉由中央處理器或處理單元而執行以決定每一幀的最佳比例因數。基於評估的總能量,視訊中的每一幀的最佳調整因數()可以被評估。根據上述,上述最佳比例因數()可以很容易地藉由方程式7決定。Next, referring to the sixth diagram, in a second step 501, it is performed by a central processing unit or processing unit to determine the optimal scaling factor for each frame. Based on the total energy assessed, the optimal adjustment factor for each frame in the video ( ) can be evaluated. According to the above, the above optimal scaling factor ( ) can be easily determined by Equation 7.

接著,在第三步驟502中,其藉由中央處理器或處理單元而執行以解決具有額外限制條件之稀疏線性系統與平滑化比例因數。為了保持時間同調性,最佳比例因數可以藉由利用移動平均方法而隨時間平滑化。對應的可以透過長寬比限制條件()而利用平滑化而決定。類似用於參考文件5中的時間平滑,藉由包括下述的限制條件(即方程式12)至方程式9與10所構成之線性系統,而迫使相鄰的框之間映射的平滑。上述限制方程式如下所述:Next, in a third step 502, it is performed by a central processing unit or processing unit to resolve a sparse linear system with additional constraints and a smoothing scaling factor versus . In order to maintain time homology, the best scaling factor It can be smoothed over time by using a moving average method. corresponding Can be limited by aspect ratio ( Smoothing And decided. Similar to the temporal smoothing in Reference 5, the smoothing of the mapping between adjacent frames is forced by a linear system comprising the following constraints (i.e., Equation 12) to Equations 9 and 10. The above restriction equation is as follows:

類似地,根據方程式9與10與額外的限制條件(即方程式12),稀疏線性系統之最小平方解可以很容易地確定。Similarly, according to Equations 9 and 10 and additional constraints (ie, Equation 12), the least squares solution of a sparse linear system can be easily determined.

參考第六圖,在步驟503中,其利用中央處理器或處理單元而執行以利用座標矩陣(X與Y)而形變輸入影像並均勻地縮放該輸入影像以輸出目標視訊解析度。基於方程式9與10與額外的限制條件,根據所有畫素之新座標,被調整影像可以藉由形變原始影像而產生。因為形變矩陣X與Y是平滑的且類似相鄰的框之間,形變矩陣可以藉由前一幀所得到的初始猜值而被評估,而相對應的線性系統可以利用對稱的LQ方法而有效率地解決(參考文件10:C. C. Paige and M. A. Saunders,“Solution of sparse indefinite systems of linear equations,”SIAM J. Numer. Anal.,vol. 12,pp. 617-629,1975)。Referring to the sixth diagram, in step 503, it is executed by the central processing unit or processing unit to deform the input image using the coordinate matrix (X and Y) and uniformly scale the input image to output the target video resolution. Based on equations 9 and 10 and additional constraints, the adjusted image can be generated by deforming the original image based on the new coordinates of all pixels. Since the deformation matrix X and Y are smooth and similar between adjacent frames, the deformation matrix can be evaluated by the initial guess obtained from the previous frame, and the corresponding linear system can utilize the symmetric LQ method. Solved efficiently (Ref. 10: CC Paige and MA Saunders, "Solution of sparse indefinite systems of linear equations," SIAM J. Numer. Anal., vol. 12, pp. 617-629, 1975).

較多的比較結果,請參考可演示視訊http://cv.cs.nthu.edu.tw/research/CAMR-SP/index.htmFor more comparison results, please refer to the demo video http://cv.cs.nthu.edu.tw/research/CAMR-SP/index.htm .

參考第六圖,在步驟504中,其執行以維持上述評估座標矩陣X與Y以作為下一幀的初始解。然後,在步驟505中,程序回復到步驟500以開始下一幀的影像調整。接下來,重複上述之步驟。Referring to the sixth diagram, in step 504, it is executed to maintain the above-described evaluation coordinate matrices X and Y as the initial solution for the next frame. Then, in step 505, the program reverts to step 500 to begin image adjustment for the next frame. Next, repeat the above steps.

對熟悉此領域技藝者,本發明雖以較佳實例闡明如上,然其並非用以限定本發明之精神。在不脫離本發明之精神與範圍內所作之修改與類似的配置,均應包含在下述之申請專利範圍內,此範圍應覆蓋所有類似修改與類似結構,且應做最寬廣的詮釋。The present invention has been described above by way of a preferred example, and is not intended to limit the spirit of the invention. Modifications and similar configurations made within the spirit and scope of the invention are intended to be included within the scope of the appended claims.

5...輸入影像5. . . Input image

10...影像/視訊處理裝置或顯示裝置10. . . Image/video processing device or display device

11...影像/視訊調整模組11. . . Image/video adjustment module

12...中央處理器或處理單元12. . . Central processing unit or processing unit

13...顯示器13. . . monitor

15...作業系統15. . . working system

16...輸入單元16. . . Input unit

17...輸出單元17. . . Output unit

19...記憶體單元19. . . Memory unit

20...框記憶體20. . . Frame memory

21...視訊記憶體twenty one. . . Video memory

100...計算整體區塊結構能量100. . . Calculate the energy of the overall block structure

101...估算壓縮率101. . . Estimated compression ratio

102...決定最佳比例因數102. . . Determine the best scale factor

103...解決稀疏線性系統之最小平方解103. . . Solving the least squares solution of sparse linear systems

104...藉由X與Y座標矩陣以形變輸入影像並均勻地縮放該輸入影像成為目標影像解析度104. . . Deforming the input image by X and Y coordinate matrix and uniformly scaling the input image to become the target image resolution

200、201、202、203...結構片段200, 201, 202, 203. . . Structural fragment

300...輸入影像300. . . Input image

301...X方向之強度梯度大小分布301. . . Intensity gradient size distribution in the X direction

302...Y方向之強度梯度大小分布302. . . Intensity gradient size distribution in the Y direction

303...壓縮區域303. . . Compressed area

500...計算每一訊框之總區塊結構能量與壓縮率500. . . Calculate the total block structure energy and compression ratio of each frame

501...決定每一幀的最佳比例因數501. . . Determine the best scaling factor for each frame

502...解決具有額外限制條件之稀疏線性系統與平滑化比例因數502. . . Solve sparse linear systems with additional constraints and smoothing scale factors

503...利用座標矩陣X與Y而形變輸入影像並均勻地縮放該輸入影像以輸出目標視訊解析度503. . . Deform the input image using the coordinate matrices X and Y and evenly scale the input image to output the target video resolution

504...維持評估座標矩陣X與Y以作為下一幀的初始解504. . . Maintain evaluation of the coordinate matrix X and Y as the initial solution for the next frame

505...步驟505. . . step

上述元件,以及本發明其他特徵與優點,藉由閱讀實施方式之內容及其圖式後,將更為明顯:The above elements, as well as other features and advantages of the present invention, will become more apparent after reading the contents of the embodiments and the drawings thereof:

第一圖顯示根據本發明之影像大小調整之流程圖。The first figure shows a flow chart of image size adjustment in accordance with the present invention.

第二圖顯示一影像/視訊處理裝置之一方塊圖。The second figure shows a block diagram of an image/video processing device.

第三圖顯示沿著x與y方向之內容結構之壓縮。The third graph shows the compression of the content structure along the x and y directions.

第四圖顯示一壓縮率之一例子。The fourth figure shows an example of a compression ratio.

第五圖顯示根據本發明之最佳比例因數之長寬比限制之示意圖。The fifth graph shows a schematic diagram of the aspect ratio limitation of the preferred scale factor in accordance with the present invention.

第六圖顯示根據本發明之視訊調整之流程圖。The sixth diagram shows a flow chart of video adjustment in accordance with the present invention.

100...計算整體區塊結構能量100. . . Calculate the energy of the overall block structure

101...估算壓縮率101. . . Estimated compression ratio

102...決定最佳比例因數102. . . Determine the best scale factor

103...解決稀疏線性系統之最小平方解103. . . Solving the least squares solution of sparse linear systems

104...藉由X與Y座標矩陣以形變輸入影像並均勻地縮放該輸入影像成為目標影像解析度104. . . Deforming the input image by X and Y coordinate matrix and uniformly scaling the input image to become the target image resolution

Claims (8)

一種影像調整之方法,包括:藉由一處理單元以決定一輸入影像內容之一整體區塊結構能量;基於該整體區塊結構能量,藉由該處理單元以決定該輸入影像內容之壓縮率,其中該輸入影像內容之該壓縮率(r x r y )可以藉由以下方程式得到: 其中G(x,y)等於,其表示畫素(x,y)之強度梯度大小,表示沿著x與y方向之最大強度梯度值之平均;藉由該處理單元以決定該輸入影像內容之一最佳比例因數;以及藉由該處理單元以利用一新座標矩陣而形變該輸入影像內容並均勻地縮放該輸入影像內容以輸出一目標影像解析度。A method for image adjustment includes: determining, by a processing unit, an overall block structure energy of an input image content; and determining, by the processing unit, a compression ratio of the input image content based on the overall block structure energy; The compression ratio ( r x and r y ) of the input image content can be obtained by the following equation: Where G(x,y) is equal to , which represents the intensity gradient of the pixel (x, y), versus Means an average of the maximum intensity gradient values along the x and y directions; the processing unit determines an optimum scaling factor for the input image content; and the processing unit deforms the input image by utilizing a new coordinate matrix The content and uniformly scales the input image content to output a target image resolution. 如請求項1所述之影像調整之方法,更包括於形變該輸入影像內容之前,藉由該處理單元而解決一稀疏線性系統的最小平方解。 The method for image adjustment according to claim 1, further comprising solving the least square solution of a sparse linear system by the processing unit before deforming the input image content. 如請求項1所述之影像調整之方法,其中該整體區塊結 構能量可以藉由強度梯度、區塊結構能量與顯著圖而決定。 The method of image adjustment according to claim 1, wherein the overall block junction The energy of the structure can be determined by the intensity gradient, the block structure energy, and the saliency map. 如請求項1所述之影像調整之方法,其中該最佳比例因數(,)可以藉由以下方程式得到: 其中β等於W r H o /W o H r ,(S x ,S y )表示一調整因數。The method of image adjustment according to claim 1, wherein the optimal scaling factor ( , ) can be obtained by the following equation: Where β is equal to W r . H o / W o . H r , ( S x , S y ) represents an adjustment factor. 一種視訊調整之方法,包括:藉由一處理單元以決定每一幀之一輸入視訊內容之一整體區塊結構能量與一壓縮率,其中該輸入視訊內容之該壓縮率(r x r y )可以藉由以下方程式得到: 其中G(x,y)等於,其表示畫素(x,y)之強度梯度大小,表示沿著x與y方向之最大強度梯度值之平均;藉由該處理單元以決定該輸入視訊內容之一最佳比例因數;以及藉由該處理單元以利用一新座標矩陣而形變該輸入視 訊內容並均勻地縮放該輸入視訊內容以輸出一目標視訊解析度。A method for video adjustment includes: determining, by a processing unit, an overall block structure energy and a compression ratio of one of the input video content, wherein the compression ratio ( r x and r y ) of the input video content ) can be obtained by the following equation: Where G(x,y) is equal to , which represents the intensity gradient of the pixel (x, y), versus Means an average of the maximum intensity gradient values along the x and y directions; the processing unit determines an optimum scaling factor for the input video content; and the processing unit deforms the input video by utilizing a new coordinate matrix The content and the input video content are evenly scaled to output a target video resolution. 如請求項5所述之視訊調整之方法,更包括於形變該輸入視訊內容之前,藉由該處理單元而解決一包含限制條件之稀疏線性系統的最小平方解,與平滑該最佳比例因數,其中該限制條件包括 其中X與Y表示該新座標矩陣,而x與y表示一原始作標矩陣。The method of video adjustment according to claim 5, further comprising: before the deforming the input video content, solving a least square solution of a sparse linear system including a constraint condition by using the processing unit, and smoothing the optimal scaling factor, Where the restrictions include Where X and Y represent the new coordinate matrix, and x and y represent an original calibration matrix. 如請求項5所述之視訊調整之方法,其中該整體區塊結構能量可以藉由強度梯度、區塊結構能量與顯著圖而決定。 The method of video adjustment according to claim 5, wherein the overall block structure energy can be determined by an intensity gradient, a block structure energy, and a saliency map. 如請求項5所述之影像調整之方法,其中該最佳比例因數(,)可以藉由以下方程式得到: 其中β等於W r H o /W o H r ,(S x ,S y )表示一調整因數。The method of image adjustment according to claim 5, wherein the optimal scaling factor ( , ) can be obtained by the following equation: Where β is equal to W r . H o / W o . H r , ( S x , S y ) represents an adjustment factor.
TW099107658A 2010-03-16 2010-03-16 Compressibility-aware media retargeting with structure preserving TWI410896B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW099107658A TWI410896B (en) 2010-03-16 2010-03-16 Compressibility-aware media retargeting with structure preserving

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW099107658A TWI410896B (en) 2010-03-16 2010-03-16 Compressibility-aware media retargeting with structure preserving

Publications (2)

Publication Number Publication Date
TW201133396A TW201133396A (en) 2011-10-01
TWI410896B true TWI410896B (en) 2013-10-01

Family

ID=46751208

Family Applications (1)

Application Number Title Priority Date Filing Date
TW099107658A TWI410896B (en) 2010-03-16 2010-03-16 Compressibility-aware media retargeting with structure preserving

Country Status (1)

Country Link
TW (1) TWI410896B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5454043A (en) * 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
US20030185445A1 (en) * 2002-03-29 2003-10-02 Industrial Technology Research Institute Method for extracting and matching gesture features of image
TW200933538A (en) * 2008-01-31 2009-08-01 Univ Nat Chiao Tung Nursing system
TW200939796A (en) * 2007-12-21 2009-09-16 Sony Corp Image pickup apparatus, color noise reduction method, and color noise reduction program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5454043A (en) * 1993-07-30 1995-09-26 Mitsubishi Electric Research Laboratories, Inc. Dynamic and static hand gesture recognition through low-level image analysis
US20030185445A1 (en) * 2002-03-29 2003-10-02 Industrial Technology Research Institute Method for extracting and matching gesture features of image
TW200939796A (en) * 2007-12-21 2009-09-16 Sony Corp Image pickup apparatus, color noise reduction method, and color noise reduction program
TW200933538A (en) * 2008-01-31 2009-08-01 Univ Nat Chiao Tung Nursing system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Shu-Fan Wang, and Shang-Hong Lai, "FAST STRUCTURE-PRESERVING IMAGE RETARGETING", the IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, ICASSP 2009, pp.1049-1052, April 19-24, *

Also Published As

Publication number Publication date
TW201133396A (en) 2011-10-01

Similar Documents

Publication Publication Date Title
CN110574025B (en) Convolution engine for merging interleaved channel data
US9615039B2 (en) Systems and methods for reducing noise in video streams
CN109416829B (en) Parallel computer vision and image scaling architecture
US9787922B2 (en) Pixel defect preprocessing in an image signal processor
JP4955096B2 (en) DETECTING DEVICE, DETECTING METHOD, DETECTING PROGRAM, AND RECORDING MEDIUM
US8295647B2 (en) Compressibility-aware media retargeting with structure preserving
US11062464B2 (en) Image processing apparatus, method, and storage medium to derive optical flow
US10951843B2 (en) Adjusting confidence values for correcting pixel defects
JP2018195084A (en) Image processing apparatus, image processing method, program, and storage medium
KR101215666B1 (en) Method, system and computer program product for object color correction
Wang et al. Compressibility-aware media retargeting with structure preserving
Chen et al. Preserving motion-tolerant contextual visual saliency for video resizing
GB2553447A (en) Image processing apparatus, control method thereof, and storage medium
JP5703705B2 (en) Image feature detection system, image recognition system, image feature detection method, and program
CN109325909B (en) Image amplification method and image amplification device
US10567787B1 (en) Autofocusing images using motion detection
TWI410896B (en) Compressibility-aware media retargeting with structure preserving
US9077963B2 (en) Systems and methods for generating a depth map and converting two-dimensional data to stereoscopic data
US8503823B2 (en) Method, device and display system for converting an image according to detected word areas
US10547863B1 (en) Image statistics for motion detection
CN103974043B (en) Image processor and image treatment method
US20150071488A1 (en) Imaging system with vanishing point detection using camera metadata and method of operation thereof
Wang et al. Image compressibility assessment and the application of structure-preserving image retargeting
CN116883491A (en) Adjustment distance determining method, device, computer equipment and storage medium
JP2020010216A (en) Image processing apparatus, information processing method, and program