TW200922338A

TW200922338A - Image encoding/decoding device and method

Info

Publication number: TW200922338A
Application number: TW97135352A
Authority: TW
Inventors: Naofumi Wada; Takeshi Chujoh
Original assignee: Toshiba Kk
Priority date: 2007-09-26
Filing date: 2008-09-15
Publication date: 2009-05-16
Also published as: WO2009041243A1; JP2010287917A

Abstract

An image encoding device comprises: an extractor (121) for extracting a first edge component image from an original image; a separator (130) for separating a reference image into a second edge component image and an edge-removed image; an auxiliary information generator (122) for generating auxiliary information for predicting the first edge component image from the second edge component image; a predictor (141) for predicting, using the auxiliary information, a third edge component image from the second edge component image; and a prediction image generator (142) for generating an prediction image by combining the edge-removed image and the third edge component image.

Description

200922338 九、發明說明【發明所屬之技術領域】本發明係有關於影像的編碼及解碼。【先前技術】先前，作爲動畫編碼方式，有MPEG-2或H.264/MPEG4 A VC(以下簡稱爲H. 264)爲人所知。在這些動畫編碼方式中 ’將原影像分割成所定大小的區塊，以該區塊單位來進行運動推定及運動補償所致之畫格間預測，而生成預測影像。再者’在上記動畫編碼方式中，對該預測影像與原影像之間的預測殘差，進行離散餘弦轉換（以下簡稱爲DCT)、量化及熵編碼化，而生成編碼資料。又，在H. 264的運動推定時，由於是從多數的區塊尺寸（16x16像素、16x8像素、8x16像素、8x8像素、8x4像素、4x8像素、4x4像素）中選擇出相應於物體之形狀或運動之複雜性的適切之區塊尺寸，因此相較於MPEG-2等，預測效率較高。然而，伴隨此種區塊尺寸之選擇而來的運動推定，雖然對於剛體在畫面內平行移動時的畫格間預測是合適的，但例如當在畫格間物體發生變形時的畫格預測就會不適合，會發生比較大的預測殘差。尤其是，邊緣及紋理等空間上像素値變化較爲劇烈的場所（領域）中，容易發生預測殘差，該預測殘差係於解碼影像中容易被觀看成飛蚊雜訊等，而引發畫質劣化。上記預測殘差，雖然可於運動推定時選擇較小的區塊 -4- 200922338 尺寸來使其減低，但是若選擇較小的區塊尺寸，則含有運動向量的標頭資訊之編碼量會增大。甚至’邊緣或紋理在空間上像素値變化較劇烈之場所，係例如DCT等之正交轉換後的轉換係數是容易分散在高頻成分中’因該轉換係數的量化誤差就會導致於解碼影像中發生邊緣鮮銳度的劣化或紋理資訊的缺損。又’被含在局部解碼影像中的預測殘差，係於以降的畫格間預測時，被當成參照影像進行參考之際而傳播開來。在日本特開平7-240925號公報中，從原影像預先抽出邊緣成分，將已去除邊緣成分之影像以MPEG-2或H.264 等既存之區塊基礎編碼而進行編碼，並且僅將邊緣成分之影像另外進行正確的編碼（使其不產生預測殘差）。如此，因爲在視覺上很重要的邊緣成分是單獨地被正確地進行編碼，因此相較於先前的動畫編碼方式，在邊緣周邊的編碼失真可被降低，因此於解碼影像中的飛蚊雜訊的發生可受到抑制。因此，尤其是在以高壓縮率進行過編碼時，可以期待再生影像中的主觀畫質獲得改善。曰本專利第3 2 3 3 1 8 5號公報中則是，在進行運動補償之際，考慮到物體的平行移動、旋轉及伸縮等而進行投影轉換，爲了決定該投影轉換參數’將從參數原影像抽出的上記物體之輪廓資訊當作輔助資訊來使用。因此’即使在畫格間有發生物體之變形時’仍可基於上記輪廓資訊來對參照影像附加幾何學上的變形來生成預測影像’因此可減低預測殘差。又’不是將從原影像抽出之輪廓資訊加以編 -5- 200922338 碼’而是將上記輪廓資訊與從已被編碼之參照影像中所抽出之輪廓資訊兩者的殘差加以編碼，因此可抑制編碼量。【發明內容】在曰本特開平7-240925號公報中，由於僅將邊緣成分的影像單獨進行正確編碼，因此相較於將影像全體整個進行編碼之情形，發生編碼量會較爲增大。又，僅進行邊緣成分影像之編碼係有別於對去除邊緣成分之影像所進行的編碼而另外進行，所以考慮到雙方的編碼資料之編碼量及畫質之平衡的控制，係爲困難。例如，當僅邊緣成分之影像所產生之發生編碼量是增大時，若要保持全體的編碼量則邊緣成分以外之影像的畫質會劣化，若要保持全體的畫質則邊緣成分以外之影像所產生之發生編碼量會增大。又，在日本專利第3 23 3 1 85號公報中，爲了決定投影轉換參數而使用輪廓資訊來作爲輔助資訊，雖然輪廓資訊中係有反映出邊緣的形狀資訊，但被攝體的運動所導致的輪廓模糊或量化誤差所產生的邊緣鮮銳度的劣化係未被反映。因此，即使使用輪廓資訊來決定投影轉換參數，也不能提升邊緣鮮銳度的重現性。因此，本發明的目的在於，基於邊緣成分中作爲代表之空間性像素値變化之特徵加以含有的編碼對象影像，生成提高其重現性的預測像素’並可降低預測殘差的動畫編碼/解碼裝置。本發明之一態樣的影像編碼裝置’係具備：抽出部， -6- 200922338 係從原影像中抽出第1邊緣成分影像；和分離部，係將參照影像分離成第2邊緣成分影像與邊緣去除影像；和輔助資訊生成部，係生成用以從前記第2邊緣成分影像來預測前記第1邊緣成分影像所需之輔助資訊；和預測部，係使用前記輔助資訊，而從前記第2邊緣成分影像來預測第3邊緣成分影像；和預測影像生成部，係將前記邊緣去除影像與前記第3邊緣成分影像加以合成，以生成預測影像；和預測殘差計算部，係求出前記原影像與前記預測影像之間的預測殘差；和編碼部，係將前記預測殘差及前記輔助資訊，加以編碼。本發明之另一態樣的影像編碼裝置，係具備：解碼部，係將已被輸入之編碼資料予以解碼，求出用來將對象影像之預測殘差及前記對象影像之第1邊緣成分影像加以預測所需的輔助資訊；和分離部，係將已被解碼之參照影像 ’分離成該當參照影像的第2邊緣成分影像、與從該當參照影像中去除了第2邊緣成分影像後的邊緣去除影像；和預測部’係使用前記輔助資訊，而從前記第2邊緣成分影像來預測前記第1邊緣成分影像；和合成部，係將前記邊緣去除影像與前記第2邊緣成分影像加以合成，以生成預測影像；和解碼影像生成部，係使用前記預測殘差與前記預測影像，生成前記對象影像的解碼影像。【貫施方式】以下’參照圖面來說明本發明的實施形態。 -7- 200922338 (第1實施形態）圖1係圖示本發明之第1實施形態所述之影像編碼裝置。本實施形態的影像編碼裝置係具有編碼部1 〇〇及編碼控制部1 5 0。編碼部1 〇〇係含有：減算器1 〇 1、轉換/量化部 102、逆轉換/逆量化部103、熵編碼器104、加算器105、畫格記憶體106、運動推定/運動補償部107及預測部1 10。編碼部100係被編碼控制部150所控制。編碼部1〇〇 ’ 係對所輸入的動畫之原影像訊號1 0，進行一種稱作混合 (Hybrid)編碼的動畫編碼處理，然後輸出編碼資料14。亦即’編碼部1 00係將從已被編碼之參照影像訊號所預測出的預測影像訊號與原影像訊號1 〇之間的預測殘差訊號，進行轉換/量化’將已被量化之轉換係數，進行熵編碼化，然後輸出編碼資料1 4。以下雖然於編碼部1 00中，原影像訊號1 〇係以未圖示的區塊掃描轉換器而被分割成所定大小的區塊’以該當區塊單位來進行處理之方式來進行說明，但亦可以畫格或圖場單位來進行處理。減算1器101係從原影像訊號1 〇，減算掉後述的來自預測部1 1 0的第2預測影像訊號！ 9以算出預測殘差訊號丨！，並輸入至轉換/量化部1 〇 2。轉換/量化部1 02係對來自減算器1 〇 1的預測殘差訊號 1 1進行例如D C T等之轉換，基於被編碼控制部1 5 〇所設定之量化參數來進行量化，將已被量化之轉換係數12輸入至逆轉換/逆量化部103及熵編碼器1〇4。此外，轉換/量化部 200922338 1 02對預測殘差訊號1 1所進行之轉換不限於DCT，例如亦可爲小波轉換或獨立成分解析，可爲其他種正交轉換。逆轉換/逆量化部1 03，係將已被量化之轉換係數1 2，基於上記量化參數來進行逆量化，例如進行IDCT等逆轉換，將解碼預測殘差訊號1 3輸入至加算器1 05。此外，逆轉換/逆量化部1 〇 3對已被量化之轉換係數1 2所進行之逆轉換係不限於ID CT，但需爲在轉換/量化部1〇2上對預測殘差訊號1 1所進行過之轉換的互逆轉換。熵編碼器1 04，係對來自轉換/量化部1 〇 2的已被量化之轉換係數12、後述之來自運動推定/運動補償部107的運動向量18及來自預測部1 10的輔助資訊20，進行例如霍夫曼編碼或算術編碼等熵編碼化，輸出成爲編碼資料14。又 ’熵編碼器1 04，係針對量化參數或預測模式資訊也同樣進行編碼。加算器1 〇 5 ’係將來自逆轉換/逆量化部1 〇 3的解碼預測殘差訊號1 3、和後述之來自預測部1 1 〇的第2預測影像訊號19’進行加算，成爲局部解碼影像訊號15而輸入至畫格記憶體1 0 6。畫格記憶體1 0 6，係將來自加算器1 〇 5的局部解碼影像訊號1 5 ’當作參照影像訊號丨6而予以暫時保持。該參照影像訊號1 6 ’係被後述的運動推定/運動補償部1 〇7所參照。此外，在畫格記憶體1 06的前段亦可還設有去區塊濾波器 ’以從局部解碼影像訊號1 5中去除區塊失真。運動推定/運動補償部1 07，係使用原影像訊號丨〇、與 -9- 200922338 被保存在畫格記憶體1 06中的參照影像訊號1 6，來進行運動推定/運動補償處理以生成第1預測影像訊號1 7，同時生成運動向量1 8。第1預測影像訊號1 7係被輸入至預測部11 0 ，運動向量18係被輸入至熵編碼器1〇4。運動推定/運動補償部107所進行之運動推定/運動補償處理，係例如以所定大小之區塊單位來進行；在原影像訊號1 〇的編碼對象區塊與參照影像訊號1 6之間進行區塊匹配。上記區塊匹配的結果，編碼成本爲最小的參照影像訊號16的區塊係被當成第1預測影像訊號而輸出，該當第1預測影像訊號1 7的表示位於參照影像訊號1 6中之位置的運動向量1 8 ’會被生成。此外，上記編碼成本，係採用例如原影像訊號1 〇與第1預測影像訊號1 7之差分的絕對値和（s A D) 〇又’由於參照影像訊號16係已經被編碼過了，因此不限於時間上較爲過去的畫格，亦可基於較爲未來的畫格來進行預測，因此亦可不限於1個畫格而是可基於複數畫格來進行預測。編碼成本，例如，係亦可用以下所示的數式 (1)來求出。 [數1] K = SAD + /1X OH (1 ) 此處，K係表示編碼成本，SAD係表示差分的絕對値和、λ係表示基於垔化寬度或量化參數的値所決定的定數 -10- 200922338 ’ 〇 Η係表示將運動向量及所參照之畫格加以表示的指數等之標頭資訊。又，編碼成本k係亦可僅採用標頭資訊〇 Η來充當，亦可將差分進行阿達瑪轉換或取近似。其他，亦可使用原影像訊號10的活性來作成成本函數。預測部1 1 〇，係將來自運動推定/運動補償部1 07的第1 預測影像訊號1 7加以取得，生成例如基於邊緣等之影像內空間性像素値變化之特徵部分的重現性加以提高的第2預測影像訊號1 9，將該當第2預測影像訊號1 9輸入至減算器 101及加算器105。又，預測部110係將用以從第1預測影像訊號1 7來預測第2預測影像訊號1 9所需之輔助資訊20，輸入至熵編碼器1 04。此外，以下說明中的特徵成分之一例是以邊緣來說明，但本實施形態所述之編碼器所能適用的特徵成分係不限於邊緣，例如亦可爲紋理、對比及雜訊。如圖2所示，預測部1 1〇係含有：特徵抽出部121、輔助資訊生成部1 22、特徵分離部1 3 0、特徵預測部1 4 1及訊號合成部142。特徵分離部130，係從來自運動推定/運動補償部1〇7 的第1預測影像訊號1 7中抽出特徵成分以生成第1特徵訊號 21，並且生成從第1預測影像訊號中去除了第1特徵訊號21 的特徵去除訊號22。亦即，特徵分離部1 30，係將第1預測影像訊號1 7分離成第1特徵訊號2 1與特徵去除訊號22。第1 特徵訊號2 1係被輸入至輔助資訊生成部1 22及特徵預測部 141，特徵去除訊號22係被輸入至訊號合成部142。如圖2 所示，特徵分離部1 3 0之一例，係含有特徵抽出部1 3 1及減 -11 - 200922338 算器132。特徵抽出部131，係對來自運動推定/運動補償部107 的第1預測影像訊號1 7進行例如邊緣抽出處理或濾波處理以抽出邊緣成分，生成第1特徵訊號2 1，然後輸入至減算器132、特徵預測部141及訊號合成部142。特徵抽出部131 ，係於影像處理中使用一般的微分運算子所進行之邊緣偵測手法即可，例如亦可採用屬於1次微分的Sobel運算子或屬於2次微分的拉普拉斯運算子等。減算器132係從來自運動推定/運動補償部107的第1預測影像訊號1 7，減算掉來自特徵抽出部1 3 1的第1特徵訊號 2 1，將特徵去除訊號22輸入至訊號合成部1 42。此外，特徵分離部1 3 0係不限於圖2所示之構成，例如亦可爲圖3A或圖3B所示之構成。如圖3A所示，特徵分離部130之一變形例，係含有平滑化濾波器133及減算器134。平滑化濾波器1 3 3，例如係爲移動平均濾波器、加權平均濾波器、中位數濾波器或高斯濾波器等影像處理中一般常用的平滑化濾波器。平滑化濾波器1 3 3，係從來自運動推定/運動補償部107的第1預測影像訊號17中，去除高頻成分。一般而言，若將影像進行頻率轉換則因爲邊緣係主要是含有高頻成分，所以藉由從第1預測影像訊號17去除高頻成分，就可生成特徵去除訊號22。平滑化濾波器 133，係將特徵去除訊號22輸入至減算器134及訊號合成部 142 ° -12- 200922338 減算器134係從來自運動推定/運動補償部1〇7的第1預測影像訊號1 7，減算掉來自平滑化濾波器1 3 3的特徵去除訊號22以生成第1特徵訊號2 1，輸入至輔助資訊生成部1 22 及特徵預測部1 4 1。如圖3 B所示，特徵分離部1 3 0之另一變形例，係含有頻帶分割部1 3 5。頻帶分割部1 3 5，係對來自運動推定/運動補償部1 〇 7的第1預測影像訊號1 7，進行使用了例如小波轉換、離散餘弦轉換或獨立成分解析等之頻率成分的頻帶分割，而將其分割成高頻成分與低頻成分。如前述，由於若將影像進行頻率轉換則因爲邊緣係主要是含有高頻成分，所以頻帶分割部1 3 5係將第1預測影像訊號1 7的高頻成分當作第1特徵訊號21、將低頻成分當成特徵去除訊號22, 而分別加以輸出。第1特徵訊號2 1係被輸入至輔助資訊生成部122及特徵預測部141，特徵去除訊號22係被輸入至訊號合成部142。又，於特徵分離部1 3 0中，亦可爲，使用將邊緣以函數加以表現而成的邊緣模型，將與該當邊緣模型高相關的部分，當作邊緣成分而加以抽出。其他還有，特徵分離部 130的特徵成分之抽出，係若在解碼裝置上也能同樣實施，則可採用各式各樣的手法。特徵抽出部1 2 1，係從原影像訊號1 〇抽出特徵成分以生成第2特徵訊號’輸入至輔助資訊生成部122。特徵抽出部121，係與前述之特徵分離部13〇同樣地，藉由濾波處理或頻帶分割處理等各種手法，將特徵成分予以抽出即可。 -13- 200922338 輔助資訊生成部1 22，係分別取得來自特徵分離部1 3 0 的第1特徵訊號21及來自特徵抽出部121的第2特徵訊號23 ’生成用來作爲從上記第1特徵訊號2 1來預測出第2特徵訊號23所需之參數而使用的輔助資訊20。輔助資訊20，係被輸入至特徵預測部1 4 1及熵編碼器1 04。此外，輔助資訊20 的詳細說明將於後述。特徵預測部141，係使用來自輔助資訊生成部122的輔助資訊2 0，基於來自特徵分離部1 3 0的第1特徵訊號2 1而預測上記第2特徵訊號23以生成特徵預測訊號24，輸入至訊號合成部1 4 2。訊號合成部142，係將來自特徵分離部130的特徵去除訊號22、和來自特徵預測部1 4 1的特徵預測訊號24加以合成，以生成第2預測影像訊號19，輸入至減算器101及加算器 105。此外，從原影像訊號1〇或第1預測影像訊號17抽出而進行預測或合成的特徵訊號不限於1個，亦可爲，將複數的特徵訊號加以抽出然後分別進行預測，將所有的特徵訊號一倂進行合成。又，當從第1預測影像訊號1 7抽出複數第1特徵訊號2 1時，則沒有必要關於所抽出的全部第1特徵訊號21，生成輔助資訊20。亦即，亦可爲，關於部分的第 1特徵訊號21係不生成輔助資訊20，將已抽出之第1特徵訊號21當作特徵預測訊號24而合成至特徵去除訊號22。編碼控制部1 5 0，係進行包含發生編碼量的回饋控制、量化特性控制、運動推定精度之控制及特徵預測訊號之 -14 - 200922338 控制在內的編碼部1 〇〇全體之控制。接著，使用圖4及圖5 A乃至J來說明預測部1 1 0之動作例。此處係說明，從圖5 A所示的原影像訊號1 0及對應於其的圖5C所示的第1預測影像訊號17，生成圖5J所示之第2預測影像訊號1 9爲止之處理的流程。首先，特徵分離部130，係將圖5C的第1預測影像訊號1 7 ’以運動推定/運動補償部1 〇7加以取得。接著，特徵分離部130，係從步驟S201中所取得到的圖5C的第1預測影像訊號1 7中，抽出圖5 D所示的第1特徵訊號2 1 (步驟 S2〇2)。接著，特徵分離部130，係從步驟S201中所取得到的圖5 C的第1預測影像訊號1 7中，去除步驟S202中所取得之圖5D的第1特徵訊號21，生成圖5E所示的特徵去除訊號22(步驟S203)。此處，步驟S2 02及步驟S203所進行的順序’並不限於上記。例如，若是圖3 A所示的特徵分離部1 3 0，則步驟S 2 0 2及步驟S 2 0 3的進行順序係亦可對調；若是圖3B所示的特徵分離部130，則步驟S202及步驟S 2 0 3亦可同時進行。另一方面，特徵抽出部1 2 1，係取得圖5 A的原影像訊號1〇(步驟S204)。接著，特徵抽出部121，係從圖5A的原影像訊號10中抽出圖5B所示的第2特徵訊號23(步驟S205) 。此處’步驟 S201乃至步驟 S203、步驟 S204及步驟 S 2 0 5所被進行的順序係不限於上記，可互相對調或是亦可同時進行。接著’輔助資訊生成部122，係根據步驟S202中所抽 -15 - 200922338 出之圖5D的第1特徵訊號21及步驟S20 5中所抽出之圖5E 的第2特徵訊號23，來生成輔助資訊20°以下’針對輔助資訊2 0詳細說明。在本實施形態中，邊緣是以形狀、強度及幅度這3個資訊來加以表現，輔助資訊2 〇係代表著’在第1特徵訊號2 1及第2特徵訊號2 3之間的上記3個資訊之至少1者的差分。首先，說明關於形狀的輔助資訊20之生成。輔助資訊生成部1 2 2，係爲了生成關於形狀的輔助資訊2 〇，對第1特徵訊號2 1及第2特徵訊號2 3進行細線化處理以將邊緣中心線予以抽出。此外，輔助資訊生成部1 22對第1特徵訊號2 1 及第2特徵訊號23所進行之細線化處理’係在影像處理中一般常用之手法即可。藉由輔助資訊生成部122對圖5D的第1特徵訊號2 1及圖5 B的第2特徵訊號2 3進行細線化處理，就獲得圖5 F所示的邊緣中心線。於圖5 F中，從圖5 D的第1特徵訊號2 1所得之邊緣中心線以實線表示’從圖5 B的第2特徵訊號23所得之邊緣中心線以虛線表示。輔助資訊生成部1 22，係將兩邊緣中心線之形狀誤差，當作關於形狀之輔助資訊2 0而加以生成。此外’形狀誤差係可使用例如鏈碼（chain code)或B樣條曲線（B-spline curve)等高次參數曲線（parametric curve)來表示。接著，說明關於強度及幅度的輔助資訊20之生成。如圖5G所示，輔助資訊生成部122係爲了生成關於強度及幅度的輔助資訊2 〇，而將對邊緣成垂直方向上相鄰的2點分別視爲起點及終點而拉出複數垂線，分別求出該當垂線上 -16- 200922338 的邊緣強度分布，將該當複數強度分布進行平均化，以求出1個強度分布。一旦輔助資訊生成部122對圖5D的第1特徵訊號21及圖5B的第2特徵訊號23求出了上記強度分布，則得到圖5 Η所示的強度分布。於圖5 Η中，橫軸係表示上記垂線上的相對位置，縱軸係表示該當垂線上之位置上的邊緣強度，從圖5 D的第1特徵訊號2 1所得之強度分布係以實線表示，從圖5Β的第2特徵訊號23所得之強度分布係以虛線表示。圖5 Η中，各位置上的邊緣強度，係以該當位置上的像素値與上記起點及終點上的像素値之差分的絕對値來表示。輔助資訊生成部1 22，係將將兩強度分布之差分，以關於強度及幅度的輔助資訊2 0而加以生成。輔助資訊生成部1 2 2，係將強度分布，使用某種函數來取近似。例如，若爲圖5 Η所示的強度分布，則若令橫軸所表示的垂線上之位置爲X、該當位置上的邊緣強度爲f(x)，則可用下式來表示。 [數2] f(x)200922338 IX. Description of the Invention [Technical Field of the Invention] The present invention relates to encoding and decoding of images. [Prior Art] Previously, as an animation coding method, MPEG-2 or H.264/MPEG4 A VC (hereinafter abbreviated as H.264) is known. In these animation coding methods, the original image is segmented into blocks of a predetermined size, and inter-frame prediction due to motion estimation and motion compensation is performed in the block unit to generate a predicted image. Further, in the above-described animation coding method, discrete cosine transform (hereinafter abbreviated as DCT), quantization, and entropy coding are performed on the prediction residual between the predicted image and the original image to generate coded data. Moreover, in the motion estimation of H.264, the shape corresponding to the object is selected from a majority of the block sizes (16x16 pixels, 16x8 pixels, 8x16 pixels, 8x8 pixels, 8x4 pixels, 4x8 pixels, 4x4 pixels) or The size of the block of motion is appropriate, so the prediction efficiency is higher than that of MPEG-2. However, the motion estimation accompanying the selection of such a block size is suitable for inter-frame prediction when the rigid body moves in parallel within the picture, but for example, when the object between the frames is deformed, the frame prediction is performed. Will not be suitable, there will be a large prediction residual. In particular, in a place (area) where the pixel 値 changes in the space such as the edge and the texture, the prediction residual is likely to occur, and the predicted residual is easily viewed as a mosquito noise in the decoded image, and the drawing is caused. Deterioration. The prediction residual is described above, although the smaller block -4-200922338 size can be selected to reduce the motion timing, but if a smaller block size is selected, the coding amount of the header information containing the motion vector will increase. Big. Even in a place where the edge or texture changes spatially in a pixel, the conversion coefficient after orthogonal conversion such as DCT is easily dispersed in the high-frequency component' because the quantization error of the conversion coefficient results in decoding the image. Deterioration of edge sharpness or defect of texture information occurs. Further, the prediction residual contained in the locally decoded video is transmitted as a reference image when it is predicted as a reference image. In Japanese Laid-Open Patent Publication No. Hei 7-240925, the edge component is extracted in advance from the original image, and the image of the removed edge component is encoded by the existing block based on MPEG-2 or H.264, and only the edge component is encoded. The image is additionally encoded correctly (so that it does not produce a prediction residual). In this way, since the visually important edge components are individually encoded correctly, the coding distortion around the edges can be reduced compared to the previous animation coding method, so the mosquito noise in the decoded image is The occurrence can be suppressed. Therefore, especially when encoding is performed at a high compression ratio, it is expected that the subjective image quality in the reproduced image is improved. In the Japanese Patent Publication No. 3 2 3 3 1 8 5, when performing motion compensation, projection conversion is performed in consideration of parallel movement, rotation, expansion and contraction of an object, etc., in order to determine the projection conversion parameter 'will parameters The outline information of the above-mentioned object extracted from the original image is used as auxiliary information. Therefore, even if there is a deformation of the object between the frames, it is possible to generate a predicted image by adding geometric distortion to the reference image based on the above-described contour information. Therefore, the prediction residual can be reduced. In addition, 'not the outline information extracted from the original image is encoded as -5, 200922338 code', but the residual of the contour information and the contour information extracted from the referenced image that has been encoded is encoded, thereby suppressing The amount of coding. In the Japanese Patent Application Laid-Open No. Hei 7-240925, since only the image of the edge component is separately encoded correctly, the amount of encoding increases as compared with the case where the entire image is encoded. Further, since only the encoding of the edge component image is performed separately from the encoding of the image from which the edge component is removed, it is difficult to control the balance between the encoding amount and the image quality of both the encoded data. For example, when only the amount of generated code generated by the image of the edge component is increased, if the total amount of code is to be maintained, the image quality of the image other than the edge component is deteriorated, and the edge component is not to be maintained while maintaining the overall image quality. The amount of code generated by the image will increase. Further, in Japanese Patent No. 3 23 3 1 85, the contour information is used as the auxiliary information in order to determine the projection conversion parameter, and although the outline information reflects the shape information of the edge, the movement of the object is caused. The degradation of the edge sharpness caused by the contour blur or quantization error is not reflected. Therefore, even if contour information is used to determine projection conversion parameters, the reproducibility of edge sharpness cannot be improved. Therefore, an object of the present invention is to generate a prediction pixel which enhances reproducibility based on a coding target image included in a characteristic of a spatial pixel change which is representative of an edge component, and can reduce animation residual encoding/decoding of prediction residuals. Device. An image coding apparatus according to an aspect of the present invention includes: an extraction unit, -6-200922338 extracts a first edge component image from an original image; and a separation unit separates the reference image into a second edge component image and edge And the auxiliary information generating unit generates auxiliary information required to predict the first edge component image from the second edge component image; and the prediction unit uses the pre-recording auxiliary information to record the second edge from the front The component image is used to predict the third edge component image; and the predicted image generation unit combines the pre-recorded edge-removed image and the pre-recorded third edge component image to generate a predicted image; and the prediction residual calculation unit obtains the pre-recorded original image. The prediction residual between the predicted image and the pre-recorded image; and the coding unit encodes the pre-recorded residual and the pre-recorded auxiliary information. According to still another aspect of the present invention, a video encoding apparatus includes: a decoding unit that decodes encoded data that has been input, and obtains a first edge component image for predicting a residual of a target image and a video of a target image; The auxiliary information required for prediction; and the separation unit separates the decoded reference image into the second edge component image of the reference image and the edge removal after removing the second edge component image from the reference image. The image and the prediction unit use the pre-recording auxiliary information to predict the first edge component image from the second edge component image; and the synthesis unit combines the pre-edge edge-removed image and the pre-recorded second edge component image to The predicted video is generated, and the decoded video generating unit generates a decoded video of the pre-recorded video using the pre-recorded residual and the pre-recorded video. [Cross Embodiment] Hereinafter, embodiments of the present invention will be described with reference to the drawings. -7-200922338 (First Embodiment) Fig. 1 is a view showing a video encoding apparatus according to a first embodiment of the present invention. The video encoding apparatus according to the present embodiment includes an encoding unit 1 and an encoding control unit 150. The coding unit 1 includes a subtractor 1 〇1, a conversion/quantization unit 102, an inverse transform/inverse quantization unit 103, an entropy encoder 104, an adder 105, a frame memory 106, and a motion estimation/motion compensation unit 107. And prediction unit 1 10. The encoding unit 100 is controlled by the encoding control unit 150. The encoding unit 1〇〇 performs an animation encoding process called hybrid encoding on the original video signal 10 of the input animation, and then outputs the encoded material 14. That is, the encoding unit 100 converts/quantizes the quantized conversion coefficients from the predicted residual signal between the predicted video signal predicted by the encoded reference video signal and the original video signal 1 〇. Entropy coding is performed, and then the encoded data 14 is output. Hereinafter, in the encoding unit 100, the original video signal 1 is divided into blocks of a predetermined size by a block scanning converter (not shown), and the processing is performed in the block unit. It can also be processed in a grid or field unit. The subtraction unit 101 subtracts the second predicted video signal from the pre-measurement unit 1 1 0 from the original video signal 1 !! 9 to calculate the predicted residual signal 丨! And input to the conversion/quantization section 1 〇 2. The conversion/quantization unit 102 performs conversion of, for example, DCT on the prediction residual signal 1 1 from the subtractor 1 , 1, and performs quantization based on the quantization parameter set by the encoding control unit 15 5 ,, and has been quantized. The conversion coefficient 12 is input to the inverse conversion/inverse quantization unit 103 and the entropy encoder 1〇4. Further, the conversion performed by the conversion/quantization section 200922338 102 is not limited to the DCT, and may be, for example, wavelet transform or independent component analysis, and may be other orthogonal transforms. The inverse conversion/inverse quantization unit 103 performs inverse quantization on the quantized conversion coefficient 1 2 based on the above-mentioned quantization parameter, for example, performs inverse conversion such as IDCT, and inputs the decoded prediction residual signal 1 3 to the adder 105. . Further, the inverse conversion performed by the inverse conversion/inverse quantization unit 1 〇3 on the quantized conversion coefficient 1 is not limited to the ID CT, but it is necessary to perform the prediction residual signal 1 1 on the conversion/quantization unit 1〇2. The reciprocal conversion of the conversions that have been performed. The entropy encoder 104 is a pair of quantized conversion coefficients 12 from the conversion/quantization unit 1 〇2, a motion vector 18 from the motion estimation/motion compensation unit 107, which will be described later, and an auxiliary information 20 from the prediction unit 110. Entropy coding such as Huffman coding or arithmetic coding is performed, and the output becomes the coded material 14. Further, the 'entropy coder 104' encodes the quantization parameter or the prediction mode information in the same manner. The adder 1 〇 5 ' adds the decoded prediction residual signal 13 from the inverse transform/inverse quantization unit 1 〇 3 and the second predicted video signal 19 ′ from the prediction unit 1 1 后 described later to become local decoding. The image signal 15 is input to the frame memory 1 0 6 . The frame memory 1 0 6 temporarily holds the locally decoded video signal 1 5 ' from the adder 1 〇 5 as the reference video signal 丨6. This reference image signal 1 6 ' is referred to by the motion estimation/motion compensation unit 1 后 7 which will be described later. In addition, a deblocking filter may be further provided in the front stage of the frame memory 106 to remove block distortion from the locally decoded video signal 15. The motion estimation/motion compensation unit 1 07 performs motion estimation/motion compensation processing to generate motion by using the original video signal 丨〇 and the reference video signal 16 stored in the frame memory 106 in -9-200922338. 1 Predicts the image signal 1 7 and simultaneously generates a motion vector 18. The first predicted video signal 17 is input to the prediction unit 110, and the motion vector 18 is input to the entropy encoder 1〇4. The motion estimation/motion compensation processing performed by the motion estimation/motion compensation unit 107 is performed, for example, in a block unit of a predetermined size; a block is performed between the coding target block of the original video signal 1 and the reference video signal 16. match. As a result of the block matching, the block of the reference video signal 16 having the smallest encoding cost is output as the first predicted video signal, and the representation of the first predicted video signal 17 is located at the position of the reference video signal 16. Motion vector 1 8 ' will be generated. In addition, the above coding cost is based on the absolute sum (s AD) of the difference between the original video signal 1 〇 and the first predicted video signal 17 〇 and 'because the reference video signal 16 has been encoded, so it is not limited The temporally past frames can also be predicted based on the more future frames, so it is not limited to one frame but can be predicted based on the complex frames. The coding cost can be obtained, for example, by the following equation (1). [Expression 1] K = SAD + /1X OH (1 ) Here, K is the coding cost, SAD is the absolute 値 of the difference, and λ is the fixed number determined by the 垔 width or the quantization parameter - 10- 200922338 ' The system indicates the header information such as the index of the motion vector and the referenced frame. In addition, the coding cost k can also be used only by using the header information 〇 ,, or the difference can be converted or approximated. Others, the activity of the original image signal 10 can also be used to create a cost function. The prediction unit 1 1 取得 obtains the first predicted video signal 17 from the motion estimation/motion compensation unit 107, and generates reproducibility of the characteristic portion of the spatial pixel 値 change based on the edge or the like, for example. The second predicted video signal 192 inputs the second predicted video signal 19 to the subtractor 101 and the adder 105. Further, the prediction unit 110 inputs the auxiliary information 20 necessary for predicting the second predicted video signal 19 from the first predicted video signal 17 to the entropy encoder 104. Further, an example of the characteristic components in the following description is described by an edge. However, the characteristic components applicable to the encoder described in the present embodiment are not limited to edges, and may be, for example, texture, contrast, and noise. As shown in Fig. 2, the prediction unit 1 1 includes a feature extraction unit 121, an auxiliary information generation unit 1 22, a feature separation unit 1 30, a feature prediction unit 14 1 and a signal synthesis unit 142. The feature separating unit 130 extracts the feature component from the first predicted video signal 17 from the motion estimation/motion compensation unit 〇7 to generate the first feature signal 21, and generates the first image signal from the first predicted video signal. The feature of the feature signal 21 removes the signal 22. That is, the feature separating unit 130 separates the first predicted video signal 17 into the first characteristic signal 2 1 and the feature removing signal 22. The first characteristic signal 2 1 is input to the auxiliary information generating unit 1 22 and the feature predicting unit 141, and the feature removing signal 22 is input to the signal synthesizing unit 142. As shown in Fig. 2, an example of the feature separating unit 138 includes a feature extracting unit 1 3 1 and a minus -11 - 200922338 calculator 132. The feature extracting unit 131 performs, for example, edge extraction processing or filtering processing on the first predicted video signal 17 from the motion estimation/motion compensation unit 107 to extract edge components, generate a first characteristic signal 2 1, and then input to the subtractor 132. The feature prediction unit 141 and the signal synthesizing unit 142. The feature extracting unit 131 is an edge detecting method performed by using a general differential operator in image processing. For example, a Sobel operator belonging to 1 differential or a Laplacian operator belonging to 2 differentials may be used. Wait. The subtractor 132 subtracts the first characteristic signal 2 1 from the feature extracting unit 13 1 from the first predicted video signal 17 from the motion estimation/motion compensating unit 107, and inputs the feature removing signal 22 to the signal synthesizing unit 1 42. Further, the feature separating unit 1 30 is not limited to the configuration shown in Fig. 2, and may be, for example, the configuration shown in Fig. 3A or Fig. 3B. As shown in Fig. 3A, a modified example of the feature separating unit 130 includes a smoothing filter 133 and a subtractor 134. The smoothing filter 1 3 3 is, for example, a smoothing filter commonly used in image processing such as a moving average filter, a weighted average filter, a median filter, or a Gaussian filter. The smoothing filter 133 removes the high frequency component from the first predicted video signal 17 from the motion estimation/motion compensation unit 107. In general, if the image is frequency-converted, since the edge system mainly contains high-frequency components, the feature removal signal 22 can be generated by removing the high-frequency component from the first predicted image signal 17. The smoothing filter 133 inputs the feature removal signal 22 to the subtractor 134 and the signal synthesizing unit 142 ° -12- 200922338. The subtractor 134 is the first predicted video signal from the motion estimation/motion compensation unit 1〇7. The feature removal signal 22 from the smoothing filter 133 is subtracted to generate the first characteristic signal 2 1, and is input to the auxiliary information generating unit 1 22 and the feature predicting unit 14 1 . As shown in Fig. 3B, another modification of the feature separating unit 1 30 includes a band dividing unit 135. The band division unit 135 performs frequency band division using frequency components such as wavelet transform, discrete cosine transform, or independent component analysis on the first predicted video signal 17 from the motion estimation/motion compensation unit 1 ,7. It is divided into high frequency components and low frequency components. As described above, when the video is frequency-converted, since the edge system mainly contains the high-frequency component, the band division unit 135 regards the high-frequency component of the first predicted video signal 17 as the first characteristic signal 21, and The low frequency components are used as feature removal signals 22 and are output separately. The first characteristic signal 2 1 is input to the auxiliary information generating unit 122 and the feature predicting unit 141, and the feature removing signal 22 is input to the signal synthesizing unit 142. Further, in the feature separating unit 130, an edge model in which an edge is expressed by a function may be used, and a portion highly correlated with the edge model may be extracted as an edge component. Further, the extraction of the characteristic components of the feature separating unit 130 can be carried out in the same manner as in the decoding device, and various methods can be employed. The feature extracting unit 1 1 1 extracts the feature component from the original video signal 1 以 to generate a second feature signal 'input to the auxiliary information generating unit 122. Similarly to the above-described feature separating unit 13A, the feature extracting unit 121 extracts the feature components by various methods such as filtering processing or band dividing processing. -13- 200922338 The auxiliary information generating unit 22 generates the first characteristic signal 21 from the feature separating unit 130 and the second characteristic signal 23' from the feature extracting unit 121, respectively, for generating the first characteristic signal from the top. The auxiliary information 20 used to predict the parameters required for the second characteristic signal 23 is 2 1 . The auxiliary information 20 is input to the feature prediction unit 1 41 and the entropy encoder 104. In addition, a detailed description of the auxiliary information 20 will be described later. The feature prediction unit 141 predicts the second feature signal 23 based on the first feature signal 2 1 from the feature separation unit 130 using the auxiliary information 20 from the auxiliary information generating unit 122 to generate the feature prediction signal 24, and inputs To signal synthesis unit 1 4 2 . The signal synthesizing unit 142 combines the feature removal signal 22 from the feature separation unit 130 and the feature prediction signal 24 from the feature prediction unit 14 1 to generate a second predicted video signal 19, which is input to the subtractor 101 and added. 105. In addition, the feature signal extracted from the original video signal 1 or the first predicted video signal 17 for prediction or synthesis is not limited to one, and the plurality of characteristic signals may be extracted and predicted separately, and all the characteristic signals are extracted. Synthesize at a glance. Further, when the plurality of first characteristic signals 2 1 are extracted from the first predicted video signal 17 , it is not necessary to generate the auxiliary information 20 with respect to all of the extracted first characteristic signals 21 . In other words, the first characteristic signal 21 is not generated as the auxiliary information 20, and the extracted first characteristic signal 21 is synthesized as the feature prediction signal 24 to the feature removal signal 22. The coding control unit 150 performs control of the entire coding unit 1 including the control of the amount of feedback, the control of the quantization characteristic, the control of the motion estimation accuracy, and the feature prediction signal. Next, an example of the operation of the prediction unit 110 will be described with reference to Figs. 4 and 5A to J. Here, the processing of generating the second predicted video signal 19 shown in FIG. 5J from the original video signal 10 shown in FIG. 5A and the first predicted video signal 17 shown in FIG. 5C corresponding thereto is illustrated. Process. First, the feature separating unit 130 acquires the first predicted video signal 1 7 ' of Fig. 5C by the motion estimation/motion compensating unit 1 〇7. Next, the feature separating unit 130 extracts the first characteristic signal 2 1 shown in Fig. 5D from the first predicted video signal 17 of Fig. 5C acquired in step S201 (step S2〇2). Next, the feature separating unit 130 removes the first characteristic signal 21 of FIG. 5D acquired in step S202 from the first predicted video signal 17 of FIG. 5C acquired in step S201, and generates the first characteristic signal 21 of FIG. 5D obtained in step S202. The feature removal signal 22 (step S203). Here, the order of steps S02 and S203 is not limited to the above. For example, in the case of the feature separating unit 1 30 shown in FIG. 3A, the order of the steps S 2 0 2 and the step S 2 0 3 may be reversed; if the feature separating unit 130 shown in FIG. 3B, the step S202 And step S 2 0 3 can also be performed simultaneously. On the other hand, the feature extracting unit 1 1 1 acquires the original video signal 1 of Fig. 5A (step S204). Next, the feature extracting unit 121 extracts the second characteristic signal 23 shown in Fig. 5B from the original video signal 10 of Fig. 5A (step S205). Here, the order in which the steps S201 and S203, the step S204, and the step S205 are performed is not limited to the above, and may be mutually reversed or may be performed simultaneously. Next, the auxiliary information generating unit 122 generates the auxiliary information based on the first characteristic signal 21 of FIG. 5D and the second characteristic signal 23 of FIG. 5E extracted in step S20, which are extracted in step S202 from -15 to 200922338. Below 20° 'Detailed description for auxiliary information 2 0. In the present embodiment, the edge is represented by three pieces of information: shape, intensity, and amplitude. The auxiliary information 2 indicates that the top of the first feature signal 2 1 and the second feature signal 2 3 are three. The difference between at least one of the information. First, the generation of the auxiliary information 20 regarding the shape will be explained. The auxiliary information generating unit 1 2 2 thins the first characteristic signal 2 1 and the second characteristic signal 2 3 in order to generate the auxiliary information 2 关于 about the shape to extract the edge center line. Further, the thinning processing performed by the auxiliary information generating unit 1 22 on the first characteristic signal 2 1 and the second characteristic signal 23 may be a commonly used technique in image processing. The auxiliary information generating unit 122 thins the first characteristic signal 2 1 of Fig. 5D and the second characteristic signal 2 3 of Fig. 5B to obtain the edge center line shown in Fig. 5F. In Fig. 5F, the edge center line obtained from the first characteristic signal 2 1 of Fig. 5D is indicated by a solid line. The edge center line obtained from the second characteristic signal 23 of Fig. 5B is indicated by a broken line. The auxiliary information generating unit 1 22 generates a shape error of the center line of the two edges as the auxiliary information 20 for the shape. Further, the shape error can be expressed by a high-order parametric curve such as a chain code or a B-spline curve. Next, the generation of the auxiliary information 20 regarding the intensity and the amplitude will be described. As shown in FIG. 5G, the auxiliary information generating unit 122 extracts the complex vertical information about the intensity and the amplitude, and draws the two perpendicular points in the vertical direction as the start point and the end point, respectively, and pulls out the complex vertical lines, respectively. The edge intensity distribution of the vertical line-16-200922338 is obtained, and the complex intensity distribution is averaged to obtain one intensity distribution. When the auxiliary information generating unit 122 obtains the upper intensity distribution for the first characteristic signal 21 of Fig. 5D and the second characteristic signal 23 of Fig. 5B, the intensity distribution shown in Fig. 5A is obtained. In Fig. 5, the horizontal axis represents the relative position on the vertical line, and the vertical axis represents the edge intensity at the position on the vertical line. The intensity distribution obtained from the first characteristic signal 2 1 of Fig. 5D is a solid line. It is shown that the intensity distribution obtained from the second characteristic signal 23 of Fig. 5A is indicated by a broken line. In Fig. 5, the edge intensity at each position is expressed by the absolute 値 of the difference between the pixel 该 at the position and the pixel 値 at the start and end points of the above. The auxiliary information generating unit 1 22 generates a difference between the two intensity distributions with the auxiliary information 20 regarding the intensity and the amplitude. The auxiliary information generating unit 1 2 2 approximates the intensity distribution using a certain function. For example, in the case of the intensity distribution shown in Fig. 5, if the position on the vertical line indicated by the horizontal axis is X and the edge intensity at the position is f(x), it can be expressed by the following equation. [Number 2] f(x)

(2) 此處，E係代表邊緣的強度，〇·係代表邊緣的幅度（分散）’ a係代表邊緣的中心座標。輔助資訊生成部1 2 2係針對從第1特徵訊號2 1所得之強度分布及從第2特徵訊號23 所得之強度分布之每一者，求出與數式（2)所表示之分布 -17- 200922338 的誤差爲最小的強度E及幅度σ，將這些差分値當作關於強度及幅度的輔助資訊20而加以生成。此外，用來將強度分布取近似的函數，係不限於數式（2)而是亦可用例如將上記起點及終點之間予以描繪的Β樣條曲線等的參數曲線來取近似，或以其他的函數來取近似。又，亦可不用固定的函數來取近似，可從複數函數之中選擇出適合於取近似的函數來爲之。又，輔助資訊生成部1 22，係亦可使用第1特徵訊號2 1 及第2特徵訊號23來進行運動推定/運動補償處理，當作輔助資訊20而生成運動向量。接著，特徵預測部141，係使用步驟S2 06中所生成之輔助資訊20及步驟S202中所抽出的圖5D的第1特徵訊號 21，來生成圖51所示的特徵預測訊號24(步驟S207)。接著，訊號合成部142係將步驟S203中所生成之圖5Ε 的特徵去除訊號22及步驟S207中所生成之圖51的特徵預測訊號24加以合成，生成圖5 J所示之第2預測影像訊號1 9( 步驟S 2 0 8)。如以上說明，本實施形態所述之動畫編碼裝置中，使用從第1預測影像訊號所抽出的第1特徵訊號及輔助資訊，將原影像訊號的第2特徵訊號加以預測以生成特徵預測訊號，對從第1預測影像訊號中去除了第1特徵訊號的特徵去除訊號，合成一特徵預測訊號，以生成第2預測影像訊號 ’將與原影像訊號之間的預測殘差訊號，與輔助資訊一起進行編碼。因此，若依據本實施形態所述之動畫編碼裝置 -18- 200922338 ，則相較於將第1預測影像訊號與原影像訊號之間的預測殘差進行編碼的情形，可更加提高特徵訊號的重現性，降低預測殘差。又，在本實施形態所述之動畫編碼裝置中，由於並非將第2特徵訊號直接進行編碼，而是使用從第1預測影像訊號所抽出之第1特徵訊號及輔助資訊來預測上記第2特徵訊號，因此可削減編碼量。又，雖然特徵訊號的重現性之高低係由上記輔助資訊的內容所決定，但由於輔助資訊所致之發生編碼量係可受編碼控制部所設定之量化參數等來加以控制，因此特徵訊號之重現性與發生編碼量之平衡係可容易控制。此外，本實施形態所述之動畫編碼裝置，係例如可藉由採用例如通用之電腦裝置爲基本硬體來實現。亦即，減算器101、轉換/量化部102、逆轉換/逆量化部103、熵編碼器104、加算器105、運動推定/運動補償部107、預測部 1 1 〇及編碼控制部1 50，係可藉由令上記電腦裝置中所搭載的處理器去執行程式而加以實現。又，本實施形態所述之動畫編碼方式係不限於動畫的編碼，只要是根據已經編碼的影像來生成預測影像訊號，將輸入影像訊號與預測影像訊號之間的預測殘差訊號進行編碼的編碼方式，則亦可對靜止影像的編碼來適用。此外，亦可將本實施形態加以，將運動推定/運動補償部1 〇7與預測部1 1 0統合爲一。在此變形例中，從畫格記憶體1 06將參照影像以畫格單位而輸入至預測部11 〇。預測 -19- 200922338 部110係分離成每一特徵。使用分離後的成分來在輔助資訊生成部122中進行運動推定/運動補償。然後，作爲輔助資訊20而發送運動向量18。 (第2實施形態）如圖6所示，本發明的第2實施形態所述之解碼裝置，係具有解碼部3 00及解碼控制部3 20，解碼部300係含有熵解碼器301、逆轉換/逆量化部302、加算器3 03、畫格記憶體3 0 4、運動補償部3 0 5及預測部3 1 0。解碼部300係受解碼控制部320所控制，將已被輸入之編碼資料30進行熵解碼，進行逆量化/逆正交轉換而將預測殘差訊號予以解碼，對該解碼預測殘差訊號，加算上使用已經解碼之參照影像訊號所預測出來的預測影像訊號，以生成解碼影像訊號3 5。熵解碼器3 01，係依照所定的資料結構來將編碼資料 3〇予以解碼，將已被量化之轉換係數3 1、輔助資訊32、運動向量3 3、量化參數及預測模式資訊等予以復原。熵解碼器3 0 1，係將已被量化之轉換係數3 1輸入至逆轉換/逆量化部3 02，將輔助資訊32輸入至預測部310，將運動向量33輸入至運動補償部3 05。逆轉換/逆量化部302，係將來自熵解碼器301的已被量化之轉換係數3 1，依照已被復原之量化參數而進行逆量化，並進行例如ID C T等逆轉換，將解碼預測殘差訊號3 4 輸入至加算器3 0 3。此外，逆轉換/逆量化部3 0 2對已被量 -20- 200922338 化之轉換係數3 1所進行之逆轉換係不限於IDCΤ，例如亦可爲逆小波轉換或其他的逆正交轉換，但需爲在編碼側上對預測殘差訊號所進行過之轉換的互逆轉換。加算器3 03，係將來自逆轉換/逆量化部302的解碼預測殘差訊號34、和後述之來自預測部3 1 0的第2預測影像訊號3 8，進行加算，以生成解碼影像訊號3 5。已生成的解碼影像訊號3 5係輸出至解碼裝置的外部，並且輸入至畫格記憶體3 04。畫格記憶體3 04，係將來自加算器3 03的解碼影像訊號 3 5，當作參照影像訊號3 6而予以暫時保持。該參照影像訊號3 6，係被後述的運動補償部3 0 5所參照。此外，在畫格記憶體3 04的前段亦可還設有去區塊濾波器，從解碼影像訊號3 5中去除區塊失真。運動補償部3 05，係將畫格記憶體3 04中所保存的參照影像訊號3 6加以取得，將來自熵解碼器3 0 1的運動向量3 3 所表示的參照影像訊號36的領域，當作第1預測影像訊號 3 7而輸入至預測部3 1 0。預測部3 1 0 ’係將來自運動補償部3 0 5的第1預測影像訊號3 7及來自熵編碼器3 0 1的輔助資訊3 2加以取得，生成例如基於邊緣等之影像內空間性像素値變化之特徵部分的重現性加以提高的第2預測影像訊號38，將該當第2預測影像訊號3 8輸入至加算器3 0 3。此外，以下說明中的特徵成分之一例是以邊緣來說明，但本實施形態所述之編碼器所能適用的特徵成分係不限於邊緣，例如亦可爲紋理、對比 -21 - 200922338 及雜訊。如圖7所示，預測部3 1 0係含有：特徵分離部3 1 1 、特徵預測部312及訊號合成部313。特徵分離部3 1 1，係和特徵分離部1 3 0同樣地，從來自運動補償部3 05的第1預測影像訊號37中抽出特徵成分以生成第1特徵訊號39，並且生成已從第1預測影像訊號中去除了第1特徵訊號39的特徵去除訊號40。第1特徵訊號3 9係被輸入至特徵預測部312，特徵去除訊號40係被輸入至訊號合成部1 42。此外，特徵分離部3 1 1，係可爲和圖2、圖3 A 或圖3B之任一者所示之特徵分離部130相同構成，亦可爲其他構成。特徵預測部3 1 2，係使用來自熵解碼器3 01的輔助資訊 32，根據來自特徵分離部31 1的第1特徵訊號39而生成特徵預測訊號41，輸入至訊號合成部313。訊號合成部313，係將來自特徵分離部311的特徵去除訊號40、和來自特徵預測部3 1 2的特徵預測訊號4 1加以合成’以生成第2預測影像訊號38，輸入至加算器3 03。解碼控制部3 2 0，係進行包含解碼時序控制在內的解碼部3〇0全體之控制。接著’使用圖8來說明預測部3 1 0之動作。首先’特徵分離部3 1 1，係將第1預測影像訊號1 7，從運動補償部3 05取得（步驟s 401)。接著，特徵分離部31 1，係從步驟S401中所取得到的第！預測影像訊號37中，抽出第1特徵訊號39(步驟S4〇2)。接著，特徵分離部311，係從步驟S401中所取得到的圖40C的第1預測影像訊號37中， -22- 200922338 去除步驟S4〇2中所取得之第！特徵訊號39，生成特徵去除訊號40(步驟S403 )。此處，步驟S4〇2及步驟S403所進行的順序，並不限於上記。例如，若是與圖3 A所示的特徵分離部13〇同樣的特徵分離部311，則步驟S402及步驟 S403的進行順序係亦可對調；若是與圖3B所示的特徵分離部13〇同樣的特徵分離部311，則步驟S402及步驟S403 亦可同時進行。接著，特徵預測部3 1 2，係將來自熵解碼器3 0 1的輔助資訊32加以取得（步驟S404)。接著，特徵預測部312，係使用步驟S4〇4中所取得到的輔助資訊32及步驟S402中所抽出之第1特徵訊號39，來生成特徵預測訊號41。接著，訊號合成部3 13，係將步驟S403中所生成之特徵去除訊號40及步驟S405中所生成之特徵預測訊號41加以合成，以生成第2預測影像訊號38(步驟S406)。如以上說明，本實施形態所述之動畫解碼裝置中，使用從第1預測影像訊號所抽出的第1特徵訊號及從編碼資料所復原出來的輔助資訊’將原影像訊號的第2特徵訊號加以預測以生成特徵預測訊號，對從第1預測影像訊號中去除了第1特徵訊號的特徵去除訊號，合成一特徵預測訊號 ’以生成第2預測影像訊號，將從編碼資料所復原出來的預測殘差訊號與第2預測影像訊號進行加算，以生成解碼影像訊號。因此’若依據本實施形態所述之動畫解碼裝置 ’則相較於將第1預測影像訊號與預測殘差訊號進行加算以生成解碼影像訊號的情形，可更加提高特徵訊號的重現 -23- 200922338 性，降低預測殘差。此外，本實施形態所述之動畫解碼裝置，係例如可藉由採用例如通用之電腦裝置爲基本硬體來實現。亦即，熵解碼器301、逆轉換/逆量化部3 02、加算器3 03、運動補償部3 05、預測部310及解碼控制部3 20，係可藉由令上記電腦裝置中所搭載的處理器去執行程式而加以實現。此外’本發明的各實施形態所述之裝置，係可藉由將程式事先安裝至電腦裝置，或將記憶在CD-ROM等記憶媒體中的程式或透過網路所發送的程式，適宜地安裝至電腦裝置上’藉此而加以實現。此外’本發明係不限定於上記各實施形態的原樣，在實施階段可在不脫離其宗旨的範圍內，對構成要素加以變形而具體化。的原樣，在實施階段可在不脫離其宗旨的範圍內’對構成要素加以變形而具體化。又，亦可將上記各實施形態所揭露的複數構成要素做適宜組合，來形成各種發明。又例如，亦可考慮將各實施形態所示之全構成要素中刪除數個構成要素之構成。甚至，亦可將不同實施形態所記載之構成要素加以組合。其他，在不脫離本發明之宗旨的範圍內，可實施各種變形’這些當然也同樣可以實施》【圖式簡單說明】 [圖1 ]圖1係第1實施形態所述之動畫編碼裝置的區塊圖。 -24- 200922338 [圖2]圖2係圖1之預測部的區塊圖。 [圖3A]圖3A係圖2之特徵分離部的一變形例的區塊圖〇 [圖3B]圖3B係圖2之特徵分離部的另一變形例的區塊圖。 [圖4]圖4係圖2的預測部之動作的流程圖。 [圖5 A]圖5 A係原影像訊號之一例的圖示。 [圖5B]圖5B係對圖5A之原影像訊號中所抽出的第2 特徵訊號之一例的圖示。 [圖5C]圖5C係對圖5A之原影像訊號進行了運動推定/ 運動補償後的第1預測影像訊號之一例的圖示。 [圖5D]圖5D係從圖5C之第1預測影像訊號中所抽出之第1特徵訊號之一例的圖示。 [圖5E]圖5E係從圖5C之第1預測影像訊號中去除了圖 5D的第1特徵訊號後的特徵去除訊號之一例的圖示。 [圖5F]圖5F係圖5D之第1特徵訊號與圖5B之第2特徵訊號的邊緣中心線的圖示。 [圖5G]圖5G係邊緣強度分布的求出方法之一例的圖不 ° [圖5H]圖5H係圖5D之第1特徵訊號與圖5B之第2特徵訊號的強度分布的圖形。 [圖51]圖51係使用圖5D之第1特徵訊號與輔助資訊所生成之特徵預測訊號之一例的圖示。 [圖5J]圖5J係將圖5E之特徵去除訊號與圖51之特徵 -25- 200922338 預測訊號加以合成後的第2預測影像訊號之一例的圖示。 [圖6]圖6係第2實施形態所述之動畫解碼裝置的區塊圖。 [圖7]圖7係圖6之預測部的區塊圖。 [圖8]圖8係圖7的預測部之動作的流程圖。【主要元件符號說明】 1 〇 :原影像訊號 1 1 :預測殘差訊號 1 2 :轉換係數 1 3 :解碼預測殘差訊號 1 4 :編碼資料 1 5 :局部解碼影像訊號 1 6 :參照影像訊號 1 7 :第1預測影像訊號 1 8 :運動向量 19 :第2預測影像訊號 2 0 :輔助資訊 2 1 :第1特徵訊號 22 :特徵去除訊號 23 :第2特徵訊號 24 :特徵預測訊號 3 0 :編碼資料 3 1 :轉換係數 -26- 200922338 3 2 :輔助資訊 3 3 :運動向量 3 4 :解碼預測殘差訊號 3 5 :解碼影像訊號 3 6 :參照影像訊號 3 7 :第1預測影像訊號 38 :第2預測影像訊號 3 9 :第1特徵訊號 40 :特徵去除訊號 4 1 特徵預測訊號 1 〇 0 :編碼部 1 0 1 :減算器 102 :轉換/量化部 103 :逆轉換/逆量化部 1 0 4 :熵編碼器 1 0 5 :加算器 106 :畫格記憶體 107 :運動推定/運動補償部 1 1 〇 :預測部 1 2 1 :特徵抽出部 122 :輔助資訊生成部 1 3 0 :特徵分離部 1 3 1 :特徵抽出部 1 3 2 :減算器 -27- 200922338 1 3 3 :平滑化濾波器 1 3 4 :減算器 1 3 5 :頻帶分割部 1 4 1 :特徵預測部 142 :訊號合成部 1 5 0 :編碼控制部 3 00 :解碼部 3 0 1 :熵解碼器 3 02 :逆轉換/逆量化部 3 03 :加算器 3 04 :畫格記憶體 3 0 5 :運動補償部 3 1 〇 :預測部 3 1 1 :特徵分離部 3 1 2 :特徵預測部 3 1 3 :訊號合成部 320 :解碼控制部 -28(2) Here, E represents the intensity of the edge, and 〇· represents the amplitude (scatter) of the edge. a represents the central coordinate of the edge. The auxiliary information generating unit 1 2 2 obtains the distribution represented by the equation (2) for each of the intensity distribution obtained from the first characteristic signal 2 1 and the intensity distribution obtained from the second characteristic signal 23 . - 200922338 The error is the minimum intensity E and the amplitude σ, which are generated as auxiliary information 20 about the intensity and amplitude. Further, the function for approximating the intensity distribution is not limited to the formula (2), but may be approximated by, for example, a parameter curve such as a Β spline curve drawn between the start point and the end point, or other The function to take the approximation. Alternatively, an approximation may be used instead of a fixed function, and a function suitable for approximating may be selected from the complex functions. Further, the auxiliary information generating unit 1 22 may perform motion estimation/motion compensation processing using the first characteristic signal 2 1 and the second characteristic signal 23 to generate a motion vector as the auxiliary information 20. Next, the feature prediction unit 141 generates the feature prediction signal 24 shown in FIG. 51 using the auxiliary information 20 generated in step S206 and the first characteristic signal 21 of FIG. 5D extracted in step S202 (step S207). . Next, the signal synthesizing unit 142 combines the feature removal signal 22 of FIG. 5A generated in step S203 and the feature prediction signal 24 of FIG. 51 generated in step S207 to generate a second predicted video signal as shown in FIG. 1 9 (step S 2 0 8). As described above, the animation encoding apparatus according to the present embodiment predicts the second characteristic signal of the original video signal to generate the feature prediction signal by using the first characteristic signal and the auxiliary information extracted from the first predicted video signal. A feature removal signal is removed from the first predicted image signal from the first predicted image signal, and a feature prediction signal is synthesized to generate a predicted residual signal between the second predicted image signal and the original image signal, together with the auxiliary information. Encode. Therefore, according to the animation encoding apparatus -18-200922338 according to the embodiment, the weight of the characteristic signal can be further improved as compared with the case where the prediction residual between the first predicted video signal and the original video signal is encoded. Now, reduce the prediction residual. Further, in the animation encoding apparatus according to the present embodiment, since the second characteristic signal is not directly encoded, the first characteristic signal extracted from the first predicted video signal and the auxiliary information are used to predict the second feature. Signal, so you can reduce the amount of coding. Moreover, although the reproducibility of the feature signal is determined by the content of the auxiliary information, the amount of code generated by the auxiliary information can be controlled by the quantization parameter set by the coding control unit, etc., and therefore the characteristic signal The balance between reproducibility and the amount of code generated can be easily controlled. Further, the animation encoding apparatus according to the present embodiment can be realized by, for example, using a general-purpose computer device as a basic hardware. That is, the subtractor 101, the conversion/quantization unit 102, the inverse transform/inverse quantization unit 103, the entropy encoder 104, the adder 105, the motion estimation/motion compensation unit 107, the prediction unit 1 1〇, and the encoding control unit 150, This can be realized by executing the program by the processor mounted on the computer device. Moreover, the animation coding method according to the embodiment is not limited to the coding of the animation, and is a code for encoding the prediction residual signal between the input image signal and the predicted image signal by generating the predicted image signal based on the already encoded image. In this way, the encoding of still images can also be applied. Further, in the present embodiment, the motion estimation/motion compensation unit 1 〇 7 and the prediction unit 1 1 0 may be integrated into one. In this modification, the reference image is input from the frame memory unit 106 to the prediction unit 11 in units of cells. Prediction -19- 200922338 Part 110 is separated into each feature. The motion estimation/motion compensation is performed in the auxiliary information generating unit 122 using the separated components. Then, the motion vector 18 is transmitted as the auxiliary information 20. (Second Embodiment) As shown in Fig. 6, the decoding apparatus according to the second embodiment of the present invention includes a decoding unit 300 and a decoding control unit 306, and the decoding unit 300 includes an entropy decoder 301 and inverse conversion. / inverse quantization unit 302, adder 303, frame memory 704, motion compensation unit 305, and prediction unit 3 1 0. The decoding unit 300 is controlled by the decoding control unit 320, performs entropy decoding on the input encoded data 30, performs inverse quantization/inverse orthogonal conversion, and decodes the prediction residual signal, and adds a residual signal to the decoding prediction. The predicted video signal predicted by the decoded reference video signal is used to generate the decoded video signal 35. The entropy decoder 301 decodes the encoded data according to the predetermined data structure, and restores the quantized transform coefficient 3 1 , the auxiliary information 32 , the motion vector 3 3 , the quantization parameter, and the prediction mode information. The entropy decoder 301 inputs the quantized conversion coefficient 3 1 to the inverse conversion/inverse quantization unit 312, inputs the auxiliary information 32 to the prediction unit 310, and inputs the motion vector 33 to the motion compensation unit 305. The inverse conversion/inverse quantization unit 302 inversely quantizes the quantized conversion coefficient 31 from the entropy decoder 301 in accordance with the restored quantization parameter, and performs inverse conversion such as ID CT to decode the decoded residual. The difference signal 3 4 is input to the adder 3 0 3. Further, the inverse conversion performed by the inverse transform/inverse quantization unit 306 on the conversion coefficient 3 1 that has been quantized -20-200922338 is not limited to IDC Τ, and may be, for example, inverse wavelet transform or other inverse orthogonal transform. However, a reciprocal conversion of the conversion of the prediction residual signal on the encoding side is required. The adder 303 adds the decoded prediction residual signal 34 from the inverse transform/inverse quantization unit 302 and the second predicted video signal 38 from the prediction unit 3 1 0 to be described later to generate a decoded video signal 3 . 5. The generated decoded video signal 3 5 is output to the outside of the decoding device and input to the frame memory 3 04. The frame memory 3 04 temporarily holds the decoded video signal 3 5 from the adder 303 as the reference video signal 36. The reference video signal 36 is referred to by a motion compensation unit 305 which will be described later. In addition, a deblocking filter may be further provided in the front portion of the frame memory 306 to remove block distortion from the decoded image signal 35. The motion compensation unit 305 obtains the reference video signal 36 stored in the frame memory 307, and takes the field of the reference video signal 36 indicated by the motion vector 3 3 of the entropy decoder 301. The first predicted video signal 3 7 is input to the prediction unit 3 1 0. The prediction unit 3 1 0 ' obtains the first predicted video signal 3 7 from the motion compensation unit 305 and the auxiliary information 3 2 from the entropy coder 3 0 1 to generate, for example, an intra-image spatial pixel based on an edge or the like. The second predicted video signal 38 whose reproducibility of the characteristic portion of the change is increased is input to the adder 3 0 3 by the second predicted video signal 38. In addition, an example of the characteristic components in the following description is described by an edge. However, the characteristic components applicable to the encoder described in this embodiment are not limited to edges, and may be, for example, texture, contrast - 21 - 200922338, and noise. . As shown in FIG. 7, the prediction unit 301 includes a feature separation unit 3 1 1 , a feature prediction unit 312, and a signal synthesis unit 313. Similarly to the feature separation unit 1300, the feature separation unit 3 1 1 extracts the feature component from the first predicted video signal 37 from the motion compensation unit 305 to generate the first feature signal 39, and generates the first feature signal 39. The feature removal signal 40 of the first characteristic signal 39 is removed from the predicted image signal. The first characteristic signal 3 9 is input to the feature prediction unit 312, and the feature removal signal 40 is input to the signal synthesizing unit 1 42. Further, the feature separating unit 31 may be configured in the same manner as the feature separating unit 130 shown in any of Figs. 2, 3A or 3B, or may have another configuration. The feature prediction unit 3 1 2 generates the feature prediction signal 41 based on the first characteristic signal 39 from the feature separating unit 31 1 using the auxiliary information 32 from the entropy decoder 310, and inputs it to the signal synthesizing unit 313. The signal synthesizing unit 313 combines the feature removal signal 40 from the feature separation unit 311 and the feature prediction signal 4 1 from the feature prediction unit 3 1 2 to generate a second predicted video signal 38, which is input to the adder 3 03. . The decoding control unit 306 performs control of the entire decoding unit 3〇0 including the decoding timing control. Next, the operation of the prediction unit 301 will be described using Fig. 8 . First, the feature separating unit 311 obtains the first predicted video signal 17 from the motion compensating unit 305 (step s 401). Next, the feature separating unit 31 1 is obtained from the step S401! In the predicted video signal 37, the first characteristic signal 39 is extracted (step S4〇2). Next, the feature separating unit 311 removes the number obtained in step S4〇2 from the first predicted video signal 37 of Fig. 40C acquired in step S401, -22-200922338! The feature signal 39 generates a feature removal signal 40 (step S403). Here, the order of steps S4〇2 and S403 is not limited to the above. For example, if the feature separating unit 311 is the same as the feature separating unit 13A shown in FIG. 3A, the order of the steps S402 and S403 may be reversed; if it is the same as the feature separating unit 13A shown in FIG. 3B. In the feature separating unit 311, steps S402 and S403 may be simultaneously performed. Next, the feature predicting unit 3 1 2 acquires the auxiliary information 32 from the entropy decoder 301 (step S404). Next, the feature prediction unit 312 generates the feature prediction signal 41 using the auxiliary information 32 acquired in step S4〇4 and the first characteristic signal 39 extracted in step S402. Next, the signal synthesizing unit 3 13 combines the feature demodulation signal 40 generated in step S403 and the feature prediction signal 41 generated in step S405 to generate a second predicted video signal 38 (step S406). As described above, in the animation decoding apparatus according to the present embodiment, the first characteristic signal extracted from the first predicted video signal and the auxiliary information restored from the encoded data are used to add the second characteristic signal of the original video signal. The prediction generates a feature prediction signal, removes the feature removal signal from which the first feature signal is removed from the first predicted image signal, and synthesizes a feature prediction signal ' to generate a second predicted image signal, and predicts the residual from the encoded data. The difference signal and the second predicted image signal are added to generate a decoded image signal. Therefore, if the animation decoding apparatus according to the embodiment is added to the case where the first prediction video signal and the prediction residual signal are added to generate a decoded video signal, the reproduction of the characteristic signal can be further improved. 200922338 Sex, reducing prediction residuals. Further, the animation decoding device according to the present embodiment can be realized by, for example, using a general-purpose computer device as a basic hardware. In other words, the entropy decoder 301, the inverse transform/inverse quantization unit 312, the adder 303, the motion compensation unit 305, the prediction unit 310, and the decoding control unit 306 can be mounted on the computer device. The processor implements the program to implement it. Further, the apparatus according to each embodiment of the present invention can be suitably installed by installing a program to a computer device in advance, or a program stored in a memory medium such as a CD-ROM or a program transmitted through a network. To the computer device 'by this to achieve. In addition, the present invention is not limited to the above-described embodiments, and constituent elements may be modified and embodied in the implementation stage without departing from the spirit and scope of the invention. As it is, in the implementation stage, the constituent elements can be modified and embodied without departing from the scope of the object. Further, various constituent elements disclosed in the above embodiments may be combined as appropriate to form various inventions. Further, for example, a configuration in which a plurality of constituent elements are deleted from the entire constituent elements shown in the respective embodiments may be considered. It is also possible to combine the constituent elements described in the different embodiments. In addition, various modifications can be made without departing from the spirit and scope of the invention. [Of course, the same can be implemented.] [FIG. 1] FIG. 1 is an area of the animation coding apparatus according to the first embodiment. Block diagram. -24- 200922338 [Fig. 2] Fig. 2 is a block diagram of the prediction section of Fig. 1. [Fig. 3A] Fig. 3A is a block diagram showing a modification of the feature separating portion of Fig. 2. Fig. 3B is a block diagram showing another modification of the feature separating portion of Fig. 2. FIG. 4 is a flowchart showing the operation of the prediction unit of FIG. 2. FIG. [Fig. 5A] Fig. 5 is a diagram showing an example of the original image signal. 5B] FIG. 5B is a diagram showing an example of a second characteristic signal extracted from the original video signal of FIG. 5A. 5C] FIG. 5C is a diagram showing an example of the first predicted video signal after motion estimation/motion compensation of the original video signal of FIG. 5A. Fig. 5D is a diagram showing an example of the first characteristic signal extracted from the first predicted video signal of Fig. 5C. [Fig. 5E] Fig. 5E is a view showing an example of the feature removal signal obtained by removing the first characteristic signal of Fig. 5D from the first predicted video signal of Fig. 5C. [ Fig. 5F] Fig. 5F is a diagram showing the edge center line of the first characteristic signal of Fig. 5D and the second characteristic signal of Fig. 5B. [Fig. 5G] Fig. 5G is a diagram showing an example of the method for determining the edge intensity distribution. Fig. 5H is a graph showing the intensity distribution of the first characteristic signal of Fig. 5D and the second characteristic signal of Fig. 5B. Fig. 51 is a view showing an example of a feature prediction signal generated using the first characteristic signal and the auxiliary information of Fig. 5D. 5J is a diagram showing an example of a second predicted video signal obtained by synthesizing the feature removal signal of FIG. 5E and the feature-25-200922338 prediction signal of FIG. 51. Fig. 6 is a block diagram showing an animation decoding apparatus according to a second embodiment. FIG. 7 is a block diagram of a prediction unit of FIG. 6. FIG. FIG. 8 is a flowchart showing the operation of the prediction unit of FIG. 7. FIG. [Main component symbol description] 1 〇: Original video signal 1 1 : Predicted residual signal 1 2 : Conversion coefficient 1 3 : Decoded prediction residual signal 1 4 : Coded data 1 5 : Local decoded video signal 1 6 : Reference video signal 1 7 : 1st predicted video signal 1 8 : motion vector 19 : 2nd predicted video signal 2 0 : auxiliary information 2 1 : 1st characteristic signal 22 : feature removal signal 23 : 2nd characteristic signal 24 : feature prediction signal 3 0 : Encoding data 3 1 : Conversion factor -26- 200922338 3 2 : Auxiliary information 3 3 : Motion vector 3 4 : Decoding prediction residual signal 3 5 : Decoding video signal 3 6 : Reference video signal 3 7 : 1st predicted video signal 38: second predicted video signal 3 9 : first characteristic signal 40 : feature removal signal 4 1 feature prediction signal 1 〇 0 : encoding unit 1 0 1 : subtractor 102 : conversion/quantization unit 103 : inverse conversion / inverse quantization 1 0 4 : Entropy encoder 1 0 5 : Adder 106 : Frame memory 107 : Motion estimation/motion compensation unit 1 1 〇: Prediction unit 1 2 1 : Feature extraction unit 122 : Auxiliary information generation unit 1 3 0 : Feature separating unit 1 3 1 : Feature extracting unit 1 3 2 : Reducer -27- 200922338 1 3 3 : Sliding filter 1 3 4 : Reducer 1 3 5 : Band division unit 1 4 1 : Feature prediction unit 142 : Signal synthesis unit 1 5 0 : Encoding control unit 3 00 : Decoding unit 3 0 1 : Entropy decoder 3 02 : inverse conversion/inverse quantization unit 3 03 : adder 3 04 : frame memory 3 0 5 : motion compensation unit 3 1 〇: prediction unit 3 1 1 : feature separation unit 3 1 2 : feature prediction unit 3 1 3 : Signal synthesis unit 320: decoding control unit -28

Claims

200922338 X. Patent Application Area 1. A video encoding device, comprising: a drawing unit for extracting a third edge component image from an original image; and a separating unit for separating the reference image into a second edge component image and The edge-removed image; and the auxiliary information generating unit generates auxiliary information necessary for predicting the first edge component image from the second edge component image; and the prediction unit uses the pre-recording auxiliary information, and the second reading is performed. The edge component image is used to predict the third edge component image; and the predicted image generation unit combines the pre-recorded edge-removed image and the pre-recorded third edge component image to generate a predicted image; and the prediction residual calculation unit obtains the pre-recorded original The prediction residual between the image and the pre-recorded image; and the coding department encodes the pre-recorded residual and the pre-recorded auxiliary information. The video encoding device according to the first aspect of the invention, wherein the pre-recording auxiliary information includes a first edge included in the first edge component image and a second edge component image included in the image. 2 Information on at least one of the shape, intensity, and amplitude between the edges. The video encoding device according to the first aspect of the invention, wherein the pre-recording unit includes: a second extracting unit that extracts a pre-recorded second edge component image from a pre-referenced video; and -29 - 200922338 The second edge component image is subtracted from the previous reference image to generate a pre-edge edge-removed image. The video encoding device according to the first aspect of the invention, wherein the pre-recording unit includes: a smoothing filter for extracting a pre-edge edge-removed image from a pre-recorded reference image; and a subtractor's pre-recording reference image The subtraction pre-edge edge removes the image to generate a pre-recorded second edge component image. 5. The image coding apparatus according to claim 1, wherein the 'previous separation unit' separates the pre-recorded reference image into a high-frequency component and a low-frequency component', and uses the pre-recorded high-frequency component as the pre-recorded second edge component image. The output 'outputs the low-frequency component of the pre-record as the pre-mark edge to remove the image. A video decoding device is characterized in that: the decoding unit is configured to decode the input encoded data and obtain a prediction residual of the target image and an i-th edge component image of the pre-recorded image. The auxiliary information required; and the separation unit' separates the decoded reference image into the second edge component image of the reference image and the edge removal image after the second edge component image is removed from the reference image; And the prediction unit' uses the pre-recording auxiliary information to predict the first edge component image from the second edge component image; and the synthesis section' combines the pre-edge edge-removed image with the pre-recorded second edge component -30-200922338 image. The decoded video generation unit generates a decoded video of the pre-recorded target image using the pre-recorded prediction residual and the pre-recorded predicted image. 7. The video decoding device according to claim 6, wherein the pre-recording auxiliary information includes a first edge included in the first edge component image and a second edge component image included in the image. 2 Information on at least one of the shape, intensity, and amplitude between the edges. 8 - an image encoding method, comprising: a step of extracting a first edge component image from an original image; and a step of separating the reference image into a second edge component image and an edge removal image; and generating a step of predicting the auxiliary information required for the first edge component image by the second edge component image; and a step of predicting the third edge component image from the second edge component image using the pre-recording auxiliary information; and the pre-recording edge a step of removing the image and the third edge component image to be combined to generate a predicted image; and obtaining a prediction residual between the pre-recorded original image and the pre-recorded predicted image; and adding the pre-recorded residual and the pre-recorded auxiliary information The step of coding. 9. The image encoding method according to claim 8, wherein the pre-recording auxiliary information includes a first edge included in the first edge component image and a second edge component image included in the image. 2 Information on at least one of the shape, intensity, and amplitude between the edges. -31 - 200922338 ι 〇种影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像影像The step of adding the auxiliary information required for the component image; and separating the decoded reference image into the reference image white, the edge component image, and the edge-removed image after removing the second edge image from the reference image a step of recording a pre-recorded first edge component image from the second edge component image; and a step of adding a pre-recorded edge-removed image and a pre-recorded second edge component image to generate a predicted image; and The decoded image generating step generates a decoded image of the pre-recorded object image by using the pre-recorded residual and the pre-recorded image. 1 1. The image decoding party according to the first aspect of the patent application, wherein the pre-recording auxiliary information includes an edge included in the first edge and the second edge component image included in the first edge component. At least one kind of difference between the shape, the intensity and the amplitude is pre-predicted by the pre-ij second component, and the image 2nd information-32-