TW200529650A - Video coding method and apparatus thereof - Google Patents

Video coding method and apparatus thereof Download PDF

Info

Publication number
TW200529650A
TW200529650A TW93104201A TW93104201A TW200529650A TW 200529650 A TW200529650 A TW 200529650A TW 93104201 A TW93104201 A TW 93104201A TW 93104201 A TW93104201 A TW 93104201A TW 200529650 A TW200529650 A TW 200529650A
Authority
TW
Taiwan
Prior art keywords
input
patent application
scope
item
image coding
Prior art date
Application number
TW93104201A
Other languages
Chinese (zh)
Other versions
TWI241130B (en
Inventor
Ming-Chieh Chi
Mei-Juan Chen
Original Assignee
Leadtek Research Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leadtek Research Inc filed Critical Leadtek Research Inc
Priority to TW93104201A priority Critical patent/TWI241130B/en
Publication of TW200529650A publication Critical patent/TW200529650A/en
Application granted granted Critical
Publication of TWI241130B publication Critical patent/TWI241130B/en

Links

Abstract

A region-of-interest (ROI) video-coding method and apparatus based on fuzzy logic control for a video encoder is provided. Providing an image having a plurality of region-of-interest regions and a plurality of non-region-of-interest regions, the first step is to separate the region-of-interest regions and the non-region-of-interest regions from the image. Then by sending the region-of-interest regions from an input image to a fuzzy logic controller, in which the fuzzy logic controller performs fuzzy manipulations that enhance the quality of the region-of-interest regions, and therefore the region-of-interest quality of an output image will be improved. The method and apparatus are particularly useful in videophone and videoconferencing.

Description

200529650 五、發明說明α) 發明所屬之技術領域 本發明是有關於一種改善影像品質的方法與裝置, 且特別是,有關於一種用於影像編碼器的模糊邏輯控制 的目標區影像編碼方法與其裝置。 1 先前技術 近來,使用視訊會議與視訊電話的數位影像通訊的 應用需求與曰倶增。然而,因為網路傳輸率有限,所以 作為這些應用的極低位元率影像編碼,是' 種可用來降 低圖像序列(picture sequence)的資料傳輸率(data rate),而又不會降低其品質的重要技術。這些標準的大 部分實施方式都是對每個方塊(b 1 oc k )的重要程度視為均 等。雖然在相同圖像之内的不同方塊可能是以不同模式 編碼,但沒有任何一方塊比另一方塊更重要。這種傳統 模型並不適用於在影像序列(video sequence)的任何目 標區(region-in-interest, R0I)應用。在H· 263 + 標準 中,會調整在大方塊(macro-block, MB)層的失真加權參 數(distortion weight parameter)與訊號變化,藉以控 制不同區的品質。對應於相同重點區(focus area)的方 塊會比在背景或其他不必要區中的方塊還更重要。雖然 會犧牲背景或非重點區的品質,然而可以對使用者重視 所在的區配置更多頻寬。對於像是視訊會議的影像序列 而言,是一個好的編碼策略。除了 R〇 I具有較高品質之 外,也可能會忽略部分背景資訊,藉以提高編碼速度。 如最大位元傳輸(maximum bit transfer, MBT) —樣,背200529650 V. Description of the invention α) Technical field to which the invention belongs The present invention relates to a method and device for improving image quality, and in particular, to a method and device for image coding of a target area for fuzzy logic control of an image encoder . 1 Prior Technology Recently, the demand for digital video communication using video conferences and video phones has increased. However, because the network transmission rate is limited, as a very low bit rate image coding for these applications, it is a kind of data rate that can be used to reduce the picture sequence without reducing its data rate. Important technology of quality. Most implementations of these standards are considered equal for each block (b 1 oc k). Although different blocks within the same image may be coded in different modes, no one block is more important than the other. This traditional model is not suitable for application in any region-in-interest (ROI) of a video sequence. In the H · 263 + standard, the distortion weight parameter and signal change at the macro-block (MB) layer are adjusted to control the quality of different regions. Blocks that correspond to the same focus area are more important than blocks in the background or other unnecessary areas. Although it sacrifices the quality of the background or non-emphasis area, it is possible to allocate more bandwidth to the area where the user cares. It is a good coding strategy for video sequences like video conferences. In addition to the high quality of ROI, some background information may also be ignored to improve encoding speed. For example, maximum bit transfer (MBT)

12429TWF.PTD 第6頁 200529650 五、發明說明(2) 景總是以其中最粗糙的量化位準((1113111:12以丨〇1116乂61) 編碼。對此,傳統上一般採用一種以區域為主 (region-based)的模糊法則(blurring algorithm) ’ 以 降低極低位元率影像編石馬的位元率。另一種方法是使用 可提昇R 0 I品質,並且降低編碼背景位元的對每一 R 0 1 Μ B 與非ROI MB的三個固定因數(fixed factors),以大量改 善R 0 I品質。本發明可根據模糊邏輯率控制而調適性地改 善R〇 I品質,而且適用於即時性的視訊會議。 模糊邏輯(fuzzy logic)首先疋由在柏克來 (Berkeley)工作的L· A· Zadeh在1 9 6 5年提出,而且是在 自然人群達成三點解決方案之後才確定模型。第一點: 對相同問題使用不同法則的解決方案。第二點:對相同 問題同時使用一個以上的規則。第三點:接受特定程度 的不確定性(imprecision),這對達到可接受的解決方案 而言是有相當助益的。很明顯的,在例*ΤΜΝ5、TMN8、 等等的不同標準測试模型中所用的正常率控制法則是符 合這二點的。在每個測試模型中,都有特定的數學解決 方案’用以決定每個MB的量化參數(quantizati〇rl parameter),而且可接受適當的不確定性 (inaccuracies),藉以估算下一個〇的位元率。看起來 模糊邏輯控制對於解決影像編碼的率控制是相當適當 的。 第1 a圖是一個習知的回饋控制系統丨〇 〇的方塊示意 圖。圖中的控制器根據一個處理的數學模型,或是數學12429TWF.PTD Page 6 200529650 V. Description of the invention (2) The scene is always coded at the roughest quantization level ((1113111: 12 to 〇〇11616 乂 61). For this reason, traditionally, a region is used Region-based blurring algorithm 'to reduce the bit rate of the very low bit-rate video editing stone horse. Another method is to improve the quality of R 0 I and reduce the encoding background bit. Three fixed factors of each R 0 1 MB and non-ROI MB to improve the quality of R 0 I. The present invention can adaptively improve the quality of R 0I according to the fuzzy logic rate control, and is applicable to Immediate video conference. Fuzzy logic was first proposed by AZ Zadeh, who worked at Berkeley in 1965, and was determined after the natural population reached a three-point solution Model. The first point: solutions using different rules for the same problem. The second point: using more than one rule for the same problem at the same time. The third point: accepting a certain degree of uncertainty (imprecision). The solution is quite helpful. Obviously, the normal rate control rules used in different standard test models such as * TMN5, TMN8, etc. are in line with these two points. In each test model There are specific mathematical solutions' to determine the quantizati rl parameter of each MB, and accept the appropriate inaccuracies to estimate the bit rate of the next 0. It seems Fuzzy logic control is quite suitable for solving the rate control of image coding. Figure 1a is a block diagram of a conventional feedback control system. The controller in the figure is based on a mathematical model of processing, or mathematics.

12429TWF.PTD 第7頁 200529650 五、發明說明(3) 關係的固定集合,決定接下來如何處理。 第1 b圖是一個模糊邏輯控制系統1 5 0的方塊示意圖。 模糊邏輯控制器1 5 0將由具相關經驗的操作者或系統工程 師所制定的一組反應規則,當成其操作指南(g u i d e )。請 參考第lb圖所示,量化器(quantizer)152從感測器157取 得資料,並且將資料轉換成可為模糊邏輯控制器1 5 3所用 的格式。模糊邏輯控制器1 5 3接下來執行計算,藉以決定 特定資料的模糊狀態(f u z z y s i t u a t i ο η )。 綜合上述說明,當資訊高速公路(information highway )以有限傳輸率開始開展時,就需要一種改善影 像的方法。近來,已經有一種可改善影像品質的目標區 (R 0 I )方法。然而,目前的R 0 I方法的解決方案仍具有其 性能上的障礙。因此,相當需要一種可獲得高品質視訊 影像的方法或法則。 發明内容 有鑑於此,本發明之目的之一是,提供一種可用於 改善,例如說,視訊電話及視訊會議應用中的影像品質 需求的方法與裝置。 為達成本發明上述及其他目的,本發明提供一種根 據目標區(R0 I )與模糊邏輯控制的新方法與裝置,並且在 此以實施例詳細說明。 首先,該方法將一個影像(image)的複數個目標區與 複數個非目標區分離。接下來,來自目標區的輸入會送 到一個模糊邏輯控制,其中模糊邏輯控制是用來改善目12429TWF.PTD Page 7 200529650 V. Description of the invention (3) A fixed set of relationships determines how to proceed next. Figure 1b is a block diagram of a fuzzy logic control system 150. The fuzzy logic controller 150 uses a set of reaction rules formulated by an operator or system engineer with relevant experience as its operation guide (g u i d e). Referring to FIG. 1b, the quantizer 152 obtains data from the sensor 157 and converts the data into a format that can be used by the fuzzy logic controller 153. The fuzzy logic controller 1 5 3 then performs calculations to determine the fuzzy state of the particular data (f u z z y s i t u a t i ο η). Based on the above description, when the information highway begins to develop with a limited transmission rate, a method for improving the image is needed. Recently, there has been a target area (R 0 I) method that can improve image quality. However, the current R 0 I solution still has its performance obstacles. Therefore, there is a great need for a method or rule for obtaining high-quality video images. SUMMARY OF THE INVENTION In view of this, one object of the present invention is to provide a method and device that can be used to improve image quality requirements in, for example, video phone and video conference applications. In order to achieve the above and other objectives of the present invention, the present invention provides a new method and device based on the target area (R0 I) and fuzzy logic control, and will be described in detail with embodiments herein. First, the method separates a plurality of target regions of an image from a plurality of non-target regions. Next, the input from the target area is sent to a fuzzy logic control, where the fuzzy logic control is used to improve the objective.

12429TWF.PTD 第8頁 200529650 五、發明說明(4) 標區的品質,以及改善輸出影像的整體品質。 在本發明一較佳實施例中,來自目標區的輸入是從 來自目標區的第一控制輸入與第二控制輸入所計算而 得。其中,第一控制輸入與第二控制輸入分別包括一個 來自一個目前的第i個大方塊的第一變異數(first variance)與一個變異數差(variance difference) ° 變 異數差是由將第一變異數減去前一個的第i-Ι個大方塊的 第二變異數(second variance),並且再除以第一變異數 所得。第i個大方塊與第i - 1個大方塊代表在其中一個目 標區之内的大方塊的序列,而且第i - 1個大方塊是第i個 大方塊的前一個大方塊。 在本發明另一較佳實施例中,模糊邏輯控制包括一 個用來將控制輸入轉換成模糊判定(fuzzy predicates) 的法則。 在本發明另一較佳實施例中,模糊邏輯控制包括一 個控制功能’藉以計算一個用來決定主控制輸入的模糊 狀態的语a從屬功能(linguistic membership function)。該控制功能使用一種中央面積區(center 0f a r e a, C 0 A )方法,來決定語言從屬功能之歸屬。 在本發明另一實施例中,模糊邏輯控制包括用來設 疋一個决桌位準(decisional level)與產生一個加權因 數(weigh ted factor)的複數個探查表(1〇〇kup tables),藉以加重其中—個目標區的品質。 在本發明再另一實施例中,該些探查表包括複數個12429TWF.PTD Page 8 200529650 V. Description of the invention (4) The quality of the target area and the improvement of the overall quality of the output image. In a preferred embodiment of the present invention, the input from the target area is calculated from the first control input and the second control input from the target area. The first control input and the second control input respectively include a first variance and a variance difference from a current i-th large block. The variance of the variance is determined by the first The number of mutations is obtained by subtracting the second variance of the previous (i-1) th large square, and dividing by the first variance. The i-th large block and the i-1 large block represent a sequence of large blocks within one of the target areas, and the i-1 large block is the previous large block of the i-th large block. In another preferred embodiment of the present invention, the fuzzy logic control includes a rule for converting control inputs into fuzzy predicates. In another preferred embodiment of the present invention, the fuzzy logic control includes a control function 'for calculating a linguistic membership function for determining the fuzzy state of the main control input. This control function uses a central area (center 0f a r e a, C 0 A) method to determine the affiliation of the language subordinate function. In another embodiment of the present invention, the fuzzy logic control includes a plurality of probe tables (100kup tables) for setting a decisional level and generating a weighted factor. Aggravate the quality of one of the target areas. In still another embodiment of the present invention, the probe tables include a plurality of

12429TWF.PTD 第9頁 200529650 五、發明說明(5) 用來對其中一個 其 縮放探查表(scaled lookup tables) 目標區提供一種類似優先權(priority-like)品質 中’縮放探查表是使用一個one-fixed與one- various從 屬功能成形。 綜合上述說明,本發明提供一種模糊控制的R 〇 I影像 編碼。模糊控制的R〇 I影像編碼可適應性地調整影像的輸 出品質:該方法可輕易地改善R〇I品質,保持固定位元 器過溢(buffer 〇verfl〇w),並且較習知技蓺 更此以較低位凡率,輕易提供更佳品^ 碼可不需複雜運算,就沪女I并关— 夕$ κυ 1〜像編 質。 异就此大罝改善母一個R0I的輸出品 為讓本發明之上述和其他 明顯易懂,下文牿以钤处每Α 将徵、和優點能更 詳細說明如下:、乂佳實轭例,並配合所附圖式,作 實施方式 以下將參考所附繪圖, 例。 口 序、、、田說明本發明的較佳實施 雖然在此以該〇tb會絲/丨抑 限於該些實施例,;可以2 “ :::眚本發明並不受 詳細内容。在下文中,目、 ^二者熟習本發明範疇及 件。 相冋的參考號碼代表相同的元 首先,藉由模糊控制 組成,包括(1)目標區It t衫像編碼可由兩部分 與(2)模糊控制。請參考第2圖所 麵 12429TWF.1 第10頁 200529650 五、發明說明(6) 示,一個目標區包括一個切割單元(segmentation)302。 一個模糊邏輯控制器3 2 0包括一個計算微分變異單元 (calculate differential variance)303 、 一個量化器 (quantizer)304 、模糊子集合(fuzzy subsets)305 、 一 個模糊控制器306、一個模糊變異數運算器(fuzzy variance operator ) 3 0 7、一個加權解模糊器(weighted deiuzzifier)308、以及一個模糊探查表(fuzzy lookup table ) 3 0 9。此外,整個編碼系統還包括一個Η· 2 6 3 +影像 編碼器(video encoder)與一個虛擬緩衝器(virtual buffer) 〇 請參考第2圖所示,模糊邏輯控制器3 2 0根據一個變 異數Ji332與一個變異數差^(7^34,改善目標區品質。 在輸入一個訊框(f r a m e ) 3 0 1之後,如外觀偵測與移動偵 測的切割單元3 0 2,會被用來將訊框3 0 1切割成目標區 (R0 1)330與非目標區331。在非目標區331中的大區塊會 不經調整任何參數,以位元率控制直接送至一個QP選擇 器310°ROI 330的第i個大區塊中的變異數差△σί334, 是從aJ32與(7/333計算而得,其中σi332與σi’333分 別是目前與前一個第i個大區塊的變異數。變異數差Δσ i334與目前大區塊的變異數口332,是使用模糊邏輯方法 的兩個輸入,而且(7135是一個即將當成輸入加權因 數的模糊輸出。 第3圖與第4圖分別繪示代表0^332與△ σί334的圖 形。請參考第3圖與第4圖所示,語言組(linguistic12429TWF.PTD Page 9 200529650 V. Description of the invention (5) It is used to provide a priority-like quality to one of its scaled lookup tables. The 'scaled lookup table' uses a one -Fixed and one- various slave functions formed. To sum up the above description, the present invention provides a fuzzy controlled ROI image coding. The fuzzy-controlled R0I image coding can adaptively adjust the output quality of the image: This method can easily improve the R0I quality and maintain a fixed bit device overflow (buffer 〇verfl0w), which is more familiar than conventional techniques. In addition, at a lower rate, it is easy to provide better products. ^ Codes can be used without complicated calculations, and the Shanghai Girls I will be closed together. Evening $ κυ 1 ~ Image quality. In order to make the above and other aspects of the present invention clearly understandable, the following features and advantages can be explained in more detail as follows: The attached drawings and embodiments will be described below with reference to the accompanying drawings and examples. Oral, ,, and field descriptions of the preferred implementation of the present invention are limited to these embodiments with the 0tb will be described here; may 2 "::: 眚 The present invention is not subject to the details. In the following, Both of them are familiar with the scope and components of the present invention. The corresponding reference numbers represent the same elements. First, they are composed of fuzzy control, including (1) the target area It t-shirt image coding can be controlled by two parts and (2) fuzzy control. Please refer to Figure 12429TWF.1 Page 10 200529650 V. Description of Invention (6) shows that a target area includes a segmentation unit 302. A fuzzy logic controller 3 2 0 includes a computational differential mutation unit ( calculate differential variance 303, a quantizer 304, fuzzy subsets 305, a fuzzy controller 306, a fuzzy variance operator 3 0 7, a weighted defuzzifier ( weighted deiuzzifier) 308, and a fuzzy lookup table 3 0 9. In addition, the entire encoding system also includes a Η · 2 6 3 + video encoder (video encoder) and a virtual buffer (refer to Figure 2), the fuzzy logic controller 3 2 0 improves the quality of the target area based on the difference between a variation number Ji332 and a variation number ^ (7 ^ 34). After a frame 3 0 1, the cutting unit 3 0 2 for appearance detection and motion detection will be used to cut the frame 3 0 1 into a target area (R0 1) 330 and a non-target area 331. The large block in the non-target area 331 will be directly sent to a QP selector 310 ° ROI 330 with a bit rate control without adjusting any parameters. The variation number difference Δσί334 in the i-th large block of 310 330 Calculated from aJ32 and (7/333, where σi332 and σi'333 are the variation numbers of the current and previous i-th large block respectively. The difference between the variation number Δσ i334 and the current large block variation number 332 is Use two inputs of the fuzzy logic method, and (7135 is a fuzzy output that will be used as the input weighting factor. Figures 3 and 4 show graphs representing 0 ^ 332 and △ σί334 respectively. Please refer to Figures 3 and As shown in Figure 4, the language group (linguistic

12429TWF.PTD 第11頁 200529650 五、發明說明(7) sets)的符號,LN351 與401 、LN352 與402 、LN353 與403 、 LN354與404、以及LN355與405,分別為π大正(Large Positive)"、丨丨小正(Smal 1 Positive”、丨,零(Zero)11、 小負(Small Negative)” 、以及”大負(Large Negative)"。除了所有的σι332都為正值,以及在統計 上大部分每一大區塊的變異數σί334都在ΖΕ 3〇3的中心 之外’第3圖的符號與第4圖的符號完全相同。第4圖繪示 以Δ^=( CTi- (7/)/ (Ji定義的變異數差次的子集合。 請參考第4圖所示,在統計上大部分的△ σ〖3 3 4都是 3人在[一 1〇 ’ +1〇]的區間中。接下來’量化器304將0^ 么^\ΤΔ 34輸入模糊子集合30 5,並且將其程度轉換 i 糊剌…351 f SN 3 5 2、ZE 3 5 3、LP 3 54、以及SP 3 5 5 的 .3 34,模,控制器3 0 6接下來藉由量化~ 3 3 2與△ σ 方法,其語言從屬功能’並且使用中央面積區(C0 A ) △ σ .對丄1二模糊狀態。在完成計算之後,每一個σ i/ 所示的—籍、一對應主控制輸入值。決策表是以第5圖 模糊器3 π 一 ^u ν、坷仔隹汜m體甲。加櫂厍 態;二根Λ模糊探查表309,考慮㈧/“的兩種狀 W加權因數ω ai3 3 5,以加重RCH 3 3 0大區塊 。^、;月探查表3 〇 9形式儲存在記憶體中。加權解 的品質 先權,可雜士 一實施例中’為使不同㈧1 330具有不同優 出模糊表。曰笛原始輸出模糊,縮放(s c a 1 e ) —組不同的輪 個R0I優先權,是用來運用與分辨不同R0I 3 3 0的每一 、 個0ne-flxed與one-various從屬功能範12429TWF.PTD Page 11 200529650 V. Description of the invention (7) The symbols of sets), LN351 and 401, LN352 and 402, LN353 and 403, LN354 and 404, and LN355 and 405, respectively, are "Large Positive" " , 丨 丨 Small Positive (Smal 1 Positive), 丨, Zero (Zero) 11, Small Negative (Small Negative), and "Large Negative" ". Except all σι332 are positive values, and in statistics The variation number σί334 of each large block in most of the above is outside the center of ZE3 03. The symbol in Figure 3 is exactly the same as that in Figure 4. Figure 4 shows that Δ ^ = (CTi- ( 7 /) / (Ji-defined sub-set of the number of variants. Please refer to Figure 4. As shown in Figure 4, most of the statistics △ σ 〖3 3 4 are 3 people in [一 10 '+ 1〇] Next, the 'quantizer 304 inputs 0 ^ Mod ^ \ ΤΔ 34 into the fuzzy sub-set 30 5 and converts its degree to i ... 351 f SN 3 5 2, ZE 3 5 3, LP 3 54, And SP 3 5 5 .3 34, module, controller 3 0 6 Next by quantification ~ 3 3 2 and △ σ method, its language subordinate function 'and use the central area (C0 A ) △ σ. Two fuzzy states for 丄 1. After the calculation is completed, each of σ i /-shown in Figure 1 corresponds to the main control input value. The decision table is based on the fuzzer in Figure 5 3 π a ^ u ν,坷 仔 隹 汜 m 体 甲. Add 棹 厍 state; two Λ fuzzy probe table 309, consider ㈧ / "two kinds of W weighting factors ω ai3 3 5 to increase the RCH 3 3 0 large block. ^ ,; The monthly lookup table 3 is stored in the memory. The quality of the weighted solution is prioritized, but in one embodiment, 'in order to make different ㈧1 330 have different excellent fuzzy tables. That is, the original output of the flute is blurred and scaled (sca 1 e) — different sets of R0I priorities are used to distinguish and distinguish each and every 0ne-flxed and one-various subordinate functions of different R0I 3 3 0

200529650 五、發明說明(8) 例。加權因數是使用模糊規則在H. 2 6 3 +影像編碼器31 1 中,針對給定的每一大區塊計算而得。 在本發明之一實施例的實驗結果中,可以驗証本發 明實施例具有較其他既有習知法則為佳之性能。該實驗 測試Carphone 、Claire 、以及Foreman三種序歹1J 。為定義 在一訊框中的R0I ,臉部偵測被用來自動選擇R0I。在測 試序列中比較四種不同方法。該四種不同方法為:不用 R0I編碼訊框(WR)、乘上一個加權因數(WA) α編碼R0I、 以三個因素(TF)編碼R0I、以及本發明(模糊)。這四種方 法都設成相似的平均位元率。對目標位元率為6 4每秒千 位元的I -訊框與Ρ-訊框而言,QP設定成5與3,而對目標 位元率為32每秒千位元的I -訊框與Ρ-訊框而言,QP則設 定成15與13。在WA中,加權因數設定成450。在TF中,三 個因素分別設定為4 5 0、2、以及1 0。為以類似加權比較 另兩種方法,ΖΕ13設定為450,而且LP卜LN25設定為 350〜550 ° 如第7圖到第1 0圖所示,相較於其他方法,在類似位 元率之下,本發明實施例具有較佳的ROI PSNR。因為WA 與TF都是以固定參數改善R0I品質,所以當每一大區塊複 雜度大量變化時,這兩種方法無法調整其加權因數。綜 合上述說明,本發明實施例可獲得較佳的R 0 I品質,並且 即使是以較低位元率工作時,遺漏訊框(s k i ρ p i n g frame)的現象也會較少發生。 本發明可適用於任何影像處理工作,特別是用於即200529650 V. Description of Invention (8) Example. The weighting factor is calculated in H. 2 6 3 + image encoder 31 1 using fuzzy rules for each given large block. In the experimental results of one embodiment of the present invention, it can be verified that the embodiment of the present invention has better performance than other conventionally known rules. This experiment tests Carphone, Claire, and Foreman. To define R0I in a frame, face detection is used to automatically select R0I. Compare four different methods in the test sequence. The four different methods are: encoding the frame (WR) without R0I, encoding R0I by multiplying by a weighting factor (WA), encoding R0I with three factors (TF), and the present invention (fuzzy). All four methods are set to similar average bit rates. For I-frames and P-frames with a target bit rate of 64 kbits per second, QP is set to 5 and 3, while for target I-frames with an I-signal of 32 kbits per second. For the frame and P-frame, the QP is set to 15 and 13. In WA, the weighting factor is set to 450. In TF, the three factors are set to 450, 2, and 10 respectively. In order to compare the other two methods with similar weighting, ZE13 is set to 450, and LP and LN25 are set to 350 ~ 550 °. As shown in Figure 7 to Figure 10, compared with other methods, at a similar bit rate The embodiments of the present invention have better ROI PSNR. Because both WA and TF improve the quality of ROI with fixed parameters, when the complexity of each large block changes a lot, these two methods cannot adjust their weighting factors. In summary, the embodiment of the present invention can obtain better R 0 I quality, and even when working at a lower bit rate, the phenomenon of missing frame (ski i ρ p i n g frame) will rarely occur. The invention can be applied to any image processing work, especially for immediate use.

12429TWF.PTD 第13頁 200529650 五、發明說明(9) 時影像編碼。因此,本發明可輕易改善R01品質,並且保 持位元率,以避免緩衝器過溢。相較於習知技藝而言, 本發明可以以較少位元率,輕易改善該晝面品質。此 外,多重R 0 I影像編碼亦可大量改善每一個R 0 I品質,而 不需複雜運算。 雖然本發明已以較佳實施例揭露如上,然其並非用 以限定本發明,任何熟習此技藝者,在不脫離本發明之 精神和範圍内,當可作各種之更動與潤飾,因此本發明 之保護範圍當視後附之申請專利範圍所界定者為準。12429TWF.PTD Page 13 200529650 V. Description of the invention (9) Image coding. Therefore, the present invention can easily improve the quality of R01 and maintain the bit rate to avoid buffer overflow. Compared with the conventional techniques, the present invention can easily improve the quality of the daylight surface with a lower bit rate. In addition, multiple R 0 I image coding can also greatly improve the quality of each R 0 I without the need for complicated operations. Although the present invention has been disclosed as above with preferred embodiments, it is not intended to limit the present invention. Any person skilled in the art can make various modifications and retouches without departing from the spirit and scope of the present invention. Therefore, the present invention The scope of protection shall be determined by the scope of the attached patent application.

12429TWF.PTD 第14頁 200529650 圖式簡單說明 第1 a圖是一個習知的回饋控制法則的方塊示意圖。 第1 b圖是一個習知的模糊邏輯控制法則的方塊示意 圖。 第2圖是一個根據本發明一實施例,由模糊邏輯控制 法則執行目標區影像編碼的方塊示意圖。 第3圖是第2圖中所示的模糊邏輯控制裝置中的變異 數i的子集合範例。 第4圖是第2圖中所示的模糊邏輯控制裝置中的變異 數變動△ i的子集合範例。 第5圖是第2圖中所示的模糊邏輯控制裝置中的模糊 輸出探查表範例。 第6圖是一個one-fixed與one-various從屬功能範 例。 第7圖是針對64每秒千位元的100個訊框的Carphone 序列的各種不同方法比較表。 第8圖是針對3 2每秒千位元的1 5 0個訊框的C 1 a i r e序 列的各種不同方法比較表。 第9圖是針對64每秒千位元的150個訊框的Foreman序 列的各種不同方法比較表。 第1 0圖是針對6 4每秒千位元的1 5 0個訊框的N e w s序列 的多重目標區比較表。 圖式標記說明: 1 0 0 :回饋控制系統 1 0 1 :設定點12429TWF.PTD Page 14 200529650 Brief description of the diagram Figure 1a is a block diagram of a conventional feedback control rule. Figure 1b is a block diagram of a conventional fuzzy logic control law. Fig. 2 is a block diagram of image coding of a target area performed by a fuzzy logic control rule according to an embodiment of the present invention. Fig. 3 is an example of a subset of the variation number i in the fuzzy logic control device shown in Fig. 2. Fig. 4 is an example of a subset of the variation Δi in the fuzzy logic control device shown in Fig. 2. Fig. 5 is an example of a fuzzy output lookup table in the fuzzy logic control device shown in Fig. 2. Figure 6 is an example of one-fixed and one-various slave functions. Figure 7 is a comparison table of various methods for a Carphone sequence of 100 frames of 64 kbits per second. Figure 8 is a comparison table of various methods for the C 1 a i r e sequence of 150 frames of 32 kilobits per second. Figure 9 is a comparison table of the various methods of the Foreman sequence of 150 frames of 64 kbits per second. Figure 10 is a multi-target region comparison table for a NeW s sequence of 150 frames of 64 kilobits per second. Graphical label description: 1 0 0: feedback control system 1 0 1: set point

12429TWF.PTD 第15頁 200529650 圖式簡單說明 102 控 制 器 103 處 理 104 系 統 數 學 模 型 105 感 測 器 150 模 糊 邏 輯 控 制 系 統 15 1 ri-rL δ又 定 點 152 量 化 器 153 模 糊 邏 輯 控 制 器 154 解 模 糊 器 155 根 據 人 性 的 規 則 組 156 處 理 157 感 測 器 30 1 訊 框 m 入 302 切 割 單 元 303 計 算 微 分 差 異 單 元 304 量 化 器 305 模 糊 子 集 合 306 模 糊 控 制 器 307 模 糊 變 異 數 運 算 器 308 加 權 解 模 糊 器 309 模 糊 探 查 表 310 加 權QP 選 擇 器 31 1 Η. 2 6 3 + 影 像 編 碼 器 312 虛 擬 緩 衝 器12429TWF.PTD Page 15 200529650 Simple description of the diagram 102 controller 103 processing 104 system mathematical model 105 sensor 150 fuzzy logic control system 15 1 ri-rL δ and fixed point 152 quantizer 153 fuzzy logic controller 154 defuzzifier 155 Group of rules according to human nature 156 processing 157 sensor 30 1 frame m input 302 cutting unit 303 calculation differential difference unit 304 quantizer 305 fuzzy subset 306 fuzzy controller 307 fuzzy variation number operator 308 weighted defuzzifier 309 fuzzy exploration Table 310 Weighted QP selector 31 1 Η. 2 6 3 + image encoder 312 virtual buffer

12429TWF.PTD 第16頁 20052965012429TWF.PTD Page 16 200529650

12429TWF.PTD 第17頁12429TWF.PTD Page 17

Claims (1)

200529650 六、申請專利範圍 1. 一種影像編碼方法,適用於視訊電話與視訊會 議,包括: 將一影像的複數個目標區與複數個非目標區分離; 以及 將來自該些目標區的一輸入,傳送至一模糊邏輯控 制,其中該模糊邏輯控制是用來改善該些目標區之一品 質,以及改善一輸出影像整體品質。 2. 如申請專利範圍第1項所述之影像編碼方法,其 中來自該些目標區的該輸入是從來自該些目標區的一第 一控制輸入與一第二控制輸入所計算而得。 3. 如申請專利範圍第2項所述之影像編碼方法,其 中該第一控制輸入與該第二控制輸入分別包括一來自一 目前第i個大方塊的第一變異數與一變異數差,該變異數 差是由將該第一變異數減去前一第i_l個大方塊的一第二 變異數,並且再除以該第一變異數所得,該第i個大方塊 與該第i-Ι個大方塊代表在該些目標區的其中之一之内的 該大方塊的一序列,而且該第i - 1個大方塊是該第i個大 方塊的一前一大方塊。 4. 如申請專利範圍第1項所述之影像編碼方法,其 中該模糊邏輯控制包括一用來將來自該些目標區的該輸 入,轉換成複數個模糊判定的法則。 5. 如申請專利範圍第1項所述之影像編碼方法,其 中該模糊邏輯控制包括一控制功能,藉以計算一用來決 定一模糊狀態的語言從屬功能。200529650 6. Scope of patent application 1. An image coding method suitable for video calls and video conferences, including: separating a plurality of target areas of an image from a plurality of non-target areas; and an input from the target areas, Send to a fuzzy logic control, where the fuzzy logic control is used to improve the quality of one of the target areas and to improve the overall quality of an output image. 2. The image coding method described in item 1 of the scope of patent application, wherein the input from the target areas is calculated from a first control input and a second control input from the target areas. 3. The image coding method as described in item 2 of the scope of patent application, wherein the first control input and the second control input respectively include a first variation number and a variation number difference from a current i-th large block, The variation number difference is obtained by subtracting a first variation number from a second variation number of a previous i_l large block, and then dividing the first variation number by the first variation number. The I large block represents a sequence of the large blocks within one of the target areas, and the i-1 large block is a previous large block of the i th large block. 4. The image coding method described in item 1 of the patent application scope, wherein the fuzzy logic control includes a rule for converting the input from the target areas into a plurality of fuzzy decisions. 5. The image coding method as described in item 1 of the scope of patent application, wherein the fuzzy logic control includes a control function to calculate a language subordinate function for determining a fuzzy state. 12429TWF.PTD 第18頁 200529650 六、 申請專利範圍 6. 如 申請專 利範 圍 第5項所述之影像編碼方法 ,其 中 該 控制 功 能包括 一中 央 面積區(C 0 A )方法’以決定該語 言 從 屬功 能 〇 7. 如 申請專 利範 圍 第1項所述之影像編碼方法 ,其 中 該 模糊 邏 輯控制 包括 用 來設定一決策位準與產生一 加 權 因 數的 複 數個探 查表 藉以加重該些目標區的其中 之 一 的 品質 〇 8. 如 申請專 利範 圍 第7項所述之影像編碼方法 ,其 中 該 些探 查 表包括 複數 個 縮放探查表,用來對該些目 標 區 的 其中 之 一提供 一類 似 優先權品質。 9. 如 申請專 利範 圍 第8項所述之影像編碼方法 ,其 中 該 些縮 放 探查表 是使 用 一one-fixed 與one-various 從 屬 功 能成 形 〇 10. 如申請專利範圍第1項所述之影像編碼方法 其 中 該模 糊 邏輯控 制更 包 括· 將來自該些巨 1標區的一輸入,轉換成複數個模糊判 定 使用- -用來決定- •模糊狀態的每一該些模糊判定的 控 制 功能 1 計算一 語言 從 屬功能,以及 從用來設定- -決策位準與產生一加權因數的該模糊 狀 態 ,產 生 複數個 探查 表 ,藉以加重該些目標區的其 中 之 一 的品 質 〇 1 1 . 如申請專利範圍第1 0項所述之影像編碼方法, 其 中 來自 該 些目標 區的 該 輸入是從來自該些目標區的12429TWF.PTD Page 18 200529650 6. Patent application scope 6. The image coding method described in item 5 of the patent application scope, wherein the control function includes a central area (C 0 A) method to determine the language subordinate function 〇7. The image coding method described in item 1 of the patent application scope, wherein the fuzzy logic control includes a plurality of probe tables for setting a decision level and generating a weighting factor to aggravate one of the target areas The quality of the image coding method described in item 7 of the scope of the patent application, wherein the look-up tables include a plurality of zoom look-up tables to provide a similar priority quality to one of the target areas. 9. The image coding method described in item 8 of the scope of patent application, wherein the zoom lookup tables are formed using a one-fixed and one-various subordinate function. 10. The image coding described in item 1 of the scope of patent application In the method, the fuzzy logic control further includes: · converting an input from the giant 1 marks into a plurality of fuzzy judgments;--used to determine-• each of the fuzzy judgment control functions 1 in the fuzzy state. Language subordinate function, and from this fuzzy state used to set-decision level and generate a weighting factor, a plurality of probe tables are generated to aggravate the quality of one of the target areas. 0 1 1 The image coding method according to item 10, wherein the input from the target areas is from the target areas 12429TWF.PTD 第19頁 200529650 六、申請專利範圍 第一控制輸入與一第二控制輸入所計算而得。 12. 如申請專利範圍第1 1項所述之影像編碼方法, 其中該第一控制輸入與該第二控制輸入分別包括一來自 一目前第i個大方塊的第一變異數與一變異數差,該變異 數差是由將該第一變異數減去前一第i-Ι個大方塊的一第 二變異數,並且再除以該第一變異數所得,該第i個大方 塊與該第i-Ι個大方塊代表在該些目標區的其中之一之内 的該大方塊的一序列,而且該第i -1個大方塊是該第i個 大方塊的一前一大方塊。 13. 如申請專利範圍第1 0項所述之影像編碼方法, 其中該控制功能使用一中央面積區(CO A )方法,來決定該 語言從屬功能。 14. 如申請專利範圍第1 0項所述之影像編碼方法, 其中該些探查表包括複數個縮放探查表,用來對該些目 標區的其中之一提供一類似優先權品質。 15. 如申請專利範圍第1 4項所述之影像編碼方法, 其中該些縮放探查表是使用一one - fixed與one- various 從屬功能成形。 16. —種影像編碼裝置,適用於視訊電話與視訊會 議,包括: 一編碼器,具有一輸入端與一輸出端,其中該編碼 器的該輸入端電性耦合至一輸入訊框; 一切割裝置,具有一輸入端、一第一輸出端與一第 二輸出端,其中該切割裝置的該輸入端電性耦合至該輸12429TWF.PTD Page 19 200529650 6. Scope of patent application Calculated by the first control input and a second control input. 12. The image coding method as described in item 11 of the scope of patent application, wherein the first control input and the second control input respectively include a difference between a first variation number and a variation number from an i-th block The difference between the i-th large block and the i-th large block is obtained by subtracting a second one from the first i-1 block and dividing it by the first one. The i-1 large block represents a sequence of the large blocks within one of the target areas, and the i-1 large block is a previous large block of the i-th large block. 13. The image coding method described in item 10 of the scope of patent application, wherein the control function uses a central area (CO A) method to determine the language dependent function. 14. The image coding method as described in item 10 of the scope of the patent application, wherein the lookup tables include a plurality of zoom lookup tables to provide a similar priority quality to one of the target areas. 15. The image coding method described in item 14 of the scope of the patent application, wherein the zoom lookup tables are formed using one-fixed and one-various dependent functions. 16. An image encoding device suitable for video telephone and video conference, including: an encoder having an input end and an output end, wherein the input end of the encoder is electrically coupled to an input frame; a cutting The device has an input terminal, a first output terminal and a second output terminal, wherein the input terminal of the cutting device is electrically coupled to the output terminal. 12429TWF.PTD 第20頁 200529650 六、 申請專利範圍 入 訊 框 ; 以 及 -模糊邏輯控制裝置 ,具有- -輸入端與一輸出端, 其 中 該 模 糊 邏 輯控制裝 置的 該輸入 端電性耦合至 該切割 裝 置 的 該 第 一 輸出端, 而且 該模糊 邏輯控制裝置 的該輸 出 端 電 性 合 至該編碼 器的 該輸入 端。 1 7. 如申請專利範圍: 第1 6項所述之影像編; 馬裝置, 其 中 該 模 糊 邏 輯控制裝 置更 加包括 -量化器,具有- -輸入端與- -輸出端,其中該量化 器 的 該 輸 入 端 電性耦合 至該 切割裝 置的該第一輸 出端, 該 量 化 器 將 來 自該切割 裝置 的該第 一輸出端的一 訊號’ 轉 換 成 一 模 糊 判定; -第- -控制器,具有- -輸入端與一輸出端 ,其中該 第 一 控 制 器 的 該輸入端 電性 柄合至 該量化器的該 輸出 端 該 第 控 制器將該 模糊 判定, 轉換成一模糊 狀悲, 以 及 -第二 二控制器,具有- -輸入端與一輪出端: ,其中該 第 ,—' 控 制 器 的 該輸入端 與該 輸出端 ,分別電性耦 合至該 第 控 制 器 的 該輸出端 與該 編碼器 的該輸入端, 該第二 控 制 器 將 該 邏 輯狀態, 轉換 成該模 糊邏輯控制裝 置的一 輸 出 〇 1 8. 如申請專利範圍第1 7項所述之影像編碼裝置, 更 加 包 括 微 分裝置, 具有 一輸入 端與一輸出端 ,其中 該 微 分 裝 置 的 該輸入端 與該 輸出端 ,分別電性耦 合至該 切 割 裝 置 的 該 第一輸出 端與 該量化 器的該輸入端 〇12429TWF.PTD Page 20 200529650 VI. Patent application scope; and-fuzzy logic control device with--input terminal and an output terminal, wherein the input terminal of the fuzzy logic control device is electrically coupled to the cutting device The first output terminal of the encoder, and the output terminal of the fuzzy logic control device is electrically coupled to the input terminal of the encoder. 1 7. According to the scope of patent application: the video editing described in item 16; horse device, wherein the fuzzy logic control device further includes a quantizer, which has an input terminal and an output terminal, wherein the quantizer's The input terminal is electrically coupled to the first output terminal of the cutting device, and the quantizer converts a signal 'from the first output terminal of the cutting device into a fuzzy decision;-the--controller has--the input terminal And an output terminal, wherein the input terminal of the first controller is electrically connected to the output terminal of the quantizer, the second controller converts the fuzzy decision into a fuzzy state, and the second controller, Having an input terminal and a round output terminal: wherein, the input terminal and the output terminal of the first,-'controller are electrically coupled to the output terminal of the first controller and the input terminal of the encoder, respectively; The second controller converts the logic state into an output of the fuzzy logic control device. 1 8. The image coding device described in item 17 of the scope of patent application, further includes a differential device, having an input terminal and an output terminal, wherein the input terminal and the output terminal of the differential device are electrically coupled to The first output terminal of the cutting device and the input terminal of the quantizer. 12429TWF.PTD 第21頁 200529650 六、申請專利範圍 19. 如申請專利範圍第1 8項所述之影像編碼裝置, 其中該編碼器的該輸入端電性耦合至該切割裝置的該第 —輸出端。 2 0. 如申請專利範圍第1 9項所述之影像編碼裝置, 更加包括一緩衝器,具有一輸入端與一輸出端’其中該 緩衝器的該輸入端與該輸出端,分別電性耦合至該編碼 器的該輸出端與該切割裝置的該第一輸出端。12429TWF.PTD Page 21 200529650 VI. Patent application scope 19. The image coding device described in item 18 of the patent application scope, wherein the input end of the encoder is electrically coupled to the first output end of the cutting device . 2 0. The image coding device described in item 19 of the scope of patent application, further comprising a buffer having an input end and an output end, wherein the input end and the output end of the buffer are electrically coupled respectively. To the output terminal of the encoder and the first output terminal of the cutting device. 12429TWF.PTD 第22頁12429TWF.PTD Page 22
TW93104201A 2004-02-20 2004-02-20 Video coding method and apparatus thereof TWI241130B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW93104201A TWI241130B (en) 2004-02-20 2004-02-20 Video coding method and apparatus thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW93104201A TWI241130B (en) 2004-02-20 2004-02-20 Video coding method and apparatus thereof

Publications (2)

Publication Number Publication Date
TW200529650A true TW200529650A (en) 2005-09-01
TWI241130B TWI241130B (en) 2005-10-01

Family

ID=37013064

Family Applications (1)

Application Number Title Priority Date Filing Date
TW93104201A TWI241130B (en) 2004-02-20 2004-02-20 Video coding method and apparatus thereof

Country Status (1)

Country Link
TW (1) TWI241130B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102263943A (en) * 2010-05-25 2011-11-30 财团法人工业技术研究院 Video bit rate control device and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102263943A (en) * 2010-05-25 2011-11-30 财团法人工业技术研究院 Video bit rate control device and method
CN102263943B (en) * 2010-05-25 2014-06-04 财团法人工业技术研究院 Video bit rate control device and method

Also Published As

Publication number Publication date
TWI241130B (en) 2005-10-01

Similar Documents

Publication Publication Date Title
US11089305B2 (en) Video frame coding method during scene change, terminal and storage medium
CN101389026B (en) Image coding apparatus and image coding method
DE60003070T2 (en) ADAPTIVE MOTION VECTOR FIELD CODING
CN100574427C (en) The control method of video code bit rate
CN110876060B (en) Code rate adjusting method and device in coding process
CN104270649B (en) Image coding device and video encoding method
CN110620924B (en) Method and device for processing coded data, computer equipment and storage medium
CN113573140B (en) Code rate self-adaptive decision-making method supporting face detection and real-time super-resolution
CN104322065A (en) Terminal and video image compression method
WO2009121234A1 (en) A video compression code rate control method
CN106961603A (en) Intracoded frame code rate allocation method and device
CN110225340A (en) A kind of control method and device of Video coding calculate equipment and storage medium
CN110545418B (en) Self-adaptive video coding method based on scene
KR19980074651A (en) Rate control device for MPEG video signal using fuzzy control
US20050140781A1 (en) Video coding method and apparatus thereof
Gao et al. Rate-distortion modeling for bit rate constrained point cloud compression
Ngan et al. Improved single-video-object rate control for MPEG-4
TW200529650A (en) Video coding method and apparatus thereof
Chi et al. Region-of-interest video coding based on rate and distortion variations for H. 263+
CN110740324B (en) Coding control method and related device
JP4183432B2 (en) Image data encoding method
CN101389012A (en) Method and device for rate distortion rate control
JP2004040811A (en) Method and apparatus for controlling amount of dct computation performed to encode motion image
CN108924555B (en) Code rate control bit distribution method suitable for video slice
Chi et al. Region-of-interest video coding by fuzzy control for H. 263+ standard

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees