TW400502B

TW400502B - A networking device and control method for dynamically planning the the voice and video channels in network

Info

Publication number: TW400502B
Application number: TW87112113A
Authority: TW
Inventors: Wen-Shin Yang; Ying-Yu Shiau; Hua-Sheng Shiu; Jen-Jia Shiu; Ren-Jie Jeng
Original assignee: Ind Tech Res Inst
Priority date: 1998-07-24
Filing date: 1998-07-24
Publication date: 2000-08-01

Abstract

This invention relates to the networking device and control method that can dynamically plan the the video and voice channels in network., wherein, the data receiver can generate the abridged network broadband parameters (ABR) according to the received video data packet and voice data packet. Additionally, the data transmitter includes a video data encoding unit and a voice data encoding unit, which corresponds to a video encoding parameter and a speech encoding parameter such as video encoding bit rate (VEBR) or the least broadband (LSBR) for voice encoding. The network device also includes an adjusting unit, which dynamically adjusts the data flow of the video data packets transmitted by the data transmitter according to the above-mentioned ABR, the video encoding parameter and the voice encoding parameter correspond to the lowest transmission requirement of voice data packet to achieve the planing of the video and voice channels.

Description

經濟部中央樣準局員工消費合作社印製 A7 B7 五、發明説明（1 ) 本發明係有關於一種網路裝置和網路控制方法之技術，特別是有關於在網際網路（Internet)上可動態規畫影音頻道之網路裝置和控制方法，例如在網際網路上之影像電話系統應用，可以在有限頻寬的網際網路上，傳送影像電話的視訊（video)和語音（speech)資料，並且透過監視傳輸網路的壅塞程度，可以動態地調整傳輸的資料量，並且能夠考量到視訊和語音資料的同步要求。目前在網際網路上傳輸視訊和語音資料的應用類型相當多，例如影像電話(video telephone)、影音電子郵件以及視訊會議（video conferencing)等等。由於目前的網際網路是屬於共享式網路，而非專屬網路，因此如何將需要即時（real-time)服務的多媒體傳輸應用在網際網路上，便成為非常重要的課題。例如中華民國專利公告編號286460案所揭露之技術，即根據應用程式的需求，動態地配置頻道（channel). 來傳送及接收多媒體資料，其主要的特徵包括：（1)在系統啟動時，先定義各頻道的種類以及其屬性，應用程式則根據需求，請求配置頻道並且視需要訂定頻道參數；（2) 每個頻道各別具有相對應的單層封包佇列，可以用來暫時地儲存資料封包；（3)根據所配置頻道的屬性以及其對應的參數，決定封包傳送及接收時的排程，再根據此排程的結果，從被選定頻道所對應的封包佇列中取出資料封包，進行組合或是傳送的處理。在此多媒體頻道動態配置技術中，利用良好的排程模組、頻道規畫定義以及本紙張尺度適用中國國家揉準（CNS ) Α4規格（210Χ297公釐） m m^i ml nn —11 k_^ In v^ln tl^i flun —* -·· - τ (請先閲讀背面之注意事項再填寫本頁) * 』經濟部中央標隼局員工消費合作社印裝 A7 B7 五、發明説明（2 ) 參數定義模組，是可以達到資料同步化的目的。在此習知技術中，主要是利用多媒體頻道規畫的機制來解決影音資料傳輸上的問題。不過，對於較為壅塞的網路狀態下，單單依靠多媒體頻道的規畫仍無法有效解決多媒體資料的傳輸問題，畢竟網際網路的頻寬仍然是有限的，同時也是屬於共享形式的網路。因此，有需要提出新的控制方法來解決上述的問題，而最有效的方式即是直接根據網路實際的狀態來調整所要傳輸的視訊資料和語音資料。有鑑於此，本發明的主要目的，在於提供一種可動態規畫網路影音頻道之通訊裝置和控制方法，其能夠根據目前網際網路的實際有效頻寬，動態地調整視訊資料的頻寬，藉此可以保證在訊息傳輸上的一定品質。本發明的另一目的，在於提供一種可動態規畫網路影音頻道之通訊裝置和控制方法，其能夠保持視訊資料和語音資料的同步性。根據上述之目的，本發明提出一種可動態規晝網路影音頻道之網路裝置以及方法。其主要的特徵在於：（1) 資料收訊端可以根據所接收之視訊資料封包和語音資料封包，產生短期網路頻寬參數（ABR) ; (2)資料傳訊端包括一視訊資料編碼部和一語音資料編碼部，其分別對應於一視訊編碼參數和一語音編碼參數，例如視訊編碼位元數（VEBR)或是語音編碼所需最小頻寬（LSBR);以及（3) 尚包括一調整部，用來根據上述的短期網路頻寬參數、本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公釐） ----_---_---21------ΐτ------^ I (請先閱讀背fi,之注意事項再填寫本頁) * / _ 經濟部中央標準局貝工消费合作社印製 A7 B7 五、發明説明（3 ) 視訊編碼參數和語音編碼參數，動態調整資料傳訊端中傳送視訊資料封包之資料量，以符合該語音資料封包的最低傳輸要求。另外，調整部包括：（1)參數計算部，其根據上述短期網路頻寬參數以及視訊編碼參數，決定出平均網路頻寬參數（MBR)、最大有效語音頻寬參數和最小有效語音頻寬參數等等；（2)狀態描述部，則根據平均網路頻寬參數、最大有效語音頻寬參數、最小有效語音頻寬參數等等判斷相關參數，以及語音編碼參數，決定出目前網路的頻寬狀態；以及（3)調整量計算部，其根據狀態描述部所得的頻寬狀態，動態調整資料傳訊端中傳送視訊資料封包的資料量，以符合語音資料封包的最低傳輸要求。利用上述之架構，便可以根據實際網路頻寬的狀況，調整視訊頻道的頻寬而達到規畫影音頻道的目的。圖式之簡單說明：為使本發明之上述目的、特徵和優點能更明顯易懂，下文特舉一較佳實施例，並配合所附圖式，作詳細說明如下：第1圖表示本發明實施例中通訊裝置之視訊資料傳訊部的方塊示意圖。第2圖表示本發明實施例中通訊裝置之語音資料傳訊部的方塊示意圖。第3圖表示本發明實施例中通訊裝置之視訊資料收本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公釐） I I I i I «I n 11 n I I I n t^i T ^ 、ve , · ' :- (請先閲讀背面之注意事項再填寫本頁) ' 經濟部中央標準局員工消費合作社印製 ΑΊ Β7 五、發明説明（4 ) 訊部的方塊示意圖。第4圖表示本發明實施例中通訊裝置之語音資料收訊部的方塊示意圖。第5圖表示本發明實施例中之調整部的資料流（data flow)示意圖。第6圖表示本發明實施例中之調整部的結構方塊示意圖。第7圖表示本發明實施例中之調整部的處理流程圖。符號說明： 1〜視訊資料傳訊部；10〜攝像裝置；11〜影像框串緩衝器；12〜再同步檢查器；13〜H.263編碼器；14〜網路管理部；15〜資料統計部；16~調整部；17〜系統控制部；99〜網際網路；2〜語音資料傳訊部；20〜語音記錄裝置；21〜語音緩衝器；22〜有效語音判斷部；23〜框界語音判斷部； 24〜語音編碼器；27〜有效語音；28〜無效語音；29〜框界語音；3〜視訊資料收訊部；30〜H.263封包串緩衝器；31~ 過濾器；32〜檢查器；33〜H.263解碼器；34〜顯示裝置； 40〜判斷部；41〜聲音封包串緩衝器；42〜監視部；43〜框界語音封包串緩衝器；44〜語音播放緩衝器；45〜語音解碼器；46〜框界語音；47〜有效語音；14a、14b〜RTP通道；14c~RTCP通道；13a〜H.263編/解碼器；24a〜語音編/解碼器；16a〜參數計算部；16b〜狀態描述部；16c〜調本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公釐） ----.--^--------訂------弋 f 1- (請先閱讀背面之1意事項再填寫本頁) _ 經濟部中央樣準局貝工消费合作社印裝 Α7 Β7 五、發明説明（5 ) 整量計算部。實施例：以下以一實施例來說明本發明。在一般點對點 (point-to-p〇int)視訊電話或其他多媒體資料傳輸應用中’通訊雙方的通訊裝置同時扮演傳送端（包括語音資料和視訊資料）和接收端（亦包括語音資料和視訊資料）的角色進行溝通《不過在處理的同一時間上，一方為傳送端而另一方則為接收端。在本實施例中，當系統開始執行傳輸工作時，傳送端和接收端之間會進行系統相關參數的設定。接著，傳送端透過RTCP通訊協定（real-time control protocol)與接收端進行連結（connection)。當此時連接的動作沒有錯誤時’表示網際網路影像電話（或其他需要視訊和語音通訊之應用）的雙方可以開始通話，所以透過網路代理模組 (network agent)建立通話雙方視訊和聲音資料相互可傳輸的RTP通訊協定（real-time protocol)頻道（分別為RTP 視訊頻道和RTP語音頻道）。在網際網路影像電話的通話雙方連接成功後，傳送端的影像擷取裝置（video capture device，例如CCD攝影機等等）開始進行影像抓取的動作。影像擷取裝置每隔1/30秒便產生一個QCIF格式大小的影像框（image frame)，並且儲存到特定的視訊緩衝器（video buffer)内，等待 H.263 視訊編碼器（H.263 video encoder)進行編碼的動作。經由Η.263視訊編碼器所編本紙張尺度適用中國國家標準（CNS ) Α4規格（210Χ297公釐） I —— I I一 — 1-'1--— /衣 i 11 I I I 訂 ---- .. --T 】一 (請先閲讀背面之注意事項再填寫本頁) _ 經濟部中央棣隼局貝工消费合作社印裝 Μ Β7 i、發明説明（6 ) ，的資料流（data stream) ’會以2K位元組為單位，透過 RTP通訊協定頻道包封（pack)成RTP封包（packet)，傳送到接‘收端。另一方面’傳送端的聲音擷取裝置（例如音效卡和麥克風）則是以8K samples/sec的取樣速度，進行收音的功能。以180ms長度的聲音為單位，傳換成16位元PCM(pulse code modulation)格式聲音資料，再觸發語音編碼器（speech encoder)進行編碼。經由語音編碼器所編碼的資料流’會以54位元組為單位，透過rtp通訊協定頻道包封成RTP封包，傳送到接收端。接著，就本實施例中通訊裝置内之視訊資料傳送端 (第1圖）、語音資料傳送端（第2圖）、視訊資料接收端（第 3圖）和語音資料接收端（第4圖）進行詳細說明。第1囷表示本實施例通訊裝置之視訊資料傳訊部1 的方塊示意圖。在第1圖中，視訊資料傳訊部1主要包括了攝像裝置10、影像框串緩衝器11、再同步檢查器12、 H.263編碼器13以及網路管理部14，至於資料統計部 Μ、調整部16以及系統控制部π則是負責調整視訊資料量的功能。在視訊資料傳訊部1中，攝像裝置10是負責影像梅取的功能，所擷取的每張影像稱為一個影像框，會依序儲存到影像框串緩衝器（frame list buffer)ll 中。再同步檢查器（re-synch checker) 12則是依序從影像框串緩衝器11中讀取每個影像框，並且指定送到H.263編碼器13 進行資料的壓縮和編碼，產生對應的資料流，再交由網本紙張尺度適用中國國家榡準（CNS > A4^ (210χ297公釐） I I I —^ I ^ I I 訂 (請先閲讀背面之注意事項再填寫本頁) -' 經濟部中央標準局員工消资合作社印裝 A7 B7 五、發明説明（7 ) 路管理部14進行包封，產生對應的RTP視訊資料封包後送到網際網路99 ^再同步檢查器12的另一作用是強制產生框間封包（intraframe packet)，用以進行視訊和語音播放的同步化。詳言之，再同步檢查器12會在每隔一段時間（例如5秒）強制H.263編碼器13進行框間封包的編碼，並且同樣在網路管理部14中包封成RTP樞間封包送到網際網路99。也就是說，由網路管理部14 $ 送出的封包，在一般正常時間内是屬於RTP視訊資料封包，而在間隔時間上則是屬於框間封包。利用此定時發出的框間封包，可以用來修正視訊和語音播放上的$胃步現象，此同步方式則於後詳述。另外，每個RTP 資料封包和RTP框間封包分別具有全域時間標籤（gl()bal time stamp)，用以表示其對應的播放時間》資料統計部15則是用計算各種參數值供後續之用，例如VEBR(video encoding bit rate，視訊編喝位元量）、VEFR(video encoding frame rate，視訊編瑪影像框量）和SNR(signal-to-noise ratio，訊號雜訊比）等等。在本實施例中，資料統計部15會將其中一個視訊_@ 參數，亦即VEBR，送到調整部16。調整部16則是根據VEBR、由網路管理部14根據所接收封包估算出的 ABR(available bit rate，有效位元量）以及語音資料傳訊部2所送來之LSBR(least speech bit rate，語音編瑪所需最小頻寬），產生調整H.263編碼器13的參數信號ρΓ以及調整攝像裝置10之視訊擁取量（video capture rate)的本紙張尺度適用中國國家樣準（CNS ) A4規格（210X297公釐） n I -mu HI ^^^1 Hal nn、一eJ (請先閲讀背面之注意事項再填寫本頁) . 經梯準局貝合作社印 % A7 B7 五、發明説明（8 ) VCR參數。其中，春振产站_ 〇 β器的編碼參數或是二號二疋而用二調整_編辱目的：是用來料視訊資料量，其細節隨後詳述兩者的的方塊示2意#//1本/2掩：；通訊^之語音資料傳訊部2 如弟2圖所不，語音資料傳訊部 ”口日記錄裝置20、語音緩衝器2 ^括框界語音判斷部23、語音編碼部24和網路曰營判理斷^ 22、與第i圏中者相同)β 1 *網路管理部14(其聲音的訊息是料語音記錄裝置2()騎收錄，而 :曰記錄裝置20可以由麥克風和音效卡（進行類比數位轉換)構成。語音記錄裝置2G所記錄的語音資料則是儲存在語音緩衝器21中。如前所述，本實施例中是以18〇ms 的16位元組PCM聲音資料為一個處理單位。有效語音判斷部22即是針對每一個處理單位（18〇ms)的聲音資料’判斷其是否為有效語音（v〇iced)或是無效語音 (unvoiced) »判斷的方式是先設定一門檻值（thresh〇ld)和一比例值（percentage) ’再將處理單位中的所有聲音樣本資料與此門檻值進行比較，得到超過門檻值的聲音樣本數量’再計算出超過門檻值聲音樣本數量佔所有聲音樣本數量的比例值，再與預設之比例值比較，決定出此處理單位是有效語音或是無效語音。以實際範例來說明，利用取樣頻率8K來取樣180ms聲音可得到1440個聲音樣本資料，亦即有2880個位元組。利用預設的門襤值 (請先閲讀背面之注意事項再填寫本頁) 訂 10 本紙張尺度適用中國國家橾率（CNS ) A4規格（21〇Χ297公缝）經濟部中央標準局貝工消費合作社印袋 A7 B7 五、發明説明（9 ) 和比例值，可以決定出這1440個聲音樣本資料中超過門檻值的比例，如果此比例超過預設的比例值就判斷此處理單位為有效語音，反之則為無效語音。有效語音27會被直接送到語音編碼部24進行壓縮和編碼，再由網路管理部14包封成RTP語音資料封包，送到網際網路99。另一方面，無效語音28會先被送到框界語音（frame-edge voice)判斷部23進行框界語音的判斷。所謂的框界語音即為有效語音之後所接的第一個無效語音，不過如果此框界語音與前一個框界語音間的時間差，是在某一個範圍之内時，仍然將此框界語音視為無效語音。同樣的，框界語音29也會被送到語音編碼部24進行壓縮和編碼，再由網路管理部14包封成RTP 語音資料封包，送到網際網路99。因此，除了有效語音 27和框界語音29之外，其他的聲音都被視為雜音或是靜音，可以直接丟棄不處理，以節省網路傳送聲音資料量的需求。另外，語音資料傳訊部2也會利用第1圖中所示之資料統計部15,根據語音編碼部224的編碼狀態，產生對應於語音編碼所需最小頻寬的語音編碼參數 LSBR，並且送到調整部16,其一般值為2.4kbps。另外，每個RTP語音資料封包也包含各自的全域時間標籤。第3圖表示本實施例通訊裝置之視訊資料收訊部3 的方塊示意圖。在第3圖中，整個視訊資料收訊部3包括了 H.263封包串緩衝器30、過濾器3卜檢查器32、H.263 解碼器和顯示裝置34。首先由網際網路99所接收到的 11 本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公釐） (請先閲讀背面之注意事項再填寫本页)Printed by A7 B7 of the Consumer Cooperatives of the Central Procurement Bureau of the Ministry of Economic Affairs 5. Description of the Invention (1) The present invention relates to a technology for a network device and a method for controlling the network, and in particular, it relates to a technology that can be used on the Internet. Network device and control method for dynamically planning video and audio channels, such as video telephony system applications on the Internet, can transmit video and speech data of video telephony on the Internet with limited bandwidth, and By monitoring the congestion of the transmission network, the amount of data transmitted can be dynamically adjusted, and the synchronization requirements of video and voice data can be considered. There are currently many types of applications for transmitting video and voice data over the Internet, such as video telephones, video emails, and video conferencing. Since the current Internet is a shared network, not a private network, how to apply multimedia transmissions that require real-time services to the Internet has become a very important issue. For example, the technology disclosed in the Republic of China Patent Bulletin No. 286460 is to dynamically configure channels according to the needs of applications. The main characteristics of transmitting and receiving multimedia data include: (1) When the system is started, first Define the types and attributes of each channel, and the application requests to configure the channel and set the channel parameters as needed; (2) Each channel has a corresponding single-layer packet queue, which can be used for temporary storage Data packets; (3) Determine the schedule of packet transmission and reception according to the attributes of the configured channel and its corresponding parameters, and then take out the data packets from the queue of packets corresponding to the selected channel based on the results of this schedule To combine or send. In this multimedia channel dynamic configuration technology, a good scheduling module, channel planning definition, and paper size are applicable to China National Standards (CNS) Α4 specifications (210 × 297 mm) mm ^ i ml nn —11 k_ ^ In v ^ ln tl ^ i flun — *-··-τ (Please read the notes on the back before filling out this page) * 』Printed by the Consumers' Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs A7 B7 5. Description of the invention (2) Parameters The definition module can achieve the purpose of data synchronization. In this conventional technology, the mechanism of multimedia channel planning is mainly used to solve the problem of audiovisual data transmission. However, in a more congested network state, multimedia channel planning alone cannot effectively solve the problem of multimedia data transmission. After all, the bandwidth of the Internet is still limited and it is also a shared network. Therefore, it is necessary to propose new control methods to solve the above problems, and the most effective way is to directly adjust the video data and voice data to be transmitted according to the actual state of the network. In view of this, the main object of the present invention is to provide a communication device and a control method capable of dynamically planning network audio and video channels, which can dynamically adjust the bandwidth of video data according to the actual effective bandwidth of the current Internet. This can guarantee a certain quality in message transmission. Another object of the present invention is to provide a communication device and a control method capable of dynamically planning a network video and audio channel, which can maintain synchronization of video data and voice data. According to the above objectives, the present invention provides a network device and method capable of dynamically regulating day-to-day network video and audio channels. Its main features are: (1) the data receiver can generate short-term network bandwidth parameters (ABR) based on the received video data packets and voice data packets; (2) the data transmitter includes a video data encoding unit and A speech data encoding unit, which corresponds to a video encoding parameter and a speech encoding parameter, such as the number of video encoding bits (VEBR) or the minimum bandwidth (LSBR) required for speech encoding; and (3) further includes an adjustment Ministry, which is based on the above short-term network bandwidth parameters and the paper size applies the Chinese National Standard (CNS) A4 specification (210X297 mm) ----_---_--- 21 ------ ΐτ ------ ^ I (please read the precautions before filling in this page) * / _ Printed by A7 B7, Shellfish Consumer Cooperative, Central Bureau of Standards of the Ministry of Economic Affairs 5. Description of the invention (3) Video coding parameters And voice encoding parameters to dynamically adjust the amount of video data packets sent in the data transmitter to meet the minimum transmission requirements for the voice data packets. In addition, the adjustment unit includes: (1) a parameter calculation unit that determines an average network bandwidth parameter (MBR), a maximum effective speech and audio width parameter, and a minimum effective speech and audio according to the short-term network bandwidth parameters and video encoding parameters. Wide parameters, etc .; (2) the state description department, based on the average network bandwidth parameters, the maximum effective speech and audio width parameters, the minimum effective speech and audio width parameters, etc. to determine the relevant parameters, as well as the voice encoding parameters, to determine the current network And (3) an adjustment amount calculation unit, which dynamically adjusts the data amount of the video data packet transmitted in the data transmitting end according to the bandwidth state obtained by the state description unit to meet the minimum transmission requirements of the voice data packet. With the above structure, the bandwidth of the video channel can be adjusted to achieve the purpose of planning the video and audio channels according to the actual network bandwidth. Brief description of the drawings: In order to make the above-mentioned objects, features, and advantages of the present invention more comprehensible, a preferred embodiment is given below and described in detail with the accompanying drawings as follows: FIG. 1 shows the present invention The block diagram of the video data transmission part of the communication device in the embodiment. Fig. 2 shows a block diagram of a voice data transmitting section of a communication device in an embodiment of the present invention. FIG. 3 shows that the paper size of the video data receiving paper of the communication device according to the embodiment of the present invention applies to the Chinese National Standard (CNS) A4 specification (210X297 mm) III i I «I n 11 n III nt ^ i T ^, ve, · · ':-(Please read the notes on the back before filling out this page)' Printed by the Consumer Cooperatives of the Central Bureau of Standards of the Ministry of Economy ΑΊ Β7 5. Description of the invention (4) Block diagram of the information department. FIG. 4 is a block diagram of a voice data receiving unit of a communication device according to an embodiment of the present invention. FIG. 5 is a schematic diagram of a data flow of the adjustment unit in the embodiment of the present invention. Fig. 6 is a block diagram showing the structure of the adjustment section in the embodiment of the present invention. Fig. 7 is a flowchart showing the processing performed by the adjustment unit in the embodiment of the present invention. Explanation of symbols: 1 ~ Video data transmission department; 10 ~ Camera device; 11 ~ Image frame buffer; 12 ~ Resynchronization checker; 13 ~ H.263 encoder; 14 ~ Network management department; 15 ~ Data statistics department 16 ~ adjustment unit; 17 ~ system control unit; 99 ~ Internet; 2 ~ voice data transmission unit; 20 ~ voice recording device; 21 ~ voice buffer; 22 ~ effective voice judgment unit; 23 ~ frame voice judgment 24 ~ speech encoder; 27 ~ valid voice; 28 ~ invalid voice; 29 ~ frame voice; 3 ~ video data receiving department; 30 ~ H.263 packet string buffer; 31 ~ filter; 32 ~ check Decoder; 33 ~ H.263 decoder; 34 ~ display device; 40 ~ judgment section; 41 ~ sound packet string buffer; 42 ~ monitor section; 43 ~ frame boundary voice packet string buffer; 44 ~ speech playback buffer; 45 ~ speech decoder; 46 ~ frame boundary voice; 47 ~ effective voice; 14a, 14b ~ RTP channel; 14c ~ RTCP channel; 13a ~ H.263 encoder / decoder; 24a ~ speech encoder / decoder; 16a ~ parameter Calculation Department; 16b ~ State Description Department; 16c ~ Adjusted paper size applies Chinese National Standard ( CNS) A4 specification (210X297 mm) ----.-- ^ -------- Order ------ 弋 f 1- (Please read the first notice on the back before filling this page) _ Printed by Shellfish Consumer Cooperatives, Central Bureau of Procurement, Ministry of Economic Affairs A7 B7 V. Description of the invention (5) Integral calculation department. Example: The following describes the present invention with an example. In a general point-to-point video phone or other multimedia data transmission application, the communication devices of both parties of the communication act as both a transmitting end (including voice data and video data) and a receiving end (including voice data and video data) ) To communicate "but at the same time, one side is the transmitting side and the other is the receiving side. In this embodiment, when the system starts to perform transmission work, system-related parameters are set between the transmitting end and the receiving end. Then, the transmitting end connects to the receiving end through the RTCP communication protocol (real-time control protocol). When there is no error in the connection action at this time, it means that the two sides of the Internet video phone (or other applications that require video and voice communication) can start a call, so the video and sound of both sides of the call are established through the network agent. Real-time protocol channels (respectively RTP video channels and RTP voice channels) where data can be transmitted to each other. After the two parties of the Internet video phone are successfully connected, the video capture device (such as a CCD camera, etc.) on the transmitting end starts to perform image capture. The image capture device generates a QCIF format image frame every 1/30 second, and stores it in a specific video buffer, waiting for the H.263 video encoder (H.263 video encoder). The paper size edited by the Η.263 video encoder is applicable to the Chinese National Standard (CNS) A4 specification (210 × 297 mm) I —— II － 1-'1 --— / clothing i 11 III Order ----. . --T】 I (Please read the notes on the back before filling out this page) _ Printed by the Central Government Bureau of the Ministry of Economic Affairs, Shellfish Consumer Cooperatives, printed ΒΒ7 i, description of the invention (6), data stream '' It will be packetized into RTP packets via RTP protocol channel in units of 2K bytes, and sent to the receiver. On the other hand, the sound capture devices (such as sound cards and microphones) at the transmitting end perform the function of receiving sound at a sampling rate of 8K samples / sec. Take the sound of 180ms as the unit, transfer it into 16-bit PCM (pulse code modulation) sound data, and then trigger the speech encoder to encode. The data stream 'encoded by the speech encoder will be encapsulated into RTP packets through the rtp protocol channel in 54-byte units and transmitted to the receiver. Next, the video data transmitting end (Fig. 1), the voice data transmitting end (Fig. 2), the video data receiving end (Fig. 3), and the voice data receiving end (Fig. 4) in the communication device in this embodiment are described. Detailed description. The first block diagram shows a block diagram of the video data transmission unit 1 of the communication device of this embodiment. In FIG. 1, the video data transmission unit 1 mainly includes a camera device 10, an image frame buffer 11, a resynchronization checker 12, an H.263 encoder 13, and a network management unit 14. As for the data statistics unit M, The adjustment unit 16 and the system control unit π are functions responsible for adjusting the amount of video data. In the video data transmission unit 1, the camera device 10 is responsible for image merging. Each captured image is called an image frame, and it is sequentially stored in a frame list buffer ll. The re-synch checker 12 reads each image frame in sequence from the image frame string buffer 11 and assigns it to the H.263 encoder 13 to compress and encode the data to generate the corresponding Data flow, then the netbook paper size is applicable to China National Standards (CNS > A4 ^ (210x297mm) III — ^ I ^ II Order (Please read the notes on the back before filling this page)-'Ministry of Economy Printed by the Standards Bureau staff consumer cooperatives A7 B7 V. Description of the invention (7) The road management department 14 performs encapsulation, generates a corresponding RTP video data packet, and sends it to the Internet 99 ^ Another role of the resynchronization checker 12 It is mandatory to generate an intraframe packet to synchronize video and audio playback. In detail, the resynchronization checker 12 will force the H.263 encoder 13 to perform at regular intervals (for example, 5 seconds). The encoding of the inter-frame packet is also encapsulated into an RTP pivot packet in the network management unit 14 and sent to the Internet 99. In other words, the packet sent by the network management unit 14 $ in normal normal time is Belongs to RTP video data packet, In the interval time, it belongs to the inter-frame packet. Using the inter-frame packet sent out at this time can be used to correct the $ gastric step phenomenon in video and audio playback. This synchronization method is described in detail later. In addition, each RTP Data packets and RTP inter-frame packets have global time stamps (gl () bal time stamp) to indicate their corresponding playback time. The data statistics department 15 calculates various parameter values for subsequent use, such as VEBR (video encoding bit rate (video bit rate), VEFR (video encoding frame rate), SNR (signal-to-noise ratio), etc. In this embodiment, The data statistics department 15 sends one of the video_ @ parameters, that is, VEBR, to the adjustment unit 16. The adjustment unit 16 is an ABR (available bit rate, which is estimated by the network management unit 14 based on the received packets based on the VEBR, Effective bit amount) and the LSBR (least speech bit rate) required by the voice data transmission unit 2 to generate a parameter signal ρΓ for adjusting the H.263 encoder 13 and adjust the camera signal 10 See The paper size of the video capture rate applies to the Chinese National Standard (CNS) A4 specification (210X297 mm) n I -mu HI ^^^ 1 Hal nn, one eJ (Please read the precautions on the back first (Fill in this page again).% A7 B7 printed by the Kazakhstan Cooperative Co., Ltd. 5. Description of the invention (8) VCR parameters. Among them, the Chunzhen production station _ 〇 β device encoding parameters or the second and second adjustment and use two adjustment _ shame purpose: is used to anticipate the amount of video data, the details of which are detailed in the box shown below 2 # // 1 copy / 2 cover :; voice data transmission department 2 of communication ^ As shown in Figure 2, the voice data transmission department "portal recording device 20, voice buffer 2 ^ frame frame voice judgment unit 23, voice coding The judgment of the Ministry 24 and the Internet ^ 22, the same as the one in the i)) β 1 * The network management unit 14 (the voice message is recorded by the voice recording device 2 (), and the recording device is: 20 can be composed of a microphone and a sound card (analog-to-digital conversion). The voice data recorded by the voice recording device 2G is stored in the voice buffer 21. As mentioned earlier, in this embodiment, 16 is 18 ms. The byte PCM sound data is a processing unit. The effective speech judging unit 22 judges whether the sound data of each processing unit (180 ms) is valid speech (voiced) or invalid speech (unvoiced). »The way to judge is to first set a threshold value (thresh〇ld) and a proportional value (per centage) 'Then compare all sound sample data in the processing unit with this threshold to get the number of sound samples that exceed the threshold' and then calculate the ratio of the number of sound samples that exceed the threshold to the total number of sound samples, and The comparison of the set ratio value determines whether the processing unit is valid speech or invalid speech. Taking a practical example to illustrate, using sampling frequency 8K to sample 180ms of sound can obtain 1440 sound sample data, that is, there are 2880 bytes. Use the preset threshold value (please read the precautions on the back before filling this page) Order 10 paper sizes applicable to the Chinese National Standard (CNS) A4 specification (21〇 × 297 cm) Cooperative printed bag A7 B7 5. The invention description (9) and the ratio value can determine the ratio of the 1440 sound sample data that exceeds the threshold value. If the ratio exceeds the preset ratio value, it is judged that the processing unit is a valid voice. Otherwise, it is invalid speech. The effective speech 27 will be directly sent to the speech encoding section 24 for compression and encoding, and then encapsulated by the network management section 14 The RTP voice data packet is sent to the Internet 99. On the other hand, the invalid voice 28 is first sent to the frame-edge voice judgment unit 23 to judge the frame-bound voice. The so-called frame-bound voice is This is the first invalid voice received after the valid voice, but if the time difference between this frame bound voice and the previous frame bound voice is within a certain range, the frame bound voice is still considered as invalid voice. Yes, the frame boundary speech 29 will also be sent to the speech encoding section 24 for compression and encoding, and then encapsulated by the network management section 14 into an RTP speech data packet and sent to the Internet 99. Therefore, except for the effective voice 27 and frame bounding voice 29, all other sounds are regarded as noise or mute, which can be directly discarded and not processed, so as to save the need for transmitting sound data on the network. In addition, the voice data transmission unit 2 also uses the data statistics unit 15 shown in FIG. 1 to generate a voice coding parameter LSBR corresponding to the minimum bandwidth required for voice coding according to the coding state of the voice coding unit 224, and sends it to the The adjustment unit 16 has a general value of 2.4 kbps. In addition, each RTP voice data packet also contains its own global time stamp. FIG. 3 is a block diagram of the video data receiving unit 3 of the communication device of this embodiment. In FIG. 3, the entire video data receiving unit 3 includes an H.263 packet string buffer 30, a filter 3, a checker 32, an H.263 decoder, and a display device 34. 11 paper sizes received by Internet 99 first are in accordance with Chinese National Standard (CNS) A4 (210X297 mm) (Please read the precautions on the back before filling this page)

,1T J—..... jf A7 B7 五、發明説明（l〇 ) RTP視訊資料封包用網路管理部（未圖示）去 RTP通訊㈣中的表頭部分(header)，再放到h加封包串缓衝器3〇。過滤器31的作用則是取出η Μ3封包串緩衝器30中所儲存的H.263視訊資料封包。過瀘器31 有兩種取出H.263視訊資料封包的方式。序從 η·26㈣視訊資料封包，此為正常取出的方式。另一種則是直接跳到下-個框間封包的位置上，取出後續的H 263視訊資料封包。也就是說，在目前取出的視訊封包到下一個框間封包之間的所有H.263視訊資料封包，全部省略。如前所述，框間封包是以定時方式（每隔5秒）產生並且送出的封包，當視訊資料收訊部3的檢查器32檢查出有影像和聲音不同的現象（由於影像資料較龐大所以一般是聲音領先影像播放）時，則可以利用直接跳到框間封包的方式’來縮短兩者間的播放時間不同。檢查器32則是用來檢查聲音和語音的同步狀態。檢一器32疋根據η.263視訊資料封包和語音資料封包中的全域時間標藏，來判斷兩者間的播放時間上是否同步。而調整同步的方法，則是利用過濾器31取出H263 視訊資料封包的方式來達成，此部分稍後再詳細描述。接著，H.263解碼器33對於所有取出的h.263視訊資料封包進行解碼的處理，再將結果顯示在顯示裝置34上。第4圓表示本實施例通訊裝置之語音資料收訊部4 的方塊示意圖。在第4圖中，語音資料收訊部4包括了 ! 12 (CNS) 規格（公楚）請先閲讀背面之注., 1T J —..... jf A7 B7 V. Description of the invention (10) The RTP video data packet uses the network management unit (not shown) to go to the header in the RTP communication card (header), and then put it in h adds the packet string buffer 30. The function of the filter 31 is to take out the H.263 video data packets stored in the η M3 packet string buffer 30. The modem 31 has two methods for taking out H.263 video data packets. Sequence video packets from η · 26㈣, this is the normal way to take out. The other is to directly jump to the next packet position and take out the subsequent H 263 video data packet. In other words, all H.263 video data packets between the currently taken video packet and the next frame packet are omitted. As mentioned above, the inter-frame packets are packets that are generated and sent out in a timed manner (every 5 seconds). When the inspector 32 of the video data receiving unit 3 detects that there are different images and sounds (because the image data is relatively large Therefore, when the sound is generally ahead of video playback), you can use the method of skipping directly to the packet between frames to reduce the difference in playback time between the two. The checker 32 is used to check the synchronization status of sound and speech. The detector 32 疋 judges whether the playback time is synchronized between the two based on the global time stamp in the η.263 video data packet and the voice data packet. The method for adjusting the synchronization is achieved by taking out the H263 video data packet by the filter 31, which will be described in detail later. Next, the H.263 decoder 33 decodes all the extracted h.263 video data packets, and displays the results on the display device 34. The fourth circle shows a block diagram of the voice data receiving unit 4 of the communication device of this embodiment. In Figure 4, the voice data receiving unit 4 includes! 12 (CNS) specifications (public information). Please read the note on the back first.

I 旁經濟部中央標準局貝工消費合作社印製 A7 B7 經濟部中央梂準局員工消費合作社印« 五、發明説明（11 ) 判斷部4〇、聲音封包串緩衝器41、監視部42、框界聲音封包串緩衝器43、語音播放緩衝器44以及語音解碼器45。同樣的，由網際網路99所接收到的RTp往音資料封包’先利_路管理部（未圖示）去除掉RTp通訊協定中的表頭部分後’交由判斷部4〇判斷是有效語音或是框界料。如果目前的語音f料封包U有效語音所構成’則先送到語音封包串緩衝器41加謂存；如果目前的語音資料封包是由框界語音所構成，則直接送到框界聲音封包串緩衝器43中加以健存。監視部32則是監視聲音封包串緩衝器41中資料的增加速度’用來控制聲音封包串緩衝器41中封包的輸出。當監視部42檢查出聲音封包串緩衝器41中的封包數量超過4上限值時，為了避免造成通話中聲音延遲時間過長’會強制地將聲音封包串緩衝器4ι中的所有封包清出，送到語音播放緩衝器44中，再利用語音解碼器45解碼後進行播放。另一方面，框界語的封包是儲存在框界聲音封包串緩衝器43中。所以在一般正常情況下’當接收到-個框界語音所構成的封包時，會將之前的全部有效語音所構成的封包，由聲音封包串緩衝器41中清出，送到語音播放緩衝器44中再利用語音解碼器45解碼後進行播放。其中，可以利用各封包所附的全域時間標籤來判斷各封包間的時序係以下詳細說明第3圓（視訊資料收訊部3)中檢查 (請先閲讀背面之注意事項再填寫本頁) ^------訂-------丨丨 HI * 13I Printed by Shellfish Consumer Cooperative, Central Standards Bureau, Ministry of Economic Affairs A7 B7 Printed by Consumer Cooperatives, Central Standards Bureau, Ministry of Economic Affairs «V. Invention Description (11) Judgment Unit 40, Sound Packet String Buffer 41, Monitoring Unit 42, Frame Boundary sound packet string buffer 43, speech playback buffer 44, and speech decoder 45. Similarly, the RTp audio data packet received by the Internet 99 is sent to the judging unit 40 to determine whether it is valid after removing the header part of the RTp communication protocol from the "Xianli_Road Management Department (not shown)". Voice or frame boundary material. If the current voice f data packet U is composed of valid voices, it is first sent to the voice packet string buffer 41 plus pre-store; if the current voice data packet is composed of frame bound voice, it is directly sent to the frame bound voice packet string. It is stored in the buffer 43. The monitoring unit 32 monitors the increase rate of the data in the audio packet string buffer 41 to control the output of the packets in the audio packet string buffer 41. When the monitoring unit 42 detects that the number of packets in the voice packet string buffer 41 exceeds the upper limit of 4, in order to avoid excessive delay in the voice during a call, all packets in the voice packet string buffer 4m will be forcibly cleared. , Send it to the speech playback buffer 44, and then use the speech decoder 45 to decode and play it. On the other hand, the frame boundary packet is stored in the frame boundary audio packet string buffer 43. Therefore, under normal circumstances, when a packet composed of frame-bound speech is received, the packet composed of all previous valid speech will be cleared from the voice packet string buffer 41 and sent to the voice playback buffer. In step 44, the voice decoder 45 is used for decoding and then played. Among them, the global time stamp attached to each packet can be used to determine the timing between each packet. Check it in the detailed explanation in the third circle (video data receiving section 3) below (please read the precautions on the back before filling this page) ------ Order ------ 丨丨 HI * 13

A7 ---—______B7 五、發明説明（12 ) 一 32的作用’以及如何在出現不同步現象時進行調整。如前所述，每個H.263視訊資料封包和聲音f料封包都會附上個別的全域時間標籤，於是檢查器32會根據目前處理之H.263視訊資料封包所具有的全域時間標藏 TS263 ’以及目前處理之語音f料封包所具有的全域時間標籤TSspeeeh ’進行下列之判斷： TS263 < TSspee£h -TTS (1) 其中’ TTS表示容許時間差’例如5χ(υη)_ TSspeech(n-l))。如果第（1)的條件成立表示發生了不同步的現象，也就是聲音部分訊息的處理速度領先影像部分訊息。因此，要進行的調整動作必須加快影像部分訊息的處理，本實施例所採用的方式，就是直接跳到下一個框間封包繼績進行處理，而忽略其間所有的H 263視訊資料封包。由於略去部分的H.263視訊資料封包’也就縮短了聲音和影像間的時間差，再達到調整同步的目的。接著說明的是本實施例中如何根據網路狀態（例如網路頻寬）的變化，動態地藉由資料量的控制來規畫影音經濟部中央標準局貝工消费合作社印策頻道。在一般點對點網際網路影像電話應用首先假設以下兩種情況： (1) 通話端的任一端均扮演著接收端和傳送端的角色。所以當任一端偵測到網路頻寬的變化時，都可以直接反應到本身的傳送資料量來調整網路頻寬。 (2) 對於2.4Kbps的語音編碼器以及一般η.263編碼本紙張尺度適用中國國家標牟（CNS ) ( 21GX297公楚）經濟部中央標準局員工消費合作社印製 A7 B7 五、發明説明（13 ) 器所產生的資料量來比較，聲音的資料量較少，而且對於影像電話而言，聲音所傳遞訊息的重要性也高於影像。根據以上兩個假設可以推論出下列的結論。第一、由於進行通信的兩個通訊裝置都具有傳送端和接收端的功能，因此每個通訊裝置都可以自行根據在接收端上接收封包的情況，來判斷出目前的網路頻寬，再根據所判斷出網路頻寬並且配合本身的各項資料編碼（包括視訊和語音）相關參數，來動態地規畫影音頻道。第二、由於聲音訊息的重要性大於影像訊息，因此規畫影音頻道的方式可以透過控制視訊頻道的資料量達成。在本實例中，主要是利用H.263視訊編碼器所產生的位元量（bit rate) 以及影像品質（quality)以及攝像裝置的視訊擁取量來控制資料量。第5圖表示本實施例中調整部16的資料流示意圖。在第5圖表示的是單一通訊裝置内的情況。如前所述，每個通訊裝置内都包括傳送端（1、2)以及接收端（3、4)，並且透過網路管理部14與網際網路99連接。網路管理部14利用RTCP通道14c建立與網際網路99之間的控制機制，並且建立RTP通道14a和14b，分別用來做為語音頻道和視訊頻道。亦即，由傳送端（1、2)中的語音編碼器24以及H.263編碼器13所編碼後的語音和視訊資料，分別利用RTP通道14a和14b傳送到網際網路99 上；另一方面，RTP通道14a和14b在接收到網際網路 15 本紙張尺度適用中國國家標準（CNS ) A4規格（2丨Ο X 297公釐）A7 ---—______ B7 V. Description of the invention (12)-The role of 32 ’and how to adjust it when there is an out-of-sync phenomenon. As mentioned earlier, each H.263 video data packet and sound f data packet will be attached with a separate global time stamp, so the checker 32 will hide TS263 according to the global time stamp of the currently processed H.263 video data packet. 'And the global time tag TSspeeeh of the currently processed voice f data packet' make the following judgment: TS263 < TSspee £ h -TTS (1) where 'TTS stands for allowable time difference', for example 5χ (υη) _ TSspeech (nl) ). If the condition (1) is satisfied, it indicates that an out-of-synchronization phenomenon has occurred, that is, the processing speed of the sound part information is ahead of the image part information. Therefore, the adjustment action to be performed must speed up the processing of some image information. The method adopted in this embodiment is to skip directly to the next frame packet for subsequent processing, and ignore all H 263 video data packets in the meantime. Because the H.263 video data packet is omitted, the time difference between audio and video is shortened, and the purpose of adjusting synchronization is achieved. Next, it is explained in this embodiment how to dynamically plan the video and audio by controlling the amount of data according to the change of the network status (such as the network bandwidth). In general point-to-point Internet video telephony applications, the following two situations are assumed first: (1) Each end of the call end plays the role of a receiving end and a transmitting end. Therefore, when either end detects a change in network bandwidth, it can directly reflect the amount of data it sends to adjust the network bandwidth. (2) For the 2.4Kbps speech encoder and general η.263 codebook, the paper size is applicable to China National Standards Corporation (CNS) (21GX297). A7 B7 printed by the Consumer Cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs. 5. Description of the invention (13 ) To compare the amount of data generated by the device, the amount of sound data is small, and for video calls, the message transmitted by the sound is more important than the image. Based on the above two assumptions, the following conclusions can be deduced. First, because the two communication devices that communicate have the functions of a transmitting end and a receiving end, each communication device can determine the current network bandwidth based on the situation of receiving packets on the receiving end, and then according to The network bandwidth is determined, and the related data encoding (including video and voice) related parameters are used to dynamically plan the video and audio channels. Second, because the importance of audio messages is greater than that of video messages, the way to plan video and audio channels can be achieved by controlling the amount of data in the video channel. In this example, the amount of data is mainly controlled by using the bit rate and image quality generated by the H.263 video encoder and the video capture amount of the camera device. FIG. 5 shows a data flow diagram of the adjustment unit 16 in this embodiment. Fig. 5 shows the situation in a single communication device. As described above, each communication device includes a transmitting end (1, 2) and a receiving end (3, 4), and is connected to the Internet 99 through the network management unit 14. The network management unit 14 uses the RTCP channel 14c to establish a control mechanism with the Internet 99, and establishes RTP channels 14a and 14b, which are used as voice channels and video channels, respectively. That is, the speech and video data encoded by the speech encoder 24 and the H.263 encoder 13 in the transmitting end (1,2) are transmitted to the Internet 99 using the RTP channels 14a and 14b, respectively; the other On the other hand, the RTP channels 14a and 14b are receiving the Internet 15 This paper size applies to the Chinese National Standard (CNS) A4 specification (2 丨〇 X 297 mm)

In I nn ml m .^ϋ m^i ml 一eJ (請先閲讀背面之注意事項再填寫本頁) · ^ 經濟部中央樣準局貝工消费合作社印製 A7 B7 五、發明説明（14 ) 99所傳來的語音和視訊資料封包後，則送到接收端（3、 4)的語音解碼器45以及H.263解碼器33進行解碼還原。如第5圖所示’本實施例之調整部丨6是根據網路管理部14所送出的ABR(有效位元量）參數，以及在傳送端（1、2)之語音編碼器24所對應的LEBR(語音編碼所需最小頻寬）參數和H.263編碼器13所對應的VEBR(視訊編碼位元量）參數來進行調整，而實際調整則是透過視訊資料產生的部分’亦即VCR(視訊擷取量）參數以及 Pr(H.263視訊編碼器相關編碼參數以及編碼形式）參數。接著說明的是在網路管理部14中產生ABR參數的方式。ABR參數實際是對應於一短期（sh〇rt term)的網路頻寬參數。以下分兩種情況說明ABR參數的產生方式》當網路底層架構支援 RSVP(resource reservation protocol) 通訊協定時，則可以直接利用RTCP通訊協定求出ABR 參數。亦即： L = Arrival_Time(RTCPn) - Initiate_Time(RTCPn) (2) ABR = Lengthof(RTCPn)/L (3) 其中，In I nn ml m. ^ Ϋ m ^ i ml one eJ (Please read the notes on the back before filling out this page) · ^ Printed by the Central Samples Bureau of the Ministry of Economic Affairs, Shellfish Consumer Cooperative, A7 B7 V. Description of the invention (14) After the packets of voice and video data from 99 are sent, they are sent to the voice decoder 45 and H.263 decoder 33 of the receiver (3, 4) for decoding and restoration. As shown in FIG. 5 'the adjustment section 6 of this embodiment corresponds to the ABR (effective bit amount) parameter sent by the network management section 14 and the corresponding speech encoder 24 at the transmitting end (1,2). LEBR (Minimum Bandwidth Required for Speech Encoding) parameters and VEBR (Video Encoding Bit Amount) parameters corresponding to H.263 encoder 13, while the actual adjustment is through the part generated by the video data, that is, VCR (Video capture) parameters and Pr (H.263 video encoder related encoding parameters and encoding form) parameters. Next, a method of generating ABR parameters in the network management unit 14 will be described. The ABR parameter actually corresponds to a short term network bandwidth parameter. The following describes the generation of ABR parameters in two cases. "When the network underlying architecture supports the RSVP (resource reservation protocol) protocol, the ABR parameters can be obtained directly using the RTCP protocol. That is: L = Arrival_Time (RTCPn)-Initiate_Time (RTCPn) (2) ABR = Lengthof (RTCPn) / L (3) where,

Initiate_Time(RTCP)表示該RTCP封包送出的全域時間標籤；Arrival_Time(RTCP)表示該RTCP封包接收的全域時間標藏；Lengthof(RTCP)表示該RTCP封包的長度。如果網路底層不支援RSVP通訊協定時，則必須利用全部資料量來求出ABR參數。亦即： 16 本紙張尺度適用中國國家標準（CNS ) A4規格（21〇X297公釐） (請先閲讀背面之注意事項再填寫本頁) 言 ~ 1 A7 B7 五、發明説明（15 ) ABR = bps(H.263視訊封包）+ bps(語音封包）（斗）其中，bps(H.263視訊封包）表示目前在接收端平均每秒所接收到之H.263封包的位元數；bps(語音封包）表示目前在接收端平均每秒所接收到之語音封包的位& 數。第6圖表不本實施例中調整部16的結構方塊示意圖，並且標示出其閜關的各參數，即ABR、VEBR、LSBR、 VCR和Pr。調整部16是由參數計算部16a、狀態描述部16b以及調整量計算部16c所構成。經濟部中央標準局負工消费合作社印製 - I. m I - - - -ΙΛ 4S-I 11 n m m (請先閲讀背面之注^h項再填寫本頁) · ' 參數計算部16a是根據網路管理部14所送來的ABR 參數以及H.263編/解碼器13a所送來的VEBR參數，計算出一些相關判斷的參數。在本實施例中，這些判斷相關參數包括：平均網路頻寬（average available bit rate)MBR、最大/最小有效音訊頻寬MFU/MF,以及最大/ 最小有效網路頻寬MBRu/MBR,。參數計算部16a主要是利用兩個預訊的參數值來決定上述各判斷相關參數，即 LB和UB，分別表示調整區間值之下限和上限的比例百分比值，此為可調整的參數。以下即為各判斷相關參數的計算方式： MF=(MBR-VEBR)xUB% ⑹ MF,=(MBR-VEBR)xLB% (7) MBR^MBRxUB% (8) MBR,=MBRxLB°/〇 ⑼ 17 本紙張尺度適用中國國家標準（CNS ) A4規格（210 X 297公釐）經濟部中央標準局員工消費合作社印策 A7 B7 五、發明説明（16 ) 其中η表示序號值。狀態描述部16b則是根據上述的各項判斷相關參數以及由語音編/解碼器24a所送出的LSBR參數，判斷出目前網路頻寬和影音頻寬的關聯性，而將所判斷出的狀態結果送到調整量計算部16c。最後調整量計算部16c 根據判斷出的狀態結果，計算出所要調整的視訊資料量，亦即計算出所要調整的VCR參數值，並且送到系統控制部17來控制攝像裝置，或是將所需調整的H.263 編碼參數值符在Pr參數上送到H.263編/解碼器13a，藉此達到所要調整的目的。另外，上述的計算動作並非用以限定本發明。對於熟知此技藝者而言，亦可以採用其他視訊編碼參數、語音編碼參數或是其他的判斷相關參數來達到本發明相同的目的。另外，執行計算動作的裝置也不應受上述說明所限制，例如，平均網路頻寬MBR也可以在網路管理部14中一併計算後再送到調整部16，而在參數計算部 16a所計算的部分判斷相關參數也可以在狀態描述部16b 中執行。第7圖則是表示本實施例之調整部16的詳細處理流程圖，並且一併詳細說明本實施例中狀態描述部16b 之判斷動作以及調整量計算部16c之計算動作。首先在調整部16的參數計算部16a中，計算及設定各相關判斷參數，如MF(包括MFU和MF,)、MBR(包括MBRU和 MBRJ、LB和UB(步驟Sl)〇接著根據上述參數來判斷 18 本紙張尺度適用中國國家標準（CNS ) A4規格（210X 297公釐） (請先閲讀背面之注意事項再填寫本頁)Initiate_Time (RTCP) indicates the global time stamp sent by the RTCP packet; Arrival_Time (RTCP) indicates the global time stamp received by the RTCP packet; Lengthof (RTCP) indicates the length of the RTCP packet. If the RSVP protocol is not supported at the bottom of the network, the full amount of data must be used to obtain the ABR parameters. That is: 16 This paper size applies to China National Standard (CNS) A4 specification (21 × 297 mm) (Please read the precautions on the back before filling this page) Word ~ 1 A7 B7 V. Description of Invention (15) ABR = bps (H.263 video packet) + bps (voice packet) (bucket) Among them, bps (H.263 video packet) represents the number of bits of the H.263 packet currently received per second on the receiving end; bps ( Voice packets) represents the number of bits & of the voice packets currently received on the receiving side on average per second. The sixth diagram is a schematic block diagram of the structure of the adjustment section 16 in this embodiment, and indicates the parameters of the key, that is, ABR, VEBR, LSBR, VCR, and Pr. The adjustment unit 16 is composed of a parameter calculation unit 16a, a state description unit 16b, and an adjustment amount calculation unit 16c. Printed by the Central Bureau of Standards, Ministry of Economic Affairs and Consumer Cooperatives-I. m I----ΙΛ 4S-I 11 nmm (please read the note on the back ^ h before filling this page) The ABR parameters sent by the road management unit 14 and the VEBR parameters sent by the H.263 encoder / decoder 13a calculate some relevant judgment parameters. In this embodiment, these judgment related parameters include: average network bandwidth (average available bit rate) MBR, maximum / minimum effective audio bandwidth MFU / MF, and maximum / minimum effective network bandwidth MBRu / MBR. The parameter calculation section 16a mainly uses the two predicted parameter values to determine the above-mentioned judgment related parameters, namely, LB and UB, which respectively represent the percentage values of the lower limit and the upper limit of the adjustment interval value, which are adjustable parameters. The following is the calculation method of each judgment related parameter: MF = (MBR-VEBR) xUB% ⑹ MF, = (MBR-VEBR) xLB% (7) MBR ^ MBRxUB% (8) MBR, = MBRxLB ° / 〇⑼ 17 This paper size applies to the Chinese National Standard (CNS) A4 specification (210 X 297 mm). Employees' Cooperatives of the Central Standards Bureau, Ministry of Economic Affairs, A7 B7. 5. Description of the invention (16) where η represents the serial number value. The state description unit 16b judges the correlation between the current network bandwidth and the video and audio bandwidth based on the above-mentioned various judgment related parameters and the LSBR parameters sent by the speech codec 24a, and judges the determined states The result is sent to the adjustment amount calculation section 16c. The final adjustment amount calculation section 16c calculates the amount of video data to be adjusted according to the determined status result, that is, calculates the VCR parameter value to be adjusted, and sends it to the system control section 17 to control the camera device, or The adjusted H.263 encoding parameter value symbol is sent to the H.263 encoder / decoder 13a on the Pr parameter, thereby achieving the purpose to be adjusted. In addition, the calculation operation described above is not intended to limit the present invention. For those skilled in the art, other video encoding parameters, voice encoding parameters, or other judgment related parameters can also be used to achieve the same purpose of the present invention. In addition, the device performing the calculation operation should not be limited by the above description. For example, the average network bandwidth MBR can also be calculated in the network management unit 14 and then sent to the adjustment unit 16, and the parameter calculation unit 16a The calculated part of the judgment related parameters may also be executed in the state description section 16b. FIG. 7 is a detailed flowchart of the processing performed by the adjustment unit 16 of this embodiment, and the judgment operation of the state description unit 16b and the calculation operation of the adjustment amount calculation unit 16c in this embodiment will be described in detail. First, in the parameter calculation section 16a of the adjustment section 16, calculate and set relevant judgment parameters, such as MF (including MFU and MF,), MBR (including MBRU and MBRJ, LB, and UB (step S1)). Judge 18 This paper size is applicable to Chinese National Standard (CNS) A4 specification (210X 297mm) (Please read the precautions on the back before filling this page)

、1T 經濟部中央梯準局員工消費合作社印製 Λ7 B7 五、發明説明（17 ) 頻寬狀態。首先判斷的是語音傳輸是否有足夠的頻寬情況（步驟S2)。判斷的方法是根據最大有效音訊頻寬MFu來檢查下式是否成立： MFU < LSBR (1〇) 如果第（10)的條件成立，則表示目前網路頻寬嚴重不足’所以必須透過調整視訊資料量來達到規畫的目的。本實施例在此情況下所進行的調整步驟（步驟S3)，是直接調整視訊擷取量VCR參數《調整的方式如下： VCRnew=VCRoldx(l-(LSBR-MFu)/MBR) (Π) 其中VCRold表示原來的視訊擷取量，vCRnew表示調整後的視訊擷取量。另外’如果第（10)式的條件不成立時，則表示目前頻寬並沒有不足的情況。接著則是判斷目前的頻寬狀態是否出現異常的情況。此異常狀態分為兩種，第一種是頻寬是否呈現下降的情況（步驟S4)，第二種則是頻宽是否呈現上昇的情況（步驟S6)。如果出現上述兩種中任一種的情況，也需要進行視訊資料量的調整。在本實施例中，此處理的調整可以透過控制視訊擷取量VCR以及 H.263視訊編碼器的編碼相關參數來達成。在步驟S4中檢査頻寬是否呈現下降的方式是根據下列方式： MF, > LSBR 並且 MBRi > ABR (12) 如果第（12)式成立，表示目前頻寬雖然還可以滿足本紙張尺度適用中國國家標準（CNS ) A4规格（210X297公釐） I 1- - - I ··I .1— n ( ii- I I t » a /ί\ (锖先閲讀背面之注意事項再填寫本頁) 訂 • n 經濟部中央標率局負工消費合作社印製 Α7 Β7 五、發明説明（18 ) 聲音的頻寬要求，但是網路頻寬漸趨不足。調整步驟S5 可以朝下列兩者中任一者進行： (A)降低傳輸視訊所需的位元數，因此可以將H.263 視訊編碼器所採用的量化參數（quantizer)加大，此時會犧牲影像品質，同時採用PB模式編碼。 (B)調整影像擷取量VCR : VCRnew=VCRoldx(l-(MBRrABR)/MBR) (13) 另外，在步驟S6中檢查頻寬是否呈現上昇的方式是根據下列方式： MF, > LSBR 並且 MBRU > ABR (14) 如果第（14)式成立，表示目前頻寬雖然短期内雖漸趨充足但是長期而言仍未達充足的情況。調整步驟S7 可以朝下列方式進行： (C)如果VCRold<30並且VCRnewS30，則調整影像擷取量VCR: VCRnew=VCRoldx(l+(ABR-MBRu)/MBR) (15) (D)如果ν〇^1(1=30，表示頻寬已經達到安全的範圍内，所以可以朝提昇影像品質方向著手，例如將Η.263 視訊編碼器所使用的量化參數變小或是採用ΑΡ模式編瑪。最後，如果目前參數的設定正好符合頻寬狀態的要求，就不需要進行調整，所以回到步驟S1，再根據新的 ABR參數、VEBR參數以及LSBR參數，繼續進行上述 20 本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公釐） ----：--*---- ^ -------訂------f Γ * - m /aj. (請先閱讀背面之注意事項再填寫本頁) * · A7Printed by the Consumer Cooperative of the Central Government Bureau of the Ministry of Economic Affairs, 1T Λ7 B7 V. Description of the invention (17) Bandwidth status. The first thing to judge is whether there is sufficient bandwidth for voice transmission (step S2). The method of judgment is to check whether the following formula is established based on the maximum effective audio bandwidth MFu: MFU < LSBR (1〇) If the condition of (10) is established, it means that the current network bandwidth is seriously insufficient, so the video must be adjusted by The amount of data to achieve planning purposes. The adjustment step (step S3) performed in this case in this case is to directly adjust the video capture volume VCR parameter "The adjustment method is as follows: VCRnew = VCRoldx (l- (LSBR-MFu) / MBR) (Π) where VCRold indicates the original video capture volume, and vCRnew indicates the adjusted video capture volume. In addition, if the condition of the formula (10) is not satisfied, it means that there is no shortage of the current bandwidth. The next step is to determine whether the current bandwidth status is abnormal. This abnormal state is divided into two types, the first is whether the bandwidth is decreasing (step S4), and the second is whether the bandwidth is increasing (step S6). If any of the above two situations occur, you also need to adjust the amount of video data. In this embodiment, the adjustment of this process can be achieved by controlling the video capture volume VCR and the encoding related parameters of the H.263 video encoder. The method of checking whether the bandwidth decreases in step S4 is according to the following methods: MF, > LSBR and MBRi > ABR (12) If the formula (12) holds, it means that the current bandwidth can still meet the requirements of this paper. China National Standard (CNS) A4 Specification (210X297 mm) I 1---I ·· I .1— n (ii- II t »a / ί \ (锖 Please read the precautions on the back before filling this page) Order • n Printed by the Consumers ’Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs Α7 Β7 V. Description of the invention (18) The bandwidth requirement of the sound, but the network bandwidth is gradually becoming insufficient. The adjustment step S5 can be directed to either of the following two Proceed as follows: (A) Reduce the number of bits required to transmit video, so the quantizer used by the H.263 video encoder can be increased. At this time, the image quality will be sacrificed and PB mode encoding will be used. (B ) Adjust the image capture volume VCR: VCRnew = VCRoldx (l- (MBRrABR) / MBR) (13) In addition, the method of checking whether the bandwidth is increasing in step S6 is according to the following method: MF, > LSBR and MBRU > ABR (14) If formula (14) holds, it means that Although it gradually becomes sufficient in the short term, it is still not sufficient in the long term. The adjustment step S7 can be performed in the following way: (C) If VCRold < 30 and VCRnewS30, adjust the image capture amount VCR: VCRnew = VCRoldx (l + ( ABR-MBRu) / MBR) (15) (D) If ν〇 ^ 1 (1 = 30, it means that the bandwidth has reached a safe range, so you can start to improve the quality of the image, such as the Η.263 video encoder The quantization parameter used is reduced or compiled in AP mode. Finally, if the current parameter settings exactly meet the requirements of the bandwidth state, no adjustment is needed, so return to step S1, and then according to the new ABR parameters, VEBR Parameters and LSBR parameters, continue to the above 20 paper sizes applicable to the Chinese National Standard (CNS) A4 specifications (210X297 mm) ----:-* ---- ^ ------- order --- --- f Γ *-m / aj. (Please read the notes on the back before filling this page) * · A7

五、發明説明（19 ) 的判斷和調整0 根據以上所述，利用本實施例的可動態規畫網路影音頻道之通訊裝置和控制方法，便可以依據目前網際網路的實際有效頻寬，動態地調整視訊資料的頻寬，藉此了以至少保證在語音訊息在傳輸上的品質，而語音在網際網路電話等等應用中也是比較重要的訊息。另外，利用每個封包所附的全域時間標藏以及在視訊資料封包中所插入的框間封包，也能夠保持視訊資料和語音資料的同步性，藉此達到本發明之目的。本發明雖以一較佳實施例揭露如上，然其並非用以限定本發明，任何熟習此項技藝者，在不脫離本發明之精神和範圍内，當可做些許的更動與潤飾，因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。 -I H - — J^i n I f. n - n - I _ 訂 *< - % /»2» f請先閑讀背面之注意事唄再填寫本頁} 一經濟部中央標準局貝工消費合作社印笨 21 本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公楚V. Judgment and Adjustment of Invention Description (19) According to the above, using the communication device and control method of the network video and audio channel that can dynamically plan the network in this embodiment, the actual effective bandwidth of the Internet can be used. Dynamically adjust the bandwidth of video data to ensure at least the quality of the transmission of voice messages. Voice is also an important message in applications such as Internet telephony. In addition, the global time stamp attached to each packet and the inter-frame packet inserted in the video data packet can also maintain the synchronization of video data and voice data, thereby achieving the object of the present invention. Although the present invention is disclosed as above with a preferred embodiment, it is not intended to limit the present invention. Any person skilled in the art can make some modifications and retouches without departing from the spirit and scope of the present invention. The scope of protection of the invention shall be determined by the scope of the attached patent application. -IH-— J ^ in I f. N-n-I _ Order * <-% / »2» f Please read the notes on the back 呗 before filling out this page} A shellfish consumption by the Central Standards Bureau of the Ministry of Economic Affairs Cooperative India Ben 21 This paper size applies to China National Standard (CNS) A4 (210X297 Gongchu)

Claims

6. The scope of patent application 12. The control method as described in item 10 or u of the scope of patent application, wherein the adjustment step includes the following steps: Once the short-term network bandwidth parameter and the video encoding parameter are determined, the average is determined. Network bandwidth parameters, maximum effective speech and audio parameters and minimum S-year effective speech and audio parameters; according to the average network bandwidth parameter, the maximum effective speech and audio parameters, the minimum effective speech and audio parameters, and the voice The encoding parameter determines the bandwidth status of the network; and adjusts the amount of data to transmit the video data packet according to the bandwidth status obtained by the status description section to meet the minimum transmission requirements for the voice data packet. 13. The control method according to item 12 of the scope of patent application, wherein in the step of determining the average network bandwidth parameter, the short-term network bandwidth parameter is averaged to obtain the average network bandwidth parameter. Printed by the Consumers' Cooperative of the Central Bureau of Standards and Quarantine of the Ministry of Economic Affairs 14. The control method described in item 12 of the scope of patent application, wherein the step of determining the maximum effective speech width parameter and the minimum effective speech width parameter is based on the The average network bandwidth parameter, the video encoding parameter and an upper limit of the adjustment interval determine the maximum effective speech and audio width parameter, and according to the average network bandwidth parameter, the video encoding parameter and a lower limit of the adjustment interval, Determines the minimum effective speech and audio width parameter. 15. The control method described in item 14 of the scope of patent application, wherein the step of determining the state of the network bandwidth is to determine the state of the network bandwidth as insufficient voice network bandwidth, decreased network bandwidth, and network Bandwidth rises one of three. This paper standard applies the Chinese National Standard (CNS) to threaten (2 offers 297 public magic A8 B8 C8 D8 VI. Patent application scope ι · A network device that can dynamically plan the network audio and video channels, the network device is connected to A network is used to send and receive video data packets and voice data packets. It includes a data receiver and a data transmitter, which are characterized by: The data receiver generates short-term data based on the received video data packets and voice data packets. Network bandwidth parameters; the data transmitting end includes a video data encoding section and a voice data encoding section, respectively corresponding to a video encoding parameter and a voice encoding parameter; and the network device further includes an adjustment section, which is coupled to The data receiver and the data transmitter dynamically adjust the data amount of the video data packet transmitted in the data transmitter according to the short-term network bandwidth parameters, the video encoding parameters, and the voice encoding parameters to meet the requirements Minimum transmission requirements for voice data packets 2. The network device as described in item 1 of the scope of patent application, wherein the video encoding parameter is the video Number of coded bits (VEBR), the voice coding parameter is the minimum bandwidth (LSBR) required for voice coding. Printed by the Industrial and Consumer Cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs m. ^ N · In I ml n 9H mi ΙΓΚ nn ^ i — ^ Ϋ nneJ (Please read the precautions on the back before filling this page) 3. The network device as described in item 1 or 2 of the scope of patent application, where the adjustment section includes: a parameter calculation section, which is based on the short-term network The channel bandwidth parameter and the video encoding parameter determine the average network bandwidth parameter, the maximum effective speech and audio bandwidth parameter, and the minimum effective speech and audio bandwidth parameter; a state description section, coupled to the parameter calculation section, and based on the average network Bandwidth parameters, the maximum effective speech and audio width parameters, the minimum effective language 22 This paper size uses the Chinese National Standard (CNS) A4 specification (210X297 mm) The Central Standards Bureau of the Ministry of Economic Affairs of the Bayer Consumer Cooperatives India Oxygen A8 B8 C8 ______ D8 VI. Patent application: The audio width parameter and the speech encoding parameter determine the bandwidth status of the network; and the adjustment amount calculation section is coupled to the status description section, which is based on the The state of the bandwidth obtained by the state description department dynamically adjusts the amount of data transmitted by the video data packet in the data transmitting terminal to meet the minimum transmission requirements of the voice data packet. Device, wherein the parameter calculation section averages the short-term network bandwidth parameters to obtain the average network bandwidth parameter. 5. The network device as described in item 3 of the patent application scope, wherein the parameter calculation section Including an upper limit of the adjustment interval and a lower limit of the adjustment interval, the parameter calculation unit determines the maximum effective speech and audio width parameter according to the average network bandwidth parameter, the video encoding parameter, and the adjustment interval upper limit, and According to the average network bandwidth parameter, the video encoding parameter, and the lower limit of the adjustment interval, the minimum effective speech and audio width parameter is determined. 6. The network device described in item 3 of the scope of patent application, wherein the bandwidth state of the network determined by the state description unit is the lack of voice network bandwidth, the decrease in network bandwidth, and the network frequency. Wide rise one of three. 7. The network device described in item 6 of the scope of patent application, wherein when the state description section determines that the network bandwidth status is insufficient for voice network bandwidth, the adjustment amount calculation section directly adjusts the data transmission The video frame capture amount of one of the video capture devices in the client; when the state description section determines that the network bandwidth status is a decrease in network bandwidth or an increase in network bandwidth, the adjustment amount calculation section adjusts the data The 23 paper sizes of one of the video capture devices in the messenger are applicable to the Chinese national oak (CNS > 8 4 wash grid (210X29 * 7 cm)) (Please read the precautions on the back before writing this page) Order A8 B8 C8 D8 VI. Patent application scope image capture volume or the video data encoding department. 8. The network device as described in item 1 of the patent application scope, wherein the video packet includes an inter-frame packet, It is sent at a predetermined interval, and the data receiver uses the inter-frame packet to synchronize the received video data packet and the voice data packet. 9. The network device described in item 1 of the scope of patent application, Which the video data The packet and the voice data packet each contain a global time tag, and the data receiver end synchronizes the video and voice based on the global time tag of the video data packet and the global time stamp of the voice data packet. A control method that can dynamically plan network video and audio channels to control the sending and receiving of video data packets and voice data packets on a network, which includes the following steps: Determine the short-term network frequency based on the received video data packets and voice data packets. Wide parameters; capture the video coding parameters in the video data coding department used to encode the video data, and the voice coding parameters in the voice data coding department used to encode the voice data; ii ml nn ml HI ^^ 1 · ϋ— I --eJ; < 0 · / 14 (Please read the precautions on the back before filling this page) According to the short-term network bandwidth parameters, the video encoding parameters, and the The voice encoding parameters adjust the data volume of the video data packet to meet the minimum transmission requirements of the voice data packet. 11. The control method as described in item 10 of the scope of patent application, wherein the video encoding parameter is the number of video encoding bits (VEBR), and the speech encoding parameter is the minimum bandwidth (LSBR) required for speech encoding. The paper size applies the Chinese national standard (CNS > A4 specification (210X297 mm) 6. Patent application scope 12. The control method described in item 10 or u of the patent application scope, where the adjustment step includes the following steps: According to the short-term network bandwidth parameter and the video encoding parameter, determine the average network bandwidth parameter, the maximum effective speech and audio bandwidth parameter, and the minimum S-year effective speech and audio bandwidth parameter; according to the average network bandwidth parameter , The maximum effective speech and audio width parameter, the minimum effective speech and audio width parameter, and the speech coding parameter determine the bandwidth status of the network; and adjusting the transmission of the video data packet according to the bandwidth status obtained by the status description section The amount of data to meet the minimum transmission requirements for the voice data packet. 13. The control method according to item 12 of the scope of patent application, wherein in the step of determining the average network bandwidth parameter, the short-term network bandwidth parameter is averaged to obtain the average network bandwidth parameter. Printed by the Consumers' Cooperative of the Central Bureau of Standards and Quarantine of the Ministry of Economic Affairs 14. The control method described in item 12 of the scope of patent application, wherein the step of determining the maximum effective speech width parameter and the minimum effective speech width parameter is based on the The average network bandwidth parameter, the video encoding parameter and an upper limit of the adjustment interval determine the maximum effective speech and audio width parameter, and according to the average network bandwidth parameter, the video encoding parameter and a lower limit of the adjustment interval, Determines the minimum effective speech and audio width parameter. 15. The control method described in item 14 of the scope of patent application, wherein the step of determining the state of the network bandwidth is to determine the state of the network bandwidth as insufficient voice network bandwidth, decreased network bandwidth, and network Bandwidth rises one of three. This paper size applies the Chinese National Standard (CNS) to threaten (2 Xian 297 public magic A8 B8 C8 D8, the scope of patent application '^ ---- = Control method described in item 15 of the scope of patent application, where: The network bandwidth status is voice network bandwidth straight == image capture volume of the capture device; when it is determined that the network St is a decrease in network bandwidth or an increase in network bandwidth, a video capture device is adjusted The amount of image capture or the video data compilation scorpion department.-17. The control method described in item 1 () of the patent scope = the data packet includes an inter-frame packet, and the data is received at a predetermined interval The terminal uses the inter-frame packet to synchronize the received video data packet and the voice data packet. 18. According to the control method described in the scope of patent application 帛 10, the video data packet and the voice data packet each contain a global domain "Time tag" The control method further includes a step of synchronizing video and voice according to the global time stamp of the video data packet and the global time stamp of the voice data packet. ------- 1-- "装 **-I ί < *-/ i (please listen first ^^ back of the note and then fill out the entry page) book 'Central Bureau of Ministry of Economic Affairs rub quasi HIGHLAND consumer cooperatives New York Post printed 26 paper read 7 mm