TW202326526A

TW202326526A - Method for reducing network bandwidth required for image streaming by using artificial intelligence processing module

Info

Publication number: TW202326526A
Application number: TW110148954A
Authority: TW
Inventors: 郭榮昌; 曹文凱; 吳英豪
Original assignee: 日商優必達株式會社股份有限公司
Priority date: 2021-12-27
Filing date: 2021-12-27
Publication date: 2023-07-01

Abstract

A method for reducing network bandwidth required for image streaming by using an Artificial Intelligence (AI) processing module. The resolution of the images to be transmitted are firstly reduced on the server side, and then the resolution-reduced low-resolution images are transmitted to the client device through the network, thereby reducing the network bandwidth required to transmit the image streams. A pre-trained AI processing module is used on the client device to restore the received low-resolution images to high-resolution images, allowing users to enjoy high-quality image streaming and low-bandwidth consumption of network at the same time.

Description

Method of Using Artificial Intelligence Processing Module to Reduce Network Bandwidth Required for Video Streaming

本發明是關於一種利用人工智慧處理模組來降低影像串流所需網路頻寬的方法，尤指一種藉由在伺服器端先將欲傳輸之影像的解析度降低後再透過網路傳輸給客戶端，然後在客戶端藉由一預先訓練之人工智慧(Artificial Intelligent；簡稱AI)處理模組來把所接收到的影像還原成高解析度，藉以降低影像串流所需網路頻寬的方法。 The present invention relates to a method for reducing the network bandwidth required for image streaming by using an artificial intelligence processing module, especially a method of reducing the resolution of the image to be transmitted on the server side before transmitting it through the network to the client, and then use a pre-trained artificial intelligence (AI) processing module on the client to restore the received image to high resolution, thereby reducing the network bandwidth required for image streaming Methods.

近幾年來，網路線上遊戲在全世界越來越受歡迎。在雲端(Cloud-Based)運算相關系統與技術的發展下，一種由伺服器將遊戲內容以影像串流透過網路傳輸給玩家來提供線上遊戲服務的雲端技術也被開發出。 In recent years, online games have become more and more popular all over the world. With the development of Cloud-Based computing related systems and technologies, a cloud-based technology has also been developed in which the server transmits the game content to the players through the network as an image stream to provide online game services.

傳統上提供此類雲端線上遊戲(On-Line Game)服務的方式是讓伺服器進行幾乎全部的運算。換言之，當提供線上遊戲服務時，一特定應用程式會在伺服器中執行以產生一包含許多3D(Three Dimensional)物件的虛擬3D環境，其中包含了可被玩家控制或移動的3D物件。然後，依據玩家的控制結果，該伺服器把這些3D物件與虛擬3D環境渲染(Render)至一2D(Two Dimensional)遊戲畫面中，以供顯示在玩家的裝置上。接著，伺服器將渲染後的影像編碼(Encode)壓縮成2D影像串流並透過網路傳送給玩家的裝置。該玩家裝置只需將接收到的2D影像串流解碼後加以「播放」，無須進行3D渲染的運算。然而，此種雲端線上遊戲服務仍有數項問題需注意，例如，當為大量玩家同時提供3D渲染程序時伺服器的高負載、因編碼壓縮及串流程序所造成之影像畫面品質的降低、以及經由網路傳送2D影像串流所消耗的大量通訊頻寬。 Traditionally, the way to provide such cloud-based online game (On-Line Game) service is to let the server perform almost all calculations. In other words, when online game services are provided, a specific application program will be executed in the server to generate a virtual 3D environment including many 3D (Three Dimensional) objects, which include 3D objects that can be controlled or moved by players. Then, according to the player's control result, the server renders the 3D objects and the virtual 3D environment into a 2D (Two Dimensional) game screen for display on the player's device. Then, the server compresses the rendered image code (Encode) into a 2D image stream and sends it to the player's device through the network. The player device only needs to decode the received 2D image stream and then "play" it without performing 3D rendering calculations. However, there are still several problems to be paid attention to in this kind of cloud online game service, such as the high load of the server when providing 3D rendering programs for a large number of players at the same time, the degradation of image quality caused by encoding compression and streaming programs, and Send 2D images via network Streaming consumes a lot of communication bandwidth.

解決畫面品質降低的一種習知方式是在伺服器端提高由遊戲應用程式產生的原圖影像的解析度，並提高傳輸影像時的位元速率(Bitrate)，亦即，降低伺服器把原圖影像編碼成2D影像串流時的壓縮率(Compression Ratio)。然而，很顯然地，這樣做將導致伺服器的負載及頻寬消耗量都將因影像的高解析度與高傳輸位元速率而顯著增加。例如，假設圖框率(Frame Rate)與編碼壓縮率都是定值時，當在伺服器端之遊戲應用程式產生的原圖影像的解析度由720p提高到1080p時，則伺服器的運算負載及所需的網路傳輸位元速率都將提高成2.25倍。相對地，倘若嘗試降低伺服器負載或網路頻寬消耗量時，則遊戲影像的畫面品質就會被犧牲。因此，想要同時獲得完美影像品質與經濟的頻寬消耗就成為無法兼得的兩難。 A known way to solve the picture quality reduction is to increase the resolution of the original image generated by the game application on the server side, and increase the bit rate (Bitrate) when transmitting the image, that is, reduce the server's original image resolution. The compression ratio (Compression Ratio) when the video is encoded into a 2D video stream. However, it is obvious that doing so will result in a significant increase in server load and bandwidth consumption due to the high resolution and high transmission bit rate of the video. For example, assuming that the frame rate (Frame Rate) and the encoding compression rate are both constant, when the resolution of the original image image generated by the game application on the server side is increased from 720p to 1080p, the computing load of the server And the required network transmission bit rate will be increased by 2.25 times. Conversely, when trying to reduce server load or network bandwidth consumption, the picture quality of game graphics will be sacrificed. Therefore, it becomes a dilemma that both perfect image quality and economical bandwidth consumption can be obtained at the same time.

解決此問題的另一種方式是降低由伺服器端之遊戲應用程式產生之原圖影像的解析度、或是以一較高壓縮率來把原圖影像編碼成2D影像串流、或兩者併行。藉此，藉由網路傳送2D影像串流的頻寬消耗得以降低，雖然遊戲影像的畫面品質也會被犧牲。同時，在客戶端裝置使用一影像增強技術。一旦收到2D影像串流，客戶端裝置會解碼影像串流並藉由該影像增強技術來來改善影像的視覺效果。直方圖均化(Histogram equalization；簡稱HE)因具有簡單性與效率性，所以是用於提高影像對比的最常用方法之一。然而，HE可能導致過度的對比度增強和特徵丟失問題，導致不自然的外觀和所處理之影像細節的損失。此外，不僅HE而且本領域已知的所有其他影像增強技術都遇到相同的困境，亦即，它們都試圖使用同一套演算法來處理具有完全不同畫面內容的各種影像，而這種想法是不可行的。拿雲端線上遊戲服務為例，由伺服器產生的原圖影像的畫面內容會因為遊戲場景的變化而顯著改變。舉例來說，一城市遊戲場景可能讓遊戲的原圖影像包含許多具有簡單且清晰的外觀輪廓以及雖不同但卻大致同色系的色彩。一個黑暗洞穴的遊戲場景則會使遊戲的原圖影像充滿單調且低色調及低色度值的色彩，但卻具有不規則但不起眼的景觀輪廓。而一茂盛花園的場景則會讓遊戲的原圖影像包含許多生氣勃勃且色彩鮮豔的物件並具有詳細且複雜的輪廓。毫無疑問地，沒有任何一種習知增強技術可以對具有完全不同畫面內容的各種不同場景都同樣提供良好的影像增強效果。 Another way to solve this problem is to reduce the resolution of the original image generated by the server-side game application, or encode the original image into a 2D video stream with a higher compression rate, or both . In this way, the bandwidth consumption of transmitting 2D video streams via the network can be reduced, although the picture quality of the game video will also be sacrificed. At the same time, an image enhancement technology is used on the client device. Once receiving the 2D video stream, the client device will decode the video stream and use the video enhancement technology to improve the visual effect of the video. Histogram equalization (HE for short) is one of the most commonly used methods for improving image contrast because of its simplicity and efficiency. However, HE can lead to excessive contrast enhancement and feature loss problems, resulting in unnatural appearance and loss of processed image details. Furthermore, not only HE but all other image enhancement techniques known in the art suffer from the same dilemma, namely, they all try to use the same set of algorithms to process various images with completely different picture content, which is not possible. OK. Taking the cloud online game service as an example, the screen content of the original image image generated by the server will change significantly due to the change of the game scene. For example, a city game scene may cause the original image of the game to contain many colors with simple and clear outlines and different but roughly the same color family. A dark cave game scene would fill the game's original image with flat, low-key and low-chroma-value colors, but with irregular but inconspicuous landscape outlines. A lush garden scene would make the game's artwork include many vibrant and brightly colored objects with detailed and complex silhouettes. Undoubtedly, none of the conventional enhancement techniques can equally provide good image enhancement effects for various scenes with completely different picture contents.

此外，這些習知影像增強技術的另一缺點是，雖然這些習知影像增強技術的數學運算式可以改善影像的畫面效果例如對比度、銳利度、飽和度等等，但這些運算式及其參數都完全與伺服器產生的原圖影像無關。所以，這些習知影像增強技術的增強過程絕不會讓被增強後的影像在視覺上更接近於其對應之原圖影像，也因此客戶端的遊戲玩家並無法完全享受到由伺服器端之遊戲應用程式所產生的原圖影像的畫面效果。 In addition, another disadvantage of these conventional image enhancement techniques is that although the mathematical calculation formulas of these conventional image enhancement techniques can improve image effects such as contrast, sharpness, saturation, etc., these calculation formulas and their parameters are not It has nothing to do with the original image generated by the server. Therefore, the enhancement process of these conventional image enhancement technologies will never make the enhanced image visually closer to its corresponding original image, and therefore gamers on the client end cannot fully enjoy the game played by the server end. The screen effect of the original image generated by the application.

緣此，本發明的主要目的在提供一種利用人工智慧處理模組來降低影像串流所需網路頻寬的方法。藉由在伺服器端先將欲傳輸之影像的解析度降低後再透過網路把低解析度影像傳輸給客戶端，藉以降低傳輸影像串流所需的網路頻寬。然後，在客戶端藉由一預先訓練之人工智慧(Artificial Intelligent；簡稱AI)處理模組，把所接收到的低解析度影像還原成高解析度影像，同時享受高影像品質及低網路頻寬消耗的雙重優點。 Therefore, the main purpose of the present invention is to provide a method for reducing the network bandwidth required for image streaming by using artificial intelligence processing modules. By first reducing the resolution of the image to be transmitted on the server side and then transmitting the low-resolution image to the client through the network, the network bandwidth required to transmit the image stream is reduced. Then, on the client side, a pre-trained artificial intelligence (AI) processing module restores the received low-resolution images to high-resolution images, while enjoying high image quality and low network frequency. The double advantage of wide consumption.

為達上述目的，本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的一實施例包括：步驟(A)：在一伺服器中執行一第一應用程式以產生相對應於複數個原圖影像的複數個來源影像；複數個該來源影像具有一第一解析度；複數個該來源影像被該伺服器內的一編碼器進行編碼與壓縮以產生相對應的複數個編碼後影像；步驟(B)：在遠離該伺服器的一客戶端裝置內執行一第二應用程式；該第二應用程式是關聯於且合作於該第一應用程式；步驟(C)：該客戶端裝置經由一網路連結於該伺服器，然後以一影像串流的方式經由該網路接收由該伺服器產生的該些編碼後影像；步驟(D)：該客戶端裝置將該些編碼後影像解碼成相對應的複數個解碼後影像，並使用一人工智慧(Artificial Intelligent；簡稱AI)處理模組來提升該些解碼後影像的解析度，以產生相對應的複數個高解析度影像；複數個該高解析度影像具有一第二解析度；其中，該第二解析度是高於該第一解析度，並且，複數個該原圖影像的解析度等於該第二解析度；步驟(E)：該客戶端裝置將複數個該高解析度影像依序輸出至一螢幕以作為被播放的複數個輸出影像；其中，該AI處理模組藉由分析該些解碼後影像與相對應之該些原圖影像之間的差異所預先得到的至少一數學運算式以及複數個加權參數來處理該些解碼後影像；藉此，所得到的該些高解析度影像的解析度會等於相對應的該些原圖影像、且高於複數個該來源影像的解析度；該AI處理模組的該至少一數學運算式以及複數個該加權參數是預先藉由一訓練伺服器內的一人工神經網路模組所執行的一訓練程序來定義。 To achieve the above-mentioned purpose, an embodiment of the method for reducing the network bandwidth required for image streaming by using an artificial intelligence processing module in the present invention includes: Step (A): Executing a first application program in a server to generate A plurality of source images corresponding to a plurality of original image images; the plurality of source images have a first resolution; the plurality of source images are encoded and compressed by an encoder in the server to generate corresponding plurality an encoded image; step (B): execute a second application program in a client device away from the server; the second application program is associated with and cooperates with the first application program; step (C): The client device is connected to the server via a network, and then receives the encoded images generated by the server via the network in an image stream; step (D): the client device will The encoded images are decoded into a corresponding plurality of decoded images, and an artificial intelligence (AI) processing module is used to improve the resolution of the decoded images to generate a corresponding plurality of high-resolution images. The plurality of high-resolution images have a second resolution; wherein, the second resolution is higher than the first resolution, and the resolution of the plurality of original image images is equal to the second resolution ; Step (E): The client device sequentially outputs a plurality of the high-resolution images to a screen as a plurality of output images to be played; wherein, the AI processing module analyzes the decoded images and Corresponding to at least one mathematical operation formula and a plurality of weighting parameters obtained in advance corresponding to the differences between the original images to process the decoded images; thereby, the obtained high-resolution The resolution of the image will be equal to the corresponding original images and higher than the resolution of the plurality of source images; the at least one mathematical operation formula and the plurality of weighting parameters of the AI processing module are pre-passed by a It is defined by a training program executed by an artificial neural network module in the training server.

較佳者，步驟(A)中所述的複數個該編碼後影像是藉由以下步驟來產生：在該伺服器中執行該第一應用程式以產生複數個該原圖影像，複數個該原圖影像具有該第二解析度；使用一解析度降低程序，將複數個該原圖影像的解析度降低至該第一解析度，以獲得相對應的複數個該來源影像；以及使用該編碼器來將複數個該來源影像進行編碼，以獲得相對應的複數個該編碼後影像。 Preferably, the plurality of encoded images described in step (A) are generated by the following steps: executing the first application program in the server to generate a plurality of the original image images, and the plurality of the original images The image image has the second resolution; using a resolution reduction program to reduce the resolution of the plurality of the original image images to the first resolution to obtain the corresponding plurality of the source images; and using the encoder to encode the plurality of source images to obtain the corresponding plurality of encoded images.

較佳者，該伺服器包括一AI編碼模組；步驟(A)中所述的複數個該編碼後影像是藉由以下步驟來產生：在該伺服器中執行該第一應用程式以產生複數個該原圖影像，複數個該原圖影像具有該第二解析度；使用該AI編碼模組來將複數個該原圖影像進行降低解析度以獲得相對應的複數個該來源影像、以及將複數個該來源影像進行編碼以獲得相對應的複數個該編碼後影像；其中，該AI編碼模組包含預設的至少一AI編碼運算式；該至少一AI編碼運算式包含預設的複數個編碼加權參數。 Preferably, the server includes an AI encoding module; the plurality of encoded images described in step (A) are generated by the following steps: executing the first application program in the server to generate a plurality of a plurality of the original image images, the plurality of the original image images have the second resolution; use the AI coding module to reduce the resolution of the plurality of the original image images to obtain the corresponding plurality of the source images, and A plurality of the source images are encoded to obtain a corresponding plurality of the encoded images; wherein, the AI encoding module includes at least one preset AI encoding formula; the at least one AI encoding formula includes a plurality of preset Encode weighting parameters.

較佳者，該AI處理模組的該至少一數學運算式包括一第一預設的AI運算式以及一第二預設的AI運算式；該第一預設的AI運算式包括複數個第一加權參數；該第二預設的AI運算式包括複數個第二加權參數；其中，該第一預設的AI運算式搭配複數個該第一加權參數可用於提高影像的解析度，藉此，由該第一預設的AI運算式搭配複數個該第一加權參數所處理過的影像的解析度可以由該第一解析度提高到該第二解析度；其中，該第二預設的AI運算式搭配複數個該第二加權參數可用於增強影像的品質，藉此，由該第二預設的AI運算式搭配複數個該第二加權參數所處理過的影像的品質比該解碼後影像的品質更高、且更接近於該原圖影像的品質。 Preferably, the at least one mathematical calculation formula of the AI processing module includes a first preset AI calculation formula and a second preset AI calculation formula; the first preset AI calculation formula includes a plurality of first A weighting parameter; the second preset AI calculation formula includes a plurality of second weighting parameters; wherein, the first preset AI calculation formula and a plurality of the first weighting parameters can be used to improve the resolution of the image, thereby , the resolution of the image processed by the first preset AI calculation formula combined with a plurality of the first weighting parameters can be increased from the first resolution to the second resolution; wherein, the second preset The AI calculation formula combined with a plurality of the second weighting parameters can be used to enhance the quality of the image, whereby the quality of the image processed by the second preset AI calculation formula with the plurality of the second weighting parameters is better than that of the decoded The quality of the image is higher and closer to that of the original image.

較佳者，當該客戶端裝置將所接收到的複數個該編碼後影像解碼成相對應的複數個解碼後影像以後，該客戶端裝置會使用以下其中之一方式來處理複數個該解碼後影像： Preferably, after the client device decodes the received plurality of encoded images into corresponding plurality of decoded images, the client device will use one of the following methods to process the plurality of decoded images image:

方式一：該客戶端裝置先使用該第一預設的AI運算式及複數個該第一加權參數來處理複數個該解碼後影像以產生相對應的具第二解析度的複數個解析度提升影像；接著，該客戶端裝置使用該第二預設的AI運算式及複數個該第二加權參數來處理複數個該解析度提升影像以產生具高影像品質且具該第二解析度的複數個該高解析度影像； Method 1: The client device first uses the first preset AI calculation formula and multiple The first weighting parameter is used to process a plurality of the decoded images to generate a plurality of corresponding resolution-upgraded images with a second resolution; then, the client device uses the second preset AI calculation formula and the plurality of The second weighting parameter is used to process the plurality of the resolution-enhanced images to generate the plurality of the high-resolution images with high image quality and the second resolution;

方式二：該客戶端裝置先使用該第二預設的AI運算式及複數個該第二加權參數來處理複數個該解碼後影像以產生相對應的具高影像品質的複數個品質提升影像；接著，該客戶端裝置使用該第一預設的AI運算式及複數個該第一加權參數來處理複數個該品質提升影像以產生具該第二解析度且具高影像品質的複數個該高解析度影像。 Method 2: the client device first uses the second preset AI calculation formula and the plurality of second weighting parameters to process the plurality of decoded images to generate corresponding plurality of quality-enhanced images with high image quality; Then, the client device uses the first preset AI calculation formula and the first weighting parameters to process the plurality of the quality-enhanced images to generate the plurality of the high-quality images with the second resolution and high image quality. resolution image.

1、501、701:伺服器 1, 501, 701: server

2、21、22、23、502、702:客戶端裝置 2, 21, 22, 23, 502, 702: client device

3:基地台 3: base station

30:路由器 30:Router

4:網路 4: Network

100、200:應用程式(App) 100, 200: Application (App)

101、201:記憶體 101, 201: memory

102:編碼 102: Coding

103:串流 103: Streaming

104:網路設備 104:Network equipment

105:人工神經網路模組 105:Artificial Neural Network Module

106:神經網路 106: Neural Networks

107:解碼模組 107:Decoding module

108:比較與訓練模組 108:Comparing and training modules

202:網路模組 202: Network module

203:解碼模組 203: decoding module

204:AI增強模組 204: AI Enhanced Module

205:輸出模組 205: Output module

301-308、400-466、711-723、7161-7229:步驟 301-308, 400-466, 711-723, 7161-7229: Procedure

本發明的較佳實施例將配合以下圖式說明，其中： The preferred embodiment of the present invention will be described in conjunction with the following drawings, wherein:

圖一示意地介紹了本發明之利用人工智慧處理模組來降低影像串流所需網路頻寬的系統； Figure 1 schematically introduces the system of the present invention that uses artificial intelligence processing modules to reduce the network bandwidth required for video streaming;

圖二是本發明之利用人工智慧處理模組來降低影像串流所需網路頻寬的系統架構的一實施例示意圖； FIG. 2 is a schematic diagram of an embodiment of the system architecture of the present invention using artificial intelligence processing modules to reduce the network bandwidth required for video streaming;

圖三是本發明本發明利用人工智慧處理模組處理影像串流的方法的第一實施例的示意圖； FIG. 3 is a schematic diagram of a first embodiment of a method for processing video streams using an artificial intelligence processing module of the present invention;

圖四是本發明所述的人工神經網路模組105的訓練程序的第一實施例的示意圖； FIG. 4 is a schematic diagram of a first embodiment of the training program of the artificial neural network module 105 according to the present invention;

圖五是本發明所述的人工神經網路模組105的訓練程序的第二實施例的示意圖； FIG. 5 is a schematic diagram of a second embodiment of the training program of the artificial neural network module 105 according to the present invention;

圖六是本發明所述的人工神經網路模組105的訓練程序的第三實施例的示意圖； FIG. 6 is a schematic diagram of a third embodiment of the training program of the artificial neural network module 105 according to the present invention;

圖七是如圖六所示之鑑別器的訓練程序的一實施例示意圖； Fig. 7 is a schematic diagram of an embodiment of the training program of the discriminator as shown in Fig. 6;

圖八，其揭露了本發明之神經網路的訓練過程的一實施例，其中，原圖影像是YUV420、且輸出影像是RGB或YUV420； Figure 8, which discloses an embodiment of the training process of the neural network of the present invention, wherein the original image is YUV420, and the output image is RGB or YUV420;

圖九是本發明處理具YUV420格式之解碼後的影像的程序的一實施例示意圖； FIG. 9 is a schematic diagram of an embodiment of a program for processing a decoded image in YUV420 format according to the present invention;

圖十是本發明處理具YUV420格式之解碼後的影像的程序的另一實施例示意圖； Fig. 10 is another program of the present invention to process the decoded image with YUV420 format Example schematic diagram;

圖十一A是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第二實施例示意圖； FIG. 11A is a schematic diagram of a second embodiment of the method for reducing the network bandwidth required for image streaming by using the artificial intelligence processing module of the present invention;

圖十一B是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第三實施例示意圖； FIG. 11B is a schematic diagram of the third embodiment of the method for reducing the network bandwidth required for image streaming by using the artificial intelligence processing module of the present invention;

圖十二A是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第四實施例示意圖； FIG. 12A is a schematic diagram of a fourth embodiment of a method for reducing network bandwidth required for image streaming by using an artificial intelligence processing module in the present invention;

圖十二B是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第五實施例示意圖； FIG. 12B is a schematic diagram of a fifth embodiment of the method for reducing the network bandwidth required for image streaming by using the artificial intelligence processing module of the present invention;

圖十三是本發明所述AI處理模組的第一預設的AI運算式及第一加權參數的訓練方式的一實施例示意圖； Fig. 13 is a schematic diagram of an embodiment of the first preset AI calculation formula and the training method of the first weighting parameter of the AI processing module of the present invention;

圖十四A是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第六實施例示意圖； FIG. 14A is a schematic diagram of the sixth embodiment of the method for reducing the network bandwidth required for image streaming by using the artificial intelligence processing module of the present invention;

圖十四B是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第七實施例示意圖； FIG. 14B is a schematic diagram of the seventh embodiment of the method for reducing the network bandwidth required for image streaming by using the artificial intelligence processing module of the present invention;

圖十五是本發明所述AI處理模組的第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數的訓練方式的一實施例示意圖； Fig. 15 is a schematic diagram of an embodiment of the training method of the first preset AI calculation formula, the second preset AI calculation formula, the first weighting parameter and the second weighting parameter of the AI processing module of the present invention;

圖十六是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第八實施例示意圖； FIG. 16 is a schematic diagram of the eighth embodiment of the method for reducing the network bandwidth required for image streaming by using the artificial intelligence processing module of the present invention;

圖十七是本發明所述人工神經網路模組的該AI編碼運算式、該AI解碼運算式、第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數的訓練方式的一實施例示意圖。 Fig. 17 shows the AI coding formula, the AI decoding formula, the first preset AI formula, the second preset AI formula, the first weighting parameter and the AI coding formula of the artificial neural network module of the present invention. A schematic diagram of an embodiment of the training method of the second weighting parameter.

本發明是關於一種利用人工智慧處理模組來降低影像串流所需網路頻寬的方法。藉由在伺服器端先將欲傳輸之影像的解析度降低後再透過網路把低解析度影像傳輸給客戶端，藉以降低傳輸影像串流所需的網路頻寬。然後，在客戶端藉由一預先訓練之人工智慧(Artificial Intelligent；簡稱AI)處理模組，把所接收到的低解析度影像還原成高解析度影像，可同時享受高影像品質及低網路頻寬消耗的雙重優點。 The invention relates to a method for reducing the network bandwidth required for image streaming by using an artificial intelligence processing module. By first reducing the resolution of the image to be transmitted on the server side and then transmitting the low-resolution image to the client through the network, the network bandwidth required to transmit the image stream is reduced. Then, on the client side, a pre-trained artificial intelligence (AI) processing module restores the received low-resolution images to high-resolution images, allowing you to enjoy high image quality and low network connectivity at the same time Double advantage of bandwidth consumption.

本發明的其中一項應用是雲端線上遊戲(cloud-based online games)，其中，玩家使用客戶端裝置透過網路連線於一伺服器以遊玩由伺服器提供的遊戲。伺服器可回應由玩家輸入的指令並產生對應的影像視頻。因此，舉例來說，玩家可在客戶端裝置執行一移動的指令。此移動指令透過網路被傳送給伺服器，然後伺服器依據該移動指令計算出一影像，並將該影像傳回並播放於客戶端裝置上。在許多遊戲中，伺服器是產生包含了若干位在可視範圍中的3D渲染物件的2D影像。 One application of the present invention is cloud-based online games, wherein players use client devices to connect to a server through a network to play games provided by the server. The server can respond to the commands input by the players and generate corresponding images and videos. Therefore, for example, the player can execute a movement command on the client device. The movement command is sent to the server through the network, and then the server calculates an image according to the movement command, returns the image and plays it on the client device. In many games, the server generates a 2D image that contains several 3D rendered objects in view.

請參閱圖一，其示意地介紹了本發明之利用人工智慧處理模組來降低影像串流所需網路頻寬的系統。一伺服器1被應用於提供由一執行於該伺服器1上的一應用程式的服務；該服務可以是、但不侷限於是一雲端線上遊戲服務。複數個客戶端裝置21、22、23可經一網路4連接(登入)該伺服器1來使用由執行於該伺服器1上之該應用程式所提供的服務。於本實施例中，該網路4是網際網路(Internet)，且該客戶端裝置21、22、23可以是任何型式的可連網的電子裝置，例如(但不侷限於)智慧手機21、數位平板、筆記型電腦22、桌上型電腦23、電子遊戲機、或甚至是智慧電視。部分客戶端裝置21、22是透過一無線通訊基地台3或一無線路由器30以無線方式連接於該網路4，而其他則是以有線方式透過網路路由器或網路分享器連接於網路4。執行於伺服器1的應用程式產生一虛擬3D環境其包含了複數3D物件；其中一部份的3D物件可依據使用者的操作而被移動或被破壞、另一些則不能。於一較佳實施例中，對於每一客戶端裝置來說，該應用程式會有一獨立運行事例；也就是說，每一個應用程式只提供該服務給一位客戶端裝置，但是在該伺服器1中可同時執行複數應用程式來提供服務給複數客戶端裝置。該客戶端裝置21、22、23經由網路4連結於該伺服器1來接收由該應用程式產生且包含至少一部份3D物件的畫面。本發明之系統的架構及功能將透過圖二及其相關說明詳述之。 Please refer to FIG. 1 , which schematically introduces the system of the present invention that utilizes artificial intelligence processing modules to reduce network bandwidth required for image streaming. A server 1 is used to provide a service executed by an application program on the server 1; the service may be, but not limited to, a cloud online game service. A plurality of client devices 21 , 22 , 23 can connect (log in) to the server 1 via a network 4 to use the services provided by the application program running on the server 1 . In this embodiment, the network 4 is the Internet, and the client devices 21, 22, 23 can be any type of electronic devices that can be connected to the Internet, such as (but not limited to) smart phones 21 , digital tablet, laptop 22, desktop 23, video game console, or even a smart TV. Some client devices 21, 22 are wirelessly connected to the network 4 through a wireless communication base station 3 or a wireless router 30, while others are connected to the network through a network router or a network sharer in a wired manner 4. The application running on the server 1 generates a virtual 3D environment which includes a plurality of 3D objects; some of the 3D objects can be moved or destroyed according to the user's operation, while others cannot. In a preferred embodiment, for each client device, the application program will have an independent running instance; that is, each application program only provides the service to one client device, but the server 1, multiple applications can be executed simultaneously to provide services to multiple client devices. The client devices 21 , 22 , 23 are connected to the server 1 via the network 4 to receive images generated by the application program and including at least a part of 3D objects. The structure and functions of the system of the present invention will be described in detail through Figure 2 and its related descriptions.

圖二是本發明之系統架構的一實施例示意圖。應用程式(App)100儲存於記憶體101內並於伺服器1上執行的應用程式(通常是3D遊戲程式)，其可產生由一系列之原圖影像所構成的3D畫面渲染結果。編碼102及串流103分別是編碼模組及串流模組，其可接受由該應用程式100產生的該些原圖影像，並將其編碼及串流程一2D影像串流。該2D影像串流接著經由伺服器的網路設備104透過網路4被傳送給位於遠端的客戶端裝置2。每一個客戶端裝置2都分別預先安裝了一應用程式200，該應用程式200是儲存在客戶端裝置2的記憶體201內且可以和與伺服器1上的應用程式100相關聯與合作。客戶端裝置2的應用程式200可和伺服器1的應用程式100建立連結，並藉由該網路模組202來從該伺服器1接收該編碼後的2D影像串流。該編碼後的2D影像串流接著會被解碼模組203解碼以產生解碼後的影像。由於這些編碼、串流及解碼的程序，解碼後的影像的品質很顯然會比原圖影像差很多。內建於客戶端裝置2中的AI處理模組204可以增強那些解碼後的影像的品質，以產生相對應之增強後的影像。其中，該AI處理模組204藉由分析該些解碼後的影像與相對應之該些原圖影像之間的差異所得到的至少一數學運算式來處理該些解碼後的影像；藉此，所得到的增強後的影像在視覺上將會比解碼後的影像更接近於原圖影像。之後，該增強後的影像經由輸出模組205被輸出(播放)於客戶端裝置2的螢幕(顯示面板)上。於本發明中，該客戶端裝置2的AI處理模組204所使用的數學運算式是藉由位於該伺服器1上之的人工神經網路(Artificial Neural Network)模組105所執行的一訓練程序來定義。該人工神經網路模組105設於該伺服器1內且包括有：一人工神經網路106、一解碼模組107及一比較與訓練模組108。本發明所述之人工神經網路模組105的該訓練程序的實施例將於稍後詳述之。 FIG. 2 is a schematic diagram of an embodiment of the system architecture of the present invention. An application program (App) 100 is an application program (usually a 3D game program) stored in the memory 101 and executed on the server 1, which can generate a 3D image rendering result composed of a series of original image images. The encoding 102 and the streaming 103 are an encoding module and a streaming module respectively, which accept the original images generated by the application 100, and encode and stream them into a 2D image stream. The 2D video stream is then passed through the server The network device 104 of the server is transmitted to the remote client device 2 through the network 4 . Each client device 2 is pre-installed with an application program 200 . The application program 200 is stored in the memory 201 of the client device 2 and can be associated and cooperated with the application program 100 on the server 1 . The application program 200 of the client device 2 can establish a connection with the application program 100 of the server 1 and receive the encoded 2D image stream from the server 1 through the network module 202 . The encoded 2D image stream is then decoded by the decoding module 203 to generate decoded images. Due to these encoding, streaming and decoding procedures, the quality of the decoded image will obviously be much worse than the original image. The AI processing module 204 built in the client device 2 can enhance the quality of those decoded images to generate corresponding enhanced images. Wherein, the AI processing module 204 processes the decoded images by analyzing at least one mathematical formula obtained by analyzing the difference between the decoded images and the corresponding original image images; thereby, The resulting enhanced image will be visually closer to the original image than the decoded image. Afterwards, the enhanced image is output (played) on the screen (display panel) of the client device 2 via the output module 205 . In the present invention, the mathematical calculation formula used by the AI processing module 204 of the client device 2 is a training performed by the artificial neural network (Artificial Neural Network) module 105 located on the server 1 program to define. The artificial neural network module 105 is set in the server 1 and includes: an artificial neural network 106 , a decoding module 107 and a comparison and training module 108 . The embodiment of the training program of the artificial neural network module 105 described in the present invention will be described in detail later.

圖三是本發明利用人工智慧處理模組處理影像串流的方法的第一實施例的示意圖。藉由利用如圖二及圖三所示之本發明的系統與架構，該方法大體上包括以下步驟： FIG. 3 is a schematic diagram of a first embodiment of a method for processing video streams using an artificial intelligence processing module according to the present invention. By utilizing the system and architecture of the present invention as shown in Figures 2 and 3, the method generally includes the following steps:

步驟301：在一伺服器中執行一第一應用程式。該第一應用程式依據至少一指令來產生複數原圖影像(步驟302)。接著，該原圖影像被伺服器內的一解析度降低模組進行解析度降低處理(步驟3021)、然後再被伺服器內的一編碼器(步驟303)進行編碼與壓縮以產生複數編碼後的影像。該些編碼後的影像接著被以一2D影像串流(步驟304)的型式經由網路傳送給客戶端裝置。由於影像在被傳送給客戶端裝置已經先被降低解析度，所以傳送影像串流所需的網路頻寬也因此降低。 Step 301: Execute a first application in a server. The first application program generates a plurality of original image images according to at least one instruction (step 302). Next, the original image is subjected to resolution reduction processing by a resolution reduction module in the server (step 3021), and then encoded and compressed by an encoder in the server (step 303) to generate a complex code of the image. The encoded images are then sent to the client device via the network in the form of a 2D image stream (step 304 ). Since the video is downscaled before being sent to the client device, the network bandwidth required to transmit the video stream is also reduced.

在遠離該伺服器的一客戶端裝置內執行有一第二應用程式(步驟305)。該第二應用程式是關連於且合作於該第一應用程式，藉此，該客戶端裝置可供一使用者操作並產生與發送指令給伺服器以享受由伺服器之第一應用程式所提供的服務。該客戶端裝置將該指令經由網路傳送給伺服器，然後經由網路接收由該伺服器產生且相對應於該指令的該編碼後的影像。然後，客戶端裝置將該些編碼後的影像解碼(步驟306)成複數解碼後的影像，並使用一AI處理模組(步驟307)來增強該些解碼後的影像的品質，以產生複數增強後的影像。其中，該AI處理模組藉由分析該些解碼後的影像與相對應之該些原圖影像之間的差異所預先得到的至少一數學運算式來處理該些解碼後的影像；藉此，所得到的增強後的影像在視覺上將會比解碼後的影像更接近於原圖影像。之後，該客戶端裝置將該增強後的影像輸出(步驟308)至螢幕(顯示面板)以作為被播放的輸出影像。 A second application is executed in a client device remote from the server (step 305). The second application is related to and cooperates with the first application, whereby the customer The client device can be operated by a user to generate and send instructions to the server to enjoy the service provided by the first application program of the server. The client device sends the command to the server via the network, and then receives the encoded image generated by the server and corresponding to the command via the network. Then, the client device decodes (step 306) the encoded images into complex decoded images, and uses an AI processing module (step 307) to enhance the quality of the decoded images to generate complex enhancements after the image. Wherein, the AI processing module processes the decoded images by analyzing the difference between the decoded images and the corresponding original images by at least one mathematical operation formula obtained in advance; thereby, The resulting enhanced image will be visually closer to the original image than the decoded image. Afterwards, the client device outputs (step 308 ) the enhanced image to a screen (display panel) as an output image to be played.

於本發明中，客戶端裝置內的AI處理模組所使用的該至少一數學運算式是包括了複數個加權參數(Weighted Parameters)。該些加權參數是和該解碼後的影像與相對應之該原圖影像之間的差異相關連、且是藉由伺服器內的一人工神經網路模組所執行的一訓練程序來定義。於本發明的一實施例中，該加權參數是預先儲存在客戶端裝置中。於另一實施例中，該加權參數是在客戶端裝置執行了該第二應用程式時，才從該伺服器下載至該客戶端內。 In the present invention, the at least one mathematical operation formula used by the AI processing module in the client device includes a plurality of weighted parameters (Weighted Parameters). The weighting parameters are related to the difference between the decoded image and the corresponding original image, and are defined by a training program executed by an artificial neural network module in the server. In an embodiment of the present invention, the weighting parameters are pre-stored in the client device. In another embodiment, the weighting parameter is downloaded from the server to the client device when the client device executes the second application program.

於本發明之一實施例中，由伺服器所產生的原圖影像所包含的畫面內容會因為遊戲場景的不同而有劇烈變化。舉例來說，一城市遊戲場景可能讓遊戲的原圖影像包含許多具有簡單且清晰的外觀輪廓以及雖不同但卻大致同色系的色彩。另一個黑暗洞穴的遊戲場景則會使遊戲的原圖影像充滿單調且低色調及低色度值的色彩，但卻具有不規則但不起眼的景觀輪廓。而又另一茂盛花園的場景則會讓遊戲的原圖影像包含許多生氣勃勃且色彩鮮豔的物件並具有詳細且複雜的輪廓。本發明的方法運用了數組不同的加權參數來分別適應這些不同的遊戲場景，藉此，經由同一個AU增強模組所增強後的輸出影像的品質可以被維持在一高品質且穩定的水準，即使原圖影像的畫面內容很劇烈地改變。 In one embodiment of the present invention, the screen content contained in the original image generated by the server will change drastically due to different game scenes. For example, a city game scene may cause the original image of the game to contain many colors with simple and clear outlines and different but roughly the same color family. Another game scene of a dark cave would fill the game's original image with flat, low-key and low-chroma colors, but with irregular but inconspicuous landscape outlines. Yet another lush garden scene would make the game's artwork image contain many vibrant and brightly colored objects with detailed and complex silhouettes. The method of the present invention uses different sets of weighting parameters to adapt to these different game scenes, whereby the quality of the output image enhanced by the same AU enhancement module can be maintained at a high-quality and stable level. Even if the screen content of the original image changes drastically.

較佳者，由該第一應用程式所產生的該些原圖影像可以被區分為複數組場景(scene-modes)，每一場景各包含複數該原圖影像。該些加權參數也被區分為複數組，每一組分別包含複數個加權參數而且是對應於其中之一該場景。對應於不同場景之原圖影像的該些解碼後的影像會被同一個AI處理模組使用該些不同組之加權參數中與該場景相對應的該組加權參數來進行影像增強處理。於本發明的一實施例中，該些不同組的加權參數是全部預先儲存於該客戶端裝置內，每當該場景改變，相對應於改變後之新場景的該組加權參數就會被運用於該AI處理模組中以供產生該增強後的影像。於另一實施例中，該些不同組的加權參數是全部儲存於該伺服器端，每當該場景改變，相對應於改變後之新場景的該組加權參數就會由伺服器傳送給客戶端裝置，然後被運用於該AI處理模組中以供產生該增強後的影像。 Preferably, the original image images generated by the first application program can be divided into a plurality of scene-modes, and each scene includes a plurality of the original image images. These weighting parameters are also divided into complex groups, each group contains a plurality of weighting parameters and is corresponding to its One of the scenarios. The decoded images corresponding to the original images of different scenes will be enhanced by the same AI processing module using the set of weighting parameters corresponding to the scene among the different sets of weighting parameters. In an embodiment of the present invention, the different sets of weighting parameters are all pre-stored in the client device, and whenever the scene changes, the set of weighting parameters corresponding to the changed new scene will be used It is used in the AI processing module to generate the enhanced image. In another embodiment, these different sets of weighting parameters are all stored on the server side, and whenever the scene changes, the set of weighting parameters corresponding to the changed new scene will be sent to the client by the server The end device is then used in the AI processing module to generate the enhanced image.

圖四是本發明所述的人工神經網路模組105的訓練程序的第一實施例的示意圖。於本發明中，該客戶端裝置2之AI處理模組204所使用的數學運算式是藉由在伺服器1中的該人工神經網路模組105所執行的一訓練程序來加以訓練及定義。該訓練程序包括以下步驟： FIG. 4 is a schematic diagram of the first embodiment of the training program of the artificial neural network module 105 of the present invention. In the present invention, the mathematical calculation formula used by the AI processing module 204 of the client device 2 is trained and defined by a training program executed by the artificial neural network module 105 in the server 1 . The training procedure includes the following steps:

步驟400：在一訓練模式中執行該第一應用程式以產生複數個訓練原圖影像(步驟401)，並且，對該些該原圖影像進行解析度降低處理(步驟4011)； Step 400: Execute the first application program in a training mode to generate a plurality of training original image images (step 401), and perform resolution reduction processing on these original image images (step 4011);

步驟402：將降低解析度後的該些訓練原圖影像藉由該編碼器編碼成為複數個訓練編碼影像； Step 402: encoding the original training image images with reduced resolution into a plurality of training encoded images by the encoder;

步驟403：藉由伺服器中的一訓練解碼器將該些訓練編碼影像解碼成為複數個訓練解碼影像； Step 403: Decoding the training coded images into a plurality of training decoded images by a training decoder in the server;

步驟404：該人工神經網路模組接受該些訓練解碼影像並使用至少一訓練數學運算式來逐一處理該訓練解碼影像，以產生複數個訓練輸出影像(步驟405)；該至少一訓練數學運算式包含複數個訓練加權參數；以及 Step 404: The artificial neural network module receives the training decoded images and uses at least one training mathematical operation to process the training decoded images one by one to generate a plurality of training output images (step 405); the at least one training mathematical operation The formula contains a plurality of training weighting parameters; and

步驟406：以該比較與訓練模組來逐一比較該訓練輸出影像與相對應之該訓練原圖影像之間的差異，並據以調整該至少一訓練數學運算式的該些訓練加權參數；該些訓練加權參數會被調整成可讓該訓練輸出影像與相對應之該訓練原圖影像之間的差異最小化；每一次當該些訓練加權參數被調整後，該些調整後的訓練加權參數就會被回饋給該至少一訓練數學運算式以供在步驟404中處理下一個訓練解碼影像。在進行過預定數量的訓練輸出影像與相對應之訓練原圖影像的比較、以及預定次數的訓練加權參數的調整程序後，最後完成訓練後所得到的該些訓練加權參數(步驟407)會被取出並應用在該客戶端裝置的AI處理模組內來作為其數學運算式的加權參數。 Step 406: using the comparison and training module to compare the differences between the training output image and the corresponding training original image one by one, and accordingly adjust the training weighting parameters of the at least one training mathematical operation formula; These training weighting parameters will be adjusted to minimize the difference between the training output image and the corresponding training original image; each time these training weighting parameters are adjusted, the adjusted training weighting parameters It will be fed back to the at least one training mathematical operation formula for processing the next training decoded image in step 404 . After a predetermined number of After the comparison between the training output image and the corresponding training original image, and the adjustment procedure of the training weighting parameters for a predetermined number of times, the training weighting parameters obtained after the training (step 407) will be taken out and applied to the client The AI processing module of the terminal device is used as the weighting parameter of the mathematical operation formula.

於本發明的第一實施例中，該訓練解碼影像被輸入至該人工神經網路模組以產生相對應的該訓練輸出影像。接著，該訓練輸出影像及相對應的該訓練原圖影像會被進行比較以便計算差異值。然後，使用例如：Adam演算法、隨機梯度下降法(Stochastic gradient descent；簡稱SGD)、或前向均方根梯度下降演算法(Root Mean Square Propagation；簡稱RMSProp)等等之數學優化算法來學習所述人工神經網路的加權參數(通常稱為加權weight w、偏異bias b)，令該差異值越小越好，藉此該訓練輸出影像可以更接近於其相對應的訓練原圖影像。不同的方法可以被用於計算該差異值(或近似值)以適應不同需求；例如：均方誤差(mean square error；簡稱MSE)、L1正規化(L1 regularization)(使用絕對值誤差absolute value error)、峰值信噪比(peak signal-to-noise ratio；簡稱PSNR)，結構相似性(structure similarity；簡稱SSIM)、生成對抗網路損失(generative adversarial networks loss；簡稱GAN loss)及/或其他方法等等。於第一實施例中，以下方法被利用來計算差異值：(1)MSE、L1、及GAN loss的加權平均；(2)MSE；(3)GAN loss並同時訓練鑑別器(Discriminator)；(4)MSE的加權平均與MSE的邊際(Edge of MSE)。該訓練程序的更多細節將敘述於後。 In the first embodiment of the present invention, the training decoding image is input to the artificial neural network module to generate the corresponding training output image. Then, the training output image and the corresponding training original image are compared to calculate a difference value. Then, use mathematical optimization algorithms such as Adam algorithm, stochastic gradient descent (SGD for short), or root mean square gradient descent algorithm (Root Mean Square Propagation; RMSProp for short), etc. to learn all The weighting parameters of the artificial neural network (usually referred to as weight w, bias b), the smaller the difference, the better, so that the training output image can be closer to its corresponding original training image. Different methods can be used to calculate the difference value (or approximate value) to suit different needs; for example: mean square error (mean square error; MSE for short), L1 regularization (L1 regularization) (using absolute value error) , peak signal-to-noise ratio (PSNR for short), structure similarity (SSIM for short), generative adversarial networks loss (GAN loss for short) and/or other methods, etc. wait. In the first embodiment, the following methods are used to calculate the difference value: (1) weighted average of MSE, L1, and GAN loss; (2) MSE; (3) GAN loss and simultaneously train the discriminator (Discriminator); ( 4) The weighted average of MSE and the edge of MSE (Edge of MSE). More details of this training procedure will be described later.

圖五是本發明所述的人工神經網路模組105的訓練程序的第二實施例的示意圖。於本發明中，該第二實施例的訓練程序包括以下步驟： FIG. 5 is a schematic diagram of a second embodiment of the training program of the artificial neural network module 105 of the present invention. In the present invention, the training program of the second embodiment includes the following steps:

步驟410：在一訓練模式中執行該第一應用程式以產生複數個訓練原圖影像(步驟411)，其中，該些訓練原圖影像的顏色格式是色光三原色(RGB)；並且，對該些該訓練原圖影像進行解析度降低處理(步驟4111)； Step 410: Execute the first application program in a training mode to generate a plurality of training original image images (step 411), wherein the color format of these training original image images is the three primary colors of color and light (RGB); and, for these Training the original image to perform resolution reduction processing (step 4111);

步驟412：將降低解析度後的該些訓練原圖影像藉由該編碼器編碼成為複數個訓練編碼影像； Step 412: Encode the original training image images with reduced resolution into a plurality of training encoded images by the encoder;

步驟413：藉由伺服器中的訓練解碼器將該些訓練編碼影像解碼成為複數個訓練解碼影像； Step 413: Decoding the training coded images into a plurality of training decoded images by a training decoder in the server;

步驟414：於該第二實施例中，當該訓練解碼影像和該訓練輸出影像的顏色格式相同時(於本第二實施例中兩者都是RGB)，殘差網路模組(residual network module)亦可稱為卷積神經網路(Convolutional Neural Network；簡稱CNN)可被使用於該人工神經網路模組中；用於處理相對應訓練解碼影像的該殘差網路模組的輸出會被和該相對應之訓練解碼影像進行加總(summed up)(步驟415)；然後，該殘差網路模組的輸出和該相對應之訓練解碼影像兩者加總的結果會被輸出作為該訓練輸出影像(步驟416)；以及 Step 414: In the second embodiment, when the training decoded image and the training input When the color formats of the output images are the same (both are RGB in the second embodiment), the residual network module (residual network module) can also be called a convolutional neural network (Convolutional Neural Network; CNN for short) Can be used in the artificial neural network module; the output of the residual network module for processing the corresponding training decoding image will be summed up with the corresponding training decoding image (step 415); Then, the sum of the output of the residual network module and the corresponding training decoded image will be output as the training output image (step 416); and

步驟417：以該比較與訓練模組來逐一比較該訓練輸出影像與相對應之該訓練原圖影像之間的差異(計算差異值)，並據以調整該至少一訓練數學運算式的該些訓練加權參數；該些訓練加權參數會被調整成可讓該訓練輸出影像與相對應之該訓練原圖影像之間的差異最小化；每一次當該些訓練加權參數被調整後，該些調整後的訓練加權參數就會被回饋給該人工神經網路以供在步驟414中處理下一個訓練解碼影像。在進行過預定數量的訓練輸出影像與相對應之訓練原圖影像的比較、以及預定次數的訓練加權參數的調整程序後，最後完成訓練後所得到的該些訓練加權參數(步驟418)會被取出並應用在該客戶端裝置的AI處理模組內來作為其數學運算式的加權參數。 Step 417: Using the comparison and training module to compare the difference between the training output image and the corresponding training original image one by one (calculate the difference value), and adjust the at least one training mathematical operation formula accordingly Training weighting parameters; these training weighting parameters will be adjusted to minimize the difference between the training output image and the corresponding training original image; each time these training weighting parameters are adjusted, these adjustments The final training weighting parameters will be fed back to the artificial neural network for processing the next training decoded image in step 414 . After performing a predetermined number of comparisons between the training output image and the corresponding training original image, and a predetermined number of adjustments to the training weighting parameters, the training weighting parameters obtained after the training is finally completed (step 418) will be Taken out and applied in the AI processing module of the client device as the weighting parameter of the mathematical operation formula.

圖六是本發明所述的人工神經網路模組105的訓練程序的第三實施例的示意圖。於第三實施例中，該比較與訓練模組使用一鑑別器(Discriminator)來比較該訓練輸出影像與相對應之該訓練原圖影像之間的差異並據以調整該訓練加權參數。該第三實施例之訓練程序包括以下步驟： FIG. 6 is a schematic diagram of a third embodiment of the training program of the artificial neural network module 105 of the present invention. In the third embodiment, the comparison and training module uses a discriminator (Discriminator) to compare the difference between the training output image and the corresponding training original image and adjust the training weighting parameters accordingly. The training procedure of this third embodiment comprises the following steps:

步驟420：在一訓練模式中執行該第一應用程式以產生複數個訓練原圖影像(步驟421)，其中，該些訓練原圖影像包括n個通道，其中n是大於2的正整數；並且，對該些該訓練原圖影像進行解析度降低處理(步驟4211)； Step 420: Execute the first application program in a training mode to generate a plurality of training original image images (step 421), wherein the training original image images include n channels, where n is a positive integer greater than 2; and , performing resolution reduction processing on these training original image images (step 4211);

步驟422：將降低解析度後的該些訓練原圖影像藉由該編碼器編碼成為複數個訓練編碼影像； Step 422: Encode the original training image images with reduced resolution into a plurality of training encoded images by the encoder;

步驟423：藉由伺服器中的訓練解碼器將該些訓練編碼影像解碼成為複數個訓練解碼影像；其中該訓練解碼影像包括m個通道，其中m是大於2的正整數；以及 Step 423: Decode the training coded images into a plurality of training decoded images by a training decoder in the server; wherein the training decoded images include m channels, where m is a positive integer greater than 2; and

步驟424：該人工神經網路模組接受該些訓練解碼影像(m個通道)並使用至少一訓練數學運算式來逐一處理該訓練解碼影像以供產生複數個訓練輸出影像(n個通道)(步驟425)；該至少一訓練數學運算式包含複數個訓練加權參數；該訓練輸出影像(n通道)及與其相對應之該訓練解碼影像(m通道)結合(步驟426)以產生複數個訓練結合影像(具有m+n通道)；接著，此些訓練結合影像被回饋至一鑑別器(步驟427)以供分析該訓練輸出影像的品質，藉此訓練該人工神經網路。 Step 424: The artificial neural network module receives the training decoded images (m channels) and uses at least one training mathematical operation formula to process the training decoded images one by one to generate a plurality of training output images (n channels) ( Step 425); the at least one training mathematical operation formula includes a plurality of training weighting parameters; the training output image (n channel) and the corresponding training decoded image (m channel) are combined (step 426) to generate a plurality of training combinations images (with m+n channels); then, the training combined images are fed back to a discriminator (step 427) for analyzing the quality of the training output images, thereby training the artificial neural network.

圖七是如圖六所示之鑑別器的訓練程序的一實施例示意圖。該鑑別器的訓練程序包括以下步驟： FIG. 7 is a schematic diagram of an embodiment of the training procedure of the discriminator shown in FIG. 6 . The training procedure for this discriminator consists of the following steps:

步驟430：在一訓練模式中執行該第一應用程式以產生複數個訓練原圖影像(步驟431)，其中，該些訓練原圖影像包括n個通道，其中n是大於2的正整數；並且，對該些該訓練原圖影像進行解析度降低處理(步驟4311)； Step 430: Execute the first application program in a training mode to generate a plurality of training original image images (step 431), wherein the training original image images include n channels, where n is a positive integer greater than 2; and , and perform resolution reduction processing on these training original image images (step 4311);

步驟432：將降低解析度後的該些訓練原圖影像藉由該編碼器編碼成為複數個訓練編碼影像； Step 432: Encode the original training image images with reduced resolution into a plurality of training encoded images by the encoder;

步驟433：藉由伺服器中的訓練解碼器將該些訓練編碼影像解碼成為複數個訓練解碼影像；其中該訓練解碼影像包括m個通道，其中m是大於2的正整數；以及 Step 433: Decode the training coded images into a plurality of training decoded images by a training decoder in the server; wherein the training decoded images include m channels, where m is a positive integer greater than 2; and

步驟434：該人工神經網路模組接受該些訓練解碼影像並使用至少一訓練數學運算式來逐一處理該訓練解碼影像(m通道)以供產生複數個訓練輸出影像(步驟435)；該至少一訓練數學運算式包含複數個訓練加權參數；該訓練輸出影像包括n通道； Step 434: The artificial neural network module accepts the training decoding images and uses at least one training mathematical operation formula to process the training decoding images (m channels) one by one to generate a plurality of training output images (step 435); the at least A training mathematical operation formula includes a plurality of training weighting parameters; the training output image includes n channels;

步驟436：該n通道之訓練輸出影像和其相對應之該m通道的訓練解碼影像兩者結合以產生複數個具有m+n通道的假樣本(false samples)；並且，該n通道之訓練原圖影像和其相對應之該m通道的訓練解碼影像兩者結合以產生複數個具有m+n通道的真樣本(true samples)(步驟437)；以及 Step 436: The n-channel training output image is combined with the corresponding m-channel training decoding image to generate a plurality of false samples (false samples) with m+n channels; and, the n-channel training original Combining the image image and its corresponding m-channel training decoding image to generate a plurality of true samples (true samples) with m+n channels (step 437); and

步驟438：該m+n通道的模擬假樣本和該m+n通道的模擬真樣本被回饋至該比較與訓練模組的鑑別器以供訓練該鑑別器去偵測及分辨該模擬假樣本和該模擬真樣本的能力。 Step 438: The simulated fake samples of the m+n channels and the simulated real samples of the m+n channels are fed back to the discriminator of the comparison and training module for training the discriminator to detect and distinguish the simulated fake samples and The ability to simulate real samples.

在該人工神經網路105(如圖二所示)在伺服器1端被妥善訓練後，所得到的該加權參數(加權weight w、偏異bias b)會被應用在客戶端裝置內的該AI處理模組204。該AI處理模組204及其相關連的加權參數(加權weight w、偏異bias b)是儲存於該客戶端裝置2。之後，每當該客戶端裝置接收並解碼來自伺服器的2D影像串流中所包含的該編碼後的影像時，每一個該編碼後的影像會被該AI處理模組所處理以產生增強後的影像。然後，該客戶端裝置將該些增強後的影像作為輸出影像播放於其螢幕上。該神經網路可以學習並增強影像的色彩、亮度和細節。由於原圖影像的部分細節會在編碼與串流的過程中受損或失去，一妥善訓練過的神經網路可以修補這些受損或失去的細節。於本發明的實施例中，AI處理模組的神經網路需要以下資訊來操作： The artificial neural network 105 (as shown in Figure 2) is properly trained on the server 1 side Afterwards, the obtained weighting parameters (weight w, bias b) will be applied to the AI processing module 204 in the client device. The AI processing module 204 and its associated weighting parameters (weight w, bias b) are stored in the client device 2 . Afterwards, whenever the client device receives and decodes the encoded image contained in the 2D image stream from the server, each encoded image will be processed by the AI processing module to generate an enhanced of the image. The client device then plays the enhanced images on its screen as an output image. The neural network can learn and enhance the color, brightness and detail of the image. Since some details of the original image will be damaged or lost in the process of encoding and streaming, a properly trained neural network can repair these damaged or lost details. In the embodiment of the present invention, the neural network of the AI processing module requires the following information to operate:

相關的函數及參數： Related functions and parameters:

X：輸入影像； X: input image;

Conv2d(X,a,b,c,d,w,b)：執行於X；輸出通道的數量為a(amount of output channel=a)；核心大小為b(kernel_size=b)；步伐值為c(stride=c)；填充尺寸為2d卷積其偏異為d(padding size=2d convolution with bias of d)；該訓練的加權參數是核心w(kernel w)及偏異b(bias b)； Conv2d(X,a,b,c,d,w,b): Execute on X; the number of output channels is a(amount of output channel=a); the core size is b(kernel_size=b); the step value is c (stride=c); the padding size is 2d convolution and its bias is d (padding size=2d convolution with bias of d); the weighting parameters of this training are core w (kernel w) and bias b (bias b);

Conv2dTranspose(X,a,b,c,w,b))：執行於X；輸出通道的數量為a(amount of output channel=a)；核心大小為b(kernel_size=b)；步伐值為c(stride=c)；裁切尺寸為2d轉置卷積其偏異為d(cropping size=2d transpose convolution with bias of d)；該訓練的加權參數是核心w(kernel w)及偏異b(bias b)； Conv2dTranspose(X,a,b,c, w , b )): Execute on X; the number of output channels is a(amount of output channel=a); the core size is b(kernel_size=b); the step value is c( stride=c); the cropping size is 2d transpose convolution and its deviation is d(cropping size=2d transpose convolution with bias of d); the weighting parameters of this training are core w(kernel w) and deviation b(bias b);

σ(X)：在X上工作的非線性激活函數； σ(X): a non-linear activation function working on X;

uint8(x)：用於控制和限制浮點x的值在0到255之間(包括255)，u使用無條件捨去方法，轉換為unsigned int8； uint8(x): Used to control and limit the value of floating-point x between 0 and 255 (including 255), u uses the unconditional rounding method and converts to unsigned int8;

R(X,w)：在X上工作的殘差塊(residual blocks)，包括很多conv2d及batchnorm，其各自包含自己的加權參數進行訓練(更多資料可以通過以下網站作為參考：https：//stats.stackexchange.com/questions/246928/what-exactly-is-a-residual-learning-block-in-the-context-of-deep-residual-networ)。 R(X,w): Residual blocks working on X, including many conv2d and batchnorm, each of which contains its own weighted parameters for training (more information can be used as a reference through the following website: https:// stats.stackexchange.com/questions/246928/what-exactly-is-a-residual-learning-block-in-the-context-of-deep-residual-networ ).

由於輸入及輸出的影像可能具有不同的顏色格式，例如RGB、 YUV420、YUV444等等，所以，以下將討論更多關於具不同顏色格式的輸入及輸出影像的情況。 Since the input and output images may have different color formats, such as RGB, YUV420, YUV444, etc., so, more about input and output images with different color formats will be discussed below.

第一種情況：原圖影像是RGB、輸出影像也是RGB。 The first case: the original image is RGB, and the output image is also RGB.

此情況是最單純的，因為輸入與輸出影像都是RGB影像。為了提高處理速度，故使用相對較大的核心大小(例如8x8、步伐值stride=4於卷積及轉置卷積結構中)來盡快加速計算，以應付全高清(Full HD)影像的高解析度。在這種情況下使用殘差網絡(Residual network)以使收斂更容易和更穩定。 This case is the simplest since both input and output images are RGB images. In order to improve the processing speed, a relatively large kernel size (such as 8x8, stride=4 in the convolution and transposed convolution structure) is used to accelerate the calculation as soon as possible to cope with the high resolution of Full HD (Full HD) images Spend. Residual networks are used in this case to make convergence easier and more stable.

相關的函數及參數： Related functions and parameters:

X：輸入影像，其是RGB格式，且各顏色都是unsigned int8格式； X: input image, which is in RGB format, and each color is in unsigned int8 format;

；

;

Y=uint8((Conv2dTranspose(σ(Conv2d(X2,a,b,c,d,w_1,b_1)),w_2,b_2)+X2)*128+128)； Y=uint8((Conv2dTranspose(σ(Conv2d(X2,a,b,c,d, w _1, b _1)), w _2, b _2)+X2)*128+128);

w_1是一矩陣其大小是b*b*3*a；b_1是一向量其大小為a； w _1 is a matrix whose size is b*b*3*a; b _1 is a vector whose size is a;

w_2是一矩陣其大小是b*b*3*a；b_2是一向量其大小為3； w _2 is a matrix whose size is b*b*3*a; b _2 is a vector whose size is 3;

所使用的參數包括： The parameters used include:

X的解析度是1280x720； The resolution of X is 1280x720;

a=128,b=10,c=5,d=0,σ=leaky relu with alpha=0.2； a=128,b=10,c=5,d=0,σ=leaky relu with alpha=0.2;

a=128,b=9,c=5,d=4,σ=leaky relu with alpha=0.2； a=128,b=9,c=5,d=4,σ=leaky relu with alpha=0.2;

a=128,b=8,c=4,d=0,σ=leaky relu with alpha=0.2； a=128,b=8,c=4,d=0,σ=leaky relu with alpha=0.2;

倘若該客戶端裝置具有較快的處理速度，則以下數學式可以被使用： Provided that the client device has a faster processing speed, the following formula can be used:

Y=uint8((Conv2dTranspose(R(σ(Conv2d(X2,a,b,c,d,w_1,b_1)),w_R),w_2,b_2)+X2)*128+128)； Y=uint8((Conv2dTranspose(R(σ(Conv2d(X2,a,b,c,d, w _1, b _1)), w _R), w _2, b _2)+X2)*128+128);

其中R是殘差塊(residual blocks)其具有n層； where R is a residual block (residual blocks) which has n layers;

其中，包含了很多神經網路層，每一層都各有其被訓練的加權參數其統稱為w_R； Among them, a lot of neural network layers are included, and each layer has its trained weighted parameters, which are collectively referred to as w _ R ;

所使用的參數包括： The parameters used include:

a=128,b=8,c=4,d=0,σ=leaky relu with alpha=0.2；n=2； a=128,b=8,c=4,d=0, σ =leaky relu with alpha=0.2; n=2;

a=128,b=8,c=4,d=0,σ=leaky relu with alpha=0.2；n=6。 a=128, b=8, c=4, d=0, σ=leaky relu with alpha=0.2; n=6.

第二種情況：原圖影像是YUV420、輸出影像是RGB或是YUV444； The second case: the original image is YUV420, the output image is RGB or YUV444;

如果輸入的原圖影像是YUV420、而輸出影像是RGB或是YUV444時，由於輸入與輸出影像的解析度及格式不同，殘差網路(Residual network)無法直接應用於此情況。本發明的方法會先解碼YUV420的輸入影像，然後使用另一神經網路(稱為A網路，其中N=3)來處理其解碼後的影像並獲得RGB或是YUV444格式的影像(稱為X2)。接著，此X2影像被送入第一種情況所述的神經網路(殘差網路)來進行訓練。並且，相同的訓練方法也應用在A網路上來比較X2和原圖影像之間的差異，藉此訓練A網路。 If the input original image is YUV420 and the output image is RGB or YUV444, the residual network cannot be directly applied to this situation because the resolution and format of the input and output images are different. The method of the present invention will first decode the input image of YUV420, and then use another neural network (called A network, wherein N=3) to process the decoded image and obtain an image in RGB or YUV444 format (called A network) X2). Then, this X2 image is fed into the neural network (residual network) described in the first case for training. Moreover, the same training method is also applied to the A network to compare the difference between X2 and the original image, thereby training the A network.

X_y是具YUV420格式之輸入影像的Y，其格式為unsigned int8； X_y is the Y of the input image with YUV420 format, and its format is unsigned int8;

X_uv是具YUV420格式之輸入影像的uv，其格式為unsigned int8； X_uv is the uv of the input image with YUV420 format, and its format is unsigned int8;

；

;

；

;

X2=Conv2d(X2_y,3,e,1,w_y,b_y)+Conv2dTranspose(X2_uv,3,f,2,w_uv,b_uv)； X2=Conv2d( X2_y ,3,e,1, w_y , b_y )+Conv2dTranspose(X2_uv , 3,f,2, w_uv , b_uv );

w_y是一矩陣其大小是e*e*1*3；b_y是一向量其大小為3； w_y is a matrix whose size is e*e*1 * 3; b_y is a vector whose size is 3;

w_uv是一矩陣其大小是f*f*3*2；b_uv是一向量其大小為3； w_uv is a matrix whose size is f*f*3*2; b _uv is a vector whose size is 3;

以上所述是A網路(神經網路編號A)的第一實施例； The above is the first embodiment of the A network (neural network number A);

最後，用於輸出該輸出影像的數學式和前述第一種情況當輸入與輸出影像都是RGB格式時所使用的數學式相同： Finally, the mathematical formula used to output the output image is the same as the mathematical formula used in the first case above when both the input and output images are in RGB format:

所使用的參數也同樣和前述當輸入與輸出影像都是RGB格式時所使用的參數相同： The parameters used are also the same as those used above when the input and output images are in RGB format:

X的解析度是1280x720； The resolution of X is 1280x720;

a=128,b=8,c=4,d=0,e=1,f=2,σ=leaky relu with alpha=0.2； a=128,b=8,c=4,d=0,e=1,f=2,σ=leaky relu with alpha=0.2;

a=128,b=8,c=4,d=0,e=1,f=2,σ=leaky relu with alpha=0.2。 a=128, b=8, c=4, d=0, e=1, f=2, σ=leaky relu with alpha=0.2.

請參閱圖八，其揭露了本發明之神經網路的訓練過程的一實施例，其中，原圖影像是YUV420、且輸出影像是RGB或YUV420。該神經網路的訓練過程包括以下步驟： Please refer to FIG. 8, which discloses an embodiment of the training process of the neural network of the present invention, wherein the original image is YUV420, and the output image is RGB or YUV420. The training process of the neural network includes the following steps:

步驟440：在一訓練模式中執行該第一應用程式以產生複數個訓練原圖影像，其中，該些訓練原圖影像是RGB或YUV444格式；並且，對該些該訓練原圖影像進行解析度降低處理(步驟4401)； Step 440: Execute the first application program in a training mode to generate a plurality of training original image images, wherein the training original image images are in RGB or YUV444 format; and perform resolution reduction on the training original image images Processing (step 4401);

步驟441：將降低解析度後的該些訓練原圖影像藉由該編碼器編碼成為複數個訓練編碼影像； Step 441: Encode the original training image images with reduced resolution into a plurality of training encoded images by the encoder;

步驟442：藉由伺服器中的訓練解碼器將該些訓練編碼影像解碼成為複數個訓練解碼影像；其中該訓練解碼影像是YUV420格式； Step 442: Decoding the training coded images into a plurality of training decoded images by a training decoder in the server; wherein the training decoded images are in YUV420 format;

步驟443：該人工神經網路模組包括一第一神經網路以及一第二神經網路；該第一神經網路(也稱為A網路)接受該些訓練解碼影像並使用至少一訓練數學運算式來逐一處理該訓練解碼影像(YUV420)以供產生複數個第一輸出影像X2(也稱為X2；如步驟444)其具有和該訓練原圖影像相同的編碼格式；該至少一訓練數學運算式包含複數個訓練加權參數； Step 443: The artificial neural network module includes a first neural network and a second neural network; the first neural network (also referred to as A network) accepts these trainings to decode images and uses at least one training A mathematical operation formula is used to process the training decoded image (YUV420) one by one to generate a plurality of first output images X2 (also referred to as X2; such as step 444) which have the same encoding format as the training original image; the at least one training The mathematical expression contains a plurality of training weighting parameters;

步驟445：該第二神經網路是一卷積神經網路(Convolutional Neural Network；簡稱CNN)；該第二神經網路(也稱為CNN網路)接受該第一輸出影像X2並使用至少一訓練數學運算式來逐一處理該第一輸出影像X2以供產生複數個第二輸出影像；該至少一訓練數學運算式包含複數個訓練加權參數；接著，該第一輸出影像X2和該第二輸出影像兩者被相加(步驟446)以產生該訓練輸出影像(步驟447)； Step 445: The second neural network is a convolutional neural network (CNN for short); the second neural network (also called CNN network) receives the first output image X2 and uses at least one training a mathematical operation formula to process the first output image X2 one by one to generate a plurality of second output images; the at least one training mathematical operation formula includes a plurality of training weighting parameters; then, the first output image X2 and the second output The two images are added (step 446) to generate the training output image (step 447);

該比較與訓練模組包含一第一比較器及一第二比較器；於步驟448中，該第一比較器比較該第一輸出影像X2與其相對應之訓練原圖影像之間的差異以供訓練該第一神經網路；於步驟449中，該第二比較器比較該訓練輸出影像與其相對應之訓練原圖影像之間的差異以供訓練該第二神經網路。 The comparison and training module includes a first comparator and a second comparator; in step 448, the first comparator compares the difference between the first output image X2 and its corresponding original image for training training the first neural network; in step 449, the second comparator compares the training The difference between the training output image and its corresponding training original image is used for training the second neural network.

圖九是本發明處理具YUV420格式之解碼後的影像的程序的一實施例示意圖。本發明處理具YUV420格式之解碼後的影像的程序包括： FIG. 9 is a schematic diagram of an embodiment of a procedure for processing a decoded image in YUV420 format according to the present invention. The present invention processes the program of the decoded image with YUV420 format including:

步驟451：該第一神經網路接受並處理具YUV420顏色格式之訓練解碼影像的步驟包括： Step 451: The step of the first neural network accepting and processing the training decoded image with YUV420 color format includes:

步驟452：提取該訓練解碼影像中的Y-part資料，由具標準大小(原大小)的該神經網路來處理該訓練解碼影像的Y-part資料以產生具N通道的Y-part輸出資料(例如：步伐值Stride=1於卷積中；如步驟454)； Step 452: Extract the Y-part data in the training decoding image, and process the Y-part data of the training decoding image by the neural network with a standard size (original size) to generate Y-part output data with N channels (For example: step value Stride=1 in convolution; as in step 454);

步驟453：提取該訓練解碼影像中的UV-part資料，由具兩倍放大的神經網路來處理該訓練解碼影像的UV-part資料以產生具N通道的UV-part輸出資料(例如：步伐值Stride=2於轉置卷積中；如步驟455)； Step 453: extract the UV-part data in the training decoded image, and process the UV-part data of the trained decoded image by a double-amplified neural network to produce UV-part output data with N channels (for example: steps Value Stride=2 in the transposed convolution; as in step 455);

步驟456：將該Y-part輸出資料與該UV-part輸出資料相加以產生該訓練輸出影像(步驟457)。 Step 456: Add the Y-part output data and the UV-part output data to generate the training output image (Step 457).

第三種情況：原圖影像是YUV420、且輸出影像是YUV444，以另一種更快的方式處理。 The third case: the original image is YUV420, and the output image is YUV444, processed in another faster way.

如果輸入影像是YUV420、且輸出影像是YUV444，則除了前述的方法以外，還有另一種實施該第一神經網路(A網路)的方式，其是具有更快速度的特例。具YUV420格式的解碼後的影像首先利用第一神經網路(A網路)將其轉換為YUV444格式的影像(亦稱為X2)；之後，X2被送入前述的神經網路(殘差網路)進行訓練。並且，相同的訓練方式也實施在A網路來比較X2與原圖影像之間的差異，藉以訓練A網路。 If the input image is YUV420 and the output image is YUV444, besides the aforementioned method, there is another way to implement the first neural network (A network), which is a special case with faster speed. The decoded image in YUV420 format is first converted into an image in YUV444 format (also known as X2) using the first neural network (A network); after that, X2 is sent to the aforementioned neural network (residual network road) for training. Moreover, the same training method is also implemented on the A network to compare the difference between X2 and the original image, so as to train the A network.

；

;

；

;

X3_uv=Conv2dTranspose(X2_uv,2,2,2,w_uv,b_uv)； X3_uv =Conv2dTranspose( X2_uv ,2,2,2, w_uv , b_uv );

w_uv是一矩陣其大小是2*2*2*2；b_uv是一向量其大小為2； w _ uv is a matrix whose size is 2*2*2*2; b _ uv is a vector whose size is 2;

X2=concat(X2_y,X3_uv)； X2=concat(X2_y,X3_uv);

以上所述是A網路(神經網路編號A)的另一實施例，其中的“concat”函數是依循通道的方向連接該輸入； The above is another embodiment of the A network (neural network number A), in which the "concat" function connects the input according to the direction of the channel;

X的解析度是1280x720； The resolution of X is 1280x720;

a=128,b=8,c=4,d=0,σ=leaky relu with alpha=0.2。 a=128, b=8, c=4, d=0, σ=leaky relu with alpha=0.2.

圖十是本發明處理具YUV420格式之解碼後的影像的程序的另一實施例示意圖。如圖十所示本發明處理具YUV420格式之解碼後的影像的程序包括： FIG. 10 is a schematic diagram of another embodiment of the process of processing a decoded image in YUV420 format according to the present invention. As shown in Figure 10, the program of the present invention for processing the decoded image in YUV420 format includes:

步驟461：該第一神經網路藉由以下步驟來接受並處理具YUV420顏色格式之訓練解碼影像，其中，該訓練解碼影像包括N通道，且N是大於2的正整數； Step 461: The first neural network accepts and processes the training decoded image in YUV420 color format through the following steps, wherein the training decoded image includes N channels, and N is a positive integer greater than 2;

步驟462：提取該訓練解碼影像中的Y-part資料以產生Y-part輸出資料； Step 462: extracting Y-part data from the training decoded image to generate Y-part output data;

步驟463：提取該訓練解碼影像中的UV-part資料，並使用具兩倍放大的第一神經網路來處理該訓練解碼影像的UV-part資料以產生具N-1通道的UV-part輸出資料(例如：步伐值Stride=2於轉置卷積中；如步驟464)； Step 463: extracting the UV-part data in the training decoded image, and using the first neural network with double magnification to process the UV-part data of the training decoded image to generate UV-part output with N-1 channels Data (for example: stride=2 in transposed convolution; as in step 464);

步驟465：以合併函數Concat(concatenates)處理該Y-part資料及該UV-part資料以產生該訓練輸出影像(步驟466)。 Step 465 : Process the Y-part data and the UV-part data with a merging function Concat (concatenates) to generate the training output image (Step 466 ).

第四種情況：原圖影像是YUV420、且輸出影像也是YUV420。 The fourth case: the original image is YUV420, and the output image is also YUV420.

如果輸入影像是YUV420、且輸出影像也是YUV420，則處理方式將類似前述RGB-to-RGB的方式。然而，由於輸入格式和輸出格式不同，所以不同的卷積方法會應用在不同通道上。例如，當神經網路的核心大小為8x8、步伐值stride為4來處理影像的Y-part時，則該神經網路可改成核心大小為4x4、步伐值stride為2來處理影像的UV-part。 If the input image is YUV420 and the output image is also YUV420, the processing method will be similar to the aforementioned RGB-to-RGB method. However, since the input format and output format are different, different convolution methods will be applied to different channels. For example, when the kernel size of the neural network is 8x8 and the stride value is 4 to process the Y-part of the image, the neural network can be changed to a kernel size of 4x4 and the stride value is 2 to process the UV-part of the image. part.

；

;

；

;

X3=σ(Conv2d(X2_y,a,b,c,w_y,b_y)+Conv2d(X2_uv,a,b/2,c/2,w_uv,b_uv))； X3=σ(Conv2d ( X2_y,a,b,c, w_y , b_y ) +Conv2d(X2_uv,a,b/ 2 ,c / 2, w_uv , b_uv ));

w_y是一矩陣其大小是b*b*1*a；b_y是一向量其大小為a； w_y is a matrix whose size is b*b*1 * a; b_y is a vector whose size is a ;

w_uv是一矩陣其大小是(b/2)*(b/2)*2*a；b_uv是一向量其大小為a； w _ uv is a matrix whose size is (b/2)*(b/2)*2*a; b _ uv is a vector whose size is a;

X4_y=Conv2dTranspose(X3,1,b,c,w_1,b_1)+X2_y； X4_y=Conv2dTranspose(X3,1 , b,c, w_1 , b_1 )+ X2_y ;

X4_uv=Conv2dTranspose(X3,2,b/2,c/2,w_2,b_2)+X2_uv； X4_uv=Conv2dTranspose(X3,2,b/2,c/2, w_2 ,b_2) + X2_uv;

w_1是一矩陣其大小是b*b*1*a；b_1是一向量其大小為1； w _1 is a matrix whose size is b*b*1*a; b _1 is a vector whose size is 1;

w_2是一矩陣其大小是(b/2)*(b/2)*2*a；b_2是一向量其大小為2； w _2 is a matrix whose size is (b/2)*(b/2)*2*a; b _2 is a vector whose size is 2;

最後輸出： Final output:

Y_y=uint8(X4_y*128+128)； Y_y=uint8(X4_y*128+128);

Y_uv=uint8(X4_uv*128+128)； Y_uv=uint8(X4_uv*128+128);

使用的參數： Parameters used:

a=128,b=8,c=4,d=0,e=2,f=2,σ=leaky relu with alpha=0.2。 a=128, b=8, c=4, d=0, e=2, f=2, σ=leaky relu with alpha=0.2.

本發明所使用的參數的詳細說明如下： The detailed description of the parameters used in the present invention is as follows:

訓練參數： Training parameters:

該加權參數的初始值是根據高斯分布(Gaussian distribution)，mean=0、stddev=0.02； The initial value of the weighting parameter is based on the Gaussian distribution (Gaussian distribution), mean=0, stddev=0.02;

在訓練過程中使用Adam演算法，學習率learning rate=1e-4,beta1=0.9； The Adam algorithm is used in the training process, the learning rate learning rate=1e-4, beta1=0.9;

微批次大小mini batch size=1； Micro batch size mini batch size=1;

主要差異函數(primary error function)是： The primary error function is:

100*(L2+L2e)+λ *L1+γ * D+α *Lg； 100*(L2+L2e)+λ*L1+γ*D+α*Lg;

所使用的參數其標準值為： The parameters used have standard values:

λ=0,γ=0,α=0； λ=0, γ=0, α=0;

λ=0,γ=0,α=100； λ=0, γ=0, α=100;

λ=0,γ=1,α=0； λ=0, γ=1, α=0;

λ=10,γ=0,α=0； λ=10, γ=0, α=0;

λ=10,γ=0,α=100； λ=10, γ=0, α=100;

λ=10,γ=1,α=0； λ=10, γ=1, α=0;

其中： in:

L2=mean((T-Y)²)；其中mean是指平均值，T是訓練標的； L2= mean (( T - Y ) ² ); where mean refers to the average value and T is the training target;

L1=mean(|T-Y|)；其中mean是指平均值，T是訓練標的； L1= mean (| T - Y |); where mean refers to the average value and T is the training target;

D是生成對抗網路損失(GAN loss)，使用一般GAN訓練方法來訓練鑑別器(Discriminator)以分辨(X,Y)與(X,T)； D is to generate a confrontational network loss (GAN loss), using the general GAN training method to train the discriminator (Discriminator) to distinguish between (X,Y) and (X,T);

Lg的數學式是： The mathematical formula of Lg is:

對於WxH的影像而言， For WxH images,

Y_dx(i,j)=Y(i+1,j)-Y(i,j)0<=i<W-1,0<=j<H Y_dx(i,j)=Y(i+1,j)-Y(i,j)0<=i<W-1,0<=j<H

T_dx(i,j)=T(i+1,j)-T(i,j)0<=i<W-1,0<=j<H T_dx(i,j)=T(i+1,j)-T(i,j)0<=i<W-1,0<=j<H

Y_dy(i,j)=Y(i,j+1)-Y(i,j)0<=i<W,0<=j<H-1 Y_dy(i,j)=Y(i,j+1)-Y(i,j)0<=i<W,0<=j<H-1

T_dy(i,j)=T(i,j+1)-T(i,j)0<=i<W,0<=j<H-1 T_dy(i,j)=T(i,j+1)-T(i,j)0<=i<W,0<=j<H-1

L _g=mean((T _dx-Y _dx)²)+mean((T _dy-Y _dy)²) L _g = mean (( T _dx - Y _dx ) ² )+ mean (( T _dy - Y _dy ) ² )

在RGB模式下，所述的訓練標的T就是RGB遊戲影像的原始原圖影像； In the RGB mode, the training target T is the original image of the RGB game image;

在YUV444模式下，訓練標的T就是RGB遊戲影像的原始原圖影像； In YUV444 mode, the training target T is the original image of the RGB game image;

在RGB->RGB、以及YUV420->YUV420模式下，L2e=0； In RGB->RGB, and YUV420->YUV420 mode, L2e=0;

在YUV420->RGB及YUV420->YUV444模式下， In YUV420->RGB and YUV420->YUV444 mode,

L _2e=mean((T-X ₂)²)。 L _{2 e} = mean (( T - X ₂ ) ² ).

由上述說明可知，本發明的方法具有以下優點： As can be seen from the foregoing description, the method of the present invention has the following advantages:

能根據具有不同內容的各種影像隨時保持對神經網絡的訓練，以便對不同的影像內容執行不同的增強效果；例如，對於具有卡通風格、現實風格或不同場景等的影像，不同的加權參數w、b可以預先存儲在客戶端裝置中、或者動態下載到客戶端裝置中； The training of the neural network can be kept at any time according to various images with different contents, so as to perform different enhancement effects on different image contents; for example, for images with cartoon style, realistic style or different scenes, etc., different weighting parameters w, b can be pre-stored in the client device, or dynamically downloaded to the client device;

關於判斷原圖影像應屬於哪種模式的方式，伺服器端的神經網絡可以自動判定原圖影像的模式，並將此類訊息傳輸到客戶端裝置；因為原圖影像的內容具有一致性，所以這種判定過程可以由伺服器週期性地執行，例如每秒一次；可是，在另一實施例中，判定影像模式的過程也可以由客戶端裝置週期性地執行，例如每數秒執行一次，視客戶端裝置的運算能力而定； Regarding the method of judging which mode the original image should belong to, the neural network on the server side can automatically determine the mode of the original image and transmit such information to the client device; because the content of the original image is consistent, this This determination process can be performed periodically by the server, such as once per second; however, in another embodiment, the process of determining the image mode can also be performed periodically by the client device, such as once every few seconds, depending on the client. Depends on the computing power of the terminal device;

訓練是根據真實視頻影像進行，可以實際測量增強的提高程度；例如，當使用本發明的方法來增強解析度為1280x720和比特率(bitrate)為3000的視頻影像時，類似場景的PSNR值可以增加1.5~2.2db左右，此可證明本發明的方法確實能真實地提高輸出影像的品質，並使輸出影像在視覺上更接近原圖影像的品質；並且，本發明不同於那些眾所周知的影像增強技術，它們只能增加輸出影像的對比度、平滑和濾色，而無法像本發明般使輸出影像在視覺上更接近於原圖影像； Training is carried out based on real video images, and the degree of enhancement can be actually measured; for example, when using the method of the present invention to enhance a video image with a resolution of 1280x720 and a bit rate (bitrate) of 3000, the PSNR value of similar scenes can be increased 1.5~2.2db or so, which proves that the method of the present invention can really improve the quality of the output image, and make the output image visually closer to the quality of the original image; moreover, the present invention is different from those well-known image enhancement techniques , they can only increase the contrast, smoothing and color filtering of the output image, but cannot make the output image visually closer to the original image as in the present invention;

藉由神經網絡演算法的簡化模型，並利用大核心、大步伐值，使神經網絡的分辨率迅速降低，且模型的處理速度可以大大提高；即使是計算能力有限的客戶端裝置也可以達到60fps和HD解析度的輸出影像的目標；以及 With the simplified model of the neural network algorithm, and the use of large cores and large step values, the resolution of the neural network can be reduced rapidly, and the processing speed of the model can be greatly improved; even the client device with limited computing power can reach 60fps and HD resolution output image targets; and

藉由將顏色格式(YUV420和RGB)的轉換工作帶入神經網絡、並利用UV通道的分辨率低於Y通道的優勢，將UV通道的步伐值設置為Y通道的一半，可提高神經網絡的計算速度。 By bringing the conversion of color formats (YUV420 and RGB) into neural networks, And taking advantage of the lower resolution of the UV channel than the Y channel, setting the step value of the UV channel to half of the Y channel can improve the calculation speed of the neural network.

圖十一A是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第二實施例示意圖，其包括下列步驟： FIG. 11A is a schematic diagram of a second embodiment of a method for reducing network bandwidth required for image streaming by using an artificial intelligence processing module according to the present invention, which includes the following steps:

步驟711：在一伺服器701中執行一第一應用程式。該第一應用程式依據至少一指令來產生具高解析度的複數個原圖影像。這些原圖影像的解析度可以是4K解析度或更高(以下亦稱為第二解析度)。所述的至少一指令是由客戶端裝置702所產生並透過網路傳送給伺服器701。 Step 711 : Execute a first application in a server 701 . The first application program generates a plurality of original image images with high resolution according to at least one instruction. The resolution of these original image images may be 4K resolution or higher (hereinafter also referred to as the second resolution). The at least one command is generated by the client device 702 and sent to the server 701 through the network.

步驟712：於伺服器701中使用一習知的抽樣法來降低原圖影像的解析度以獲得具低解析度(例如1080i、720p或更低，以下亦稱為第一解析度)的來源影像，第一解析度是低於第二解析度。 Step 712: Use a known sampling method in the server 701 to reduce the resolution of the original image to obtain a source image with a low resolution (such as 1080i, 720p or lower, hereinafter also referred to as the first resolution) , the first resolution is lower than the second resolution.

步驟713：於伺服器701中使用一編碼器對該些來源影像進行編碼與壓縮以產生相對應的複數個編碼後影像。 Step 713: Use an encoder in the server 701 to encode and compress the source images to generate a plurality of corresponding encoded images.

步驟714：由伺服器701依據來自客戶端裝置702的指令，將這些編碼後影像以一2D影像串流(步驟304)的型式經由網路傳送給客戶端裝置702。由於影像在被傳送給客戶端裝置已經先被降低解析度，所以傳送影像串流所需的網路頻寬也因此降低。 Step 714 : The server 701 transmits the encoded images to the client device 702 via the network in the form of a 2D video stream (step 304 ) according to the instruction from the client device 702 . Since the video is downscaled before being sent to the client device, the network bandwidth required to transmit the video stream is also reduced.

步驟715：客戶端裝置702接受這些編碼後影像並將其解碼成相對應的複數個解碼後影像。 Step 715: The client device 702 receives the encoded images and decodes them into a plurality of corresponding decoded images.

於本發明中，客戶端裝置702包含一AI處理模組，其包括預設的至少一數學運算式。該至少一數學運算式包括複數個加權參數。該些加權參數是藉由一訓練伺服器的一人工神經網路模組的一訓練模式來預先定義。於客戶端裝置702執行一第二應用程式，其是和第一應用程式相關聯及配合，以供使用者操作客戶端裝置702來產生該指令。客戶端裝置702透過網路將指令傳送給伺服器701、以及自伺服器接收依據該指令所產生的編碼後影像。 In the present invention, the client device 702 includes an AI processing module, which includes at least one preset mathematical operation formula. The at least one mathematical operation formula includes a plurality of weighting parameters. The weighting parameters are predefined by a training mode of an artificial neural network module of a training server. A second application program is executed on the client device 702, which is associated with and cooperates with the first application program for the user to operate the client device 702 to generate the command. The client device 702 sends the command to the server 701 through the network, and receives the encoded image generated according to the command from the server.

於本實施例中，該至少一數學運算式包括一第一預設的AI運算式以及一第二預設的AI運算式。該第一預設的AI運算式包括複數個第一加權參數。該第二預設的AI運算式包括複數個第二加權參數。該第一預設的AI運算式搭配複數個該第一加權參數可用於提高影像的解析度，藉此，由該第一預設的AI運算式搭配複數個該第一加權參數所處理過的影像的解析度可以由該第一解析度提高到該第二解析度。該第二預設的AI運算式搭配複數個該第二加權參數可用於增強影像的品質，藉此，由該第二預設的AI運算式搭配複數個該第二加權參數所處理過的影像的品質比該解碼後影像的品質更高、且更接近於原圖影像的品質。 In this embodiment, the at least one mathematical calculation formula includes a first preset AI calculation formula and a second preset AI calculation formula. The first preset AI calculation formula includes a plurality of first weighting parameters. The second preset AI calculation formula includes a plurality of second weighting parameters. The first preset AI calculation formula combined with a plurality of the first weighting parameters can be used to improve the resolution of the image, thereby, by The resolution of the image processed by the first preset AI calculation formula combined with the plurality of first weighting parameters can be increased from the first resolution to the second resolution. The second preset AI calculation formula combined with a plurality of the second weighting parameters can be used to enhance the quality of the image, thereby, the image processed by the second preset AI calculation formula with the plurality of the second weighting parameters The quality of the decoded image is higher than that of the decoded image and closer to the quality of the original image.

步驟716：當該客戶端裝置702將所接收到的複數個該編碼後影像解碼成相對應的複數個解碼後影像以後，該客戶端裝置先使用該第一預設的AI運算式及複數個該第一加權參數來處理複數個該解碼後影像以產生相對應的具第二解析度的複數個解析度提升影像。接著，於步驟717中，該客戶端裝置702使用該第二預設的AI運算式及複數個該第二加權參數來處理複數個該解析度提升影像以產生具高影像品質且具該第二解析度的複數個該高解析度影像。之後，如步驟718，客戶端裝置702將該些高解析度影像做為輸出影像並輸出至螢幕(顯示器)上。 Step 716: After the client device 702 decodes the received plurality of encoded images into corresponding plurality of decoded images, the client device first uses the first preset AI formula and the plurality of The first weighting parameter is used to process the plurality of decoded images to generate corresponding plurality of resolution-enhanced images with the second resolution. Then, in step 717, the client device 702 uses the second default AI calculation formula and the second weighting parameters to process the plurality of the resolution-enhanced images to generate high image quality with the second A plurality of the high-resolution images of resolutions. Afterwards, in step 718 , the client device 702 takes the high-resolution images as output images and outputs them to the screen (display).

其中，該第一預設的AI運算式的第一加權參數是藉由分析具低解析度之來源影像與相對應之原圖影像之間的差異的方式來預先定義，以使得該解析度提升影像在視覺上更接近於原圖影像而非來源影像。並且，該第二預設的AI運算式的第二加權參數是藉由分析該解碼後影像與相對應之原圖影像之間的差異的方式來預先定義，以使得該高解析度影像在視覺上更接近於原圖影像而非解碼後影像。 Wherein, the first weighting parameter of the first preset AI calculation formula is pre-defined by analyzing the difference between the source image with low resolution and the corresponding original image, so that the resolution can be improved The image is visually closer to the original image than to the source image. In addition, the second weighting parameter of the second preset AI calculation formula is pre-defined by analyzing the difference between the decoded image and the corresponding original image, so that the high-resolution image is visually It is closer to the original image than the decoded image.

圖十一B是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第三實施例示意圖。由於圖十一B所示大部分步驟都和圖十一A所示相同，所以相同或類似的步驟將給予相同的編號且不贅述其細節。於圖十一B所示實施例中，執行於伺服器701的第一應用程式產生具第一解析度的複數來源影像(步驟719)，換言之，伺服器701會直接產生低解析度的來源影像，所以無須另執行降低解析度程序。之後，這些來源影像會依據相同於圖十一A所述之步驟713~718被處理。由於伺服器701是直接產生低解析度的來源影像，其所需消耗的運算資源比產生高解析度原圖影像更低；所以，除了如同圖十一A所述實施例具有的節省網路頻寬的好處外，圖十一B所示的實施例還兼具有能夠節省伺服器的運算資源的優勢。 FIG. 11B is a schematic diagram of a third embodiment of a method for reducing network bandwidth required for image streaming by using an artificial intelligence processing module according to the present invention. Since most of the steps shown in FIG. 11B are the same as those shown in FIG. 11A, the same or similar steps will be given the same numbers and the details will not be repeated. In the embodiment shown in FIG. 11B, the first application running on the server 701 generates a plurality of source images with a first resolution (step 719), in other words, the server 701 will directly generate low-resolution source images , so no additional downscaling is required. Afterwards, these source images are processed according to the same steps 713-718 described in FIG. 11A. Since the server 701 directly generates the low-resolution source image, the computing resource required for it is lower than that for generating the high-resolution original image; In addition to the benefits of widening, the embodiment shown in FIG. 11B also has the advantage of saving computing resources of the server.

圖十二A是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第四實施例示意圖。由於圖十二A所示大部分步驟都和圖十一A與圖十一B所示相同，所以相同或類似的步驟將給予相同的編號且不贅述其細節。於圖十二A所示實施例中，在伺服器701中執行的第一應用程式產生具第二解析度的複數個原圖影像(步驟711)。這些原圖影像接著被降低解析度處理，成為相對應具有第一解析度的複數個來源影像(步驟712)。之後，這些來源影像被編碼(步驟713)成編碼後影像並傳送給客戶端裝置702(步驟714)。客戶端裝置702把接收到的編碼後影像進行解碼成為解碼後影像(步驟715)。然後，於圖十二A所示實施例的步驟717中，該客戶端裝置702先使用該第二預設的AI運算式及複數個該第二加權參數來處理複數個該解碼後影像以產生具高影像品質但解析度仍為第一解析度的複數個品質提升影像。接著，該客戶端裝置702使用該第一預設的AI運算式及複數個該第一加權參數來處理複數個該品質提升影像以產生具該第二解析度且具高影像品質的複數個該高解析度影像。之後，如步驟718，客戶端裝置702將該些高解析度影像做為輸出影像並輸出至螢幕上。 Figure 12A shows the use of artificial intelligence processing modules in the present invention to reduce the need for image streaming A schematic diagram of a fourth embodiment of the network bandwidth method. Since most of the steps shown in FIG. 12A are the same as those shown in FIG. 11A and FIG. 11B, the same or similar steps will be given the same numbers and their details will not be repeated. In the embodiment shown in FIG. 12A, the first application program executed in the server 701 generates a plurality of original image images with the second resolution (step 711). These original images are then processed by reducing the resolution to become a plurality of source images corresponding to the first resolution (step 712 ). Afterwards, these source images are encoded (step 713 ) into encoded images and sent to the client device 702 (step 714 ). The client device 702 decodes the received encoded video into a decoded video (step 715 ). Then, in step 717 of the embodiment shown in FIG. 12A, the client device 702 first uses the second preset AI calculation formula and the plurality of second weighting parameters to process a plurality of the decoded images to generate A plurality of upscaled images with high image quality but still at the first resolution. Then, the client device 702 uses the first preset AI calculation formula and the first weighting parameters to process the quality-enhanced images to generate the second resolution and high image quality images. High resolution imagery. Afterwards, in step 718 , the client device 702 takes the high-resolution images as output images and outputs them on the screen.

圖十二B是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第五實施例示意圖。由於圖十二B所示大部分步驟都和圖十二A及圖十一B所示相同，所以相同或類似的步驟將給予相同的編號且不贅述其細節。於圖十二B所示實施例中，執行於伺服器701的第一應用程式產生具第一解析度的複數來源影像(步驟719)，換言之，伺服器701會直接產生低解析度的來源影像，所以無須另執行降低解析度程序。之後，這些來源影像會依據相同於圖十二A所述之步驟713~718被處理。 FIG. 12B is a schematic diagram of a fifth embodiment of a method for reducing network bandwidth required for image streaming by using an artificial intelligence processing module according to the present invention. Since most of the steps shown in FIG. 12B are the same as those shown in FIG. 12A and FIG. 11B, the same or similar steps will be given the same numbers and the details will not be repeated. In the embodiment shown in FIG. 12B, the first application running on the server 701 generates multiple source images with a first resolution (step 719), in other words, the server 701 directly generates source images with low resolution. , so no additional downscaling is required. Afterwards, these source images are processed according to the same steps 713-718 described in FIG. 12A.

圖十三是本發明所述AI處理模組的第一預設的AI運算式及第一加權參數的訓練方式的一實施例示意圖。於本發明中，客戶端裝置702內的AI處理模組中的第一預設的AI運算式及複數第一加權參數是藉由在該訓練伺服器上執行一人工神經網路的訓練程序所預先定義。當訓練完成後，第一預設的AI運算式及複數第一加權參數會被應用在客戶端裝置702的AI處理模組中以執行如圖十一A、十一B、十二A、十二B所示之步驟716所述的AI提升解析度步驟。在訓練伺服器中訓練第一預設的AI運算式及複數第一加權參數的步驟包括： FIG. 13 is a schematic diagram of an embodiment of the training method of the first preset AI calculation formula and the first weighting parameter of the AI processing module of the present invention. In the present invention, the first preset AI calculation formula and the complex first weighting parameters in the AI processing module in the client device 702 are obtained by executing an artificial neural network training program on the training server. predefined. After the training is completed, the first preset AI calculation formula and the complex first weighting parameters will be applied to the AI processing module of the client device 702 to execute Step 716 shown in FIG. 2B shows the step of improving the resolution of AI. The steps of training the first preset AI calculation formula and the complex first weighting parameters in the training server include:

步驟7161：在該訓練伺服器中啟用一訓練模式以產生複數個訓練原圖影像(步驟7162)；複數個該訓練原圖影像具有該第二解析度(高解析度)。 Step 7161: Enable a training mode in the training server to generate a plurality of training sessions Training the original image (step 7162); a plurality of the original training images have the second resolution (high resolution).

步驟7163：執行一解析度降低程序，將複數個該訓練原圖影像的解析度由該第二解析度降低至該第一解析度，以產生具第一解析度的複數個訓練低解析度影像(步驟7164)。 Step 7163: Execute a resolution reduction procedure to reduce the resolution of the plurality of training original image images from the second resolution to the first resolution, so as to generate a plurality of training low-resolution images with the first resolution (step 7164).

步驟7165：由該人工神經網路模組接受並使用一第一訓練運算式來逐一處理複數個該訓練低解析度影像以產生相對應之具該第二解析度的複數個訓練輸出影像(步驟7166)；該第一訓練運算式具有複數個第一訓練加權參數。 Step 7165: Accept and use a first training algorithm to process the plurality of training low-resolution images one by one by the artificial neural network module to generate a plurality of training output images corresponding to the second resolution (step 7166); the first training expression has a plurality of first training weighting parameters.

步驟7167：使用一比較模組來逐一比較複數個該訓練輸出影像和相對應的複數個該訓練原圖影像之間的差異，並據以調整該第一訓練運算式的該些第一訓練加權參數。該些第一訓練加權參數會被調整成可讓該訓練輸出影像與相對應之該訓練原圖影像之間的差異最小化。每一次當該些第一訓練加權參數被調整後，調整後的該些第一訓練加權參數就會被回饋給該第一訓練運算式以供處理下一個該訓練低解析度影像。在進行過預定數量的該訓練輸出影像與相對應之該訓練原圖影像的比較、以及預定次數的複數個該第一訓練加權參數的調整程序後，最後所得到的該些第一訓練加權參數會被應用在客戶端裝置702的AI處理模組內來作為其至少一該數學運算式的複數個該加權參數，以執行如圖十一A、十一B、十二A、十二B所示之步驟716所述的AI提升解析度步驟。 Step 7167: Use a comparison module to compare the differences between the plurality of training output images and the corresponding plurality of training original image images one by one, and adjust the first training weights of the first training formula accordingly parameter. The first training weighting parameters are adjusted to minimize the difference between the training output image and the corresponding training original image. Every time the first training weighting parameters are adjusted, the adjusted first training weighting parameters are fed back to the first training calculation formula for processing the next training low-resolution image. After a predetermined number of comparisons between the training output image and the corresponding training original image, and a predetermined number of adjustment procedures for the first training weighting parameters, the finally obtained first training weighting parameters Will be applied in the AI processing module of the client device 702 as a plurality of the weighting parameters of at least one mathematical operation formula, so as to perform as shown in Figures 11A, 11B, 12A, and 12B. The AI upscaling step described in step 716 is shown.

於本實施例中，對於客戶端裝置702之AI處理模組的第二預設的AI運算式及第二加權參數的訓練方式是和圖四、圖五、或圖六所示之前述人工神經網路模組105的訓練方式相同。當訓練完成後，所得到的該些第二訓練加權參數會被應用在客戶端裝置702的AI處理模組內來作為其至少一該數學運算式的複數個該加權參數，以執行如圖十一A、十一B、十二A、十二B所示之步驟717所述的AI增強影像品質的步驟。 In this embodiment, the training method for the second preset AI calculation formula and the second weighting parameter of the AI processing module of the client device 702 is the same as that shown in Figure 4, Figure 5, or Figure 6. The network module 105 is trained in the same way. After the training is completed, the obtained second training weighting parameters will be applied in the AI processing module of the client device 702 as a plurality of the weighting parameters of at least one mathematical operation formula, so as to execute as shown in FIG. 10 1A, 11B, 12A, and 12B show the step of AI enhancing image quality described in step 717.

圖十四A是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第六實施例示意圖。由於圖十四A所示大部分步驟都和圖十一A所示相同，所以相同或類似的步驟將給予相同的編號且不贅述其細節。於圖十四A所示實施例中，在伺服器701中執行的第一應用程式產生具第二解析度的複數個原圖影像(步驟711)。這些原圖影像接著被降低解析度處理，成為相對應具有第一解析度的複數個來源影像(步驟712)。之後，這些來源影像被編碼(步驟713)成編碼後影像並傳送給客戶端裝置702(步驟714)。客戶端裝置702把接收到的編碼後影像進行解碼成為解碼後影像(步驟715)。於本實施例中，該第一預設的AI運算式、該第二預設的AI運算式、複數個該第一加權參數、以及複數個該第二加權參數全部都包含在該客戶端裝置702的同一個該AI處理模組內，以供把複數個該解碼後影像直接處理成具高影像品質且具該第二解析度的複數個該高解析度影像。所以，於步驟720中，客戶端裝置702的AI處理模組接受並使用該第一預設的AI運算式、該第二預設的AI運算式、複數個該第一加權參數、以及複數個該第二加權參數來處理該些解碼後影像以產生相對應具第二解析度的複數個該高解析度影像。之後，如步驟718，客戶端裝置702將該些高解析度影像做為輸出影像並輸出至螢幕上。 FIG. 14A is a schematic diagram of a sixth embodiment of a method for reducing network bandwidth required for image streaming by using an artificial intelligence processing module according to the present invention. Since most of the steps shown in FIG. 14A are the same as those shown in FIG. 11A, the same or similar steps will be given the same numbers and the details will not be repeated. In the embodiment shown in FIG. 14A, the first application program executed in the server 701 generates a A plurality of original image images of the second resolution (step 711). These original images are then processed by reducing the resolution to become a plurality of source images corresponding to the first resolution (step 712 ). Afterwards, these source images are encoded (step 713 ) into encoded images and sent to the client device 702 (step 714 ). The client device 702 decodes the received encoded video into a decoded video (step 715 ). In this embodiment, the first preset AI calculation formula, the second preset AI calculation formula, the plurality of the first weighting parameters, and the plurality of the second weighting parameters are all included in the client device In 702, the same AI processing module is used for directly processing the plurality of decoded images into a plurality of the high-resolution images with high image quality and the second resolution. Therefore, in step 720, the AI processing module of the client device 702 accepts and uses the first preset AI calculation formula, the second preset AI calculation formula, the plurality of the first weighting parameters, and the plurality of The second weighting parameter is used to process the decoded images to generate a plurality of the high-resolution images corresponding to the second resolution. Afterwards, in step 718 , the client device 702 takes the high-resolution images as output images and outputs them on the screen.

圖十四B是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第七實施例示意圖。由於圖十四B所示大部分步驟都和圖十四A及圖十一B所示相同，所以相同或類似的步驟將給予相同的編號且不贅述其細節。於圖十四B所示實施例中，執行於伺服器701的第一應用程式產生具第一解析度的複數來源影像(步驟719)，換言之，伺服器701會直接產生低解析度的來源影像，所以無須另執行降低解析度程序。之後，這些來源影像會依據相同於圖十四A所述之步驟713、714、715、720及718被處理。 FIG. 14B is a schematic diagram of a seventh embodiment of a method for reducing network bandwidth required for image streaming by using an artificial intelligence processing module according to the present invention. Since most of the steps shown in FIG. 14B are the same as those shown in FIG. 14A and FIG. 11B, the same or similar steps will be given the same numbers and the details will not be repeated. In the embodiment shown in FIG. 14B, the first application program running on the server 701 generates a plurality of source images with a first resolution (step 719), in other words, the server 701 will directly generate low-resolution source images , so no additional downscaling is required. Afterwards, these source images are processed according to the same steps 713, 714, 715, 720 and 718 as described in FIG. 14A.

圖十五是本發明所述AI處理模組的第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數的訓練方式的一實施例示意圖。於本發明中，客戶端裝置702內的AI處理模組中的第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數是藉由在該訓練伺服器上執行一人工神經網路的訓練程序所預先定義。當訓練完成後，第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數被應用在客戶端裝置702的AI處理模組中以執行如圖十四A及圖十四B所示之步驟720所述的AI提升解析度+增強的步驟。在訓練伺服器中訓練第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數的步驟包括： FIG. 15 is a schematic diagram of an embodiment of the training method of the first preset AI calculation formula, the second preset AI calculation formula, the first weighting parameter and the second weighting parameter of the AI processing module of the present invention. In the present invention, the first preset AI calculation formula, the second preset AI calculation formula, the first weighting parameter and the second weighting parameter in the AI processing module in the client device 702 are obtained by training A pre-defined training program of an artificial neural network is executed on the server. After the training is completed, the first preset AI calculation formula, the second preset AI calculation formula, the first weighting parameter and the second weighting parameter are applied to the AI processing module of the client device 702 for execution as shown in FIG. 4A and step 720 shown in FIG. 14B, the steps of AI enhancement resolution+enhancement. The steps of training the first preset AI calculation formula, the second preset AI calculation formula, the first weighting parameter and the second weighting parameter in the training server include:

步驟7201：在該訓練伺服器中啟用一訓練模式以產生複數個訓練原圖影像(步驟7202)；複數個該訓練原圖影像具有該第二解析度(高解析度)。 Step 7201: Enable a training mode in the training server to generate a plurality of training original images (step 7202); the plurality of training original images have the second resolution (high resolution).

步驟7203：執行一解析度降低程序，將複數個該訓練原圖影像的解析度由該第二解析度降低至該第一解析度，以產生具該第一解析度的複數個訓練低解析度影像(步驟7204)。 Step 7203: Execute a resolution reduction procedure to reduce the resolution of the plurality of training original image images from the second resolution to the first resolution, so as to generate a plurality of training low resolutions with the first resolution Image (step 7204).

步驟7205：執行一編碼程序，藉由訓練伺服器內的一編碼器來把複數個該訓練低解析度影像編碼成相對應的複數個訓練編碼後影像。 Step 7205: Execute an encoding process to encode the plurality of training low-resolution images into corresponding plurality of training encoded images by means of an encoder in the training server.

步驟7206：執行一解碼程序，藉由訓練伺服器內的一解碼器來把複數個該訓練編碼後影像解碼成相對應的複數個訓練解碼後影像；複數個該訓練解碼後影像具有該第一解析度。 Step 7206: Execute a decoding program to decode the plurality of training coded images into corresponding plurality of training decoded images by a decoder in the training server; the plurality of training decoded images have the first resolution.

步驟7207：由該人工神經網路模組接受並使用一第一訓練運算式以及一第二訓練運算式來逐一處理複數個該訓練解碼後影像以產生相對應之具該第二解析度的複數個訓練輸出影像(步驟7208)。該第一訓練運算式具有複數個第一訓練加權參數。該第二訓練運算式具有複數個第二訓練加權參數。 Step 7207: The artificial neural network module receives and uses a first training formula and a second training formula to process a plurality of the training decoded images one by one to generate corresponding complex numbers with the second resolution training output images (step 7208). The first training formula has a plurality of first training weighting parameters. The second training formula has a plurality of second training weighting parameters.

步驟7209：使用一比較模組來逐一比較複數個該訓練輸出影像和相對應的複數個該訓練原圖影像之間的差異，並據以調整該第一訓練運算式的該些第一訓練加權參數以及該第二訓練運算式的該些第二訓練加權參數。該些第一訓練加權參數以及該些第二訓練加權參數會被調整成可讓該訓練輸出影像與相對應之該訓練原圖影像之間的差異最小化。每一次當該些第一訓練加權參數以及該些第二訓練加權參數被調整後，調整後的該些第一訓練加權參數以及該些第二訓練加權參數就會被回饋給該第一訓練運算式以及該第二訓練運算式以供處理下一個該訓練低解析度影像。在進行過預定數量的該訓練輸出影像與相對應之該訓練原圖影像的比較、以及預定次數的複數個該第一訓練加權參數以及該些第二訓練加權參數的調整程序後，最後所得到的該些第一訓練加權參數以及該些第二訓練加權參數會被應用在該客戶端裝置的該AI處理模組內來作為其至少一該數學運算式所包含的該第一訓練運算式以及該第二訓練運算式的加權參數，以供執行如圖十四A及圖十四B中的步驟720所述的AI提升解析度+增強影像品質的步驟。 Step 7209: Use a comparison module to compare the differences between the plurality of training output images and the corresponding plurality of training original image images one by one, and adjust the first training weights of the first training formula accordingly parameters and the second training weighting parameters of the second training formula. The first training weighting parameters and the second training weighting parameters are adjusted to minimize the difference between the training output image and the corresponding training original image. Every time when the first training weighting parameters and the second training weighting parameters are adjusted, the adjusted first training weighting parameters and the second training weighting parameters will be fed back to the first training operation The formula and the second training calculation formula are used for processing the next training low-resolution image. After a predetermined number of comparisons between the training output image and the corresponding training original image, and a predetermined number of adjustment procedures for the first training weighting parameters and the second training weighting parameters, the final result obtained The first training weight parameters and the second training weight parameters will be applied in the AI processing module of the client device as the first training formula contained in at least one mathematical formula and The weighting parameters of the second training calculation formula are used to perform the steps of AI improvement resolution + image quality enhancement as described in step 720 in FIG. 14A and FIG. 14B step.

於本發明的一較佳實施例中，客戶端裝置702的AI處理模組僅包含單一組的AI運算式及複數個加權參數，其是藉由如圖十五所述步驟7201至7209的方式來進行訓練，所以也可以提供如圖十四A及圖十四B中的步驟720所述的「AI提升解析度+增強影像品質」的合併功能。 In a preferred embodiment of the present invention, the AI processing module of the client device 702 only includes a single set of AI calculation formulas and a plurality of weighting parameters, which is achieved through steps 7201 to 7209 as described in FIG. 15 For training, it is also possible to provide the merging function of "AI enhancement of resolution + enhancement of image quality" as described in step 720 in FIG. 14A and FIG. 14B.

圖十六是本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的方法的第八實施例示意圖。由於圖十六所示大部分步驟都和圖十四A及圖十四B所示相同，所以相同或類似的步驟將給予相同的編號且不贅述其細節。於圖十六所示實施例中，伺服器701更包含一AI編碼模組。執行於伺服器701的第一應用程式依據指令產生具第二解析度的複數原圖影像(步驟721)。接著，於步驟722中，伺服器701使用該AI編碼模組來將複數個該原圖影像進行降低解析度以獲得相對應的複數個該來源影像、以及將複數個該來源影像進行編碼以獲得相對應的複數個該編碼後影像。該AI編碼模組包含預設的至少一AI編碼運算式；該至少一AI編碼運算式包含預設的複數個編碼加權參數。然後，編碼後影像以影像串流的方式傳送給客戶端裝置750(步驟714)。於本實施例中，客戶端裝置702的AI處理模組更包括一AI解碼運算式以供將所接收之編碼後影像解碼成為相對應的解碼後影像。換言之，該AI解碼運算式、該第一預設的AI運算式、該第二預設的AI運算式、複數個該第一加權參數、以及複數個該第二加權參數全部都包含在該客戶端裝置702的同一個該AI處理模組內，以供把接收到的複數個該編碼後影像直接處理成解碼後且具高影像品質且具該第二解析度的複數個該高解析度影像。所以，於步驟723中，客戶端裝置702的AI處理模組接受並使用該AI解碼運算式、該第一預設的AI運算式、該第二預設的AI運算式、複數個該第一加權參數、以及複數個該第二加權參數來處理該些編碼後影像以直接產生相對應具第二解析度的複數個該高解析度影像。之後，如步驟718，客戶端裝置702將該些高解析度影像做為輸出影像並輸出至螢幕上。 FIG. 16 is a schematic diagram of an eighth embodiment of a method for reducing network bandwidth required for image streaming by using an artificial intelligence processing module according to the present invention. Since most of the steps shown in FIG. 16 are the same as those shown in FIG. 14A and FIG. 14B, the same or similar steps will be given the same numbers and their details will not be repeated. In the embodiment shown in FIG. 16, the server 701 further includes an AI encoding module. The first application program executed on the server 701 generates a plurality of original image images with a second resolution according to the instruction (step 721 ). Next, in step 722, the server 701 uses the AI encoding module to reduce the resolution of the plurality of original images to obtain the corresponding plurality of source images, and encode the plurality of source images to obtain Corresponding plural encoded images. The AI coding module includes at least one preset AI coding formula; the at least one AI coding formula includes a plurality of preset coding weighting parameters. Then, the encoded image is transmitted to the client device 750 in the form of an image stream (step 714 ). In this embodiment, the AI processing module of the client device 702 further includes an AI decoding algorithm for decoding the received encoded image into a corresponding decoded image. In other words, the AI decoding formula, the first preset AI formula, the second preset AI formula, the plurality of the first weighting parameters, and the plurality of the second weighting parameters are all included in the client In the same AI processing module of the terminal device 702, the plurality of received encoded images are directly processed into a plurality of decoded high-resolution images with high image quality and the second resolution . Therefore, in step 723, the AI processing module of the client device 702 receives and uses the AI decoding formula, the first preset AI formula, the second preset AI formula, and a plurality of the first The weighting parameters and the plurality of second weighting parameters are used to process the encoded images to directly generate the plurality of high-resolution images corresponding to the second resolution. Afterwards, in step 718 , the client device 702 takes the high-resolution images as output images and outputs them on the screen.

圖十七是本發明所述人工神經網路模組的該AI編碼運算式、該AI解碼運算式、第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數的訓練方式的一實施例示意圖。於本發明中，伺服器701內的AI編碼運算式及其加權參數、以及客戶端裝置702內的AI處理模組中的 AI解碼運算式及其加權參數、第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數都是藉由在該訓練伺服器上執行人工神經網路的訓練程序所預先定義。當訓練完成後，AI編碼運算式及其加權參數會被應用於伺服器701的AI編碼模組中以供執行圖十六所示的步驟722(AI編碼步驟)；同時，AI解碼運算式及其加權參數、第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數被應用在客戶端裝置702的AI處理模組中以執行如圖十六所示之步驟723(合併處理AI解碼+提升解析度+增強影像品質的步驟)。在訓練伺服器中訓練該AI編碼運算式及其加權參數、該AI解碼運算式及其加權參數、第一預設的AI運算式、第二預設的AI運算式、第一加權參數及第二加權參數的步驟包括： Fig. 17 shows the AI coding formula, the AI decoding formula, the first preset AI formula, the second preset AI formula, the first weighting parameter and the AI coding formula of the artificial neural network module of the present invention. A schematic diagram of an embodiment of the training method of the second weighting parameter. In the present invention, the AI coding formula and its weighting parameters in the server 701, and the AI processing module in the client device 702 The AI decoding formula and its weighting parameters, the first preset AI formula, the second preset AI formula, the first weighting parameter and the second weighting parameter are all obtained by executing the artificial neural network on the training server. The training program of the road is pre-defined. After the training is completed, the AI coding formula and its weighted parameters will be applied to the AI coding module of the server 701 to execute step 722 (AI coding step) shown in FIG. 16; meanwhile, the AI decoding formula and Its weighting parameters, the first preset AI calculation formula, the second preset AI calculation formula, the first weighting parameter and the second weighting parameter are applied in the AI processing module of the client device 702 to execute as shown in FIG. 16 Step 723 shown (the step of combining AI decoding + improving resolution + enhancing image quality). Train the AI encoding formula and its weighting parameters, the AI decoding formula and its weighting parameters, the first preset AI formula, the second preset AI formula, the first weighting parameter and the second formula in the training server The steps for the second weighting parameter include:

步驟7221：在該訓練伺服器中啟用一訓練模式並在訓練模式中執行第一應用程式以產生複數個訓練原圖影像(步驟7222)；複數個該訓練原圖影像具有該第二解析度(高解析度)。 Step 7221: Enable a training mode in the training server and execute the first application program in the training mode to generate a plurality of training original image images (step 7222); a plurality of the training original image images have the second resolution ( High resolution).

步驟7223：執行一解析度降低程序，將複數個該訓練原圖影像的解析度由該第二解析度降低至該第一解析度，以產生具該第一解析度的複數個訓練低解析度影像(步驟7224)。 Step 7223: Execute a resolution reduction procedure to reduce the resolution of the plurality of training original image images from the second resolution to the first resolution, so as to generate a plurality of training low resolutions with the first resolution Image (step 7224).

步驟7225：使用一第一人工神經網路模組來接受並使用一訓練編碼運算式來逐一處理複數個該訓練低解析度影像以產生相對應之具該第一解析度的複數個訓練編碼影像(步驟7226)；該訓練編碼運算式具有複數個訓練編碼加權參數。 Step 7225: Use a first artificial neural network module to receive and use a training encoding algorithm to process the plurality of training low-resolution images one by one to generate corresponding training encoding images with the first resolution (Step 7226); the training encoding formula has a plurality of training encoding weighting parameters.

步驟7227：使用一第二人工神經網路模組來接受並使用一訓練解碼運算式來逐一處理複數個該訓練編碼影像以產生相對應之具該第二解析度的複數個訓練輸出影像(步驟7228)；該訓練解碼運算式具有複數個訓練解碼加權參數。 Step 7227: Use a second artificial neural network module to receive and use a training decoding algorithm to process the plurality of training encoding images one by one to generate corresponding training output images with the second resolution (step 7228); the training decoding operation formula has a plurality of training decoding weighting parameters.

步驟7229：使用一比較模組來逐一比較複數個該訓練輸出影像和相對應的複數個該訓練原圖影像之間的差異，並據以調整該訓練編碼運算式的該些訓練編碼加權參數以及該訓練解碼運算式的該些訓練解碼加權參數。該些訓練編碼加權參數以及該些訓練解碼加權參數會被調整成可讓該訓練輸出影像與相對應之該訓練原圖影像之間的差異最小化。每一次當該些訓練編碼加權參數以及該些訓練解碼加權參數被調整後，調整後的該些訓練編碼加權參數以及該些訓練解碼加權參數就會分別被回饋給該訓練編碼運算式以及該訓練解碼運算式以供處理下一個該訓練低解析度影像。於步驟7220中，在進行過預定數量的該訓練輸出影像與相對應之該訓練原圖影像的比較、以及預定次數的複數個該訓練編碼加權參數以及該些訓練解碼加權參數的調整程序後，最後所得到的該些訓練編碼加權參數會被應用在該伺服器的該AI編碼模組的AI編碼運算式中；並且，所得到的該些訓練解碼加權參數會被應用在該客戶端裝置的該AI處理模組的至少一該數學運算式中。藉此，該伺服器的AI編碼模組可以如圖十六之步驟722般合併地處理將原圖影像的解析度降低與編碼的程序；並且，該客戶端裝置的該AI處理模組可以如圖十六之步驟723般對所接收到的該編碼後影像合併性地執行解碼、解析度提升以及影像品質增強的程序。 Step 7229: Use a comparison module to compare the differences between the plurality of training output images and the corresponding plurality of training original images one by one, and adjust the training encoding weighting parameters and the training encoding formula accordingly. The training decoding weighting parameters of the training decoding operation formula. The training encoding weighting parameters and the training decoding weighting parameters are adjusted to minimize the difference between the training output image and the corresponding training original image. Each time when the training encoding weight parameters and the training decoding weight parameters are adjusted, the adjusted The training encoding weighting parameters and the training decoding weighting parameters are respectively fed back to the training encoding formula and the training decoding formula for processing the next training low-resolution image. In step 7220, after performing a predetermined number of comparisons between the training output image and the corresponding training original image, and a predetermined number of adjustment procedures for the training encoding weighting parameters and the training decoding weighting parameters, The finally obtained training encoding weighting parameters will be applied to the AI encoding calculation formula of the AI encoding module of the server; and the obtained training decoding weighting parameters will be applied to the client device. In at least one mathematical operation formula of the AI processing module. Thereby, the AI encoding module of the server can process the process of reducing the resolution and encoding of the original image in a combined manner as in step 722 of FIG. 16; and, the AI processing module of the client device can be as follows Step 723 in FIG. 16 generally executes the procedures of decoding, resolution enhancement, and image quality enhancement on the received encoded image in combination.

於一較佳實施例中，客戶端裝置702的AI處理模組僅包含單一組的AI運算式及複數個加權參數，其是藉由如圖十七所述步驟7221至7229的方式來進行訓練，所以也可以提供如圖十六中的步驟723所述的「AI解碼+AI提升解析度+AI增強影像品質」的合併功能。 In a preferred embodiment, the AI processing module of the client device 702 only includes a single set of AI calculation formulas and a plurality of weighting parameters, which are trained by means of steps 7221 to 7229 as described in FIG. 17 , so the merging function of "AI decoding + AI enhancement of resolution + AI enhancement of image quality" as described in step 723 in FIG. 16 can also be provided.

在本發明的一個實施例中，可以使用以下現有技術中的任一種人工神經網絡技術作為第一人工神經網絡模組在伺服器中執行AI編碼步驟：自動編碼器(Autoencoder；簡稱AE)、去噪自動編碼器(Denoising Autoencoder；簡稱DAE)、變分自編碼器(Variational autoencoder；簡稱VAE)和矢量量化變分自編碼器(Vector-Quantized Variational Autoencoder；簡稱VQ-VAE)。用於在客戶端裝置中用於執行AI解碼、AI提升解析度和AI增強影像品質的第二人工神經網絡模組可以選自以下現有的人工神經網絡技術：SRCNN、EDSR、RCAN、EnhanceNet、SRGAN和ESRGAN。 In one embodiment of the present invention, any of the following artificial neural network technologies in the prior art can be used as the first artificial neural network module to perform the AI encoding step in the server: Autoencoder (Autoencoder; AE for short), Denoising Autoencoder (DAE for short), Variational autoencoder (VAE for short) and Vector-Quantized Variational Autoencoder (Vector-Quantized Variational Autoencoder; VQ-VAE for short). The second artificial neural network module for performing AI decoding, AI upscaling and AI enhanced image quality in the client device may be selected from the following existing artificial neural network technologies: SRCNN, EDSR, RCAN, EnhanceNet, SRGAN and ESRGAN.

於一較佳實施例中，複數個該原圖影像可以是三維(three-dimensional；簡稱3D)影像；每一個3D影像分別包含以並排方式組合在一個圖像幀中的左眼視圖和右眼視圖。因此，在客戶端裝置產生的相對應的輸出影像也會是3D影像。 In a preferred embodiment, the plurality of original image images may be three-dimensional (3D for short) images; each 3D image includes a left-eye view and a right-eye view combined in an image frame in a side-by-side manner view. Therefore, the corresponding output image generated on the client device will also be a 3D image.

於一較佳實施例中，根據本發明利用人工智慧處理模組來降低影像串流所需網路頻寬的系統還可以應用於機器人的遠程控制系統。本發明的伺服器可以是機器人，其包括運動模組、攝像頭模組、通訊模組和控制模組。本發明的客戶端裝置可以是包括控制器模組和顯示器的機器人控制設備。機器人通過網際網路或其他無線通信技術與控制設備遠程連接。控制器模組可由使用者操作以向機器人發送控制指令，從而遠程控制和操作機器人的運動和動作。機器人的攝像頭模組包括雙眼影像擷取模組，以獲取3D影像(左眼視圖和右眼視圖並排組合在一個圖像幀中)。依據從控制設備接收到的控制指令，機器人可以進行移動和其他動作，也可以拍攝機器人周圍環境的3D影像，然後將這些3D影像發送回控制設備並顯示在顯示器上。通過使用本發明的方法，客戶端裝置(控制設備)可以配備預先訓練過的AI處理模組；藉此，機器人的雙眼影像擷取模組只需要拍攝少量數據的低解析度影像並消耗較少的網路頻寬快速地傳送到客戶端裝置，然後客戶端裝置可以使用AI處理模組恢復3D影像的高解析度以及高影像品質。此外，由於機器人是拍攝及處理低解析度影像，其所需消耗的運算資源相對更低而且更省電，可以延長機器人的遠端作業時間。 In a preferred embodiment, the system for reducing network bandwidth required for image streaming by using artificial intelligence processing modules according to the present invention can also be applied to a robot remote control system. The server of the present invention can be a robot, which includes a motion module, a camera module, a communication module and a controller Modeling set. The client device of the present invention may be a robot control device including a controller module and a display. The robot is remotely connected with the control equipment through the Internet or other wireless communication technologies. The controller module can be operated by the user to send control instructions to the robot, so as to remotely control and operate the movement and action of the robot. The camera module of the robot includes a binocular image capture module to obtain 3D images (the left eye view and the right eye view are combined side by side in one image frame). According to the control instructions received from the control device, the robot can move and perform other actions, and can also take 3D images of the robot's surrounding environment, and then send these 3D images back to the control device and display them on the monitor. By using the method of the present invention, the client device (control device) can be equipped with a pre-trained AI processing module; thus, the binocular image capture module of the robot only needs to capture low-resolution images with a small amount of data and consume less The low network bandwidth is quickly transmitted to the client device, and then the client device can use the AI processing module to restore the high resolution and high image quality of the 3D image. In addition, since the robot shoots and processes low-resolution images, it consumes relatively less computing resources and saves power, which can extend the remote working time of the robot.

以上所述僅為本發明的較佳實施例，並非用來侷限本發明的可實施範圍。本發明的保護範圍應以申請專利範圍內容所載為準。任何熟習本項技術之人對於本發明的各種修改與改變，都可能未偏離本發明的發明精神與可實施範圍，而仍受到本發明的申請專利範圍內容所涵蓋。 The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the applicable scope of the present invention. The scope of protection of the present invention should be based on the contents of the scope of the patent application. Any person familiar with the art may make various modifications and changes to the present invention without departing from the spirit and scope of the present invention, and are still covered by the content of the patent application of the present invention.

701:伺服器 701: server

702:客戶端裝置 702: client device

711-718:步驟 711-718: Steps

Claims

A method for reducing network bandwidth required for image streaming by using an artificial intelligence processing module, comprising:

Step (A): executing a first application program in a server to generate a plurality of source images corresponding to the plurality of original image images; the plurality of source images have a first resolution; the plurality of source images are An encoder in the server performs encoding and compression to generate a plurality of corresponding encoded images;

Step (B): executing a second application program in a client device remote from the server; the second application program is associated with and cooperates with the first application program;

Step (C): The client device is connected to the server via a network, and then receives the encoded images generated by the server via the network in an image stream;

Step (D): The client device decodes the encoded images into a plurality of corresponding decoded images, and uses an artificial intelligence (AI) processing module to improve the analysis of the decoded images degree, to generate a corresponding plurality of high-resolution images; the plurality of the high-resolution images have a second resolution; wherein, the second resolution is higher than the first resolution, and the plurality of the original the resolution of the graphic image is equal to the second resolution;

Step (E): The client device sequentially outputs the plurality of high-resolution images to a screen as the plurality of output images to be played;

Wherein, the AI processing module processes the decoded images by analyzing the difference between the decoded images and the corresponding original images at least one mathematical operation formula and a plurality of weighting parameters obtained in advance; Thereby, the resolution of the obtained high-resolution images will be equal to the corresponding original image images, and higher than the resolution of the plurality of source images; the at least one mathematical operation formula of the AI processing module And the plurality of weighting parameters are pre-defined by a training program executed by an artificial neural network module in a training server.

The method as described in claim 1, wherein the plurality of encoded images described in step (A) are generated by the following steps:

executing the first application program in the server to generate a plurality of the original image images, the plurality of the original image images having the second resolution;

using a resolution reduction procedure to reduce the resolution of the plurality of original image images to the first resolution, so as to obtain the corresponding plurality of the source images; and

The encoder is used to encode the plurality of source images to obtain the corresponding plurality of encoded images.

The method as described in claim 1, wherein the server includes an AI encoding module; the plurality of encoded images described in step (A) are generated by the following steps:

Using the AI encoding module to reduce the resolution of the plurality of original images to obtain corresponding plurality of source images, and encode the plurality of source images to obtain corresponding plurality of encoded images;

Wherein, the AI coding module includes at least one preset AI coding formula; the at least one AI coding formula includes a plurality of preset coding weighting parameters.

The method as described in claim 1, wherein the at least one mathematical calculation formula of the AI processing module includes a first preset AI calculation formula and a second preset AI calculation formula; the first preset The AI formula includes a plurality of first weighting parameters; the second preset AI formula includes a plurality of second weighting parameters;

Wherein, the first preset AI calculation formula with a plurality of the first weighting parameters can be used to improve the resolution of the image, thereby, the first preset AI calculation formula with the plurality of the first weighting parameters is processed The resolution of the processed image can be increased from the first resolution to the second resolution;

Wherein, the second preset AI calculation formula with a plurality of the second weighting parameters can be used to enhance the quality of the image, thereby, the second preset AI calculation formula with the plurality of the second weighting parameters processed The quality of the image is higher than that of the decoded image and closer to the quality of the original image.

The method as described in claim 4, wherein, after the client device decodes the received plurality of encoded images into corresponding plurality of decoded images, the client device will use one of the following way to process multiple decoded images:

Way 1: The client device first uses the first preset AI calculation formula and the first weighting parameters to process the decoded images to generate corresponding resolution enhancements with the second resolution. image; then, the client device uses the second default AI calculation processing a plurality of the resolution-enhanced images to generate a plurality of the high-resolution images with high image quality and the second resolution;

Method 2: the client device first uses the second preset AI calculation formula and the plurality of second weighting parameters to process the plurality of decoded images to generate corresponding plurality of quality-enhanced images with high image quality; Then, the client device uses the first preset AI calculation formula and the first weighting parameters to process the plurality of the quality-enhanced images to generate the plurality of the high-quality images with the second resolution and high image quality. resolution image.

The method according to claim 4, wherein the first preset AI formula, the second preset AI formula, the plurality of first weighting parameters, and the plurality of second weighting parameters are all Included in the same AI processing module of the client device, for processing the plurality of decoded images into a plurality of the high-resolution images with high image quality and the second resolution.

The method as described in claim 4, wherein the AI processing module further includes an AI decoding operation formula for decoding the received plurality of encoded images into a plurality of decoded images; wherein, the AI The decoding calculation formula, the first preset AI calculation formula, the second preset AI calculation formula, the plurality of the first weighting parameters, and the plurality of the second weighting parameters are all contained in the same client device. One of the AI processing modules is used to process the received plurality of encoded images into a plurality of the high-resolution images with high image quality and the second resolution at one time.

The method as described in claim 4, wherein the training program of the artificial neural network module executed in the training server includes the following steps:

enabling a training mode in the training server to generate a plurality of training original image images; the plurality of training original image images have the second resolution;

performing a resolution reduction procedure to reduce the resolution of the plurality of training original image images from the second resolution to the first resolution, so as to generate a plurality of training low resolution images with the first resolution;

receiving and using a first training algorithm by the artificial neural network module to process a plurality of the training low-resolution images one by one to generate a corresponding plurality of training output images with the second resolution; the first training The expression has a plurality of first training weighting parameters; and

Use a comparison module to compare one by one the plurality of the training output images and the corresponding complex The differences between the training original image images, and adjust the first training weighting parameters of the first training calculation formula accordingly; the first training weighting parameters will be adjusted so that the training output image corresponds to the corresponding training output image. The difference between the training original images is minimized; each time when the first training weighting parameters are adjusted, the adjusted first training weighting parameters will be fed back to the first training calculation formula for Process the next training low-resolution image;

Wherein, after a predetermined number of comparisons between the training output image and the corresponding training original image, and a predetermined number of adjustment procedures of the first training weighting parameters, the finally obtained first training The weighting parameters will be applied in the AI processing module of the client device as the plurality of weighting parameters of at least one of the mathematical calculation formulas.

Executing an encoding process to encode the plurality of training low-resolution images into corresponding plurality of training encoded images by means of an encoder in the training server;

Executing a decoding program to decode the plurality of training encoded images into corresponding plurality of training decoded images by a decoder in the training server; the plurality of training decoded images have the first resolution;

The artificial neural network module receives and uses a first training formula and a second training formula to process a plurality of the training decoded images one by one to generate corresponding training outputs with the second resolution image; the first training expression has a plurality of first training weighting parameters; the second training expression has a plurality of second training weighting parameters; and

Using a comparison module to compare the differences between the plurality of training output images and the corresponding plurality of training original images one by one, and adjust the first training weighting parameters and the first training weighting parameters of the first training formula accordingly. The second training weighting parameters of the second training formula; the first training weighting parameters and the second training weighting parameters will be adjusted so that the training output image and the corresponding training original image The difference is minimized; each time when these first training After the weighting parameters and the second training weighting parameters are adjusted, the adjusted first training weighting parameters and the second training weighting parameters will be fed back to the first training formula and the second training formula for processing the next training low-resolution image;

Wherein, after performing a predetermined number of comparisons between the training output image and the corresponding training original image, and a predetermined number of adjustment procedures for the first training weighting parameters and the second training weighting parameters, finally The obtained first training weighting parameters and the second training weighting parameters will be applied in the AI processing module of the client device as the first training operation included in at least one mathematical operation formula. formula and the weighting parameters of the second training formula.

The method as described in claim 3, wherein the training program of the artificial neural network module executed in the training server includes the following steps:

using a first artificial neural network module to receive and use a training encoding algorithm to process the plurality of training low-resolution images one by one to generate corresponding training encoding images with the first resolution; the training The encoding operation formula has a plurality of training encoding weighting parameters;

using a second artificial neural network module to receive and use a training decoding algorithm to process the plurality of training encoding images one by one to generate corresponding training output images with the second resolution; the training decoding algorithm The formula has a plurality of training and decoding weighting parameters;

Using a comparison module to compare the differences between the plurality of training output images and the corresponding plurality of training original image images one by one, and adjust the training encoding weighting parameters of the training encoding formula and the training decoding accordingly. The training decoding weighting parameters of the calculation formula; the training coding weighting parameters and the training decoding weighting parameters will be adjusted to minimize the difference between the training output image and the corresponding training original image; each Once the training encoding weighting parameters and the training decoding weighting parameters are adjusted, the adjusted training encoding weighting parameters and the training decoding weighting parameters will be fed back to the training encoding formula and the training decoding operation formula for processing the next training low-resolution image;

Wherein, after performing a predetermined number of comparisons between the training output image and the corresponding training original image, and a predetermined number of adjustment procedures for the training encoding weighting parameters and the training decoding weighting parameters, the finally obtained The training encoding weighting parameters will be applied to the AI encoding calculation formula of the AI encoding module of the server, and the obtained training decoding weighting parameters will be applied to the AI processing of the client device In at least one mathematical operation formula of the module; thereby, the AI processing module of the client device can execute the procedures of decoding, resolution improvement and image quality enhancement on the received encoded image at one time.

The method as described in claim 2, wherein the plurality of original image images are three-dimensional (three-dimensional; 3D for short) images, and each of the 3D images is combined in an image frame in a side-by-side manner left eye view and right eye view of .