TWI723547B - Style transfer method and computer program product thereof - Google Patents
Style transfer method and computer program product thereof Download PDFInfo
- Publication number
- TWI723547B TWI723547B TW108133761A TW108133761A TWI723547B TW I723547 B TWI723547 B TW I723547B TW 108133761 A TW108133761 A TW 108133761A TW 108133761 A TW108133761 A TW 108133761A TW I723547 B TWI723547 B TW I723547B
- Authority
- TW
- Taiwan
- Prior art keywords
- feature
- neural network
- map
- original image
- style
- Prior art date
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
Description
本發明係有關於一種風格移轉的方法,特別是有關於一種應用於網頁介面的快速風格移轉的方法及其電腦程式產品。The present invention relates to a method of style transfer, in particular to a method of quick style transfer applied to a web interface and a computer program product thereof.
風格移轉(style transfer)是利用人工智慧(artificial intelligent:AI) 視覺演算法,對一原圖(content image)取出其內容,並對一風格圖(style image)取出其風格,並將上述兩者加以合成的一種技術。Style transfer (style transfer) uses artificial intelligence (AI) visual algorithms to extract the content of an original image (content image) and extract the style of a style image (style image), and combine the above two It is a technology synthesized by the author.
網頁瀏覽已是目前社會上最多人使用的一個媒介,為了使風格移轉的應用可以讓更多人看到,如何讓一個龐大的風格移轉演算法可以在網頁上很快地產生最後的合成結果則是首要之務。在各種硬體AI加速尚未考量到網頁的應用的現在,上述任務更加艱難。Web browsing is currently the most widely used medium in society. In order to make the style transfer application more visible to more people, how to make a huge style transfer algorithm can quickly produce the final synthesis on the web page The result is the first priority. Now that various hardware AI accelerations have not yet considered web applications, the above tasks are even more difficult.
依據本發明一實施例之風格移轉的方法,應用於一網頁介面中,包括:取得一第一卷積神經網路(convolution neural network:CNN)模型,其中,該第一卷積神經網路模型經訓練用以萃取一原圖(content image)中的特徵,並且包括複數卷積層,該等複數卷積層包括一第一特定數目的特徵過濾器(feature filter)及一第二特定數目的特徵過濾器;將該第一特定數目的特徵過濾器的數量減少至原來的八分之一;將該第二特定數目的特徵過濾器的數量減少至原來的八分之一;將該原圖輸入至該第一卷積神經網路模型;透過該第一卷積神經網路模型執行一卷積運算,用以產生該原圖的複數第一特徵圖。其中,每一特徵過濾器用以擷取該原圖的不同特徵。The method of style transfer according to an embodiment of the present invention, applied to a web interface, includes: obtaining a first convolution neural network (CNN) model, wherein the first convolution neural network The model is trained to extract features from a content image, and includes a plurality of convolutional layers including a first specific number of feature filters and a second specific number of features Filter; reduce the number of the first specific number of feature filters to one eighth of the original; reduce the number of the second specific number of feature filters to one eighth of the original; enter the original image To the first convolutional neural network model; perform a convolution operation through the first convolutional neural network model to generate a complex first feature map of the original image. Among them, each feature filter is used to extract different features of the original image.
如上述之風格移轉的方法,更包括:取得一第二卷積神經網路模型;將該原圖的該等複數第一特徵圖以及一風格圖輸入至該第二卷積神經網路模型;透過該卷積運算,從該等複數第一特徵圖中擷取相關聯於該風格圖的空間特徵圖,並且從該風格圖中擷取相關聯於該風格圖的非空間特徵圖;透過一損失函數運算,將相關聯於該原圖的空間特徵圖與相關聯於該風格圖的該非空間特徵圖進行合併,用以得到經風格轉移後的一合成圖;將該合成圖顯示於該網頁介面中。The method for style transfer as described above further includes: obtaining a second convolutional neural network model; inputting the plural first feature maps and a style map of the original image to the second convolutional neural network model ; Through the convolution operation, extract the spatial feature map associated with the style map from the plural first feature maps, and extract the non-spatial feature map associated with the style map from the style map; through A loss function operation to merge the spatial feature map associated with the original image and the non-spatial feature map associated with the style map to obtain a composite image after style transfer; and display the composite image on the In the web interface.
如上述之風格移轉的方法,其中,該第一卷積神經網路模型將該原圖與該第一特定數目及該第二特定數目的特徵過濾器做卷積運算,用以萃取並產生該原圖的該等複數第一特徵圖。Such as the above-mentioned style transfer method, wherein the first convolutional neural network model performs convolution operation on the original image with the first specific number and the second specific number of feature filters to extract and generate The plural first feature maps of the original image.
如上述之風格移轉的方法,其中,該空間特徵圖包括形狀特徵圖、邊界特徵圖;該非空間特徵圖包括顏色特徵圖、紋理特徵圖。The method for style transfer as described above, wherein the spatial feature map includes a shape feature map and a boundary feature map; the non-spatial feature map includes a color feature map and a texture feature map.
如上述之風格移轉的方法,其中,該第一特定數目為32個,該第二特定數目為64個。The method for style transfer as described above, wherein the first specific number is 32, and the second specific number is 64.
如上述之風格移轉的方法,其中,該第二卷積神經網路模型為VGG 19。The method for style transfer as described above, wherein the second convolutional neural network model is VGG19.
如上述之風格移轉的方法,其中,該損失函數運算,包括:計算相關聯於該原圖的空間特徵圖與該合成圖的一第一特徵差距;計算相關聯於該風格圖的該非空間特徵圖與該合成圖的一第二特徵差距;將該第一特徵差距與該第二特徵差距加總,用以得到該損失函數;透過一梯度下降法,將該損失函數最小化,用以得到該合成圖。The method for style transfer as described above, wherein the loss function calculation includes: calculating a first feature gap between the spatial feature map associated with the original image and the composite image; calculating the non-spatial feature map associated with the style map A second feature gap between the feature map and the composite image; the first feature gap and the second feature gap are added together to obtain the loss function; a gradient descent method is used to minimize the loss function to Obtain the composite picture.
依據本發明一實施例之電腦程式產品,應用於一網頁介面中,用以對一原圖與一風格圖進行風格移轉;該電腦程式產品經由電腦載入該程式以執行:一第一呼叫指令、一過濾器設定指令、一第一輸入指令,以及一第一特徵萃取指令。該第一呼叫指令使該電腦的一處理器從該電腦的儲存器中取得一第一卷積神經網路模型;其中,該第一卷積神經網路模型經訓練用以萃取該原圖的特徵,並且包括複數卷積層,該等複數卷積層包括一第一特定數目的特徵過濾器及一第二特定數目的特徵過濾器。該過濾器設定指令使該處理器將該第一特定數目的特徵過濾器的數量減少至原來的八分之一,並且將該第二特定數目的特徵過濾器的數量減少至原來的八分之一。該第一輸入指令使該處理器將該原圖輸入至該第一卷積神經網路模型。該第一特徵萃取指令使該處理器透過該第一卷積神經網路模執行一卷積運算,用以產生該原圖的複數第一特徵圖。其中,每一特徵過濾器用以擷取該原圖的不同特徵。A computer program product according to an embodiment of the present invention is applied to a web interface for style transfer between an original image and a style image; the computer program product loads the program through the computer to execute: a first call Command, a filter setting command, a first input command, and a first feature extraction command. The first call instruction causes a processor of the computer to obtain a first convolutional neural network model from the storage of the computer; wherein, the first convolutional neural network model is trained to extract the original image Features and includes a plurality of convolutional layers, and the plurality of convolutional layers includes a first specific number of feature filters and a second specific number of feature filters. The filter setting instruction causes the processor to reduce the number of the first specific number of feature filters to one eighth of the original, and reduce the number of the second specific number of feature filters to one eighth of the original One. The first input instruction causes the processor to input the original image to the first convolutional neural network model. The first feature extraction instruction causes the processor to perform a convolution operation through the first convolutional neural network module to generate a complex first feature map of the original image. Among them, each feature filter is used to extract different features of the original image.
如上述之電腦程式產品,更包括:一第二呼叫指令、一第二輸入指令、一第二特徵萃取指令、一合成指令,以及一顯示指令。該第二呼叫指令使該處理器從該電腦的儲存器中取得一第二卷積神經網路模型。該第二輸入指令使該處理器將該原圖的該等複數第一特徵圖以及該風格圖輸入至該第二卷積神經網路模型。該第二特徵萃取指令使該處理器透過該卷積運算,從該等複數第一特徵圖中擷取相關聯於該原圖的空間特徵圖,並且該風格圖中擷取相關聯於該風格圖的非空間特徵圖。該合成指令使該處理器透過一損失函數運算,將相關聯於該原圖的空間特徵圖與相關聯於該風格圖的該非空間特徵圖進行合併,用以得到經風格轉移後的一合成圖。該顯示指令使該處理器將該合成圖顯示於該網頁介面中。The computer program product described above further includes: a second call command, a second input command, a second feature extraction command, a synthesis command, and a display command. The second call instruction causes the processor to obtain a second convolutional neural network model from the computer's storage. The second input instruction enables the processor to input the plural first feature maps and the style map of the original image to the second convolutional neural network model. The second feature extraction instruction causes the processor to extract the spatial feature maps associated with the original image from the plurality of first feature maps through the convolution operation, and the extraction from the style map is associated with the style The non-spatial feature map of the graph. The synthesis instruction causes the processor to combine the spatial feature map associated with the original image with the non-spatial feature map associated with the style map through a loss function operation to obtain a composite image after style transfer . The display instruction causes the processor to display the composite image in the web page interface.
如上述之電腦程式產品,其中,該第一卷積神經網路模型將該原圖與該第一特定數目及該第二特定數目的特徵過濾器做卷積運算,用以萃取並產生該原圖的該等複數第一特徵圖。Such as the above-mentioned computer program product, wherein the first convolutional neural network model performs convolution operation on the original image with the first specific number and the second specific number of feature filters to extract and generate the original The plural first feature maps of the graph.
如上述之電腦程式產品,其中,該空間特徵圖包括形狀特徵圖、邊界特徵圖;該非空間特徵圖包括顏色特徵圖、紋理特徵圖。The computer program product described above, wherein the spatial feature map includes a shape feature map and a boundary feature map; the non-spatial feature map includes a color feature map and a texture feature map.
如上述之電腦程式產品,其中,該第一特定數目為32個,該第二特定數目為64個。The computer program product described above, wherein the first specific number is 32, and the second specific number is 64.
如上述之電腦程式產品,其中,該第二卷積神經網路模型為VGG 19。Such as the computer program product mentioned above, wherein the second convolutional neural network model is VGG 19.
如上述之電腦程式產品,其中,該損失函數運算,包括:計算相關聯於該原圖的空間特徵圖與該合成圖的一第一特徵差距;計算相關聯於該風格圖的該非空間特徵圖與該合成圖的一第二特徵差距;將該第一特徵差距與該第二特徵差距加總,用以得到該損失函數;透過一梯度下降法,將該損失函數最小化,用以得到該合成圖。The computer program product described above, wherein the loss function calculation includes: calculating a first feature gap between the spatial feature map associated with the original image and the composite image; calculating the non-spatial feature map associated with the style map And a second feature gap of the composite image; sum the first feature gap and the second feature gap to obtain the loss function; use a gradient descent method to minimize the loss function to obtain the Composite image.
本發明之風格移轉(style transfer)方法,係應用於一網頁介面中,該網頁介面例如利用一網頁圖形資料庫Web Graphics Library (WebGL),在任何兼容的網頁瀏覽器中,呈現互動式的3D或2D圖形。在本實施例中,使用者可將一原圖(content image)及一風格圖(style image)上傳於該網頁介面中,或者是由使用者上傳該原圖另外選用由該網頁所提供的風格圖,並且利用本發明的風格移轉方法,透過卷積運算,用以對該原圖取出內容,並且對該風格圖取出風格,並加以合成,最後將一合成圖輸出於該網頁介面中,而達到風格移轉的效果。The style transfer method of the present invention is applied to a web interface. The web interface uses, for example, a web graphics database Web Graphics Library (WebGL) to display interactive web pages in any compatible web browser. 3D or 2D graphics. In this embodiment, the user can upload a content image and a style image to the web interface, or the user can upload the original image and choose the style provided by the web page. It also uses the style transfer method of the present invention to extract content from the original image through convolution, and extract the style from the style image, and synthesize, and finally output a synthesized image in the web interface. And to achieve the effect of style transfer.
第1圖為本揭露實施例之卷積運算的示意圖。如第1圖所示,輸入圖像100(例如為該原圖,並且例如以7*7的矩陣表示)係與特徵過濾器102(例如以3*3的矩陣表示)做卷積運算,而得到一特徵圖104(經卷積運算後成為5*5的矩陣)。若輸入圖像100中的特徵與特徵過濾器102的特徵愈相似,則經卷積運算所得出的特徵圖104中的對應卷積值會愈大。Figure 1 is a schematic diagram of the convolution operation of the disclosed embodiment. As shown in Figure 1, the input image 100 (for example, the original image and represented by a 7*7 matrix) is convolved with the feature filter 102 (for example, represented by a 3*3 matrix), and A
舉例來說,如第1圖所示,輸入圖像100中的部分特徵110(例如以3*3矩陣表示)係與特徵過濾器102做卷積運算,由於部分特徵110係與特徵過濾器102完全相同,經過卷積運算,係可在特徵圖104中得到其對應的卷積值『4』。再者,輸入圖像100中的部分特徵112係與特徵過濾器102完全不同,經過卷積運算,係可在特徵圖104中得到其對應的卷積值『0』。藉由上述卷積運算,特徵過濾器102係可將輸入圖像100中與特徵過濾器102本身最相近的特徵取出,而得到對應的特徵圖104。第1圖中的輸入圖像100、特徵過濾器102,以及特徵圖104之矩陣內的數值僅作為例示,不作為本發明之限制。For example, as shown in Figure 1, some of the
本發明的風格移轉方法係利用經訓練的一第一卷積神經網路(convolution neural network:CNN)模型萃取該原圖中的特徵。一般來說,該第一卷積神經網路模型為一演算法。該演算法,例如以Python語法為例,包括複數個卷積運算的功能函數,例如conv1 = _conv_layer (image, 32, 9, 1),其中_conv_layer即為上述卷積運算的功能函數,可依據其應用需求,設定不同的輸入圖像(image)、特徵過濾器數量(filter number,例如為32)、特徵過濾器大小(filter size,例如為9)、以及卷積運算步伐(stride,例如為1)。The style transfer method of the present invention uses a trained first convolution neural network (CNN) model to extract the features in the original image. Generally speaking, the first convolutional neural network model is an algorithm. This algorithm, for example, taking Python syntax as an example, includes multiple function functions of convolution operations, such as conv1 = _conv_layer (image, 32, 9, 1), where _conv_layer is the function function of the above convolution operations, which can be based on Its application requirements include setting different input images (image), the number of feature filters (filter number, for example, 32), the feature filter size (filter size, for example, 9), and the convolution operation stride (stride, for example, 1).
第2圖為本揭露實施例之一第一卷積神經網路模型200的示意圖。如第2圖所示,經訓練的該第一卷積神經網路200包括複數卷積層(convolution layer)(例如卷積層210、212、214、216),以及複數殘差區塊(residual block)(例如殘差區塊218、220、222、224、226、228)。其中,第一卷積神經網路200的每一卷積層及每一殘差區塊係可分別對應其演算法中的不同卷積運算功能函數。FIG. 2 is a schematic diagram of a first convolutional
舉例來說,卷積層210可對應演算法中的卷積功能函數conv1 = _conv_layer (image, 32, 9, 1),以及殘差區塊218可對應演算法中的卷積功能函數conv3 = _conv_layer (conv2, 128, 3, 2)。因此,透過對上述卷積功能函數的設定(例如預先設定特徵過濾器的數量或種類),經訓練的第一卷積神經網路模型200可具有多個預先設定好的特徵過濾器,用以擷取所輸入該原圖的不同特徵。For example, the
舉例來說,當該原圖輸入至該卷積層210時,該原圖會與卷積層210內的32個不同的特徵過濾器(例如第1圖的特徵過濾器102)分別執行卷積運算,用以抓取該原圖的32個不同的特徵(例如形狀、邊界、顏色、紋理…等),並且產生32個不同的特徵圖像(例如第1圖特徵圖104)。同理,卷積層210所產生的32個不同的特徵圖像會依序會再輸入至卷積層212,並且每一特徵圖像經由卷積層212內的64個不同的特徵過濾器再次抓取每一特徵圖像的64個不同的特徵圖像。For example, when the original image is input to the
簡單來說,該原圖是輸入至第一卷積神經網路模型200的卷積層210,途中依序經過卷積層212、殘差區塊218~228、卷積層214,以及卷積層216的卷積運算,並從卷積層216輸出該原圖的該等複數第一特徵圖。在本實施例中,該等複數卷積層(卷積層210~216)中的每一特徵過濾器係與所輸入該原圖執行一次卷積運算,該等複數殘差區塊(殘差區塊218~228)中的每一特徵過濾器係與所輸入該原圖執行兩次卷積運算。To put it simply, the original image is input to the
在本實施例中,第一卷積神經網路模型200可例如為以下演算法(以Python語法編程為例):
def net(image):
conv1 = _conv_layer (image, 32, 9, 1)
conv2 = _ conv_layer (conv1, 64, 3, 2)
conv3 = _conv_layer (conv2, 128, 3, 2)
resid1 = residual_block (conv3, 3)
resid2 = residual_block (resid1, 3)
resid3 = residual_block (resid2, 3)
resid4 = residual_block (resid3, 3)
resid5 = residual_block (resid4, 3)
conv_t1 = conv_transpose_layer (resid5, 64, 3, 2)
conv_t2 = conv_transpose_layer (conv_t1, 32, 3, 2)
…
In this embodiment, the first convolutional
演算法中的conv1 = _conv_layer (image, 32, 9, 1)係將該原圖輸入至卷積層210中,並且卷積層210具有32個特徵過濾器。The conv1 = _conv_layer (image, 32, 9, 1) in the algorithm is to input the original image into the
演算法中的conv2 = _ conv_layer (conv1, 64, 3, 2)係將卷積層210所運算出的輸出特徵圖再輸入至卷積層212中,並且卷積層212具有64個特徵過濾器。The conv2 = _ conv_layer (conv1, 64, 3, 2) in the algorithm is to input the output feature map calculated by the
演算法中的conv3 = _conv_layer (conv2, 128, 3, 2)係將卷積層212所運算出的輸出特徵圖再輸入至殘差區塊218中,並且殘差區塊218具有128個特徵過濾器。The conv3 = _conv_layer (conv2, 128, 3, 2) in the algorithm is to input the output feature map calculated by the
演算法中的resid1 = residual_block (conv3, 3)係將殘差區塊218所運算出的輸出特徵圖再輸入至殘差區塊220中,並且殘差區塊220具亦具有128個特徵過濾器。The residual1 = residual_block (conv3, 3) in the algorithm is to input the output feature map calculated by the
演算法中的resid2 = residual_block (resid1, 3)係將殘差區塊220所運算出的特徵圖再輸入至殘差區塊222中,並且殘差區塊222具亦具有128個特徵過濾器。Resid2 = residual_block (resid1, 3) in the algorithm is to input the feature map calculated by the
演算法中的resid3 = residual_block (resid2, 3) 係將殘差區塊222所運算出的特徵圖再輸入至殘差區塊224中,並且殘差區塊224具亦具有128個特徵過濾器。Resid3 = residual_block (resid2, 3) in the algorithm is to input the feature map calculated by the
演算法中的resid4 = residual_block (resid3, 3) 係將殘差區塊224所運算出的特徵圖再輸入至殘差區塊226中,並且殘差區塊226具亦具有128個特徵過濾器。Resid4 = residual_block (resid3, 3) in the algorithm is to input the feature map calculated by the
演算法中的resid5 = residual_block (resid4, 3) 係將殘差區塊226所運算出的特徵圖再輸入至殘差區塊228中,並且殘差區塊228具亦具有128個特徵過濾器。The residual5 = residual_block (resid4, 3) in the algorithm is to input the feature map calculated by the
演算法中的conv_t1 = conv_transpose_layer (resid5, 64, 3, 2)係將殘差區塊228所運算出的特徵圖再輸入至卷積層214,並且卷積層214具有64個特徵過濾器。The conv_t1 = conv_transpose_layer (resid5, 64, 3, 2) in the algorithm is to input the feature map calculated by the
演算法中的conv_t2 = conv_transpose_layer (conv_t1, 32, 3, 2)係將卷積層214所運算出的特徵圖再輸入至卷積層216,並且卷積層216具有32個特徵過濾器。The conv_t2 = conv_transpose_layer (conv_t1, 32, 3, 2) in the algorithm is to input the feature map calculated by the
第3A、3B圖為本揭露實施例之風格移轉方法的流程圖。如第3A圖所示,在步驟S300中,本發明的風格移轉方法首先取得一第一卷積神經網路(例如第2圖的第一卷積神經網路200)。接著,本發明的風格移轉方法執行步驟S302,將一第一特定數目的特徵過濾器的數量減少至原來的八分之一。並且於步驟S304中,將一第二特定數目的特徵過濾器的數量減少至原來的八分之一。Figures 3A and 3B are flowcharts of the style transfer method according to an embodiment of the disclosure. As shown in FIG. 3A, in step S300, the style transfer method of the present invention first obtains a first convolutional neural network (for example, the first convolutional
第4圖為本揭露實施例之一第一卷積神經網路模型400的示意圖。在本實施例中,本發明的風格移轉方法將第一卷積神經網路模型200中的卷積層210、216內的32個特徵過濾器減少至4個特徵過濾器,對應地產生第一卷積神經網路模型400中的卷積層410、416。本發明的風格移轉方法將第一卷積神經網路模型200中的卷積層212、214內的64個特徵過濾器減少至8個特徵過濾器,對應地產生第一卷積神經網路模型400中的卷積層412、414。FIG. 4 is a schematic diagram of a first convolutional
再者,本發明的風格移轉方法將第一卷積神經網路模型200中的殘差區塊218、220、222、224、226、228內分別具有的128個特徵過濾器減少至16個特徵過濾器,對應地產生第一卷積神經網路模型400中的殘差區塊418、420、422、424、426、428。最後,本發明的風格移轉方法使得第一卷積神經網路模型200的檔案(tensorflow檔案)大小由原本的6580KB減少至第一卷積神經網路模型400的檔案大小148KB。Furthermore, the style transfer method of the present invention reduces the 128 feature filters in the
在步驟S306中,本發明的風格移轉方法將該原圖輸入至第一卷積神經網路模型400。接著,在步驟S308中,本發明的風格移轉方法透過第一卷積神經網路模型400執行一卷積運算(convolution),用以產生該原圖的複數第一特徵圖。例如,第一卷積神經網路模型400中的卷積層410中的32個特徵過濾器,係用以擷取該原圖的32個不同特徵,並且卷積層412中的64個特徵過濾器,係用以擷取該原圖的64個不同特徵。In step S306, the style transfer method of the present invention inputs the original image to the first convolutional
在得到該原圖的該等複數第一特徵圖之後,本發明的風格移轉方法執行步驟S310,取得一第二卷積神經網路模型。在本實施例中,該第二卷積神經網路模型亦為一演算法,例如為Visual Geometry Group 19(VGG19),亦具有複數卷積層,透過影像辨識技術的原理,將輸入於該第二卷積神經網路模型的影像或圖片進行分類。After obtaining the plural first feature maps of the original image, the style transfer method of the present invention executes step S310 to obtain a second convolutional neural network model. In this embodiment, the second convolutional neural network model is also an algorithm, for example, Visual Geometry Group 19 (VGG19), which also has multiple convolutional layers. Through the principle of image recognition technology, the input is input to the second Convolutional neural network model images or pictures are classified.
在步驟S312中,本發明的風格轉移方法將該原圖的該等複數第一特徵圖以及該風格圖輸入至該第二卷積神經網路模型,並且於步驟S314中,該第二卷積神經網路模型透過該卷積運算,從該等複數第一特徵圖中擷取相關聯於該原圖的空間特徵圖(例如形狀特徵圖、邊界特徵圖),並且從風格圖中擷取相關聯於該風格圖的非空間特徵圖(例如顏色特徵圖、紋理特徵圖)。該第二卷積神經網路模型執行卷積運算的原理係與第2圖的第一卷積神經網路模型200與第4圖的第一卷積神經網路模型400相同,故不再贅述。In step S312, the style transfer method of the present invention inputs the plural first feature maps of the original image and the style map to the second convolutional neural network model, and in step S314, the second convolution Through the convolution operation, the neural network model extracts spatial feature maps (such as shape feature maps, boundary feature maps) related to the original image from the plural first feature maps, and extracts related features from the style map Non-spatial feature maps (such as color feature maps, texture feature maps) linked to the style map. The principle of the second convolutional neural network model for performing convolution operations is the same as that of the first convolutional
接著,在步驟S316中,本發明的風格移轉方法透過一損失函數運算,將相關聯於該原圖的空間特徵圖與相關聯於該風格圖的該非空間特徵圖進行合併,用以得到經風格轉移後的一合成圖。最後,在步驟S318中,本發明的風格移轉方法將該合成圖顯示於該網頁介面中。Next, in step S316, the style transfer method of the present invention uses a loss function operation to merge the spatial feature map associated with the original image and the non-spatial feature map associated with the style map to obtain the classic A composite picture after the style transfer. Finally, in step S318, the style transfer method of the present invention displays the composite image in the web page interface.
在步驟S316中,本發明的風格移轉方法所執行的該損失函數運算,包括:計算相關聯於該原圖的空間特徵圖與該合成圖的一第一特徵差距;計算相關聯於該風格圖的該非空間特徵圖與該合成圖的一第二特徵差距;將該第一特徵差距與該第二特徵差距加總,用以得到該損失函數;透過一梯度下降法,將該損失函數最小化,用以得到該合成圖。該梯度下降法為一個一階最佳化算法,透過該梯度下降法找到該損失函數的局部最小值,用以將該損失函數最小化。In step S316, the loss function calculation performed by the style transfer method of the present invention includes: calculating a first feature gap between the spatial feature map associated with the original image and the composite image; calculating the difference between the spatial feature map associated with the original image and the composite image; A second feature gap between the non-spatial feature map of the graph and the composite image; the first feature gap and the second feature gap are summed to obtain the loss function; the loss function is minimized by a gradient descent method化 to obtain the composite image. The gradient descent method is a first-order optimization algorithm, and the local minimum of the loss function is found through the gradient descent method to minimize the loss function.
在本實施例中,在所輸入該原圖或該風格圖的解析度為256*256的情況下,當使用第2圖的第一卷積神經網路模型200搭配該第二卷積神經網路模型執行風格移轉時,即執行第3A、3B圖的步驟S300、步驟S306、步驟S308、步驟S310、步驟S312、步驟S314、步驟S316及步驟S318,但不執行步驟S302及步驟S304,則在該網頁介面上執行風格移轉的處理時間為每張圖像需費時1710微秒。換句話說,從使用者將該原圖及該風格圖上傳至該網頁介面後開始,到本發明的風格移轉方法將最後的合成圖顯示於該網頁介面為止,上述風格移轉處理所花費的時間為1710微秒。In this embodiment, when the input resolution of the original image or the style image is 256*256, when the first convolutional
在另一實施例中,在所輸入該原圖或該風格圖的解析度為256*256的情況下,當使用第4圖的第一卷積神經網路模型400搭配該第二卷積神經網路模型執行風格移轉時,即執行第3A、3B圖中所有的步驟(包括步驟S302及步驟S304),則在該網頁介面上執行風格移轉的處理時間為每張圖像僅需費時130微秒。換句話說,本發明的風格移轉方法藉由將第一卷積神經網路模型200內的特徵過濾器的數量減少,而成為第一卷積神經網路模型400,用以降低風格移轉的總執行時間。In another embodiment, when the input resolution of the original image or the style image is 256*256, when the first convolutional
在另一實施例中,在所輸入該原圖或該風格圖的解析度為480*480的情況下,當使用第4圖的第一卷積神經網路模型400搭配該第二卷積神經網路模型執行風格移轉時,即執行第3A、3B圖中所有的步驟(包括步驟S302及步驟S304),則在該網頁介面上執行風格移轉的處理時間為每張圖像僅需費時270微秒,亦遠比利用第一卷積神經網路模型200搭配該第二卷積神經網路模型,並且輸入圖像的解析度為256*256時的速度每張圖像費時1710微秒還要快。In another embodiment, when the input resolution of the original image or the style image is 480*480, when the first convolutional
雖然特徵過濾器數量的減少,會導致從該原圖或該風格圖中所擷取的特徵點數量也減少,使得經風格移轉後的該合成圖的效果變差。本發明的風格移轉方法即是犧牲些許合成效果,使得使用者在可接受該合成圖的視覺效果的前提下,有效降低風格移轉的處理時間,用以提升該網頁介面上執行風格移轉的使用者經驗。Although the number of feature filters is reduced, the number of feature points extracted from the original image or the style map is also reduced, so that the effect of the composite image after the style transfer is deteriorated. The style transfer method of the present invention sacrifices a little synthesis effect, so that the user can effectively reduce the processing time of style transfer under the premise of accepting the visual effect of the synthesized image, so as to improve the execution of style transfer on the web interface User experience.
本發明更揭露一種電腦程式產品,應用於一網頁介面中,用以對一原圖與一風格圖進行風格移轉;該電腦程式產品經由電腦載入該程式以執行:一第一呼叫指令、一過濾器設定指令、一第一輸入指令、一第一特徵萃取指令、一第二呼叫指令、一第二輸入指令、一第二特徵萃取指令、一合成指令,以及一顯示指令。該第一呼叫指令使該電腦的一處理器執行第3A圖的步驟S300。該過濾器設定指令使該處理器執行第3A圖的步驟S302及步驟S304。該第一輸入指令使該處理器執行第3A圖的步驟S306。該第一特徵萃取指令使該處理器執行第3A圖的步驟S308。The present invention further discloses a computer program product applied to a web interface for style transfer between an original image and a style image; the computer program product is loaded into the program by the computer to execute: a first call command, A filter setting instruction, a first input instruction, a first feature extraction instruction, a second call instruction, a second input instruction, a second feature extraction instruction, a synthesis instruction, and a display instruction. The first call instruction causes a processor of the computer to execute step S300 in FIG. 3A. The filter setting instruction causes the processor to execute step S302 and step S304 in FIG. 3A. The first input instruction causes the processor to execute step S306 in FIG. 3A. The first feature extraction instruction causes the processor to execute step S308 in FIG. 3A.
依據本發明所揭露的電腦程式產品,該第二呼叫指令使該處理器執行第3B圖的步驟S310。該第二輸入指令使該處理器執行第3B圖的步驟S312。該第二特徵萃取指令使該處理器執行第3B圖的步驟S314。該合成指令使該處理器執行第3B圖的步驟S316。最後,該顯示指令使該處理器執行第3B圖的步驟S318。According to the computer program product disclosed in the present invention, the second call instruction causes the processor to execute step S310 in FIG. 3B. The second input instruction causes the processor to execute step S312 in FIG. 3B. The second feature extraction instruction causes the processor to execute step S314 in FIG. 3B. The synthesis instruction causes the processor to execute step S316 in FIG. 3B. Finally, the display instruction causes the processor to execute step S318 in FIG. 3B.
本發明所揭露的電腦程式產品在執行該合成指令時,該處理器會執行一損失函數運算,該損失函數運算包括:計算相關聯於該原圖的空間特徵圖與該合成圖的一第一特徵差距;計算相關聯於該風格圖的該非空間特徵圖與該合成圖的一第二特徵差距;將該第一特徵差距與該第二特徵差距加總,用以得到該損失函數;透過一梯度下降法,將該損失函數最小化,用以得到該合成圖。When the computer program product disclosed in the present invention executes the synthetic instruction, the processor executes a loss function operation. The loss function operation includes: calculating a spatial feature map associated with the original image and a first of the composite image. Feature gap; calculate a second feature gap between the non-spatial feature map and the composite image associated with the style map; add the first feature gap and the second feature gap to obtain the loss function; The gradient descent method minimizes the loss function to obtain the composite image.
本發明所揭露的風格移轉方法及電腦程式產品係可讓風格移轉的功能,透過網頁介面,在低階的硬體裝置上實現,使得使用者可在低階的硬體裝置上亦可順暢地體驗風格移轉的魅力。The style transfer method and computer program product disclosed in the present invention enable the style transfer function to be implemented on low-level hardware devices through a web interface, so that users can also use low-level hardware devices. Experience the charm of style transfer smoothly.
雖然本發明的實施例如上述所描述,我們應該明白上述所呈現的只是範例,而不是限制。依據本實施例上述示範實施例的許多改變是可以在沒有違反發明精神及範圍下被執行。因此,本發明的廣度及範圍不該被上述所描述的實施例所限制。更確切地說,本發明的範圍應該要以以下的申請專利範圍及其相等物來定義。Although the embodiments of the present invention are as described above, we should understand that what is presented above is only an example, not a limitation. According to this embodiment, many changes of the above exemplary embodiment can be implemented without violating the spirit and scope of the invention. Therefore, the breadth and scope of the present invention should not be limited by the embodiments described above. More precisely, the scope of the present invention should be defined by the following patented scope and its equivalents.
100 ~ 輸入圖像
102 ~ 特徵過濾器
104 ~ 特徵圖
110、112 ~ 部分特徵
200、400 ~ 第一卷積神經網路模型
210、212、214、216 ~ 卷積層
410、412、414、416 ~ 卷積層
218、220、222 ~ 殘差區塊
224、226、228 ~ 殘差區塊
418、420、422 ~ 殘差區塊
424、426、428 ~ 殘差區塊
S300、S302、S304、S306、S308 ~ 步驟
S310、S312、S314、S316、S318 ~ 步驟
100 ~
第1圖為本揭露實施例之卷積運算的示意圖。
第2圖為本揭露實施例之一第一卷積神經網路模型200的示意圖。
第3A圖為本揭露實施例之風格移轉方法的流程圖。
第3B圖為本揭露實施例之風格移轉方法的流程圖。
第4圖為本揭露實施例之一第一卷積神經網路模型400的示意圖。
Figure 1 is a schematic diagram of the convolution operation of the disclosed embodiment.
FIG. 2 is a schematic diagram of a first convolutional
S300、S302、S304 ~ 步驟 S306、S308 ~ 步驟 S300, S302, S304 ~ steps S306, S308 ~ step
Claims (12)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108133761A TWI723547B (en) | 2019-09-19 | 2019-09-19 | Style transfer method and computer program product thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108133761A TWI723547B (en) | 2019-09-19 | 2019-09-19 | Style transfer method and computer program product thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202113687A TW202113687A (en) | 2021-04-01 |
TWI723547B true TWI723547B (en) | 2021-04-01 |
Family
ID=76604234
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW108133761A TWI723547B (en) | 2019-09-19 | 2019-09-19 | Style transfer method and computer program product thereof |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI723547B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107180410A (en) * | 2017-04-11 | 2017-09-19 | 中国农业大学 | The stylized method for reconstructing and device of a kind of image |
US20180158224A1 (en) * | 2015-07-31 | 2018-06-07 | Eberhard Karls Universitaet Tuebingen | Method and device for image synthesis |
TWM569845U (en) * | 2018-07-12 | 2018-11-11 | 卓峰智慧生態有限公司 | Leather detection equipment and leather product production system based on artificial intelligence |
CN109933982A (en) * | 2017-12-15 | 2019-06-25 | 英特尔公司 | Use the malware detection and classification of artificial neural network |
US20190252073A1 (en) * | 2018-02-12 | 2019-08-15 | Ai.Skopy, Inc. | System and method for diagnosing gastrointestinal neoplasm |
-
2019
- 2019-09-19 TW TW108133761A patent/TWI723547B/en active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180158224A1 (en) * | 2015-07-31 | 2018-06-07 | Eberhard Karls Universitaet Tuebingen | Method and device for image synthesis |
CN107180410A (en) * | 2017-04-11 | 2017-09-19 | 中国农业大学 | The stylized method for reconstructing and device of a kind of image |
CN109933982A (en) * | 2017-12-15 | 2019-06-25 | 英特尔公司 | Use the malware detection and classification of artificial neural network |
US20190252073A1 (en) * | 2018-02-12 | 2019-08-15 | Ai.Skopy, Inc. | System and method for diagnosing gastrointestinal neoplasm |
TWM569845U (en) * | 2018-07-12 | 2018-11-11 | 卓峰智慧生態有限公司 | Leather detection equipment and leather product production system based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
TW202113687A (en) | 2021-04-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11328523B2 (en) | Image composites using a generative neural network | |
US11030782B2 (en) | Accurately generating virtual try-on images utilizing a unified neural network framework | |
Han et al. | Caricatureshop: Personalized and photorealistic caricature sketching | |
WO2021232609A1 (en) | Semantic segmentation method and system for rgb-d image, medium and electronic device | |
CN111124119B (en) | Building model presentation and interaction method based on virtual reality technology | |
CN111127309B (en) | Portrait style migration model training method, portrait style migration method and device | |
CN106709978A (en) | Unity 3D based Tujia brocade virtual design system | |
US12026843B2 (en) | Systems and methods for using machine learning models to effect virtual try-on and styling on actual users | |
CN114202615A (en) | Facial expression reconstruction method, device, equipment and storage medium | |
Amorim et al. | Facing the high-dimensions: Inverse projection with radial basis functions | |
US20240331330A1 (en) | System and Method for Dynamically Improving the Performance of Real-Time Rendering Systems via an Optimized Data Set | |
Li et al. | Interactive image/video retexturing using GPU parallelism | |
TWI723547B (en) | Style transfer method and computer program product thereof | |
US9582247B1 (en) | Preserving data correlation in asynchronous collaborative authoring systems | |
CN107066926A (en) | Positioned using the 3D objects of descriptor | |
CN112150608B (en) | Three-dimensional face reconstruction method based on graph convolution neural network | |
CN115544311A (en) | Data analysis method and device | |
Saran et al. | Augmented annotations: Indoor dataset generation with augmented reality | |
CN117576280B (en) | Intelligent terminal cloud integrated generation method and system based on 3D digital person | |
US11587277B2 (en) | Weight maps to generate off-center split maps of a shape | |
Feng et al. | Art Design Style Mining Based on Deep Learning and Data Mining | |
CN118470048B (en) | Real-time feedback interactive tree image matting method | |
Pradhan et al. | Identifying deepfake faces with resnet50-keras using amazon ec2 dl1 instances powered by gaudi accelerators from habana labs | |
Wu et al. | Effects Study of CAD Technology on the Dissemination of Artistic Images Using Big Data | |
WO2024057905A1 (en) | Program, information processing method, and information processing device |