TWI771250B - Device and method for reducing data dimension, and operating method of device for converting data dimension - Google Patents
Device and method for reducing data dimension, and operating method of device for converting data dimension Download PDFInfo
- Publication number
- TWI771250B TWI771250B TW110147292A TW110147292A TWI771250B TW I771250 B TWI771250 B TW I771250B TW 110147292 A TW110147292 A TW 110147292A TW 110147292 A TW110147292 A TW 110147292A TW I771250 B TWI771250 B TW I771250B
- Authority
- TW
- Taiwan
- Prior art keywords
- data
- base
- dimension
- reduced
- weight function
- Prior art date
Links
Images
Abstract
Description
本發明是關於一種降低資料維度的方法以及裝置,特別是關於一種資料非相依性的降低資料維度的方法以及裝置。 The present invention relates to a method and device for reducing data dimension, and more particularly, to a method and device for reducing data dimension that are independent of data.
隨著網路雲端服務愈來愈興盛,所需要傳輸與儲存的資料也愈來愈大愈複雜,資料中心愈蓋愈多,資料壓縮的市場需求應該是愈來愈高。資料降維可以應用於雲端服務的資料儲存,在設置有著大量資料中心的許多國家,對於大量的資料壓縮儲存有很大的需求。資料降維可以應用於影音串流服務,各個國家的影音串流服務都很興盛,對於大量的串流資料壓縮傳輸的需求高。 As network cloud services become more and more prosperous, the data that needs to be transmitted and stored becomes larger and more complex, and the number of data centers increases. The market demand for data compression should be higher and higher. Data dimensionality reduction can be applied to data storage of cloud services. In many countries with a large number of data centers, there is a great demand for a large amount of data compression storage. Data dimensionality reduction can be applied to video streaming services. Video streaming services in various countries are very prosperous, and there is a high demand for compressed transmission of a large amount of streaming data.
在專利公開號(CN105678265A)中,其揭示一種基於流形學的資料降維方法,根據公式Y_i=W_i^T X_i求出Y_i,其中Y_i為X_i的低維表示,W_i是通過最大間距準則和局部保持投影演算法推導得出。此方法的缺點為如果資料原本的維度愈高或是資料集愈多,則計算W_i矩陣所需要時間愈多,且只能應用在與資料集所產生的空間內的資料上,通用性較低。 In Patent Publication No. (CN105678265A), it discloses a manifold-based data dimensionality reduction method. Y_i is obtained according to the formula Y_i=W_i^T X_i, wherein Y_i is the low-dimensional representation of X_i, and W_i is obtained by the maximum distance criterion and The locality preserving projection algorithm is derived. The disadvantage of this method is that if the original dimension of the data is higher or there are more data sets, it will take more time to calculate the W_i matrix, and it can only be applied to the data in the space generated by the data set, and the versatility is low. .
在先前技術文獻一(Nonlinear Dimensionality Reduction by Locally Linear Embedding.)中,LocallyLinear Embedding(LLE)中,其揭示一種非線性降維的方法,LLE尋找一個資料的低維度投影,其投影保留了資料 點與其局部相鄰資料點的距離。LLE的缺點為計算所需的相鄰資料點數量K的最佳值需要手動調整或是用交叉驗證法求得,較為費時。而只能應用在與資料集所產生的空間內的資料上。 In the prior art document 1 (Nonlinear Dimensionality Reduction by Locally Linear Embedding.), Locally Linear Embedding (LLE), which discloses a nonlinear dimensionality reduction method, LLE finds a low-dimensional projection of data, and its projection preserves the data The distance of a point from its local neighbor data point. The disadvantage of LLE is that the optimal value of the required number of adjacent data points K needs to be manually adjusted or obtained by cross-validation, which is time-consuming. It can only be applied to data in the space created by the data set.
在先前技術文獻二(Reducing the Dimensionality of Data with Neural Networks)中,其揭示使用基於神經網路的自動編碼器(autoencoder)降低資料維度。其主要缺點為有效作用目標需要在訓練資料集產生的空間內,通常只能作用於與資料集同類型的資料,若要提高神經網路模型的通用性,傳統上需要使用大數據集訓練,導致訓練時間倍增。 In the prior art document 2 (Reducing the Dimensionality of Data with Neural Networks), it is disclosed to use a neural network-based autoencoder to reduce the data dimension. Its main disadvantage is that the effective target needs to be in the space generated by the training data set, and it can usually only act on the same type of data as the data set. resulting in a doubling of training time.
有鑑於上述限制,本揭示提出一種基於空間基底的資料降維方法,空間通常是一向量空間,所有的資料點在空間都有一座標或是係數,而基底可展開整個向量空間,空間中的所有資料點都可以由這些較少量的基底所組合而成。此揭示將基底轉換成訓練資料集,用以訓練基於神經網路的自動編碼器。訓練完成的自動編碼器,能夠通用於符合前述基底所展開的空間內所有資料。如此一來,就能以較少量的資料訓練降維模型,但是具有高度的通用性,只需要訓練一次模型就可以應用於多數使用場景,節省大量的訓練時間。 In view of the above limitations, the present disclosure proposes a data dimensionality reduction method based on a spatial basis. The space is usually a vector space, all data points have coordinates or coefficients in the space, and the basis can expand the entire vector space. Data points can be assembled from these smaller bases. This disclosure transforms the base into a training dataset for training neural network-based autoencoders. The trained autoencoder can be universally applied to all data in the space expanded by the aforementioned base. In this way, the dimensionality reduction model can be trained with a smaller amount of data, but it has a high degree of versatility. It only needs to train the model once and can be applied to most usage scenarios, saving a lot of training time.
本揭示提出的基於神經網路的資料降維度的技術,可以被這些資料中心內或是消費性電子當中專門加速神經網路計算的晶片來提升效能,使本揭示相較傳統技術更具競爭優勢。 The neural network-based data dimensionality reduction technology proposed in this disclosure can be used in these data centers or in consumer electronics to accelerate neural network computing chips to improve performance, making this disclosure more competitive than traditional technologies. .
依據上述構想,本揭示提供一種用於降低影像資料維度的方法,包含下列步驟:產生一基底影像資料矩陣,其中:該基底影像資料矩陣 包含N×N個基底影像資料;各該基底影像資料具有N×N維度;以及將該N×N個基底影像資料各自施予一特定係數並組合後,足以表示一待處理影像資料。以一第一權重函數操作各該基底影像資料,且對該基底影像資料進行一殘差下取樣,以形成具有一維度小於N×N的一降維度基底影像資料。以一第二權重函數對該降維度基底影像資料進行殘差上取樣,以將該降維度基底影像資料還原成N×N維度的一還原基底影像資料,其中,該還原基底影像資料與該基底影像資料之間具有一誤差。根據該誤差調整該第一與該第二權重函數。 Based on the above concept, the present disclosure provides a method for reducing the dimension of image data, including the following steps: generating a base image data matrix, wherein: the base image data matrix includes N × N base image data; each base image data has N × N dimensions; and applying a specific coefficient to each of the N × N base image data and combining them, it is sufficient to represent a to-be-processed image data. Each of the base image data is operated with a first weight function, and a residual down-sampling is performed on the base image data to form a reduced-dimensional base image data having a dimension smaller than N × N . Residual upsampling is performed on the reduced-dimension base image data with a second weight function to restore the reduced-dimension base image data into a restored base image data of N × N dimensions, wherein the restored base image data and the base There is an error between the image data. The first and the second weighting functions are adjusted according to the error.
依據上述構想,本揭示提供一種用於轉換資料維度的裝置的操作方法,其中該裝置包含一第一特徵提取單元以及一第二特徵提取單元,該方法包含下列步驟:提供一基底資料資料庫,包含複數基底資料,其中各該基底資料具有一第一維度。依一第一權重函數操作各該基底資料,以形成維度降低的具有一第二維度的一降維度基底資料。依一第二權重函數操作各該降維度基底資料,將各該降維度基底資料還原成具有該第一維度的一還原基底資料,其中:該還原基底資料與該基底資料之間具有一誤差;以及依據該誤差調整該第一與第二權重函數,其中在該基底資料庫中:各該基底資料對應於一空間向量;以及該複數空間向量彼此間皆相互正交。 Based on the above concept, the present disclosure provides an operation method of an apparatus for converting data dimensions, wherein the apparatus includes a first feature extraction unit and a second feature extraction unit, and the method includes the following steps: providing a base data database, A plurality of base data are included, wherein each of the base data has a first dimension. Each of the base data is operated according to a first weight function to form a dimension-reduced base data having a second dimension. operating each of the dimension-reduced base data according to a second weight function to restore each of the dimension-reduced base data into a restored base data with the first dimension, wherein: there is an error between the restored base data and the base data; and adjusting the first and second weight functions according to the error, wherein in the base database: each of the base data corresponds to a space vector; and the complex space vectors are mutually orthogonal to each other.
依據上述構想,本揭示提供一種用於降低資料維度的裝置,包含一基底資料資料庫以及一人工智慧模組。該基底資料資料庫,包括彼此相互正交之複數基底資料,且各該基底資料具有一第一維度。該人工智慧模組包含一編碼單元及一解碼單元,其中:該編碼單元依一第一權重函數對各該基底資料進行殘差下取樣,以形成維度降低之具有一第二維度的一降維 度基底資料。該解碼單元依一第二權重函數對該降維度基底資料進行殘差上取樣,以將該降維度基底資料還原成具有該第一維度的一還原基底資料,其中該還原基底資料與該基底資料之間具有一誤差,且該人工智慧模組根據該誤差調整該第一與該第二權重函數。 Based on the above concept, the present disclosure provides an apparatus for reducing data dimension, including a base data database and an artificial intelligence module. The base data database includes a plurality of base data orthogonal to each other, and each base data has a first dimension. The artificial intelligence module includes an encoding unit and a decoding unit, wherein: the encoding unit performs residual down-sampling on each of the base data according to a first weighting function to form a dimension-reduced dimension with a second dimension Basic information. The decoding unit performs residual up-sampling on the reduced-dimension base data according to a second weight function, so as to restore the reduced-dimension base data into a restored base data with the first dimension, wherein the restored base data and the base data There is an error therebetween, and the artificial intelligence module adjusts the first and the second weight function according to the error.
依據上述構想,本揭示提供一種用於降低資料維度的方法,包含下列步驟:提供一基底資料資料庫,包括複數基底資料,其中各該基底資料具有一第一維度;依一第一權重函數操作各該基底資料,以形成維度降低之具有一第二維度的一降維度基底資料;依一第二權重函數操作各該降維度基底資料,以將各該降維度基底資料還原成具有該第一維度的一還原基底資料;依據該還原基底資料與該基底資料之間的一誤差以調整該第一與該第二權重函數,其中在該基底資料資料庫中:各該基底資料對應至一空間向量;以及該複數空間向量彼此間皆相互正交。 Based on the above concept, the present disclosure provides a method for reducing the dimension of data, including the following steps: providing a base data database, including a plurality of base data, wherein each base data has a first dimension; operating according to a first weight function each of the base data to form a dimension-reduced base data with a second dimension; operate each of the dimension-reduced base data according to a second weight function to restore each of the dimension-reduced base data to have the first dimension A restored base data of a dimension; the first and the second weighting functions are adjusted according to an error between the restored base data and the base data, wherein in the base data database: each base data corresponds to a space vector; and the complex space vectors are mutually orthogonal to each other.
本揭示使用神經網路模型,是非線性降維,可以有更低的重建錯誤,且可以用於任何類型的影像資料。 The present disclosure uses a neural network model, which is non-linear dimensionality reduction, can have lower reconstruction errors, and can be used for any type of image data.
本揭示降低資料維度的方法與裝置可應用在通用性的資料,例如不同種類的資料庫,特別是可應用在原始數據較複雜的訊號分析,例如頻譜影像、電腦斷層掃描等生醫影像。本揭示的神經網路模型不太需要手動調整參數就可以收斂到很低的重建誤差,且可以用在任何資料集上。 The method and device for reducing the dimension of the data disclosed in the present disclosure can be applied to general data, such as different types of databases, and especially can be applied to signal analysis with more complex raw data, such as spectral images, computed tomography and other biomedical images. The disclosed neural network model can converge to a very low reconstruction error with little manual tuning of parameters, and can be used on any dataset.
本揭示使用空間基底作為訓練資料,只需要少量的資料就可以求得通用性好的模型,大量減少訓練時間。 The present disclosure uses the spatial base as the training data, and only needs a small amount of data to obtain a model with good generality, which greatly reduces the training time.
本揭示進一步的說明與優點可參閱後續的圖式與實施例,以更清楚地理解本發明的技術方案。 For further description and advantages of the present disclosure, reference may be made to the subsequent drawings and embodiments for a clearer understanding of the technical solutions of the present disclosure.
10:用於降低資料維度的裝置 10: A device for reducing the dimension of data
101:基底資料資料庫 101: Base Database
101D:複數基底影像資料 101D: Complex base image data
101D11,101D12,101D1N,101D21,101DN1,101DN2,101DNN:基底影像資料 101D11, 101D12, 101D1N, 101D21, 101DN1, 101DN2, 101DNN: base image data
102:人工智慧模組 102: Artificial Intelligence Modules
102A:自動編碼器模型 102A: Autoencoder Models
102AD:解碼單元 102AD: Decoding unit
102AE:編碼單元 102AE: Coding Unit
102ADC:反卷積計算單元 102ADC: Deconvolution calculation unit
102AEC:卷積計算單元 102AEC: Convolution Computation Unit
102ADR:第二活化單元 102ADR: second activation unit
102AER:第一活化單元 102AER: the first activation unit
D1:特徵資料 D1: Characteristic data
D2:經過活化單元處理之特徵資料 D2: Characteristic data processed by activation unit
D1’,D2’:還原特徵資料 D1', D2': restore feature data
102AO:優化層 102AO: Optimization Layer
102CNN:卷積神經網路 102CNN: Convolutional Neural Networks
圖一:本揭示較佳實施例複數基底影像資料的示意圖。 Figure 1: A schematic diagram of multiple base image data according to a preferred embodiment of the present disclosure.
圖二:本揭示較佳實施例用於降低資料維度的裝置的示意圖。 FIG. 2 is a schematic diagram of an apparatus for reducing data dimension according to a preferred embodiment of the present disclosure.
圖三A:本揭示較佳實施例測試影像資料在低維度空間的分布圖。 FIG. 3A: The distribution diagram of the test image data in the low-dimensional space according to the preferred embodiment of the present disclosure.
圖三B:本揭示較佳實施例待處理影像資料在低維度空間的放大分布圖。 FIG. 3B is an enlarged distribution diagram of the image data to be processed in the low-dimensional space according to the preferred embodiment of the present disclosure.
圖四:本揭示較佳實施例重建影像樣本的示意圖。 FIG. 4 is a schematic diagram of a reconstructed image sample according to a preferred embodiment of the present disclosure.
圖五:本揭示較佳實施例用於降低影像資料維度的方法的示意圖。 FIG. 5 is a schematic diagram of a method for reducing the dimension of image data according to a preferred embodiment of the present disclosure.
圖六:本揭示較佳實施例轉換資料維度的裝置的操作方法的示意圖。 FIG. 6 is a schematic diagram of an operation method of an apparatus for converting data dimensions according to a preferred embodiment of the present disclosure.
圖七:本揭示另一較佳實施例用於降低資料維度的方法的示意圖。 FIG. 7 is a schematic diagram of a method for reducing data dimension according to another preferred embodiment of the present disclosure.
請參酌本揭示的附圖來閱讀下面的詳細說明,其中本揭示的附圖是以舉例說明的方式,來介紹本揭示各種不同的實施例,並供瞭解如何實現本揭示。本揭示實施例提供了充足的內容,以供本領域的技術人員來實施本揭示所揭示的實施例,或實施依本揭示所揭示的內容所衍生的實施例。須注意的是,該些實施例彼此間並不互斥,且部分實施例可與其他一個或多個實施例作適當結合,以形成新的實施例,亦即本揭示的實施並不局限於以下所揭示的實施例。此外為了簡潔明瞭舉例說明,在各實施例中並不會過度揭示相關的細節,即使揭示了具體的細節也僅舉例說明以使讀者明瞭,在各實施例中的相關具體細節也並非用來限制本案的揭示。 Please read the following detailed description with reference to the accompanying drawings of the present disclosure, which are by way of example to introduce various embodiments of the present disclosure and to provide an understanding of how to implement the present disclosure. The embodiments of the present disclosure provide sufficient content for those skilled in the art to implement the disclosed embodiments of the present disclosure, or to implement the embodiments derived from the disclosed contents of the present disclosure. It should be noted that these embodiments are not mutually exclusive, and some embodiments can be appropriately combined with one or more other embodiments to form new embodiments, that is, the implementation of the present disclosure is not limited to Examples disclosed below. In addition, for the sake of brevity and clarity, the relevant details are not excessively disclosed in each embodiment, and even if specific details are disclosed, they are only exemplified to make the readers understand, and the relevant specific details in the various embodiments are not intended to be limiting. disclosure of the case.
請參閱圖一,其為本揭示較佳實施例複數基底影像資料101D的示意圖,複數基底影像資料101D包含基底影像資料102D11,101D12,...101D1N,101D21,...101D2N,...101DNN。請參閱圖二,其為本揭示較佳實施例用於降低資料維度的裝置10的示意圖,其包含如圖一的一基底資料資料庫101以及如圖二的一人工智慧模組102,該人工智慧模組102,例如基於物理基底的一自動編碼器模型102A,該人工智慧模組102包含一編碼單元102AE以及一解碼單元102AD。以對灰階影像資料降維為例,目標為將原始資料維度為32×32=1024,降為2維,而此2維資料,藉由該人工智慧模組102,可以還原成原本1024維的資料。基於物理基底的自動編碼器模型102A的資料降維,主要有兩個部分,一是訓練用的複數基底影像資料101D的產生,二是基於神經網路102CNN的自動編碼器模型102A的訓練。
Please refer to FIG. 1 , which is a schematic diagram of a plurality of
在圖一中,對於灰階影像資料的降維,在一實施例中使用離散餘弦變換(DCT)的複數基底影像資料101D來當作訓練用的基底影像資料。對於任一張N×N尺寸或解析度的待處理影像資料F,可以表示成複數基底影像資料101D的線性組合
In FIG. 1, for the dimensionality reduction of gray-scale image data, in one embodiment, the complex
T(u,v)為施予S u,v 之係數,各該複數基底影像資料101D係定義為一基底影像矩陣,其中s u 及s v 均係具有1×N維度的向量並各自對應於N×N維度的離散餘弦轉換(DCT)矩陣之列向量的其中之一,1≦u≦N且1≦v≦N,且該N×N個S u,v 分別代表不同空間頻率的基底函數。
T ( u,v ) is the coefficient applied to Su ,v , each of the complex
在圖二中,自動編碼模型102A執行函數的編碼以
及解碼功能,其中編碼單元102AE執行函數Encoder(x)=z的編碼功能,解碼單元102AD執行函數的解碼功能。自動編碼模型102A的輸入訓練影像資料為x,輸出重建影像資料為,其中經過編碼後的影像資料code z k 為一2維的向量,即為降完維度後的影像資料。訓練自動編碼模型102A的方法為最小化模型的重建誤差,其中M=1024,代表訓練資料的數量。當自動編碼模型102A訓練完成後,則可輸入待處理影像資料F至編碼單元102AE以執行函數Encoder(x)的編碼功能來降低待處理影像資料F的維度,以產生低維度的經過編碼後的影像資料code z。當需要重建原始的資訊時,只需要將code z輸入至解碼單元102AD以執行函數Decoder(z)的解碼功能,即可重建高維度的資訊。
In Figure 2, the auto-
在圖二中,除了影像基底資料之外,任一類型的基底資料在訓練階段亦可作為輸入的基底資料,例如聲音基底資料,且它們可來自於一雲端硬碟、一影音串流服務、一電腦斷層掃描、一影音壓縮與儲存、一影像前處理、以及一資料探勘的至少其中之一。該基底資料可為基底向量資料,其為一空間基底資料且形成一向量空間,所有的資料點在該向量空間中都有具有一座標或是係數,該空間基底資料展開形成該向量空間,且藉由複數基底向量資料形成該向量空間中的所有資料點,以將各該基底資料轉換成一訓練資料集。在圖一中的複數基底影像資料101D形成DCT矩陣,亦可形成Hadmard矩陣。
In Figure 2, in addition to image base data, any type of base data can also be used as input base data in the training phase, such as audio base data, and they can come from a cloud disk, an audio-video streaming service, At least one of a computed tomography scan, a video compression and storage, an image preprocessing, and a data mining. The base data may be base vector data, which is a space base data and forms a vector space, all data points have coordinates or coefficients in the vector space, the space base data is expanded to form the vector space, and All data points in the vector space are formed from complex basis vector data to convert each of the basis data into a training data set. The complex
該基底資料資料庫101可包括彼此相互正交之複數基底資料,例如s u 及s v 相互正交而施予T(u,v)為係數的所形成的複數基底資料101D,且各該基底資料具有一第一維度,例如圖二中的32×32×1,其表
示維度為32×32的一個基底資料,且輸入各該基底影像資料102D11,101D12,...101D1N,101D21,...101D2N,...101DNN的順序為隨機的順序。其中該編碼單元102AE依一第一權重函數對各該基底資料進行殘差下取樣(Residual down sampling),以形成維度降低之具有一第二維度的一降維度基底資料code z,例如圖二中的1×1×2,其表示降維度為1×1的兩個基底資料。該解碼單元102AD依一第二權重函數對該降維度基底資料code z進行殘差上取樣(Residual upsampling),以將該降維度基底資料codez還原成具有該第一維度的一還原基底影像資料101D11’,其中該還原基底影像資料101D11’與該基底影像資料101D11之間具有一誤差err,且該人工智慧模組102根據該誤差err調整該第一與該第二權重函數。
The
在圖二中,該編碼單元在卷積神經網路102CNN中可為一殘差下取樣區塊,包含一卷積計算單元102AEC、一第一活化單元102AER、以及一全連接層(未顯示)。該卷積計算單元102AEC根據該第一權重函數對每個基底資料101D提取一特徵資料D1。該第一活化單元102AER依據一第一非線性組合規則ReLU處理該特徵資料D1。每個基底資料101D可經過兩層以上的該卷積計算單元102AEC與該第一活化單元102AER的處理後,再輸入全連接層。該全連接層將經過該活化單元102AER處理之該特徵資料D2轉換成該降維度基底資料code z。該解碼單元102AD在卷積神經網路102CNN中可為一殘差上取樣區塊,包含一反卷積計算單元102ADC、一第二活化單元102ADR、以及一優化層102AO。該反卷積計算單元102ADC根據該第二權重函數對每個降維度資料code z還原成一還原特徵資料D1’。該第二活化單元102ADR依據一第二非線性組合規則ReLU’處理該還原特徵資料D1’,以
形成一還原特徵資料D2’。該降維度基底資料code z可經過兩層以上的該反卷積計算單元102ADC與該第二活化單元102ADR的處理後,再輸入優化層102AO。該優化層102AO使該誤差err最小化,以重建該還原特徵資料D2’成為該還原基底資料101D11’。該優化層102AO可為一殘差區塊,該殘差區塊可與該殘差下取樣的結構相同。
In FIG. 2, the coding unit in the convolutional neural network 102CNN can be a residual down-sampling block, including a convolution calculation unit 102AEC, a first activation unit 102AER, and a fully connected layer (not shown) . The convolution calculation unit 102AEC extracts a characteristic data D1 for each
請參閱圖三A,其為本揭示較佳實施例測試影像資料在低維度空間的分布圖。圖三B為本揭示較佳實施例待處理影像資料在低維度空間的放大分布圖。橫軸代表影像資料的維度1,縱軸代表影像資料的維度2。圓形點代表自然影像,三角形點代表手寫數字影像,正方形點代表衣物影像。從圖三A與圖三B可知,影像資料包含了不同種類的資料集,同一個種類的資料集使用同一個形狀點來表示,分散在不同位置的同一個形狀點代表同一種類中進一步的分類資料。當自動編碼模型102AE訓練完成後,則任意影像可以輸入編碼單元102AE以執行函數Encoder(x)的編碼功能來降低維度,產生低維度的基底影像資料code z。當需要重建原始的資訊時,只需要將低維度的基底影像資料code z輸入解碼單元102AD以執行函數Decoder(z)的解碼功能,即可重建高維度的資訊,在此例當中自動編碼模型102A訓練了10層元(epochs)。以自然影像、手寫數字影像、以及衣物影像的不同種類的測試集為測試資料,圖三是某些影像在低維度空間的分布,由圖三A與三B可知任一影像在空間中皆為獨立座標點,表示整個高維空間都被轉換成低維空間。
Please refer to FIG. 3A , which is a distribution diagram of test image data in a low-dimensional space according to a preferred embodiment of the present disclosure. FIG. 3B is an enlarged distribution diagram of the image data to be processed in the low-dimensional space according to the preferred embodiment of the disclosure. The horizontal axis represents the
請參閱圖四,其為本揭示較佳實施例重建影像樣本的示意圖。從測試影像資料與重建的影像樣本之間均方差(MSE)可知,均方差的範 圍在3.8*10-5~10.8*10-5的範圍,顯示出不錯的重建影像誤差。 Please refer to FIG. 4 , which is a schematic diagram of a reconstructed image sample according to a preferred embodiment of the present disclosure. From the mean square error (MSE) between the test image data and the reconstructed image samples, the range of the mean square error is in the range of 3.8*10 -5 ~10.8*10 -5 , showing a good reconstructed image error.
請參閱圖五,其為本揭示較佳實施例用於降低影像資料維度的方法S10的示意圖,該方法S10包含下列步驟:步驟S101,產生一基底影像資料矩陣,其中:該基底影像資料矩陣包含N×N個基底影像資料;各該基底影像資料具有N×N維度;以及將該N×N個基底影像資料各自施予一特定係數並組合後,足以表示一待處理影像資料。步驟S102,以一第一權重函數操作各該基底影像資料,且對該基底影像資料進行一殘差下取樣,以形成具有一維度小於N×N的一降維度基底影像資料。步驟S103,以一第二權重函數對該降維度基底影像資料進行殘差上取樣,以將該降維度基底影像資料還原成N×N維度的一還原基底影像資料,其中,該還原基底影像資料與該基底影像資料之間具有一誤差。步驟S104,根據該誤差調整該第一與該第二權重函數。 Please refer to FIG. 5 , which is a schematic diagram of a method S10 for reducing the dimension of image data according to a preferred embodiment of the present disclosure. The method S10 includes the following steps: Step S101 , generating a base image data matrix, wherein: the base image data matrix includes: N × N base image data; each of the base image data has N × N dimensions; and each of the N × N base image data is applied with a specific coefficient and combined, which is sufficient to represent a to-be-processed image data. Step S102 , operating each of the base image data with a first weight function, and performing a residual down-sampling on the base image data to form a reduced-dimensional base image data with a dimension smaller than N × N . Step S103, performing residual upsampling on the reduced-dimension base image data with a second weight function, so as to restore the reduced-dimension base image data into a restored base image data of N × N dimensions, wherein the restored base image data There is an error with the base image data. Step S104, adjusting the first and the second weighting functions according to the error.
在本揭示任一實施例中,各該基底影像資料係定義為一基底影像矩陣,其中s u 及s v 均係具有1×N維度的向量並各自對應於N×N維度的離散餘弦轉換(DCT)矩陣之列向量的其中之一,1≦u≦N且1≦v≦N,且該N×N個S u,v 分別代表不同空間頻率的基底函數。該待處理影像資料係表示為,,其中T(u,v)為施予S u,v 之係數。該方法更包括以調整後之該第一權重函數操作該待處理影像資料,且對該待處理影像資料進行一殘差下取樣,以形成具有一維度小於N×N的一降維度影像資料。 In any embodiment of the present disclosure, each of the base image data is defined as a base image matrix , where both s u and s v are vectors with 1× N dimensions and each corresponds to one of the column vectors of an N × N -dimensional discrete cosine transform (DCT) matrix, 1≦ u ≦ N and 1≦ v ≦ N , and the N × N S u,v respectively represent basis functions of different spatial frequencies. The to-be-processed image data is represented as , , where T ( u , v ) is the coefficient applied to S u,v . The method further includes operating the to-be-processed image data with the adjusted first weight function, and performing a residual downsampling on the to-be-processed image data to form a reduced-dimensional image data having a dimension smaller than N × N .
在本揭示任一實施例中,以該第一權重函數操作各該基底影像資料步驟係以一隨機的順序操作各該基底影像資料。該N個DCT矩陣列向量係藉由取樣不同的頻率的餘弦函數的數值而獲得。 In any embodiment of the present disclosure, the step of operating each of the base image data with the first weight function operates on each of the base image data in a random order. The N DCT matrix column vectors are obtained by sampling the values of the cosine function at different frequencies.
請參閱圖六,其為本揭示較佳實施例轉換資料維度的裝置的操作方法S20的示意圖,其中該裝置包含一第一特徵提取單元以及一第二特徵提取單元,該方法S20包含下列步驟:步驟S201,提供一基底資料資料庫,包含複數基底資料,其中各該基底資料具有一第一維度。步驟S202,依一第一權重函數操作各該基底資料,以形成維度降低的具有一第二維度的一降維度基底資料。步驟S203,依一第二權重函數操作各該降維度基底資料,將各該降維度基底資料還原成具有該第一維度的一還原基底資料,其中:該還原基底資料與該基底資料之間具有一誤差;以及依據該誤差調整該第一與第二權重函數,其中在該基底資料庫中:各該基底資料對應於一空間向量;以及該複數空間向量彼此間皆相互正交。 Please refer to FIG. 6 , which is a schematic diagram of an operation method S20 of an apparatus for converting data dimensions according to a preferred embodiment, wherein the apparatus includes a first feature extraction unit and a second feature extraction unit, and the method S20 includes the following steps: Step S201 , providing a base data database, including a plurality of base data, wherein each base data has a first dimension. Step S202 , operating each of the base data according to a first weight function to form a reduced-dimensional base data having a second dimension with a reduced dimension. Step S203, operate each of the reduced-dimensional base data according to a second weight function, and restore each of the reduced-dimensional base data into a restored base data with the first dimension, wherein: the restored base data and the base data have an error; and adjusting the first and second weighting functions according to the error, wherein in the basis database: each of the basis data corresponds to a space vector; and the complex space vectors are mutually orthogonal to each other.
在本揭示任一實施例中,降低或轉換資料維度的方法更包含:輸入一待處理資料,該待處理資料以表示,其中,S u,v 為基底影像矩陣,,s u 與s v 各自對應於離散傅立葉(DCT)矩陣之列向量的其中之一,該N×N個S u,v 分別代表具有不同空間頻率的基底函數,T(u,v)施予S u,v 之係數,1≦u≦N且1≦v≦N;以調整後之該第一權重函數操作該待處理資料,以形成維度等於該第二維度的一降維度資料;以及以調整後之該第二權重函數操作該降維度資料,以將該降維度資料還原成一還原資料。 In any embodiment of the present disclosure, the method for reducing or converting data dimensions further includes: inputting data to be processed, the data to be processed is represents, where S u,v is the base image matrix, , s u and s v each correspond to one of the column vectors of the discrete Fourier transform (DCT) matrix, the N × N S u,v respectively represent basis functions with different spatial frequencies, T ( u,v ) gives The coefficients of S u, v , 1≦ u ≦ N and 1≦ v ≦ N ; operate the data to be processed with the adjusted first weight function to form a dimension-reduced data with a dimension equal to the second dimension; and The adjusted second weight function operates on the dimension-reduced data to restore the dimension-reduced data into restored data.
在本揭示任一實施例中,其中該第一權重函數包括L層第一子權重函數,其中L≧1、各該第一子權重函數對應於一第一輸入資料、且該第1層之該第一子權重函數的該第一輸入資料為各該基底資料;該第二權重函數包括L層第二子權重函數,其中L≦1、各該第二子權重函數對應於一第二輸入資料、且該第1層之該第二子權重函數的該第二輸入資料為各該降維度基底資料。依該第一權重函數操作各該基底資料步驟包括下列步驟:根據該第一子權重函數,提取對應該第一子權重函數之該輸入資料的一特徵資料;依據一第一非線性組合規則,處理該特徵資料,其中該特徵資料為下一層第一子權重函數之該輸入資料,且該第L層之該第一子權重函數所對應之該特徵資料經該第一非線性組合規則處理後與該基底資料以一第一特定方式疊加,以獲得該降維度基底資料。依該第二權重函數操作各該降維度基底資料步驟包括下列步驟:根據該第二子權重函數,將對應該第二子權重函數之該降維度基底資料還原成一還原特徵資料;依據一第二非線性組合規則,處理該還原特徵資料,其中該還原特徵資料為下一層第二子權重函數之該第二輸入資料,且該第L層之該第二子權重函數所對應之該還原特徵資料經該第二非線性組合規則處理後與該降維度基底資料以一第二特定方式疊加,以獲得該還原基底資料;以及使該誤差最小化,以最佳化該第一與第二權重函數。 In any embodiment of the present disclosure, the first weight function includes L layers of first sub-weight functions, wherein L ≧1, each of the first sub-weight functions corresponds to a first input data, and the first layer of The first input data of the first sub-weight function is each of the base data; the second weight function includes L layers of second sub-weight functions, where L ≦1, each of the second sub-weight functions corresponds to a second input data, and the second input data of the second sub-weight function of the first layer is each of the dimension-reduced basis data. The step of operating each of the base data according to the first weighting function includes the following steps: extracting a characteristic data of the input data corresponding to the first sub-weighting function according to the first sub-weighting function; according to a first nonlinear combination rule, Processing the feature data, wherein the feature data is the input data of the first sub-weight function of the next layer, and the feature data corresponding to the first sub-weight function of the Lth layer is processed by the first nonlinear combination rule Overlaid with the base data in a first specific manner to obtain the dimension-reduced base data. The step of operating each of the dimension-reduced base data according to the second weight function includes the following steps: according to the second sub-weight function, restoring the dimension-reduce base data corresponding to the second sub-weight function into a restored feature data; according to a second sub-weight function A nonlinear combination rule for processing the restored feature data, wherein the restored feature data is the second input data of the second sub-weight function of the next layer, and the restored feature data corresponding to the second sub-weight function of the Lth layer After being processed by the second nonlinear combination rule, the reduced-dimensional base data is superimposed in a second specific manner to obtain the restored base data; and the error is minimized to optimize the first and second weight functions .
在本揭示任一實施例中,其中各該基底資料係定義為一基底矩陣,其中s u 及s v 分別具有1×M及1×N維度維度的向量並分別對應於M×M及N×N維度的正交轉換矩陣之列向量的其中之一,1≦u≦M且1≦v≦N,且該M×N個S u,v 分別代表不同空間頻率的基底函數,上述 的正交轉換矩陣可以是Hadamard,DFT,DCT,Haar,Slant矩陣。該編碼單元以調整後之該第一權重函數操作該待處理資料,且對該待處理資料進行一殘差下取樣,以形成具有一維度小於M×N的一降維度資料。 In any embodiment of the present disclosure, each of the base data is defined as a base matrix , where s u and s v have 1× M and 1× N -dimensional vectors, respectively, and correspond to one of the column vectors of the M × M and N × N -dimensional orthogonal transformation matrices, respectively, 1≦ u ≦ M And 1≦ v ≦ N , and the M × N S u, v respectively represent basis functions of different spatial frequencies, the above-mentioned orthogonal transformation matrix can be Hadamard, DFT, DCT, Haar, Slant matrix. The encoding unit operates the data to be processed with the adjusted first weight function, and performs a residual downsampling on the data to be processed to form a reduced-dimensional data having a dimension smaller than M × N .
請參閱圖七,其為本揭示另一較佳實施例用於降低資料維度的方法S30的示意圖,該方法S30包含下列步驟:步驟S301,提供一基底資料資料庫,包括複數基底資料,其中各該基底資料具有一第一維度。步驟S302,依一第一權重函數操作各該基底資料,以形成維度降低之具有一第二維度的一降維度基底資料。步驟S303,依一第二權重函數操作各該降維度基底資料,以將各該降維度基底資料還原成具有該第一維度的一還原基底資料。步驟S304,依據該還原基底資料與該基底資料之間的一誤差以調整該第一與該第二權重函數,其中在該基底資料資料庫中:各該基底資料對應至一空間向量;以及該複數空間向量彼此間皆相互正交。 Please refer to FIG. 7 , which is a schematic diagram of a method S30 for reducing data dimension according to another preferred embodiment. The method S30 includes the following steps: Step S301 , providing a base data database, including a plurality of base data, wherein each The base data has a first dimension. Step S302 , operating each of the base data according to a first weight function to form a dimension-reduced base data having a second dimension with a reduced dimension. Step S303 , operating each of the dimension-reduced base data according to a second weight function to restore each of the dimension-reduced base data to a restored base data having the first dimension. Step S304, adjusting the first and the second weighting functions according to an error between the restored base data and the base data, wherein in the base data database: each base data corresponds to a space vector; and the Complex space vectors are all orthogonal to each other.
綜上所述,本揭示提出一種基於空間基底的資料降維方法,空間通常是一向量空間,所有的資料點在空間都有一座標或是係數,而基底可展開整個向量空間,空間中的所有資料點都可以由這些較少量的基底所組合而成。此揭示將基底轉換成訓練資料集,用以訓練基於神經網路的自動編碼器。訓練完成的自動編碼器,能夠通用於符合前述基底所展開的空間內所有資料。如此一來,就能以較少量的資料訓練降維模型,且同時具有高度的通用性,只需要訓練一次模型就可以應用於多數使用場景,節省大量的訓練時間。本揭示用於降低影像資料維度的方法、用於轉換資料維度的裝置的操作方法、用於降低資料維度的裝置、以及用於降低資料維度的方法,可以應用於需要處理複雜資料的相關領域,例如影音的壓縮與儲存、影像前處理 與資料探勘等等。傳統的自動編碼器資料降維方法通常需較大量的資料集,訓練模型需耗費大量時間;或是只能針對小範圍資料類型應用,要用在不同類型的資料時,每次都要重新訓練模型,一樣會花費大量時間。藉由此揭示所提出的方法可以大量節省時間。 To sum up, the present disclosure proposes a data dimensionality reduction method based on a spatial basis. The space is usually a vector space, all data points have coordinates or coefficients in the space, and the basis can expand the entire vector space. Data points can be assembled from these smaller bases. This disclosure transforms the base into a training dataset for training neural network-based autoencoders. The trained autoencoder can be universally applied to all data in the space expanded by the aforementioned base. In this way, the dimensionality reduction model can be trained with a smaller amount of data, and at the same time, it has a high degree of versatility. The model only needs to be trained once and can be applied to most usage scenarios, saving a lot of training time. The method for reducing the dimension of image data, the operation method of the apparatus for converting the dimension of the data, the apparatus for reducing the dimension of the data, and the method for reducing the dimension of the data disclosed in the present disclosure can be applied to related fields that need to process complex data, For example, video compression and storage, image preprocessing and data mining, etc. The traditional autoencoder data dimensionality reduction method usually requires a large data set, and the training model takes a lot of time; or it can only be applied to a small range of data types, and when it is used for different types of data, it needs to be retrained every time. model, it will take a lot of time. Significant time savings can be achieved by thus disclosing the proposed method.
本發明雖以上述數個實施方式或實施例揭露如上,然其並非用以限定本發明,任何熟習此技藝者,在不脫離本發明之精神和範圍內,當可作些許之更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。 Although the present invention is disclosed in the above-mentioned several embodiments or examples, it is not intended to limit the present invention. Anyone who is familiar with this technique can make some changes and modifications without departing from the spirit and scope of the present invention. Therefore, the protection scope of the present invention should be determined by the scope of the appended patent application.
S301~S304:S30的步驟 Steps of S301~S304:S30
S30:用於降低資料維度的方法 S30: Methods for reducing data dimension
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW110147292A TWI771250B (en) | 2021-12-16 | 2021-12-16 | Device and method for reducing data dimension, and operating method of device for converting data dimension |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW110147292A TWI771250B (en) | 2021-12-16 | 2021-12-16 | Device and method for reducing data dimension, and operating method of device for converting data dimension |
Publications (2)
Publication Number | Publication Date |
---|---|
TWI771250B true TWI771250B (en) | 2022-07-11 |
TW202326595A TW202326595A (en) | 2023-07-01 |
Family
ID=83439521
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW110147292A TWI771250B (en) | 2021-12-16 | 2021-12-16 | Device and method for reducing data dimension, and operating method of device for converting data dimension |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI771250B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI419059B (en) * | 2010-06-14 | 2013-12-11 | Ind Tech Res Inst | Method and system for example-based face hallucination |
TWI467498B (en) * | 2011-12-19 | 2015-01-01 | Ind Tech Res Inst | Method and system for recognizing images |
CN109376591A (en) * | 2018-09-10 | 2019-02-22 | 武汉大学 | The ship object detection method of deep learning feature and visual signature joint training |
US10388044B2 (en) * | 2008-06-20 | 2019-08-20 | New Bis Safe Luxco S.À R.L | Dimension reducing visual representation method |
CN113343900A (en) * | 2021-06-28 | 2021-09-03 | 中国电子科技集团公司第二十八研究所 | Combined nuclear remote sensing image target detection method based on combination of CNN and superpixel |
CN113723255A (en) * | 2021-08-24 | 2021-11-30 | 中国地质大学(武汉) | Hyperspectral image classification method and storage medium |
-
2021
- 2021-12-16 TW TW110147292A patent/TWI771250B/en active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10388044B2 (en) * | 2008-06-20 | 2019-08-20 | New Bis Safe Luxco S.À R.L | Dimension reducing visual representation method |
TWI419059B (en) * | 2010-06-14 | 2013-12-11 | Ind Tech Res Inst | Method and system for example-based face hallucination |
TWI467498B (en) * | 2011-12-19 | 2015-01-01 | Ind Tech Res Inst | Method and system for recognizing images |
CN109376591A (en) * | 2018-09-10 | 2019-02-22 | 武汉大学 | The ship object detection method of deep learning feature and visual signature joint training |
CN113343900A (en) * | 2021-06-28 | 2021-09-03 | 中国电子科技集团公司第二十八研究所 | Combined nuclear remote sensing image target detection method based on combination of CNN and superpixel |
CN113723255A (en) * | 2021-08-24 | 2021-11-30 | 中国地质大学(武汉) | Hyperspectral image classification method and storage medium |
Also Published As
Publication number | Publication date |
---|---|
TW202326595A (en) | 2023-07-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220284547A1 (en) | Super-resolution image reconstruction method based on deep convolutional sparse coding | |
US11221990B2 (en) | Ultra-high compression of images based on deep learning | |
CN110348487B (en) | Hyperspectral image compression method and device based on deep learning | |
Wang et al. | Resolution enhancement based on learning the sparse association of image patches | |
US9349072B2 (en) | Local feature based image compression | |
CN113962893A (en) | Face image restoration method based on multi-scale local self-attention generation countermeasure network | |
US11375198B2 (en) | Processing signal data using an upsampling adjuster | |
CN115311720B (en) | Method for generating deepfake based on transducer | |
CN113298716B (en) | Image super-resolution reconstruction method based on convolutional neural network | |
US20130129197A1 (en) | Image restoration by vector quantization utilizing visual patterns | |
CN114612289A (en) | Stylized image generation method and device and image processing equipment | |
CN113538246A (en) | Remote sensing image super-resolution reconstruction method based on unsupervised multi-stage fusion network | |
Siddeq et al. | A novel 2D image compression algorithm based on two levels DWT and DCT transforms with enhanced minimize-matrix-size algorithm for high resolution structured light 3D surface reconstruction | |
CN113962882A (en) | JPEG image compression artifact eliminating method based on controllable pyramid wavelet network | |
Heydari et al. | A low complexity wavelet-based blind image quality evaluator | |
CN116188272B (en) | Two-stage depth network image super-resolution reconstruction method suitable for multiple fuzzy cores | |
TWI771250B (en) | Device and method for reducing data dimension, and operating method of device for converting data dimension | |
Shen et al. | A new approach of lossy image compression based on hybrid image resizing techniques. | |
CN116797456A (en) | Image super-resolution reconstruction method, system, device and storage medium | |
CN115994849B (en) | Three-dimensional digital watermark embedding and extracting method based on point cloud up-sampling | |
Wang et al. | A customized deep network based encryption-then-lossy-compression scheme of color images achieving arbitrary compression ratios | |
CN115564975A (en) | Image matching method and device, terminal equipment and storage medium | |
CN110807746B (en) | Hyperspectral image sharpening method based on detail embedded injection convolutional neural network | |
CN115861048A (en) | Image super-resolution method, device, equipment and storage medium | |
Ayyoubzadeh et al. | Lossless compression of mosaic images with convolutional neural network prediction |