TW202001692A - Framebuffer-less system and method of convolutional neural network - Google Patents
Framebuffer-less system and method of convolutional neural network Download PDFInfo
- Publication number
- TW202001692A TW202001692A TW107122430A TW107122430A TW202001692A TW 202001692 A TW202001692 A TW 202001692A TW 107122430 A TW107122430 A TW 107122430A TW 107122430 A TW107122430 A TW 107122430A TW 202001692 A TW202001692 A TW 202001692A
- Authority
- TW
- Taiwan
- Prior art keywords
- neural network
- convolutional neural
- interest
- item
- region
- Prior art date
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
Description
本發明係有關一種卷積神經網路(CNN),特別是關於一種無訊框緩衝器的卷積神經網路系統。The invention relates to a convolutional neural network (CNN), in particular to a convolutional neural network system without frame buffer.
卷積神經網路(convolutional neural network, CNN)為人工神經網路(artificial neural network)的一種,可用於機器學習(machine learning)。卷積神經網路可應用於信號處理,例如影像處理及電腦視覺。Convolutional neural network (CNN) is a type of artificial neural network (artificial neural network), which can be used for machine learning. Convolutional neural networks can be applied to signal processing, such as image processing and computer vision.
第一圖顯示傳統卷積神經網路900的方塊圖,揭示於Li Du等人所提出的“用於物聯網的可重置串流之深卷積神經網路加速器(A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things)”,2017年8月,電機電子工程師學會(IEEE)電路與系統會刊(IEEE Transactions on Circuits and Systems)I:定期論文,其內容視為本說明書的一部份。卷積神經網路900包含緩衝組(buffer bank)91,其包含單埠的靜態隨機存取記憶體(SRAM),用以儲存中間資料(intermediate data)且與訊框緩衝器(frame buffer)92交換資料,該訊框緩衝器92包含動態隨機存取記憶體(DRAM),例如雙倍資料率同步動態隨機存取記憶體(DDR DRAM),用以儲存整個影像訊框,供卷積神經網路操作之用。緩衝組91被分為二部分:輸入層與輸出層。卷積神經網路900包含行(column)緩衝器93,用以將緩衝組91的輸出重映射(remap)至卷積單元(convolution unit, CU)引擎陣列94。卷積單元引擎陣列94包含複數卷積單元以執行高度平行的卷積運算。卷積單元引擎陣列94包含預取(pre-fetch)控制器941,用以週期的從直接記憶體存取(direct memory access, DMA)控制器(未顯示)取得參數且更新卷積單元引擎陣列94的權重與偏壓值。卷積神經網路900還包含累積(accumulation)緩衝器95,具草稿(scratchpad)記憶體,用以儲存卷積單元引擎陣列94的部分卷積結果。累積緩衝器95包含最大池化(max pool)951以池化輸出層資料。卷積神經網路900包含指令解碼器96,用以儲存預存於訊框緩衝器92的命令。The first figure shows a block diagram of a traditional convolutional
如第一圖所示的傳統卷積神經網路系統,訊框緩衝器包含動態隨機存取記憶體(DRAM),例如雙倍資料率同步動態隨機存取記憶體(DDR DRAM),用以儲存整個影像訊框,供卷積神經網路操作之用。舉例而言,解析度為320x240的影像訊框需佔用空間為320x240x8位元的訊框緩衝器。然而,雙倍資料率同步動態隨機存取記憶體(DDR DRAM)並不適用於低功率應用,例如穿戴式或物聯網(IoT)裝置。因此亟需提出一種新穎的卷積神經網路系統,以適用於低功率應用。As shown in the first picture of the traditional convolutional neural network system, the frame buffer contains dynamic random access memory (DRAM), such as double data rate synchronous dynamic random access memory (DDR DRAM), for storage The entire image frame is used for convolutional neural network operation. For example, an image frame with a resolution of 320x240 requires a frame buffer with a space of 320x240x8 bits. However, double data rate synchronous dynamic random access memory (DDR DRAM) is not suitable for low-power applications, such as wearable or Internet of Things (IoT) devices. Therefore, there is an urgent need to propose a novel convolutional neural network system for low power applications.
鑑於上述,本發明實施例的目的之一在於提出一種無訊框緩衝器的卷積神經網路系統。本實施例可使用簡易系統架構以執行卷積神經網路操作於高解析度影像訊框。In view of the above, one of the objectives of the embodiments of the present invention is to provide a convolutional neural network system without frame buffer. In this embodiment, a simple system architecture can be used to perform a convolutional neural network operation on a high-resolution image frame.
根據本發明實施例,無訊框緩衝器的卷積神經網路系統包含感興趣區域單元、卷積神經網路單元及追蹤單元。感興趣區域單元萃取特徵,據以產生輸入影像訊框的感興趣區域。卷積神經網路單元處理輸入影像訊框的感興趣區域以偵測物件。追蹤單元比較不同時間萃取的特徵,使得卷積神經網路單元據以選擇地處理輸入影像訊框。According to an embodiment of the present invention, a frameless buffer convolutional neural network system includes a region of interest unit, a convolutional neural network unit, and a tracking unit. The region of interest unit extracts the features, and generates the region of interest based on the input image frame. The convolutional neural network unit processes the region of interest in the input image frame to detect objects. The tracking unit compares the features extracted at different times, so that the convolutional neural network unit selectively processes the input image frame accordingly.
第二A圖顯示本發明實施例之無訊框緩衝器(framebuffer-less)的卷積神經網路(CNN)系統100的方塊圖,第二B圖顯示本發明實施例之無訊框緩衝器的卷積神經網路(CNN)方法200的流程圖。Figure 2A shows a block diagram of a framebuffer-less convolutional neural network (CNN)
在本實施例中,無訊框緩衝器的卷積神經網路系統(以下簡稱系統)100可包含感興趣區域(region of interest, ROI)單元11,用以於輸入影像訊框中產生感興趣區域(步驟21)。由於本實施例之系統100不含訊框緩衝器,感興趣區域單元11可採用基於掃描線的技術與基於區塊的機制,用以於輸入影像訊框中找出感興趣區域。其中,輸入影像訊框分割為複數影像區塊,排列為矩陣形式,例如4x6影像區塊。In this embodiment, the frame buffer-free convolutional neural network system (hereinafter referred to as the system) 100 may include a region of interest (ROI)
在本實施例中,感興趣區域單元11產生基於區塊的特徵,據以決定每一影像區塊是否執行卷積神經網路(CNN)操作。第三圖顯示第二A圖之感興趣區域單元11的細部方塊圖。在本實施例中,感興趣區域單元11可包含特徵萃取器111,例如用以從輸入影像訊框中萃取淺特徵(shallow feature)。於一例子中,特徵萃取器111根據基於區塊的直方圖(histogram)以產生區塊的(淺)特徵。於另一例子中,特徵萃取器111根據頻率分析以產生區塊的(淺)特徵。In this embodiment, the region-of-
感興趣區域單元11還可包含分類器112,例如支援向量機(support vector machine, SVM),用以決定輸入影像訊框之每一區塊是否執行卷積神經網路操作。藉此,可產生決定圖(decision map)12,其包含代表輸入影像訊框的複數區塊(其可排列為矩陣形式)。第四A圖例示決定圖12,其包含4x6區塊,其中X表示相關區塊不需執行卷積神經網路操作,C表示相關區塊需執行卷積神經網路操作,且D表示相關區塊偵測到物件(例如一隻狗)。藉此,可決定感興趣區域並執行卷積神經網路操作。The region of
參閱第二B圖,系統100可包含暫存器13,例如靜態隨機存取記憶體(SRAM),用以儲存(感興趣區域單元11之)特徵萃取器111所產生的(淺)特徵(步驟22)。第五圖顯示第二A圖之暫存器13的細部方塊圖。在本實施例中,暫存器13可包含二個特徵圖(feature map),亦即,第一特徵圖131A,用以儲存前一影像訊框(於前一時間t-1)的特徵;及第二特徵圖131B,用以儲存目前影像訊框(於目前時間t)的特徵。暫存器13還可包含滑動視窗(sliding window)132,其大小可為40x40x8位元,用以儲存輸入影像訊框的一區塊。Referring to FIG. 2B, the
參閱第二A圖,本實施例之系統100可包含卷積神經網路(CNN)單元14,其接收並處理(感興趣區域單元11)所產生之輸入影像訊框的感興趣區域,以偵測物件(步驟23)。其中,本實施例之卷積神經網路單元14僅於感興趣區域執行,而非如具訊框緩衝器之傳統系統係執行於整個輸入影像訊框。Referring to FIG. 2A, the
第六圖顯示第二A圖之卷積神經網路單元14的細部方塊圖。其中,卷積神經網路單元14可包含卷積單元141,其包含複數卷積引擎(convolution engine),用以執行卷積操作。卷積神經網路單元14可包含激勵(activation)單元142,當偵測到預設特徵時,可執行激勵功能。卷積神經網路單元14還可包含池化(pooling)單元143,用以對輸入影像訊框執行降低取樣率(down-sampling)或池化(pooling)。The sixth figure shows a detailed block diagram of the convolutional
本實施例之系統100可包含追蹤單元15,用以比較(前一影像訊框之)第一特徵圖131A與(目前影像訊框之)第二特徵圖131B,接著更新決定圖12(步驟24)。追蹤單元15分析第一特徵圖131A與第二特徵圖131B之間的內容變化。第四B圖例示另一決定圖12,其更新於第四A圖之後。在這個例子中,於前一時間,位於第5~6行與第3列之區塊有偵測到物件(如第四A圖所標示的D),但於目前時間,該物件消失(如第四B圖所標示的X)。據此,卷積神經網路單元14不需針對無特徵變化的區塊執行卷積神經網路操作。換句話說,卷積神經網路單元14選擇地針對具特徵變化的區塊執行卷積神經網路操作。因此,系統100可大量地加速操作。The
相較於傳統卷積神經網路系統,上述實施例之卷積神經網路操作可大量降低(且加速)。此外,由於本發明實施例不需訊框緩衝器,本實施例可較佳適用於低功率應用,例如穿戴式或物聯網(IoT)裝置。對於解析度為320x240且(非重疊)滑動視窗大小為40x40的影像訊框,具訊框緩衝器的傳統系統需要8x6滑動視窗以執行卷積神經網路操作。相反的,本實施例之系統100僅需很少(小於10)的滑動視窗以執行卷積神經網路操作。Compared with the traditional convolutional neural network system, the operation of the convolutional neural network in the above embodiment can be greatly reduced (and accelerated). In addition, since the embodiment of the present invention does not require a frame buffer, this embodiment can be preferably applied to low-power applications, such as wearable or Internet of Things (IoT) devices. For image frames with a resolution of 320x240 and a (non-overlapping) sliding window size of 40x40, conventional systems with frame buffers require 8x6 sliding windows to perform convolutional neural network operations. Conversely, the
以上所述僅為本發明之較佳實施例而已,並非用以限定本發明之申請專利範圍;凡其它未脫離發明所揭示之精神下所完成之等效改變或修飾,均應包含在下述之申請專利範圍內。The above are only the preferred embodiments of the present invention and are not intended to limit the scope of the patent application of the present invention; all other equivalent changes or modifications made without departing from the spirit of the invention should be included in the following Within the scope of patent application.
100‧‧‧無訊框緩衝器的卷積神經網路系統11‧‧‧感興趣區域單元111‧‧‧特徵萃取器112‧‧‧分類器12‧‧‧決定圖13‧‧‧暫存器131A‧‧‧第一特徵圖131B‧‧‧第二特徵圖132‧‧‧滑動視窗14‧‧‧卷積神經網路單元141‧‧‧卷積單元142‧‧‧激勵單元143‧‧‧池化單元15‧‧‧追蹤單元200‧‧‧無訊框緩衝器的卷積神經網路方法21‧‧‧於輸入影像訊框中產生感興趣區域22‧‧‧儲存特徵於特徵圖23‧‧‧處理感興趣區域以偵測物件24‧‧‧比較特徵並於具特徵變化的區塊執行卷積神經網路操作900‧‧‧卷積神經網路91‧‧‧緩衝組92‧‧‧訊框緩衝器93‧‧‧行緩衝器94‧‧‧卷積單元引擎陣列941‧‧‧預取控制器95‧‧‧累積緩衝器951‧‧‧最大池化96‧‧‧指令解碼器100‧‧‧Convolutional neural network system without
第一圖顯示傳統卷積神經網路的方塊圖。 第二A圖顯示本發明實施例之無訊框緩衝器的卷積神經網路系統的方塊圖。 第二B圖顯示本發明實施例之無訊框緩衝器的卷積神經網路方法的流程圖。 第三圖顯示第二A圖之感興趣區域單元的細部方塊圖。 第四A圖例示決定圖,其包含4x6區塊。 第四B圖例示另一決定圖,其更新於第四A圖之後。 第五圖顯示第二A圖之暫存器的細部方塊圖。 第六圖顯示第二A圖之卷積神經網路單元的細部方塊圖。The first figure shows the block diagram of a traditional convolutional neural network. Figure 2A shows a block diagram of a frameless buffer convolutional neural network system according to an embodiment of the present invention. FIG. 2B shows a flowchart of a frameless buffer convolution neural network method according to an embodiment of the present invention. The third diagram shows a detailed block diagram of the region of interest unit in the second diagram A. The fourth diagram A illustrates the decision diagram, which includes 4x6 blocks. The fourth picture B illustrates another decision picture, which is updated after the fourth picture A. The fifth figure shows a detailed block diagram of the register in the second figure A. The sixth figure shows a detailed block diagram of the convolutional neural network unit of the second figure A.
100‧‧‧無訊框緩衝器的卷積神經網路系統 100‧‧‧Convolutional neural network system without frame buffer
11‧‧‧感興趣區域單元 11‧‧‧ Region of Interest Unit
12‧‧‧決定圖 12‧‧‧Decision map
13‧‧‧暫存器 13‧‧‧register
14‧‧‧卷積神經網路單元 14‧‧‧Convolutional Neural Network Unit
15‧‧‧追蹤單元 15‧‧‧Tracking unit
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW107122430A TWI696127B (en) | 2018-06-29 | 2018-06-29 | Framebuffer-less system and method of convolutional neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW107122430A TWI696127B (en) | 2018-06-29 | 2018-06-29 | Framebuffer-less system and method of convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202001692A true TW202001692A (en) | 2020-01-01 |
TWI696127B TWI696127B (en) | 2020-06-11 |
Family
ID=69942004
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW107122430A TWI696127B (en) | 2018-06-29 | 2018-06-29 | Framebuffer-less system and method of convolutional neural network |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI696127B (en) |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1656465B (en) * | 2002-03-22 | 2010-05-26 | 迈克尔·F·迪林 | Method and system for rendering graph by executing render computation by multiple interconnecting nodes |
ITRM20060110A1 (en) * | 2006-03-03 | 2007-09-04 | Cnr Consiglio Naz Delle Ricerche | METHOD AND SYSTEM FOR THE AUTOMATIC DETECTION OF EVENTS IN SPORTS ENVIRONMENT |
US9798972B2 (en) * | 2014-07-02 | 2017-10-24 | International Business Machines Corporation | Feature extraction using a neurosynaptic system for object classification |
TWI634436B (en) * | 2016-11-14 | 2018-09-01 | 耐能股份有限公司 | Buffer device and convolution operation device and method |
TWI645335B (en) * | 2016-11-14 | 2018-12-21 | 耐能股份有限公司 | Convolution operation device and convolution operation method |
TWI616813B (en) * | 2016-11-14 | 2018-03-01 | 耐能股份有限公司 | Convolution operation method |
-
2018
- 2018-06-29 TW TW107122430A patent/TWI696127B/en active
Also Published As
Publication number | Publication date |
---|---|
TWI696127B (en) | 2020-06-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10769485B2 (en) | Framebuffer-less system and method of convolutional neural network | |
CN108399406B (en) | Method and system for detecting weakly supervised salient object based on deep learning | |
CN110991311B (en) | Target detection method based on dense connection deep network | |
CN109583340B (en) | Video target detection method based on deep learning | |
US10936937B2 (en) | Convolution operation device and convolution operation method | |
CN110176001B (en) | Grad-CAM algorithm-based high-speed rail contact net insulator damage accurate positioning method | |
US20210319565A1 (en) | Target detection method, apparatus and device for continuous images, and storage medium | |
US20060222243A1 (en) | Extraction and scaled display of objects in an image | |
CN107784288B (en) | Iterative positioning type face detection method based on deep neural network | |
US20180268533A1 (en) | Digital Image Defect Identification and Correction | |
CN111222562B (en) | Target detection method based on space self-attention mechanism | |
US10504007B2 (en) | Determination of population density using convoluted neural networks | |
CN112381004B (en) | Dual-flow self-adaptive graph rolling network behavior recognition method based on framework | |
US20200257902A1 (en) | Extraction of spatial-temporal feature representation | |
CN110659658A (en) | Target detection method and device | |
CN111723660A (en) | Detection method for long ground target detection network | |
TWI803243B (en) | Method for expanding images, computer device and storage medium | |
CN110147724B (en) | Method, apparatus, device, and medium for detecting text region in video | |
Gutierrez et al. | Lip reading word classification | |
TWI696127B (en) | Framebuffer-less system and method of convolutional neural network | |
JPWO2019215904A1 (en) | Predictive model creation device, predictive model creation method, and predictive model creation program | |
CN111179212A (en) | Method for realizing micro target detection chip integrating distillation strategy and deconvolution | |
CN110717575B (en) | Frame buffer free convolutional neural network system and method | |
US11544523B2 (en) | Convolutional neural network method and system | |
CN115049546A (en) | Sample data processing method and device, electronic equipment and storage medium |