US10769485B2 - Framebuffer-less system and method of convolutional neural network - Google Patents

Framebuffer-less system and method of convolutional neural network Download PDF

Info

Publication number
US10769485B2
US10769485B2 US16/012,133 US201816012133A US10769485B2 US 10769485 B2 US10769485 B2 US 10769485B2 US 201816012133 A US201816012133 A US 201816012133A US 10769485 B2 US10769485 B2 US 10769485B2
Authority
US
United States
Prior art keywords
features
image frame
cnn
unit
input image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/012,133
Other versions
US20190385005A1 (en
Inventor
Der-Wei Yang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Himax Technologies Ltd
Original Assignee
Himax Technologies Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Himax Technologies Ltd filed Critical Himax Technologies Ltd
Priority to US16/012,133 priority Critical patent/US10769485B2/en
Assigned to HIMAX TECHNOLOGIES LIMITED reassignment HIMAX TECHNOLOGIES LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: YANG, DER-WEI
Publication of US20190385005A1 publication Critical patent/US20190385005A1/en
Application granted granted Critical
Publication of US10769485B2 publication Critical patent/US10769485B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06K9/4642
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/046Forward inferencing; Production systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • G06V10/955Hardware or software architectures specially adapted for image or video understanding using specific electronic processors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking

Definitions

  • the present invention generally relates to a convolutional neural network (CNN), and more particularly to a CNN system without framebuffer.
  • CNN convolutional neural network
  • a convolutional neural network is a class of artificial neural networks that may be adapted to machine learning.
  • the CNN can be applied to signal processing such as image processing and computer vision.
  • FIG. 1 shows a block diagram illustrating a conventional CNN 900 as disclosed in “A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things,” entitled to Li Du et al., August 2017, IEEE Transactions on Circuits and Systems I: Regular Papers, the disclosure of which is incorporated herein by reference.
  • the CNN 900 includes a single port static random-access memory (SRAM) as a buffer bank 91 to store intermediate data and exchange data with a dynamic random-access memory (DRAM) (e.g., double data rate synchronous DRAM (DDR SDRAM)) as a framebuffer 92 required to store whole image frame for CNN operation.
  • SRAM static random-access memory
  • DRAM dynamic random-access memory
  • DDR SDRAM double data rate synchronous DRAM
  • the CNN 900 includes a column (COL) buffer 93 that is used to remap output of the buffer bank 91 to a convolution unit (CU) engine array 94 .
  • the CU engine array 94 is composed of a plurality of convolution units to enable highly parallel convolution computation.
  • a pre-fetch controller 941 is included inside the CU engine array 94 to periodically fetch parameters from a direct memory access (DMA) controller (not shown) and update weights and bias values in the CU engine array 94 .
  • the CNN 900 also includes an accumulation (ACCU) buffer 95 with scratchpad used to store partial convolution results from the CU engine array 94 .
  • a max pool 951 is included in the ACCU buffer 95 to pool output-layer data.
  • the CNN 900 includes an instruction decoder 96 used to store commands that are pre-stored in the framebuffer 92 .
  • a framebuffer composed of a dynamic random-access memory (DRAM) (e.g., double data rate synchronous DRAM (DDR SDRAM)) is commonly required to store whole image frame for CNN operation.
  • DRAM dynamic random-access memory
  • framebuffer may occupy large space of 320 ⁇ 240 ⁇ 8 bits for an image frame with a 320 ⁇ 240 resolution.
  • DDR SDRAM is not available for most low-power applications such as wearables or Internet of things (IoT).
  • IoT Internet of things
  • CNN convolutional neural network
  • a framebuffer-less system of convolutional neural network includes a region of interest (ROI) unit, a convolutional neural network (CNN) unit and a tracking unit.
  • the ROI unit extracts features, according to which a region of interest in an input image frame is generated.
  • the CNN unit processes the region of interest of the input image frame to detect an object.
  • the tracking unit compares the features extracted at different times, according to which the CNN unit selectively processes the input image frame.
  • FIG. 1 shows a block diagram illustrating a conventional CNN
  • FIG. 2A shows a block diagram illustrating a framebuffer-less system of convolutional neural network (CNN) according to one embodiment of the present invention
  • FIG. 2B shows a flow diagram illustrating a framebuffer-less method of convolutional neural network (CNN) according to one embodiment of the present invention
  • FIG. 3 shows a detailed block diagram of the ROI unit of FIG. 2A ;
  • FIG. 4A shows an exemplary decision map composed of 4 ⁇ 6 blocks
  • FIG. 4B shows another exemplary decision map updated after that in FIG. 4A ;
  • FIG. 5 shows a detailed block diagram of the temporary storage of FIG. 2A ;
  • FIG. 6 shows a detailed block diagram of the CNN unit of FIG. 2A .
  • FIG. 2A shows a block diagram illustrating a framebuffer-less system 100 of convolutional neural network (CNN) according to one embodiment of the present invention
  • FIG. 2B shows a flow diagram illustrating a framebuffer-less method 200 of convolutional neural network (CNN) according to one embodiment of the present invention.
  • the system 100 may include a region of interest (ROI) unit 11 configured to generate a region of interest in an input image frame (step 21 ).
  • ROI region of interest
  • the ROI unit 11 may adopt scan-line based technique and block-based scheme to find the region of interest in the input image frame, which is divided into a plurality of blocks of image arranged in a matrix form composed of, for example, 4 ⁇ 6 blocks of image.
  • the ROI unit 11 is configured to generate block-based features, according to which decision of whether to perform CNN is made for each block of image.
  • FIG. 3 shows a detailed block diagram of the ROI unit 11 of FIG. 2A .
  • the ROI unit 11 may include a feature extractor 111 configured to extract, for example, shallow features from the input image frame.
  • the feature extractor 111 generates (shallow) features of the blocks according to block-based histogram.
  • the feature extractor 111 generates (shallow) features of the blocks according to frequency analysis.
  • the ROI unit 11 may also include a classifier 112 , such as support vector machine (SVM), configured to make decision whether to perform CNN for each block of the input image frame.
  • a decision map 12 composed of a plurality of blocks (e.g., arranged in a matrix form) representing the input image frame is generated.
  • FIG. 4A shows an exemplary decision map 12 composed of 4 ⁇ 6 blocks, where X indicates that associated block requires no CNN performance, C indicates that associated block requires CNN performance, and D indicates that an object (e.g., a dog) is detected in associated block. Accordingly, the ROI is determined and is thereafter subjected to CNN performance.
  • the system 100 may include temporary storage 13 such as static random-access memory (SRAM), which is configured to store the (shallow) features generated by the feature extractor 111 (of the ROI unit 11 ) (Step 22 ).
  • FIG. 5 shows a detailed block diagram of the temporary storage 13 of FIG. 2A .
  • the temporary storage 13 may include two feature maps 131 —first feature map 131 A used to store features of a previous image frame (e.g., at time t ⁇ 1) and second feature map 131 B used to store features of a current image frame (e.g., at time t).
  • the temporary storage 13 may also include a sliding window 132 of a size, for example, of 40 ⁇ 40 ⁇ 8 bits for storing a block of the input image frame.
  • the system 100 of the embodiment may include a convolutional neural network (CNN) unit 14 that operatively receives and processes the generated ROI (from the ROI unit 11 ) of the input image frame to detect an object (step 23 ).
  • CNN convolutional neural network
  • the CNN unit 13 of the embodiment performs operation only on the generated ROI, instead of entire input image frame as in a conventional system with framebuffer.
  • FIG. 6 shows a detailed block diagram of the CNN unit 14 of FIG. 2A .
  • the CNN unit 14 may include a convolution unit 141 including a plurality of convolution engines configured to perform convolution operation.
  • the CNN unit 14 may include an activation unit 142 configured to perform activation functions when predefined features are detected.
  • the CNN unit 14 may also include a pooling unit 143 configured to perform down-sampling (or pooling) on the input image frame.
  • the system 100 of the embodiment may include a tracking unit 15 configured, in step 24 , to compare the first feature map 131 A (of the previous image frame) and the second feature map 131 B (of the current image frame), followed by updating the decision map 12 .
  • the tracking unit 15 analyzes content variation between the first feature map 131 A and the second feature map 131 B.
  • FIG. 4B shows another exemplary decision map 12 updated after that in FIG. 4A .
  • the object detected in the blocks located at columns 5 - 6 and row 3 at a previous time (designated D in FIG. 4A ) disappears in the same blocks at a current time (designated X in FIG. 4B ).
  • the CNN unit 14 need not perform CNN operation on those blocks without feature variation.
  • the CNN unit 14 selectively performs CNN operation only on those blocks with feature variation. Therefore, operation of the system 100 can be substantially accelerated.
  • the amount of CNN operation may be substantially reduced (and thus accelerated) compared with a conventional CNN system.
  • the embodiment of the present invention requires no framebuffer, the embodiment can be preferably adaptable to low-power applications such as wearables or Internet of things (IoT).
  • IoT Internet of things
  • the conventional system with framebuffer requires 8 ⁇ 6 sliding window operations for CNN.
  • only a few (e.g., less than ten) sliding window operations for CNN are required in the system 100 of the embodiment.

Abstract

A framebuffer-less system of convolutional neural network (CNN) includes a region of interest (ROI) unit that extracts features, according to which a region of interest in an input image frame is generated; a convolutional neural network (CNN) unit that processes the region of interest of the input image frame to detect an object; and a tracking unit that compares the features extracted at different times, according to which the CNN unit selectively processes the input image frame.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention
The present invention generally relates to a convolutional neural network (CNN), and more particularly to a CNN system without framebuffer.
2. Description of Related Art
A convolutional neural network (CNN) is a class of artificial neural networks that may be adapted to machine learning. The CNN can be applied to signal processing such as image processing and computer vision.
FIG. 1 shows a block diagram illustrating a conventional CNN 900 as disclosed in “A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things,” entitled to Li Du et al., August 2017, IEEE Transactions on Circuits and Systems I: Regular Papers, the disclosure of which is incorporated herein by reference. The CNN 900 includes a single port static random-access memory (SRAM) as a buffer bank 91 to store intermediate data and exchange data with a dynamic random-access memory (DRAM) (e.g., double data rate synchronous DRAM (DDR SDRAM)) as a framebuffer 92 required to store whole image frame for CNN operation. The buffer bank 91 is separated into two sets: an input layer and an output layer. The CNN 900 includes a column (COL) buffer 93 that is used to remap output of the buffer bank 91 to a convolution unit (CU) engine array 94. The CU engine array 94 is composed of a plurality of convolution units to enable highly parallel convolution computation. A pre-fetch controller 941 is included inside the CU engine array 94 to periodically fetch parameters from a direct memory access (DMA) controller (not shown) and update weights and bias values in the CU engine array 94. The CNN 900 also includes an accumulation (ACCU) buffer 95 with scratchpad used to store partial convolution results from the CU engine array 94. A max pool 951 is included in the ACCU buffer 95 to pool output-layer data. The CNN 900 includes an instruction decoder 96 used to store commands that are pre-stored in the framebuffer 92.
In the conventional CNN system as exemplified in FIG. 1, a framebuffer composed of a dynamic random-access memory (DRAM) (e.g., double data rate synchronous DRAM (DDR SDRAM)) is commonly required to store whole image frame for CNN operation. For example, framebuffer may occupy large space of 320×240×8 bits for an image frame with a 320×240 resolution. However, DDR SDRAM is not available for most low-power applications such as wearables or Internet of things (IoT). A need has arisen to propose a novel CNN system that is adaptable to low-power applications.
SUMMARY OF THE INVENTION
In view of the foregoing, it is an object of the embodiment of the present invention to provide a convolutional neural network (CNN) system without framebuffer. The embodiment is capable of performing CNN operation on high-resolution image frame with low system complexity.
According to one embodiment, a framebuffer-less system of convolutional neural network (CNN) includes a region of interest (ROI) unit, a convolutional neural network (CNN) unit and a tracking unit. The ROI unit extracts features, according to which a region of interest in an input image frame is generated. The CNN unit processes the region of interest of the input image frame to detect an object. The tracking unit compares the features extracted at different times, according to which the CNN unit selectively processes the input image frame.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 shows a block diagram illustrating a conventional CNN;
FIG. 2A shows a block diagram illustrating a framebuffer-less system of convolutional neural network (CNN) according to one embodiment of the present invention;
FIG. 2B shows a flow diagram illustrating a framebuffer-less method of convolutional neural network (CNN) according to one embodiment of the present invention;
FIG. 3 shows a detailed block diagram of the ROI unit of FIG. 2A;
FIG. 4A shows an exemplary decision map composed of 4×6 blocks;
FIG. 4B shows another exemplary decision map updated after that in FIG. 4A;
FIG. 5 shows a detailed block diagram of the temporary storage of FIG. 2A; and
FIG. 6 shows a detailed block diagram of the CNN unit of FIG. 2A.
DETAILED DESCRIPTION OF THE INVENTION
FIG. 2A shows a block diagram illustrating a framebuffer-less system 100 of convolutional neural network (CNN) according to one embodiment of the present invention, and FIG. 2B shows a flow diagram illustrating a framebuffer-less method 200 of convolutional neural network (CNN) according to one embodiment of the present invention.
In the embodiment, the system 100 may include a region of interest (ROI) unit 11 configured to generate a region of interest in an input image frame (step 21). Specifically, as the system 100 of the embodiment contains no framebuffer, the ROI unit 11 may adopt scan-line based technique and block-based scheme to find the region of interest in the input image frame, which is divided into a plurality of blocks of image arranged in a matrix form composed of, for example, 4×6 blocks of image.
In the embodiment, the ROI unit 11 is configured to generate block-based features, according to which decision of whether to perform CNN is made for each block of image. FIG. 3 shows a detailed block diagram of the ROI unit 11 of FIG. 2A. In the embodiment, the ROI unit 11 may include a feature extractor 111 configured to extract, for example, shallow features from the input image frame. In one exemplary embodiment, the feature extractor 111 generates (shallow) features of the blocks according to block-based histogram. In another exemplary embodiment, the feature extractor 111 generates (shallow) features of the blocks according to frequency analysis.
The ROI unit 11 may also include a classifier 112, such as support vector machine (SVM), configured to make decision whether to perform CNN for each block of the input image frame. Accordingly, a decision map 12 composed of a plurality of blocks (e.g., arranged in a matrix form) representing the input image frame is generated. FIG. 4A shows an exemplary decision map 12 composed of 4×6 blocks, where X indicates that associated block requires no CNN performance, C indicates that associated block requires CNN performance, and D indicates that an object (e.g., a dog) is detected in associated block. Accordingly, the ROI is determined and is thereafter subjected to CNN performance.
Referring back to FIG. 2A, the system 100 may include temporary storage 13 such as static random-access memory (SRAM), which is configured to store the (shallow) features generated by the feature extractor 111 (of the ROI unit 11) (Step 22). FIG. 5 shows a detailed block diagram of the temporary storage 13 of FIG. 2A. In the embodiment, the temporary storage 13 may include two feature maps 131first feature map 131A used to store features of a previous image frame (e.g., at time t−1) and second feature map 131B used to store features of a current image frame (e.g., at time t). The temporary storage 13 may also include a sliding window 132 of a size, for example, of 40×40×8 bits for storing a block of the input image frame.
Referring back to FIG. 2A, the system 100 of the embodiment may include a convolutional neural network (CNN) unit 14 that operatively receives and processes the generated ROI (from the ROI unit 11) of the input image frame to detect an object (step 23). Specifically, the CNN unit 13 of the embodiment performs operation only on the generated ROI, instead of entire input image frame as in a conventional system with framebuffer.
FIG. 6 shows a detailed block diagram of the CNN unit 14 of FIG. 2A. Specifically, the CNN unit 14 may include a convolution unit 141 including a plurality of convolution engines configured to perform convolution operation. The CNN unit 14 may include an activation unit 142 configured to perform activation functions when predefined features are detected. The CNN unit 14 may also include a pooling unit 143 configured to perform down-sampling (or pooling) on the input image frame.
The system 100 of the embodiment may include a tracking unit 15 configured, in step 24, to compare the first feature map 131A (of the previous image frame) and the second feature map 131B (of the current image frame), followed by updating the decision map 12. The tracking unit 15 analyzes content variation between the first feature map 131A and the second feature map 131B. FIG. 4B shows another exemplary decision map 12 updated after that in FIG. 4A. In this example, the object detected in the blocks located at columns 5-6 and row 3 at a previous time (designated D in FIG. 4A) disappears in the same blocks at a current time (designated X in FIG. 4B). According to feature variation (and constant), the CNN unit 14 need not perform CNN operation on those blocks without feature variation. Alternatively speaking, the CNN unit 14 selectively performs CNN operation only on those blocks with feature variation. Therefore, operation of the system 100 can be substantially accelerated.
According to the embodiment proposed above, the amount of CNN operation may be substantially reduced (and thus accelerated) compared with a conventional CNN system. Moreover, as the embodiment of the present invention requires no framebuffer, the embodiment can be preferably adaptable to low-power applications such as wearables or Internet of things (IoT). Regarding an image frame of a 320×240 resolution and a (non-overlap) sliding window of a size of 40×40, the conventional system with framebuffer requires 8×6 sliding window operations for CNN. To the contrary, only a few (e.g., less than ten) sliding window operations for CNN are required in the system 100 of the embodiment.
Although specific embodiments have been illustrated and described, it will be appreciated by those skilled in the art that various modifications may be made without departing from the scope of the present invention, which is intended to be limited solely by the appended claims.

Claims (15)

What is claimed is:
1. A framebuffer-less system of convolutional neural network (CNN), comprising:
a region of interest (ROI) unit that extracts features, according to which a region of interest in an input image frame is generated;
a convolutional neural network (CNN) unit that processes the region of interest of the input image frame to detect an object;
a tracking unit that compares features extracted at different times, according to which the CNN unit selectively processes the input image frame; and
a temporary storage for storing the features extracted by the ROI unit;
wherein the ROI unit adopts scan-line based technique and block-based scheme to find the region of interest in the input image frame, which is divided into a plurality of blocks of image;
wherein the ROI unit comprises:
a feature extractor that extracts the features from the input image frame; and
a classifier that makes decision whether to perform CNN for each block of image, thus generating a decision map, according to which the region of interest is determined.
2. The system of claim 1, wherein the ROI unit generates block-based features, according to which decision of whether to perform CNN is made for each block of image.
3. The system of claim 1, wherein the feature extractor generates shallow features of the blocks of image according to block-based histogram or frequency analysis.
4. The system of claim 1, wherein the temporary storage comprises a first feature map storing features of a previous image frame, and a second feature map storing features of a current image frame.
5. The system of claim 4, wherein the tracking unit compares the first feature and the second feature map, and accordingly updates the decision map.
6. The system of claim 1, wherein the temporary storage comprises a sliding window storing a block of the input image frame.
7. The system of claim 1, wherein the CNN unit comprises:
a convolutional unit including a plurality of convolution engines to perform convolution operation on the region of interest;
an activation unit that performs activation function when predefined features are detected; and
a pooling unit that performs down-sampling on the input image frame.
8. A framebuffer-less method of convolutional neural network (CNN), comprising:
extracting features to generate a region of interest (ROI) in an input image frame;
performing convolutional neural network (CNN) on the region of interest of the input image frame to detect an object; and
comparing features extracted at different times and accordingly processing the input image frame selectively;
wherein the ROI is generated by adopting scan-line based technique and block-based scheme, the input image frame being divided into a plurality of blocks of image;
wherein the step of generating the ROI comprises:
extracting the features from the input image frame; and
making decision by classification whether to perform CNN for each block of image, thus generating a decision map, according to which the region of interest is determined.
9. The method of claim 8, wherein the step of generating the ROI comprises:
generating block-based features, according to which decision of whether to perform CNN is made for each block of image.
10. The method of claim 8, wherein the step of extracting the features comprises:
generating shallow features of the blocks of image according to block-based histogram or frequency analysis.
11. The method of claim 8, further comprising a step of temporarily storing the features that generates the ROI.
12. The method of claim 11, wherein the step of temporarily storing the features comprises:
generating a first feature map storing features of a previous image frame; and
generating a second feature map storing features of a current image frame.
13. The method of claim 12, wherein the step of comparing the features comprises:
comparing the first feature and the second feature map, and accordingly updating the decision map.
14. The method of claim 11, wherein the step of temporarily storing the features comprises:
generating a sliding window storing a block of the input image frame.
15. The method of claim 8, wherein the step of performing convolutional neural network (CNN) comprises:
using a plurality of convolution engines to perform convolution operation on the region of interest;
performing activation function when predefined features are detected; and
performing down-sampling on the input image frame.
US16/012,133 2018-06-19 2018-06-19 Framebuffer-less system and method of convolutional neural network Active 2038-12-07 US10769485B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/012,133 US10769485B2 (en) 2018-06-19 2018-06-19 Framebuffer-less system and method of convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US16/012,133 US10769485B2 (en) 2018-06-19 2018-06-19 Framebuffer-less system and method of convolutional neural network

Publications (2)

Publication Number Publication Date
US20190385005A1 US20190385005A1 (en) 2019-12-19
US10769485B2 true US10769485B2 (en) 2020-09-08

Family

ID=68840064

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/012,133 Active 2038-12-07 US10769485B2 (en) 2018-06-19 2018-06-19 Framebuffer-less system and method of convolutional neural network

Country Status (1)

Country Link
US (1) US10769485B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11037031B2 (en) * 2019-03-06 2021-06-15 Beijing Horizon Robotics Technology Research And Development Co., Ltd. Image recognition method, electronic apparatus and readable storage medium
US20210182589A1 (en) * 2018-08-01 2021-06-17 Kyungpook National University Industry-Academic Cooperation Foundation Object detection device and control method

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11055854B2 (en) * 2018-08-23 2021-07-06 Seoul National University R&Db Foundation Method and system for real-time target tracking based on deep learning
US11574200B2 (en) * 2019-12-18 2023-02-07 W.S.C. Sports Technologies Ltd. System and method of determining a region of interest in media
TWI745808B (en) * 2019-12-24 2021-11-11 亞達科技股份有限公司 Situation awareness system and method
CN111832493A (en) * 2020-07-17 2020-10-27 平安科技(深圳)有限公司 Image traffic signal lamp detection method and device, electronic equipment and storage medium
US20220108203A1 (en) * 2020-10-01 2022-04-07 Texas Instruments Incorporated Machine learning hardware accelerator

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160371546A1 (en) * 2015-06-16 2016-12-22 Adobe Systems Incorporated Generating a shoppable video
US20180165548A1 (en) * 2015-07-30 2018-06-14 Beijing Sensetime Technology Development Co., Ltd Systems and methods for object tracking
US10007860B1 (en) * 2015-12-21 2018-06-26 Amazon Technologies, Inc. Identifying items in images using regions-of-interest
US20180276841A1 (en) * 2017-03-23 2018-09-27 Intel Corporation Method and system of determining object positions for image processing using wireless network angle of transmission
US20180293737A1 (en) * 2017-04-07 2018-10-11 Nvidia Corporation System and method for optical flow estimation
US20180300553A1 (en) * 2017-03-30 2018-10-18 Hrl Laboratories, Llc Neuromorphic system for real-time visual activity recognition
US20180341872A1 (en) * 2016-02-02 2018-11-29 Beijing Sensetime Technology Development Co., Ltd Methods and systems for cnn network adaption and object online tracking
US20190005657A1 (en) * 2017-06-30 2019-01-03 Baidu Online Network Technology (Beijing) Co., Ltd . Multiple targets-tracking method and apparatus, device and storage medium
US20190026538A1 (en) * 2017-07-21 2019-01-24 Altumview Systems Inc. Joint face-detection and head-pose-angle-estimation using small-scale convolutional neural network (cnn) modules for embedded systems
US20190050994A1 (en) * 2017-08-10 2019-02-14 Fujitsu Limited Control method, non-transitory computer-readable storage medium, and control apparatus
US20190073564A1 (en) * 2017-09-05 2019-03-07 Sentient Technologies (Barbados) Limited Automated and unsupervised generation of real-world training data
US20190138676A1 (en) * 2017-11-03 2019-05-09 Drishti Technologies Inc. Methods and systems for automatically creating statistically accurate ergonomics data
US20190251383A1 (en) * 2016-11-09 2019-08-15 Panasonic Intellectual Property Management Co., Ltd. Method for processing information, information processing apparatus, and non-transitory computer-readable recording medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160371546A1 (en) * 2015-06-16 2016-12-22 Adobe Systems Incorporated Generating a shoppable video
US20180165548A1 (en) * 2015-07-30 2018-06-14 Beijing Sensetime Technology Development Co., Ltd Systems and methods for object tracking
US10007860B1 (en) * 2015-12-21 2018-06-26 Amazon Technologies, Inc. Identifying items in images using regions-of-interest
US20180341872A1 (en) * 2016-02-02 2018-11-29 Beijing Sensetime Technology Development Co., Ltd Methods and systems for cnn network adaption and object online tracking
US20190251383A1 (en) * 2016-11-09 2019-08-15 Panasonic Intellectual Property Management Co., Ltd. Method for processing information, information processing apparatus, and non-transitory computer-readable recording medium
US20180276841A1 (en) * 2017-03-23 2018-09-27 Intel Corporation Method and system of determining object positions for image processing using wireless network angle of transmission
US20180300553A1 (en) * 2017-03-30 2018-10-18 Hrl Laboratories, Llc Neuromorphic system for real-time visual activity recognition
US20180293737A1 (en) * 2017-04-07 2018-10-11 Nvidia Corporation System and method for optical flow estimation
US10467763B1 (en) * 2017-04-07 2019-11-05 Nvidia Corporation System and method for optical flow estimation
US20190005657A1 (en) * 2017-06-30 2019-01-03 Baidu Online Network Technology (Beijing) Co., Ltd . Multiple targets-tracking method and apparatus, device and storage medium
US20190026538A1 (en) * 2017-07-21 2019-01-24 Altumview Systems Inc. Joint face-detection and head-pose-angle-estimation using small-scale convolutional neural network (cnn) modules for embedded systems
US20190050994A1 (en) * 2017-08-10 2019-02-14 Fujitsu Limited Control method, non-transitory computer-readable storage medium, and control apparatus
US20190073564A1 (en) * 2017-09-05 2019-03-07 Sentient Technologies (Barbados) Limited Automated and unsupervised generation of real-world training data
US20190138676A1 (en) * 2017-11-03 2019-05-09 Drishti Technologies Inc. Methods and systems for automatically creating statistically accurate ergonomics data

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210182589A1 (en) * 2018-08-01 2021-06-17 Kyungpook National University Industry-Academic Cooperation Foundation Object detection device and control method
US11715280B2 (en) * 2018-08-01 2023-08-01 Kyungpook National University Industry-Academic Cooperation Foundation Object detection device and control method
US11037031B2 (en) * 2019-03-06 2021-06-15 Beijing Horizon Robotics Technology Research And Development Co., Ltd. Image recognition method, electronic apparatus and readable storage medium

Also Published As

Publication number Publication date
US20190385005A1 (en) 2019-12-19

Similar Documents

Publication Publication Date Title
US10769485B2 (en) Framebuffer-less system and method of convolutional neural network
CN110826530B (en) Face detection using machine learning
US10936937B2 (en) Convolution operation device and convolution operation method
US11836931B2 (en) Target detection method, apparatus and device for continuous images, and storage medium
US9971959B2 (en) Performing object detection operations via a graphics processing unit
US20230229931A1 (en) Neural processing apparatus and method with neural network pool processing
US20190114742A1 (en) Image upscaling with controllable noise reduction using a neural network
US20220138454A1 (en) Training method and training apparatus for a neural network for object recognition
US10055672B2 (en) Methods and systems for low-energy image classification
CN109784372B (en) Target classification method based on convolutional neural network
US20200257902A1 (en) Extraction of spatial-temporal feature representation
US10268886B2 (en) Context-awareness through biased on-device image classifiers
US9208374B2 (en) Information processing apparatus, control method therefor, and electronic device
KR102070956B1 (en) Apparatus and method for processing image
CN110782430A (en) Small target detection method and device, electronic equipment and storage medium
CN109934239B (en) Image feature extraction method
KR102086042B1 (en) Apparatus and method for processing image
CN111179212A (en) Method for realizing micro target detection chip integrating distillation strategy and deconvolution
US11544523B2 (en) Convolutional neural network method and system
US11715216B2 (en) Method and apparatus with object tracking
CN115713769A (en) Training method and device of text detection model, computer equipment and storage medium
CN114943729A (en) Cell counting method and system for high-resolution cell image
TWI696127B (en) Framebuffer-less system and method of convolutional neural network
CN110717575B (en) Frame buffer free convolutional neural network system and method
CN111583292B (en) Self-adaptive image segmentation method for two-photon calcium imaging video data

Legal Events

Date Code Title Description
AS Assignment

Owner name: HIMAX TECHNOLOGIES LIMITED, TAIWAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:YANG, DER-WEI;REEL/FRAME:046130/0546

Effective date: 20180615

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4