CN110110849B - Line fixed data stream mapping method based on graph segmentation - Google Patents

Line fixed data stream mapping method based on graph segmentation Download PDF

Info

Publication number
CN110110849B
CN110110849B CN201910353373.XA CN201910353373A CN110110849B CN 110110849 B CN110110849 B CN 110110849B CN 201910353373 A CN201910353373 A CN 201910353373A CN 110110849 B CN110110849 B CN 110110849B
Authority
CN
China
Prior art keywords
map
column
mapping
processing
width
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910353373.XA
Other languages
Chinese (zh)
Other versions
CN110110849A (en
Inventor
张博文
顾华玺
王琨
杨银堂
姚晰月
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201910353373.XA priority Critical patent/CN110110849B/en
Publication of CN110110849A publication Critical patent/CN110110849A/en
Application granted granted Critical
Publication of CN110110849B publication Critical patent/CN110110849B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a line fixed data stream mapping method based on graph segmentation, which mainly solves the problems of limited application scenes and low utilization rate of a processing array in the conventional line fixed data stream mapping method. The method comprises the following implementation steps: 1. acquiring related parameters of a convolutional neural network convolutional layer and a processing array; 2. generating a mapping map according to the convolutional layer parameters, and determining related parameters of the mapping map; 3. carrying out map segmentation according to the map parameters and the relevant parameters of the processing array; 4. and generating a corresponding data stream mapping according to the graph segmentation result. The invention divides and maps the mapping chart based on the row fixed data stream according to the processing array scale, can map the convolution layer of any scale to the processing array of any scale while keeping the characteristic of high data reusability of the row fixed data stream, has the advantages of high flexibility, strong applicability, high utilization rate of processing units and strong processing performance, and can be used for the convolution neural network to accelerate the data processing process.

Description

Line fixed data stream mapping method based on graph segmentation
Technical Field
The invention belongs to the technical field of communication, and particularly relates to a row fixed data stream mapping method which can be used for accelerating a data processing process by a convolutional neural network.
Background
Neural networks NN are the basis for modern artificial intelligence applications. Since the neural network has made a breakthrough application in speech recognition, image recognition, and the like, the number of application programs using the neural network has sharply increased. These neural networks are widely used in a variety of fields including automotive driving, cancer detection, and complex games. In many fields, neural networks have surpassed human accuracy and greatly improved execution efficiency. The good performance of the neural network is derived from the fact that high-level features can be extracted from raw data after an effective representation of an input space is obtained from a large amount of data by using statistical learning.
As a further development of the neural network, the convolutional neural network CNN can automatically extract data features through a multi-layer neural network, a multi-dimensional convolution operation, and a merging operation of different data paths. Compared with the traditional multilayer neural network, the convolutional neural network can greatly simplify the data processing, so that the convolutional neural network becomes one of the most important tools in the current deep learning. The original convolutional neural network comprises two structures: convolutional layer CONV and pooling layer PL. The identifier of the cells in the convolutional layer as a feature is connected with the cells between different layers by different weights, and the cells sharing the weights provide the necessary nonlinearity through an activation function. While the convolutional layer is used for identifying the features, the pooling layer can merge a plurality of fine features into the same class of features through sampling. These characteristics of convolutional neural networks have made them highly successful in the field of machine learning. With the development of the convolutional neural network, it is one of the important research directions to obtain better neural network performance by further increasing the number of layers of the neural network. The classical structure of deep convolutional neural networks DCNN such as AlexNet, VGG, googleNet, resNet, and SENEt is proposed.
However, the existing deep convolutional neural networks have the defects of high computational complexity, large amount of computational data, strong memory access requirement and high system parallelism requirement, and particularly, a large amount of memory data access requirement is generated in the process of processing corresponding data by a processor, so that the efficiency of processing the deep convolutional neural networks is influenced.
It is worth noting, however, that while deep convolutional neural networks have the above disadvantages, there are a large number of data resources that can be reused in their processing, especially in the processing of convolutional layers, there are a large number of weight/convolutional kernel data, input image data, and parts and data that can be reused. Therefore, the data processing process of the convolution layer can be optimized correspondingly according to the logical operation characteristics of convolution operation and the dependency relationship of data grouping, the possibility of data reuse in the convolution layer processing process of the deep convolution neural network is fully utilized, a special data stream format and a special processing mechanism are designed, the data moving distance and the storage access requirement in the processing process are reduced, and the processing acceleration and the operation efficiency improvement of the deep convolution neural network are realized.
Based on the above thought, yu-Hsin Chen et al proposed a row-fixed RS data stream in its published paper "Eyeris An Energy-Efficient configurable Accelerator for Deep conditional Networks". The line fixed data stream maps a complete line of data in a convolution kernel and a complete line of input image data to corresponding processing units, the processing units select corresponding data elements according to the processing requirements of convolution operation to perform convolution operation and generate partial sum data, and the convolution kernel data and the input image data are repeatedly used in multiple convolution operations. The partial sum data resulting from the convolution operation is shortest-path transmitted between the processing units according to the interdependence relationship and further added in the corresponding processing units. Convolution kernel reuse, input image reuse and partial and data movement minimization are realized in the convolution layer processing process based on the row fixed data stream. The row-fixed data stream is a very efficient data stream to handle convolutional layers in deep convolutional neural networks. However, the existing mapping method based on the row fixed data stream has the minimum requirement on the scale of the processing array in the convolutional neural network processing system, and the mapping method requires that the width of the processing array is not less than the width of the convolutional core in the convolutional layer; for the condition that the processing array width is larger than the convolution kernel width, the existing mapping method cannot fully utilize all processing units, so that the processing units are idle, and the system operation efficiency is low. These problems limit the further application of row-fixed data streams in convolutional neural networks.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a line fixed data stream mapping method based on graph segmentation, so as to realize the high-efficiency mapping from the processing requirements of the convolutional layer of any scale to the processing array of any scale, accelerate the processing speed of the deep convolutional neural network and improve the system operation efficiency.
The technical idea of the invention is as follows: generating a corresponding mapping chart according to the processing mode of the row fixed data stream by considering the size of a convolution kernel in a convolution neural network convolution layer and the size of an input image; the method has the advantages that the problems of the reusability of data in the convolutional neural network processing process, the utilization efficiency of processing units in the processing array and the like are comprehensively considered, the mapping graph is divided according to the scale of the processing array, the high utilization rate of the processing units in the processing array processing process of data stream mapping is realized, and meanwhile, the reusability of the data is fully utilized to reduce the data access pressure and the data movement overhead. The concrete implementation steps comprise:
(1) Obtaining convolutional neural network convolutional layer related parameters and processing array scale S PE
(2) Generating a data stream map based on convolutional layer parameters, the map having a size S M
S M =L M *W M Wherein: l is a radical of an alcohol M For the length of the map generated, W M Generating a width for the map;
(3) And (3) carrying out map segmentation according to the map parameters and the processing array related parameters:
(3a) Setting the column starting point C =1 of the current map, and scaling the map by S M And the size S of the processing array PE By comparison, if S M ≥S PE Executing (3 b); if S is M <S PE Dividing into 1 complete row at a time, repeating L M Secondly, finishing the map segmentation;
(3b) And (3) partitioning the mapping map:
(3b1) Simultaneously cut off C at a time F A complete column and width R PE The remaining column elements of length 1, wherein C F =S PE /W M The number of columns for a complete column of elements that the processing array can process at one time; r PE =S PE mod W M Mod is the number of processing units remaining after a complete column of elements is processed once, and is the remainder operation sign in the division;
(3b2) Repeat (3 b 1) total Count = L M /(C F + 1) times to obtain the result of the complete column division and the result of the incomplete column division;
(3b3) Judging whether the segmentation can be finished:
if R is PE >0, then calculate the remaining mapping firstShot size S MN =L MN *W MN Wherein: l is MN = Count is the remaining map length, W MN =W M -R PE Is the remaining map width; re-executing (3 b 4);
if R is PE If =0, ending the map segmentation;
(3b4) The remaining map size S MN And the size S of the processing array PE Comparing:
if S is MN ≥S PE Then set the column start point to C N =C F *Count+C R + C, and returning to (3 b 1) to continue dividing the rest mapping chart;
if S is MN <S PE If yes, ending the map segmentation;
(4) Generating a map element according to the map segmentation result in the step (3); and on the principle that each part of processing units can bear the mapping requirements, the processing units in the processing array are divided firstly, and then mapping data streams from the mapping map elements to the corresponding processing units are generated.
Compared with the prior art, the invention has the following advantages:
first, the CONV processing-oriented row-fixed data stream mapping method based on graph partitioning is adopted for CONV processing acceleration, and convolutional layers of any scale can be mapped into processing arrays of any scale, so that the method has the advantages of high flexibility and strong applicability.
Secondly, the invention adopts a mapping chart segmentation method to segment and map the mapping chart based on the row fixed data stream according to the processing array scale, and fully utilizes the processing unit resources in the processing array while keeping the characteristic of high data reusability of the row fixed data stream, so that the invention has high processing unit utilization rate and more efficient processing performance.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is a row-fixed data flow map of the first convolutional layer CONV-1 of the convolutional neural network AlexNet of the present invention;
FIG. 3 is a schematic diagram of AlexNet CONV-1 to 4 x 4 processing array data stream map segmentation in accordance with the present invention;
FIG. 4 is a schematic of a first segmentation of the AlexNet CONV-1 data flow map into a 4 x 4 processing array map according to the present invention;
FIG. 5 is a schematic of a second segmentation of the AlexNet CONV-1 data flow map into a 4 x 4 processing array map according to the present invention;
FIG. 6 is a schematic of a third segmentation of the AlexNet CONV-1 data flow map into a 4 x 4 processing array map in accordance with the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments.
Referring to fig. 1, the specific steps of this embodiment are as follows:
step 1, obtaining related parameters of a convolutional neural network convolutional layer and a processing array.
The convolutional layer parameters comprise:
convolution kernel scale: s F *S F Wherein: the length and width of the convolution kernel are both S F
Input image size: s I *S I Wherein: input image length and width are both S I
Convolution step size: l;
taking the CONV-1 layer of the convolutional neural network AlexNet as an example, the relevant parameters of the convolutional layer are as follows:
S F =11;
S I =227;
L=4。
the processing array parameters comprise:
processing the length of the array: l is a radical of an alcohol PE
Processing the array width: w is a group of PE
Scale of the treatment array: s. the PE =L PE *W PE
Taking a 4 × 4 grid-like processing array as an example, the relevant parameters are:
L PE =4;
W PE =4;
S PE =16。
step 2, generating a data stream mapping chart according to the convolutional layer parameters, and determining the related parameters of the mapping chart:
generating the length L by convolution operation according to the convolution kernel, the input image and the convolution step length related parameters M =(S I -S F ) L +1, width W M =S F A map of (2);
the size of the map is S M =L M *W M In which is included S M Each mapping element comprises convolution kernel line data and input image line data;
multiplying and adding convolution kernel elements to input image elements may generate partial sum data, i.e. each mapping element in the map yields L M A portion and data.
Take CONV-1 layer of AlexNet as an example of the convolutional neural network, L in the mapping chart M =(227-11)/4+1=55,W M =11,S M =55 × 11=605. The map includes 605 map elements, each of which generates 55 parts and data in total, as shown in fig. 2.
And 3, dividing the mapping map according to the mapping map parameters and the processing array related parameters.
3a) Setting the column starting point C =1 of the current map, and scaling the map by S M And the size S of the processing array PE By comparison, if S M ≥S PE Indicating that the processing array cannot process the map at one time, and therefore go to step 3b; if S is M <S PE Dividing into 1 complete row at a time, repeating L M Secondly, finishing the map segmentation;
3b) And (3) partitioning the mapping map:
3b1) Two columns of elements are partitioned simultaneously in the map:
column count C of the array of compute processors that can process a complete column of elements in the map at one time F =S PE /W M
ComputingNumber of processing units R remaining after processing a complete column of elements in a processing array at a time PE =S PE mod W M Wherein mod is a remainder operation symbol in the division;
in the mapping chart, C is divided out simultaneously according to the calculated parameters F A complete column and width of R PE Incomplete column elements of length 1;
3b2) For the length L M According to step 3b 1), repeatedly dividing Count = L M /(C F + 1) times, the number of remaining complete columns C in the map is calculated R =L M mod(C F + 1) to obtain the results of a complete column split and a non-complete column split:
the results of the complete column are: c column to C F *Count+C R + C-1 column, width of column W M
The results of the incomplete column are: l th M the-Count +1 column to the Lth M Columns of width R PE
3b3) Judging whether the segmentation can be finished:
if R is PE >0, indicating that there are still residual mapping elements in the mapping map without being divided, calculating the length L of the residual mapping map MN = Count, remaining map width W MN =W M -R PE Residual size S MN =L MN *W MN (ii) a Then step 3b4 is executed;
if R is PE =0, the map segmentation is ended;
3b4) Scale S of the remaining map MN And the size S of the processing array PE Comparing:
if S is MN ≥S PE If the processing array cannot process the remaining maps at one time, the starting point of the map column is set to C N =C F *Count+C R + C, length set to L MN Width is set as W MN Scale is set to S MN Returning to the step 3b 1) to continue segmenting the residual mapping chart;
if S is MN <S PE Then map segmentation is ended.
Taking the grid-shaped processing array with the CONV-1 layer of the convolutional neural network AlexNet mapped to 4 × 4 as an example, the segmentation parameters and results are as follows:
first segmentation map:
the column starting point C =1 of the map, and the length L of the map M =55, width W M =11, scale S M =605; size of processing array S PE =16; 1 complete column and non-complete column elements with the width of 5 and the length of 1 are simultaneously segmented at one time;
after repeating the segmentation 27 times, the complete column results were obtained: column 1 to column 28, the width of the columns being 11; results for the incomplete column were obtained: column 29 to column 55, the width of the columns being 5;
the division result is shown by the diagonal lines and the cross-lined rectangular portions in fig. 3;
second segmentation residual map:
the column starting point C =29 of the mapping chart and the length L of the mapping chart M =27, width W M =6, scale S M =162; size of processing array S PE =16; 2 complete columns and non-complete column elements with the width of 4 and the length of 1 are simultaneously segmented at one time;
after 9 repeated divisions, the complete column results were obtained as: 29 th to 46 th columns, the width of which is 6; results for the incomplete column were obtained: column 47 to column 55, the width of the columns being 4;
the segmentation result is shown by the horizontal and vertical line rectangles in fig. 3;
third segmentation the remaining maps after the second segmentation:
the column starting point C =47 of the mapping chart and the length L of the mapping chart M =9, width W M =2, scale S M =18; size of processing array S PE =16; simultaneously segmenting 8 complete columns and incomplete column elements with the width of 0 and the length of 1 at one time;
after the remaining maps are divided 1 time, the complete column results are obtained as follows: column 47 to column 55, the width of the columns being 2; results for the incomplete column were obtained: column 55 to column 55, the width of the columns being 0;
the result of this division is shown by the dotted rectangular portion in fig. 3.
And 4, generating a data stream from the mapping to the processing unit.
Generating a map element according to the map segmentation result in the step (3);
and dividing the processing units in the processing array on the principle that each part of the processing units can bear the mapping requirements to generate a data stream from the mapping map to the processing units.
Take the grid-like processing array with the CONV-1 layer of the convolutional neural network AlexNet mapped to 4 x 4 as an example:
when the mapping chart is divided for the first time, 1 complete column with the width of 11 and 1 incomplete column element with the width of 5 are divided at the same time; thus, according to this division method, the processing units are divided such that 11 processing units receive a mapping of 1 complete column and the remaining 5 processing units receive a mapping of 1 incomplete column, as shown in fig. 4.
When the residual mapping chart is segmented for the second time, 2 complete columns with the width of 6 and 1 incomplete column element with the width of 4 are segmented at the same time; thus, according to this division method, the processing units are divided such that 12 processing units receive the mapping of 2 complete columns, respectively, and the remaining 4 processing units receive the mapping of 1 incomplete column, as shown in fig. 5.
When the map is divided for the second time for the third time, 8 complete columns with the width of 2 and 1 incomplete column element with the width of 0 are divided at the same time; therefore, according to this division method, the processing units are divided so that 16 processing units are mapped to 8 complete columns, respectively, as shown in fig. 6.
After the processing units are divided, the data in the map elements are sent to corresponding processing arrays for processing according to the corresponding relation between the segmentation elements in the map and the processing units, and a mapping data stream from the map to the corresponding processing units is generated.
The above description is only a specific example of the present invention and is not intended to limit the modifications of the present invention. It will be apparent to persons skilled in the relevant art(s) that, having the benefit of this disclosure and the teaching hereof, numerous modifications and variations in form and detail can be made without departing from the principles and structures described herein, but such modifications and variations are within the scope of the invention as defined by the appended claims.

Claims (5)

1. A line fixed data stream mapping method based on graph segmentation is characterized by comprising the following steps:
(1) Obtaining convolutional neural network convolutional layer related parameters and processing array scale S PE
(2) Generating a map with the size of S according to the convolutional layer parameters M
S M =L M *W M Wherein: l is a radical of an alcohol M For the generated length of the map, W M Generating a width for the map;
(3) And (3) carrying out map segmentation according to the map parameters and the processing array related parameters:
(3a) Setting the column starting point C =1 of the current mapping chart, and scaling the mapping chart by S M And the size S of the processing array PE By comparison, if S M ≥S PE Executing (3 b); if S is M <S PE Dividing into 1 complete row at a time, repeating L M Secondly, finishing the map segmentation;
(3b) And (3) partitioning the mapping map:
(3b1) Simultaneously cut off C at a time F A complete column and width of R PE The remaining column elements of length 1, wherein C F =S PE /W M The number of columns of full column elements that can be processed at one time for processing the array; r is PE =S PE mod W M Mod is the number of processing units remaining after a complete column of elements is processed once, and is the remainder operation sign in the division;
(3b2) Repeat (3 b 1) total Count = L M /(C F + 1) times to obtain the result of the complete column division and the result of the incomplete column division;
(3b3) Judging whether the segmentation can be finished:
if R is PE If the residual mapping graph size is larger than 0, the residual mapping graph size S is calculated first MN =L MN *W MN Wherein: l is a radical of an alcohol MN = Count is the remaining map length, W MN =W M -R PE Is the remaining map width; re-executing (3 b 4);
if R is PE If =0, ending the map segmentation;
(3b4) Scale S of the remaining map MN And the size S of the processing array PE Comparing:
if S is MN ≥S PE Then set the column start point to C N =C F *Count+C R + C, and returning to (3 b 1) to continue dividing the rest mapping chart; c R =L M mod(C F + 1) represents the number of remaining complete columns in the map after Count division;
if S is MN <S PE If yes, ending the map segmentation;
(4) Generating a map element according to the map segmentation result in the step (3); and dividing the processing units in the processing array on the principle that each part of the processing units can bear the mapping requirement, and generating a data stream from the mapping map to the processing units.
2. The method of claim 1, wherein the parameters associated with convolutional neural network convolutional layers in (1) comprise:
scale of convolution kernel: s. the F *S F Length and width are both S F
Scale of input image: s. the I *S I Length and width are both S I
Convolution step size: and L.
3. The method of claim 1, wherein the array size S is processed in (1) PE Expressed as follows:
S PE =L PE *W PE
wherein: l is PE To handle array length; w is a group of PE To address the array width.
4. The method of claim 1, wherein in (2) the data stream map is generated based on convolutional layer parameters, which are as follows:
generation length of the map: l is M =(S I -S F ) L +1, wherein S I For the length of the input image in the convolutional layer, S F Is the length of the convolution kernel in the convolution layer, and L is the convolution step length of the convolution layer;
generation width of map: w M =S F Wherein S is F Is the width of the convolution kernel in the convolution layer.
5. The method of claim 1, wherein the results of the complete column segmentation and the results of the incomplete column segmentation in (3 b 2) are obtained as follows:
the results of the complete column are: c column to C F *Count+C R + C-1 column, width of column W M ,C R =L M mod(C F + 1) represents the number of remaining complete columns in the map after Count division;
the results of the incomplete column are: l th M the-Count +1 column to the Lth M Rows of width R PE
CN201910353373.XA 2019-04-29 2019-04-29 Line fixed data stream mapping method based on graph segmentation Active CN110110849B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910353373.XA CN110110849B (en) 2019-04-29 2019-04-29 Line fixed data stream mapping method based on graph segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910353373.XA CN110110849B (en) 2019-04-29 2019-04-29 Line fixed data stream mapping method based on graph segmentation

Publications (2)

Publication Number Publication Date
CN110110849A CN110110849A (en) 2019-08-09
CN110110849B true CN110110849B (en) 2023-04-07

Family

ID=67487450

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910353373.XA Active CN110110849B (en) 2019-04-29 2019-04-29 Line fixed data stream mapping method based on graph segmentation

Country Status (1)

Country Link
CN (1) CN110110849B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11861485B2 (en) * 2019-11-22 2024-01-02 Baidu Usa Llc Data format transform method to improve AI engine MAC utilization
CN114781634B (en) * 2022-06-21 2022-11-04 之江实验室 Automatic mapping method and device of neural network array based on memristor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108241890A (en) * 2018-01-29 2018-07-03 清华大学 A kind of restructural neural network accelerated method and framework
WO2021069211A1 (en) * 2019-10-11 2021-04-15 Robert Bosch Gmbh Method of and apparatus for processing data of a deep neural network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008077100A2 (en) * 2006-12-19 2008-06-26 Kla-Tencor Corporation Systems and methods for creating inspection recipes
US10614354B2 (en) * 2015-10-07 2020-04-07 Altera Corporation Method and apparatus for implementing layers on a convolutional neural network accelerator

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108241890A (en) * 2018-01-29 2018-07-03 清华大学 A kind of restructural neural network accelerated method and framework
WO2021069211A1 (en) * 2019-10-11 2021-04-15 Robert Bosch Gmbh Method of and apparatus for processing data of a deep neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
深度神经网络压缩与加速综述;纪荣嵘等;《计算机研究与发展》;20180915(第09期);全文 *

Also Published As

Publication number Publication date
CN110110849A (en) 2019-08-09

Similar Documents

Publication Publication Date Title
Cai et al. Yolobile: Real-time object detection on mobile devices via compression-compilation co-design
US11816559B2 (en) Dilated convolution using systolic array
Mundhenk et al. Efficient saliency maps for explainable AI
CN108268931B (en) Data processing method, device and system
Lu et al. Adaptive object detection using adjacency and zoom prediction
TWI806134B (en) Method and system for hierarchical weight-sparse convolution processing and related non-transitory computer-readable storage medium
CN106599900A (en) Method and device for recognizing character string in image
KR20190055447A (en) Apparatus and method for generating and using neural network model applying accelerated computation
KR20180105694A (en) Creating images using neural networks
US20220083857A1 (en) Convolutional neural network operation method and device
CN110110849B (en) Line fixed data stream mapping method based on graph segmentation
CN115147598B (en) Target detection segmentation method and device, intelligent terminal and storage medium
CN111727445A (en) Data compression for partial entropy coding
CN103177414A (en) Structure-based dependency graph node similarity concurrent computation method
Zhang et al. CSL-YOLO: A new lightweight object detection system for edge computing
CN114332133A (en) New coronary pneumonia CT image infected area segmentation method and system based on improved CE-Net
CN110852295A (en) Video behavior identification method based on multitask supervised learning
CN114708434A (en) Cross-domain remote sensing image semantic segmentation method based on adaptation and self-training in iterative domain
Arredondo-Velázquez et al. A streaming accelerator of convolutional neural networks for resource-limited applications
CN113313720A (en) Object segmentation method and device
CN111461144A (en) Method for accelerating convolutional neural network
CN113222160A (en) Quantum state conversion method and device
CN110807479A (en) Neural network convolution calculation acceleration method based on Kmeans algorithm
CN109670598A (en) A kind of data processing method based on deep learning
CN110490300B (en) Deep learning-based operation acceleration method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant