CN109543836A - Operation method, device and Related product - Google Patents
Operation method, device and Related product Download PDFInfo
- Publication number
- CN109543836A CN109543836A CN201811456746.8A CN201811456746A CN109543836A CN 109543836 A CN109543836 A CN 109543836A CN 201811456746 A CN201811456746 A CN 201811456746A CN 109543836 A CN109543836 A CN 109543836A
- Authority
- CN
- China
- Prior art keywords
- data
- segment
- matrix
- expression
- algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 270
- 239000011159 matrix material Substances 0.000 claims abstract description 208
- 238000013528 artificial neural network Methods 0.000 claims description 45
- 238000012545 processing Methods 0.000 claims description 22
- 230000008901 benefit Effects 0.000 claims description 2
- 230000006399 behavior Effects 0.000 claims 1
- 238000011161 development Methods 0.000 abstract description 8
- 230000009466 transformation Effects 0.000 description 25
- 210000002364 input neuron Anatomy 0.000 description 23
- 210000004205 output neuron Anatomy 0.000 description 23
- 210000002569 neuron Anatomy 0.000 description 18
- 230000006870 function Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 11
- 230000011218 segmentation Effects 0.000 description 10
- 230000018109 developmental process Effects 0.000 description 7
- 241001269238 Data Species 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 230000009471 action Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 235000019580 granularity Nutrition 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 210000005036 nerve Anatomy 0.000 description 3
- 238000005481 NMR spectroscopy Methods 0.000 description 2
- 238000004378 air conditioning Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000002620 method output Methods 0.000 description 2
- 238000002604 ultrasonography Methods 0.000 description 2
- 238000005406 washing Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7807—System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
- G06F15/781—On-chip cache; Off-chip memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
This disclosure relates to operation method, device and Related product, which comprises according to segment information by the data sectional of matrix computation system, obtain the segment data of the data, the data include the input data and output data of the matrix computation system;Expression among the first of the data is determined according to the segment data of the data;Determine expression corresponding memory space on piece caching among the first of data;Expression among the second of the data is generated according to the memory space;According to the second centre expression of the second of the data the centre expression and the matrix computation system operator, the second centre expression of the matrix computation system is generated.The embodiment of the present disclosure is reached the intermediate of algorithm by the intermediate expression of data and the middle table of operator and is expressed, different algorithms is when different systems on chip is realized, it does not need that specific interface is arranged, improves the compatibility between algorithm and system on chip, reduce the difficulty of algorithm development.
Description
Technical field
This disclosure relates to technical field of information processing more particularly to a kind of operation method, device and Related product.
Background technique
Different language expression can be set in different systems on chip, that is, different can be used in system on chip execution is arranged
Code.And the complexity of neural network algorithm itself is high, an algorithm executes after may needing to split in system on chip.For
Realize that the programming language of neural network algorithm is also complicated and changeable.When the neural network algorithm realized using different programming languages,
When executing in different types of system on chip, need specifically to connect for the algorithm of language-specific and the setting of specific system on chip
Mouthful, so that complexity of the neural network algorithm when system on chip is realized is high.
Summary of the invention
In view of this, the present disclosure proposes a kind of operation method, device and Related products, to reduce neural network algorithm
The exploitation complexity realized on chip.
According to the one side of the disclosure, a kind of operation method is provided, which comprises
According to segment information by the data sectional of matrix computation system, the segment data of the data, the data packet are obtained
Include the input data and output data of the matrix computation system;
Expression among the first of the data is determined according to the segment data of the data;
Determine expression corresponding memory space on piece caching among the first of data;
Expression among the second of the data is generated according to the memory space;
It is expressed according to the second centre of the second of the data the centre expression and the matrix computation system operator, described in generation
Expression among the second of matrix computation system.
In one possible implementation, the method also includes:
The first executable instruction of the matrix computation system is generated according to the first centre expression of the matrix computation system, or
The second executable instruction of the matrix computation system is generated according to the second of the matrix computation system the centre expression.
In one possible implementation, the segment information is determined according to the size that on piece caches.
In one possible implementation, the input data of the matrix computation system includes the third matrix function of N row C column
Point of the data is obtained according to segment information by the data sectional of matrix computation system according to the 4th matrix data arranged with N row C
Segment data, comprising:
Third matrix data and the 4th matrix data are respectively divided into N sections according to segment information, obtain third matrix data
Segment data and the 4th matrix data segment data, wherein the every segment length of the segment data of third matrix data is C, the
The every segment length of the segment data of four matrix datas is C.
According to the one side of the disclosure, a kind of arithmetic unit is provided, described device includes:
Segment data obtains module, for, by the data sectional of matrix computation system, obtaining the data according to segment information
Segment data, the data include the input data and output data of the matrix computation system;
Determining module is expressed among data, among first for determining the data according to the segment data of the data
Expression;
Memory space determining module, for determining that it is empty that the corresponding storage on piece caching is expressed in the first centre of data
Between;
Determining module is expressed among second, is expressed among second for generating the data according to the memory space;
Determining module is expressed among algorithm, for calculating according to the second centre expression of the data and the matrix computation system
Expression among the second of son generates expression among the second of the matrix computation system.
In one possible implementation, described device further include:
First executable instruction generation module, for generating the square according to the first centre expression of the matrix computation system
First executable instruction of battle array computation system, or
Second executable instruction generation module, for generating the square according to the second centre expression of the matrix computation system
Second executable instruction of battle array computation system.
In one possible implementation, the segment information is determined according to the size that on piece caches.
In one possible implementation, the input data of the matrix computation system includes the third matrix function of N row C column
Point of the data is obtained according to segment information by the data sectional of matrix computation system according to the 4th matrix data arranged with N row C
Segment data, comprising:
Third matrix data and the 4th matrix data are respectively divided into N sections according to segment information, obtain third matrix data
Segment data and the 4th matrix data segment data, wherein the every segment length of the segment data of third matrix data is C, the
The every segment length of the segment data of four matrix datas is C.
According to the one side of the disclosure, a kind of neural network computing device, the neural network computing device packet are provided
One or more arithmetic units described in any of the above embodiments are included, the neural network computing device is used to complete the nerve net of setting
Network operation.
According to the one side of the disclosure, provide a kind of combinatorial operation device, the combinatorial operation device include one or
Multiple above-mentioned neural network computing devices, general interconnecting interface and other processing units;
The neural network computing device is interacted with other described processing units, the common calculating completing user and specifying
Operation.
According to the one side of the disclosure, a kind of neural network chip is provided, the neural network chip includes:
Above-mentioned arithmetic unit;Or
Above-mentioned neural network computing device;Or
Combinations of the above processing unit.
According to the one side of the disclosure, a kind of electronic equipment is provided, the electronic equipment includes:
Arithmetic unit described in any of the above embodiments;Or
Above-mentioned neural network computing device;Or
Combinations of the above processing unit;Or
Above-mentioned neural network chip.
In the embodiments of the present disclosure, the data sectional of algorithm is obtained by the segment data of data according to segment information, it can be with
Expression among the first of data is determined according to the segment data of data, it can be according to the first centre expression of data and algorithm operator
First among expression, generating algorithm first among expression.Among the first centre expression of data and the first of operator
Expression obtains expression among the first of algorithm, and it is specific not need setting when different systems on chip is realized for different algorithms
Interface improves the compatibility between algorithm and system on chip, reduces the difficulty of algorithm development.
In some embodiments, the electronic equipment includes data processing equipment, robot, computer, printer, scanning
Instrument, tablet computer, intelligent terminal, mobile phone, automobile data recorder, navigator, sensor, camera, server, cloud server,
Camera, video camera, projector, wrist-watch, earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or medical treatment
Equipment.
In some embodiments, the vehicles include aircraft, steamer and/or vehicle;The household electrical appliance include electricity
Depending on, air-conditioning, micro-wave oven, refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator;The Medical Devices include
Nuclear Magnetic Resonance, B ultrasound instrument and/or electrocardiograph.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become
It is clear.
Detailed description of the invention
Comprising in the description and constituting the attached drawing of part of specification and specification together illustrates the disclosure
Exemplary embodiment, feature and aspect, and for explaining the principles of this disclosure.
Fig. 1 shows the flow chart of the operation method according to one embodiment of the disclosure;
Fig. 2 shows the flow charts according to the operation method of one embodiment of the disclosure;
Fig. 3 shows the flow chart of the operation method according to one embodiment of the disclosure;
Fig. 4 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure;
Fig. 5 shows the block diagram of the combined treatment device according to one embodiment of the disclosure.
Specific embodiment
Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing
Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove
It non-specifically points out, it is not necessary to attached drawing drawn to scale.
Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary "
Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.
In addition, giving numerous details in specific embodiment below to better illustrate the disclosure.
It will be appreciated by those skilled in the art that without certain details, the disclosure equally be can be implemented.In some instances, for
Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.
Fig. 1 shows the flow chart of the operation method according to one embodiment of the disclosure, as shown in Figure 1, the operation method packet
It includes:
Step S10 obtains the segment data of the data, the data according to segment information by the data sectional of algorithm
Input data and output data including the algorithm.
In one possible implementation, the algorithm of neural network may include multiple operation granularities, such as fine granularity
Operation (operational order with limited dimensions), the operation of coarseness operation (convolution operation) and network-level.Different operation grain
Spend the of different sizes of the corresponding data of corresponding algorithm.Due to the cache resources of the system on chip for completing neural network algorithm
It is limited.The input data and output data of the corresponding algorithm of each operation granularity, can on piece caching fragmented storage.
It in one possible implementation, can be according to data itself for different input data and output data
Length and on piece caching size, preset segment information corresponding with each input data and output data.Each data can be right
Identical segment information is answered, different segment informations can also be corresponded to.By input data and number can be exported according to segment information
After being segmented, the segment data of input data and the segment data of output data are obtained.
In one possible implementation, the segment information is determined according to the size that on piece caches.The segmentation letter
Breath includes section length and/or number of fragments.For example, the size that can be cached according on piece, determines that segment information is the length of section
Degree is B byte.The input data of convolution algorithm includes input neuron, and the length for inputting neuron is A byte (A > B), can be with
Input neuron is divided into two sections, one section is B byte, and one section is (A-B) byte.Then inputting the corresponding segment data of neuron is
Input neuron the first segment data B byte and input the second segment data of neuron (A-B) byte.
Step S20 is generated according to the segment data of the data and is expressed among the first of the data.
In one possible implementation, intermediate expression and default intermediate code can be preset according to demand.It can set
Set the intermediate expression of C language.Centre expression language can be different from the language of algorithm and system on chip, can also and algorithm
Or the language of system on chip is identical.The disclosure does not limit this.
In one possible implementation, the segment data of each input data and each output data can be compiled respectively
Write the first centre expression of each input data and output data.When executing algorithm, each data can be segmented execution, can be successively
The segment data for extracting each data executes, and the segment data of each data can share expression among identical first.For example, can be with
According to the first segment data and the second segment data of input neuron, determines and be expressed as among the first of neuron number evidence
Neuron input.Second segment data of the first segment data and input neuron that input neuron can share in first
Between express Neuron input.
Step S30 is expressed, generation institute according to the first centre of the first of the data the centre expression and the algorithm operator
State expression among the first of algorithm.
In one possible implementation, expression among the first of the operator of each algorithm can be preset.For example, convolution is calculated
Be expressed as ConvForward among the first of the operator of method, pond algorithm first among be expressed as MaxPoolForward.
In one possible implementation, the first centre of each input data of algorithm can be expressed, respectively export number
According to first among expression and algorithm operator first among expression, combine to obtain among the first of algorithm in the way of setting
Expression.For example, the data of convolution algorithm may include input neuron (being expressed as Neuron input among first), weight
(weight is expressed as among first), biasing (bias is expressed as among first), output data may include convolution output nerve
First (Neuron input is expressed as among first), the operator of convolution algorithm are that convolution operator (is expressed as among first
ConvForward), then the intermediate expression of convolution algorithm can be ConvForward (Neuron input, Neuron in-
Put, weight, bias).
In the present embodiment, the data sectional of algorithm is obtained by the segment data of data according to segment information, it can basis
The segment data of data determines expression among the first of data, can be according to the of expression among the first of data and algorithm operator
Expression among one, the first centre expression of generating algorithm.Pass through expression among the first centre expression of data and the first of operator
Expression among the first of algorithm is obtained, different algorithms when different systems on chip is realized does not need that specific interface is arranged,
The compatibility between algorithm and system on chip is improved, the difficulty of algorithm development is reduced.
Fig. 2 shows the flow charts according to the operation method of one embodiment of the disclosure, as shown in Fig. 2, the data are multidimensional
Data, step S10 includes: in the operation method
Each dimension data is segmented by step S11 respectively according to segment information, obtains the dimension segmentation of each dimension data
Data.
Step S12 obtains the segments of the data according to the dimension segment data of each dimension data of the data
According to.
In one possible implementation, the input data and output data of algorithm can be multidimensional data.For defeated
Each dimension data of each dimension data and output data that enter data can be segmented respectively.For example, the input of convolution algorithm
It include input neuron Neuron input in data, each dimension data of input neuron Neuron input includes that feature is defeated
Enter channel input_channal, input feature vector figure height input_spatial_H and input feature vector figure width input_
spatial_W.It can data, data of the data of input feature vector figure height and input feature vector figure width to feature input channel
It is segmented respectively.
In one possible implementation, each dimension data can correspond to different segment informations.Can according to it is each
The corresponding segment information of dimension data obtains the dimension segment data of each dimension data.The dimension of each dimension data indexes data
Length can not wait.For example, the length of feature input channel input_channal is C, corresponding section length is A1, then
The dimension segment data of feature input channel can be input_channal (C, A1).
It, can be by the dimension segment data of each dimension data in the way of setting in a kind of possible realization method mode
Combination obtains the segment data of data.For example, input neuron segment data be Neuron input (input_channal,
Input_spatial_H, input_spatial_W).
In the present embodiment, each dimension data can be segmented according to segment information respectively, obtains each dimension data
Dimension segment data.According to the dimension segment data of each dimension data of the data, the segment data of the data is obtained.
After data are segmented according to different dimensions, the intermediate use scope expressed of data can be made wider, robustness is more
By force.
Fig. 3 shows the flow chart of the operation method according to one embodiment of the disclosure, as shown in figure 3, in the operation method
Step S30 includes:
Step S31 determines expression corresponding memory space on piece caching among the first of data.
Step S32 is generated according to the memory space and is expressed among the second of the data.
Step S33 is expressed, generation institute according to the second centre of the second of the data the centre expression and the algorithm operator
State expression among the second of algorithm.
In a kind of possible realization method mode, in the algorithm of nonidentity operation, different data can be stored in not
In same memory.For example, in neural network algorithm, can will input the data of neuron, the data of output neuron and
The data of weight are respectively stored on piece block storage.
In a kind of possible realization method mode, the first centre expression of each data can be corresponded to not on piece caching
Same memory space.For example, can by the corresponding address 1 that cached on piece of expression among input neuron first, by weight
First among the corresponding address 2 cache on piece of expression, expression among the first of biasing corresponded into the address cached on piece
3, by the first corresponding address 4 cached on piece of centre expression of output neuron.It can be among the first of different data
Expression determines different size of memory space on piece caching.
In a kind of possible realization method mode, corresponding on piece caching can be expressed according to the first centre of a data
The address of memory space generates expression among the second of the data.For example, expression can be among the second of input neuron
Address 1, weight second among expression can be address 2, biasing second among express can be address 3, output neuron
Second among expression can be address 4.
In one possible implementation, expression among the second of each algorithm operator can be preset.For example, convolution operator
Second among be expressed as LConvForwar, pond operator second among be expressed as LMaxPoolForward.Each operator
The data that expression can call directly on piece caching among second carry out corresponding operation.It can be according in the second of each data
Between expression and operator second among expression generating algorithm second among expression.For example, expression among the second of convolution algorithm
Can be LConvForwar (address 1, address 2, address 3, address 4).
In the present embodiment, expression corresponding memory space on piece caching among the first of data is determined, according to depositing
The second centre expression that space generates data is stored up, is expressed according to the second centre of the second of data the centre expression and algorithm operator,
Expression among the second of generating algorithm.It is obtained among the second of algorithm according to the memory space in the on piece caching distributed for data
Expression calls directly data on piece caching and is calculated, and can be improved the operation efficiency of algorithm, and improve among algorithm
The compatibility of expression.
In one possible implementation, the operation method further include:
The first executable instruction of the algorithm is generated according to the first of the algorithm the centre expression, or
The second executable instruction of the algorithm is generated according to the second of the algorithm the centre expression.
In one possible implementation, system on chip can be preset to execute between code and the intermediate expression of algorithm
Transformation warehouse.Such as transformation warehouse can use assembler language realization.It can be expression among the first centre expression of algorithm and second
Different transformation warehouses is set.It can use transformation warehouse and the first centre expression of algorithm be converted into the first executable instruction, or benefit
The second centre expression of algorithm is converted into the second executable instruction with transformation warehouse.
In the present embodiment, the first executable instruction of the algorithm can be generated according to the first centre expression of algorithm,
Or the second executable instruction of the algorithm is generated according to the second centre expression of the algorithm.According to the intermediate expression life of algorithm
At executable instruction can be executed in system on chip, can be improved the scope of application expressed among algorithm.
In one possible implementation, the algorithm includes convolution algorithm, pond algorithm, Matrix Multiple Algorithms and matrix
One of which or any combination in computation system.
In one possible implementation, in the various arithmetic units such as neural network model, the algorithm packet that can use
Include the various algorithms such as convolution algorithm, pond algorithm, Matrix Multiple Algorithms and matrix computation system.Can by convolution algorithm, pond algorithm,
Matrix Multiple Algorithms and one of which or any combination in matrix computation system are converted according to the operation method in the embodiment of the present disclosure
To be executed after executable instruction for system on chip.The disclosure to the type and content of algorithm without limitation.
In the present embodiment, algorithm include in convolution algorithm, pond algorithm, Matrix Multiple Algorithms and matrix computation system wherein
A kind of or any combination.Algorithm is converted to after executable instruction for system on chip using the operation method in the embodiment of the present disclosure
It executes, improves the compatibility between algorithm and system on chip, reduce the difficulty of algorithm development.
Embodiment 1:
In one possible implementation, the data sectional of convolution algorithm is obtained by the data according to segment information
Segment data, the data include the input data and output data of the convolution algorithm;
Expression among the first of the data is determined according to the segment data of the data;
Determine expression corresponding memory space on piece caching among the first of data;
Expression among the second of the data is generated according to the memory space;
According to the second centre expression of the second of the data the centre expression and the convolution algorithm operator, the volume is generated
Expression among the second of integration method.
In one possible implementation, the input data of the convolution algorithm include input neuron, weight, partially
It sets, the output data of the convolution algorithm includes convolution output neuron.
In one possible implementation, it is described input neuron each dimension data include: feature input channel, it is defeated
Enter characteristic pattern height and input feature vector figure width;
Each dimension data of the convolution output neuron includes: convolutional channel, convolution characteristic pattern height and convolution feature
Figure width;
Each dimension data of the weight includes: feature input channel, convolutional channel, characteristic pattern height convolution kernel and feature
Figure width convolution kernel;
Each dimension data of the biasing includes: convolutional channel.
In one possible implementation, the input data of convolution algorithm can be segmented according to segment information, is obtained
The segment data of input data.Can according to segment information will input neuron, weight, biasing be segmented respectively, obtain
Input neuron segment data, weight segment data and biasing segment data.It can be according to segment information by the defeated of convolution algorithm
Data sectional out obtains the segment data of output data.Convolution output neuron can be segmented according to segment information,
Obtain convolution output neuron segment data.In segment information, for different input data and output data, section length
Can be different with number of fragments, the disclosure does not limit this.
In one possible implementation, the data of the convolution algorithm are multidimensional data, will be rolled up according to segment information
The data sectional of integration method obtains the segment data of the data, comprising:
Each dimension data of the data is segmented respectively according to segment information, obtains each dimension data of the data
Dimension segment data;
According to the dimension segment data of each dimension data of the data, the segment data of the data is obtained.
In one possible implementation, the input data and output data of convolution algorithm are multidimensional data.It can root
According to segment information, each dimension data of the input data of convolution algorithm and output data is segmented respectively, obtains input number
According to the dimension segment data of each dimension data with output data.Each dimension data of input neuron includes: that feature input is logical
Road, input feature vector figure height and input feature vector figure width.The data of feature input channel can be divided according to segment information
Section, obtains the dimension segment information of feature input channel.The data of input feature vector figure height can be carried out according to segment information
Segmentation, obtains the dimension segment information of input feature vector figure height.It can be according to segment information by the data of input feature vector figure width
It is segmented, obtains the dimension segment information of input feature vector figure width.It similarly, can be according to segment information by convolution output nerve
Each dimension data of member is segmented, and the dimension segment information of convolutional channel, the dimension segmentation letter of convolution characteristic pattern height are obtained
The dimension segment information of breath and convolution characteristic pattern width.Each dimension data of weight is segmented according to segment information, is obtained
The dimension segmentation of the dimension segment information of feature input channel, the dimension segment information of convolutional channel, characteristic pattern height convolution kernel
The dimension segment information of information and characteristic pattern width convolution kernel.Each dimension data of biasing is segmented according to segment information,
Obtain the dimension segment information of convolutional channel.
In one possible implementation, intermediate expression and default intermediate code can be preset according to demand.It can set
Set the intermediate expression of C language.Centre expression language can be different from the language of algorithm and system on chip, can also and algorithm
Or the language of system on chip is identical.The disclosure does not limit this.
It in one possible implementation, can be by the segment data of each input data of convolution algorithm and each output number
According to segment data, write respectively each input data first among expression and each output data first among expression.It is holding
When row algorithm, each data can be segmented execution, and the segment data that can successively extract each data executes.
In one possible implementation, the segment data of each data can share expression among identical first.Example
Such as, the first middle table of neuron number evidence can be determined according to the first segment data and the second segment data of input neuron
Up to for Neuron input.Second segment data of the first segment data and input neuron that input neuron can share the
Neuron input is expressed among one.
In one possible implementation, according to the first of the data the centre expression and the first of convolution algorithm operator
Centre expression generates expression among the first of convolution algorithm.
In one possible implementation, expression among the first of the operator of each algorithm can be preset.For example, convolution is calculated
ConvForward is expressed as among the first of the operator of method.
In one possible implementation, the first centre of each input data of convolution algorithm can be expressed, is each defeated
First centre of expression and convolution algorithm operator is expressed among the first of data out, combines to obtain algorithm in the way of setting
Expression among first.For example, the data of convolution algorithm may include that input neuron (is expressed as among first
Neuroninput), weight (weight is expressed as among first), biasing (bias is expressed as among first), output data can be with
Including convolution output neuron (being expressed as Neuron input among first), the operator of convolution algorithm is convolution operator (first
Centre is expressed as ConvForward), then the intermediate expression of convolution algorithm can for ConvForward (Neuron input,
Neuron input, weight, bias).
In one possible implementation, the convolution can be generated according to the first centre expression of the convolution algorithm
First executable instruction of algorithm, or generating the second of the convolution algorithm according to the second centre expression of the convolution algorithm can
It executes instruction.
In one possible implementation, the intermediate expression of system on chip execution code and convolution algorithm can be preset
Between transformation warehouse.Such as transformation warehouse can use assembler language realization.It can be the first centre expression of convolution algorithm and second
Different transformation warehouses is arranged in centre expression.Can use transformation warehouse expression among the first of convolution algorithm is converted to first can hold
Row instruction, or the second centre expression of convolution algorithm is converted into the second executable instruction using transformation warehouse.
In the present embodiment, the data sectional of convolution algorithm is obtained by the segment data of data according to segment information, it can be with
Expression among the first of data is determined according to the segment data of data, it can be according to the first centre expression of data and convolution algorithm
Expression among the first of operator generates expression among the first of convolution algorithm.Pass through the first centre expression of data and operator
First middle table reaches expression among the first of convolution algorithm, and convolution algorithm is not needed when different systems on chip is realized
Specific interface is set, the compatibility between convolution algorithm and system on chip is improved, reduces the difficulty of algorithm development.
Embodiment 2:
In one possible implementation, the data sectional of convolution algorithm is obtained by the data according to segment information
Segment data, the data include the input data and output data of the convolution algorithm;
Expression among the first of the data is determined according to the segment data of the data;
According to the first centre expression of the first of the data the centre expression and the convolution algorithm operator, the volume is generated
Expression among the first of integration method.
In one possible implementation, the data are multidimensional data, according to segment information by the number of convolution algorithm
According to segmentation, the segment data of the data is obtained, comprising:
Each dimension data of the data is segmented respectively according to segment information, obtains each dimension data of the data
Dimension segment data;
According to the dimension segment data of each dimension data of the data, the segment data of the data is obtained.
In one possible implementation, the method also includes:
The first executable instruction of the convolution algorithm is generated according to the first of the convolution algorithm the centre expression.
The present embodiment difference from example 1 is that, only according to expression among the first of the data of convolution algorithm and
First middle table method of convolution algorithm operator generates expression among the first of convolution algorithm, and according in the first of convolution algorithm
Between expression generate convolution algorithm the first executable instruction.Not according to the first of convolution algorithm the centre expression on piece caching
Corresponding memory space generates expression among the second of the convolution algorithm, is not also expressed according to the second of convolution algorithm the centre
Generate the second executable instruction of convolution algorithm.
Embodiment 3:
In one possible implementation, it can be obtained described according to segment information by the data sectional of pond algorithm
The segment data of data, the data include the input data and output data of the pond algorithm;
Expression among the first of the data is determined according to the segment data of the data;
Determine expression corresponding memory space on piece caching among the first of data;
Expression among the second of the data is generated according to the memory space;
According to the second centre expression of the second of the data the centre expression and the pond algorithm operator, the pond is generated
Change expression among the second of algorithm.
In one possible implementation, the data of the pond algorithm are multidimensional data, according to segment information by pond
The data sectional for changing algorithm, obtains the segment data of the data, comprising:
Each dimension data is segmented respectively according to segment information, obtains the dimension segment data of each dimension data;
According to the dimension segment data of each dimension data of the data, the segment data of the data is obtained.
In one possible implementation, the input data of the pond algorithm includes convolution output neuron, described
The output data of pond algorithm includes pond output neuron.
In one possible implementation, the input data of pond algorithm can be segmented according to segment information, is obtained
The segment data of pond algorithm input data.Convolution output neuron can be segmented according to segment information, be rolled up
Product output neuron segment data.The output data of pond algorithm can be segmented according to segment information, it is defeated to obtain pond algorithm
The segment data of data out.Pond output neuron can be segmented according to segment information, obtain pond output nerve
First segment data.In segment information, for the input data and output data of pond algorithm, section length and number of fragments can
With difference, the disclosure is not limited this.
In one possible implementation, each dimension data of the convolution output neuron includes: convolutional channel, volume
Product characteristic pattern height and convolution characteristic pattern width;Each dimension data of the pond output neuron includes: convolutional channel, Chi Hua
Characteristic pattern height and pond characteristic pattern width.
It in one possible implementation, can be according to segment information by the convolution output neuron in the algorithm of pond
Each dimension data is segmented, and the dimension segment data of each dimension data is obtained.Can according to segment information by convolutional channel,
Convolution characteristic pattern height and convolution characteristic pattern width are segmented respectively, and dimension segment data, the convolution for obtaining convolutional channel are special
Levy the dimension segment data of figure height and the dimension segment data of convolution characteristic pattern width.Pondization can be calculated according to segment information
Each dimension data of pond output neuron in method is segmented, and the dimension segment data of each dimension data is obtained.It can
Convolutional channel, pond characteristic pattern height and pond characteristic pattern width are segmented according to segment information, obtain convolutional channel
The dimension segment data of dimension segment data, the dimension segment data of pond characteristic pattern height and pond characteristic pattern width.
In one possible implementation, intermediate expression and default intermediate code can be preset according to demand.It can set
Set the intermediate expression of C language.Centre expression language can be different from the language of algorithm and system on chip, can also and algorithm
Or the language of system on chip is identical.The disclosure does not limit this.
It in one possible implementation, can be by the segments of each input data of pond algorithm and each output data
According to the first centre for writing each input data and output data respectively is expressed.When executing pond algorithm, each data can be segmented
It executes, the segment data that can successively extract each data executes, and the segment data of each data can share among identical first
Expression.
In one possible implementation, the method also includes:
The first executable instruction of the pond algorithm is generated according to the first of the pond algorithm the centre expression, or
The second executable instruction of the pond algorithm is generated according to the second of the pond algorithm the centre expression.
In one possible implementation, the intermediate expression of system on chip execution code and pond algorithm can be preset
Between transformation warehouse.Such as transformation warehouse can use assembler language realization.It can be the first centre expression of pond algorithm and second
Different transformation warehouses is arranged in centre expression.Can use transformation warehouse expression among the first of pond algorithm is converted to first can hold
Row instruction, or the second centre expression of pond algorithm is converted into the second executable instruction using transformation warehouse.
In the present embodiment, the data sectional of pond algorithm is obtained by the segment data of data according to segment information, it can be with
Expression among the first of data is determined according to the segment data of data, it can be according to the first centre expression of data and pond algorithm
Expression among the first of operator, the first centre expression of generate pond algorithm.Pass through the first centre expression of data and operator
First middle table reaches expression among the first of pond algorithm, and pond algorithm is not needed when different systems on chip is realized
Specific interface is set, the compatibility between pond algorithm and system on chip is improved, reduces the difficulty of algorithm development.
Application example 4:
In one possible implementation, the data sectional of pond algorithm is obtained by the data according to segment information
Segment data, the data include the input data and output data of the pond algorithm;
Expression among the first of the data is determined according to the segment data of the data;
According to the first centre expression of the first of the data the centre expression and the pond algorithm operator, the pond is generated
Change expression among the first of algorithm.
In one possible implementation, the data are multidimensional data, according to segment information by the number of pond algorithm
According to segmentation, the segment data of the data is obtained, comprising:
Each dimension data is segmented respectively according to segment information, obtains the dimension segment data of each dimension data;
According to the dimension segment data of each dimension data of the data, the segment data of the data is obtained.
In one possible implementation, the method also includes:
The first executable instruction of the pond algorithm is generated according to the first of the pond algorithm the centre expression.
The present embodiment and embodiment 3 the difference is that, only according to expression among the first of the data of pond algorithm with
Expression among the first of pond algorithm operator, the first centre expression of generate pond algorithm, and according in the first of pond algorithm
Between express generate pond algorithm the first executable instruction.Not according to the first of pond algorithm the centre expression on piece caching
Corresponding memory space generates expression among the second of the pond algorithm, and is given birth to according to the second centre expression of pond algorithm
Second executable instruction of Cheng Chihua algorithm.
Embodiment 5:
In one possible implementation, the data sectional of Matrix Multiple Algorithms is obtained by the number according to segment information
According to segment data, the data include the input data and output data of the Matrix Multiple Algorithms;
Expression among the first of the data is determined according to the segment data of the data;
Determine expression corresponding memory space on piece caching among the first of data;
Expression among the second of the data is generated according to the memory space;
It is expressed according to the second centre of the second of the data the centre expression and the Matrix Multiple Algorithms operator, described in generation
Expression among the second of Matrix Multiple Algorithms.
In one possible implementation, the input data of the Matrix Multiple Algorithms includes the first matrix function of N row C column
It include the second matrix data of N row M column according to, the output data of the Matrix Multiple Algorithms, according to segment information by the data of algorithm
Segmentation, obtains the segment data of the data, comprising:
The first matrix data and the second matrix data are respectively divided into N sections according to segment information, obtain the first matrix data
Segment data and the second matrix data segment data, wherein the every segment length of the segment data of the first matrix data is C, the
The every segment length of the segment data of two matrix datas is M.
In one possible implementation, the input data of Matrix Multiple Algorithms can be segmented according to segment information, is obtained
To the segment data of Matrix Multiple Algorithms input data.The first matrix data can be segmented according to segment information, be obtained
First matrix data segment data.The output data of Matrix Multiple Algorithms can be segmented according to segment information, obtain Matrix Multiplication calculation
The segment data of method output data.The second matrix data can be segmented according to segment information, obtain the second matrix function
According to segment data.In segment information, for the input data and output data of Matrix Multiple Algorithms, section length and number of fragments
Can be different, the disclosure does not limit this.
In one possible implementation, intermediate expression and default intermediate code can be preset according to demand.It can set
Set the intermediate expression of C language.Centre expression language can be different from the language of algorithm and system on chip, can also and algorithm
Or the language of system on chip is identical.The disclosure does not limit this.
It in one possible implementation, can be by the segmentation of each input data of Matrix Multiple Algorithms and each output data
Data write the first centre expression of each input data and output data respectively.When executing Matrix Multiple Algorithms, each data can be with
Segmentation executes, and the segment data that can successively extract each data executes, and the segment data of each data can share identical first
Centre expression.
In one possible implementation, the method also includes:
The first executable instruction of the Matrix Multiple Algorithms is generated according to the first centre expression of the Matrix Multiple Algorithms, or
The second executable instruction of the Matrix Multiple Algorithms is generated according to the second of the Matrix Multiple Algorithms the centre expression.
In one possible implementation, the intermediate expression that system on chip executes code and Matrix Multiple Algorithms can be preset
Between transformation warehouse.Such as transformation warehouse can use assembler language realization.Can for Matrix Multiple Algorithms first among expression and
Different transformation warehouses is arranged in expression among second.It can use transformation warehouse and expression among the first of Matrix Multiple Algorithms be converted to the
One executable instruction, or the second centre expression of Matrix Multiple Algorithms is converted into the second executable instruction using transformation warehouse.
In the present embodiment, the data sectional of Matrix Multiple Algorithms is obtained by the segment data of data according to segment information, it can
Expression among the first of data is determined with the segment data according to data, it can be according to the first centre expression of data and Matrix Multiplication
Expression among the first of algorithm operator, the first centre expression of generator matrix multiplication algorithm.By data first among expression and
First middle table of operator reaches expression among the first of Matrix Multiple Algorithms, and Matrix Multiple Algorithms are realized in different systems on chip
When, it does not need that specific interface is arranged, improves the compatibility between Matrix Multiple Algorithms and system on chip, reduce algorithm development
Difficulty.
Embodiment 6:
In one possible implementation, the data sectional of Matrix Multiple Algorithms is obtained by the number according to segment information
According to segment data, the data include the input data and output data of the Matrix Multiple Algorithms;
Expression among the first of the data is determined according to the segment data of the data;
It is expressed according to the first centre of the first of the data the centre expression and the Matrix Multiple Algorithms operator, described in generation
Expression among the first of Matrix Multiple Algorithms.
In one possible implementation, the method also includes:
The first executable instruction of the Matrix Multiple Algorithms is generated according to the first of the Matrix Multiple Algorithms the centre expression.
The present embodiment and embodiment 5 the difference is that, only expressed according among the first of the data of Matrix Multiple Algorithms
With the first centre expression of Matrix Multiple Algorithms operator, the first centre of generator matrix multiplication algorithm is expressed, and according to Matrix Multiple Algorithms
First among expression generator matrix multiplication algorithm the first executable instruction.Do not expressed according to the first of Matrix Multiple Algorithms the centre
Corresponding memory space generates expression among the second of the Matrix Multiple Algorithms on piece caching, and according to Matrix Multiple Algorithms
Second among expression generator matrix multiplication algorithm the second executable instruction.
Embodiment 7:
In one possible implementation, the data sectional of matrix computation system is obtained by the number according to segment information
According to segment data, the data include the input data and output data of the matrix computation system;
Expression among the first of the data is determined according to the segment data of the data;
Determine expression corresponding memory space on piece caching among the first of data;
Expression among the second of the data is generated according to the memory space;
It is expressed according to the second centre of the second of the data the centre expression and the matrix computation system operator, described in generation
Expression among the second of matrix computation system.
In one possible implementation, the input data of the matrix computation system includes the third matrix function of N row C column
The segment data of the data is obtained according to segment information by the data sectional of algorithm according to the 4th matrix data arranged with N row C,
Include:
Third matrix data and the 4th matrix data are respectively divided into N sections according to segment information, obtain third matrix data
Segment data and the 4th matrix data segment data, wherein the every segment length of the segment data of third matrix data is C, the
The every segment length of the segment data of four matrix datas is C.
In one possible implementation, the input data of matrix computation system can be segmented according to segment information, is obtained
To the segment data of matrix computation system input data.Third matrix data can be segmented according to segment information, be obtained
Third matrix data segment data.The output data of matrix computation system can be segmented according to segment information, obtain matrix and add
The segment data of method output data.The 4th matrix data can be segmented according to segment information, obtain the 4th matrix function
According to segment data.In segment information, for the input data and output data of matrix computation system, section length and number of fragments
Can be different, the disclosure does not limit this.
In one possible implementation, the method also includes:
The first executable instruction of the matrix computation system is generated according to the first centre expression of the matrix computation system, or
The second executable instruction of the matrix computation system is generated according to the second of the matrix computation system the centre expression.
In one possible implementation, the intermediate expression that system on chip executes code and matrix computation system can be preset
Between transformation warehouse.Such as transformation warehouse can use assembler language realization.Can for matrix computation system first among expression and
Different transformation warehouses is arranged in expression among second.It can use transformation warehouse and expression among the first of matrix computation system be converted to the
One executable instruction, or the second centre expression of matrix computation system is converted into the second executable instruction using transformation warehouse.
In the present embodiment, the data sectional of matrix computation system is obtained by the segment data of data according to segment information, it can
Expression among the first of data is determined with the segment data according to data, can be added according to the first centre expression of data and matrix
Expression among the first of algorithm operator, the first centre expression of generator matrix computation system.By data first among expression and
First middle table of operator reaches expression among the first of matrix computation system, and matrix computation system is realized in different systems on chip
When, it does not need that specific interface is arranged, improves the compatibility between matrix computation system and system on chip, reduce algorithm development
Difficulty.
Embodiment 8:
In one possible implementation, the data sectional of matrix computation system is obtained by the number according to segment information
According to segment data, the data include the input data and output data of the matrix computation system;
Expression among the first of the data is determined according to the segment data of the data;
It is expressed according to the first centre of the first of the data the centre expression and the matrix computation system operator, described in generation
Expression among the first of matrix computation system.
In one possible implementation, the method also includes:
The first executable instruction of the matrix computation system is generated according to the first of the matrix computation system the centre expression.
The present embodiment and embodiment 7 the difference is that, only expressed according among the first of the data of matrix computation system
With the first centre expression of matrix computation system operator, the first centre of generator matrix computation system is expressed, and according to matrix computation system
First among expression generator matrix computation system the first executable instruction.Do not expressed according to the first of matrix computation system the centre
Corresponding memory space generates expression among the second of the matrix computation system on piece caching, and according to matrix computation system
Second among expression generator matrix computation system the second executable instruction.
Fig. 4 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure, as shown in figure 4, the arithmetic unit includes:
Segment data obtains module 10, for, by the data sectional of algorithm, obtaining point of the data according to segment information
Segment data, the data include the input data and output data of the algorithm;
Determining module 20 is expressed among data, in first for generating the data according to the segment data of the data
Between express;
Determining module 30 is expressed among algorithm, for according to the first centre expression of the data and the algorithm operator
Expression among first generates expression among the first of the algorithm.
In one possible implementation, the data are multidimensional data, and the segment data obtains module, comprising:
Dimension segment data acquisition submodule is obtained for being segmented each dimension data respectively according to segment information
The dimension segment data of each dimension data;
Segment data acquisition submodule obtains institute for the dimension segment data according to each dimension datas of the data
State the segment data of data.
In one possible implementation, determining module is expressed among the algorithm, comprising:
Memory space determines submodule, and expression corresponding storage on piece caching is empty among first for determining data
Between;
It is expressed among second and determines submodule, for generating the second middle table of the data according to the memory space
It reaches;
Algorithm centre, which is expressed, determines submodule, for according to the second centre expression of the data and the algorithm operator
Expression among second generates expression among the second of the algorithm.
In one possible implementation, described device further include:
First executable instruction generation module, for generating the of the algorithm according to expression among the first of the algorithm
One executable instruction, or
Second executable instruction generation module, for generating the of the algorithm according to expression among the second of the algorithm
Two executable instructions.
In one possible implementation, the segment information is determined according to the size that on piece caches.
In one possible implementation, the algorithm includes convolution algorithm, pond algorithm, Matrix Multiple Algorithms and matrix
One of which or any combination in computation system.
In one possible implementation, the input data of the convolution algorithm include input neuron, weight, partially
It sets, the output data of the convolution algorithm includes convolution output neuron.
In one possible implementation, it is described input neuron each dimension data include: feature input channel, it is defeated
Enter characteristic pattern height and input feature vector figure width;
Each dimension data of the convolution output neuron includes: convolutional channel, convolution characteristic pattern height and convolution feature
Figure width;
Each dimension data of the weight includes: feature input channel, convolutional channel, characteristic pattern height convolution kernel and feature
Figure width convolution kernel;
Each dimension data of the biasing includes: convolutional channel.
In one possible implementation, the input data of the pond algorithm includes convolution output neuron, described
The output data of pond algorithm includes pond output neuron.
In one possible implementation, each dimension data of the convolution output neuron includes: convolutional channel, volume
Product characteristic pattern height and convolution characteristic pattern width;
Each dimension data of the pond output neuron includes: convolutional channel, pond characteristic pattern height and pond feature
Figure width.
In one possible implementation, the input data of the Matrix Multiple Algorithms includes the first matrix function of N row C column
According to the output data of the Matrix Multiple Algorithms includes the second matrix data of N row M column, and the segment data obtains module, packet
It includes:
Matrix Multiplication segment data acquisition submodule, for according to segment information by the first matrix data and the second matrix data
N sections are respectively divided into, obtains the segment data of the first matrix data and the segment data of the second matrix data, wherein the first matrix
The every segment length of the segment data of data is C, and the every segment length of the segment data of the second matrix data is M.
In one possible implementation, the input data of the matrix computation system includes the third matrix function of N row C column
According to the 4th matrix data arranged with N row C, the segment data obtains module, comprising:
Matrix bonus point segment data acquisition submodule, for according to segment information by third matrix data and the 4th matrix data
N sections are respectively divided into, obtains the segment data of third matrix data and the segment data of the 4th matrix data, wherein third matrix
The every segment length of the segment data of data is C, and the every segment length of segment data of the 4th matrix data is C.
Fig. 5 shows the block diagram of the combined treatment device according to one embodiment of the disclosure, as shown in figure 5, the combined treatment
Device, including above-mentioned neural network computing device, general interconnecting interface and other processing units.
Neural network computing device is interacted with other processing units, the common operation completing user and specifying.Its elsewhere
Device is managed, including one in the general/application specific processors such as central processor CPU, graphics processor GPU, neural network processor
Kind or above processor type.Processor quantity included by other processing units is with no restrictions.Other processing unit conducts
The interface of neural network computing device and external data and control, including data are carried, and are completed to this neural network computing device
The basic control such as unlatching, stopping;Other processing units can also cooperate with neural network computing device and complete operation times jointly
Business.General interconnecting interface, for transmitting data and control instruction between the neural network computing device and other processing units.
The neural network computing device obtains required input data from other processing units, and neural network computing device on piece is written
Storage device;Control instruction can be obtained from other processing units, the control of write-in neural network computing device on piece is slow
It deposits;The data in the memory module of neural network computing device can also be read and be transferred to other processing units.
Combined treatment device can also include storage device, storage device respectively with the neural network computing device and institute
State the connection of other processing units.Storage device is used to be stored in the neural network computing device and other processing units
Data, the data of operation required for being particularly suitable for are in the storage inside of this neural network computing device or other processing units
The data that can not all save.
The combined treatment device can be used as the SOC on piece of the equipment such as mobile phone, robot, unmanned plane, video monitoring equipment
The die area of control section is effectively reduced in system, improves processing speed, reduces overall power.When this situation, the combined treatment
The general interconnecting interface of device is connected with certain components of equipment.Certain components for example camera, display, mouse, keyboard,
Network interface card, wifi interface.
In one possible implementation, the disclosure also provides neural network chip comprising above-mentioned neural network
Arithmetic unit or combined treatment device.
In one possible implementation, the disclosure also provides chip-packaging structure comprising said chip.
In one possible implementation, the disclosure also provides board comprising said chip encapsulating structure.
In one possible implementation, the disclosure also provides electronic equipment comprising above-mentioned board.
Electronic equipment include data processing equipment, robot, computer, printer, scanner, tablet computer, intelligent terminal,
Mobile phone, automobile data recorder, navigator, sensor, camera, server, cloud server, camera, video camera, projector, hand
Table, earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or Medical Devices.
The vehicles include aircraft, steamer and/or vehicle;The household electrical appliance include TV, air-conditioning, micro-wave oven,
Refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator;The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument
And/or electrocardiograph.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of
Combination of actions, but those skilled in the art should understand that, the disclosure is not limited by the described action sequence because
According to the disclosure, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know
It knows, embodiment described in this description belongs to alternative embodiment, the related actions and modules not necessarily disclosure
It is necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, reference can be made to the related descriptions of other embodiments.
In several embodiments provided by the disclosure, it should be understood that disclosed device, it can be by another way
It realizes.For example, the apparatus embodiments described above are merely exemplary, such as the division of the unit, it is only a kind of
Logical function partition, there may be another division manner in actual implementation, such as multiple units or components can combine or can
To be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual
Coupling, direct-coupling or communication connection can be through some interfaces, the indirect coupling or communication connection of device or unit,
It can be electrical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, each functional unit in each embodiment of the disclosure can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also be realized in the form of software program module.
If the integrated unit is realized in the form of software program module and sells or use as independent product
When, it can store in a computer-readable access to memory.Based on this understanding, the technical solution of the disclosure substantially or
Person says that all or part of the part that contributes to existing technology or the technical solution can body in the form of software products
Reveal and, which is stored in a memory, including some instructions are used so that a computer equipment
(can be personal computer, server or network equipment etc.) executes all or part of each embodiment the method for the disclosure
Step.And memory above-mentioned includes: USB flash disk, read-only memory (ROM, Read-Only Memory), random access memory
The various media that can store program code such as (RAM, Random Access Memory), mobile hard disk, magnetic or disk.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can
It is completed with instructing relevant hardware by program, which can store in a computer-readable memory, memory
May include: flash disk, read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English:
Random Access Memory, referred to as: RAM), disk or CD etc..
The embodiment of the present disclosure is described in detail above, specific case used herein to the principle of the disclosure and
Embodiment is expounded, disclosed method that the above embodiments are only used to help understand and its core concept;
At the same time, for those skilled in the art can in specific embodiments and applications according to the thought of the disclosure
There is change place, in conclusion the content of the present specification should not be construed as the limitation to the disclosure.
Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/
Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/
Or in block diagram each box combination, can be realized by computer-readable program instructions.
The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use
The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box
It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel
Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or
The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic
The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.
The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and
It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill
Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport
In principle, the practical application or to the technological improvement in market for best explaining each embodiment, or make the art its
Its those of ordinary skill can understand each embodiment disclosed herein.
Claims (12)
1. a kind of operation method, which is characterized in that the described method includes:
According to segment information by the data sectional of matrix computation system, the segment data of the data is obtained, the data include institute
State the input data and output data of matrix computation system;
Expression among the first of the data is determined according to the segment data of the data;
Determine expression corresponding memory space on piece caching among the first of data;
Expression among the second of the data is generated according to the memory space;
According to the second centre expression of the second of the data the centre expression and the matrix computation system operator, the matrix is generated
Expression among the second of computation system.
2. the method according to claim 1, wherein the method also includes:
The first executable instruction of the matrix computation system is generated according to the first centre expression of the matrix computation system, or
The second executable instruction of the matrix computation system is generated according to the second of the matrix computation system the centre expression.
3. the method according to claim 1, wherein the segment information is determined according to the size that on piece caches.
4. the method according to claim 1, wherein the input data of the matrix computation system includes that N row C is arranged
4th matrix data of third matrix data and N row C column obtains institute according to segment information by the data sectional of matrix computation system
State the segment data of data, comprising:
Third matrix data and the 4th matrix data are respectively divided into N sections according to segment information, obtain point of third matrix data
The segment data of segment data and the 4th matrix data, wherein the every segment length of the segment data of third matrix data is C, the 4th square
The every segment length of segment data of battle array data is C.
5. a kind of arithmetic unit, which is characterized in that described device includes:
Segment data obtains module, for, by the data sectional of matrix computation system, obtaining point of the data according to segment information
Segment data, the data include the input data and output data of the matrix computation system;
Determining module is expressed among data, for determining the first middle table of the data according to the segment data of the data
It reaches;
Memory space determining module, for determining that the corresponding memory space on piece caching is expressed in the first centre of data;
Determining module is expressed among second, is expressed among second for generating the data according to the memory space;
Determining module is expressed among algorithm, for according to the second centre expression of the data and the matrix computation system operator
Expression among second generates expression among the second of the matrix computation system.
6. device according to claim 5, which is characterized in that described device further include:
First executable instruction generation module generates the matrix for the first centre expression according to the matrix computation system and adds
First executable instruction of algorithm, or
Second executable instruction generation module generates the matrix for the second centre expression according to the matrix computation system and adds
Second executable instruction of algorithm.
7. device according to claim 5, which is characterized in that the segment information is determined according to the size that on piece caches.
8. device according to claim 5, which is characterized in that the input data of the matrix computation system includes that N row C is arranged
4th matrix data of third matrix data and N row C column obtains institute according to segment information by the data sectional of matrix computation system
State the segment data of data, comprising:
Third matrix data and the 4th matrix data are respectively divided into N sections according to segment information, obtain point of third matrix data
The segment data of segment data and the 4th matrix data, wherein the every segment length of the segment data of third matrix data is C, the 4th square
The every segment length of segment data of battle array data is C.
9. a kind of neural network computing device, which is characterized in that the neural network computing device includes one or more as weighed
Benefit requires arithmetic unit described in any one of 5-8, and the neural network computing device is used to complete the neural network fortune of setting
It calculates.
10. a kind of combinatorial operation device, which is characterized in that the combinatorial operation device includes one or more such as claim 9
Described in any item neural network computing devices, general interconnecting interface and other processing units;
The neural network computing device is interacted with other described processing units, the common calculating behaviour for completing user and specifying
Make.
11. a kind of neural network chip, which is characterized in that the neural network chip includes:
Such as the described in any item arithmetic units of claim 5-8;Or
Neural network computing device as claimed in claim 9;Or
Combined treatment device as claimed in claim 10.
12. a kind of electronic equipment, which is characterized in that the electronic equipment includes:
Such as the described in any item arithmetic units of claim 5-8;Or
Neural network computing device as claimed in claim 9;Or
Combined treatment device as claimed in claim 10;Or
Neural network chip as claimed in claim 11.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811456746.8A CN109543836B (en) | 2018-11-30 | 2018-11-30 | Operation method, device and related product |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811456746.8A CN109543836B (en) | 2018-11-30 | 2018-11-30 | Operation method, device and related product |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109543836A true CN109543836A (en) | 2019-03-29 |
CN109543836B CN109543836B (en) | 2021-08-03 |
Family
ID=65852539
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811456746.8A Active CN109543836B (en) | 2018-11-30 | 2018-11-30 | Operation method, device and related product |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109543836B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105975639A (en) * | 2016-07-04 | 2016-09-28 | 北京百度网讯科技有限公司 | Search result ordering method and device |
CN106529565A (en) * | 2016-09-23 | 2017-03-22 | 北京市商汤科技开发有限公司 | Target identification model training and target identification method and device, and computing equipment |
CN106611216A (en) * | 2016-12-29 | 2017-05-03 | 北京旷视科技有限公司 | Computing method and device based on neural network |
CN107168955A (en) * | 2017-05-23 | 2017-09-15 | 南京大学 | Word insertion and the Chinese word cutting method of neutral net using word-based context |
US20180246853A1 (en) * | 2017-02-28 | 2018-08-30 | Microsoft Technology Licensing, Llc | Hardware node with matrix-vector multiply tiles for neural network processing |
CN108875482A (en) * | 2017-09-14 | 2018-11-23 | 北京旷视科技有限公司 | Object detecting method and device, neural network training method and device |
-
2018
- 2018-11-30 CN CN201811456746.8A patent/CN109543836B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105975639A (en) * | 2016-07-04 | 2016-09-28 | 北京百度网讯科技有限公司 | Search result ordering method and device |
CN106529565A (en) * | 2016-09-23 | 2017-03-22 | 北京市商汤科技开发有限公司 | Target identification model training and target identification method and device, and computing equipment |
CN106611216A (en) * | 2016-12-29 | 2017-05-03 | 北京旷视科技有限公司 | Computing method and device based on neural network |
US20180246853A1 (en) * | 2017-02-28 | 2018-08-30 | Microsoft Technology Licensing, Llc | Hardware node with matrix-vector multiply tiles for neural network processing |
CN107168955A (en) * | 2017-05-23 | 2017-09-15 | 南京大学 | Word insertion and the Chinese word cutting method of neutral net using word-based context |
CN108875482A (en) * | 2017-09-14 | 2018-11-23 | 北京旷视科技有限公司 | Object detecting method and device, neural network training method and device |
Non-Patent Citations (1)
Title |
---|
NADAV ROTEM ET.AL: "Glow: Graph Lowering Compiler Techniques for Neural Networks", 《ARXIV:1805.00907V2 [CS.PL]》 * |
Also Published As
Publication number | Publication date |
---|---|
CN109543836B (en) | 2021-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109104876B (en) | Arithmetic device and related product | |
US20210117810A1 (en) | On-chip code breakpoint debugging method, on-chip processor, and chip breakpoint debugging system | |
CN107832845A (en) | A kind of information processing method and Related product | |
CN109219821A (en) | Arithmetic unit and method | |
CN110502330A (en) | Processor and processing method | |
CN109284815A (en) | Neural network model algorithm Compilation Method, device and Related product | |
CN109543825A (en) | Neural network model algorithm Compilation Method, device and Related product | |
CN111860807B (en) | Fractal calculation device, fractal calculation method, integrated circuit and board card | |
CN114580606A (en) | Data processing method, data processing device, computer equipment and storage medium | |
CN113469336A (en) | Compiling method and execution method for optimizing neural network model and related products | |
CN110163349A (en) | A kind of calculation method and device of network model | |
CN109583579A (en) | Computing device and Related product | |
CN109543835A (en) | Operation method, device and Related product | |
CN109558565A (en) | Operation method, device and Related product | |
CN109542837A (en) | Operation method, device and Related product | |
US11710031B2 (en) | Parallel processing circuits for neural networks | |
CN109582277A (en) | Data processing method, device and Related product | |
CN109543836A (en) | Operation method, device and Related product | |
CN109583580A (en) | Operation method, device and Related product | |
CN109543834A (en) | Operation method, device and Related product | |
CN109543833A (en) | Operation method, device and Related product | |
CN109558564A (en) | Operation method, device and Related product | |
CN109558943A (en) | Operation method, device and Related product | |
CN110472734A (en) | A kind of computing device and Related product | |
CN116185377A (en) | Optimization method and device for calculation graph and related product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |