CN109543836A

CN109543836A - Operation method, device and Related product

Info

Publication number: CN109543836A
Application number: CN201811456746.8A
Authority: CN
Inventors: 不公告发明人
Original assignee: Shanghai Cambricon Information Technology Co Ltd
Current assignee: Shanghai Cambricon Information Technology Co Ltd
Priority date: 2018-11-30
Filing date: 2018-11-30
Publication date: 2019-03-29
Anticipated expiration: 2038-11-30
Also published as: CN109543836B

Abstract

This disclosure relates to operation method, device and Related product, which comprises according to segment information by the data sectional of matrix computation system, obtain the segment data of the data, the data include the input data and output data of the matrix computation system；Expression among the first of the data is determined according to the segment data of the data；Determine expression corresponding memory space on piece caching among the first of data；Expression among the second of the data is generated according to the memory space；According to the second centre expression of the second of the data the centre expression and the matrix computation system operator, the second centre expression of the matrix computation system is generated.The embodiment of the present disclosure is reached the intermediate of algorithm by the intermediate expression of data and the middle table of operator and is expressed, different algorithms is when different systems on chip is realized, it does not need that specific interface is arranged, improves the compatibility between algorithm and system on chip, reduce the difficulty of algorithm development.

Description

Operation method, device and Related product

Technical field

This disclosure relates to technical field of information processing more particularly to a kind of operation method, device and Related product.

Background technique

Different language expression can be set in different systems on chip, that is, different can be used in system on chip execution is arranged Code.And the complexity of neural network algorithm itself is high, an algorithm executes after may needing to split in system on chip.For Realize that the programming language of neural network algorithm is also complicated and changeable.When the neural network algorithm realized using different programming languages, When executing in different types of system on chip, need specifically to connect for the algorithm of language-specific and the setting of specific system on chip Mouthful, so that complexity of the neural network algorithm when system on chip is realized is high.

Summary of the invention

In view of this, the present disclosure proposes a kind of operation method, device and Related products, to reduce neural network algorithm The exploitation complexity realized on chip.

According to the one side of the disclosure, a kind of operation method is provided, which comprises

According to segment information by the data sectional of matrix computation system, the segment data of the data, the data packet are obtained Include the input data and output data of the matrix computation system；

Expression among the first of the data is determined according to the segment data of the data；

Determine expression corresponding memory space on piece caching among the first of data；

Expression among the second of the data is generated according to the memory space；

It is expressed according to the second centre of the second of the data the centre expression and the matrix computation system operator, described in generation Expression among the second of matrix computation system.

In one possible implementation, the method also includes:

The first executable instruction of the matrix computation system is generated according to the first centre expression of the matrix computation system, or

The second executable instruction of the matrix computation system is generated according to the second of the matrix computation system the centre expression.

In one possible implementation, the segment information is determined according to the size that on piece caches.

In one possible implementation, the input data of the matrix computation system includes the third matrix function of N row C column Point of the data is obtained according to segment information by the data sectional of matrix computation system according to the 4th matrix data arranged with N row C Segment data, comprising:

Third matrix data and the 4th matrix data are respectively divided into N sections according to segment information, obtain third matrix data Segment data and the 4th matrix data segment data, wherein the every segment length of the segment data of third matrix data is C, the The every segment length of the segment data of four matrix datas is C.

According to the one side of the disclosure, a kind of arithmetic unit is provided, described device includes:

Segment data obtains module, for, by the data sectional of matrix computation system, obtaining the data according to segment information Segment data, the data include the input data and output data of the matrix computation system；

Determining module is expressed among data, among first for determining the data according to the segment data of the data Expression；

Memory space determining module, for determining that it is empty that the corresponding storage on piece caching is expressed in the first centre of data Between；

Determining module is expressed among second, is expressed among second for generating the data according to the memory space；

Determining module is expressed among algorithm, for calculating according to the second centre expression of the data and the matrix computation system Expression among the second of son generates expression among the second of the matrix computation system.

In one possible implementation, described device further include:

First executable instruction generation module, for generating the square according to the first centre expression of the matrix computation system First executable instruction of battle array computation system, or

Second executable instruction generation module, for generating the square according to the second centre expression of the matrix computation system Second executable instruction of battle array computation system.

According to the one side of the disclosure, a kind of neural network computing device, the neural network computing device packet are provided One or more arithmetic units described in any of the above embodiments are included, the neural network computing device is used to complete the nerve net of setting Network operation.

According to the one side of the disclosure, provide a kind of combinatorial operation device, the combinatorial operation device include one or Multiple above-mentioned neural network computing devices, general interconnecting interface and other processing units；

The neural network computing device is interacted with other described processing units, the common calculating completing user and specifying Operation.

According to the one side of the disclosure, a kind of neural network chip is provided, the neural network chip includes:

Above-mentioned arithmetic unit；Or

Above-mentioned neural network computing device；Or

Combinations of the above processing unit.

According to the one side of the disclosure, a kind of electronic equipment is provided, the electronic equipment includes:

Arithmetic unit described in any of the above embodiments；Or

Above-mentioned neural network computing device；Or

Combinations of the above processing unit；Or

Above-mentioned neural network chip.

In the embodiments of the present disclosure, the data sectional of algorithm is obtained by the segment data of data according to segment information, it can be with Expression among the first of data is determined according to the segment data of data, it can be according to the first centre expression of data and algorithm operator First among expression, generating algorithm first among expression.Among the first centre expression of data and the first of operator Expression obtains expression among the first of algorithm, and it is specific not need setting when different systems on chip is realized for different algorithms Interface improves the compatibility between algorithm and system on chip, reduces the difficulty of algorithm development.

In some embodiments, the electronic equipment includes data processing equipment, robot, computer, printer, scanning Instrument, tablet computer, intelligent terminal, mobile phone, automobile data recorder, navigator, sensor, camera, server, cloud server, Camera, video camera, projector, wrist-watch, earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or medical treatment Equipment.

In some embodiments, the vehicles include aircraft, steamer and/or vehicle；The household electrical appliance include electricity Depending on, air-conditioning, micro-wave oven, refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator；The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument and/or electrocardiograph.

According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become It is clear.

Detailed description of the invention

Comprising in the description and constituting the attached drawing of part of specification and specification together illustrates the disclosure Exemplary embodiment, feature and aspect, and for explaining the principles of this disclosure.

Fig. 1 shows the flow chart of the operation method according to one embodiment of the disclosure；

Fig. 2 shows the flow charts according to the operation method of one embodiment of the disclosure；

Fig. 3 shows the flow chart of the operation method according to one embodiment of the disclosure；

Fig. 4 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure；

Fig. 5 shows the block diagram of the combined treatment device according to one embodiment of the disclosure.

Specific embodiment

Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove It non-specifically points out, it is not necessary to attached drawing drawn to scale.

Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary " Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.

In addition, giving numerous details in specific embodiment below to better illustrate the disclosure. It will be appreciated by those skilled in the art that without certain details, the disclosure equally be can be implemented.In some instances, for Method, means, element and circuit well known to those skilled in the art are not described in detail, in order to highlight the purport of the disclosure.

Fig. 1 shows the flow chart of the operation method according to one embodiment of the disclosure, as shown in Figure 1, the operation method packet It includes:

Step S10 obtains the segment data of the data, the data according to segment information by the data sectional of algorithm Input data and output data including the algorithm.

In one possible implementation, the algorithm of neural network may include multiple operation granularities, such as fine granularity Operation (operational order with limited dimensions), the operation of coarseness operation (convolution operation) and network-level.Different operation grain Spend the of different sizes of the corresponding data of corresponding algorithm.Due to the cache resources of the system on chip for completing neural network algorithm It is limited.The input data and output data of the corresponding algorithm of each operation granularity, can on piece caching fragmented storage.

It in one possible implementation, can be according to data itself for different input data and output data Length and on piece caching size, preset segment information corresponding with each input data and output data.Each data can be right Identical segment information is answered, different segment informations can also be corresponded to.By input data and number can be exported according to segment information After being segmented, the segment data of input data and the segment data of output data are obtained.

In one possible implementation, the segment information is determined according to the size that on piece caches.The segmentation letter Breath includes section length and/or number of fragments.For example, the size that can be cached according on piece, determines that segment information is the length of section Degree is B byte.The input data of convolution algorithm includes input neuron, and the length for inputting neuron is A byte (A > B), can be with Input neuron is divided into two sections, one section is B byte, and one section is (A-B) byte.Then inputting the corresponding segment data of neuron is Input neuron the first segment data B byte and input the second segment data of neuron (A-B) byte.

Step S20 is generated according to the segment data of the data and is expressed among the first of the data.

In one possible implementation, intermediate expression and default intermediate code can be preset according to demand.It can set Set the intermediate expression of C language.Centre expression language can be different from the language of algorithm and system on chip, can also and algorithm Or the language of system on chip is identical.The disclosure does not limit this.

In one possible implementation, the segment data of each input data and each output data can be compiled respectively Write the first centre expression of each input data and output data.When executing algorithm, each data can be segmented execution, can be successively The segment data for extracting each data executes, and the segment data of each data can share expression among identical first.For example, can be with According to the first segment data and the second segment data of input neuron, determines and be expressed as among the first of neuron number evidence Neuron input.Second segment data of the first segment data and input neuron that input neuron can share in first Between express Neuron input.

Step S30 is expressed, generation institute according to the first centre of the first of the data the centre expression and the algorithm operator State expression among the first of algorithm.

In one possible implementation, expression among the first of the operator of each algorithm can be preset.For example, convolution is calculated Be expressed as ConvForward among the first of the operator of method, pond algorithm first among be expressed as MaxPoolForward.

In one possible implementation, the first centre of each input data of algorithm can be expressed, respectively export number According to first among expression and algorithm operator first among expression, combine to obtain among the first of algorithm in the way of setting Expression.For example, the data of convolution algorithm may include input neuron (being expressed as Neuron input among first), weight (weight is expressed as among first), biasing (bias is expressed as among first), output data may include convolution output nerve First (Neuron input is expressed as among first), the operator of convolution algorithm are that convolution operator (is expressed as among first ConvForward), then the intermediate expression of convolution algorithm can be ConvForward (Neuron input, Neuron in- Put, weight, bias).

In the present embodiment, the data sectional of algorithm is obtained by the segment data of data according to segment information, it can basis The segment data of data determines expression among the first of data, can be according to the of expression among the first of data and algorithm operator Expression among one, the first centre expression of generating algorithm.Pass through expression among the first centre expression of data and the first of operator Expression among the first of algorithm is obtained, different algorithms when different systems on chip is realized does not need that specific interface is arranged, The compatibility between algorithm and system on chip is improved, the difficulty of algorithm development is reduced.

Fig. 2 shows the flow charts according to the operation method of one embodiment of the disclosure, as shown in Fig. 2, the data are multidimensional Data, step S10 includes: in the operation method

Each dimension data is segmented by step S11 respectively according to segment information, obtains the dimension segmentation of each dimension data Data.

Step S12 obtains the segments of the data according to the dimension segment data of each dimension data of the data According to.

In one possible implementation, the input data and output data of algorithm can be multidimensional data.For defeated Each dimension data of each dimension data and output data that enter data can be segmented respectively.For example, the input of convolution algorithm It include input neuron Neuron input in data, each dimension data of input neuron Neuron input includes that feature is defeated Enter channel input_channal, input feature vector figure height input_spatial_H and input feature vector figure width input_ spatial_W.It can data, data of the data of input feature vector figure height and input feature vector figure width to feature input channel It is segmented respectively.

In one possible implementation, each dimension data can correspond to different segment informations.Can according to it is each The corresponding segment information of dimension data obtains the dimension segment data of each dimension data.The dimension of each dimension data indexes data Length can not wait.For example, the length of feature input channel input_channal is C, corresponding section length is A1, then The dimension segment data of feature input channel can be input_channal (C, A1).

It, can be by the dimension segment data of each dimension data in the way of setting in a kind of possible realization method mode Combination obtains the segment data of data.For example, input neuron segment data be Neuron input (input_channal, Input_spatial_H, input_spatial_W).

In the present embodiment, each dimension data can be segmented according to segment information respectively, obtains each dimension data Dimension segment data.According to the dimension segment data of each dimension data of the data, the segment data of the data is obtained. After data are segmented according to different dimensions, the intermediate use scope expressed of data can be made wider, robustness is more By force.

Fig. 3 shows the flow chart of the operation method according to one embodiment of the disclosure, as shown in figure 3, in the operation method Step S30 includes:

Step S31 determines expression corresponding memory space on piece caching among the first of data.

Step S32 is generated according to the memory space and is expressed among the second of the data.

Step S33 is expressed, generation institute according to the second centre of the second of the data the centre expression and the algorithm operator State expression among the second of algorithm.

In a kind of possible realization method mode, in the algorithm of nonidentity operation, different data can be stored in not In same memory.For example, in neural network algorithm, can will input the data of neuron, the data of output neuron and The data of weight are respectively stored on piece block storage.

In a kind of possible realization method mode, the first centre expression of each data can be corresponded to not on piece caching Same memory space.For example, can by the corresponding address 1 that cached on piece of expression among input neuron first, by weight First among the corresponding address 2 cache on piece of expression, expression among the first of biasing corresponded into the address cached on piece 3, by the first corresponding address 4 cached on piece of centre expression of output neuron.It can be among the first of different data Expression determines different size of memory space on piece caching.

In a kind of possible realization method mode, corresponding on piece caching can be expressed according to the first centre of a data The address of memory space generates expression among the second of the data.For example, expression can be among the second of input neuron Address 1, weight second among expression can be address 2, biasing second among express can be address 3, output neuron Second among expression can be address 4.

In one possible implementation, expression among the second of each algorithm operator can be preset.For example, convolution operator Second among be expressed as LConvForwar, pond operator second among be expressed as LMaxPoolForward.Each operator The data that expression can call directly on piece caching among second carry out corresponding operation.It can be according in the second of each data Between expression and operator second among expression generating algorithm second among expression.For example, expression among the second of convolution algorithm Can be LConvForwar (address 1, address 2, address 3, address 4).

In the present embodiment, expression corresponding memory space on piece caching among the first of data is determined, according to depositing The second centre expression that space generates data is stored up, is expressed according to the second centre of the second of data the centre expression and algorithm operator, Expression among the second of generating algorithm.It is obtained among the second of algorithm according to the memory space in the on piece caching distributed for data Expression calls directly data on piece caching and is calculated, and can be improved the operation efficiency of algorithm, and improve among algorithm The compatibility of expression.

In one possible implementation, the operation method further include:

The first executable instruction of the algorithm is generated according to the first of the algorithm the centre expression, or

The second executable instruction of the algorithm is generated according to the second of the algorithm the centre expression.

In one possible implementation, system on chip can be preset to execute between code and the intermediate expression of algorithm Transformation warehouse.Such as transformation warehouse can use assembler language realization.It can be expression among the first centre expression of algorithm and second Different transformation warehouses is set.It can use transformation warehouse and the first centre expression of algorithm be converted into the first executable instruction, or benefit The second centre expression of algorithm is converted into the second executable instruction with transformation warehouse.

In the present embodiment, the first executable instruction of the algorithm can be generated according to the first centre expression of algorithm, Or the second executable instruction of the algorithm is generated according to the second centre expression of the algorithm.According to the intermediate expression life of algorithm At executable instruction can be executed in system on chip, can be improved the scope of application expressed among algorithm.

In one possible implementation, the algorithm includes convolution algorithm, pond algorithm, Matrix Multiple Algorithms and matrix One of which or any combination in computation system.

In one possible implementation, in the various arithmetic units such as neural network model, the algorithm packet that can use Include the various algorithms such as convolution algorithm, pond algorithm, Matrix Multiple Algorithms and matrix computation system.Can by convolution algorithm, pond algorithm, Matrix Multiple Algorithms and one of which or any combination in matrix computation system are converted according to the operation method in the embodiment of the present disclosure To be executed after executable instruction for system on chip.The disclosure to the type and content of algorithm without limitation.

In the present embodiment, algorithm include in convolution algorithm, pond algorithm, Matrix Multiple Algorithms and matrix computation system wherein A kind of or any combination.Algorithm is converted to after executable instruction for system on chip using the operation method in the embodiment of the present disclosure It executes, improves the compatibility between algorithm and system on chip, reduce the difficulty of algorithm development.

Embodiment 1:

In one possible implementation, the data sectional of convolution algorithm is obtained by the data according to segment information Segment data, the data include the input data and output data of the convolution algorithm；

According to the second centre expression of the second of the data the centre expression and the convolution algorithm operator, the volume is generated Expression among the second of integration method.

In one possible implementation, the input data of the convolution algorithm include input neuron, weight, partially It sets, the output data of the convolution algorithm includes convolution output neuron.

In one possible implementation, it is described input neuron each dimension data include: feature input channel, it is defeated Enter characteristic pattern height and input feature vector figure width；

Each dimension data of the convolution output neuron includes: convolutional channel, convolution characteristic pattern height and convolution feature Figure width；

Each dimension data of the weight includes: feature input channel, convolutional channel, characteristic pattern height convolution kernel and feature Figure width convolution kernel；

Each dimension data of the biasing includes: convolutional channel.

In one possible implementation, the input data of convolution algorithm can be segmented according to segment information, is obtained The segment data of input data.Can according to segment information will input neuron, weight, biasing be segmented respectively, obtain Input neuron segment data, weight segment data and biasing segment data.It can be according to segment information by the defeated of convolution algorithm Data sectional out obtains the segment data of output data.Convolution output neuron can be segmented according to segment information, Obtain convolution output neuron segment data.In segment information, for different input data and output data, section length Can be different with number of fragments, the disclosure does not limit this.

In one possible implementation, the data of the convolution algorithm are multidimensional data, will be rolled up according to segment information The data sectional of integration method obtains the segment data of the data, comprising:

Each dimension data of the data is segmented respectively according to segment information, obtains each dimension data of the data Dimension segment data；

According to the dimension segment data of each dimension data of the data, the segment data of the data is obtained.

In one possible implementation, the input data and output data of convolution algorithm are multidimensional data.It can root According to segment information, each dimension data of the input data of convolution algorithm and output data is segmented respectively, obtains input number According to the dimension segment data of each dimension data with output data.Each dimension data of input neuron includes: that feature input is logical Road, input feature vector figure height and input feature vector figure width.The data of feature input channel can be divided according to segment information Section, obtains the dimension segment information of feature input channel.The data of input feature vector figure height can be carried out according to segment information Segmentation, obtains the dimension segment information of input feature vector figure height.It can be according to segment information by the data of input feature vector figure width It is segmented, obtains the dimension segment information of input feature vector figure width.It similarly, can be according to segment information by convolution output nerve Each dimension data of member is segmented, and the dimension segment information of convolutional channel, the dimension segmentation letter of convolution characteristic pattern height are obtained The dimension segment information of breath and convolution characteristic pattern width.Each dimension data of weight is segmented according to segment information, is obtained The dimension segmentation of the dimension segment information of feature input channel, the dimension segment information of convolutional channel, characteristic pattern height convolution kernel The dimension segment information of information and characteristic pattern width convolution kernel.Each dimension data of biasing is segmented according to segment information, Obtain the dimension segment information of convolutional channel.

It in one possible implementation, can be by the segment data of each input data of convolution algorithm and each output number According to segment data, write respectively each input data first among expression and each output data first among expression.It is holding When row algorithm, each data can be segmented execution, and the segment data that can successively extract each data executes.

In one possible implementation, the segment data of each data can share expression among identical first.Example Such as, the first middle table of neuron number evidence can be determined according to the first segment data and the second segment data of input neuron Up to for Neuron input.Second segment data of the first segment data and input neuron that input neuron can share the Neuron input is expressed among one.

In one possible implementation, according to the first of the data the centre expression and the first of convolution algorithm operator Centre expression generates expression among the first of convolution algorithm.

In one possible implementation, expression among the first of the operator of each algorithm can be preset.For example, convolution is calculated ConvForward is expressed as among the first of the operator of method.

In one possible implementation, the first centre of each input data of convolution algorithm can be expressed, is each defeated First centre of expression and convolution algorithm operator is expressed among the first of data out, combines to obtain algorithm in the way of setting Expression among first.For example, the data of convolution algorithm may include that input neuron (is expressed as among first Neuroninput), weight (weight is expressed as among first), biasing (bias is expressed as among first), output data can be with Including convolution output neuron (being expressed as Neuron input among first), the operator of convolution algorithm is convolution operator (first Centre is expressed as ConvForward), then the intermediate expression of convolution algorithm can for ConvForward (Neuron input, Neuron input, weight, bias).

In one possible implementation, the convolution can be generated according to the first centre expression of the convolution algorithm First executable instruction of algorithm, or generating the second of the convolution algorithm according to the second centre expression of the convolution algorithm can It executes instruction.

In one possible implementation, the intermediate expression of system on chip execution code and convolution algorithm can be preset Between transformation warehouse.Such as transformation warehouse can use assembler language realization.It can be the first centre expression of convolution algorithm and second Different transformation warehouses is arranged in centre expression.Can use transformation warehouse expression among the first of convolution algorithm is converted to first can hold Row instruction, or the second centre expression of convolution algorithm is converted into the second executable instruction using transformation warehouse.

In the present embodiment, the data sectional of convolution algorithm is obtained by the segment data of data according to segment information, it can be with Expression among the first of data is determined according to the segment data of data, it can be according to the first centre expression of data and convolution algorithm Expression among the first of operator generates expression among the first of convolution algorithm.Pass through the first centre expression of data and operator First middle table reaches expression among the first of convolution algorithm, and convolution algorithm is not needed when different systems on chip is realized Specific interface is set, the compatibility between convolution algorithm and system on chip is improved, reduces the difficulty of algorithm development.

Embodiment 2:

According to the first centre expression of the first of the data the centre expression and the convolution algorithm operator, the volume is generated Expression among the first of integration method.

In one possible implementation, the data are multidimensional data, according to segment information by the number of convolution algorithm According to segmentation, the segment data of the data is obtained, comprising:

In one possible implementation, the method also includes:

The first executable instruction of the convolution algorithm is generated according to the first of the convolution algorithm the centre expression.

The present embodiment difference from example 1 is that, only according to expression among the first of the data of convolution algorithm and First middle table method of convolution algorithm operator generates expression among the first of convolution algorithm, and according in the first of convolution algorithm Between expression generate convolution algorithm the first executable instruction.Not according to the first of convolution algorithm the centre expression on piece caching Corresponding memory space generates expression among the second of the convolution algorithm, is not also expressed according to the second of convolution algorithm the centre Generate the second executable instruction of convolution algorithm.

Embodiment 3:

In one possible implementation, it can be obtained described according to segment information by the data sectional of pond algorithm The segment data of data, the data include the input data and output data of the pond algorithm；

According to the second centre expression of the second of the data the centre expression and the pond algorithm operator, the pond is generated Change expression among the second of algorithm.

In one possible implementation, the data of the pond algorithm are multidimensional data, according to segment information by pond The data sectional for changing algorithm, obtains the segment data of the data, comprising:

Each dimension data is segmented respectively according to segment information, obtains the dimension segment data of each dimension data；

In one possible implementation, the input data of the pond algorithm includes convolution output neuron, described The output data of pond algorithm includes pond output neuron.

In one possible implementation, the input data of pond algorithm can be segmented according to segment information, is obtained The segment data of pond algorithm input data.Convolution output neuron can be segmented according to segment information, be rolled up Product output neuron segment data.The output data of pond algorithm can be segmented according to segment information, it is defeated to obtain pond algorithm The segment data of data out.Pond output neuron can be segmented according to segment information, obtain pond output nerve First segment data.In segment information, for the input data and output data of pond algorithm, section length and number of fragments can With difference, the disclosure is not limited this.

In one possible implementation, each dimension data of the convolution output neuron includes: convolutional channel, volume Product characteristic pattern height and convolution characteristic pattern width；Each dimension data of the pond output neuron includes: convolutional channel, Chi Hua Characteristic pattern height and pond characteristic pattern width.

It in one possible implementation, can be according to segment information by the convolution output neuron in the algorithm of pond Each dimension data is segmented, and the dimension segment data of each dimension data is obtained.Can according to segment information by convolutional channel, Convolution characteristic pattern height and convolution characteristic pattern width are segmented respectively, and dimension segment data, the convolution for obtaining convolutional channel are special Levy the dimension segment data of figure height and the dimension segment data of convolution characteristic pattern width.Pondization can be calculated according to segment information Each dimension data of pond output neuron in method is segmented, and the dimension segment data of each dimension data is obtained.It can Convolutional channel, pond characteristic pattern height and pond characteristic pattern width are segmented according to segment information, obtain convolutional channel The dimension segment data of dimension segment data, the dimension segment data of pond characteristic pattern height and pond characteristic pattern width.

It in one possible implementation, can be by the segments of each input data of pond algorithm and each output data According to the first centre for writing each input data and output data respectively is expressed.When executing pond algorithm, each data can be segmented It executes, the segment data that can successively extract each data executes, and the segment data of each data can share among identical first Expression.

In one possible implementation, the method also includes:

The first executable instruction of the pond algorithm is generated according to the first of the pond algorithm the centre expression, or

The second executable instruction of the pond algorithm is generated according to the second of the pond algorithm the centre expression.

In one possible implementation, the intermediate expression of system on chip execution code and pond algorithm can be preset Between transformation warehouse.Such as transformation warehouse can use assembler language realization.It can be the first centre expression of pond algorithm and second Different transformation warehouses is arranged in centre expression.Can use transformation warehouse expression among the first of pond algorithm is converted to first can hold Row instruction, or the second centre expression of pond algorithm is converted into the second executable instruction using transformation warehouse.

In the present embodiment, the data sectional of pond algorithm is obtained by the segment data of data according to segment information, it can be with Expression among the first of data is determined according to the segment data of data, it can be according to the first centre expression of data and pond algorithm Expression among the first of operator, the first centre expression of generate pond algorithm.Pass through the first centre expression of data and operator First middle table reaches expression among the first of pond algorithm, and pond algorithm is not needed when different systems on chip is realized Specific interface is set, the compatibility between pond algorithm and system on chip is improved, reduces the difficulty of algorithm development.

Application example 4:

In one possible implementation, the data sectional of pond algorithm is obtained by the data according to segment information Segment data, the data include the input data and output data of the pond algorithm；

According to the first centre expression of the first of the data the centre expression and the pond algorithm operator, the pond is generated Change expression among the first of algorithm.

In one possible implementation, the data are multidimensional data, according to segment information by the number of pond algorithm According to segmentation, the segment data of the data is obtained, comprising:

In one possible implementation, the method also includes:

The first executable instruction of the pond algorithm is generated according to the first of the pond algorithm the centre expression.

The present embodiment and embodiment 3 the difference is that, only according to expression among the first of the data of pond algorithm with Expression among the first of pond algorithm operator, the first centre expression of generate pond algorithm, and according in the first of pond algorithm Between express generate pond algorithm the first executable instruction.Not according to the first of pond algorithm the centre expression on piece caching Corresponding memory space generates expression among the second of the pond algorithm, and is given birth to according to the second centre expression of pond algorithm Second executable instruction of Cheng Chihua algorithm.

Embodiment 5:

In one possible implementation, the data sectional of Matrix Multiple Algorithms is obtained by the number according to segment information According to segment data, the data include the input data and output data of the Matrix Multiple Algorithms；

It is expressed according to the second centre of the second of the data the centre expression and the Matrix Multiple Algorithms operator, described in generation Expression among the second of Matrix Multiple Algorithms.

In one possible implementation, the input data of the Matrix Multiple Algorithms includes the first matrix function of N row C column It include the second matrix data of N row M column according to, the output data of the Matrix Multiple Algorithms, according to segment information by the data of algorithm Segmentation, obtains the segment data of the data, comprising:

The first matrix data and the second matrix data are respectively divided into N sections according to segment information, obtain the first matrix data Segment data and the second matrix data segment data, wherein the every segment length of the segment data of the first matrix data is C, the The every segment length of the segment data of two matrix datas is M.

In one possible implementation, the input data of Matrix Multiple Algorithms can be segmented according to segment information, is obtained To the segment data of Matrix Multiple Algorithms input data.The first matrix data can be segmented according to segment information, be obtained First matrix data segment data.The output data of Matrix Multiple Algorithms can be segmented according to segment information, obtain Matrix Multiplication calculation The segment data of method output data.The second matrix data can be segmented according to segment information, obtain the second matrix function According to segment data.In segment information, for the input data and output data of Matrix Multiple Algorithms, section length and number of fragments Can be different, the disclosure does not limit this.

It in one possible implementation, can be by the segmentation of each input data of Matrix Multiple Algorithms and each output data Data write the first centre expression of each input data and output data respectively.When executing Matrix Multiple Algorithms, each data can be with Segmentation executes, and the segment data that can successively extract each data executes, and the segment data of each data can share identical first Centre expression.

In one possible implementation, the method also includes:

The first executable instruction of the Matrix Multiple Algorithms is generated according to the first centre expression of the Matrix Multiple Algorithms, or

The second executable instruction of the Matrix Multiple Algorithms is generated according to the second of the Matrix Multiple Algorithms the centre expression.

In one possible implementation, the intermediate expression that system on chip executes code and Matrix Multiple Algorithms can be preset Between transformation warehouse.Such as transformation warehouse can use assembler language realization.Can for Matrix Multiple Algorithms first among expression and Different transformation warehouses is arranged in expression among second.It can use transformation warehouse and expression among the first of Matrix Multiple Algorithms be converted to the One executable instruction, or the second centre expression of Matrix Multiple Algorithms is converted into the second executable instruction using transformation warehouse.

In the present embodiment, the data sectional of Matrix Multiple Algorithms is obtained by the segment data of data according to segment information, it can Expression among the first of data is determined with the segment data according to data, it can be according to the first centre expression of data and Matrix Multiplication Expression among the first of algorithm operator, the first centre expression of generator matrix multiplication algorithm.By data first among expression and First middle table of operator reaches expression among the first of Matrix Multiple Algorithms, and Matrix Multiple Algorithms are realized in different systems on chip When, it does not need that specific interface is arranged, improves the compatibility between Matrix Multiple Algorithms and system on chip, reduce algorithm development Difficulty.

Embodiment 6:

It is expressed according to the first centre of the first of the data the centre expression and the Matrix Multiple Algorithms operator, described in generation Expression among the first of Matrix Multiple Algorithms.

In one possible implementation, the method also includes:

The first executable instruction of the Matrix Multiple Algorithms is generated according to the first of the Matrix Multiple Algorithms the centre expression.

The present embodiment and embodiment 5 the difference is that, only expressed according among the first of the data of Matrix Multiple Algorithms With the first centre expression of Matrix Multiple Algorithms operator, the first centre of generator matrix multiplication algorithm is expressed, and according to Matrix Multiple Algorithms First among expression generator matrix multiplication algorithm the first executable instruction.Do not expressed according to the first of Matrix Multiple Algorithms the centre Corresponding memory space generates expression among the second of the Matrix Multiple Algorithms on piece caching, and according to Matrix Multiple Algorithms Second among expression generator matrix multiplication algorithm the second executable instruction.

Embodiment 7:

In one possible implementation, the data sectional of matrix computation system is obtained by the number according to segment information According to segment data, the data include the input data and output data of the matrix computation system；

In one possible implementation, the input data of the matrix computation system includes the third matrix function of N row C column The segment data of the data is obtained according to segment information by the data sectional of algorithm according to the 4th matrix data arranged with N row C, Include:

In one possible implementation, the input data of matrix computation system can be segmented according to segment information, is obtained To the segment data of matrix computation system input data.Third matrix data can be segmented according to segment information, be obtained Third matrix data segment data.The output data of matrix computation system can be segmented according to segment information, obtain matrix and add The segment data of method output data.The 4th matrix data can be segmented according to segment information, obtain the 4th matrix function According to segment data.In segment information, for the input data and output data of matrix computation system, section length and number of fragments Can be different, the disclosure does not limit this.

In one possible implementation, the method also includes:

In one possible implementation, the intermediate expression that system on chip executes code and matrix computation system can be preset Between transformation warehouse.Such as transformation warehouse can use assembler language realization.Can for matrix computation system first among expression and Different transformation warehouses is arranged in expression among second.It can use transformation warehouse and expression among the first of matrix computation system be converted to the One executable instruction, or the second centre expression of matrix computation system is converted into the second executable instruction using transformation warehouse.

In the present embodiment, the data sectional of matrix computation system is obtained by the segment data of data according to segment information, it can Expression among the first of data is determined with the segment data according to data, can be added according to the first centre expression of data and matrix Expression among the first of algorithm operator, the first centre expression of generator matrix computation system.By data first among expression and First middle table of operator reaches expression among the first of matrix computation system, and matrix computation system is realized in different systems on chip When, it does not need that specific interface is arranged, improves the compatibility between matrix computation system and system on chip, reduce algorithm development Difficulty.

Embodiment 8:

It is expressed according to the first centre of the first of the data the centre expression and the matrix computation system operator, described in generation Expression among the first of matrix computation system.

In one possible implementation, the method also includes:

The first executable instruction of the matrix computation system is generated according to the first of the matrix computation system the centre expression.

The present embodiment and embodiment 7 the difference is that, only expressed according among the first of the data of matrix computation system With the first centre expression of matrix computation system operator, the first centre of generator matrix computation system is expressed, and according to matrix computation system First among expression generator matrix computation system the first executable instruction.Do not expressed according to the first of matrix computation system the centre Corresponding memory space generates expression among the second of the matrix computation system on piece caching, and according to matrix computation system Second among expression generator matrix computation system the second executable instruction.

Fig. 4 shows the block diagram of the arithmetic unit according to one embodiment of the disclosure, as shown in figure 4, the arithmetic unit includes:

Segment data obtains module 10, for, by the data sectional of algorithm, obtaining point of the data according to segment information Segment data, the data include the input data and output data of the algorithm；

Determining module 20 is expressed among data, in first for generating the data according to the segment data of the data Between express；

Determining module 30 is expressed among algorithm, for according to the first centre expression of the data and the algorithm operator Expression among first generates expression among the first of the algorithm.

In one possible implementation, the data are multidimensional data, and the segment data obtains module, comprising:

Dimension segment data acquisition submodule is obtained for being segmented each dimension data respectively according to segment information The dimension segment data of each dimension data；

Segment data acquisition submodule obtains institute for the dimension segment data according to each dimension datas of the data State the segment data of data.

In one possible implementation, determining module is expressed among the algorithm, comprising:

Memory space determines submodule, and expression corresponding storage on piece caching is empty among first for determining data Between；

It is expressed among second and determines submodule, for generating the second middle table of the data according to the memory space It reaches；

Algorithm centre, which is expressed, determines submodule, for according to the second centre expression of the data and the algorithm operator Expression among second generates expression among the second of the algorithm.

In one possible implementation, described device further include:

First executable instruction generation module, for generating the of the algorithm according to expression among the first of the algorithm One executable instruction, or

Second executable instruction generation module, for generating the of the algorithm according to expression among the second of the algorithm Two executable instructions.

Each dimension data of the biasing includes: convolutional channel.

In one possible implementation, each dimension data of the convolution output neuron includes: convolutional channel, volume Product characteristic pattern height and convolution characteristic pattern width；

Each dimension data of the pond output neuron includes: convolutional channel, pond characteristic pattern height and pond feature Figure width.

In one possible implementation, the input data of the Matrix Multiple Algorithms includes the first matrix function of N row C column According to the output data of the Matrix Multiple Algorithms includes the second matrix data of N row M column, and the segment data obtains module, packet It includes:

Matrix Multiplication segment data acquisition submodule, for according to segment information by the first matrix data and the second matrix data N sections are respectively divided into, obtains the segment data of the first matrix data and the segment data of the second matrix data, wherein the first matrix The every segment length of the segment data of data is C, and the every segment length of the segment data of the second matrix data is M.

In one possible implementation, the input data of the matrix computation system includes the third matrix function of N row C column According to the 4th matrix data arranged with N row C, the segment data obtains module, comprising:

Matrix bonus point segment data acquisition submodule, for according to segment information by third matrix data and the 4th matrix data N sections are respectively divided into, obtains the segment data of third matrix data and the segment data of the 4th matrix data, wherein third matrix The every segment length of the segment data of data is C, and the every segment length of segment data of the 4th matrix data is C.

Fig. 5 shows the block diagram of the combined treatment device according to one embodiment of the disclosure, as shown in figure 5, the combined treatment Device, including above-mentioned neural network computing device, general interconnecting interface and other processing units.

Neural network computing device is interacted with other processing units, the common operation completing user and specifying.Its elsewhere Device is managed, including one in the general/application specific processors such as central processor CPU, graphics processor GPU, neural network processor Kind or above processor type.Processor quantity included by other processing units is with no restrictions.Other processing unit conducts The interface of neural network computing device and external data and control, including data are carried, and are completed to this neural network computing device The basic control such as unlatching, stopping；Other processing units can also cooperate with neural network computing device and complete operation times jointly Business.General interconnecting interface, for transmitting data and control instruction between the neural network computing device and other processing units. The neural network computing device obtains required input data from other processing units, and neural network computing device on piece is written Storage device；Control instruction can be obtained from other processing units, the control of write-in neural network computing device on piece is slow It deposits；The data in the memory module of neural network computing device can also be read and be transferred to other processing units.

Combined treatment device can also include storage device, storage device respectively with the neural network computing device and institute State the connection of other processing units.Storage device is used to be stored in the neural network computing device and other processing units Data, the data of operation required for being particularly suitable for are in the storage inside of this neural network computing device or other processing units The data that can not all save.

The combined treatment device can be used as the SOC on piece of the equipment such as mobile phone, robot, unmanned plane, video monitoring equipment The die area of control section is effectively reduced in system, improves processing speed, reduces overall power.When this situation, the combined treatment The general interconnecting interface of device is connected with certain components of equipment.Certain components for example camera, display, mouse, keyboard, Network interface card, wifi interface.

In one possible implementation, the disclosure also provides neural network chip comprising above-mentioned neural network Arithmetic unit or combined treatment device.

In one possible implementation, the disclosure also provides chip-packaging structure comprising said chip.

In one possible implementation, the disclosure also provides board comprising said chip encapsulating structure.

In one possible implementation, the disclosure also provides electronic equipment comprising above-mentioned board.

Electronic equipment include data processing equipment, robot, computer, printer, scanner, tablet computer, intelligent terminal, Mobile phone, automobile data recorder, navigator, sensor, camera, server, cloud server, camera, video camera, projector, hand Table, earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or Medical Devices.

The vehicles include aircraft, steamer and/or vehicle；The household electrical appliance include TV, air-conditioning, micro-wave oven, Refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator；The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument And/or electrocardiograph.

It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the disclosure is not limited by the described action sequence because According to the disclosure, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know It knows, embodiment described in this description belongs to alternative embodiment, the related actions and modules not necessarily disclosure It is necessary.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, reference can be made to the related descriptions of other embodiments.

In several embodiments provided by the disclosure, it should be understood that disclosed device, it can be by another way It realizes.For example, the apparatus embodiments described above are merely exemplary, such as the division of the unit, it is only a kind of Logical function partition, there may be another division manner in actual implementation, such as multiple units or components can combine or can To be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Coupling, direct-coupling or communication connection can be through some interfaces, the indirect coupling or communication connection of device or unit, It can be electrical or other forms.

The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.

It, can also be in addition, each functional unit in each embodiment of the disclosure can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also be realized in the form of software program module.

If the integrated unit is realized in the form of software program module and sells or use as independent product When, it can store in a computer-readable access to memory.Based on this understanding, the technical solution of the disclosure substantially or Person says that all or part of the part that contributes to existing technology or the technical solution can body in the form of software products Reveal and, which is stored in a memory, including some instructions are used so that a computer equipment (can be personal computer, server or network equipment etc.) executes all or part of each embodiment the method for the disclosure Step.And memory above-mentioned includes: USB flash disk, read-only memory (ROM, Read-Only Memory), random access memory The various media that can store program code such as (RAM, Random Access Memory), mobile hard disk, magnetic or disk.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can store in a computer-readable memory, memory May include: flash disk, read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English: Random Access Memory, referred to as: RAM), disk or CD etc..

The embodiment of the present disclosure is described in detail above, specific case used herein to the principle of the disclosure and Embodiment is expounded, disclosed method that the above embodiments are only used to help understand and its core concept； At the same time, for those skilled in the art can in specific embodiments and applications according to the thought of the disclosure There is change place, in conclusion the content of the present specification should not be construed as the limitation to the disclosure.

Referring herein to according to the flow chart of the method, apparatus (system) of the embodiment of the present disclosure and computer program product and/ Or block diagram describes various aspects of the disclosure.It should be appreciated that flowchart and or block diagram each box and flow chart and/ Or in block diagram each box combination, can be realized by computer-readable program instructions.

The flow chart and block diagram in the drawings show system, method and the computer journeys according to multiple embodiments of the disclosure The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation One module of table, program segment or a part of instruction, the module, program segment or a part of instruction include one or more use The executable instruction of the logic function as defined in realizing.In some implementations as replacements, function marked in the box It can occur in a different order than that indicated in the drawings.For example, two continuous boxes can actually be held substantially in parallel Row, they can also be executed in the opposite order sometimes, and this depends on the function involved.It is also noted that block diagram and/or The combination of each box in flow chart and the box in block diagram and or flow chart, can the function as defined in executing or dynamic The dedicated hardware based system made is realized, or can be realized using a combination of dedicated hardware and computer instructions.

The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport In principle, the practical application or to the technological improvement in market for best explaining each embodiment, or make the art its Its those of ordinary skill can understand each embodiment disclosed herein.

Claims

1. a kind of operation method, which is characterized in that the described method includes:

According to segment information by the data sectional of matrix computation system, the segment data of the data is obtained, the data include institute State the input data and output data of matrix computation system；

According to the second centre expression of the second of the data the centre expression and the matrix computation system operator, the matrix is generated Expression among the second of computation system.

2. the method according to claim 1, wherein the method also includes:

3. the method according to claim 1, wherein the segment information is determined according to the size that on piece caches.

4. the method according to claim 1, wherein the input data of the matrix computation system includes that N row C is arranged 4th matrix data of third matrix data and N row C column obtains institute according to segment information by the data sectional of matrix computation system State the segment data of data, comprising:

Third matrix data and the 4th matrix data are respectively divided into N sections according to segment information, obtain point of third matrix data The segment data of segment data and the 4th matrix data, wherein the every segment length of the segment data of third matrix data is C, the 4th square The every segment length of segment data of battle array data is C.

5. a kind of arithmetic unit, which is characterized in that described device includes:

Segment data obtains module, for, by the data sectional of matrix computation system, obtaining point of the data according to segment information Segment data, the data include the input data and output data of the matrix computation system；

Determining module is expressed among data, for determining the first middle table of the data according to the segment data of the data It reaches；

Memory space determining module, for determining that the corresponding memory space on piece caching is expressed in the first centre of data；

Determining module is expressed among algorithm, for according to the second centre expression of the data and the matrix computation system operator Expression among second generates expression among the second of the matrix computation system.

6. device according to claim 5, which is characterized in that described device further include:

First executable instruction generation module generates the matrix for the first centre expression according to the matrix computation system and adds First executable instruction of algorithm, or

Second executable instruction generation module generates the matrix for the second centre expression according to the matrix computation system and adds Second executable instruction of algorithm.

7. device according to claim 5, which is characterized in that the segment information is determined according to the size that on piece caches.

8. device according to claim 5, which is characterized in that the input data of the matrix computation system includes that N row C is arranged 4th matrix data of third matrix data and N row C column obtains institute according to segment information by the data sectional of matrix computation system State the segment data of data, comprising:

9. a kind of neural network computing device, which is characterized in that the neural network computing device includes one or more as weighed Benefit requires arithmetic unit described in any one of 5-8, and the neural network computing device is used to complete the neural network fortune of setting It calculates.

10. a kind of combinatorial operation device, which is characterized in that the combinatorial operation device includes one or more such as claim 9 Described in any item neural network computing devices, general interconnecting interface and other processing units；

The neural network computing device is interacted with other described processing units, the common calculating behaviour for completing user and specifying Make.

11. a kind of neural network chip, which is characterized in that the neural network chip includes:

Such as the described in any item arithmetic units of claim 5-8；Or

Neural network computing device as claimed in claim 9；Or

Combined treatment device as claimed in claim 10.

12. a kind of electronic equipment, which is characterized in that the electronic equipment includes:

Such as the described in any item arithmetic units of claim 5-8；Or

Neural network computing device as claimed in claim 9；Or

Combined treatment device as claimed in claim 10；Or

Neural network chip as claimed in claim 11.