CN108734288A

CN108734288A - A kind of operation method and device

Info

Publication number: CN108734288A
Application number: CN201710269049.0A
Authority: CN
Inventors: 不公告发明人
Original assignee: Shanghai Cambricon Information Technology Co Ltd
Current assignee: Shanghai Cambricon Information Technology Co Ltd
Priority date: 2017-04-21
Filing date: 2017-04-21
Publication date: 2018-11-02
Anticipated expiration: 2037-04-21
Also published as: CN108734288B; CN109376852A; CN109376852B

Abstract

A kind of operation method and device, the arithmetic unit include input module, are used for input data；Model generation module, for building model according to input data；Neural network computing module obtains operation result for being based on model generation operational order and caching, and according to operational order to pending data progress operation；Output module, for exporting operation result.The device and method of the disclosure can avoid the overhead that entire software architecture is brought in operation conventional method.

Description

A kind of operation method and device

Technical field

The disclosure belongs to Computer Architecture, deep learning and field of neural networks, relates more specifically to a kind of operation Method and device.

Background technology

Deep learning is the branch of machine learning, it attempts use and is constituted comprising labyrinth or by multiple nonlinear transformation Multiple process layers to data carry out higher level of abstraction algorithm.

Deep learning is a kind of based on the method for carrying out representative learning to data in machine learning.Observation (such as a width Image) it can use a plurality of ways to indicate, such as vector of each pixel intensity value, or be more abstractively expressed as a series of Side, specific shape region etc..And use certain specific representation methods be easier from example learning tasks (for example, face Identification or human facial expression recognition).

So far have several deep learning frameworks, such as deep neural network, convolutional neural networks and depth belief network and Recurrent neural network has been applied to computer vision, speech recognition, natural language processing, audio identification and bioinformatics etc. Field, and obtain fabulous effect.In addition, deep learning has become similar terms, or perhaps the brand weight of neural network Modeling.

With the big heat of deep learning (neural network), neural network accelerator also comes into being, and passes through special memory It is designed with computing module, neural network accelerator can obtain the general processor tens that compares when carrying out deep learning operation Very it is the speed-up ratio of hundreds of times again, and area smaller, power consumption are lower.

For convenience by neural network accelerator be applied to various heterogeneous networks structures on to carry out acceleration operation, be with it The programming software library and programming framework on basis are also come into being and are continued to develop.In previous application scenarios, neural network adds Fast device programming framework is usually located at top layer, and currently used programming framework has Caffe, Tensorflow, Torch etc., such as Fig. 1 It is shown, neural network accelerator (specialized hardware for being used for neural network computing) is followed successively by from bottom to upper layer, hardware driving (is used In software transfer neural network accelerator), neural network accelerator programming library (calls connecing for neural network accelerator for providing Mouthful), neural network accelerator programming framework and the advanced application for needing to carry out neural network computing.And some low memories, In real-time application scenarios, excessive computing resource will be consumed by running a whole set of software architecture.Therefore, for specific application field Scape, how to be optimized to calculating process is one of problem to be solved.

Invention content

Based on problem above, the purpose of the disclosure is to propose a kind of operation method and device, for solving above-mentioned technology At least one of problem.

In order to achieve the above object, as an aspect of this disclosure, the present disclosure proposes a kind of operation methods, including with Lower step：

When input data includes pending data, network structure and weight data, following steps are executed：

Step 11 inputs and reads input data；

Step 12 builds off-line model according to network structure and weight data；

Step 13, parsing off-line model, obtain operational order and cache, called for subsequently calculating；

Step 14, according to operational order, operation is carried out to pending data and obtains operation result for output；

When input data includes pending data and off-line model, following steps are executed：

Step 21 inputs and reads input data；

Step 22, parsing off-line model, obtain operational order and cache, called for subsequently calculating；

Step 23, according to operational order, operation is carried out to pending data and obtains operation result for output；

When input data only includes pending data, following steps are executed：

Step 31 inputs and reads input data；

Step 32, the operational order for calling caching carry out operation to pending data and obtain operation result for output.

Further, above-mentioned according to operational order, the step of operation obtains operation result is carried out to pending data, is logical Neural-network processing unit is crossed to realize.

Further, above-mentioned neural-network processing unit has instruction cache unit, for caching after operational order is used for Continuous calculate is called.

Further, above-mentioned off-line model includes various neural network models, which includes Cambricon_model、AlexNet_model、GoogleNet_model、VGG_model、R-CNN_model、GAN_ Model, LSTM_model, RNN_model, ResNet_model etc..

Further, above-mentioned pending data is the input that can be handled with neural network.

Further, above-mentioned pending data includes continuous single picture, voice or video flowing.

Further, above-mentioned network structure include AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet and possible various neural network structures.

In order to achieve the above object, as another aspect of the disclosure, the present disclosure proposes a kind of arithmetic unit, packets It includes：

Input module, be used for input data, the data include pending data, network structure and weight data and/or from Line model data；

Model generation module, for building off-line model according to the network structure and weight data of input；

Neural network computing module is generated for being based on off-line model and operational order and is cached, and is based on operational order Operation is carried out to pending data and obtains operation result；

Output module, for exporting the operation result；

Control module, for detecting input data type and executing following operation：

When input data includes pending data, network structure and weight data, input module is controlled by network structure With weight data input model generation module to build off-line model, and control neural network computing module is based on model and generates mould The off-line model of block input carries out operation to the pending data of input module input；

When input data includes pending data and off-line model, input module is controlled by pending data and offline mould Type inputs neural network computing module, and control neural network computing module is based on off-line model and generates operational order and cache, And operation is carried out to pending data based on operational order；

When input data only includes pending data, pending data is inputted neural network computing by control input module Module, and control neural network computing module calls the operational order of caching, and operation is carried out to pending data.

Further, above-mentioned neural network computing module includes model analyzing unit and a processing unit, wherein：

Model analyzing unit, for generating operational order based on off-line model；

Neural-network processing unit, for caching operational order for subsequently calculating calling；Or it only wraps in input data The operational order of caching is called when including pending data, and operation is carried out to pending data based on operational order and obtains operation knot Fruit.

The operation method and device that the disclosure proposes, have the advantages that：

1, the method and device that the disclosure proposes can directly be transported after generating off-line model according to off-line model It calculates, avoids the overhead that entire software architecture of the operation including deep learning frame is brought；

2, the device and method that the disclosure proposes, realizes the reconstruction more efficient to neural network processor so that Neural network processor can give full play to performance in low memory, real-time application environment, and calculating process is more succinct Quickly.

Description of the drawings

Fig. 1 is programming framework in the prior art；

Fig. 2 is the operational flowchart for the operation method that one embodiment of the disclosure proposes；

Fig. 3 is the structural framing figure for the arithmetic unit that another embodiment of the disclosure proposes.

Specific implementation mode

To make the purpose, technical scheme and advantage of the disclosure be more clearly understood, below in conjunction with specific embodiment, and reference Attached drawing is described in further detail the disclosure.

In the present specification, following various embodiments for describing disclosure principle only illustrate, should not be with any Mode is construed to limit the scope of the disclosure.Described below with reference to attached drawing is used to help comprehensive understanding by claim and its waits The exemplary embodiment for the disclosure that jljl limits.It is described below to help to understand including a variety of details, but these details It is considered as being only exemplary.Therefore, it will be appreciated by those of ordinary skill in the art that without departing substantially from the scope of the present disclosure and essence In the case of god, embodiment described herein can be made various changes and modifications.In addition, rising for clarity and brevity See, the description of known function and structure is omitted.In addition, running through attached drawing, same reference numerals are used for identity function and operation.

The present disclosure discloses a kind of operation methods, include the following steps：

Step 11 inputs and reads input data；

Step 12 builds off-line model according to network structure and weight data；

Step 21 inputs and reads input data；

When input data only includes pending data, following steps are executed：

Step 31 inputs and reads input data；

In some embodiments of the present disclosure, by neural-network processing unit, according to operational order, to pending data It carries out operation and obtains operation result；Preferably, which has instruction cache unit, for the fortune to reception It calculates instruction to be cached, the above-mentioned operational order cached in advance is the operational order of the previous operation of instruction cache unit caching.

In some embodiments of the present disclosure, above-mentioned neural-network processing unit also has data buffer storage unit, for delaying Deposit the pending data.

Based on above-mentioned operation method, the disclosure also discloses a kind of arithmetic unit, including：

Output module, for exporting the operation result；

Above-mentioned neural network computing module includes model analyzing unit and neural-network processing unit, wherein：

In some embodiments of the present disclosure, above-mentioned neural-network processing unit has instruction cache unit, for caching Operational order is called for subsequently calculating.

In some embodiments of the present disclosure, above-mentioned off-line model is a text file defined according to special construction, Can be various neural network models, such as can be Cambricon_model, AlexNet_model, GoogleNet_model, The models such as VGG_model, R-CNN_model, GAN_model, LSTM_model, RNN_model, ResNet_model, but simultaneously It is not limited solely to these models of the present embodiment proposition.

In some embodiments of the present disclosure, pending data is the input that can be handled with neural network, for example, Any one of continuous single picture, voice or video flowing.

In some embodiments of the present disclosure, above-mentioned network structure can be various neural network structures, such as can be AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet etc., but it is not only limited in the present embodiment These structures proposed.

Specifically, according to the difference of input module input data, the arithmetic unit of the disclosure has following three kinds work former Reason：

1, when the data of input module input are network structure, weight data and pending data, then control module control Network structure and weight data are transmitted to model generation module by input module processed, and pending data is transmitted to model analyzing mould Block；Control module Controlling model generation module generates off-line model according to network structure and weight data, and by the generation from Line model is transmitted to model analyzing unit；Control module Controlling model resolution unit parses the off-line model of reception, obtains It is transmitted to its nerve for including to the identifiable operational order of neural-network processing unit, and by operational order and pending data Network processing unit；Neural-network processing unit carries out operation according to the operational order of reception to pending data, obtains one really Fixed operation result, and the operation result is transmitted to output module for output.

2, when the data of input module input are off-line model and pending data, control module then controls input module Off-line model and pending data are directly transferred to model analyzing unit, follow-up work principle is identical as the first situation.

3, when the data of input module input only include pending data, then control module control input module is direct This pending data is transmitted to neural-network processing unit through model analyzing unit, neural-network processing unit is according to caching Operational order carries out operation to pending data and obtains operation result.Usual such case will not be used for the first time at neural network Occur in reason device, to ensure the operational order of existing determination in instruction buffer.

Therefore, in the off-line model difference of current network operation and upper primary network operations, the number of input module input According to that should include network structure, weight data and pending data, be carried out after new off-line model is generated by model generation module Subsequent network operations；It is defeated when current network operation is first time network operations and has obtained corresponding off-line model in advance The data for entering module input should include off-line model and pending data；It is not for the first time, and with upper one in current network operation When the off-line model of secondary network operations is identical, the data of input module input only include pending data.

In some embodiments of the present disclosure, the arithmetic unit of disclosure description is integrated into entire computer as submodule In the CPU module of system.Pending data and off-line model are transmitted to arithmetic unit by central processing unit control In.Model analyzing unit can parse incoming neural network off-line model and generate operational order.Then operational order It can be passed into neural-network processing unit with pending data, operation result is obtained by calculation process, and by the operation knot Fruit returns in main memory unit.In follow-up calculating process, network structure no longer changes, then only needs constantly to be passed to pending number According to neural computing can be completed, operation result is obtained.

The arithmetic unit and method proposed to the disclosure below by way of specific embodiment is described in detail.

Embodiment 1

As shown in Fig. 2, the present embodiment proposes a kind of operation method, include the following steps：

Step 11 inputs and reads input data；

Step 12 builds off-line model according to network structure and weight data；

Step 14, according to operational order, operation is carried out to pending data and obtains neural network computing result for output；

Step 21 inputs and reads input data；

Step 23, according to operational order, operation is carried out to pending data and obtains neural network computing result for output；

When input data only includes pending data, following steps are executed：

Step 31 inputs and reads input data；

Step 32, call caching operational order, to pending data carry out operation obtain neural network computing result with For output.

By neural-network processing unit, according to operational order, pending data is handled to obtain operation result；It should Neural-network processing unit has instruction cache unit and data buffer storage unit, for respectively to the operational order of reception and waiting locating Reason data are cached.

The network structure of the input proposed in the present embodiment is AlexNet, weight data bvlc_ Alexnet.caffemodel, pending data are continuous single picture, off-line model Cambricon_model.

In conclusion the method proposed with the present embodiment, can largely simplify and use neural network processor The flow for carrying out operation avoids the extra memory for calling traditional a whole set of programming framework to arrive and IO expenses.With this method, can allow Neural network accelerator gives full play to operational performance under low memory, real-time environment.

Embodiment 2

As shown in figure 3, the present embodiment proposes a kind of arithmetic unit, including：Input module 101, model generation module 102, Neural network computing module 103, output module 104 and control module 105, wherein neural network computing module 103 includes model Resolution unit 106 and neural network processor 107

The keyword of the device is off-line execution, refers to directly off-line model being utilized to generate correlation after generating off-line model Operational order and incoming weight data, processing operation is carried out to pending data.More specifically：

Above-mentioned input module 101, the combination for inputting network structure, weight data and pending data or offline mould The combination of type and pending data.When input is network structure, weight data and pending data, then by network structure and power Value Data is passed to model generation module 102, and following operation is executed to generate off-line model.When input is off-line model and is waited for When handling data, then by the directly incoming model analyzing unit 106 of off-line model, pending data, to execute following operation.

Above-mentioned output module 104, for exporting according to the determination of particular network structure and the generation of one group of pending data Operational data.Wherein output data is obtained by 107 operation of neural network processor.

Above-mentioned model generation module 102, for the network architecture parameters according to input, weight data is generated for under The off-line model that layer uses.

Above-mentioned model analyzing unit 106, for parsing incoming off-line model, generation can be directly at afferent nerve network The operational order of device 107 is managed, while the pending data that input module 101 is passed to being passed in neural network processor 107.

Above-mentioned neural network processor 107 is obtained for carrying out operation according to incoming operational order and pending data Determining operation result is passed in output module 104, has instruction cache unit and data buffer storage unit.

Above-mentioned control module 105, for detecting input data type and executing following operation：

When input data includes pending data, network structure and weight data, input module 101 is controlled by network knot Structure and weight data input model generation module 102 are to build off-line model, and control neural network computing module 103 is based on mould The off-line model that type generation module 102 inputs carries out neural network computing to the pending data that input module 101 inputs；

When input data includes pending data and off-line model, control input module 101 by pending data and from Line model inputs neural network computing module 103, and control neural network computing module 103 is based on off-line model generation operation and refers to It enables and caches, and neural network computing is carried out to pending data based on operational order；

When input data only includes pending data, pending data is inputted neural network by control input module 101 Computing module 103, and control neural network computing module 103 calls the operational order of caching, and nerve is carried out to pending data Network operations.

The network structure of the input proposed in the present embodiment is AlexNet, weight data bvlc_ Alexnet.caffemodel, pending data are continuous single picture.Model generation module 102 is according to the network knot of input Structure and weight data generate new off-line model Cambricon_model, and the off-line model Cambricon_model of generation also may be used It is used alone using the input as next time；Model analyzing unit 106 can parse off-line model Cambricon_model, to Generate a series of operational orders.The operational order of generation is transferred on neural network processor 107 by model analyzing unit 106 In instruction cache unit, by the data buffer storage on the incoming input picture transfer to neural network processor 107 of input module 101 In unit.

Discribed process or method can be by including hardware in the attached drawing of front, software, or both combination processing Logic executes.Although above according to certain sequences operation describe process or method, however, it is to be understood that it is described certain A little operations can be executed with different order.In addition, concurrently rather than some operations can be sequentially performed.

Particular embodiments described above has carried out further in detail the purpose, technical solution and advantageous effect of the disclosure Describe in detail bright, it should be understood that the foregoing is merely the specific embodiment of the disclosure, be not limited to the disclosure, it is all Within the spirit and principle of the disclosure, any modification, equivalent substitution, improvement and etc. done should be included in the protection of the disclosure Within the scope of.

Claims

1. a kind of operation method, includes the following steps：

Step 11 inputs and reads input data；

Step 12 builds off-line model according to the network structure and weight data；

Step 13, the parsing off-line model, obtain operational order and cache, called for subsequently calculating；

Step 14, according to the operational order, operation is carried out to the pending data and obtains operation result for output；

Step 21 inputs and reads input data；

Step 22, the parsing off-line model, obtain operational order and cache, called for subsequently calculating；

Step 23, according to the operational order, operation is carried out to the pending data and obtains operation result for output；

When input data only includes pending data, following steps are executed：

Step 31 inputs and reads input data；

Step 32, the operational order for calling caching carry out operation to the pending data and obtain operation result for output.

2. operation method as described in claim 1, wherein it is described according to operational order, god is carried out to the pending data The step of obtaining operation result through network operations is realized by neural-network processing unit.

3. operation method as claimed in claim 2, wherein the neural-network processing unit has instruction cache unit, uses In caching the operational order, called for subsequently calculating.

4. operation method according to any one of claims 1 to 3, wherein the off-line model is neural network model；Institute It includes Cambricon_model, AlexNet_model, GoogleNet_model, VGG_model, R- to state neural network model CNN_model、GAN_model、LSTM_model、RNN_model、ResNet_model。

5. the operation method as described in Claims 1 to 4, wherein the pending data is can be at neural network The input of reason.

6. operation method as claimed in claim 5, wherein the pending data include continuous single picture, voice or Video flowing.

7. such as operation method according to any one of claims 1 to 6, wherein the network structure is neural network structure；Institute It includes AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet to state neural network structure.

8. a kind of arithmetic unit, including：

Input module, is used for input data, and the data include pending data, network structure and weight data and/or offline Model data；

Neural network computing module is generated for being based on off-line model and operational order and is cached, and treated based on operational order Processing data carry out operation and obtain operation result；

Output module, for exporting the operation result；

When input data includes pending data, network structure and weight data, input module is controlled by network structure and power Value Data input model generation module is to build off-line model, and control neural network computing module is defeated based on model generation module The off-line model entered carries out operation to the pending data of input module input；

When input data includes pending data and off-line model, control input module is defeated by pending data and off-line model Enter neural network computing module, and control neural network computing module is based on off-line model and generates operational order and cache, and base Operation is carried out to the pending data in the operational order；

When input data only includes pending data, pending data is inputted neural network computing mould by control input module Block, and control neural network computing module calls the operational order of caching, and operation is carried out to the pending data.

9. arithmetic unit as claimed in claim 8, wherein the neural network computing module includes model analyzing unit and god Through network processing unit, wherein：

Neural-network processing unit, for caching the operational order for subsequently calculating calling；Or it only wraps in input data The operational order of caching is called when including pending data, and operation is carried out to the pending data based on the operational order and is obtained To the operation result.

10. the arithmetic unit as described in any one of claim 8~9, wherein the neural-network processing unit has instruction Buffer unit, for caching the operational order for subsequently calculating calling.