CN108734288A - A kind of operation method and device - Google Patents

A kind of operation method and device Download PDF

Info

Publication number
CN108734288A
CN108734288A CN201710269049.0A CN201710269049A CN108734288A CN 108734288 A CN108734288 A CN 108734288A CN 201710269049 A CN201710269049 A CN 201710269049A CN 108734288 A CN108734288 A CN 108734288A
Authority
CN
China
Prior art keywords
data
model
operational order
input
pending data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710269049.0A
Other languages
Chinese (zh)
Other versions
CN108734288B (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to CN201710269049.0A priority Critical patent/CN108734288B/en
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to KR1020197025307A priority patent/KR102292349B1/en
Priority to EP19214371.7A priority patent/EP3786786B1/en
Priority to CN202410405915.4A priority patent/CN118690805A/en
Priority to PCT/CN2018/083415 priority patent/WO2018192500A1/en
Priority to CN201880000923.3A priority patent/CN109121435A/en
Priority to US16/476,262 priority patent/US11531540B2/en
Priority to KR1020197038135A priority patent/KR102258414B1/en
Priority to JP2019549467A priority patent/JP6865847B2/en
Priority to EP18788355.8A priority patent/EP3614259A4/en
Priority to EP19214320.4A priority patent/EP3654172A1/en
Priority to CN201811097653.0A priority patent/CN109376852B/en
Publication of CN108734288A publication Critical patent/CN108734288A/en
Priority to US16/697,533 priority patent/US11531541B2/en
Priority to US16/697,727 priority patent/US11698786B2/en
Priority to US16/697,637 priority patent/US11720353B2/en
Priority to US16/697,687 priority patent/US11734002B2/en
Priority to JP2019228383A priority patent/JP6821002B2/en
Application granted granted Critical
Publication of CN108734288B publication Critical patent/CN108734288B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/35Creation or generation of source code model driven
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)
  • Devices For Executing Special Programs (AREA)
  • Machine Translation (AREA)

Abstract

A kind of operation method and device, the arithmetic unit include input module, are used for input data;Model generation module, for building model according to input data;Neural network computing module obtains operation result for being based on model generation operational order and caching, and according to operational order to pending data progress operation;Output module, for exporting operation result.The device and method of the disclosure can avoid the overhead that entire software architecture is brought in operation conventional method.

Description

A kind of operation method and device
Technical field
The disclosure belongs to Computer Architecture, deep learning and field of neural networks, relates more specifically to a kind of operation Method and device.
Background technology
Deep learning is the branch of machine learning, it attempts use and is constituted comprising labyrinth or by multiple nonlinear transformation Multiple process layers to data carry out higher level of abstraction algorithm.
Deep learning is a kind of based on the method for carrying out representative learning to data in machine learning.Observation (such as a width Image) it can use a plurality of ways to indicate, such as vector of each pixel intensity value, or be more abstractively expressed as a series of Side, specific shape region etc..And use certain specific representation methods be easier from example learning tasks (for example, face Identification or human facial expression recognition).
So far have several deep learning frameworks, such as deep neural network, convolutional neural networks and depth belief network and Recurrent neural network has been applied to computer vision, speech recognition, natural language processing, audio identification and bioinformatics etc. Field, and obtain fabulous effect.In addition, deep learning has become similar terms, or perhaps the brand weight of neural network Modeling.
With the big heat of deep learning (neural network), neural network accelerator also comes into being, and passes through special memory It is designed with computing module, neural network accelerator can obtain the general processor tens that compares when carrying out deep learning operation Very it is the speed-up ratio of hundreds of times again, and area smaller, power consumption are lower.
For convenience by neural network accelerator be applied to various heterogeneous networks structures on to carry out acceleration operation, be with it The programming software library and programming framework on basis are also come into being and are continued to develop.In previous application scenarios, neural network adds Fast device programming framework is usually located at top layer, and currently used programming framework has Caffe, Tensorflow, Torch etc., such as Fig. 1 It is shown, neural network accelerator (specialized hardware for being used for neural network computing) is followed successively by from bottom to upper layer, hardware driving (is used In software transfer neural network accelerator), neural network accelerator programming library (calls connecing for neural network accelerator for providing Mouthful), neural network accelerator programming framework and the advanced application for needing to carry out neural network computing.And some low memories, In real-time application scenarios, excessive computing resource will be consumed by running a whole set of software architecture.Therefore, for specific application field Scape, how to be optimized to calculating process is one of problem to be solved.
Invention content
Based on problem above, the purpose of the disclosure is to propose a kind of operation method and device, for solving above-mentioned technology At least one of problem.
In order to achieve the above object, as an aspect of this disclosure, the present disclosure proposes a kind of operation methods, including with Lower step:
When input data includes pending data, network structure and weight data, following steps are executed:
Step 11 inputs and reads input data;
Step 12 builds off-line model according to network structure and weight data;
Step 13, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 14, according to operational order, operation is carried out to pending data and obtains operation result for output;
When input data includes pending data and off-line model, following steps are executed:
Step 21 inputs and reads input data;
Step 22, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 23, according to operational order, operation is carried out to pending data and obtains operation result for output;
When input data only includes pending data, following steps are executed:
Step 31 inputs and reads input data;
Step 32, the operational order for calling caching carry out operation to pending data and obtain operation result for output.
Further, above-mentioned according to operational order, the step of operation obtains operation result is carried out to pending data, is logical Neural-network processing unit is crossed to realize.
Further, above-mentioned neural-network processing unit has instruction cache unit, for caching after operational order is used for Continuous calculate is called.
Further, above-mentioned off-line model includes various neural network models, which includes Cambricon_model、AlexNet_model、GoogleNet_model、VGG_model、R-CNN_model、GAN_ Model, LSTM_model, RNN_model, ResNet_model etc..
Further, above-mentioned pending data is the input that can be handled with neural network.
Further, above-mentioned pending data includes continuous single picture, voice or video flowing.
Further, above-mentioned network structure include AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet and possible various neural network structures.
In order to achieve the above object, as another aspect of the disclosure, the present disclosure proposes a kind of arithmetic unit, packets It includes:
Input module, be used for input data, the data include pending data, network structure and weight data and/or from Line model data;
Model generation module, for building off-line model according to the network structure and weight data of input;
Neural network computing module is generated for being based on off-line model and operational order and is cached, and is based on operational order Operation is carried out to pending data and obtains operation result;
Output module, for exporting the operation result;
Control module, for detecting input data type and executing following operation:
When input data includes pending data, network structure and weight data, input module is controlled by network structure With weight data input model generation module to build off-line model, and control neural network computing module is based on model and generates mould The off-line model of block input carries out operation to the pending data of input module input;
When input data includes pending data and off-line model, input module is controlled by pending data and offline mould Type inputs neural network computing module, and control neural network computing module is based on off-line model and generates operational order and cache, And operation is carried out to pending data based on operational order;
When input data only includes pending data, pending data is inputted neural network computing by control input module Module, and control neural network computing module calls the operational order of caching, and operation is carried out to pending data.
Further, above-mentioned neural network computing module includes model analyzing unit and a processing unit, wherein:
Model analyzing unit, for generating operational order based on off-line model;
Neural-network processing unit, for caching operational order for subsequently calculating calling;Or it only wraps in input data The operational order of caching is called when including pending data, and operation is carried out to pending data based on operational order and obtains operation knot Fruit.
Further, above-mentioned neural-network processing unit has instruction cache unit, for caching after operational order is used for Continuous calculate is called.
The operation method and device that the disclosure proposes, have the advantages that:
1, the method and device that the disclosure proposes can directly be transported after generating off-line model according to off-line model It calculates, avoids the overhead that entire software architecture of the operation including deep learning frame is brought;
2, the device and method that the disclosure proposes, realizes the reconstruction more efficient to neural network processor so that Neural network processor can give full play to performance in low memory, real-time application environment, and calculating process is more succinct Quickly.
Description of the drawings
Fig. 1 is programming framework in the prior art;
Fig. 2 is the operational flowchart for the operation method that one embodiment of the disclosure proposes;
Fig. 3 is the structural framing figure for the arithmetic unit that another embodiment of the disclosure proposes.
Specific implementation mode
To make the purpose, technical scheme and advantage of the disclosure be more clearly understood, below in conjunction with specific embodiment, and reference Attached drawing is described in further detail the disclosure.
In the present specification, following various embodiments for describing disclosure principle only illustrate, should not be with any Mode is construed to limit the scope of the disclosure.Described below with reference to attached drawing is used to help comprehensive understanding by claim and its waits The exemplary embodiment for the disclosure that jljl limits.It is described below to help to understand including a variety of details, but these details It is considered as being only exemplary.Therefore, it will be appreciated by those of ordinary skill in the art that without departing substantially from the scope of the present disclosure and essence In the case of god, embodiment described herein can be made various changes and modifications.In addition, rising for clarity and brevity See, the description of known function and structure is omitted.In addition, running through attached drawing, same reference numerals are used for identity function and operation.
The present disclosure discloses a kind of operation methods, include the following steps:
When input data includes pending data, network structure and weight data, following steps are executed:
Step 11 inputs and reads input data;
Step 12 builds off-line model according to network structure and weight data;
Step 13, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 14, according to operational order, operation is carried out to pending data and obtains operation result for output;
When input data includes pending data and off-line model, following steps are executed:
Step 21 inputs and reads input data;
Step 22, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 23, according to operational order, operation is carried out to pending data and obtains operation result for output;
When input data only includes pending data, following steps are executed:
Step 31 inputs and reads input data;
Step 32, the operational order for calling caching carry out operation to pending data and obtain operation result for output.
In some embodiments of the present disclosure, by neural-network processing unit, according to operational order, to pending data It carries out operation and obtains operation result;Preferably, which has instruction cache unit, for the fortune to reception It calculates instruction to be cached, the above-mentioned operational order cached in advance is the operational order of the previous operation of instruction cache unit caching.
In some embodiments of the present disclosure, above-mentioned neural-network processing unit also has data buffer storage unit, for delaying Deposit the pending data.
Based on above-mentioned operation method, the disclosure also discloses a kind of arithmetic unit, including:
Input module, be used for input data, the data include pending data, network structure and weight data and/or from Line model data;
Model generation module, for building off-line model according to the network structure and weight data of input;
Neural network computing module is generated for being based on off-line model and operational order and is cached, and is based on operational order Operation is carried out to pending data and obtains operation result;
Output module, for exporting the operation result;
Control module, for detecting input data type and executing following operation:
When input data includes pending data, network structure and weight data, input module is controlled by network structure With weight data input model generation module to build off-line model, and control neural network computing module is based on model and generates mould The off-line model of block input carries out operation to the pending data of input module input;
When input data includes pending data and off-line model, input module is controlled by pending data and offline mould Type inputs neural network computing module, and control neural network computing module is based on off-line model and generates operational order and cache, And operation is carried out to pending data based on operational order;
When input data only includes pending data, pending data is inputted neural network computing by control input module Module, and control neural network computing module calls the operational order of caching, and operation is carried out to pending data.
Above-mentioned neural network computing module includes model analyzing unit and neural-network processing unit, wherein:
Model analyzing unit, for generating operational order based on off-line model;
Neural-network processing unit, for caching operational order for subsequently calculating calling;Or it only wraps in input data The operational order of caching is called when including pending data, and operation is carried out to pending data based on operational order and obtains operation knot Fruit.
In some embodiments of the present disclosure, above-mentioned neural-network processing unit has instruction cache unit, for caching Operational order is called for subsequently calculating.
In some embodiments of the present disclosure, above-mentioned off-line model is a text file defined according to special construction, Can be various neural network models, such as can be Cambricon_model, AlexNet_model, GoogleNet_model, The models such as VGG_model, R-CNN_model, GAN_model, LSTM_model, RNN_model, ResNet_model, but simultaneously It is not limited solely to these models of the present embodiment proposition.
In some embodiments of the present disclosure, pending data is the input that can be handled with neural network, for example, Any one of continuous single picture, voice or video flowing.
In some embodiments of the present disclosure, above-mentioned network structure can be various neural network structures, such as can be AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet etc., but it is not only limited in the present embodiment These structures proposed.
Specifically, according to the difference of input module input data, the arithmetic unit of the disclosure has following three kinds work former Reason:
1, when the data of input module input are network structure, weight data and pending data, then control module control Network structure and weight data are transmitted to model generation module by input module processed, and pending data is transmitted to model analyzing mould Block;Control module Controlling model generation module generates off-line model according to network structure and weight data, and by the generation from Line model is transmitted to model analyzing unit;Control module Controlling model resolution unit parses the off-line model of reception, obtains It is transmitted to its nerve for including to the identifiable operational order of neural-network processing unit, and by operational order and pending data Network processing unit;Neural-network processing unit carries out operation according to the operational order of reception to pending data, obtains one really Fixed operation result, and the operation result is transmitted to output module for output.
2, when the data of input module input are off-line model and pending data, control module then controls input module Off-line model and pending data are directly transferred to model analyzing unit, follow-up work principle is identical as the first situation.
3, when the data of input module input only include pending data, then control module control input module is direct This pending data is transmitted to neural-network processing unit through model analyzing unit, neural-network processing unit is according to caching Operational order carries out operation to pending data and obtains operation result.Usual such case will not be used for the first time at neural network Occur in reason device, to ensure the operational order of existing determination in instruction buffer.
Therefore, in the off-line model difference of current network operation and upper primary network operations, the number of input module input According to that should include network structure, weight data and pending data, be carried out after new off-line model is generated by model generation module Subsequent network operations;It is defeated when current network operation is first time network operations and has obtained corresponding off-line model in advance The data for entering module input should include off-line model and pending data;It is not for the first time, and with upper one in current network operation When the off-line model of secondary network operations is identical, the data of input module input only include pending data.
In some embodiments of the present disclosure, the arithmetic unit of disclosure description is integrated into entire computer as submodule In the CPU module of system.Pending data and off-line model are transmitted to arithmetic unit by central processing unit control In.Model analyzing unit can parse incoming neural network off-line model and generate operational order.Then operational order It can be passed into neural-network processing unit with pending data, operation result is obtained by calculation process, and by the operation knot Fruit returns in main memory unit.In follow-up calculating process, network structure no longer changes, then only needs constantly to be passed to pending number According to neural computing can be completed, operation result is obtained.
The arithmetic unit and method proposed to the disclosure below by way of specific embodiment is described in detail.
Embodiment 1
As shown in Fig. 2, the present embodiment proposes a kind of operation method, include the following steps:
When input data includes pending data, network structure and weight data, following steps are executed:
Step 11 inputs and reads input data;
Step 12 builds off-line model according to network structure and weight data;
Step 13, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 14, according to operational order, operation is carried out to pending data and obtains neural network computing result for output;
When input data includes pending data and off-line model, following steps are executed:
Step 21 inputs and reads input data;
Step 22, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 23, according to operational order, operation is carried out to pending data and obtains neural network computing result for output;
When input data only includes pending data, following steps are executed:
Step 31 inputs and reads input data;
Step 32, call caching operational order, to pending data carry out operation obtain neural network computing result with For output.
By neural-network processing unit, according to operational order, pending data is handled to obtain operation result;It should Neural-network processing unit has instruction cache unit and data buffer storage unit, for respectively to the operational order of reception and waiting locating Reason data are cached.
The network structure of the input proposed in the present embodiment is AlexNet, weight data bvlc_ Alexnet.caffemodel, pending data are continuous single picture, off-line model Cambricon_model.
In conclusion the method proposed with the present embodiment, can largely simplify and use neural network processor The flow for carrying out operation avoids the extra memory for calling traditional a whole set of programming framework to arrive and IO expenses.With this method, can allow Neural network accelerator gives full play to operational performance under low memory, real-time environment.
Embodiment 2
As shown in figure 3, the present embodiment proposes a kind of arithmetic unit, including:Input module 101, model generation module 102, Neural network computing module 103, output module 104 and control module 105, wherein neural network computing module 103 includes model Resolution unit 106 and neural network processor 107
The keyword of the device is off-line execution, refers to directly off-line model being utilized to generate correlation after generating off-line model Operational order and incoming weight data, processing operation is carried out to pending data.More specifically:
Above-mentioned input module 101, the combination for inputting network structure, weight data and pending data or offline mould The combination of type and pending data.When input is network structure, weight data and pending data, then by network structure and power Value Data is passed to model generation module 102, and following operation is executed to generate off-line model.When input is off-line model and is waited for When handling data, then by the directly incoming model analyzing unit 106 of off-line model, pending data, to execute following operation.
Above-mentioned output module 104, for exporting according to the determination of particular network structure and the generation of one group of pending data Operational data.Wherein output data is obtained by 107 operation of neural network processor.
Above-mentioned model generation module 102, for the network architecture parameters according to input, weight data is generated for under The off-line model that layer uses.
Above-mentioned model analyzing unit 106, for parsing incoming off-line model, generation can be directly at afferent nerve network The operational order of device 107 is managed, while the pending data that input module 101 is passed to being passed in neural network processor 107.
Above-mentioned neural network processor 107 is obtained for carrying out operation according to incoming operational order and pending data Determining operation result is passed in output module 104, has instruction cache unit and data buffer storage unit.
Above-mentioned control module 105, for detecting input data type and executing following operation:
When input data includes pending data, network structure and weight data, input module 101 is controlled by network knot Structure and weight data input model generation module 102 are to build off-line model, and control neural network computing module 103 is based on mould The off-line model that type generation module 102 inputs carries out neural network computing to the pending data that input module 101 inputs;
When input data includes pending data and off-line model, control input module 101 by pending data and from Line model inputs neural network computing module 103, and control neural network computing module 103 is based on off-line model generation operation and refers to It enables and caches, and neural network computing is carried out to pending data based on operational order;
When input data only includes pending data, pending data is inputted neural network by control input module 101 Computing module 103, and control neural network computing module 103 calls the operational order of caching, and nerve is carried out to pending data Network operations.
The network structure of the input proposed in the present embodiment is AlexNet, weight data bvlc_ Alexnet.caffemodel, pending data are continuous single picture.Model generation module 102 is according to the network knot of input Structure and weight data generate new off-line model Cambricon_model, and the off-line model Cambricon_model of generation also may be used It is used alone using the input as next time;Model analyzing unit 106 can parse off-line model Cambricon_model, to Generate a series of operational orders.The operational order of generation is transferred on neural network processor 107 by model analyzing unit 106 In instruction cache unit, by the data buffer storage on the incoming input picture transfer to neural network processor 107 of input module 101 In unit.
Discribed process or method can be by including hardware in the attached drawing of front, software, or both combination processing Logic executes.Although above according to certain sequences operation describe process or method, however, it is to be understood that it is described certain A little operations can be executed with different order.In addition, concurrently rather than some operations can be sequentially performed.
Particular embodiments described above has carried out further in detail the purpose, technical solution and advantageous effect of the disclosure Describe in detail bright, it should be understood that the foregoing is merely the specific embodiment of the disclosure, be not limited to the disclosure, it is all Within the spirit and principle of the disclosure, any modification, equivalent substitution, improvement and etc. done should be included in the protection of the disclosure Within the scope of.

Claims (10)

1. a kind of operation method, includes the following steps:
When input data includes pending data, network structure and weight data, following steps are executed:
Step 11 inputs and reads input data;
Step 12 builds off-line model according to the network structure and weight data;
Step 13, the parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 14, according to the operational order, operation is carried out to the pending data and obtains operation result for output;
When input data includes pending data and off-line model, following steps are executed:
Step 21 inputs and reads input data;
Step 22, the parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 23, according to the operational order, operation is carried out to the pending data and obtains operation result for output;
When input data only includes pending data, following steps are executed:
Step 31 inputs and reads input data;
Step 32, the operational order for calling caching carry out operation to the pending data and obtain operation result for output.
2. operation method as described in claim 1, wherein it is described according to operational order, god is carried out to the pending data The step of obtaining operation result through network operations is realized by neural-network processing unit.
3. operation method as claimed in claim 2, wherein the neural-network processing unit has instruction cache unit, uses In caching the operational order, called for subsequently calculating.
4. operation method according to any one of claims 1 to 3, wherein the off-line model is neural network model;Institute It includes Cambricon_model, AlexNet_model, GoogleNet_model, VGG_model, R- to state neural network model CNN_model、GAN_model、LSTM_model、RNN_model、ResNet_model。
5. the operation method as described in Claims 1 to 4, wherein the pending data is can be at neural network The input of reason.
6. operation method as claimed in claim 5, wherein the pending data include continuous single picture, voice or Video flowing.
7. such as operation method according to any one of claims 1 to 6, wherein the network structure is neural network structure;Institute It includes AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet to state neural network structure.
8. a kind of arithmetic unit, including:
Input module, is used for input data, and the data include pending data, network structure and weight data and/or offline Model data;
Model generation module, for building off-line model according to the network structure and weight data of input;
Neural network computing module is generated for being based on off-line model and operational order and is cached, and treated based on operational order Processing data carry out operation and obtain operation result;
Output module, for exporting the operation result;
Control module, for detecting input data type and executing following operation:
When input data includes pending data, network structure and weight data, input module is controlled by network structure and power Value Data input model generation module is to build off-line model, and control neural network computing module is defeated based on model generation module The off-line model entered carries out operation to the pending data of input module input;
When input data includes pending data and off-line model, control input module is defeated by pending data and off-line model Enter neural network computing module, and control neural network computing module is based on off-line model and generates operational order and cache, and base Operation is carried out to the pending data in the operational order;
When input data only includes pending data, pending data is inputted neural network computing mould by control input module Block, and control neural network computing module calls the operational order of caching, and operation is carried out to the pending data.
9. arithmetic unit as claimed in claim 8, wherein the neural network computing module includes model analyzing unit and god Through network processing unit, wherein:
Model analyzing unit, for generating operational order based on off-line model;
Neural-network processing unit, for caching the operational order for subsequently calculating calling;Or it only wraps in input data The operational order of caching is called when including pending data, and operation is carried out to the pending data based on the operational order and is obtained To the operation result.
10. the arithmetic unit as described in any one of claim 8~9, wherein the neural-network processing unit has instruction Buffer unit, for caching the operational order for subsequently calculating calling.
CN201710269049.0A 2017-04-19 2017-04-21 Operation method and device Active CN108734288B (en)

Priority Applications (17)

Application Number Priority Date Filing Date Title
CN201710269049.0A CN108734288B (en) 2017-04-21 2017-04-21 Operation method and device
EP18788355.8A EP3614259A4 (en) 2017-04-19 2018-04-17 Processing apparatus and processing method
CN202410405915.4A CN118690805A (en) 2017-04-19 2018-04-17 Processing apparatus and processing method
PCT/CN2018/083415 WO2018192500A1 (en) 2017-04-19 2018-04-17 Processing apparatus and processing method
CN201880000923.3A CN109121435A (en) 2017-04-19 2018-04-17 Processing unit and processing method
US16/476,262 US11531540B2 (en) 2017-04-19 2018-04-17 Processing apparatus and processing method with dynamically configurable operation bit width
KR1020197038135A KR102258414B1 (en) 2017-04-19 2018-04-17 Processing apparatus and processing method
JP2019549467A JP6865847B2 (en) 2017-04-19 2018-04-17 Processing equipment, chips, electronic equipment and methods
KR1020197025307A KR102292349B1 (en) 2017-04-19 2018-04-17 Processing device and processing method
EP19214320.4A EP3654172A1 (en) 2017-04-19 2018-04-17 Fused vector multiplier and method using the same
CN201811097653.0A CN109376852B (en) 2017-04-21 2018-04-17 Arithmetic device and arithmetic method
EP19214371.7A EP3786786B1 (en) 2017-04-19 2018-04-17 Processing device, processing method, chip, and electronic apparatus
US16/697,533 US11531541B2 (en) 2017-04-19 2019-11-27 Processing apparatus and processing method
US16/697,727 US11698786B2 (en) 2017-04-19 2019-11-27 Processing apparatus and processing method
US16/697,637 US11720353B2 (en) 2017-04-19 2019-11-27 Processing apparatus and processing method
US16/697,687 US11734002B2 (en) 2017-04-19 2019-11-27 Counting elements in neural network input data
JP2019228383A JP6821002B2 (en) 2017-04-19 2019-12-18 Processing equipment and processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710269049.0A CN108734288B (en) 2017-04-21 2017-04-21 Operation method and device

Publications (2)

Publication Number Publication Date
CN108734288A true CN108734288A (en) 2018-11-02
CN108734288B CN108734288B (en) 2021-01-29

Family

ID=63934137

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201710269049.0A Active CN108734288B (en) 2017-04-19 2017-04-21 Operation method and device
CN201811097653.0A Active CN109376852B (en) 2017-04-19 2018-04-17 Arithmetic device and arithmetic method

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201811097653.0A Active CN109376852B (en) 2017-04-19 2018-04-17 Arithmetic device and arithmetic method

Country Status (1)

Country Link
CN (2) CN108734288B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685203A (en) * 2018-12-21 2019-04-26 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
CN109697500A (en) * 2018-12-29 2019-04-30 北京中科寒武纪科技有限公司 Data processing method, device, electronic equipment and storage medium
CN109726797A (en) * 2018-12-21 2019-05-07 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
CN110070176A (en) * 2019-04-18 2019-07-30 北京中科寒武纪科技有限公司 The processing method of off-line model, the processing unit of off-line model and Related product
CN110309917A (en) * 2019-07-05 2019-10-08 北京中科寒武纪科技有限公司 The verification method and relevant apparatus of off-line model
CN111242321A (en) * 2019-04-18 2020-06-05 中科寒武纪科技股份有限公司 Data processing method and related product
CN113490943A (en) * 2019-07-31 2021-10-08 华为技术有限公司 Integrated chip and method for processing sensor data
WO2021232958A1 (en) * 2020-05-18 2021-11-25 Oppo广东移动通信有限公司 Method and apparatus for executing operation, electronic device, and storage medium
US11983535B2 (en) 2019-03-22 2024-05-14 Cambricon Technologies Corporation Limited Artificial intelligence computing device and related product

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112613597B (en) * 2020-11-30 2023-06-30 河南汇祥通信设备有限公司 Comprehensive pipe rack risk automatic identification convolutional neural network model and construction method
CN112947935B (en) * 2021-02-26 2024-08-13 上海商汤智能科技有限公司 Operation method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005911A (en) * 2015-06-26 2015-10-28 深圳市腾讯计算机系统有限公司 Operating system for deep neural network and operating method
CN105512723A (en) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 Artificial neural network calculating device and method for sparse connection
WO2016099779A1 (en) * 2014-12-19 2016-06-23 Intel Corporation Method and apparatus for distributed and cooperative computation in artificial neural networks
CN105930902A (en) * 2016-04-18 2016-09-07 中国科学院计算技术研究所 Neural network processing method and system
CN106228238A (en) * 2016-07-27 2016-12-14 中国科学技术大学苏州研究院 The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN106355246A (en) * 2015-10-08 2017-01-25 上海兆芯集成电路有限公司 Tri-configuration neural network element
CN106529670A (en) * 2016-10-27 2017-03-22 中国科学院计算技术研究所 Neural network processor based on weight compression, design method, and chip

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130090147A (en) * 2012-02-03 2013-08-13 안병익 Neural network computing apparatus and system, and method thereof
US9378455B2 (en) * 2012-05-10 2016-06-28 Yan M. Yufik Systems and methods for a computer understanding multi modal data streams
US20160162779A1 (en) * 2014-12-05 2016-06-09 RealMatch, Inc. Device, system and method for generating a predictive model by machine learning
CN106557332A (en) * 2016-11-30 2017-04-05 上海寒武纪信息科技有限公司 A kind of multiplexing method and device of instruction generating process

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016099779A1 (en) * 2014-12-19 2016-06-23 Intel Corporation Method and apparatus for distributed and cooperative computation in artificial neural networks
CN105005911A (en) * 2015-06-26 2015-10-28 深圳市腾讯计算机系统有限公司 Operating system for deep neural network and operating method
CN106355246A (en) * 2015-10-08 2017-01-25 上海兆芯集成电路有限公司 Tri-configuration neural network element
CN105512723A (en) * 2016-01-20 2016-04-20 南京艾溪信息科技有限公司 Artificial neural network calculating device and method for sparse connection
CN105930902A (en) * 2016-04-18 2016-09-07 中国科学院计算技术研究所 Neural network processing method and system
CN106228238A (en) * 2016-07-27 2016-12-14 中国科学技术大学苏州研究院 The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform
CN106529670A (en) * 2016-10-27 2017-03-22 中国科学院计算技术研究所 Neural network processor based on weight compression, design method, and chip

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685203A (en) * 2018-12-21 2019-04-26 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
CN109726797A (en) * 2018-12-21 2019-05-07 北京中科寒武纪科技有限公司 Data processing method, device, computer system and storage medium
CN109697500A (en) * 2018-12-29 2019-04-30 北京中科寒武纪科技有限公司 Data processing method, device, electronic equipment and storage medium
US11983535B2 (en) 2019-03-22 2024-05-14 Cambricon Technologies Corporation Limited Artificial intelligence computing device and related product
CN110070176A (en) * 2019-04-18 2019-07-30 北京中科寒武纪科技有限公司 The processing method of off-line model, the processing unit of off-line model and Related product
CN111242321A (en) * 2019-04-18 2020-06-05 中科寒武纪科技股份有限公司 Data processing method and related product
US11762690B2 (en) 2019-04-18 2023-09-19 Cambricon Technologies Corporation Limited Data processing method and related products
CN111242321B (en) * 2019-04-18 2023-09-26 中科寒武纪科技股份有限公司 Data processing method and related product
CN110309917A (en) * 2019-07-05 2019-10-08 北京中科寒武纪科技有限公司 The verification method and relevant apparatus of off-line model
CN113490943A (en) * 2019-07-31 2021-10-08 华为技术有限公司 Integrated chip and method for processing sensor data
WO2021232958A1 (en) * 2020-05-18 2021-11-25 Oppo广东移动通信有限公司 Method and apparatus for executing operation, electronic device, and storage medium

Also Published As

Publication number Publication date
CN108734288B (en) 2021-01-29
CN109376852A (en) 2019-02-22
CN109376852B (en) 2021-01-29

Similar Documents

Publication Publication Date Title
CN108734288A (en) A kind of operation method and device
RU2771008C1 (en) Method and apparatus for processing tasks based on a neural network
US20230048031A1 (en) Data processing method and apparatus
WO2022170997A1 (en) Data processing method and system based on risc-v instruction set, and device and medium
TWI731373B (en) Chip, data processing method and computing equipment based on it
CN108416440A (en) A kind of training method of neural network, object identification method and device
CN110458280B (en) Convolutional neural network acceleration method and system suitable for mobile terminal
CN109242094A (en) Device and method for executing artificial neural network forward operation
CN106845631B (en) Stream execution method and device
CN108320018B (en) Artificial neural network operation device and method
US20220004858A1 (en) Method for processing artificial neural network, and electronic device therefor
CN109597965A (en) Data processing method, system, terminal and medium based on deep neural network
CN110633785B (en) Method and system for calculating convolutional neural network
CN108122031A (en) A kind of neutral net accelerator architecture of low-power consumption
CN111191789B (en) Model optimization deployment system, chip, electronic equipment and medium
CN109885406B (en) Operator calculation optimization method, device, equipment and storage medium
US20240129236A1 (en) Dqn-based distributed computing network coordinate flow scheduling system and method
CN112835712A (en) Multithreading special effect drawing method, device, system and medium
CN117032807A (en) AI acceleration processor architecture based on RISC-V instruction set
CN113190352B (en) General CPU-oriented deep learning calculation acceleration method and system
CN112862083A (en) Deep neural network inference method and device under edge environment
CN109189570B (en) MEC-based moving edge pre-calculation method
CN111738432B (en) Neural network processing circuit supporting self-adaptive parallel computation
Youlve et al. Asynchronous Distributed Proximal Policy Optimization Training Framework Based on GPU
XUANLEI et al. HeteGen: Efficient Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant