CN108734288A - A kind of operation method and device - Google Patents
A kind of operation method and device Download PDFInfo
- Publication number
- CN108734288A CN108734288A CN201710269049.0A CN201710269049A CN108734288A CN 108734288 A CN108734288 A CN 108734288A CN 201710269049 A CN201710269049 A CN 201710269049A CN 108734288 A CN108734288 A CN 108734288A
- Authority
- CN
- China
- Prior art keywords
- data
- model
- operational order
- input
- pending data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/35—Creation or generation of source code model driven
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Image Analysis (AREA)
- Devices For Executing Special Programs (AREA)
- Machine Translation (AREA)
Abstract
A kind of operation method and device, the arithmetic unit include input module, are used for input data;Model generation module, for building model according to input data;Neural network computing module obtains operation result for being based on model generation operational order and caching, and according to operational order to pending data progress operation;Output module, for exporting operation result.The device and method of the disclosure can avoid the overhead that entire software architecture is brought in operation conventional method.
Description
Technical field
The disclosure belongs to Computer Architecture, deep learning and field of neural networks, relates more specifically to a kind of operation
Method and device.
Background technology
Deep learning is the branch of machine learning, it attempts use and is constituted comprising labyrinth or by multiple nonlinear transformation
Multiple process layers to data carry out higher level of abstraction algorithm.
Deep learning is a kind of based on the method for carrying out representative learning to data in machine learning.Observation (such as a width
Image) it can use a plurality of ways to indicate, such as vector of each pixel intensity value, or be more abstractively expressed as a series of
Side, specific shape region etc..And use certain specific representation methods be easier from example learning tasks (for example, face
Identification or human facial expression recognition).
So far have several deep learning frameworks, such as deep neural network, convolutional neural networks and depth belief network and
Recurrent neural network has been applied to computer vision, speech recognition, natural language processing, audio identification and bioinformatics etc.
Field, and obtain fabulous effect.In addition, deep learning has become similar terms, or perhaps the brand weight of neural network
Modeling.
With the big heat of deep learning (neural network), neural network accelerator also comes into being, and passes through special memory
It is designed with computing module, neural network accelerator can obtain the general processor tens that compares when carrying out deep learning operation
Very it is the speed-up ratio of hundreds of times again, and area smaller, power consumption are lower.
For convenience by neural network accelerator be applied to various heterogeneous networks structures on to carry out acceleration operation, be with it
The programming software library and programming framework on basis are also come into being and are continued to develop.In previous application scenarios, neural network adds
Fast device programming framework is usually located at top layer, and currently used programming framework has Caffe, Tensorflow, Torch etc., such as Fig. 1
It is shown, neural network accelerator (specialized hardware for being used for neural network computing) is followed successively by from bottom to upper layer, hardware driving (is used
In software transfer neural network accelerator), neural network accelerator programming library (calls connecing for neural network accelerator for providing
Mouthful), neural network accelerator programming framework and the advanced application for needing to carry out neural network computing.And some low memories,
In real-time application scenarios, excessive computing resource will be consumed by running a whole set of software architecture.Therefore, for specific application field
Scape, how to be optimized to calculating process is one of problem to be solved.
Invention content
Based on problem above, the purpose of the disclosure is to propose a kind of operation method and device, for solving above-mentioned technology
At least one of problem.
In order to achieve the above object, as an aspect of this disclosure, the present disclosure proposes a kind of operation methods, including with
Lower step:
When input data includes pending data, network structure and weight data, following steps are executed:
Step 11 inputs and reads input data;
Step 12 builds off-line model according to network structure and weight data;
Step 13, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 14, according to operational order, operation is carried out to pending data and obtains operation result for output;
When input data includes pending data and off-line model, following steps are executed:
Step 21 inputs and reads input data;
Step 22, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 23, according to operational order, operation is carried out to pending data and obtains operation result for output;
When input data only includes pending data, following steps are executed:
Step 31 inputs and reads input data;
Step 32, the operational order for calling caching carry out operation to pending data and obtain operation result for output.
Further, above-mentioned according to operational order, the step of operation obtains operation result is carried out to pending data, is logical
Neural-network processing unit is crossed to realize.
Further, above-mentioned neural-network processing unit has instruction cache unit, for caching after operational order is used for
Continuous calculate is called.
Further, above-mentioned off-line model includes various neural network models, which includes
Cambricon_model、AlexNet_model、GoogleNet_model、VGG_model、R-CNN_model、GAN_
Model, LSTM_model, RNN_model, ResNet_model etc..
Further, above-mentioned pending data is the input that can be handled with neural network.
Further, above-mentioned pending data includes continuous single picture, voice or video flowing.
Further, above-mentioned network structure include AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM,
RNN, ResNet and possible various neural network structures.
In order to achieve the above object, as another aspect of the disclosure, the present disclosure proposes a kind of arithmetic unit, packets
It includes:
Input module, be used for input data, the data include pending data, network structure and weight data and/or from
Line model data;
Model generation module, for building off-line model according to the network structure and weight data of input;
Neural network computing module is generated for being based on off-line model and operational order and is cached, and is based on operational order
Operation is carried out to pending data and obtains operation result;
Output module, for exporting the operation result;
Control module, for detecting input data type and executing following operation:
When input data includes pending data, network structure and weight data, input module is controlled by network structure
With weight data input model generation module to build off-line model, and control neural network computing module is based on model and generates mould
The off-line model of block input carries out operation to the pending data of input module input;
When input data includes pending data and off-line model, input module is controlled by pending data and offline mould
Type inputs neural network computing module, and control neural network computing module is based on off-line model and generates operational order and cache,
And operation is carried out to pending data based on operational order;
When input data only includes pending data, pending data is inputted neural network computing by control input module
Module, and control neural network computing module calls the operational order of caching, and operation is carried out to pending data.
Further, above-mentioned neural network computing module includes model analyzing unit and a processing unit, wherein:
Model analyzing unit, for generating operational order based on off-line model;
Neural-network processing unit, for caching operational order for subsequently calculating calling;Or it only wraps in input data
The operational order of caching is called when including pending data, and operation is carried out to pending data based on operational order and obtains operation knot
Fruit.
Further, above-mentioned neural-network processing unit has instruction cache unit, for caching after operational order is used for
Continuous calculate is called.
The operation method and device that the disclosure proposes, have the advantages that:
1, the method and device that the disclosure proposes can directly be transported after generating off-line model according to off-line model
It calculates, avoids the overhead that entire software architecture of the operation including deep learning frame is brought;
2, the device and method that the disclosure proposes, realizes the reconstruction more efficient to neural network processor so that
Neural network processor can give full play to performance in low memory, real-time application environment, and calculating process is more succinct
Quickly.
Description of the drawings
Fig. 1 is programming framework in the prior art;
Fig. 2 is the operational flowchart for the operation method that one embodiment of the disclosure proposes;
Fig. 3 is the structural framing figure for the arithmetic unit that another embodiment of the disclosure proposes.
Specific implementation mode
To make the purpose, technical scheme and advantage of the disclosure be more clearly understood, below in conjunction with specific embodiment, and reference
Attached drawing is described in further detail the disclosure.
In the present specification, following various embodiments for describing disclosure principle only illustrate, should not be with any
Mode is construed to limit the scope of the disclosure.Described below with reference to attached drawing is used to help comprehensive understanding by claim and its waits
The exemplary embodiment for the disclosure that jljl limits.It is described below to help to understand including a variety of details, but these details
It is considered as being only exemplary.Therefore, it will be appreciated by those of ordinary skill in the art that without departing substantially from the scope of the present disclosure and essence
In the case of god, embodiment described herein can be made various changes and modifications.In addition, rising for clarity and brevity
See, the description of known function and structure is omitted.In addition, running through attached drawing, same reference numerals are used for identity function and operation.
The present disclosure discloses a kind of operation methods, include the following steps:
When input data includes pending data, network structure and weight data, following steps are executed:
Step 11 inputs and reads input data;
Step 12 builds off-line model according to network structure and weight data;
Step 13, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 14, according to operational order, operation is carried out to pending data and obtains operation result for output;
When input data includes pending data and off-line model, following steps are executed:
Step 21 inputs and reads input data;
Step 22, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 23, according to operational order, operation is carried out to pending data and obtains operation result for output;
When input data only includes pending data, following steps are executed:
Step 31 inputs and reads input data;
Step 32, the operational order for calling caching carry out operation to pending data and obtain operation result for output.
In some embodiments of the present disclosure, by neural-network processing unit, according to operational order, to pending data
It carries out operation and obtains operation result;Preferably, which has instruction cache unit, for the fortune to reception
It calculates instruction to be cached, the above-mentioned operational order cached in advance is the operational order of the previous operation of instruction cache unit caching.
In some embodiments of the present disclosure, above-mentioned neural-network processing unit also has data buffer storage unit, for delaying
Deposit the pending data.
Based on above-mentioned operation method, the disclosure also discloses a kind of arithmetic unit, including:
Input module, be used for input data, the data include pending data, network structure and weight data and/or from
Line model data;
Model generation module, for building off-line model according to the network structure and weight data of input;
Neural network computing module is generated for being based on off-line model and operational order and is cached, and is based on operational order
Operation is carried out to pending data and obtains operation result;
Output module, for exporting the operation result;
Control module, for detecting input data type and executing following operation:
When input data includes pending data, network structure and weight data, input module is controlled by network structure
With weight data input model generation module to build off-line model, and control neural network computing module is based on model and generates mould
The off-line model of block input carries out operation to the pending data of input module input;
When input data includes pending data and off-line model, input module is controlled by pending data and offline mould
Type inputs neural network computing module, and control neural network computing module is based on off-line model and generates operational order and cache,
And operation is carried out to pending data based on operational order;
When input data only includes pending data, pending data is inputted neural network computing by control input module
Module, and control neural network computing module calls the operational order of caching, and operation is carried out to pending data.
Above-mentioned neural network computing module includes model analyzing unit and neural-network processing unit, wherein:
Model analyzing unit, for generating operational order based on off-line model;
Neural-network processing unit, for caching operational order for subsequently calculating calling;Or it only wraps in input data
The operational order of caching is called when including pending data, and operation is carried out to pending data based on operational order and obtains operation knot
Fruit.
In some embodiments of the present disclosure, above-mentioned neural-network processing unit has instruction cache unit, for caching
Operational order is called for subsequently calculating.
In some embodiments of the present disclosure, above-mentioned off-line model is a text file defined according to special construction,
Can be various neural network models, such as can be Cambricon_model, AlexNet_model, GoogleNet_model,
The models such as VGG_model, R-CNN_model, GAN_model, LSTM_model, RNN_model, ResNet_model, but simultaneously
It is not limited solely to these models of the present embodiment proposition.
In some embodiments of the present disclosure, pending data is the input that can be handled with neural network, for example,
Any one of continuous single picture, voice or video flowing.
In some embodiments of the present disclosure, above-mentioned network structure can be various neural network structures, such as can be
AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet etc., but it is not only limited in the present embodiment
These structures proposed.
Specifically, according to the difference of input module input data, the arithmetic unit of the disclosure has following three kinds work former
Reason:
1, when the data of input module input are network structure, weight data and pending data, then control module control
Network structure and weight data are transmitted to model generation module by input module processed, and pending data is transmitted to model analyzing mould
Block;Control module Controlling model generation module generates off-line model according to network structure and weight data, and by the generation from
Line model is transmitted to model analyzing unit;Control module Controlling model resolution unit parses the off-line model of reception, obtains
It is transmitted to its nerve for including to the identifiable operational order of neural-network processing unit, and by operational order and pending data
Network processing unit;Neural-network processing unit carries out operation according to the operational order of reception to pending data, obtains one really
Fixed operation result, and the operation result is transmitted to output module for output.
2, when the data of input module input are off-line model and pending data, control module then controls input module
Off-line model and pending data are directly transferred to model analyzing unit, follow-up work principle is identical as the first situation.
3, when the data of input module input only include pending data, then control module control input module is direct
This pending data is transmitted to neural-network processing unit through model analyzing unit, neural-network processing unit is according to caching
Operational order carries out operation to pending data and obtains operation result.Usual such case will not be used for the first time at neural network
Occur in reason device, to ensure the operational order of existing determination in instruction buffer.
Therefore, in the off-line model difference of current network operation and upper primary network operations, the number of input module input
According to that should include network structure, weight data and pending data, be carried out after new off-line model is generated by model generation module
Subsequent network operations;It is defeated when current network operation is first time network operations and has obtained corresponding off-line model in advance
The data for entering module input should include off-line model and pending data;It is not for the first time, and with upper one in current network operation
When the off-line model of secondary network operations is identical, the data of input module input only include pending data.
In some embodiments of the present disclosure, the arithmetic unit of disclosure description is integrated into entire computer as submodule
In the CPU module of system.Pending data and off-line model are transmitted to arithmetic unit by central processing unit control
In.Model analyzing unit can parse incoming neural network off-line model and generate operational order.Then operational order
It can be passed into neural-network processing unit with pending data, operation result is obtained by calculation process, and by the operation knot
Fruit returns in main memory unit.In follow-up calculating process, network structure no longer changes, then only needs constantly to be passed to pending number
According to neural computing can be completed, operation result is obtained.
The arithmetic unit and method proposed to the disclosure below by way of specific embodiment is described in detail.
Embodiment 1
As shown in Fig. 2, the present embodiment proposes a kind of operation method, include the following steps:
When input data includes pending data, network structure and weight data, following steps are executed:
Step 11 inputs and reads input data;
Step 12 builds off-line model according to network structure and weight data;
Step 13, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 14, according to operational order, operation is carried out to pending data and obtains neural network computing result for output;
When input data includes pending data and off-line model, following steps are executed:
Step 21 inputs and reads input data;
Step 22, parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 23, according to operational order, operation is carried out to pending data and obtains neural network computing result for output;
When input data only includes pending data, following steps are executed:
Step 31 inputs and reads input data;
Step 32, call caching operational order, to pending data carry out operation obtain neural network computing result with
For output.
By neural-network processing unit, according to operational order, pending data is handled to obtain operation result;It should
Neural-network processing unit has instruction cache unit and data buffer storage unit, for respectively to the operational order of reception and waiting locating
Reason data are cached.
The network structure of the input proposed in the present embodiment is AlexNet, weight data bvlc_
Alexnet.caffemodel, pending data are continuous single picture, off-line model Cambricon_model.
In conclusion the method proposed with the present embodiment, can largely simplify and use neural network processor
The flow for carrying out operation avoids the extra memory for calling traditional a whole set of programming framework to arrive and IO expenses.With this method, can allow
Neural network accelerator gives full play to operational performance under low memory, real-time environment.
Embodiment 2
As shown in figure 3, the present embodiment proposes a kind of arithmetic unit, including:Input module 101, model generation module 102,
Neural network computing module 103, output module 104 and control module 105, wherein neural network computing module 103 includes model
Resolution unit 106 and neural network processor 107
The keyword of the device is off-line execution, refers to directly off-line model being utilized to generate correlation after generating off-line model
Operational order and incoming weight data, processing operation is carried out to pending data.More specifically:
Above-mentioned input module 101, the combination for inputting network structure, weight data and pending data or offline mould
The combination of type and pending data.When input is network structure, weight data and pending data, then by network structure and power
Value Data is passed to model generation module 102, and following operation is executed to generate off-line model.When input is off-line model and is waited for
When handling data, then by the directly incoming model analyzing unit 106 of off-line model, pending data, to execute following operation.
Above-mentioned output module 104, for exporting according to the determination of particular network structure and the generation of one group of pending data
Operational data.Wherein output data is obtained by 107 operation of neural network processor.
Above-mentioned model generation module 102, for the network architecture parameters according to input, weight data is generated for under
The off-line model that layer uses.
Above-mentioned model analyzing unit 106, for parsing incoming off-line model, generation can be directly at afferent nerve network
The operational order of device 107 is managed, while the pending data that input module 101 is passed to being passed in neural network processor 107.
Above-mentioned neural network processor 107 is obtained for carrying out operation according to incoming operational order and pending data
Determining operation result is passed in output module 104, has instruction cache unit and data buffer storage unit.
Above-mentioned control module 105, for detecting input data type and executing following operation:
When input data includes pending data, network structure and weight data, input module 101 is controlled by network knot
Structure and weight data input model generation module 102 are to build off-line model, and control neural network computing module 103 is based on mould
The off-line model that type generation module 102 inputs carries out neural network computing to the pending data that input module 101 inputs;
When input data includes pending data and off-line model, control input module 101 by pending data and from
Line model inputs neural network computing module 103, and control neural network computing module 103 is based on off-line model generation operation and refers to
It enables and caches, and neural network computing is carried out to pending data based on operational order;
When input data only includes pending data, pending data is inputted neural network by control input module 101
Computing module 103, and control neural network computing module 103 calls the operational order of caching, and nerve is carried out to pending data
Network operations.
The network structure of the input proposed in the present embodiment is AlexNet, weight data bvlc_
Alexnet.caffemodel, pending data are continuous single picture.Model generation module 102 is according to the network knot of input
Structure and weight data generate new off-line model Cambricon_model, and the off-line model Cambricon_model of generation also may be used
It is used alone using the input as next time;Model analyzing unit 106 can parse off-line model Cambricon_model, to
Generate a series of operational orders.The operational order of generation is transferred on neural network processor 107 by model analyzing unit 106
In instruction cache unit, by the data buffer storage on the incoming input picture transfer to neural network processor 107 of input module 101
In unit.
Discribed process or method can be by including hardware in the attached drawing of front, software, or both combination processing
Logic executes.Although above according to certain sequences operation describe process or method, however, it is to be understood that it is described certain
A little operations can be executed with different order.In addition, concurrently rather than some operations can be sequentially performed.
Particular embodiments described above has carried out further in detail the purpose, technical solution and advantageous effect of the disclosure
Describe in detail bright, it should be understood that the foregoing is merely the specific embodiment of the disclosure, be not limited to the disclosure, it is all
Within the spirit and principle of the disclosure, any modification, equivalent substitution, improvement and etc. done should be included in the protection of the disclosure
Within the scope of.
Claims (10)
1. a kind of operation method, includes the following steps:
When input data includes pending data, network structure and weight data, following steps are executed:
Step 11 inputs and reads input data;
Step 12 builds off-line model according to the network structure and weight data;
Step 13, the parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 14, according to the operational order, operation is carried out to the pending data and obtains operation result for output;
When input data includes pending data and off-line model, following steps are executed:
Step 21 inputs and reads input data;
Step 22, the parsing off-line model, obtain operational order and cache, called for subsequently calculating;
Step 23, according to the operational order, operation is carried out to the pending data and obtains operation result for output;
When input data only includes pending data, following steps are executed:
Step 31 inputs and reads input data;
Step 32, the operational order for calling caching carry out operation to the pending data and obtain operation result for output.
2. operation method as described in claim 1, wherein it is described according to operational order, god is carried out to the pending data
The step of obtaining operation result through network operations is realized by neural-network processing unit.
3. operation method as claimed in claim 2, wherein the neural-network processing unit has instruction cache unit, uses
In caching the operational order, called for subsequently calculating.
4. operation method according to any one of claims 1 to 3, wherein the off-line model is neural network model;Institute
It includes Cambricon_model, AlexNet_model, GoogleNet_model, VGG_model, R- to state neural network model
CNN_model、GAN_model、LSTM_model、RNN_model、ResNet_model。
5. the operation method as described in Claims 1 to 4, wherein the pending data is can be at neural network
The input of reason.
6. operation method as claimed in claim 5, wherein the pending data include continuous single picture, voice or
Video flowing.
7. such as operation method according to any one of claims 1 to 6, wherein the network structure is neural network structure;Institute
It includes AlexNet, GoogleNet, ResNet, VGG, R-CNN, GAN, LSTM, RNN, ResNet to state neural network structure.
8. a kind of arithmetic unit, including:
Input module, is used for input data, and the data include pending data, network structure and weight data and/or offline
Model data;
Model generation module, for building off-line model according to the network structure and weight data of input;
Neural network computing module is generated for being based on off-line model and operational order and is cached, and treated based on operational order
Processing data carry out operation and obtain operation result;
Output module, for exporting the operation result;
Control module, for detecting input data type and executing following operation:
When input data includes pending data, network structure and weight data, input module is controlled by network structure and power
Value Data input model generation module is to build off-line model, and control neural network computing module is defeated based on model generation module
The off-line model entered carries out operation to the pending data of input module input;
When input data includes pending data and off-line model, control input module is defeated by pending data and off-line model
Enter neural network computing module, and control neural network computing module is based on off-line model and generates operational order and cache, and base
Operation is carried out to the pending data in the operational order;
When input data only includes pending data, pending data is inputted neural network computing mould by control input module
Block, and control neural network computing module calls the operational order of caching, and operation is carried out to the pending data.
9. arithmetic unit as claimed in claim 8, wherein the neural network computing module includes model analyzing unit and god
Through network processing unit, wherein:
Model analyzing unit, for generating operational order based on off-line model;
Neural-network processing unit, for caching the operational order for subsequently calculating calling;Or it only wraps in input data
The operational order of caching is called when including pending data, and operation is carried out to the pending data based on the operational order and is obtained
To the operation result.
10. the arithmetic unit as described in any one of claim 8~9, wherein the neural-network processing unit has instruction
Buffer unit, for caching the operational order for subsequently calculating calling.
Priority Applications (17)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710269049.0A CN108734288B (en) | 2017-04-21 | 2017-04-21 | Operation method and device |
EP18788355.8A EP3614259A4 (en) | 2017-04-19 | 2018-04-17 | Processing apparatus and processing method |
CN202410405915.4A CN118690805A (en) | 2017-04-19 | 2018-04-17 | Processing apparatus and processing method |
PCT/CN2018/083415 WO2018192500A1 (en) | 2017-04-19 | 2018-04-17 | Processing apparatus and processing method |
CN201880000923.3A CN109121435A (en) | 2017-04-19 | 2018-04-17 | Processing unit and processing method |
US16/476,262 US11531540B2 (en) | 2017-04-19 | 2018-04-17 | Processing apparatus and processing method with dynamically configurable operation bit width |
KR1020197038135A KR102258414B1 (en) | 2017-04-19 | 2018-04-17 | Processing apparatus and processing method |
JP2019549467A JP6865847B2 (en) | 2017-04-19 | 2018-04-17 | Processing equipment, chips, electronic equipment and methods |
KR1020197025307A KR102292349B1 (en) | 2017-04-19 | 2018-04-17 | Processing device and processing method |
EP19214320.4A EP3654172A1 (en) | 2017-04-19 | 2018-04-17 | Fused vector multiplier and method using the same |
CN201811097653.0A CN109376852B (en) | 2017-04-21 | 2018-04-17 | Arithmetic device and arithmetic method |
EP19214371.7A EP3786786B1 (en) | 2017-04-19 | 2018-04-17 | Processing device, processing method, chip, and electronic apparatus |
US16/697,533 US11531541B2 (en) | 2017-04-19 | 2019-11-27 | Processing apparatus and processing method |
US16/697,727 US11698786B2 (en) | 2017-04-19 | 2019-11-27 | Processing apparatus and processing method |
US16/697,637 US11720353B2 (en) | 2017-04-19 | 2019-11-27 | Processing apparatus and processing method |
US16/697,687 US11734002B2 (en) | 2017-04-19 | 2019-11-27 | Counting elements in neural network input data |
JP2019228383A JP6821002B2 (en) | 2017-04-19 | 2019-12-18 | Processing equipment and processing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710269049.0A CN108734288B (en) | 2017-04-21 | 2017-04-21 | Operation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108734288A true CN108734288A (en) | 2018-11-02 |
CN108734288B CN108734288B (en) | 2021-01-29 |
Family
ID=63934137
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710269049.0A Active CN108734288B (en) | 2017-04-19 | 2017-04-21 | Operation method and device |
CN201811097653.0A Active CN109376852B (en) | 2017-04-19 | 2018-04-17 | Arithmetic device and arithmetic method |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811097653.0A Active CN109376852B (en) | 2017-04-19 | 2018-04-17 | Arithmetic device and arithmetic method |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN108734288B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685203A (en) * | 2018-12-21 | 2019-04-26 | 北京中科寒武纪科技有限公司 | Data processing method, device, computer system and storage medium |
CN109697500A (en) * | 2018-12-29 | 2019-04-30 | 北京中科寒武纪科技有限公司 | Data processing method, device, electronic equipment and storage medium |
CN109726797A (en) * | 2018-12-21 | 2019-05-07 | 北京中科寒武纪科技有限公司 | Data processing method, device, computer system and storage medium |
CN110070176A (en) * | 2019-04-18 | 2019-07-30 | 北京中科寒武纪科技有限公司 | The processing method of off-line model, the processing unit of off-line model and Related product |
CN110309917A (en) * | 2019-07-05 | 2019-10-08 | 北京中科寒武纪科技有限公司 | The verification method and relevant apparatus of off-line model |
CN111242321A (en) * | 2019-04-18 | 2020-06-05 | 中科寒武纪科技股份有限公司 | Data processing method and related product |
CN113490943A (en) * | 2019-07-31 | 2021-10-08 | 华为技术有限公司 | Integrated chip and method for processing sensor data |
WO2021232958A1 (en) * | 2020-05-18 | 2021-11-25 | Oppo广东移动通信有限公司 | Method and apparatus for executing operation, electronic device, and storage medium |
US11983535B2 (en) | 2019-03-22 | 2024-05-14 | Cambricon Technologies Corporation Limited | Artificial intelligence computing device and related product |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112613597B (en) * | 2020-11-30 | 2023-06-30 | 河南汇祥通信设备有限公司 | Comprehensive pipe rack risk automatic identification convolutional neural network model and construction method |
CN112947935B (en) * | 2021-02-26 | 2024-08-13 | 上海商汤智能科技有限公司 | Operation method and device, electronic equipment and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105005911A (en) * | 2015-06-26 | 2015-10-28 | 深圳市腾讯计算机系统有限公司 | Operating system for deep neural network and operating method |
CN105512723A (en) * | 2016-01-20 | 2016-04-20 | 南京艾溪信息科技有限公司 | Artificial neural network calculating device and method for sparse connection |
WO2016099779A1 (en) * | 2014-12-19 | 2016-06-23 | Intel Corporation | Method and apparatus for distributed and cooperative computation in artificial neural networks |
CN105930902A (en) * | 2016-04-18 | 2016-09-07 | 中国科学院计算技术研究所 | Neural network processing method and system |
CN106228238A (en) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform |
CN106355246A (en) * | 2015-10-08 | 2017-01-25 | 上海兆芯集成电路有限公司 | Tri-configuration neural network element |
CN106529670A (en) * | 2016-10-27 | 2017-03-22 | 中国科学院计算技术研究所 | Neural network processor based on weight compression, design method, and chip |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20130090147A (en) * | 2012-02-03 | 2013-08-13 | 안병익 | Neural network computing apparatus and system, and method thereof |
US9378455B2 (en) * | 2012-05-10 | 2016-06-28 | Yan M. Yufik | Systems and methods for a computer understanding multi modal data streams |
US20160162779A1 (en) * | 2014-12-05 | 2016-06-09 | RealMatch, Inc. | Device, system and method for generating a predictive model by machine learning |
CN106557332A (en) * | 2016-11-30 | 2017-04-05 | 上海寒武纪信息科技有限公司 | A kind of multiplexing method and device of instruction generating process |
-
2017
- 2017-04-21 CN CN201710269049.0A patent/CN108734288B/en active Active
-
2018
- 2018-04-17 CN CN201811097653.0A patent/CN109376852B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016099779A1 (en) * | 2014-12-19 | 2016-06-23 | Intel Corporation | Method and apparatus for distributed and cooperative computation in artificial neural networks |
CN105005911A (en) * | 2015-06-26 | 2015-10-28 | 深圳市腾讯计算机系统有限公司 | Operating system for deep neural network and operating method |
CN106355246A (en) * | 2015-10-08 | 2017-01-25 | 上海兆芯集成电路有限公司 | Tri-configuration neural network element |
CN105512723A (en) * | 2016-01-20 | 2016-04-20 | 南京艾溪信息科技有限公司 | Artificial neural network calculating device and method for sparse connection |
CN105930902A (en) * | 2016-04-18 | 2016-09-07 | 中国科学院计算技术研究所 | Neural network processing method and system |
CN106228238A (en) * | 2016-07-27 | 2016-12-14 | 中国科学技术大学苏州研究院 | The method and system of degree of depth learning algorithm is accelerated on field programmable gate array platform |
CN106529670A (en) * | 2016-10-27 | 2017-03-22 | 中国科学院计算技术研究所 | Neural network processor based on weight compression, design method, and chip |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109685203A (en) * | 2018-12-21 | 2019-04-26 | 北京中科寒武纪科技有限公司 | Data processing method, device, computer system and storage medium |
CN109726797A (en) * | 2018-12-21 | 2019-05-07 | 北京中科寒武纪科技有限公司 | Data processing method, device, computer system and storage medium |
CN109697500A (en) * | 2018-12-29 | 2019-04-30 | 北京中科寒武纪科技有限公司 | Data processing method, device, electronic equipment and storage medium |
US11983535B2 (en) | 2019-03-22 | 2024-05-14 | Cambricon Technologies Corporation Limited | Artificial intelligence computing device and related product |
CN110070176A (en) * | 2019-04-18 | 2019-07-30 | 北京中科寒武纪科技有限公司 | The processing method of off-line model, the processing unit of off-line model and Related product |
CN111242321A (en) * | 2019-04-18 | 2020-06-05 | 中科寒武纪科技股份有限公司 | Data processing method and related product |
US11762690B2 (en) | 2019-04-18 | 2023-09-19 | Cambricon Technologies Corporation Limited | Data processing method and related products |
CN111242321B (en) * | 2019-04-18 | 2023-09-26 | 中科寒武纪科技股份有限公司 | Data processing method and related product |
CN110309917A (en) * | 2019-07-05 | 2019-10-08 | 北京中科寒武纪科技有限公司 | The verification method and relevant apparatus of off-line model |
CN113490943A (en) * | 2019-07-31 | 2021-10-08 | 华为技术有限公司 | Integrated chip and method for processing sensor data |
WO2021232958A1 (en) * | 2020-05-18 | 2021-11-25 | Oppo广东移动通信有限公司 | Method and apparatus for executing operation, electronic device, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108734288B (en) | 2021-01-29 |
CN109376852A (en) | 2019-02-22 |
CN109376852B (en) | 2021-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108734288A (en) | A kind of operation method and device | |
RU2771008C1 (en) | Method and apparatus for processing tasks based on a neural network | |
US20230048031A1 (en) | Data processing method and apparatus | |
WO2022170997A1 (en) | Data processing method and system based on risc-v instruction set, and device and medium | |
TWI731373B (en) | Chip, data processing method and computing equipment based on it | |
CN108416440A (en) | A kind of training method of neural network, object identification method and device | |
CN110458280B (en) | Convolutional neural network acceleration method and system suitable for mobile terminal | |
CN109242094A (en) | Device and method for executing artificial neural network forward operation | |
CN106845631B (en) | Stream execution method and device | |
CN108320018B (en) | Artificial neural network operation device and method | |
US20220004858A1 (en) | Method for processing artificial neural network, and electronic device therefor | |
CN109597965A (en) | Data processing method, system, terminal and medium based on deep neural network | |
CN110633785B (en) | Method and system for calculating convolutional neural network | |
CN108122031A (en) | A kind of neutral net accelerator architecture of low-power consumption | |
CN111191789B (en) | Model optimization deployment system, chip, electronic equipment and medium | |
CN109885406B (en) | Operator calculation optimization method, device, equipment and storage medium | |
US20240129236A1 (en) | Dqn-based distributed computing network coordinate flow scheduling system and method | |
CN112835712A (en) | Multithreading special effect drawing method, device, system and medium | |
CN117032807A (en) | AI acceleration processor architecture based on RISC-V instruction set | |
CN113190352B (en) | General CPU-oriented deep learning calculation acceleration method and system | |
CN112862083A (en) | Deep neural network inference method and device under edge environment | |
CN109189570B (en) | MEC-based moving edge pre-calculation method | |
CN111738432B (en) | Neural network processing circuit supporting self-adaptive parallel computation | |
Youlve et al. | Asynchronous Distributed Proximal Policy Optimization Training Framework Based on GPU | |
XUANLEI et al. | HeteGen: Efficient Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |