CN110308909A - For the executable program generating means and method of neural network processor - Google Patents
For the executable program generating means and method of neural network processor Download PDFInfo
- Publication number
- CN110308909A CN110308909A CN201810257595.7A CN201810257595A CN110308909A CN 110308909 A CN110308909 A CN 110308909A CN 201810257595 A CN201810257595 A CN 201810257595A CN 110308909 A CN110308909 A CN 110308909A
- Authority
- CN
- China
- Prior art keywords
- data
- neural network
- module
- program
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/42—Syntactic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Neurology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Devices For Executing Special Programs (AREA)
Abstract
A kind of executable program generating means and method for neural network processor, wherein generating means include: source program segmentation module, receiving source file is input, according to the format in source file, the position of code segment and data segment is identified and extracted, and generates the intermediate file comprising code segment and the intermediate file comprising data segment;Data processing module handles putting for data, exports Memory Allocation information and data disposing way for inputting the intermediate file comprising data;Neural network algorithm mapping block, by the algorithm flow for being mapped as macrostatement composition with the neural network algorithm that block indicates in code, re-maps into the relevant instruction of hardware for inputting the intermediate file comprising code segment later.The present apparatus provides a kind of method for conveniently using neural network processor for user.
Description
Technical field
Present disclosure is related to computer field, further to artificial intelligence field.
Background technique
Deep neural network algorithm is a kind of nearest popular machine learning algorithm, is widely used various
Field, such as image recognition, speech recognition, natural language processing etc..Since deep neural network all obtains in various tasks
Good effect, various network structures and algorithm emerge one after another, and to programming development bring challenge.And for nerve
For network processing unit, due to its unique hardware configuration, and the possessed calculating of neural network algorithm run thereon is close
Collection and the intensive feature of memory access, programming also just become more complicated and difficult.Currently proposed neural network processor is all
It is programmed by way of handwriting instructions, this mode is very time-consuming, consumes manpower, while being also easy to appear mistake, no
Easily debugging.
During implementing present disclosure, it is found by the applicant that above-mentioned exist in the prior art following problems: at neural network
It manages device and lacks effective program means and efficient code generating unit, cause its programming extremely difficult, generate the code come
Inefficiency, it is difficult to give full play to the advantage of neural network processor.
Summary of the invention
(1) technical problems to be solved
In view of this, present disclosure be designed to provide for neural network processor executable program generating means and
Method, to solve above-described at least partly technical problem.
(2) technical solution
According to the one side of present disclosure, a kind of executable program generating means for neural network processor are provided, are wrapped
It includes:
Source program divide module, receive source file be input, according to the format in source file, identify and extract code segment and
The position of data segment, and generate the intermediate file comprising code segment and the intermediate file comprising data segment;
Data processing module handles putting for data, output Memory Allocation letter for inputting the intermediate file comprising data
Breath and data disposing way;
Neural network algorithm mapping block, for inputting the intermediate file comprising code segment, by being indicated with block in code
Neural network algorithm be mapped as macrostatement composition algorithm flow, re-map into the relevant instruction of hardware later.
Further include parallel codes generation module in further embodiment, be used for the relevant instruction of input hardware, to its into
The processing and optimization of row parallelization, the program after output optimization.
It further include reorientation module, input data disposing way, Memory Allocation information and optimization in further embodiment
Relative address in program after optimization is replaced with absolute address by program afterwards.
It further include machine code generation module in further embodiment, for the reorientation module will to be relocated it
The character string that program translation afterwards can be identified at neural network processor.
In further embodiment, the data processing module is also used to carry out data division, by each layer of neural network
Inputoutput data divided, can be put into the on-chip memory cell of neural network processor after division.
In further embodiment, the neural network algorithm mapping block includes:
Computation partition module, for by large-scale computation partition relatively small sub- computing module on a large scale;
Command mappings module, for by Algorithm mapping at the instruction in neural network processor instruction set.
In further embodiment, the sentence of neural network algorithm, block and corresponding fixed are contained in the code segment
Justice.
According to the another aspect of present disclosure, a kind of method that executable program is generated using above-mentioned apparatus is also provided, comprising:
Module is divided using source program, receiving source file is input, and according to the format in source file, identifies code segment and data segment
Position and extraction, and generate the intermediate file comprising code segment and intermediate file comprising data segment;
Using data processing module, input includes the intermediate file of data, handles putting for data, output Memory Allocation letter
Breath and data disposing way;
Using neural network algorithm mapping block, input includes the intermediate file of code segment, by being indicated in code with block
Neural network algorithm be mapped as macrostatement composition algorithm flow, re-map into the relevant instruction of hardware later.
In further embodiment, further includes: use parallel codes generation module, the relevant instruction of input hardware, to it
Carry out the processing and optimization of parallelization, the program after output optimization.
In further embodiment, further includes: using reorientation module, input data disposing way, Memory Allocation information
With the program after optimization, the relative address in the program after optimization is replaced with into absolute address.
In further embodiment, further includes: use machine code generation module, the reorientation module will be relocated it
The character string that program translation afterwards can be identified at neural network processor.
In further embodiment, further includes: data division is carried out using the data processing module, neural network is every
One layer of inputoutput data is divided, and is put into the on-chip memory cell of neural network processor after division.
In further embodiment, the sentence of neural network algorithm, block and corresponding fixed are contained in the code segment
Justice.
In further embodiment, parallelization processing and optimization in parallel codes generation module include: by simulation and/
Or inference method, adjustment statement sequence improve parallel effect.
In further embodiment, neuron is divided by multiple data blocks using data processing module, by multiple numbers
According to the storage of block sequence into storage unit, in calculating process, the data block can be loaded into on-chip memory and carry out
It is further to calculate
(3) beneficial effect
The executable file generating means proposed in present disclosure are mentioned specifically for designing on neural network accelerator for user
For a kind of method for conveniently using neural network processor.Since neural network algorithm is close in design, and it is independent
In hardware itself, user does not need that efficient neural network algorithm operating file can be generated the characteristics of considering hardware.
The executable code for capableing of efficient operation can be generated in device in present disclosure, ensure that neural network accelerator
Efficiency.
Detailed description of the invention
Fig. 1 is the frame diagram of the generating means containing executable program in present disclosure embodiment.
Fig. 2 is the overall structure diagram in present disclosure embodiment comprising executable program generating means.
Fig. 3 shows the data that data processing module carries out in present disclosure embodiment and divides.
The data that Fig. 4 illustrates that present disclosure embodiment data processing module carries out put function.
Fig. 5 is the flow diagram that source file processing is carried out using present disclosure embodiment.
Fig. 6 describes the positive of the network structure of two layers of full articulamentum and calculates schematic process.
Fig. 7 shows the specific expansion of the positive calculation block in Fig. 6.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference
Attached drawing, the present invention is described in further detail.
In present disclosure, since neural network processor has special hardware configuration, it is therefore desirable to special programming side
Method is programmed it, that is, uses programming language, language is mapped to the source file that can be generated device and be understood.Fig. 1 is
Present disclosure embodiment is from neural network algorithm to the generating process of neural network processor executable file.In present disclosure embodiment
Programmed method include two steps, firstly, by the concept in neural network algorithm, be mapped to abstract general in programming language
It reads, for example by the neuron in neural network, cynapse is mapped to the data structure in programming language;Second step, by programming language
In abstract concept, the code being mapped in specific source program, such as: one piece of neuron number according to can source file data segment
It is stated.
Programming language is made of one group of set rule (grammer, semantic).In present disclosure, assembler language from the following aspect by
Constraint: data type (data type), sentence (statement) and block (block).
The various data structures in neural network algorithm are organized and expressed to data type, and these are counted for storing
In particular hardware according to structure mapping to neural network processor.It include three kinds of data types: neuron in present disclosure
(neuron), cynapse (synapse) and parameter (parameter).Wherein, neuron is a kind of Multidimensional numerical, for storing
With the input and output numerical value of each layer in expression neural network algorithm.Cynapse data structure is for expressing and storing neural network algorithm
In for connecting the weight and a Multidimensional numerical that certain layers (such as convolutional layer, full articulamentum) are output and input.Supplemental characteristic
Structure is a kind of scalar structure, for indicating the training parameters such as the scalar data in neural network algorithm, especially learning rate.Number
It is divided into dynamic data and two kinds of static data according to structure.Dynamic data refers to the size that can just obtain primary data in the process of running
Data, it is therefore desirable to carry out data distribution in the process of running;Static data is before runtime, in program generation procedure just
Those of primary data size data, therefore this partial data is assigned during Program Generating and finishes.
Sentence is used to express the specific implementation procedure of algorithm, and user realizes mind by writing out particular statement with particular order
Through network algorithm.Sentence includes basic statement (basic statement) and macrostatement (macro statement).Basic language
Sentence is corresponding with the instruction set of neural network processor, represents the most basic function that can be supported.Macrostatement is for providing language
Speech is abstracted, and macrostatement is realized by macrodefinition and macro-call.Macrodefinition is used to define the specific implementation procedure of the macrostatement,
It is made of basic statement.Macro-call refers to the specifically used of the macrostatement.
Block (block) is made of for expressing a specific neural network algorithm, block one section of macrostatement and basic statement.
The use of block is also classified into definition and calls two parts, and block defines the assembled arrangement for passing through macrostatement and basic statement, description one
The calculating process of a neural network algorithm.The calling of block is the use of the algorithm.
More than, the generating mode of source file and the composition of source file are described, illustrates this below in conjunction with attached drawing
The executable program generating means for neural network processor for disclosing embodiment, are used to handle source file, lead to
It crosses and source file is parsed, optimization and etc., it is processed into the executable program that can be run on neural network processor.
The executable program generating means of present disclosure be used for generates can be run on neural network processor can be performed
Program.Its input is stored in a storage unit, the source file (character string) write as with certain fixed format, and exporting is one section
Can continuously be can be by the instruction sequence that target processor identifies and runs, instruction sequence with binary system, the decimal system, eight into
System, the storage of the arbitrary carry systems such as hexadecimal, the character string file being stored in a storage unit.Generating means may include following
Module, source program divide module, data processing module and neural network algorithm mapping block.
Source program segmentation module is used to parse the section of different meanings in source file, and is given different processing modules.
Specifically, it is input that segmentation module, which receives source file, according to the format in source file, the position of code segment and data segment is identified
It sets, is extracted, and generate two intermediate files, one contains code segment, and one contains data segment.In code segment
All sentences, block and its corresponding definition are contained, this partial code will be sent to neural network algorithm mapping mould
Block;Another part is sent to data processing module, carries out the distribution of data and puts.All data are contained in data segment
The statement and definition of (static and dynamic, various types).
Data processing module is for handling the distribution of memory and putting for data.The module is with the statement of all data and determines
Justice is input, exports Memory Allocation information (for example including data first address, size etc.) and disposing way.Fig. 3 shows this
The data partition functionality that data processing module carries out in embodiment is disclosed, each layer of input and output scale may be very big, therefore,
It needs to divide these data, to enable them to be put into the on-chip memory cell of neural network processor.
The data that Fig. 4 illustrates that present disclosure embodiment data processing module carries out put function.One piece of complete data meeting
It is divided into several data blocks, in figure, a neuron data block has been divided into three data blocks, these three data blocks
The storage of meeting sequence is into storage unit.Before progress data are put, data are put as unit of a complete block, and
Data put module and are divided into multiple data blocks, and each is handled as an independent neuron data block
With put.In calculating process, such a data block can be loaded into on-chip memory and further be calculated.
Neural network algorithm mapping block is used to map by the block (stating different neural network algorithms) in source program
At corresponding basic sentence (and hardware instruction set is closely related), computation partition module, command mappings module are specifically included.It calculates
Division module is used for large-scale computation partition relatively small sub- computing module on a large scale.Such as by a full articulamentum algorithm
Output be divided into three sections, then this algorithm is divided into three sons and calculates, and every height calculates the section of an output.Instruction is reflected
Module is penetrated then by specific Algorithm mapping at the instruction in hardware instruction set (neural network processor instruction set).For example, one
Full connection calculates the basic statement that can be mapped as a Matrix Multiplication vector.Code after it is mapped will be admitted to parallel optimization
Module.
In some embodiments, generating means can also include parallel codes generation module, parallel codes generation module
Input is to be mapped to the program of basic statement and macrostatement, is exported as the code after optimization.The module passes through simulation, the side such as reasoning
Method, adjustment statement sequence reach best parallel effect.
In some embodiments, generating means can also include reorientation module, and wherein the reorientation module is used for basis
The address information of Memory Allocation is replaced the first address of the corresponding data in program.
In some embodiments, generating means can also include machine code generation module, wherein the machine code generation module
String of binary characters for that can identify the code translation after reorientation at machine.
Fig. 2 is the overall structure diagram in present disclosure embodiment comprising specific executable program generating means.Source first
File is admitted to source program segmentation module and carries out cutting, is divided into data segment, code segment.Data segment is admitted to data processing later
Device module carries out the distribution of memory and putting for data;Code segment is admitted to neural network algorithm mapping block, will be in code
With the neural network algorithm that block indicates be mapped as macrostatement composition algorithm flow, re-map into the relevant finger of hardware later
It enables.The output of neural network algorithm mapping block can be admitted to parallel codes generation module, carry out the processing and optimization of parallelization.
The step in, parallel codes generation module can for the sequence that instructs in the specific parallel mechanism characteristic adjustment code of hardware,
With being optimal.Later, the program after the optimization of the address information of data processing module output and the output of parallel codes module
It is admitted in reorientation module, the relative address in program is substituted for absolute address.Finally, program is admitted to machine code generation
Module, form which can identify the code translation indicated with memonic symbol at machine (such as: binary file).
In order to further describe, a specific embodiment is used below, describes entire executable program generating process.But
It should be understood that the specific implementation details in the embodiment are only used for illustrating present disclosure, it is not construed as to present disclosure
Restriction;In addition, wherein the embodiment may simplify or omit component or step known to the art, so as not to it is fuzzy
The characteristics of present disclosure.
Fig. 5 is the flow diagram that source file processing is carried out using present disclosure embodiment, and Fig. 6 describes one and connects entirely for two layers
The positive of the network structure of layer is connect to calculate.Its specific executable file generating process is as follows:
(1) source file to be sent into source file and divides module, file indicates .code (indicating code segment) according to it accordingly,
.static_rw .static_ro and .dynamic (respectively static read-write, static readonly and dynamic segment) is divided
At four parts.
(2) .code is partially fed to neural network algorithm mapping block.Wherein ,@block_fw_fc is mapped to specifically
Calculating is macro and memory access macrostatement.Result after mapping is as shown in Figure 7.Macro parameter fc1_out, fc1_inp and the fc1_
Weight can replacement block define in parameter.
(3) program exported in step (2) is admitted to parallel optimization module and optimizes.
(4) data segment (.static_rw .static_ro .dynamic) is admitted to data processing module and is handled.
Firstly, data processing module calculates the size of these data, and such as: it is included as 1024 numbers, i.e. 2048 words in fc1_inp
Section.Therefore the data of 2048 bytes are distributed to fc1_inp since relative address 0, first address is then displaced to 2048, then to the
Two data carry out address distribution, until all distribution terminates.These address informations are the output information of this module.
(5) data of data segment can be put according to the segment information etc. of data assertion.
(6) program after the optimization exported in the address information and step 3 exported in step 4 is admitted to reorientation module.
The module can be by relative address modification at absolute address.
(7) output program of step 6 is admitted in machine code generation module, which can manage program translation at machine
The document form of solution, i.e., final executable program.
More than, by present disclosure embodiment, propose a kind of executable program generating means for neural network processor
And method.By using this method and device, programmers more efficient can program on neural network processor.
It, can also be in addition, each functional unit in each embodiment of present disclosure can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also be realized in the form of software program module.
If the integrated unit is realized in the form of software program module and sells or use as independent product
When, it can store in a computer-readable access to memory.Based on this understanding, the technical solution of present disclosure substantially or
Person says that all or part of the part that contributes to existing technology or the technical solution can body in the form of software products
Reveal and, which is stored in a memory, including some instructions are used so that a computer equipment
(can be personal computer, server or network equipment etc.) executes all or part of each embodiment the method for the present invention
Step.And memory above-mentioned includes: USB flash disk, read-only memory (ROM, Read-Only Memory), random access memory
The various media that can store program code such as (RAM, Random Access Memory), mobile hard disk, magnetic or disk.
Each functional unit/module can be hardware, for example the hardware can be circuit, including digital circuit, simulation electricity
Road etc..The physics realization of hardware configuration includes but is not limited to physical device, and physical device includes but is not limited to transistor,
Memristor etc..Computing module in the computing device can be any hardware processor appropriate, such as CPU, GPU,
FPGA, DSP and ASIC etc..The storage unit can be any magnetic storage medium appropriate or magnetic-optical storage medium, than
Such as RRAM, DRAM, SRAM, EDRAM, HBM, HMC etc..
It should also be understood that in present disclosure, it is right, for example, " some embodiments ", " embodiment ", " one or more implementation
Example " reference indicates that special characteristic can be included in the implementation of present disclosure.Similarly, it should be appreciated that in the de-scription, it is various
Feature is grouped in sometimes in single embodiment, figure or its description, to simplify explanation, and helps to understand each side of present disclosure
Face.However, the disclosure method be not construed as reflection present disclosure need it is more than what is be expressly recited in each claim
The intention of feature.
Particular embodiments described above has carried out further in detail the purpose of the present invention, technical scheme and beneficial effects
Describe in detail bright, it should be understood that the above is only a specific embodiment of the present invention, is not intended to restrict the invention, it is all
Within the spirit and principles in the present invention, any modification, equivalent substitution, improvement and etc. done should be included in protection of the invention
Within the scope of.
Claims (15)
1. a kind of executable program generating means for neural network processor, characterized by comprising:
Source program divides module, and receiving source file is input, according to the format in source file, identifies and extracts code segment and data
The position of section, and generate the intermediate file comprising code segment and the intermediate file comprising data segment;
Data processing module handles putting for data for inputting the intermediate file comprising data, output Memory Allocation information and
Data disposing way;
Neural network algorithm mapping block, for inputting the intermediate file comprising code segment, by the mind indicated with block in code
It is mapped as the algorithm flow of macrostatement composition through network algorithm, re-maps into the relevant instruction of hardware later.
2. the apparatus according to claim 1, which is characterized in that further include parallel codes generation module, be used for input hardware
Relevant instruction carries out the processing and optimization of parallelization, the program after output optimization to it.
3. the apparatus of claim 2, which is characterized in that it further include reorientation module, it is input data disposing way, interior
Program after depositing distribution information and optimization, replaces with absolute address for the relative address in the program after optimization.
4. device according to claim 3, which is characterized in that further include machine code generation module, for being reset described
The character string that position module can identify the program translation after reorientation at neural network processor.
5. the apparatus according to claim 1, which is characterized in that the data processing module is also used to carry out data division,
Each layer of inputoutput data of neural network is divided, the on piece storage that neural network processor can be put into after division is single
In member.
6. the apparatus according to claim 1, which is characterized in that the neural network algorithm mapping block includes:
Computation partition module, for by large-scale computation partition relatively small sub- computing module on a large scale;
Command mappings module, for by Algorithm mapping at the instruction in neural network processor instruction set.
7. the apparatus according to claim 1, which is characterized in that contain the language of neural network algorithm in the code segment
Sentence, block and corresponding definition.
8. a kind of method using the raw executable program of any described device of claim 1-7, characterized by comprising: use source
Program divides module, and receiving source file is input, and according to the format in source file, identifies the position of code segment and data segment
And it extracts, and generate the intermediate file comprising code segment and the intermediate file comprising data segment;
Using data processing module, input includes the intermediate file of data, handles putting for data, output Memory Allocation information and
Data disposing way;
Using neural network algorithm mapping block, input includes the intermediate file of code segment, by the mind indicated with block in code
It is mapped as the algorithm flow of macrostatement composition through network algorithm, re-maps into the relevant instruction of hardware later.
9. according to the method described in claim 8, it is characterized by further comprising:
Using parallel codes generation module, the relevant instruction of input hardware carries out the processing and optimization of parallelization to it, exports excellent
Program after change.
10. according to the method described in claim 9, it is characterized by further comprising:
Using reorientation module, input data disposing way, Memory Allocation information and optimization after program, by the program after optimization
In relative address replace with absolute address.
11. according to the method described in claim 10, it is characterized by further comprising:
Using machine code generation module, by the reorientation module by the program translation after reorientation at neural network processor
The character string that can be identified.
12. according to the method described in claim 10, it is characterized by further comprising:
Data division is carried out using the data processing module, each layer of inputoutput data of neural network is divided,
It is put into after division in the on-chip memory cell of neural network processor.
13. according to the method described in claim 8, it is characterized in that, containing the language of neural network algorithm in the code segment
Sentence, block and corresponding definition.
14. according to the method described in claim 9, it is characterized in that, in parallel codes generation module parallelization processing and it is excellent
Change includes: by simulation and/or inference method, and adjustment statement sequence improves parallel effect.
15. according to the method described in claim 9, it is characterized by further comprising:
Neuron is divided by multiple data blocks using data processing module, the storage of multiple data block orders is single to storage
In member, in calculating process, the data block can be loaded into on-chip memory and further be calculated.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810257595.7A CN110308909B (en) | 2018-03-27 | 2018-03-27 | Executable program generating device and method for neural network processor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810257595.7A CN110308909B (en) | 2018-03-27 | 2018-03-27 | Executable program generating device and method for neural network processor |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110308909A true CN110308909A (en) | 2019-10-08 |
CN110308909B CN110308909B (en) | 2023-08-01 |
Family
ID=68074163
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810257595.7A Active CN110308909B (en) | 2018-03-27 | 2018-03-27 | Executable program generating device and method for neural network processor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110308909B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113031952A (en) * | 2019-12-25 | 2021-06-25 | 上海高德威智能交通系统有限公司 | Method and device for determining execution code of deep learning model and storage medium |
CN115098107A (en) * | 2022-06-21 | 2022-09-23 | 清华大学 | Code generation method and device of neural network task |
WO2022227869A1 (en) * | 2021-04-30 | 2022-11-03 | International Business Machines Corporation | Locate neural network performance hot spots |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6578020B1 (en) * | 1999-12-07 | 2003-06-10 | International Business Machines Corporation | Method and system for converting code to executable code using neural networks implemented in a very large scale integration (VLSI) integrated circuit |
US20040122785A1 (en) * | 2000-12-15 | 2004-06-24 | International Business Machines Corporation | Method, system, and program for converting application program code to executable code using neural networks based on characteristics of the inputs |
US6832214B1 (en) * | 1999-12-07 | 2004-12-14 | International Business Machines Corporation | Method, system, and program for converting code to executable code using neural networks implemented in a software program |
CN103282891A (en) * | 2010-08-16 | 2013-09-04 | 甲骨文国际公司 | System and method for effective caching using neural networks |
CN105740946A (en) * | 2015-07-29 | 2016-07-06 | 上海磁宇信息科技有限公司 | Method for realizing neural network calculation by using cell array computing system |
CN105989408A (en) * | 2015-03-18 | 2016-10-05 | 国际商业机器公司 | A system and a method for mapping a neural network onto a neurosynaptic substrate |
CN107239315A (en) * | 2017-04-11 | 2017-10-10 | 北京深鉴智能科技有限公司 | Towards the programming model of neutral net heterogeneous computing platforms |
US20170323224A1 (en) * | 2016-05-07 | 2017-11-09 | 1026 Labs, Inc. | Apparatus for hardware accelerated machine learning |
US20180018167A1 (en) * | 2016-07-15 | 2018-01-18 | Microsoft Technology Licensing, Llc | Transforming data manipulation code into data workflow |
-
2018
- 2018-03-27 CN CN201810257595.7A patent/CN110308909B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6578020B1 (en) * | 1999-12-07 | 2003-06-10 | International Business Machines Corporation | Method and system for converting code to executable code using neural networks implemented in a very large scale integration (VLSI) integrated circuit |
US6832214B1 (en) * | 1999-12-07 | 2004-12-14 | International Business Machines Corporation | Method, system, and program for converting code to executable code using neural networks implemented in a software program |
US20040122785A1 (en) * | 2000-12-15 | 2004-06-24 | International Business Machines Corporation | Method, system, and program for converting application program code to executable code using neural networks based on characteristics of the inputs |
CN103282891A (en) * | 2010-08-16 | 2013-09-04 | 甲骨文国际公司 | System and method for effective caching using neural networks |
CN105989408A (en) * | 2015-03-18 | 2016-10-05 | 国际商业机器公司 | A system and a method for mapping a neural network onto a neurosynaptic substrate |
CN105740946A (en) * | 2015-07-29 | 2016-07-06 | 上海磁宇信息科技有限公司 | Method for realizing neural network calculation by using cell array computing system |
US20170323224A1 (en) * | 2016-05-07 | 2017-11-09 | 1026 Labs, Inc. | Apparatus for hardware accelerated machine learning |
US20180018167A1 (en) * | 2016-07-15 | 2018-01-18 | Microsoft Technology Licensing, Llc | Transforming data manipulation code into data workflow |
CN107239315A (en) * | 2017-04-11 | 2017-10-10 | 北京深鉴智能科技有限公司 | Towards the programming model of neutral net heterogeneous computing platforms |
Non-Patent Citations (1)
Title |
---|
张吉豫等: "一种基于人工神经网络的基本块重排方法", 《北京大学学报(自然科学版)》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113031952A (en) * | 2019-12-25 | 2021-06-25 | 上海高德威智能交通系统有限公司 | Method and device for determining execution code of deep learning model and storage medium |
WO2022227869A1 (en) * | 2021-04-30 | 2022-11-03 | International Business Machines Corporation | Locate neural network performance hot spots |
US11775317B2 (en) | 2021-04-30 | 2023-10-03 | International Business Machines Corporation | Locate neural network performance hot spots |
GB2622153A (en) * | 2021-04-30 | 2024-03-06 | Ibm | Locate neural network performance hot spots |
GB2622153B (en) * | 2021-04-30 | 2024-07-17 | Ibm | Locate neural network performance hot spots |
CN115098107A (en) * | 2022-06-21 | 2022-09-23 | 清华大学 | Code generation method and device of neural network task |
CN115098107B (en) * | 2022-06-21 | 2024-04-19 | 清华大学 | Code generation method and device for neural network task |
Also Published As
Publication number | Publication date |
---|---|
CN110308909B (en) | 2023-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111368996B (en) | Retraining projection network capable of transmitting natural language representation | |
Cuomo et al. | A GPU-accelerated parallel K-means algorithm | |
Misra et al. | Neural shift-reduce CCG semantic parsing | |
KR20180062321A (en) | Method for drawing word related keyword based on deep learning and computerprogram | |
CN110308909A (en) | For the executable program generating means and method of neural network processor | |
CN105706092B (en) | The method and system of four values simulation | |
Sánchez-Karhunen et al. | Modelling complex market interactions using PDP systems | |
US11720788B2 (en) | Calculation scheme decision system, calculation scheme decision device, calculation scheme decision method, and storage medium | |
Antonelli et al. | Learning concurrently data and rule bases of Mamdani fuzzy rule-based systems by exploiting a novel interpretability index | |
CN114168154B (en) | Model data processing method and device, electronic equipment and storage medium | |
Glauner | Comparison of training methods for deep neural networks | |
CN114398899A (en) | Training method and device for pre-training language model, computer equipment and medium | |
CN116401502A (en) | Method and device for optimizing Winograd convolution based on NUMA system characteristics | |
Kokhazadeh et al. | A Design space exploration methodology for enabling tensor train decomposition in edge devices | |
Gibaja et al. | An ensemble-based approach for multi-view multi-label classification | |
Lima et al. | A grammar-based GP approach applied to the design of deep neural networks | |
CN114611714B (en) | Model processing method, device, system, electronic equipment and storage medium | |
CN109858027A (en) | One tool method for identifying and classifying of internet four product of electric business merchandise news | |
CN109241322A (en) | Code generating method, code generating unit and electronic equipment | |
Mezher | GFLibPy: an open-source python toolbox for genetic folding algorithm | |
CN115422357A (en) | Text classification method and device, computer equipment and storage medium | |
CN110308899B (en) | Language source program generation method and device for neural network processor | |
Rudi et al. | CodeFlow: A code generation system for Flash-X orchestration runtime | |
Chichin et al. | Capability to embed deep neural networks: Study on cpu processor in avionics context | |
Gabryel et al. | The bag-of-words method with dictionary analysis by evolutionary algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |