CN114428761A - Neural network warping method and device based on FPGA - Google Patents

Neural network warping method and device based on FPGA Download PDF

Info

Publication number
CN114428761A
CN114428761A CN202210052457.1A CN202210052457A CN114428761A CN 114428761 A CN114428761 A CN 114428761A CN 202210052457 A CN202210052457 A CN 202210052457A CN 114428761 A CN114428761 A CN 114428761A
Authority
CN
China
Prior art keywords
neural network
module
fpga
hardware accelerator
chip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210052457.1A
Other languages
Chinese (zh)
Inventor
凌味未
相博镪
赵良平
胡双
邹金成
李蠡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu University of Information Technology
Original Assignee
Chengdu University of Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu University of Information Technology filed Critical Chengdu University of Information Technology
Priority to CN202210052457.1A priority Critical patent/CN114428761A/en
Publication of CN114428761A publication Critical patent/CN114428761A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0033Recording/reproducing or transmission of music for electrophonic musical instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0033Recording/reproducing or transmission of music for electrophonic musical instruments
    • G10H1/0041Recording/reproducing or transmission of music for electrophonic musical instruments in coded form
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/101Music Composition or musical creation; Tools or processes therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Neurology (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a neural network song compiling method and device based on an FPGA (field programmable gate array), and the device comprises a display module, a control key module, an audio decoding module and an FPGA module with a soft core of a neural network hardware accelerator, wherein the FPGA module is used for realizing control of each module and artificial intelligence data operation, and is provided with a system on chip, the system on chip comprises a neural network hardware accelerator based on an instruction set architecture, a data scheduling module and a memory, the neural network hardware accelerator is used for performing operation according to a built neural network model, in the operation process, the data scheduling module moves model weights from the memory to the neural network hardware accelerator for operation, and after an operation result is obtained, audio waveform data corresponding to musical notes obtained through inference are moved from the memory to the audio decoding module for playing. The invention can solve the limitation of the existing neural network curve compiling method in the aspects of calculation power and reconfigurability.

Description

Neural network warping method and device based on FPGA
Technical Field
The invention relates to the technical field of neural network hardware acceleration, in particular to a neural network buckling method and device based on an FPGA (field programmable gate array).
Background
In order to improve the music creation efficiency and the more novel music effect, the method of automatically composing music using an algorithm has been applied to computer-aided music composition systems to various degrees, among which genetic algorithms, artificial neural networks, markov chains, and hybrid algorithms are most widely used. With the development of artificial intelligence technology in recent years, artificial neural networks have been widely used in music application systems. At present, companies such as AIVA, Google and domestic network easiness can realize high-quality artificial intelligence audio processing and music creation on a server side, and for an edge computing side, cost and application scene constraints are considered, tradeoffs are often required in aspects such as calculation power, power consumption and reconfigurability, and a larger space is still available for improvement. Meanwhile, the artificial intelligence transcription model usually takes a recurrent neural network as a core, most of the existing neural network accelerators mainly aim at the convolutional neural network, and the recurrent neural network is lack of targeted optimization.
Disclosure of Invention
In order to solve the problems, the invention provides a neural network compilation method and a device based on an FPGA (field programmable gate array), which can solve the limitation of the existing neural network compilation method in the aspects of computing power and reconfigurability, and adopts the following technical scheme:
an FPGA-based neural network warping device, comprising:
the display module is used for displaying the playing state information;
the control key module is used for selecting different play modes;
the audio decoding module is used for playing music automatically generated through artificial intelligence calculation;
the FPGA module is provided with a neural network hardware accelerator soft core and is used for realizing the control of the display module, the control key module and the audio decoding module and the artificial intelligent data operation; the FPGA module carries on a system on chip, the system on chip comprises a neural network hardware accelerator based on an instruction set framework, a data scheduling module and a memory, the neural network hardware accelerator is used for carrying out operation according to a built neural network model, in the operation process, the data scheduling module moves model weight from the memory to the neural network hardware accelerator for operation, after an operation result is obtained, audio waveform data corresponding to the musical notes obtained through inference are moved from the memory to an audio decoding module for playing.
Furthermore, the system on chip uses a soft core CPU as a controller, uses a DDR3SDRAM and a TF card as a memory, realizes audio decoding through an audio CODEC chip, and is equipped with a UART, a serial peripheral interface SPI, I2C, I2S, and a DDR3SDRAM controller, and is connected by an AHB bus.
Further, the system on chip communicates with an upper computer through a UART interface and prints log information, realizes reading and writing of a TF card through an SPI interface, and configures and transmits data to an audio CODEC chip through an I2C interface and an I2S interface.
Further, the system on chip uses a TF card to obtain data obtained after the neural network model is trained and audio waveform data corresponding to the musical notes, and the data and the audio waveform data are read by a DDR3SDRAM when the system on chip is started, different types of weights in the neural network model are stored in addresses corresponding to the TF card and the DDR3SDRAM according to rules, and a user program needs to control the neural network hardware accelerator according to the addresses to achieve weight moving.
Furthermore, the neural network hardware accelerator can receive an instruction from a user program in a mode of writing a register on a bus through a CPU, and perform memory read-write access operation, on-chip special cache operation and operation resource module operation according to the instruction.
Further, the data flow scheduling module can receive an instruction from a user program in a mode of writing a register on a bus by the CPU, and configure data scheduling among the off-chip memory, the on-chip module and the on-chip cache according to the instruction.
Furthermore, the neural network hardware accelerator takes a multi-path parallel multiplier as an operation core, and a plurality of distributed buffers temporarily store operation intermediate data; and (3) performing hardware implementation on the activation function by using piecewise function fitting and utilizing the symmetric characteristics of sigmoid and tanh.
Further, the neural network hardware accelerator generates pseudo random numbers by using a linear feedback shift register LFSR, realizes exponential operation by using a lookup table in a plurality of intervals, and cooperatively realizes hardware implementation on SOFTMAX.
Further, the method for building the neural network model comprises the following steps: the method comprises the steps of configuring a word embedding model to code musical notes in a software programming mode, and building by using three layers of GRUs, one layer of full connection layer and SOFTMAX.
A neural network warping method based on FPGA comprises the following steps:
s1, the neural network hardware accelerator performs operation according to a built neural network model, and in the operation process, the data scheduling module moves model weight from the memory to the neural network hardware accelerator for operation;
and S2, after the operation result is obtained, the data scheduling module moves the audio waveform data corresponding to the musical note obtained by inference from the memory to the audio decoding module for playing.
The invention has the beneficial effects that:
(1) the real-time neural network warping method and the real-time neural network warping device can adapt to various artificial intelligent neural networks with the circulating neural network as the core at the end side of edge calculation, and the operation performance of the real-time neural network warping device is further improved compared with that of a general processor due to the integration of a hardware accelerator.
(2) In order to deploy a neural network buckling device with low power consumption and capability of ensuring calculation power and reconfigurability at the edge calculation end side, the invention is designed based on an FPGA hardware platform, a soft core CPU is carried as a main controller, a neural network hardware accelerator is responsible for calculation, and a plurality of peripheral controller modules are integrated to realize interaction with users.
(3) In order to realize the neural network buckling acceleration more efficiently, the neural network hardware accelerator is designed as a core, the accelerator is designed based on a single instruction multiple data stream instruction set optimized for the cyclic neural network, and the accelerator can be adapted to the cyclic neural network with different layers and dimensions and a derivative model thereof while the operation speed is ensured.
Drawings
FIG. 1 is a schematic structural diagram of a neural network warping device according to the present invention;
FIG. 2 is a schematic diagram of a system-on-chip architecture of the FPGA platform of the present invention;
FIG. 3 is a functional diagram of the data flow scheduling of the present invention;
FIG. 4 is a schematic diagram of real-time playback audio waveform data according to the present invention;
FIG. 5 is a schematic diagram of a user program deployment of the present invention;
FIG. 6 is a diagram of a neural network hardware accelerator architecture of the present invention.
Detailed Description
In order to more clearly understand the technical features, objects, and effects of the present invention, specific embodiments of the present invention will now be described. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the present embodiment provides a neural network song compiling method and device based on an FPGA, wherein the song compiling device includes an upper computer interface 1, a control key module 2, a display module 3, a TF Card 4(Trans-flash Card), an FPGA module 5 carrying a soft core of a neural network hardware accelerator, an audio decoding module 6, a DDR3SDRAM 7, a debugging module 8, and a power management module 9. In addition, the display module 3 is connected with a display screen 10, the audio decoding module 6 is connected with a loudspeaker 11, and the power management module 9 is connected with a lithium battery.
In this embodiment, the system on chip is implemented based on an FPGA hardware platform, and referring to fig. 2, a soft core CPU13 is used as a controller core, and an AHB and an APB bus are used to carry a peripheral module. And audio waveform data corresponding to parameters and tones obtained by neural network pre-training is stored in the TF card 4, and when the system is started, the data scheduling module moves the TF card 4 data to the DDR3SDRAM 7. In the operation process of the neural network hardware accelerator 27, the data scheduling module moves the model weight from the DDR3SDRAM 7 to the neural network hardware accelerator 27 for operation, and after an operation result is obtained, moves the audio data corresponding to the musical note obtained through inference from the DDR3SDRAM 7 to the audio decoding module 6, with reference to fig. 3 and 4.
In this embodiment, a developer is supported to use software programming to enable the neural network hardware accelerator 27 to adapt to the recurrent neural network models of different scales, and referring to fig. 5, when the whole system is reset, firstly, a program declares variables required for post-processing, and then, the variables are initialized for the UART (universal asynchronous receiver transmitter) 26, the TIMER 25, the GPIO (general purpose input/output port) 15, the SPI (serial peripheral interface) 16, the audio decoding module 6, the DDR3SDRAM controller module 19, and the neural network hardware accelerator 27, and after the serial printing welcome information is completed. After the initialization of each module is finished, the preparation stage before the operation is started. The CPU13 reads out the neural network model parameter data and the sound source corresponding to each note from the TF card 4 by using a program, and writes them into the DDR3SDRAM 7 so that the data can be read out at high speed in the subsequent operation state. After the data is prepared, the program gives an initial value to the neural network, and the function starts to operate. In this program, the neural network hardware accelerator 27 is not always running, but is prepared for data playing functions. The program circularly queries the status register of the playing module, when the data of the playing buffer is lower than a threshold value, the program starts the neural network hardware accelerator 27 to operate, and the data to be played is put into the buffer of the audio decoding module 6 after the result is obtained, so that the purpose of playing the whole system in real time is realized.
The architecture of the neural network hardware accelerator 27 of the present embodiment is shown in fig. 6, and the work flow thereof can be expressed as: first, the CPU13 notifies the neural network hardware accelerator 27 that the word corresponding to the input value calculated in this round is embedded in the address of the vector in the SDRAM, and the accelerator fetches it into the buffer Data _ in 33. And sequentially extracting bias and weight required by current calculation to corresponding buffers, and performing parallel multiply-add operation by the controller according to the instruction in a corresponding mode. All the operation results are summarized into the buffer Data _ out34, wherein the buffer Data _ tmp0 and Data _ tmp1 can automatically copy the operation results of the Data _ out34 in some special operations. The buffers Data _ r28, Data _ z29 are special buffers required in the calculation flow of the GRU, and Data _ h32 are special buffers of the hidden layer, and the calculation results are updated after each layer of calculation of the GRU is completed. After the calculation of both the GRU and the output layer at the current time is completed, the final stage of hardware implementation is entered, the result vector elements are respectively subjected to exponential operation, the result is taken as the probability corresponding to the index, the simulation extraction is performed from the result according to the form of multinomial distribution, the index of the extraction result is taken as the final result of the hardware end at the current time, and the CPU13 is informed that the operation at the current time is completed in an interruption form.
The foregoing is illustrative of the preferred embodiments of this invention, and it is to be understood that the invention is not limited to the precise form disclosed herein and that various other combinations, modifications, and environments may be resorted to, falling within the scope of the concept as disclosed herein, either as described above or as apparent to those skilled in the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. The utility model provides a neural network song compiling device based on FPGA which characterized in that includes:
the display module is used for displaying the playing state information;
the control key module is used for selecting different play modes;
the audio decoding module is used for playing music automatically generated through artificial intelligence calculation;
the FPGA module is provided with a soft core of a neural network hardware accelerator and is used for realizing the control of the display module, the control key module and the audio decoding module and the artificial intelligence data operation; the FPGA module carries on a system on chip, the system on chip comprises a neural network hardware accelerator based on an instruction set framework, a data scheduling module and a memory, the neural network hardware accelerator is used for carrying out operation according to a built neural network model, in the operation process, the data scheduling module moves model weight from the memory to the neural network hardware accelerator for operation, after an operation result is obtained, audio waveform data corresponding to the musical notes obtained through inference are moved from the memory to an audio decoding module for playing.
2. The FPGA-based neural network transcription device according to claim 1, wherein the system on chip uses a soft core CPU as a controller, uses DDR3SDRAM and TF card as memories, implements audio decoding through an audio CODEC chip, and is loaded with a UART, a serial peripheral interface SPI, I2C, I2S, and a DDR3SDRAM controller, and is connected by an AHB bus.
3. The FPGA-based neural network transcription device according to claim 2, wherein the system on chip performs communication with an upper computer and printing log information through a UART interface, realizes reading and writing of a TF card through an SPI interface, and performs configuration and data transmission of an audio CODEC chip through an I2C interface and an I2S interface.
4. The FPGA-based neural network curving device of claim 2, wherein the system on chip uses a TF card to obtain data obtained after the neural network model is trained and audio waveform data corresponding to musical notes, and the audio waveform data are read by DDR3SDRAM when the system on chip is started, different types of weights in the neural network model are stored in the TF card and addresses corresponding to DDR3SDRAM according to rules, and a user program needs to control the neural network hardware accelerator according to the addresses to move the weights.
5. The FPGA-based neural network programming device of claim 2, wherein the neural network hardware accelerator is capable of receiving an instruction from a user program by writing a register on a bus through a CPU, and performing a memory read-write access operation, an on-chip dedicated cache operation, and an operation resource module operation according to the instruction.
6. The FPGA-based neural network transcription device according to claim 2, wherein the data flow scheduling module is capable of receiving an instruction from a user program by writing a register on a bus through a CPU, and configuring data scheduling between an off-chip memory and an on-chip module and an on-chip cache according to the instruction.
7. The FPGA-based neural network buckling device of claim 2, wherein the neural network hardware accelerator takes a plurality of parallel multipliers as an operation core, and a plurality of distributed buffers temporarily store operation intermediate data; and (3) performing hardware implementation on the activation function by using piecewise function fitting and utilizing the symmetric characteristics of sigmoid and tanh.
8. The FPGA-based neural network compilation device of claim 2, wherein the neural network hardware accelerator generates pseudo-random numbers using a Linear Feedback Shift Register (LFSR), performs exponential operation using a look-up table with multiple intervals, and performs hardware implementation on SOFTMAX in cooperation.
9. The FPGA-based neural network buckling device of claim 1, wherein the building method of the neural network model comprises the following steps: the method comprises the steps of configuring a word embedding model to code musical notes in a software programming mode, and building by using three layers of GRUs, one layer of full connection layer and SOFTMAX.
10. An FPGA-based neural network warping method applied to the FPGA-based neural network warping device of any one of claims 1-9, characterized by comprising the following steps:
s1, the neural network hardware accelerator performs operation according to a built neural network model, and in the operation process, the data scheduling module moves model weight from the memory to the neural network hardware accelerator for operation;
and S2, after the operation result is obtained, the data scheduling module moves the audio waveform data corresponding to the musical note obtained by inference from the memory to the audio decoding module for playing.
CN202210052457.1A 2022-01-18 2022-01-18 Neural network warping method and device based on FPGA Pending CN114428761A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210052457.1A CN114428761A (en) 2022-01-18 2022-01-18 Neural network warping method and device based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210052457.1A CN114428761A (en) 2022-01-18 2022-01-18 Neural network warping method and device based on FPGA

Publications (1)

Publication Number Publication Date
CN114428761A true CN114428761A (en) 2022-05-03

Family

ID=81313015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210052457.1A Pending CN114428761A (en) 2022-01-18 2022-01-18 Neural network warping method and device based on FPGA

Country Status (1)

Country Link
CN (1) CN114428761A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271434A (en) * 2023-11-15 2023-12-22 成都维德青云电子有限公司 On-site programmable system-in-chip

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271434A (en) * 2023-11-15 2023-12-22 成都维德青云电子有限公司 On-site programmable system-in-chip
CN117271434B (en) * 2023-11-15 2024-02-09 成都维德青云电子有限公司 On-site programmable system-in-chip

Similar Documents

Publication Publication Date Title
WO2018171717A1 (en) Automated design method and system for neural network processor
CN110310628B (en) Method, device and equipment for optimizing wake-up model and storage medium
CN107346351A (en) For designing FPGA method and system based on the hardware requirement defined in source code
CN101027633B (en) An apparatus and method for address generation using a hybrid adder
CN101231589B (en) System and method for developing embedded software in-situ
US8725486B2 (en) Apparatus and method for simulating a reconfigurable processor
CN103870335B (en) System and method for efficient resource management of signal flow programmed digital signal processor code
CN109146067A (en) A kind of Policy convolutional neural networks accelerator based on FPGA
CN113313247B (en) Operation method of sparse neural network based on data flow architecture
CN104915427B (en) A kind of figure processing optimization method based on breadth first traversal
CN113222133A (en) FPGA-based compressed LSTM accelerator and acceleration method
CN111563582A (en) Method for realizing and optimizing accelerated convolution neural network on FPGA (field programmable Gate array)
Fujii et al. A threshold neuron pruning for a binarized deep neural network on an FPGA
CN114428761A (en) Neural network warping method and device based on FPGA
Vipin ZyNet: Automating deep neural network implementation on low-cost reconfigurable edge computing platforms
CN108805277A (en) Depth belief network based on more FPGA accelerates platform and its design method
CN116126354A (en) Model deployment method, device, electronic equipment and storage medium
WO2021031137A1 (en) Artificial intelligence application development system, computer device and storage medium
Hou et al. Architecting efficient multi-modal aiot systems
He et al. Design and Implementation of Embedded Real‐Time English Speech Recognition System Based on Big Data Analysis
Sen et al. Dataflow-based mapping of computer vision algorithms onto FPGAs
EP2956874A2 (en) Device and method for accelerating the update phase of a simulation kernel
Bai et al. An OpenCL-based FPGA accelerator with the Winograd’s minimal filtering algorithm for convolution neuron networks
CN113806431A (en) Method for transmitting simulation data, electronic system and storage medium
El-Shafei et al. Implementation of harmony search on embedded platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination