CN111401541A - Data transmission control method and device - Google Patents

Data transmission control method and device Download PDF

Info

Publication number
CN111401541A
CN111401541A CN202010162763.1A CN202010162763A CN111401541A CN 111401541 A CN111401541 A CN 111401541A CN 202010162763 A CN202010162763 A CN 202010162763A CN 111401541 A CN111401541 A CN 111401541A
Authority
CN
China
Prior art keywords
processor
data
neural network
network acceleration
external
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010162763.1A
Other languages
Chinese (zh)
Inventor
陈子荷
唐明华
袁涛
赵修齐
马爱永
王宏利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Goke Microelectronics Co Ltd
Original Assignee
Hunan Goke Microelectronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Goke Microelectronics Co Ltd filed Critical Hunan Goke Microelectronics Co Ltd
Priority to CN202010162763.1A priority Critical patent/CN111401541A/en
Publication of CN111401541A publication Critical patent/CN111401541A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Multi Processors (AREA)

Abstract

The invention discloses a data transmission control method and a device, wherein the method comprises the following steps: the processor sends a data request command; the direct memory access controller receives and processes a data request command sent by the processor, and determines a data reading rule; the direct memory access controller reads data requested by the processor from the random static memory according to the data reading rule and sends the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules; the processor receives the requested data and begins the operation. The invention can enable different processors to read data from the static random access memory in the neural network acceleration processor in parallel, and saves the waiting time of data transmission during operation.

Description

Data transmission control method and device
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data transmission control method and apparatus.
Background
The neural network is an arithmetic mathematical model for simulating animal neural network behavior characteristics and performing distributed parallel information processing, and the aim of processing information is fulfilled by adjusting the interconnection relationship among a large number of internal nodes depending on the complexity of the system. NNA (neural Network accelerator), also called as neural Network accelerator, is a module for calculating the operation tasks contained in the application scene of the artificial intelligence, the calculation complexity of the neural Network model is in direct proportion to the size of the input data, and the data volume needing to be operated is larger and larger as the application scene of the artificial intelligence is wider and wider.
The existing neural network accelerator does not support parallel operation, data needs to be moved to the outside to be calculated by other HOST processors after the NNA processor finishes operation, data transmission of the next step can be carried out only by waiting for the operation result of the previous step in the operation process, and once the data volume is large, the data transmission waiting time during operation is too long, and the operation efficiency is low.
Disclosure of Invention
In order to solve the above technical problems, the present invention provides a data transmission control method and apparatus, which enable different processors to read data from a static random access memory in a neural network acceleration processor in parallel, and save the waiting time of data transmission during operation.
One aspect of the present invention provides a data transmission control method, including:
the processor sends a data request command;
the direct memory access controller receives and processes the data request command sent by the processor, and determines a data reading rule; the direct memory access controller reads the data requested by the processor from a random static memory according to the data reading rule and sends the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules;
the processor receives the requested data and begins the operation.
Preferably, the processor comprises a neural network acceleration processor and/or an external processor.
Preferably, the first and second substrates are, among others,
when the processor comprises a neural network acceleration processor, the direct memory access controller receives and processes a data request command sent by the neural network acceleration processor, and determines a data reading address of the neural network acceleration processor; the direct memory access controller reads the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and sends the data requested by the neural network acceleration processor to the neural network acceleration processor;
when the processor comprises an external processor, the direct memory access controller receives and processes a data request command sent by the external processor, and determines a data reading address of the external processor; the direct memory access controller reads the data requested by the external processor from a static random access memory according to the data reading address of the external processor and sends the data requested by the external processor to the external processor;
when the processor comprises a neural network acceleration processor and an external processor, the direct memory access controller receives and processes a data request command sent by the neural network acceleration processor and a data request command sent by the external processor, and determines a data reading address of the neural network acceleration processor and a data reading address of the external processor; and the direct memory access controller reads the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and reads the data requested by the external processor from the static random access memory according to the data reading address of the external processor, and sends the data requested by the neural network acceleration processor to the neural network acceleration processor and sends the data requested by the external processor to the external processor.
Preferably, when the processor includes a neural network acceleration processor and an external processor, the method further includes:
the direct memory access controller judges whether the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor; wherein,
when the data reading address of the neural network acceleration processor is different from the data reading address of the external processor, the direct memory access controller simultaneously reads data from the static random access memory;
and when the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor, the direct memory access controller determines the priority of the data requested by the neural network acceleration processor and the priority of the data requested by the external processor, and sequentially reads the data from the static random access memory according to the priority.
Preferably, when the processor includes a neural network acceleration processor and an external processor, the processor receives the requested data and starts to operate, specifically:
the neural network acceleration processor and the external processor respectively receive the requested data and start parallel operation.
Preferably, the sending the data requested by the external processor to the external processor specifically includes:
and sending the data requested by the external processor to the external processor through a network-on-chip bus.
The invention provides a data transmission control device on the other hand, which comprises a processor, a direct memory access controller and a static random access memory;
the processor is used for sending a data request command;
the direct memory access controller is used for receiving and processing a data request command sent by the processor and determining a data reading rule; the direct memory access controller is also used for reading the data requested by the processor from a random static memory according to the data reading rule and sending the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules;
the processor is also configured to receive the requested data and begin the operation.
Preferably, the processor comprises a neural network acceleration processor and/or an external processor.
Preferably, the dma controller is specifically configured to:
when the processor comprises a neural network acceleration processor, receiving and processing a data request command sent by the neural network acceleration processor, and determining a data reading address of the neural network acceleration processor; reading the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor;
when the processor comprises an external processor, receiving and processing a data request command sent by the external processor, and determining a data reading address of the external processor; reading the data requested by the external processor from a static random access memory according to the data reading address of the external processor, and sending the data requested by the external processor to the external processor;
when the processor comprises a neural network acceleration processor and an external processor, receiving and processing a data request command sent by the neural network acceleration processor and a data request command sent by the external processor, and determining a data reading address of the neural network acceleration processor and a data reading address of the external processor; and reading the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and the data requested by the external processor from the static random access memory according to the data reading address of the external processor, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor and sending the data requested by the external processor to the external processor.
Preferably, when the processor comprises a neural network acceleration processor and an external processor, the direct memory access controller is further configured to determine whether a data read address of the neural network acceleration processor is the same as a data read address of the external processor; wherein,
when the data reading address of the neural network acceleration processor is different from the data reading address of the external processor, the direct memory access controller simultaneously reads data from the static random access memory;
and when the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor, the direct memory access controller determines the priority of the data requested by the neural network acceleration processor and the priority of the data requested by the external processor, and sequentially reads the data from the static random access memory according to the priority.
Preferably, when the processor includes a neural network acceleration processor and an external processor, the processor receives the requested data and starts to operate, specifically:
the neural network acceleration processor and the external processor respectively receive the requested data and start parallel operation.
Preferably, the dma controller sends the data requested by the external processor to the external processor, specifically:
and the direct memory access controller sends the data requested by the external processor to the external processor through an on-chip network bus.
The invention has at least the following beneficial effects:
the direct memory access controller embedded in the neural network acceleration processor is configured to simultaneously receive and process data request commands sent by a plurality of processors and determine corresponding data reading rules, so that different processors can read requested data from the static random access memory in the neural network acceleration processor in parallel through the direct memory access controller, thereby accelerating parallel operation of different processors and saving the waiting time of data transmission during operation.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flow chart of a data transmission control method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a data transmission control apparatus according to an embodiment of the present invention.
Detailed Description
The core of the invention is to provide a data transmission control method and a device, and the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time and determine corresponding data reading rules, so that different processors can read the requested data from the static random access memory respectively through the direct memory access controller in parallel, and thus, the parallel operation of different processors can be accelerated.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An embodiment of the present invention provides a data transmission control method in one aspect, please refer to fig. 1, where the data transmission control method includes:
step S110, the processor sends a data request command.
In the embodiment of the present invention, when different processors need to Access a Static Random Access Memory (SRAM) inside a Neural Network Accelerator (NNA), data request commands may be respectively issued.
Step S120, the direct memory access controller receives and processes the data request command sent by the processor, and determines a data reading rule; the direct memory access controller reads data requested by the processor from the random static memory according to the data reading rule and sends the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules.
In the embodiment of the invention, the NNA Direct Memory Access (DMA) controller is configured to simultaneously receive and process data request commands sent by a plurality of processors, and corresponding data reading rules are determined, so that different processors can read requested data from the SRAM in parallel through the DMA controller. The DMA controller in the NNA can be configured through software, so that the DMA controller can receive and process data access requests of a plurality of processors at the same time, and the whole process is convenient to debug.
In step S130, the processor receives the requested data and starts the operation.
In the embodiment of the invention, different processors can respectively start operation after receiving the requested data, so that the parallel operation of the different processors can be accelerated, and the waiting time of data transmission during the operation of the processors is saved.
As can be seen from the above, in the data transfer control method provided in the embodiment of the present invention, the DMA controller inside the NNA is configured to receive and process the data request commands sent by the multiple processors at the same time, and determine the corresponding data reading rule, so that different processors can read the requested data from the SRAM inside the NNA in parallel through the DMA controller.
In particular implementations, the processor may include a Neural Network Acceleration (NNA) processor and/or an external Host processor. Specifically, the NNA processor can access the internal SRAM independently, and only the data request commands sent by the NNA processor are received and processed by the DMA controller; the external Host processor can also access the SRAM in the NNA independently, and the data request command sent by the processor and received and processed by the DMA controller is only the data request command sent by the external Host processor; the NNA processor and the external Host processor can also simultaneously access the SRAM in the NNA, and the data request command sent by the processor and received and processed by the DMA controller simultaneously comprises the data request command sent by the NNA processor and the data request command sent by the external Host processor.
The data reading rules determined by the DMA controller vary according to the received and processed data request commands sent by the processor. Next, a specific flow of step S120 in the embodiment of the present invention is specifically described.
When the processor includes an NNA processor, the step S120 specifically includes: the DMA controller receives and processes a data request command sent by the NNA processor, and determines a data reading address of the NNA processor; the DMA controller reads the data requested by the NNA processor from the SRAM according to the NNA processor data read address and sends the data requested by the NNA processor to the NNA processor. In this embodiment, when only the NNA processor accesses the SRAM alone, the data request command sent by the NNA processor and received and processed by the DMA controller is only the data request command sent by the NNA processor, and at this time, only the data read address of the NNA processor needs to be determined, the data requested by the NNA processor is read from the SRAM and then sent to the NNA processor, and the operation can be started after the NNA processor receives the requested data.
When the processor includes an external Host processor, the step S120 specifically includes: the DMA controller receives and processes a data request command sent by an external Host processor, and determines a data reading address of the external Host processor; the DMA controller reads the data requested by the external Host processor from the SRAM according to the external Host processor data read address, and transmits the data requested by the external Host processor to the external Host processor. In this embodiment, when only the external Host processor accesses the SRAM of the NNA alone, the data request command sent by the processor and received and processed by the DMA controller is only the data request command sent by the external Host processor, at this time, only the data read address of the external Host processor needs to be determined, the data requested by the external Host processor is read from the SRAM and then sent to the external Host processor, and the external Host processor can start to operate after receiving the requested data.
When the processors include an NNA processor and an external Host processor, the step S120 specifically includes: the DMA controller processes a data request command sent by the NNA processor and a data request command sent by the external Host processor, and determines an NNA processor data reading address and an external Host processor data reading address; the DMA controller reads the data requested by the NNA processor from the SRAM according to the NNA processor data read address and the data requested by the external Host processor from the SRAM according to the external Host processor data read address, and transmits the data requested by the NNA processor to the NNA processor and the data requested by the external Host processor to the external Host processor. In this embodiment, when the NNA processor and the external Host processor access the SRAM of the NNA simultaneously, the data request command sent by the processor received and processed by the DMA controller includes the data request command sent by the NNA processor and the data request command sent by the external Host processor, at this time, it is necessary to determine the data read address of the NNA processor and the data read address of the external Host processor, respectively, read the data requested by the NNA processor from the SRAM, send the data to the NNA processor, read the data requested by the external Host processor from the SRAM, and send the data to the external Host processor, where the NNA processor and the external Host processor start to operate after receiving the requested data.
When the NNA processor and the external Host processor access different physical BANKs of the SRAM in the NNA at the same time, the DMA controller can respectively read data requested by the NNA processor from the different physical BANKs of the SRAM and then send the data to the NNA processor, and read data requested by the external Host processor and then send the data to the external Host processor. But when the NNA processor and the external Host processor simultaneously access the same physical BANK of the NNA internal SRAM, the DMA controller cannot simultaneously perform two read data operations in the same physical BANK of the SRAM.
To solve the above problem, in some preferred embodiments of the present invention, when the processor includes an NNA processor and an external Host processor, the data transmission control method further includes:
the DMA controller judges whether the NNA processor data reading address is the same as the external Host processor data reading address or not; wherein,
when the NNA processor data reading address is different from the external Host processor data reading address, the DMA controller simultaneously reads data from the SRAM;
when the NNA processor data reading address is the same as the external Host processor data reading address, the DMA controller determines the priority of the data requested by the NNA processor and the data requested by the external Host processor, and reads the data from the SRAM in sequence according to the priority.
In the embodiment of the invention, when the NNA processor and the external Host processor access different physical BANKs of the SRAM in the NNA at the same time, the DMA controller can simultaneously read data requested by the NNA processor from the different physical BANKs of the SRAM, send the data to the NNA processor and read data requested by the external Host processor, and send the data to the external Host processor; when the NNA processor and the external Host processor simultaneously access the same physical BANK of the SRAM in the NNA, the DMA controller determines the priority of the data requested by the NNA processor and the data requested by the external Host processor, the DMA controller preferentially reads the data with high priority from the SRAM according to the priority of the requested data and then respectively sends the data to the corresponding processors, and the NNA processor and the external Host processor respectively start to operate after receiving the requested data. The priority can be configured by software, and debugging is facilitated.
Optionally, in the foregoing embodiment, when the processor includes an NNA processor and an external Host processor, the processor receives the requested data and starts to perform the operation, specifically:
the NNA processor and the external Host processor each receive the requested data and begin parallel operations.
In this embodiment, the NNA processor and the external Host processor can read requested data from the SRAM in parallel through the DMA controller to perform parallel computation acceleration. Therefore, the external Host processor does not need to wait for the NNA processor to finish the calculation and then read the data in the SRAM in the NNA for subsequent calculation, and does not need to wait for the calculation result of the previous step in the calculation process, thereby saving the waiting time of data transmission in the calculation of the processor.
In a specific implementation, the sending of the data requested by the external Host processor to the external Host processor is specifically to send the data requested by the external Host processor to the external Host processor through an on-chip network bus. In this embodiment, the external Host processor may issue a data request command to the SRAM inside the NNA via a Network On Chip (NOC) bus, and the data in the SRAM inside the NNA may also be sent to the external Host processor via the NOC bus. Optionally, the NNA may be mounted on the NoC bus, and since the bandwidth of the NoC bus is large enough, the external Host processor reads data in the SRAM inside the NNA very quickly, thereby further shortening the data transmission latency.
As can be seen from the above, the data transmission control method provided in the embodiment of the present invention can realize that different processors read data from the static random access memory in the neural network acceleration processor in parallel, and save the waiting time of data transmission during operation.
Another aspect of the present invention provides a data transmission control apparatus, which is described below and to which the above-described method is mutually referred. Referring to fig. 2, the data transmission control apparatus includes: a processor, a Direct Memory Access (DMA) controller 100, and a Static Random Access Memory (SRAM) 200;
the processor is used for sending a data request command;
the dma controller 100 is configured to receive and process a data request command sent by a processor, and determine a data reading rule; the dma controller 100 is further configured to read data requested by the processor from the sram 200 according to a data reading rule, and send the data requested by the processor to the processor; the dma controller 100 is configured to receive and process data request commands sent by multiple processors at the same time, and determine corresponding data reading rules;
the processor is also configured to receive the requested data and begin the operation.
As a preferred embodiment of the present invention, the processor includes a neural network acceleration processor 300 and/or an external processor 400. It is understood that, in the embodiment, the data request command sent by the processor received and processed by the dma controller 100 includes a data request command sent by the Neural Network Accelerator (NNA) processor 300 and/or a data request command sent by the external (Host) processor 400.
As a preferred embodiment of the present invention, the dma controller 100 is specifically configured to:
when the processor comprises the neural network acceleration processor 300, receiving and processing a data request command sent by the neural network acceleration processor 300, and determining a data reading address of the neural network acceleration processor; reading the data requested by the neural network acceleration processor 300 from the static random access memory 200 according to the neural network acceleration processor data reading address, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor 300;
when the processor comprises the external processor 400, receiving and processing a data request command sent by the external processor 400, and determining an external processor data reading address; and reads the data requested by the external processor 400 from the static random access memory 200 according to the external processor data read address and transmits the data requested by the external processor to the external processor 400;
when the processor comprises the neural network acceleration processor 300 and the external processor 400, receiving and processing a data request command sent by the neural network acceleration processor 300 and a data request command sent by the external processor 400, and determining a data reading address of the neural network acceleration processor and a data reading address of the external processor; and reading data requested by the neural network acceleration processor 300 from the static random access memory 200 according to the neural network acceleration processor data read address and data requested by the external processor 400 from the static random access memory 200 according to the external processor data read address, and transmitting the data requested by the neural network acceleration processor to the neural network acceleration processor 300 and transmitting the data requested by the external processor to the external processor 400.
As a preferred embodiment of the present invention, when the processor includes the neural network acceleration processor 300 and the external processor 400, the dma controller 100 is further configured to determine whether the neural network acceleration processor data read address and the external processor data read address are the same; wherein,
when the neural network accelerated processor data reading address is different from the external processor data reading address, the direct memory access controller 100 reads data from the static random access memory 200 at the same time;
when the neural network acceleration processor data read address and the external processor data read address are the same, the dma controller 100 determines the priority of the data requested by the neural network acceleration processor 300 and the data requested by the external processor 400, and sequentially reads the data from the sram 200 according to the priority.
As a preferred embodiment of the present invention, when the processor includes the neural network acceleration processor 300 and the external processor 400, the processor receives the requested data and starts to operate, specifically:
the neural network acceleration processor 300 and the external processor 400 respectively receive the requested data and start parallel operations.
As a preferred embodiment of the present invention, the dma controller 100 sends the data requested by the external processor 400 to the external processor 400, specifically:
the dma controller 100 transmits data requested by the external processor 400 to the external processor 400 through a Network On Chip (NOC) bus.
As can be seen from the above, the data transmission control device provided in the embodiment of the present invention can realize that different processors read data from the static random access memory in the neural network acceleration processor in parallel, thereby saving the waiting time of data transmission during operation.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (12)

1. A data transmission control method, comprising:
the processor sends a data request command;
the direct memory access controller receives and processes the data request command sent by the processor, and determines a data reading rule; the direct memory access controller reads the data requested by the processor from a random static memory according to the data reading rule and sends the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules;
the processor receives the requested data and begins the operation.
2. The data transmission control method according to claim 1, wherein the processor comprises a neural network acceleration processor and/or an external processor.
3. The data transmission control method according to claim 2, wherein,
when the processor comprises a neural network acceleration processor, the direct memory access controller receives and processes a data request command sent by the neural network acceleration processor, and determines a data reading address of the neural network acceleration processor; the direct memory access controller reads the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and sends the data requested by the neural network acceleration processor to the neural network acceleration processor;
when the processor comprises an external processor, the direct memory access controller receives and processes a data request command sent by the external processor, and determines a data reading address of the external processor; the direct memory access controller reads the data requested by the external processor from a static random access memory according to the data reading address of the external processor and sends the data requested by the external processor to the external processor;
when the processor comprises a neural network acceleration processor and an external processor, the direct memory access controller receives and processes a data request command sent by the neural network acceleration processor and a data request command sent by the external processor, and determines a data reading address of the neural network acceleration processor and a data reading address of the external processor; and the direct memory access controller reads the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and reads the data requested by the external processor from the static random access memory according to the data reading address of the external processor, and sends the data requested by the neural network acceleration processor to the neural network acceleration processor and sends the data requested by the external processor to the external processor.
4. The data transmission control method of claim 3, wherein when the processor includes a neural network acceleration processor and an external processor, the method further comprises:
the direct memory access controller judges whether the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor; wherein,
when the data reading address of the neural network acceleration processor is different from the data reading address of the external processor, the direct memory access controller simultaneously reads data from the static random access memory;
and when the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor, the direct memory access controller determines the priority of the data requested by the neural network acceleration processor and the priority of the data requested by the external processor, and sequentially reads the data from the static random access memory according to the priority.
5. The data transmission control method according to claim 4, wherein when the processor includes a neural network acceleration processor and an external processor, the processor receives the requested data and starts to perform operations, specifically:
the neural network acceleration processor and the external processor respectively receive the requested data and start parallel operation.
6. The data transmission control method according to any one of claims 3 to 5, wherein the sending the data requested by the external processor to the external processor specifically includes:
and sending the data requested by the external processor to the external processor through a network-on-chip bus.
7. A data transmission control device is characterized by comprising a processor, a direct memory access controller and a static random access memory;
the processor is used for sending a data request command;
the direct memory access controller is used for receiving and processing a data request command sent by the processor and determining a data reading rule; the direct memory access controller is also used for reading the data requested by the processor from a random static memory according to the data reading rule and sending the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules;
the processor is also configured to receive the requested data and begin the operation.
8. The data transmission control device of claim 7, wherein the processor comprises a neural network acceleration processor and/or an external processor.
9. The data transfer control device of claim 8, wherein the dma controller is specifically configured to:
when the processor comprises a neural network acceleration processor, receiving and processing a data request command sent by the neural network acceleration processor, and determining a data reading address of the neural network acceleration processor; reading the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor;
when the processor comprises an external processor, receiving and processing a data request command sent by the external processor, and determining a data reading address of the external processor; reading the data requested by the external processor from a static random access memory according to the data reading address of the external processor, and sending the data requested by the external processor to the external processor;
when the processor comprises a neural network acceleration processor and an external processor, receiving and processing a data request command sent by the neural network acceleration processor and a data request command sent by the external processor, and determining a data reading address of the neural network acceleration processor and a data reading address of the external processor; and reading the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and the data requested by the external processor from the static random access memory according to the data reading address of the external processor, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor and sending the data requested by the external processor to the external processor.
10. The data transmission control device of claim 9, wherein when the processor comprises a neural network acceleration processor and an external processor, the dma controller is further configured to determine whether the neural network acceleration processor data read address and the external processor data read address are the same; wherein,
when the data reading address of the neural network acceleration processor is different from the data reading address of the external processor, the direct memory access controller simultaneously reads data from the static random access memory;
and when the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor, the direct memory access controller determines the priority of the data requested by the neural network acceleration processor and the priority of the data requested by the external processor, and sequentially reads the data from the static random access memory according to the priority.
11. The data transmission control device according to claim 10, wherein when the processor includes a neural network acceleration processor and an external processor, the processor receives the requested data and starts to perform operations, specifically:
the neural network acceleration processor and the external processor respectively receive the requested data and start parallel operation.
12. The data transmission control device according to any one of claims 9 to 11, wherein the dma controller sends the data requested by the external processor to the external processor, specifically:
and the direct memory access controller sends the data requested by the external processor to the external processor through an on-chip network bus.
CN202010162763.1A 2020-03-10 2020-03-10 Data transmission control method and device Withdrawn CN111401541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010162763.1A CN111401541A (en) 2020-03-10 2020-03-10 Data transmission control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010162763.1A CN111401541A (en) 2020-03-10 2020-03-10 Data transmission control method and device

Publications (1)

Publication Number Publication Date
CN111401541A true CN111401541A (en) 2020-07-10

Family

ID=71436122

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010162763.1A Withdrawn CN111401541A (en) 2020-03-10 2020-03-10 Data transmission control method and device

Country Status (1)

Country Link
CN (1) CN111401541A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712167A (en) * 2020-12-31 2021-04-27 北京清微智能科技有限公司 Memory access method and system supporting acceleration of multiple convolutional neural networks

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225760A1 (en) * 2003-05-11 2004-11-11 Samsung Electronics Co., Ltd. Method and apparatus for transferring data at high speed using direct memory access in multi-processor environments
CN102521201A (en) * 2011-11-16 2012-06-27 刘大可 Multi-core DSP (digital signal processor) system-on-chip and data transmission method
CN103714027A (en) * 2014-01-10 2014-04-09 浪潮(北京)电子信息产业有限公司 Data transmission method and device for direct memory access controller
CN104572519A (en) * 2014-12-22 2015-04-29 中国电子科技集团公司第三十八研究所 Multiport access and storage controller for multiprocessor and control method thereof
CN105207794A (en) * 2014-06-05 2015-12-30 中兴通讯股份有限公司 Statistics counting equipment and realization method thereof, and system with statistics counting equipment
CN107392309A (en) * 2017-09-11 2017-11-24 东南大学—无锡集成电路技术研究所 A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA
CN108363670A (en) * 2017-01-26 2018-08-03 华为技术有限公司 A kind of method, apparatus of data transmission, equipment and system
CN109491938A (en) * 2018-11-27 2019-03-19 济南浪潮高新科技投资发展有限公司 A kind of multi-channel DMA controller and convolutional neural networks accelerated method accelerated towards convolutional neural networks
CN109961392A (en) * 2017-12-22 2019-07-02 英特尔公司 The compression of deep learning is directed in the case where sparse value is mapped to nonzero value
CN110008156A (en) * 2019-03-27 2019-07-12 无锡海斯凯尔医学技术有限公司 Device, method and the readable storage medium storing program for executing of data transmission
CN110309088A (en) * 2019-06-19 2019-10-08 北京百度网讯科技有限公司 ZYNQ fpga chip and its data processing method, storage medium
CN110633576A (en) * 2018-06-22 2019-12-31 顶级公司 Data processing
CN110738308A (en) * 2019-09-23 2020-01-31 陈小柏 neural network accelerators
CN110852428A (en) * 2019-09-08 2020-02-28 天津大学 Neural network acceleration method and accelerator based on FPGA

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040225760A1 (en) * 2003-05-11 2004-11-11 Samsung Electronics Co., Ltd. Method and apparatus for transferring data at high speed using direct memory access in multi-processor environments
CN102521201A (en) * 2011-11-16 2012-06-27 刘大可 Multi-core DSP (digital signal processor) system-on-chip and data transmission method
CN103714027A (en) * 2014-01-10 2014-04-09 浪潮(北京)电子信息产业有限公司 Data transmission method and device for direct memory access controller
CN105207794A (en) * 2014-06-05 2015-12-30 中兴通讯股份有限公司 Statistics counting equipment and realization method thereof, and system with statistics counting equipment
CN104572519A (en) * 2014-12-22 2015-04-29 中国电子科技集团公司第三十八研究所 Multiport access and storage controller for multiprocessor and control method thereof
CN108363670A (en) * 2017-01-26 2018-08-03 华为技术有限公司 A kind of method, apparatus of data transmission, equipment and system
CN107392309A (en) * 2017-09-11 2017-11-24 东南大学—无锡集成电路技术研究所 A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA
CN109961392A (en) * 2017-12-22 2019-07-02 英特尔公司 The compression of deep learning is directed in the case where sparse value is mapped to nonzero value
CN110633576A (en) * 2018-06-22 2019-12-31 顶级公司 Data processing
CN109491938A (en) * 2018-11-27 2019-03-19 济南浪潮高新科技投资发展有限公司 A kind of multi-channel DMA controller and convolutional neural networks accelerated method accelerated towards convolutional neural networks
CN110008156A (en) * 2019-03-27 2019-07-12 无锡海斯凯尔医学技术有限公司 Device, method and the readable storage medium storing program for executing of data transmission
CN110309088A (en) * 2019-06-19 2019-10-08 北京百度网讯科技有限公司 ZYNQ fpga chip and its data processing method, storage medium
CN110852428A (en) * 2019-09-08 2020-02-28 天津大学 Neural network acceleration method and accelerator based on FPGA
CN110738308A (en) * 2019-09-23 2020-01-31 陈小柏 neural network accelerators

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王洪利 等: "用于卷积神经网络硬件加速器的 3D DMA 控制器" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112712167A (en) * 2020-12-31 2021-04-27 北京清微智能科技有限公司 Memory access method and system supporting acceleration of multiple convolutional neural networks

Similar Documents

Publication Publication Date Title
US10877766B2 (en) Embedded scheduling of hardware resources for hardware acceleration
CN108229687B (en) Data processing method, data processing device and electronic equipment
CN114003392B (en) Data accelerated computing method and related device
US11941528B2 (en) Neural network training in a distributed system
WO2023124304A1 (en) Chip cache system, data processing method, device, storage medium, and chip
KR20210080009A (en) Accelerator, method for operating the same and device including the same
JPH05274252A (en) Transaction execution method for computer system
WO2020106482A1 (en) Programming and controlling compute units in an integrated circuit
CN111401541A (en) Data transmission control method and device
KR102303424B1 (en) Direct memory access control device for at least one processing unit having a random access memory
CN117155802A (en) Out-of-order transmission simulation method, device and system and electronic equipment
JP2821345B2 (en) Asynchronous I / O control method
US6567908B1 (en) Method of and apparatus for processing information, and providing medium
CN108062224B (en) Data reading and writing method and device based on file handle and computing equipment
US20040059563A1 (en) Emulatd atomic instruction sequences in a multiprocessor system
CN116243983A (en) Processor, integrated circuit chip, instruction processing method, electronic device, and medium
CN114021715A (en) Deep learning training method based on Tensorflow framework
CN115712486A (en) Method and device for controlling live migration of virtual machine, medium and computer equipment
KR20050080704A (en) Apparatus and method of inter processor communication
CN115563053A (en) High-performance on-chip memory controller and execution method thereof
CN111913812B (en) Data processing method, device, equipment and storage medium
WO2020221161A1 (en) Computing job processing method and system, mobile device and acceleration device
US20190179778A1 (en) System memory controller with client preemption
CN111506518B (en) Data storage control method and device
JP3110024B2 (en) Memory control system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20200710