CN111401541A - Data transmission control method and device - Google Patents
Data transmission control method and device Download PDFInfo
- Publication number
- CN111401541A CN111401541A CN202010162763.1A CN202010162763A CN111401541A CN 111401541 A CN111401541 A CN 111401541A CN 202010162763 A CN202010162763 A CN 202010162763A CN 111401541 A CN111401541 A CN 111401541A
- Authority
- CN
- China
- Prior art keywords
- processor
- data
- neural network
- network acceleration
- external
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 230000005540 biological transmission Effects 0.000 title claims abstract description 38
- 238000013528 artificial neural network Methods 0.000 claims abstract description 124
- 230000001133 acceleration Effects 0.000 claims abstract description 117
- 230000003068 static effect Effects 0.000 claims abstract description 45
- 238000004364 calculation method Methods 0.000 description 6
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3877—Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Neurology (AREA)
- Multi Processors (AREA)
Abstract
The invention discloses a data transmission control method and a device, wherein the method comprises the following steps: the processor sends a data request command; the direct memory access controller receives and processes a data request command sent by the processor, and determines a data reading rule; the direct memory access controller reads data requested by the processor from the random static memory according to the data reading rule and sends the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules; the processor receives the requested data and begins the operation. The invention can enable different processors to read data from the static random access memory in the neural network acceleration processor in parallel, and saves the waiting time of data transmission during operation.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data transmission control method and apparatus.
Background
The neural network is an arithmetic mathematical model for simulating animal neural network behavior characteristics and performing distributed parallel information processing, and the aim of processing information is fulfilled by adjusting the interconnection relationship among a large number of internal nodes depending on the complexity of the system. NNA (neural Network accelerator), also called as neural Network accelerator, is a module for calculating the operation tasks contained in the application scene of the artificial intelligence, the calculation complexity of the neural Network model is in direct proportion to the size of the input data, and the data volume needing to be operated is larger and larger as the application scene of the artificial intelligence is wider and wider.
The existing neural network accelerator does not support parallel operation, data needs to be moved to the outside to be calculated by other HOST processors after the NNA processor finishes operation, data transmission of the next step can be carried out only by waiting for the operation result of the previous step in the operation process, and once the data volume is large, the data transmission waiting time during operation is too long, and the operation efficiency is low.
Disclosure of Invention
In order to solve the above technical problems, the present invention provides a data transmission control method and apparatus, which enable different processors to read data from a static random access memory in a neural network acceleration processor in parallel, and save the waiting time of data transmission during operation.
One aspect of the present invention provides a data transmission control method, including:
the processor sends a data request command;
the direct memory access controller receives and processes the data request command sent by the processor, and determines a data reading rule; the direct memory access controller reads the data requested by the processor from a random static memory according to the data reading rule and sends the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules;
the processor receives the requested data and begins the operation.
Preferably, the processor comprises a neural network acceleration processor and/or an external processor.
Preferably, the first and second substrates are, among others,
when the processor comprises a neural network acceleration processor, the direct memory access controller receives and processes a data request command sent by the neural network acceleration processor, and determines a data reading address of the neural network acceleration processor; the direct memory access controller reads the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and sends the data requested by the neural network acceleration processor to the neural network acceleration processor;
when the processor comprises an external processor, the direct memory access controller receives and processes a data request command sent by the external processor, and determines a data reading address of the external processor; the direct memory access controller reads the data requested by the external processor from a static random access memory according to the data reading address of the external processor and sends the data requested by the external processor to the external processor;
when the processor comprises a neural network acceleration processor and an external processor, the direct memory access controller receives and processes a data request command sent by the neural network acceleration processor and a data request command sent by the external processor, and determines a data reading address of the neural network acceleration processor and a data reading address of the external processor; and the direct memory access controller reads the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and reads the data requested by the external processor from the static random access memory according to the data reading address of the external processor, and sends the data requested by the neural network acceleration processor to the neural network acceleration processor and sends the data requested by the external processor to the external processor.
Preferably, when the processor includes a neural network acceleration processor and an external processor, the method further includes:
the direct memory access controller judges whether the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor; wherein,
when the data reading address of the neural network acceleration processor is different from the data reading address of the external processor, the direct memory access controller simultaneously reads data from the static random access memory;
and when the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor, the direct memory access controller determines the priority of the data requested by the neural network acceleration processor and the priority of the data requested by the external processor, and sequentially reads the data from the static random access memory according to the priority.
Preferably, when the processor includes a neural network acceleration processor and an external processor, the processor receives the requested data and starts to operate, specifically:
the neural network acceleration processor and the external processor respectively receive the requested data and start parallel operation.
Preferably, the sending the data requested by the external processor to the external processor specifically includes:
and sending the data requested by the external processor to the external processor through a network-on-chip bus.
The invention provides a data transmission control device on the other hand, which comprises a processor, a direct memory access controller and a static random access memory;
the processor is used for sending a data request command;
the direct memory access controller is used for receiving and processing a data request command sent by the processor and determining a data reading rule; the direct memory access controller is also used for reading the data requested by the processor from a random static memory according to the data reading rule and sending the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules;
the processor is also configured to receive the requested data and begin the operation.
Preferably, the processor comprises a neural network acceleration processor and/or an external processor.
Preferably, the dma controller is specifically configured to:
when the processor comprises a neural network acceleration processor, receiving and processing a data request command sent by the neural network acceleration processor, and determining a data reading address of the neural network acceleration processor; reading the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor;
when the processor comprises an external processor, receiving and processing a data request command sent by the external processor, and determining a data reading address of the external processor; reading the data requested by the external processor from a static random access memory according to the data reading address of the external processor, and sending the data requested by the external processor to the external processor;
when the processor comprises a neural network acceleration processor and an external processor, receiving and processing a data request command sent by the neural network acceleration processor and a data request command sent by the external processor, and determining a data reading address of the neural network acceleration processor and a data reading address of the external processor; and reading the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and the data requested by the external processor from the static random access memory according to the data reading address of the external processor, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor and sending the data requested by the external processor to the external processor.
Preferably, when the processor comprises a neural network acceleration processor and an external processor, the direct memory access controller is further configured to determine whether a data read address of the neural network acceleration processor is the same as a data read address of the external processor; wherein,
when the data reading address of the neural network acceleration processor is different from the data reading address of the external processor, the direct memory access controller simultaneously reads data from the static random access memory;
and when the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor, the direct memory access controller determines the priority of the data requested by the neural network acceleration processor and the priority of the data requested by the external processor, and sequentially reads the data from the static random access memory according to the priority.
Preferably, when the processor includes a neural network acceleration processor and an external processor, the processor receives the requested data and starts to operate, specifically:
the neural network acceleration processor and the external processor respectively receive the requested data and start parallel operation.
Preferably, the dma controller sends the data requested by the external processor to the external processor, specifically:
and the direct memory access controller sends the data requested by the external processor to the external processor through an on-chip network bus.
The invention has at least the following beneficial effects:
the direct memory access controller embedded in the neural network acceleration processor is configured to simultaneously receive and process data request commands sent by a plurality of processors and determine corresponding data reading rules, so that different processors can read requested data from the static random access memory in the neural network acceleration processor in parallel through the direct memory access controller, thereby accelerating parallel operation of different processors and saving the waiting time of data transmission during operation.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flow chart of a data transmission control method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a data transmission control apparatus according to an embodiment of the present invention.
Detailed Description
The core of the invention is to provide a data transmission control method and a device, and the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time and determine corresponding data reading rules, so that different processors can read the requested data from the static random access memory respectively through the direct memory access controller in parallel, and thus, the parallel operation of different processors can be accelerated.
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An embodiment of the present invention provides a data transmission control method in one aspect, please refer to fig. 1, where the data transmission control method includes:
step S110, the processor sends a data request command.
In the embodiment of the present invention, when different processors need to Access a Static Random Access Memory (SRAM) inside a Neural Network Accelerator (NNA), data request commands may be respectively issued.
Step S120, the direct memory access controller receives and processes the data request command sent by the processor, and determines a data reading rule; the direct memory access controller reads data requested by the processor from the random static memory according to the data reading rule and sends the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules.
In the embodiment of the invention, the NNA Direct Memory Access (DMA) controller is configured to simultaneously receive and process data request commands sent by a plurality of processors, and corresponding data reading rules are determined, so that different processors can read requested data from the SRAM in parallel through the DMA controller. The DMA controller in the NNA can be configured through software, so that the DMA controller can receive and process data access requests of a plurality of processors at the same time, and the whole process is convenient to debug.
In step S130, the processor receives the requested data and starts the operation.
In the embodiment of the invention, different processors can respectively start operation after receiving the requested data, so that the parallel operation of the different processors can be accelerated, and the waiting time of data transmission during the operation of the processors is saved.
As can be seen from the above, in the data transfer control method provided in the embodiment of the present invention, the DMA controller inside the NNA is configured to receive and process the data request commands sent by the multiple processors at the same time, and determine the corresponding data reading rule, so that different processors can read the requested data from the SRAM inside the NNA in parallel through the DMA controller.
In particular implementations, the processor may include a Neural Network Acceleration (NNA) processor and/or an external Host processor. Specifically, the NNA processor can access the internal SRAM independently, and only the data request commands sent by the NNA processor are received and processed by the DMA controller; the external Host processor can also access the SRAM in the NNA independently, and the data request command sent by the processor and received and processed by the DMA controller is only the data request command sent by the external Host processor; the NNA processor and the external Host processor can also simultaneously access the SRAM in the NNA, and the data request command sent by the processor and received and processed by the DMA controller simultaneously comprises the data request command sent by the NNA processor and the data request command sent by the external Host processor.
The data reading rules determined by the DMA controller vary according to the received and processed data request commands sent by the processor. Next, a specific flow of step S120 in the embodiment of the present invention is specifically described.
When the processor includes an NNA processor, the step S120 specifically includes: the DMA controller receives and processes a data request command sent by the NNA processor, and determines a data reading address of the NNA processor; the DMA controller reads the data requested by the NNA processor from the SRAM according to the NNA processor data read address and sends the data requested by the NNA processor to the NNA processor. In this embodiment, when only the NNA processor accesses the SRAM alone, the data request command sent by the NNA processor and received and processed by the DMA controller is only the data request command sent by the NNA processor, and at this time, only the data read address of the NNA processor needs to be determined, the data requested by the NNA processor is read from the SRAM and then sent to the NNA processor, and the operation can be started after the NNA processor receives the requested data.
When the processor includes an external Host processor, the step S120 specifically includes: the DMA controller receives and processes a data request command sent by an external Host processor, and determines a data reading address of the external Host processor; the DMA controller reads the data requested by the external Host processor from the SRAM according to the external Host processor data read address, and transmits the data requested by the external Host processor to the external Host processor. In this embodiment, when only the external Host processor accesses the SRAM of the NNA alone, the data request command sent by the processor and received and processed by the DMA controller is only the data request command sent by the external Host processor, at this time, only the data read address of the external Host processor needs to be determined, the data requested by the external Host processor is read from the SRAM and then sent to the external Host processor, and the external Host processor can start to operate after receiving the requested data.
When the processors include an NNA processor and an external Host processor, the step S120 specifically includes: the DMA controller processes a data request command sent by the NNA processor and a data request command sent by the external Host processor, and determines an NNA processor data reading address and an external Host processor data reading address; the DMA controller reads the data requested by the NNA processor from the SRAM according to the NNA processor data read address and the data requested by the external Host processor from the SRAM according to the external Host processor data read address, and transmits the data requested by the NNA processor to the NNA processor and the data requested by the external Host processor to the external Host processor. In this embodiment, when the NNA processor and the external Host processor access the SRAM of the NNA simultaneously, the data request command sent by the processor received and processed by the DMA controller includes the data request command sent by the NNA processor and the data request command sent by the external Host processor, at this time, it is necessary to determine the data read address of the NNA processor and the data read address of the external Host processor, respectively, read the data requested by the NNA processor from the SRAM, send the data to the NNA processor, read the data requested by the external Host processor from the SRAM, and send the data to the external Host processor, where the NNA processor and the external Host processor start to operate after receiving the requested data.
When the NNA processor and the external Host processor access different physical BANKs of the SRAM in the NNA at the same time, the DMA controller can respectively read data requested by the NNA processor from the different physical BANKs of the SRAM and then send the data to the NNA processor, and read data requested by the external Host processor and then send the data to the external Host processor. But when the NNA processor and the external Host processor simultaneously access the same physical BANK of the NNA internal SRAM, the DMA controller cannot simultaneously perform two read data operations in the same physical BANK of the SRAM.
To solve the above problem, in some preferred embodiments of the present invention, when the processor includes an NNA processor and an external Host processor, the data transmission control method further includes:
the DMA controller judges whether the NNA processor data reading address is the same as the external Host processor data reading address or not; wherein,
when the NNA processor data reading address is different from the external Host processor data reading address, the DMA controller simultaneously reads data from the SRAM;
when the NNA processor data reading address is the same as the external Host processor data reading address, the DMA controller determines the priority of the data requested by the NNA processor and the data requested by the external Host processor, and reads the data from the SRAM in sequence according to the priority.
In the embodiment of the invention, when the NNA processor and the external Host processor access different physical BANKs of the SRAM in the NNA at the same time, the DMA controller can simultaneously read data requested by the NNA processor from the different physical BANKs of the SRAM, send the data to the NNA processor and read data requested by the external Host processor, and send the data to the external Host processor; when the NNA processor and the external Host processor simultaneously access the same physical BANK of the SRAM in the NNA, the DMA controller determines the priority of the data requested by the NNA processor and the data requested by the external Host processor, the DMA controller preferentially reads the data with high priority from the SRAM according to the priority of the requested data and then respectively sends the data to the corresponding processors, and the NNA processor and the external Host processor respectively start to operate after receiving the requested data. The priority can be configured by software, and debugging is facilitated.
Optionally, in the foregoing embodiment, when the processor includes an NNA processor and an external Host processor, the processor receives the requested data and starts to perform the operation, specifically:
the NNA processor and the external Host processor each receive the requested data and begin parallel operations.
In this embodiment, the NNA processor and the external Host processor can read requested data from the SRAM in parallel through the DMA controller to perform parallel computation acceleration. Therefore, the external Host processor does not need to wait for the NNA processor to finish the calculation and then read the data in the SRAM in the NNA for subsequent calculation, and does not need to wait for the calculation result of the previous step in the calculation process, thereby saving the waiting time of data transmission in the calculation of the processor.
In a specific implementation, the sending of the data requested by the external Host processor to the external Host processor is specifically to send the data requested by the external Host processor to the external Host processor through an on-chip network bus. In this embodiment, the external Host processor may issue a data request command to the SRAM inside the NNA via a Network On Chip (NOC) bus, and the data in the SRAM inside the NNA may also be sent to the external Host processor via the NOC bus. Optionally, the NNA may be mounted on the NoC bus, and since the bandwidth of the NoC bus is large enough, the external Host processor reads data in the SRAM inside the NNA very quickly, thereby further shortening the data transmission latency.
As can be seen from the above, the data transmission control method provided in the embodiment of the present invention can realize that different processors read data from the static random access memory in the neural network acceleration processor in parallel, and save the waiting time of data transmission during operation.
Another aspect of the present invention provides a data transmission control apparatus, which is described below and to which the above-described method is mutually referred. Referring to fig. 2, the data transmission control apparatus includes: a processor, a Direct Memory Access (DMA) controller 100, and a Static Random Access Memory (SRAM) 200;
the processor is used for sending a data request command;
the dma controller 100 is configured to receive and process a data request command sent by a processor, and determine a data reading rule; the dma controller 100 is further configured to read data requested by the processor from the sram 200 according to a data reading rule, and send the data requested by the processor to the processor; the dma controller 100 is configured to receive and process data request commands sent by multiple processors at the same time, and determine corresponding data reading rules;
the processor is also configured to receive the requested data and begin the operation.
As a preferred embodiment of the present invention, the processor includes a neural network acceleration processor 300 and/or an external processor 400. It is understood that, in the embodiment, the data request command sent by the processor received and processed by the dma controller 100 includes a data request command sent by the Neural Network Accelerator (NNA) processor 300 and/or a data request command sent by the external (Host) processor 400.
As a preferred embodiment of the present invention, the dma controller 100 is specifically configured to:
when the processor comprises the neural network acceleration processor 300, receiving and processing a data request command sent by the neural network acceleration processor 300, and determining a data reading address of the neural network acceleration processor; reading the data requested by the neural network acceleration processor 300 from the static random access memory 200 according to the neural network acceleration processor data reading address, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor 300;
when the processor comprises the external processor 400, receiving and processing a data request command sent by the external processor 400, and determining an external processor data reading address; and reads the data requested by the external processor 400 from the static random access memory 200 according to the external processor data read address and transmits the data requested by the external processor to the external processor 400;
when the processor comprises the neural network acceleration processor 300 and the external processor 400, receiving and processing a data request command sent by the neural network acceleration processor 300 and a data request command sent by the external processor 400, and determining a data reading address of the neural network acceleration processor and a data reading address of the external processor; and reading data requested by the neural network acceleration processor 300 from the static random access memory 200 according to the neural network acceleration processor data read address and data requested by the external processor 400 from the static random access memory 200 according to the external processor data read address, and transmitting the data requested by the neural network acceleration processor to the neural network acceleration processor 300 and transmitting the data requested by the external processor to the external processor 400.
As a preferred embodiment of the present invention, when the processor includes the neural network acceleration processor 300 and the external processor 400, the dma controller 100 is further configured to determine whether the neural network acceleration processor data read address and the external processor data read address are the same; wherein,
when the neural network accelerated processor data reading address is different from the external processor data reading address, the direct memory access controller 100 reads data from the static random access memory 200 at the same time;
when the neural network acceleration processor data read address and the external processor data read address are the same, the dma controller 100 determines the priority of the data requested by the neural network acceleration processor 300 and the data requested by the external processor 400, and sequentially reads the data from the sram 200 according to the priority.
As a preferred embodiment of the present invention, when the processor includes the neural network acceleration processor 300 and the external processor 400, the processor receives the requested data and starts to operate, specifically:
the neural network acceleration processor 300 and the external processor 400 respectively receive the requested data and start parallel operations.
As a preferred embodiment of the present invention, the dma controller 100 sends the data requested by the external processor 400 to the external processor 400, specifically:
the dma controller 100 transmits data requested by the external processor 400 to the external processor 400 through a Network On Chip (NOC) bus.
As can be seen from the above, the data transmission control device provided in the embodiment of the present invention can realize that different processors read data from the static random access memory in the neural network acceleration processor in parallel, thereby saving the waiting time of data transmission during operation.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (12)
1. A data transmission control method, comprising:
the processor sends a data request command;
the direct memory access controller receives and processes the data request command sent by the processor, and determines a data reading rule; the direct memory access controller reads the data requested by the processor from a random static memory according to the data reading rule and sends the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules;
the processor receives the requested data and begins the operation.
2. The data transmission control method according to claim 1, wherein the processor comprises a neural network acceleration processor and/or an external processor.
3. The data transmission control method according to claim 2, wherein,
when the processor comprises a neural network acceleration processor, the direct memory access controller receives and processes a data request command sent by the neural network acceleration processor, and determines a data reading address of the neural network acceleration processor; the direct memory access controller reads the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and sends the data requested by the neural network acceleration processor to the neural network acceleration processor;
when the processor comprises an external processor, the direct memory access controller receives and processes a data request command sent by the external processor, and determines a data reading address of the external processor; the direct memory access controller reads the data requested by the external processor from a static random access memory according to the data reading address of the external processor and sends the data requested by the external processor to the external processor;
when the processor comprises a neural network acceleration processor and an external processor, the direct memory access controller receives and processes a data request command sent by the neural network acceleration processor and a data request command sent by the external processor, and determines a data reading address of the neural network acceleration processor and a data reading address of the external processor; and the direct memory access controller reads the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and reads the data requested by the external processor from the static random access memory according to the data reading address of the external processor, and sends the data requested by the neural network acceleration processor to the neural network acceleration processor and sends the data requested by the external processor to the external processor.
4. The data transmission control method of claim 3, wherein when the processor includes a neural network acceleration processor and an external processor, the method further comprises:
the direct memory access controller judges whether the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor; wherein,
when the data reading address of the neural network acceleration processor is different from the data reading address of the external processor, the direct memory access controller simultaneously reads data from the static random access memory;
and when the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor, the direct memory access controller determines the priority of the data requested by the neural network acceleration processor and the priority of the data requested by the external processor, and sequentially reads the data from the static random access memory according to the priority.
5. The data transmission control method according to claim 4, wherein when the processor includes a neural network acceleration processor and an external processor, the processor receives the requested data and starts to perform operations, specifically:
the neural network acceleration processor and the external processor respectively receive the requested data and start parallel operation.
6. The data transmission control method according to any one of claims 3 to 5, wherein the sending the data requested by the external processor to the external processor specifically includes:
and sending the data requested by the external processor to the external processor through a network-on-chip bus.
7. A data transmission control device is characterized by comprising a processor, a direct memory access controller and a static random access memory;
the processor is used for sending a data request command;
the direct memory access controller is used for receiving and processing a data request command sent by the processor and determining a data reading rule; the direct memory access controller is also used for reading the data requested by the processor from a random static memory according to the data reading rule and sending the data requested by the processor to the processor; the direct memory access controller is configured to receive and process data request commands sent by a plurality of processors at the same time, and determine corresponding data reading rules;
the processor is also configured to receive the requested data and begin the operation.
8. The data transmission control device of claim 7, wherein the processor comprises a neural network acceleration processor and/or an external processor.
9. The data transfer control device of claim 8, wherein the dma controller is specifically configured to:
when the processor comprises a neural network acceleration processor, receiving and processing a data request command sent by the neural network acceleration processor, and determining a data reading address of the neural network acceleration processor; reading the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor;
when the processor comprises an external processor, receiving and processing a data request command sent by the external processor, and determining a data reading address of the external processor; reading the data requested by the external processor from a static random access memory according to the data reading address of the external processor, and sending the data requested by the external processor to the external processor;
when the processor comprises a neural network acceleration processor and an external processor, receiving and processing a data request command sent by the neural network acceleration processor and a data request command sent by the external processor, and determining a data reading address of the neural network acceleration processor and a data reading address of the external processor; and reading the data requested by the neural network acceleration processor from a static random access memory according to the data reading address of the neural network acceleration processor and the data requested by the external processor from the static random access memory according to the data reading address of the external processor, and sending the data requested by the neural network acceleration processor to the neural network acceleration processor and sending the data requested by the external processor to the external processor.
10. The data transmission control device of claim 9, wherein when the processor comprises a neural network acceleration processor and an external processor, the dma controller is further configured to determine whether the neural network acceleration processor data read address and the external processor data read address are the same; wherein,
when the data reading address of the neural network acceleration processor is different from the data reading address of the external processor, the direct memory access controller simultaneously reads data from the static random access memory;
and when the data reading address of the neural network acceleration processor is the same as the data reading address of the external processor, the direct memory access controller determines the priority of the data requested by the neural network acceleration processor and the priority of the data requested by the external processor, and sequentially reads the data from the static random access memory according to the priority.
11. The data transmission control device according to claim 10, wherein when the processor includes a neural network acceleration processor and an external processor, the processor receives the requested data and starts to perform operations, specifically:
the neural network acceleration processor and the external processor respectively receive the requested data and start parallel operation.
12. The data transmission control device according to any one of claims 9 to 11, wherein the dma controller sends the data requested by the external processor to the external processor, specifically:
and the direct memory access controller sends the data requested by the external processor to the external processor through an on-chip network bus.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010162763.1A CN111401541A (en) | 2020-03-10 | 2020-03-10 | Data transmission control method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010162763.1A CN111401541A (en) | 2020-03-10 | 2020-03-10 | Data transmission control method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111401541A true CN111401541A (en) | 2020-07-10 |
Family
ID=71436122
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010162763.1A Withdrawn CN111401541A (en) | 2020-03-10 | 2020-03-10 | Data transmission control method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111401541A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112712167A (en) * | 2020-12-31 | 2021-04-27 | 北京清微智能科技有限公司 | Memory access method and system supporting acceleration of multiple convolutional neural networks |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040225760A1 (en) * | 2003-05-11 | 2004-11-11 | Samsung Electronics Co., Ltd. | Method and apparatus for transferring data at high speed using direct memory access in multi-processor environments |
CN102521201A (en) * | 2011-11-16 | 2012-06-27 | 刘大可 | Multi-core DSP (digital signal processor) system-on-chip and data transmission method |
CN103714027A (en) * | 2014-01-10 | 2014-04-09 | 浪潮(北京)电子信息产业有限公司 | Data transmission method and device for direct memory access controller |
CN104572519A (en) * | 2014-12-22 | 2015-04-29 | 中国电子科技集团公司第三十八研究所 | Multiport access and storage controller for multiprocessor and control method thereof |
CN105207794A (en) * | 2014-06-05 | 2015-12-30 | 中兴通讯股份有限公司 | Statistics counting equipment and realization method thereof, and system with statistics counting equipment |
CN107392309A (en) * | 2017-09-11 | 2017-11-24 | 东南大学—无锡集成电路技术研究所 | A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA |
CN108363670A (en) * | 2017-01-26 | 2018-08-03 | 华为技术有限公司 | A kind of method, apparatus of data transmission, equipment and system |
CN109491938A (en) * | 2018-11-27 | 2019-03-19 | 济南浪潮高新科技投资发展有限公司 | A kind of multi-channel DMA controller and convolutional neural networks accelerated method accelerated towards convolutional neural networks |
CN109961392A (en) * | 2017-12-22 | 2019-07-02 | 英特尔公司 | The compression of deep learning is directed in the case where sparse value is mapped to nonzero value |
CN110008156A (en) * | 2019-03-27 | 2019-07-12 | 无锡海斯凯尔医学技术有限公司 | Device, method and the readable storage medium storing program for executing of data transmission |
CN110309088A (en) * | 2019-06-19 | 2019-10-08 | 北京百度网讯科技有限公司 | ZYNQ fpga chip and its data processing method, storage medium |
CN110633576A (en) * | 2018-06-22 | 2019-12-31 | 顶级公司 | Data processing |
CN110738308A (en) * | 2019-09-23 | 2020-01-31 | 陈小柏 | neural network accelerators |
CN110852428A (en) * | 2019-09-08 | 2020-02-28 | 天津大学 | Neural network acceleration method and accelerator based on FPGA |
-
2020
- 2020-03-10 CN CN202010162763.1A patent/CN111401541A/en not_active Withdrawn
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040225760A1 (en) * | 2003-05-11 | 2004-11-11 | Samsung Electronics Co., Ltd. | Method and apparatus for transferring data at high speed using direct memory access in multi-processor environments |
CN102521201A (en) * | 2011-11-16 | 2012-06-27 | 刘大可 | Multi-core DSP (digital signal processor) system-on-chip and data transmission method |
CN103714027A (en) * | 2014-01-10 | 2014-04-09 | 浪潮(北京)电子信息产业有限公司 | Data transmission method and device for direct memory access controller |
CN105207794A (en) * | 2014-06-05 | 2015-12-30 | 中兴通讯股份有限公司 | Statistics counting equipment and realization method thereof, and system with statistics counting equipment |
CN104572519A (en) * | 2014-12-22 | 2015-04-29 | 中国电子科技集团公司第三十八研究所 | Multiport access and storage controller for multiprocessor and control method thereof |
CN108363670A (en) * | 2017-01-26 | 2018-08-03 | 华为技术有限公司 | A kind of method, apparatus of data transmission, equipment and system |
CN107392309A (en) * | 2017-09-11 | 2017-11-24 | 东南大学—无锡集成电路技术研究所 | A kind of general fixed-point number neutral net convolution accelerator hardware structure based on FPGA |
CN109961392A (en) * | 2017-12-22 | 2019-07-02 | 英特尔公司 | The compression of deep learning is directed in the case where sparse value is mapped to nonzero value |
CN110633576A (en) * | 2018-06-22 | 2019-12-31 | 顶级公司 | Data processing |
CN109491938A (en) * | 2018-11-27 | 2019-03-19 | 济南浪潮高新科技投资发展有限公司 | A kind of multi-channel DMA controller and convolutional neural networks accelerated method accelerated towards convolutional neural networks |
CN110008156A (en) * | 2019-03-27 | 2019-07-12 | 无锡海斯凯尔医学技术有限公司 | Device, method and the readable storage medium storing program for executing of data transmission |
CN110309088A (en) * | 2019-06-19 | 2019-10-08 | 北京百度网讯科技有限公司 | ZYNQ fpga chip and its data processing method, storage medium |
CN110852428A (en) * | 2019-09-08 | 2020-02-28 | 天津大学 | Neural network acceleration method and accelerator based on FPGA |
CN110738308A (en) * | 2019-09-23 | 2020-01-31 | 陈小柏 | neural network accelerators |
Non-Patent Citations (1)
Title |
---|
王洪利 等: "用于卷积神经网络硬件加速器的 3D DMA 控制器" * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112712167A (en) * | 2020-12-31 | 2021-04-27 | 北京清微智能科技有限公司 | Memory access method and system supporting acceleration of multiple convolutional neural networks |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10877766B2 (en) | Embedded scheduling of hardware resources for hardware acceleration | |
CN108229687B (en) | Data processing method, data processing device and electronic equipment | |
CN114003392B (en) | Data accelerated computing method and related device | |
US11941528B2 (en) | Neural network training in a distributed system | |
WO2023124304A1 (en) | Chip cache system, data processing method, device, storage medium, and chip | |
KR20210080009A (en) | Accelerator, method for operating the same and device including the same | |
JPH05274252A (en) | Transaction execution method for computer system | |
WO2020106482A1 (en) | Programming and controlling compute units in an integrated circuit | |
CN111401541A (en) | Data transmission control method and device | |
KR102303424B1 (en) | Direct memory access control device for at least one processing unit having a random access memory | |
CN117155802A (en) | Out-of-order transmission simulation method, device and system and electronic equipment | |
JP2821345B2 (en) | Asynchronous I / O control method | |
US6567908B1 (en) | Method of and apparatus for processing information, and providing medium | |
CN108062224B (en) | Data reading and writing method and device based on file handle and computing equipment | |
US20040059563A1 (en) | Emulatd atomic instruction sequences in a multiprocessor system | |
CN116243983A (en) | Processor, integrated circuit chip, instruction processing method, electronic device, and medium | |
CN114021715A (en) | Deep learning training method based on Tensorflow framework | |
CN115712486A (en) | Method and device for controlling live migration of virtual machine, medium and computer equipment | |
KR20050080704A (en) | Apparatus and method of inter processor communication | |
CN115563053A (en) | High-performance on-chip memory controller and execution method thereof | |
CN111913812B (en) | Data processing method, device, equipment and storage medium | |
WO2020221161A1 (en) | Computing job processing method and system, mobile device and acceleration device | |
US20190179778A1 (en) | System memory controller with client preemption | |
CN111506518B (en) | Data storage control method and device | |
JP3110024B2 (en) | Memory control system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20200710 |