WO2019084789A1 - Contrôleur d'accès direct à la mémoire, procédé de lecture de données et procédé d'écriture de données - Google Patents

Contrôleur d'accès direct à la mémoire, procédé de lecture de données et procédé d'écriture de données Download PDF

Info

Publication number
WO2019084789A1
WO2019084789A1 PCT/CN2017/108644 CN2017108644W WO2019084789A1 WO 2019084789 A1 WO2019084789 A1 WO 2019084789A1 CN 2017108644 W CN2017108644 W CN 2017108644W WO 2019084789 A1 WO2019084789 A1 WO 2019084789A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
read
path
unit
write
Prior art date
Application number
PCT/CN2017/108644
Other languages
English (en)
Chinese (zh)
Inventor
任子木
韩彬
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN201780010826.8A priority Critical patent/CN108701102A/zh
Priority to PCT/CN2017/108644 priority patent/WO2019084789A1/fr
Publication of WO2019084789A1 publication Critical patent/WO2019084789A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal

Definitions

  • the present application relates to the field of data processing, and in particular, to a direct memory access (DMA) controller, a data reading method, and a data writing method.
  • DMA direct memory access
  • the chip typically includes a DMA controller and a data processing unit for data processing.
  • the functions of the DMA controller on the market are relatively simple, and the main function is to complete the data movement operation.
  • the chip usually sets a preprocessing module in the data processing unit to complete the unpacking operation of the image data.
  • the unpacking operation is to unpack the compactly stored image data in the memory into a format that is easy to use by the data processing unit.
  • unpacking image data in 8-bit (bit), 10-bit, 12-bit or 16-bit bit width compact storage in DDR is Stored in regularity in 16-bit or 32-bit width.
  • the unpacking operation is performed by the preprocessing module of the data processing unit. Therefore, the DMA controller moves the data and the preprocessing module unpacks the data in series. In other words, if you want to process a piece of image data, you need the DMA controller to complete the data transfer first, and then the pre-processing module can complete the unpacking operation, which cannot be performed in parallel. This makes the chip process data very slow.
  • the present application provides a direct memory access controller, a data reading method, and a data writing method, which can improve data processing speed and improve chip performance.
  • a direct memory access DMA controller comprising a control path and a read data path, the control path comprising a read control unit, the read control unit for generating a read control And generating, by the read control signal, the read data path to read first data from an external memory via an external bus; the read data path includes an unpacking unit, the unpacking unit is configured to The data is unpacked, and the read data path writes the unpacked first data to the internal storage unit via the internal bus.
  • the first aspect of the DMA controller sets a control path and a read data path therein, and the read data path reads data from the external memory while unpacking the data through the unpacking unit, so that the DMA controller moves the data.
  • the unpacking operation is performed, and the two processes in parallel can improve the data processing speed and improve the performance of the chip.
  • a direct memory access DMA controller comprising a control path and a write data path, the control path comprising a write control unit for generating a write control signal and by the write control Signaling the write data path to read second data from an internal memory unit via an internal bus; the write data path includes a packing unit, the packing unit for packaging the second data, the write data path The packed second data is written to the external memory via the external bus.
  • the DMA controller of the second aspect sets the control path and the write data path therein, and writes the data path to write data to the external memory, and packs the data through the packing unit, so that the DMA controller performs data transfer simultaneously.
  • the packaging operation which is processed in parallel, can improve the data processing speed and improve the performance of the chip.
  • a data reading method is provided, the method being performed by a direct memory access DMA controller, the DMA controller comprising a control path and a read data path, the control path comprising a read control unit, the reading
  • the data path includes an unpacking unit, the method comprising: the read control unit generating a read control signal, and controlling, by the read control signal, the read data path to read first data from an external memory via an external bus; The packet unit unpacks the first data, and the read data path writes the unpacked first data to the internal storage unit via the internal bus.
  • a data writing method is provided, the method being performed by a direct memory access DMA controller, the DMA controller comprising a control path and a write data path, the control path comprising a write control unit, the writing
  • the data path includes a packing unit, the method comprising: the write control unit generating a write control signal, and controlling, by the write control signal, the write data path to read second data from an internal storage unit via an internal bus; The unit packs the second data, and the write data path writes the packaged second data to the outside via an external bus Save.
  • an integrated circuit comprising the DMA controller of the first aspect of the invention and/or the DMA controller of the second aspect.
  • a mobile device comprising the DMA controller of the first aspect of the invention and/or the DMA controller of the second aspect.
  • FIG. 1 is a schematic block diagram of a DMA controller of one embodiment of the present application.
  • FIG. 2 is a schematic diagram of a data unpacking operation of an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an implementation of a descriptor cache in accordance with an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a descriptor format of an embodiment of the present application.
  • FIG. 5 is a schematic block diagram of a DMA controller of another embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a DMA controller of another embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a data reading method according to an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a data writing method according to an embodiment of the present application.
  • Embodiments of the present application provide a DMA controller having a read data function.
  • 1 is a schematic block diagram of a DMA controller 100 in accordance with one embodiment of the present application.
  • DMA controller 100 includes a control path 110 and a read data path 120.
  • the control path 110 can include a read control unit 112 for generating a read control signal and controlling the read data path 120 to read the first data from the external memory 220 via the external bus 210 by a read control signal.
  • the read data path 120 can include a depacketizing unit 122 for unpacking the first data, and the read data path 120 writing the unpacked first data to the internal storage unit 240 via the internal bus 230.
  • the external bus 210 may be an Advanced eXtensible Interface (AXI) bus, or may be a bus specified by other protocols.
  • the external memory 220 can be double data rate SDRAM (Double Data Rate SDRAM, DDR SDRAM), or simply DDR memory, where SDRAM is a Synchronous Dynamic Random Access Memory.
  • the internal bus 230 can be a crossbar switch (CROSSBAR) bus.
  • CROSSBAR crossbar switch
  • the first embodiment of the present application can read the first data from the external memory 220 through the external bus 210 through the external read unit 126 of the read data path 120 , which is not limited in this embodiment of the present application.
  • the DMA controller of the embodiment of the present application sets a control path and a read data path therein, and the read data path reads data from the external memory while unpacking the data through the unpacking unit, so that the DMA controller moves the data. At the same time, the unpacking operation is performed, and the two are processed in parallel, which can improve the data processing speed and improve the performance of the chip.
  • the downstream unit of the DMA controller does not need to unpack the original data, which increases the flexibility of the downstream unit of the DMA controller, so that the DMA controller can be used in A variety of different application scenarios, such as big-end storage, small-end storage, and so on.
  • the read data path 120 may further include a first clock processing unit 124 for performing asynchronous first in first out (FIFO) on the first data read by the read data path 120.
  • the cross-clock domain processing, the unpacking unit 122 is specifically configured to unpack the first data processed by the FIFO across the clock domain.
  • the first clock processing unit 124 can use FIFO technology to synchronize data originating from different clock domains.
  • the first clock processing unit 124 may perform the cross-clock domain processing of the FIFO from the external memory 220 (for example, may be DDR memory), and then send it to the unpacking unit 122.
  • the data especially the pixel value of the image, is generally stored compactly in the DDR memory. Images of different bit width pixels are stored in different formats in DDR memory.
  • the unpacking unit 122 of the embodiment of the present application adds 0 to the high bit of the original pixel value and expands the pixel bit width to 16 bit or 32 bit. That is, the unpacking unit 122 needs to perform conversion of the data format.
  • the unpacking unit 122 may unpack the first data according to the unpacking mode preset by the system.
  • the unpacking mode supported by the embodiments of the present application may include a direct mode, a low word mode, a low bit halfword mode, a parity column split mode, a one-bank copy mode, a high word mode, a high halfword mode, and two blocks ( At least one of two-bank replication modes, and the embodiment of the present application Not limited to this.
  • the direct mode is to output the data as it is without formatting.
  • the low word mode uses a low word of one word (including 32 bits) to store one pixel value and a high bit to zero. In this way, 8-bit, 10-bit, 12-bit, and 16-bit wide pixels can be expanded to 32 bits.
  • the lower halfword mode uses a lower half of the word to store one pixel value and a high bit to zero. In this way, 8-bit, 10-bit, and 12-bit wide pixels can be expanded to 16 bits.
  • the high word mode uses a word high bit to store a pixel value and a low bit zero.
  • the upper halfword mode uses a high word of half a word to store one pixel value and a low bit to zero.
  • each bank is a group, and the data is copied into 16 copies, and each data is placed in one bank.
  • every two banks are grouped, and the data is copied into 8 copies, and each data is placed in two banks.
  • the parity column split mode is to divide the data into two parts, which are respectively stored in the even bank and the odd bank. For example, the lower 8 bits of data are stored in the even bank, and the upper 8 bits of data are stored in the odd bank. This is not limited.
  • the pixel width that can be supported by the embodiment of the present application may include 8 bits, 10 bits, 12 bits, and 16 bits, and is not limited thereto.
  • FIG. 2 is a schematic diagram of a data unpacking operation of an embodiment of the present application.
  • the data is stored in the external memory in a 12-bit pixel format, including data of pixels 0-pixel 9.
  • the low-order half-word mode is used for unpacking.
  • each pixel is 16 bits, and the high bit is inserted with 0 to fill.
  • the Most Significant Bit (MSB) and the Least Significant Bit (LSB) of the data before and after unpacking are shown in Figure 2, respectively. Therefore, the embodiment of the present application does not need to store image data including a large amount of redundant information (for example, the data after unpacking in FIG. 2) in the external memory, but stores the data before the unpacking of FIG. 2, thereby saving memory bandwidth. .
  • the unpacking unit 122 may include a 3-stage pipeline.
  • the first level pipeline can unpack data of different bit width pixels (for example, the first data) into a certain bit width, for example, 16 bits.
  • the input to the first level pipeline is 128 bits of data.
  • one beat data includes 16 pixels; for 10-bit pixels, one beat data includes 12 pixels; for 12-bit pixels, one beat data includes 10 pixels, and the first-stage pipeline unpacks data, and the data width is wide. It is 16bit.
  • the second level pipeline can padding the data.
  • the padding operation expands a tile having a width and height of less than 256 ⁇ 256 into a tile of 256 ⁇ 256, wherein the extended region complements 0 pixels.
  • the third-level pipeline can complete half-word to word conversion and low-to-high conversion. Because the first The data after the unpacking of the first-level pipeline is the low-order halfword mode. If the current unpacking mode requires other modes, the data format needs to be converted.
  • the unpacking unit includes a level 3 pipeline, which is merely exemplary, and is not intended to limit the embodiments of the present application.
  • control path 110 may further include a data control unit 114.
  • the data control unit 114 can be configured to read a first descriptor of the first data, and the read control unit 112 is specifically configured to generate a read control signal according to the first descriptor.
  • the data control unit 114 may read the first descriptor of the first data from the core 260 via the core bus 250.
  • the descriptor of the embodiment of the present application may be used to indicate a priority, where there are multiple situations to be considered.
  • the first descriptor may include a first field, and the first field is used to indicate that the type of the first data is an immediate number, 1D. Data or 2D data.
  • the immediate priority is greater than the priority of one-dimensional (1D) data and two-dimensional (2D) data.
  • the amount of data of 1D data and 2D data transmitted by the 1D task and the 2D task are relatively large.
  • the DMA controller of the embodiment of the present application can improve the execution efficiency of the immediate data.
  • the DMA controller of the embodiments of the present application can support an outstanding high efficiency data receiving structure.
  • the data receiving structure can make full use of the outstanding resources to achieve efficient data reception in the case of a small cache.
  • the data receiving structure has 8 IDentifiers (IDs) 0 to 7 that can be utilized.
  • IDs IDentifiers
  • the data receiving structure maintains a register file. Each row of the register file corresponds to an ID, and stores a cache storage address corresponding to the current ID.
  • the DMA controller reads back valid data from the external memory via the external bus, according to the ID of the data, the corresponding register file is queried, the cache address of the data is obtained, and then the data is buffered to the corresponding address. At the same time, the address data in the register file is incremented by one. If the data currently read back is the last data of its burst, the corresponding register of the ID is set to 0 for use by other transmissions.
  • DMA controller supports outstanding 8, if there is currently an empty ID, you can make Use this ID to read and write immediate data.
  • the amount of immediate data is usually small, for example 128 bits, and only a single floating point operation is required to complete the transfer.
  • the DMA controller of the embodiment of the present application can greatly improve the utilization of the oustanding resource under the premise of a small cache resource consumption, thereby improving the reading efficiency of the DMA controller.
  • the first descriptor may include a second field, and the second field is used to indicate the priority of the task corresponding to the first data.
  • the description of each 1D data or the descriptor of the 2D data may be configured with a 3-bit second field for indicating the priority of the data task.
  • the priority can be divided into 8 levels. The control path selects the task with the highest priority from the descriptor cache for execution. If multiple tasks have the same priority, the task that is first pushed into the descriptor cache is selected for execution.
  • the data control unit 114 may include a register for registering the priority of the task corresponding to the first data in the first descriptor, and the read control unit 112 is configured to generate the read control signal according to the priority in the register.
  • a specific implementation of the descriptor cache can be as follows. 3 is a schematic diagram of one implementation of a descriptor cache. As shown in Figure 3, a register bank is set up. The set of registers has a total of 7 rows, and each row of the register corresponds to a descriptor in the descriptor cache.
  • the 0th line corresponds to the descriptor of the task corresponding to the data of the addresses 0 to 9
  • the 1st line corresponds to the descriptor of the task corresponding to the data of the addresses 10 to 19, and so on.
  • Each row of the register may include 3 fields (field A, field B, and field C) total 7 bits, and field A (used field) occupies 1 bit, which is used to indicate whether the storage space of the row of the register has a descriptor, and the field A value is 1 indicates that the storage space of the row of the register has a descriptor.
  • Field B occupies 3 bits, indicating the absolute value of the priority carried in the descriptor of the data (ie, the value of the pri1 field is assigned by the pri field of the descriptor header).
  • Field C occupies 3 bits and is used to indicate the order in which tasks are pushed into the descriptor cache, ensuring that tasks of the same priority are first-in, first-out.
  • the value of the pri2 field of the descriptor is set to the register variable main, and the value of the pri2 field is decremented by 1.
  • the value of the pri2 field of the remaining descriptor is set. plus 1.
  • the first descriptor of the embodiment of the present application may include a third field, where the third field is used to indicate an unpacking mode of the first data.
  • FIG. 4 is a schematic diagram of a descriptor format of an embodiment of the present application.
  • This descriptor is a descriptor of 2D data.
  • the first halfword (16bit) of the descriptor is the header dscrp_head of the descriptor, which indicates the mode mode[1:0] of the data (dscrp_head[15:14]), the priority of the task corresponding to the data.
  • the mode mode of the data may be an immediate data, 1D data, or 2D data.
  • the mode of the data in this example is 2D data.
  • the unpacking mode unpack_mode may be one of the aforementioned various unpacking modes.
  • the transfer direction direc is the read direction or the write direction.
  • Dscrp_id[11:0] is used to indicate the ID of this descriptor.
  • Ext_addr (including ext_addr[15:0] and ext_addr[31:16]) is used to indicate the data address of the external memory.
  • Trans_len[15:0] is used to indicate the transmission length.
  • Toggle_num[15:0] indicates the number of times the current operation needs to be repeated.
  • Trans_width[15:0] and trans_stride[15:0] are used to indicate the 2D area. These two parameters are used to calculate the next start address when wrapping.
  • Port_cfg1[15:0] and port_cfg2[15:0] correspond to the number of two internal memory locations, respectively, for storing the upper 8 bits and lower 8 bits of the data.
  • the three parameters padding_en[0], vld_data_height[6:0], and vld_data_width[6:0] are used for the padding operation.
  • padding_en is 1 for padding
  • 0 is for padding
  • vld_data_height[6: 0] and vld_data_width[6:0] are used to define the height and width of the unfilled area.
  • the DMA controller of the embodiment of the present application may further include a write data path, and the control path may further include a write control unit, where the write control unit is configured to generate a write control signal, and control the write data path via the internal bus through the write control signal.
  • the second data is read from the internal storage unit; the write data path may include a packing unit, the packing unit is configured to package the second data, and the write data path writes the packed second data to the external memory via the external bus.
  • the write data path may further include a second clock processing unit, configured to perform cross-clock domain processing of the asynchronous FIFO on the packed second data after the packing unit packages the second data.
  • a second clock processing unit configured to perform cross-clock domain processing of the asynchronous FIFO on the packed second data after the packing unit packages the second data.
  • FIG. 5 is a schematic block diagram of a DMA controller 500 of another embodiment of the present application.
  • the DMA controller 500 includes a control path 510 and a write data path 520.
  • the control path 510 can include a write control unit 512 for generating a write control signal and controlling the write data path 520 to read the second data from the internal memory unit 240 via the internal bus 230 via a write control signal.
  • the write data path 520 can include a packing unit 522 for packaging the second data, and the write data path 520 writing the packed second data to the external memory 220 via the external bus 210.
  • the embodiment of the present application may specifically write the packaged second data to the external memory 220 via the external bus 210 through the external write unit 526 of the write data path 520, which is not limited in this embodiment of the present application. .
  • the DMA controller of the embodiment of the present application sets a control path and a write data path therein, and the write data path writes data to the external memory while packing the data through the packing unit, so that the DMA controller moves the data at the same time.
  • the packaging operation is performed in parallel, which can improve the data processing speed and improve the performance of the chip.
  • the upstream unit of the DMA controller does not need to pack the original data, which increases the flexibility of the upstream unit of the DMA controller, so that the DMA controller can be used in many different ways. Application scenario.
  • the write data path 520 may further include a second clock processing unit 524, configured to perform asynchronous FIFO on the packaged second data after the packing unit 522 packages the second data.
  • a second clock processing unit 524 configured to perform asynchronous FIFO on the packaged second data after the packing unit 522 packages the second data.
  • Cross-clock domain processing configured to perform asynchronous FIFO on the packaged second data after the packing unit 522 packages the second data.
  • the packaging unit 522 is specifically configured to package the second data according to the preset packaging mode.
  • the packing modes supported by the embodiments of the present application may include a direct mode, a low word mode, a low halfword mode, a parity column split mode, a one-bank copy mode, a high word mode, a high halfword mode, and At least one of the two-bank copy modes, and the embodiment of the present application is not limited thereto.
  • the working principle and the flow of the unpacking unit 522 of the embodiment of the present application may be similar to the principle of the unpacking unit of the embodiment of the present application, and the process is reversed, and details are not described herein.
  • control path 510 may further include a data control unit 514.
  • the data control unit 514 is configured to read a second descriptor of the second data, and the write control unit 512 is specifically configured to generate a write control signal according to the second descriptor.
  • the data control unit 514 can read the second descriptor of the second data from the core 260 via the core bus 250.
  • the second descriptor may include a first field, where the first field is used to indicate that the type of the second data is immediate, one-dimensional 1D data, or two-dimensional 2D data.
  • the second descriptor may include a second field, where the second field is used to indicate a priority of the task corresponding to the second data.
  • the second descriptor may include a third field, where the third field is used to indicate a packing mode of the second data (the third field corresponding to the first descriptor is used to indicate an unpacking mode of the first data).
  • the format of the second descriptor of the embodiment of the present application may be the same as or the same as the format of the first descriptor. like.
  • the data control unit 514 may include a register for registering priorities of tasks corresponding to the second data in the second descriptor, and the write control unit 512 is configured to generate according to priorities in the registers. Write control signals.
  • the DMA controller 500 may further include a read data path, and the control path may further include a read control unit, and the read control unit may be configured to generate a read control signal and control the read data path via the read control signal.
  • the external bus reads the first data from the external memory; the read data path may include an unpacking unit, the unpacking unit may be configured to unpack the first data, and the read data path writes the unpacked first data via the internal bus Internal storage unit.
  • the read data path may further include a first clock processing unit, configured to perform cross-clock domain processing of the asynchronous FIFO on the first data read by the read data path, where the unpacking unit is specifically configured to perform The first data processed by the FIFO across the clock domain is unpacked.
  • a first clock processing unit configured to perform cross-clock domain processing of the asynchronous FIFO on the first data read by the read data path, where the unpacking unit is specifically configured to perform The first data processed by the FIFO across the clock domain is unpacked.
  • the registers in the data control unit may register the priority of the task corresponding to the read data (eg, the first data) and register the write to be written.
  • the priority of the task corresponding to the incoming data eg second data.
  • the read data and the write data can share the same register.
  • the read data and the write data may also be set differently, which is not limited in this embodiment of the present application.
  • FIG. 6 is a schematic block diagram of a DMA controller 600 of another embodiment of the present application. As shown in FIG. 6, DMA controller 600 includes control path 610, read data path 620, and write data path 630.
  • Control path 610 can include read control unit 612 and write control unit 614.
  • the read control unit 612 is configured to generate a read control signal and control the read data path 620 to read data from the external memory 220 via the external bus 210 by a read control signal.
  • the write control unit 614 is configured to generate a write control signal and control the write data path 630 to read data from the internal memory unit 240 via the internal bus 230 by a write control signal.
  • Control path 610 can also include a data control unit 616.
  • Data control unit 616 can be used to read descriptors of the data.
  • the data control unit 616 can include an interface (crf_if) module, which is an interface module of the data control unit 616 and the core bus 250.
  • the data control unit 616 may also include a read immediate data cache (im_cache) module and a descriptor format check. (dscrp_check) module and descriptor distribution (dscrp_distribute) module.
  • the im_cache module is a module for caching immediate data.
  • the dscrp_check module is a module for checking the descriptor format.
  • the dscrp_distribute module distributes the descriptor to the read immediate task cache (fifo_im_r) module, the write immediate task cache (fifo_im_w) module, the read 1D2D task cache (pri_1d2d_r) module, and the write 1D2D task cache (pri_1d2d_w) module according to the contents of the descriptor.
  • the fifo_im_r module is used to cache read immediate tasks
  • the fifo_im_w module is used to cache write immediate tasks
  • the pri_1d2d_r module is used to cache read 1D tasks and 2D tasks, and has task priority function.
  • the pri_1d2d_w module is used to cache write 1D tasks and 2D tasks. And has a task priority function.
  • the read control unit 612 generates a read control signal based on the descriptors in the fifo_im_r module or the pri_1d2d_r module.
  • the write control unit 614 generates a write control signal based on the descriptors in the fifo_im_w module or the pri_1d2d_w module.
  • the read data path 620 can include the external read unit 626 reading data from the external memory 220 via the external bus 210; the read data path 620 can also include a first clock processing unit 624 for asynchronously FIFOing the data read by the read data path 620.
  • the cross-clock domain processing; the read data path 620 further includes an unpacking unit 622 for writing data to the internal storage unit 240 via the internal bus 230 after unpacking the data.
  • the write data path 630 can include a packing unit 632 for packaging data read from the internal storage unit 240 via the internal bus 230 by the write data path 630; the write data path 630 can also include a second clock processing unit 634 for The packed data of the packing unit 632 performs cross-clock domain processing of the asynchronous FIFO; the write data path 630 may further include an external write unit 626 for writing the packed and cross-clock domain processed data to the external memory 220 via the external bus 210.
  • DMA controller 600 shown in FIG. 6 is only an example of the embodiment of the present application and is not limited thereto.
  • the DMA controller of the embodiment of the present application has been described above, and the data reading method and the data writing method corresponding thereto are respectively described in detail below.
  • FIG. 7 is a schematic flowchart of a data reading method 700 according to an embodiment of the present application.
  • Method 700 can be performed by a DMA controller that includes a control path and a read data path, the control path including a read control unit, and the read data path including an unpacking unit. As shown in FIG. 7, method 700 can include the following steps.
  • the read control unit generates a read control signal, and controls the read data path to read the first data from the external memory via the external bus through the read control signal.
  • the unpacking unit unpacks the first data.
  • the read data path writes the unpacked first data to the internal storage unit via the internal bus.
  • the read data path reads data from the external memory and unpacks the data through the unpacking unit, so that the DMA control
  • the device performs the unpacking operation while the data is being moved, and the two processes in parallel can improve the data processing speed and improve the performance of the chip.
  • the S720 unpacking unit unpacking the first data may include: the unpacking unit unpacks the first data according to the preset unpacking mode.
  • the read data path may further include a first clock processing unit
  • the method 700 may further include: the first clock processing unit performs asynchronous first-in first-out FIFO on the first data read by the read data path.
  • the clock domain processing; the S720 unpacking unit unpacking the first data may include: the unpacking unit unpacking the first data after performing the cross-clock domain processing of the FIFO.
  • control path may further include a data control unit
  • the method 700 may further include: the data control unit reads the first descriptor of the first data; S710, the read control unit generates the read control signal, which may include The read control unit generates a read control signal based on the first descriptor.
  • the first descriptor includes a first field, where the first field is used to indicate that the type of the first data is immediate, one-dimensional 1D data, or two-dimensional 2D data.
  • the first descriptor includes a second field, where the second field is used to indicate a priority of the task corresponding to the first data.
  • the first descriptor includes a third field, where the third field is used to indicate an unpacking mode of the first data.
  • the data control unit may include a register
  • the method 700 may further include: registering a priority of a task corresponding to the first data in the first descriptor; and the read control unit generates a read according to the first descriptor.
  • the control signal may include: the read control unit generates a read control signal according to a priority in the register.
  • the DMA controller further includes a write data path
  • the control path may further include a write control unit
  • the write data path may include a packing unit
  • the method 700 may further include: the write control unit generates a write control signal, and The write data path is controlled by the write control signal to read the second data from the internal memory unit via the internal bus; the packing unit packs the second data, and the write data path writes the packed second data to the external memory via the external bus.
  • the write data path may further include a second clock processing unit
  • the method 700 may further include: after the second clock processing unit packs the second data, the second clock processing unit performs the packaged second data.
  • FIG. 8 is a schematic flowchart of a data writing method 800 according to an embodiment of the present application.
  • Method 800 can be performed by a DMA controller that includes a control path including a write control unit and a write data path including a packing unit, and method 800 can include the following steps.
  • the write control unit generates a write control signal, and controls the write data path to read the second data from the internal storage unit via the internal bus through the write control signal.
  • the packaging unit packages the second data.
  • the write data path writes the packed second data to the external memory via an external bus.
  • the write data path writes data to the external memory, and packs the data through the packing unit, so that the DMA controller is
  • the data is moved at the same time as the packaging operation, and the two are processed in parallel, which can improve the data processing speed and improve the performance of the chip.
  • the S820 packaging unit packing the second data may include: the packaging unit packing the second data according to the preset packaging mode.
  • the write data path may further include a second clock processing unit
  • the method 800 may further include: after the second clock processing unit packs the second data, the second clock processing unit performs the packed second data.
  • control path may further include a data control unit
  • the method 800 may further include: the data control unit reads the second descriptor of the second data; and the S810 write control unit generates the write control signal, which may include: The write control unit generates a write control signal based on the second descriptor.
  • the second descriptor includes a first field, where the first field is used to indicate that the type of the second data is immediate, one-dimensional 1D data, or two-dimensional 2D data.
  • the second descriptor includes a second field, where the second field is used to indicate a priority of the task corresponding to the second data.
  • the second descriptor includes a third field, where the third field is used to indicate a packing mode of the second data.
  • the data control unit may include a register
  • the method 800 may further include: registering a priority of a task corresponding to the second data in the second descriptor; and writing control
  • the unit generates a write control signal according to the second descriptor, and may include: the write control unit generates a write control signal according to the priority in the register.
  • the DMA controller may further include a read data path
  • the control path may further include a read control unit
  • the read data path may include a unpacking unit
  • the method 800 may further include: the read control unit generates a read control signal And controlling the read data path to read the first data from the external memory via the external bus through the read control signal; the unpacking unit unpacks the first data, and the read data path writes the unpacked first data to the internal via the internal bus Storage unit.
  • the read data path may further include a first clock processing unit
  • the method 800 may further include: the first clock processing unit performs asynchronous first-in first-out FIFO on the first data read by the read data path.
  • the clock domain processing; the unpacking unit unpacking the first data may include: the unpacking unit unpacking the first data after performing the cross-clock domain processing of the FIFO.
  • the embodiment of the present application further provides an integrated circuit including at least one of the DMA controller 100, the DMA controller 500, and the DMA controller 600 of the embodiment of the present application.
  • the integrated circuit of the embodiment of the present application may be an Application Specific Integrated Circuit (ASIC) or a Field-Programmable Gate Array (FPGA).
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the embodiment of the present application further provides a computing chip, which includes at least one of the DMA controller 100, the DMA controller 500, and the DMA controller 600 of the embodiment of the present application.
  • the embodiment of the present application further provides a mobile device, which includes at least one of the DMA controller 100, the DMA controller 500, and the DMA controller 600 of the embodiment of the present application.
  • the mobile device of the embodiment of the present application may be an aircraft, and in particular may be a drone.
  • circuits, sub-circuits, and sub-units of various embodiments of the present application is merely illustrative. Those of ordinary skill in the art will appreciate that the circuits, sub-circuits, and sub-units of the various examples described in the embodiments disclosed herein can be further separated or combined.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, or a computer network Network, or other programmable device.
  • the computer instructions can be stored in a computer readable storage medium or transferred from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions can be from a website site, computer, server or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer readable storage medium can be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that includes one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a high-density digital video disc (DVD)), or a semiconductor medium (for example, a solid state hard disk (Solid State Disk, SSD)) and so on.
  • a magnetic medium for example, a floppy disk, a hard disk, a magnetic tape
  • an optical medium for example, a high-density digital video disc (DVD)
  • DVD high-density digital video disc
  • semiconductor medium for example, a solid state hard disk (Solid State Disk, SSD)
  • the size of the sequence numbers of the foregoing processes does not mean the order of execution sequence, and the order of execution of each process should be determined by its function and internal logic, and should not be applied to the embodiment of the present application.
  • the implementation process constitutes any limitation.
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B from A does not mean that B is only determined based on A, and that B can also be determined based on A and/or other information.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bus Control (AREA)

Abstract

L'invention concerne également un contrôleur DMA, un procédé de lecture de données et un procédé d'écriture de données. Le contrôleur DMA comprend un trajet de commande et un trajet de lecture de données. Le trajet de commande comprend une unité de commande de lecture, configurée pour générer un signal de commande de lecture, et commander le trajet de lecture de données au moyen du signal de commande de lecture pour lire des premières données à partir d'une mémoire externe par l'intermédiaire d'un bus externe. Le trajet de lecture de données comprend une unité de déballage, configurée pour déballer les premières données, et le trajet de lecture de données écrit les premières données non condensées dans une unité de stockage interne au moyen d'un bus interne. En fournissant le trajet de commande et le trajet de lecture de données dans le contrôleur DMA, le contrôleur DMA effectue un dégroupage des données au moyen de l'unité de dégroupage en même temps que le trajet de lecture de données lit les données à partir de la mémoire externe, de sorte que le contrôleur DMA effectue une opération de dégroupage pendant le transfert de données, et effectue un traitement parallèle de façon à améliorer la vitesse de traitement de données et à améliorer les performances des puces.
PCT/CN2017/108644 2017-10-31 2017-10-31 Contrôleur d'accès direct à la mémoire, procédé de lecture de données et procédé d'écriture de données WO2019084789A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201780010826.8A CN108701102A (zh) 2017-10-31 2017-10-31 直接存储器访问控制器、数据读取方法和数据写入方法
PCT/CN2017/108644 WO2019084789A1 (fr) 2017-10-31 2017-10-31 Contrôleur d'accès direct à la mémoire, procédé de lecture de données et procédé d'écriture de données

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/108644 WO2019084789A1 (fr) 2017-10-31 2017-10-31 Contrôleur d'accès direct à la mémoire, procédé de lecture de données et procédé d'écriture de données

Publications (1)

Publication Number Publication Date
WO2019084789A1 true WO2019084789A1 (fr) 2019-05-09

Family

ID=63844132

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/108644 WO2019084789A1 (fr) 2017-10-31 2017-10-31 Contrôleur d'accès direct à la mémoire, procédé de lecture de données et procédé d'écriture de données

Country Status (2)

Country Link
CN (1) CN108701102A (fr)
WO (1) WO2019084789A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112823343A (zh) * 2020-03-11 2021-05-18 深圳市大疆创新科技有限公司 直接内存存取单元、处理器、设备、处理方法及存储介质
CN115794722B (zh) * 2023-02-10 2023-05-16 智绘微电子科技(南京)有限公司 一种AXI outstanding接口转换实现方法
CN117033270B (zh) * 2023-10-08 2024-01-26 腾讯科技(深圳)有限公司 一种芯片、设备以及数据处理方法
CN117971746A (zh) * 2024-03-28 2024-05-03 深圳鲲云信息科技有限公司 用于控制直接内存访问控制器的方法及计算设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6584513B1 (en) * 2000-03-31 2003-06-24 Emc Corporation Direct memory access (DMA) transmitter
US20070162644A1 (en) * 2005-12-13 2007-07-12 Poseidon Design Systems, Inc. Data packing in A 32-bit DMA architecture
CN102135946A (zh) * 2010-01-27 2011-07-27 中兴通讯股份有限公司 一种数据处理方法和装置
CN103714026A (zh) * 2014-01-14 2014-04-09 中国人民解放军国防科学技术大学 一种支持原址数据交换的存储器访问方法及装置
CN106951388A (zh) * 2017-03-16 2017-07-14 湖南博匠信息科技有限公司 一种基于PCIe的DMA数据传输方法及系统

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100370436C (zh) * 2005-09-07 2008-02-20 深圳市海思半导体有限公司 一种提高存储器访问效率的方法及存储器控制器
TW200921395A (en) * 2007-11-14 2009-05-16 Sonix Technology Co Ltd System and method of direct memory access
CN103885919B (zh) * 2014-03-20 2017-01-04 北京航空航天大学 一种多dsp和fpga并行处理系统及实现方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6584513B1 (en) * 2000-03-31 2003-06-24 Emc Corporation Direct memory access (DMA) transmitter
US20070162644A1 (en) * 2005-12-13 2007-07-12 Poseidon Design Systems, Inc. Data packing in A 32-bit DMA architecture
CN102135946A (zh) * 2010-01-27 2011-07-27 中兴通讯股份有限公司 一种数据处理方法和装置
CN103714026A (zh) * 2014-01-14 2014-04-09 中国人民解放军国防科学技术大学 一种支持原址数据交换的存储器访问方法及装置
CN106951388A (zh) * 2017-03-16 2017-07-14 湖南博匠信息科技有限公司 一种基于PCIe的DMA数据传输方法及系统

Also Published As

Publication number Publication date
CN108701102A (zh) 2018-10-23

Similar Documents

Publication Publication Date Title
WO2019084789A1 (fr) Contrôleur d'accès direct à la mémoire, procédé de lecture de données et procédé d'écriture de données
US20090138570A1 (en) Method for setting parameters and determining latency in a chained device system
US7844752B2 (en) Method, apparatus and program storage device for enabling multiple asynchronous direct memory access task executions
WO2018232736A1 (fr) Technologie d'accès mémoire et système informatique
US7694035B2 (en) DMA shared byte counters in a parallel computer
CN109219805B (zh) 一种多核系统内存访问方法、相关装置、系统及存储介质
US9092275B2 (en) Store operation with conditional push of a tag value to a queue
JPH07219844A (ja) キャッシュラインリプレーシング装置及び方法
CN102314400B (zh) 一种分散聚合式dma方法及装置
CN113900974B (zh) 一种存储装置、数据存储方法及相关设备
CN112189324B (zh) 带宽匹配的调度器
EP1508100B1 (fr) Communication avec plan de commande de processeurs entre microcircuits
US11693663B2 (en) Circular queue management with split indexes
WO2016169032A1 (fr) Dispositif de conversion de format de données, puce de mémoire tampon et procédé
US10489322B2 (en) Apparatus and method to improve performance in DMA transfer of data
US8037254B2 (en) Memory controller and method for coupling a network and a memory
CN106062814B (zh) 由图形处理器改进的成组存储器存取效率
US7774513B2 (en) DMA circuit and computer system
JP5659817B2 (ja) 相互接続装置
WO2013086847A1 (fr) Procédé de communications entre cœurs et processeur à cœurs
CN109992198B (zh) 神经网络的数据传输方法及相关产品
CN117198363B (zh) 双数据率同步动态随机存储系统及方法、设备及存储介质
JP7206485B2 (ja) 情報処理システム、半導体集積回路及び情報処理方法
KR101116613B1 (ko) 메모리 액세스 제어 장치 및 방법
CN118152310A (zh) 一种基于PCIe AXI bridge写数据存储、搜索及传输处理方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17930645

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17930645

Country of ref document: EP

Kind code of ref document: A1