WO2023178859A1 - Dma data transfer method and device - Google Patents

Dma data transfer method and device Download PDF

Info

Publication number
WO2023178859A1
WO2023178859A1 PCT/CN2022/100634 CN2022100634W WO2023178859A1 WO 2023178859 A1 WO2023178859 A1 WO 2023178859A1 CN 2022100634 W CN2022100634 W CN 2022100634W WO 2023178859 A1 WO2023178859 A1 WO 2023178859A1
Authority
WO
WIPO (PCT)
Prior art keywords
address
data transfer
discrete
dma
instruction set
Prior art date
Application number
PCT/CN2022/100634
Other languages
French (fr)
Chinese (zh)
Inventor
马成勇
秦旋
袁峰
Original Assignee
奥比中光科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 奥比中光科技集团股份有限公司 filed Critical 奥比中光科技集团股份有限公司
Publication of WO2023178859A1 publication Critical patent/WO2023178859A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal

Definitions

  • This application belongs to the field of computer technology, and in particular relates to a direct memory access (Direct Memory Access, DMA) data transfer method and device.
  • DMA Direct Memory Access
  • DMA transfer refers to copying data from one address space to another address space. DMA transfer occupies an important position in fields such as high-performance embedded system algorithms and networks. In the existing technology, DMA can only move data in a certain address area to another address area. For example, data in the address range [a, b] can be moved to the address range [h, j], where "within the address range [a, b]" can be understood as the distance between location a and location b within the memory. space area.
  • DMA DMA needs to perform multiple operations to complete the transfer of all data. This will frequently generate DMA interrupts, affecting the central processing unit (Centre Processor Unit, CPU) or other co-processors to assist the CPU in completing processing tasks that it cannot perform or performs inefficiently and ineffectively.
  • CPU Central Processing Unit
  • embodiments of the present application provide a DMA data transfer method and device, which can solve one or more technical problems in related technologies.
  • an embodiment of the present application provides a DMA data transfer method, which is applied to a DMA controller.
  • the DMA data transfer method includes: obtaining a read instruction set, the read instruction set includes first configuration information, and the The first configuration information includes first discrete address information and first continuous address information; according to the first configuration information in the read instruction set, the data in the discrete address range corresponding to the first discrete address information is transported to the within the continuous address range corresponding to the first continuous address information.
  • an embodiment of the present application provides a DMA data transfer method, which is applied to a DMA controller.
  • the DMA data transfer method includes: obtaining a write instruction set, the write instruction set includes second configuration information, and the The second configuration information includes second discrete address information and second continuous address information; according to the second configuration information in the write instruction set, the data in the continuous address range corresponding to the second continuous address information is transported to each Within the discrete address range corresponding to the second discrete address information.
  • an embodiment of the present application provides a DMA data transfer device, which is configured in a DMA controller and includes: a first acquisition module for acquiring a read instruction set, where the read instruction set includes first configuration information, so The first configuration information includes first discrete address information and first continuous address information; a data reading module is configured to read the discrete address information corresponding to the first discrete address information according to the first configuration information in the read instruction set. The data within the address range is moved to the continuous address range corresponding to the first continuous address information.
  • an embodiment of the present application provides a DMA data transfer device, which is configured in a DMA controller and includes: a second acquisition module for acquiring a write instruction set, where the write instruction set includes second configuration information, so The second configuration information includes second discrete address information and second continuous address information; a data writing module is configured to write the continuous address information corresponding to the second continuous address information according to the second configuration information in the write instruction set. The data within the address range is moved to the discrete address range corresponding to each of the second discrete address information.
  • an embodiment of the present application provides a coprocessor, the coprocessor is coupled with a DMA controller, or the DMA controller is coupled and integrated with the coprocessor, and the DMA controller includes The DMA data transfer device according to the third aspect, and/or the DMA data transfer device according to the fourth aspect.
  • an embodiment of the present application provides an electronic device, including a memory, a coprocessor, and a computer program stored in the memory and executable on the coprocessor.
  • the coprocessor executes the The computer program implements the DMA data transfer method as described in any embodiment of the first aspect, and/or implements the DMA data transfer method as described in any embodiment of the second aspect.
  • an embodiment of the present application provides a computer-readable storage medium.
  • the computer storage medium stores a computer program.
  • the computer program When the computer program is executed by a coprocessor, the computer program implements the method described in any embodiment of the first aspect.
  • an embodiment of the present application provides a computer program product.
  • the electronic device can implement the DMA data transfer method as described in any embodiment of the first aspect, and /Or, implement the DMA data transfer method as described in any embodiment of the second aspect.
  • the embodiment of the present application controls the DMA data transfer operation through the instruction set, realizing the transfer of data in the discrete address range to the continuous address range, or transferring the data in the continuous address range to each discrete address range, so that in the DMA During the data transfer process, the normal operation of the CPU will not be affected by DMA interrupts.
  • Figure 1 is a schematic flow chart of the implementation of a DMA data transfer method provided by an embodiment of the present application
  • Figure 2 is a schematic structural diagram of a read instruction set provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of the meaning of configuration information provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of the meaning of configuration information provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram of an image and its sub-images provided by an embodiment of the present application.
  • Figure 6 is a schematic diagram of the meaning of configuration information in an image processing application scenario provided by an embodiment of the present application.
  • Figure 7 is a schematic flow chart of the implementation of a DMA data transfer method provided by an embodiment of the present application.
  • Figure 8 is a schematic structural diagram of a write instruction set provided by an embodiment of the present application.
  • Figure 9 is a schematic process diagram of a DMA data transfer method provided by an embodiment of the present application.
  • Figure 10 is a schematic diagram of an image processing application scenario provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of the meaning of configuration information in an image processing application scenario provided by an embodiment of the present application.
  • Figure 12 is a schematic structural diagram of a DMA data transfer device provided by an embodiment of the present application.
  • Figure 13 is a schematic structural diagram of a DMA data transfer device provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • connection should be understood in a broad sense.
  • it can be a fixed connection, a detachable connection, or an integral body; it can be a direct connection or an intermediate connection.
  • the medium is indirectly connected, which can be the internal connection between two components or the interaction between two components.
  • the DMA data transfer in the embodiment of this application is divided into two parts.
  • One is DMA data transfer based on the read instruction set, which mainly realizes moving each data in certain discrete addresses to a continuous address range.
  • the other one is DMA data transfer based on the write instruction set, which mainly implements moving data in a continuous address range to various discrete address areas. Therefore, data transfer based on the write instruction set can be regarded as the reverse process of data transfer based on the read instruction set. .
  • DMA data transfer based on the read instruction set has a locking function.
  • the DMA data transfer method provided by this application can be executed by a DMA controller, and the DMA controller can be coupled with a coprocessor or integrated with a coprocessor; if the DMA controller can be coupled with a coprocessor or integrated with a coprocessor, processor, the DMA data transfer method is executed under the control of the coprocessor, and there is no limit here.
  • Figure 1 is a schematic flowchart of a DMA data transfer method based on a read instruction set provided by an embodiment of the present application.
  • the method may include steps S110 to S120.
  • the read instruction set includes configuration information, and the configuration information includes discrete address information, continuous address information and other information.
  • the discrete address information includes one or more, and the discrete address information is the source address information; the continuous address information is the target address information, and the data in one or more source addresses is moved to a target address.
  • the configuration information is determined based on user operations and/or system settings, and a read instruction set is generated based on the configuration information.
  • the discrete address information in the read instruction set includes the starting offset address of the source memory, the address space size of a single discrete address, the address distance of two adjacent discrete addresses of the same data source, and the address space of the same data source.
  • the number of addresses, the address distance between two adjacent discrete addresses of different data sources, and the number of data sources, etc.; continuous address information includes the starting offset address information of the destination memory, etc.
  • Figure 2 shows a schematic structural diagram of the read instruction set. As shown in Figure 2, the read instruction set has a total of 256 bits. The configuration information included in the read instruction set is shown in Table 1 below.
  • a1, a3, and a5 areas are areas that need to read data and the three address spaces are equal in size
  • a2 and a4 are areas that do not need to read data, and the address spaces of the two are equal in size; a1, a3,
  • the total number of small intervals between a2 and a5 is L2_sector_num
  • the address space size of a2 and a4 is L2_sector_gap.
  • the space size is L1_sector_gap, and the number of gray areas within each small interval a1, a3, and a5 is L1_sector_num.
  • an address schedule can be established within the DMA. After moving the data in a gray area address space, it will automatically jump to the next gray area address space to move data until all gray area data in the source memory has been moved. When moved to the destination memory, that is, the data at each discrete address is moved to a continuous address range, as shown in Figures 3 and 4. Through these configuration information, DMA can move data in multiple discrete address ranges to a continuous address range at one time.
  • the memory stores the images according to the order in which the images to be processed are arranged. As shown in Figure 5, the memory first stores the first image in the order in which the images to be processed are arranged. image, followed by a second image, then a third image, until all images have been stored. It should be noted that when storing images, the image data of a single image is stored row by row in sequence.
  • a single image that is too large will be split into multiple small images, that is, sub-images, for processing during design, as shown in Figure 5.
  • a single image can be viewed as consisting of many sub-images, such as A0, B0, A1, B1, A2, B2, A3, and B3.
  • sub-images of the same position and size of all images are taken out for processing. For example, as shown in Figure 5, sub-images A0, A1, A2, and A3 are first taken out for parallel processing.
  • sub-images A0, A1 , A2, and A3 When sub-images A0, A1 , A2, and A3 are processed, then take out the sub-images B0, B1, B2, and B3 for processing, repeat the same operation, and process all sub-images in sequence, that is, the processing of all images is completed.
  • the sub-image data needs to be transferred to the corresponding co-processor for processing.
  • A0 needs to be moved first, then A1, and then Move A2, then A3, then B0, then B1, until all sub-images are moved. It can be seen that the conventional method will cause the DMA to generate an interrupt signal to the CPU every time it moves a sub-image, so that the CPU frequently generates interrupt signals, affecting its normal operation.
  • embodiments of the present application provide a DMA data transfer method based on a read instruction set.
  • the configuration information is determined through user operations and/or system settings, and then a read instruction set is generated based on the configuration information, and then the read instruction set is used to complete the data. Handling to ensure the normal operation of the CPU.
  • sector_length can be set to the image width of the sub-image
  • L1_sector_gap is set to the distance from the end of the first row to the beginning of the second row in two adjacent rows of the sub-image
  • L1_sector_num is The image height of the sub-image
  • L2_sector_num is the total number of all images, (L2_sector_num is 4 in Figure 6).
  • a read instruction set is generated according to the structure shown in Figure 2, and the read instructions are used to transfer the four sub-images A0, A1, A2, and A3 from the four images at discrete addresses to within a contiguous address range.
  • the DMA controller directly notifies the co-processor for subsequent processing. According to the above configuration, all sub-images after splitting the image can be transferred to the co-processor in sequence by reading the instruction set to avoid frequent interrupt signals affecting the CPU.
  • the DMA based on the read instruction set also has a locking function, that is, the configuration information in the instruction set lock_bn_begin (i.e., the starting position for executing the locking function) and lock_bn_end (i.e., the end position for executing the locking function)
  • lock_bn_begin i.e., the starting position for executing the locking function
  • lock_bn_end i.e., the end position for executing the locking function
  • the address space between is locked. Once the space is locked, the DMA controller cannot transfer data to the address space. If the next read instruction set is configured to transfer data to the locked space, the DMA transfer will stop until the space is unlocked. , unlocking is controlled by the coprocessor.
  • the locking and unlocking scenario is caused by the fact that the data in the address space has not been used by the coprocessor or needs to be used by the coprocessor later.
  • Locking can prevent the data in the address space from being overwritten, or reduce duplication. The number of times to read the sub-image.
  • the coprocessor needs to send an unlock signal to the DMA controller. After receiving the unlock signal, the DMA controller will unlock the address space so that it can access the address space. Transfer data within.
  • the embodiment of the present application also provides a DMA data transfer method based on a write instruction set.
  • the DMA data transfer method based on the write instruction set can be understood as the reverse operation of the DMA data transfer method based on the read instruction set, which transfers data in the continuous address range to each discrete address range.
  • the DMA data transfer method based on the write instruction set may include steps S210 to S220.
  • the write instruction set includes configuration information, and the configuration information at least includes discrete address information and continuous address information.
  • the continuous address information is the source address information; the discrete address information includes one or more, and the discrete address information is the target address information. Move data from a source address to one or more discrete destination addresses.
  • the configuration information is determined based on user operations and/or system settings, and a write instruction set is generated based on the configuration information.
  • the continuous address information of the write instruction set includes the starting offset address information of the source memory, etc.; the discrete address information includes the starting offset address of the destination memory, the address space size of a single discrete address, and the same data source.
  • S220 According to the configuration information in the write instruction set, transfer the data in the continuous address range corresponding to the continuous address information to the discrete address range corresponding to each discrete address information.
  • FIG. 8 shows a schematic structural diagram of the write instruction set. As shown in FIG. 8 , the write instruction set has a total of 256 bits. The configuration information included in the write instruction set is shown in Table 2 below.
  • DMA data transfer based on the write instruction set can be viewed as the operation shown in Figure 9. As shown in Figure 9, DMA data transfer based on the write instruction set can complete reading out continuous data and writing one or more discrete blocks. within the address range. By writing the configuration information in the instruction set, DMA can move data in a continuous address range to multiple discrete address ranges at one time.
  • each sub-image After each sub-image is processed, it needs to be written to the corresponding location, that is, the data in the continuous address is moved to the discrete address space.
  • the above operation can be completed by using the DMA data transfer method based on the write instruction set provided by the embodiment of the present application.
  • the information such as sector_length, L1_sector_gap, L1_sector_num, L2_sector_gap, and L2_sector_num configured in the write instruction set is the corresponding relevant information on the final output image, as shown in Figure 11.
  • sector_length can be set to the image width of the sub-image
  • L1_sector_gap is set to the distance from the end of the first row to the beginning of the second row in two adjacent rows of the sub-image
  • L1_sector_num is the width of the sub-image.
  • Image height L2_sector_num is the total number of images (4 in Figure 11).
  • the last pixel of sub-image A'0 in the first image in Figure 11 is C'
  • the first pixel of sub-image A'1 in the second image is D'.
  • the address distance of these two pixels is recorded as
  • L2_sector_gap
  • sequence number of each step in the above embodiment does not mean the order of execution.
  • the execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
  • An embodiment of the present application also provides a DMA data transfer device.
  • the DMA data transfer device is configured in a DMA controller.
  • DMA controller For details about the DMA data transfer device that are not described in detail, please refer to the relevant descriptions of the foregoing method embodiments and will not be described again here.
  • Figure 12 is a schematic structural diagram of a DMA data transfer device provided by an embodiment of the present application. As shown in Figure 12, the DMA data transfer device includes: a first acquisition module 1210 and a data reading module 1220.
  • the first acquisition module 1210 is used to acquire the read instruction set.
  • the data reading module 1220 is used to transfer data in the discrete address range corresponding to the discrete address information to the continuous address range corresponding to the continuous address information according to the configuration information in the read instruction set.
  • FIG 13 is a schematic structural diagram of a DMA data transfer device provided by an embodiment of the present application.
  • the DMA data transfer device includes: a second acquisition module 1310 and a data writing module 1320.
  • the second acquisition module 1310 is used to acquire the write instruction set.
  • the data writing module 1320 is used to transfer data in the continuous address range corresponding to the continuous address information to the discrete address range corresponding to each discrete address information according to the configuration information in the write instruction set.
  • An embodiment of the present application also provides a DMA controller.
  • the DMA controller includes the DMA data transfer device of the embodiment shown in FIG. 12 and/or the DMA data transfer device of the embodiment shown in FIG. 13 .
  • An embodiment of the present application also provides a coprocessor.
  • the coprocessor is integrated with the foregoing DMA controller, or the coprocessor is coupled with the foregoing DMA controller.
  • an embodiment of the present application also provides an electronic device.
  • the electronic device may include one or more coprocessors 1400 (only one is shown in FIG. 14 ), a memory 1410 , and a program stored in the memory 1410 and operable on the one or more coprocessors 1400 .
  • Computer program 1420 for example, a program for DMA data transfer.
  • When one or more coprocessors 1400 execute the computer program 1420 they may implement the DMA data transfer method of the embodiment shown in FIG. 1, and/or the steps in the DMA data transfer method of the embodiment shown in FIG. 7. .
  • one or more co-processors 1400 execute the computer program 1420, they can implement the DMA data transfer apparatus of the embodiment shown in FIG. 12, and/or each of the DMA data transfer apparatus of the embodiment shown in FIG. 13.
  • the functions of modules/units are not limited here.
  • the computer program 1420 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 1410 and executed by the coprocessor 1400 to complete the present application.
  • One or more modules/units may be a series of computer program instruction segments capable of completing specific functions. The instruction segments are used to describe the execution process of the computer program 1420 in the processing unit.
  • the computer program 1420 can be divided into several modules as follows.
  • the specific functions of each module are as follows:
  • the first acquisition module is used to acquire the read instruction set
  • the data reading module is used to transfer data in the discrete address range corresponding to the discrete address information to the continuous address range corresponding to the continuous address information according to the configuration information in the read instruction set.
  • the second acquisition module is used to acquire the write instruction set
  • the data writing module is used to transfer data in the continuous address range corresponding to the continuous address information to the discrete address range corresponding to each discrete address information according to the configuration information in the write instruction set.
  • FIG. 14 is only an example of an electronic device and does not constitute a limitation on the electronic device.
  • the electronic device may include more or less components than shown in the figures, or a combination of certain components, or different components.
  • the electronic device may also include input and output devices, network access devices, buses, etc.
  • the so-called coprocessor 1400 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field- Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field- Programmable Gate Array
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the memory 1410 may be an internal storage unit of the electronic device, such as a hard disk or memory of the electronic device.
  • the memory 1410 can also be an external storage device of the electronic device, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, or a flash card equipped on the electronic device. wait.
  • the memory 1410 may also include both an internal storage unit of the electronic device and an external storage device.
  • Memory 1410 is used to store computer programs and other programs and data required by the electronic device.
  • the memory 1410 may also be used to temporarily store data that has been output or is to be output.
  • an embodiment of the present application also provides another preferred embodiment of an electronic device.
  • the electronic device includes one or more co-processors.
  • One or more coprocessors are used to execute the following program modules stored in memory:
  • the first acquisition module is used to acquire the read instruction set
  • the data reading module is used to transfer data in the discrete address range corresponding to the discrete address information to the continuous address range corresponding to the continuous address information according to the configuration information in the read instruction set.
  • the second acquisition module is used to acquire the write instruction set
  • the data writing module is used to transfer data in the continuous address range corresponding to the continuous address information to the discrete address range corresponding to each discrete address information according to the configuration information in the write instruction set.
  • Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.
  • Each functional unit and module in the embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above-mentioned integrated unit can be hardware-based. It can also be implemented in the form of software functional units.
  • the specific names of each functional unit and module are only for the convenience of distinguishing each other and are not used to limit the scope of protection of the present application.
  • For the specific working processes of the units and modules in the above system please refer to the corresponding processes in the foregoing method embodiments, and will not be described again here.
  • An embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program is executed by a processor, the DMA data transfer method of the embodiment shown in Figure 1 can be implemented. And/or, the DMA data transfer method of the embodiment shown in FIG. 7 .
  • An embodiment of the present application provides a computer program product.
  • the electronic device can implement the DMA data transfer method of the embodiment shown in Figure 1, and/or the method shown in Figure 7.
  • the DMA data transfer method of the embodiment is shown.
  • the above-mentioned first acquisition module, data reading module, second acquisition module, and data writing module can be implemented in the form of hardware (hardware), firmware (firmware), software (software, that is, computer program). ) or a combination of more of the above three.
  • the above-mentioned first acquisition module, data reading module, second acquisition module, data writing module, etc. may be logic circuits implemented on an integrated circuit.
  • the related functions of the above-mentioned first acquisition module, data reading module, second acquisition module, and data writing module can be implemented using hardware description languages (hardware description languages, such as Verilog HDL or VHDL) or other suitable programming languages. hardware.
  • the related functions of the above-mentioned first acquisition module, data reading module, second acquisition module, and data writing module can be implemented in one or more controllers, microcontrollers, microprocessors, special applications Various logic areas in integrated circuits (Application-specific integrated circuits, ASICs), digital signal processors, DSPs, field programmable gate arrays, and/or other processing units Blocks, modules and circuits.
  • ASICs Application-specific integrated circuits
  • DSPs digital signal processors
  • field programmable gate arrays and/or other processing units Blocks, modules and circuits.
  • the related functions of the above-mentioned first acquisition module, data reading module, second acquisition module, data writing module, etc. can be implemented as computer programs.
  • the computer program includes computer program code
  • the computer program code can be in the form of source code, object code, executable file or some intermediate form, etc.
  • Computer program code may be recorded/stored in computer-readable media.
  • Computer-readable media may include: any entity or device capable of carrying computer program code, recording media, USB flash drives, mobile hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), RAM, electronic Carrier signals, telecommunications signals, and software distribution media, etc.
  • a central processing unit Central Processing Unit, CPU
  • controller, microcontroller or microprocessor can read and execute the computer program code from the computer-readable medium, thereby realizing the above-mentioned first acquisition module, data reading Relevant functions of the acquisition module, the second acquisition module, and the data writing module.
  • the content contained in the computer-readable medium can be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction.
  • the computer-readable medium does not include Electrical carrier signals and telecommunications signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bus Control (AREA)

Abstract

The present application is suitable for the technical field of computers, and provides a DMA data transfer method and device. The data transfer method is applied to a DMA controller, and comprises: obtaining a read instruction set, wherein the read instruction set comprises first configuration information, and the first configuration information comprises first discrete address information and first continuous address information; and according to the first configuration information in the read instruction set, transferring, to a continuous address range corresponding to the first continuous address information, data within a discrete address range corresponding to the first discrete address information. According to embodiments of the present application, a DMA data transfer operation is controlled by means of the instruction set, and normal work of a CPU would not be affected by DMA interruption during DMA data transfer.

Description

一种DMA数据搬运方法及装置A DMA data transfer method and device
本申请要求于2022年3月21日提交中国专利局,申请号为202210276686.1,发明名称为“一种DMA数据搬运方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application submitted to the China Patent Office on March 21, 2022, with the application number 202210276686.1 and the invention title "A DMA data transfer method and device", the entire content of which is incorporated into this application by reference. middle.
技术领域Technical field
本申请属于计算机技术领域,尤其涉及一种直接存储器访问(Direct Memory Access,DMA)数据搬运方法及装置。This application belongs to the field of computer technology, and in particular relates to a direct memory access (Direct Memory Access, DMA) data transfer method and device.
背景技术Background technique
DMA传输是指将数据从一个地址空间复制到另外一个地址空间。DMA传输在高效能嵌入式系统算法和网络等领域中占据重要地位。现有技术中,DMA仅能将某一地址区域内的数据搬运到另外一个地址区域内。例如,可以将[a,b]地址范围内的数据搬运到[h,j]地址范围内,其中,“[a,b]地址范围内”可以理解为存储器内部a位置到b位置之间的空间区域。DMA transfer refers to copying data from one address space to another address space. DMA transfer occupies an important position in fields such as high-performance embedded system algorithms and networks. In the existing technology, DMA can only move data in a certain address area to another address area. For example, data in the address range [a, b] can be moved to the address range [h, j], where "within the address range [a, b]" can be understood as the distance between location a and location b within the memory. space area.
但是,当需要将多个离散的地址范围内的数据搬运到一个连续地址范围内时,就需要发起多次DMA搬运操作,例如将地址范围分别在[a1,b1],[a2,b2],[a3,b3]内的数据按顺序搬运到[h,j]地址范围内时,需要DMA发起三次搬运操作。同样地,如果要将地址范围[a,b]的数据按顺序搬运到三个离散地址范围[h1,j1],[h2,j2],[h3,j3]内,同样也需要DMA发起三次搬运操作。However, when data in multiple discrete address ranges needs to be moved to a continuous address range, multiple DMA transfer operations need to be initiated, such as moving the address ranges to [a1,b1], [a2,b2], When the data in [a3, b3] is transferred to the [h, j] address range in sequence, DMA needs to initiate three transfer operations. Similarly, if you want to transfer the data in the address range [a, b] to three discrete address ranges [h1, j1], [h2, j2], [h3, j3] in sequence, you also need to initiate three transfers by DMA. operate.
可见,相关技术中当存在上述数据搬运需求时,DMA需要执行多次操作才能完成所有数据的搬运。这样会频繁产生DMA中断,影响中央处理单元(Centre Processor Unit,CPU)或者其他协处理器协助CPU完成其无法执行或执行效率、效果低下的处理工作。It can be seen that in related technologies, when the above data transfer requirements exist, DMA needs to perform multiple operations to complete the transfer of all data. This will frequently generate DMA interrupts, affecting the central processing unit (Centre Processor Unit, CPU) or other co-processors to assist the CPU in completing processing tasks that it cannot perform or performs inefficiently and ineffectively.
发明内容Contents of the invention
有鉴于此,本申请实施例提供了一种DMA数据搬运方法及装置,能够解决相关技术中的一个或多个技术问题。In view of this, embodiments of the present application provide a DMA data transfer method and device, which can solve one or more technical problems in related technologies.
第一方面,本申请一实施例提供了一种DMA数据搬运方法,应用于DMA控制器,所述DMA数据搬运方法包括:获取读指令集,所述读指令集包括第一配置信息,所述第一配置信息包括第一离散地址信息和第一连续地址信息;根据所述读指令集中的所述第一配置信息,将所述第一离散地址信息对应的离散地址范围内的数据搬运到所述第一连续地址信息对应的连续地址范围内。In a first aspect, an embodiment of the present application provides a DMA data transfer method, which is applied to a DMA controller. The DMA data transfer method includes: obtaining a read instruction set, the read instruction set includes first configuration information, and the The first configuration information includes first discrete address information and first continuous address information; according to the first configuration information in the read instruction set, the data in the discrete address range corresponding to the first discrete address information is transported to the within the continuous address range corresponding to the first continuous address information.
第二方面,本申请一实施例提供了一种DMA数据搬运方法,应用于DMA控制器,所述DMA数据搬运方法包括:获取写指令集,所述写指令集包括第二配置信息,所述第二配置信息包括第二离散地址信息和第二连续地址信息;根据所述写指令集中的所述第二配置信息,将所述第二连续地址信息对应的连续地址范围内的数据搬运到各所述第二离散地址信息对应的离散地址范围内。In the second aspect, an embodiment of the present application provides a DMA data transfer method, which is applied to a DMA controller. The DMA data transfer method includes: obtaining a write instruction set, the write instruction set includes second configuration information, and the The second configuration information includes second discrete address information and second continuous address information; according to the second configuration information in the write instruction set, the data in the continuous address range corresponding to the second continuous address information is transported to each Within the discrete address range corresponding to the second discrete address information.
第三方面,本申请一实施例提供了一种DMA数据搬运装置,配置于DMA控制器,包括:第一获取模块,用于获取读指令集,所述读指令集包括第一配置信息,所述第一配置信息包括第一离散地址信息和第一连续地址信息;数据读取模块,用于根据所述读指令集中的所述第一配置信息,将所述第一离散地址信息对应的离散地址范围内的数据搬运到所述第一连续地址信息对应的连续地址范围内。In a third aspect, an embodiment of the present application provides a DMA data transfer device, which is configured in a DMA controller and includes: a first acquisition module for acquiring a read instruction set, where the read instruction set includes first configuration information, so The first configuration information includes first discrete address information and first continuous address information; a data reading module is configured to read the discrete address information corresponding to the first discrete address information according to the first configuration information in the read instruction set. The data within the address range is moved to the continuous address range corresponding to the first continuous address information.
第四方面,本申请一实施例提供了一种DMA数据搬运装置,配置于DMA控制器,包括:第二获取模块,用于获取写指令集,所述写指令集包括第二配置信息,所述第二配置信息包括第二离散地址信息和第二连续地址信息;数据写出模块,用于根据所述写指令集中的所述第二配置信息,将所述第二连续地址信息对应的连续地址范围内的数据搬运到各所述第二离散地址信息对应的离散地址范围内。In the fourth aspect, an embodiment of the present application provides a DMA data transfer device, which is configured in a DMA controller and includes: a second acquisition module for acquiring a write instruction set, where the write instruction set includes second configuration information, so The second configuration information includes second discrete address information and second continuous address information; a data writing module is configured to write the continuous address information corresponding to the second continuous address information according to the second configuration information in the write instruction set. The data within the address range is moved to the discrete address range corresponding to each of the second discrete address information.
第五方面,本申请一实施例提供了一种协处理器,所述协处理器与DMA控制器耦合,或,所述DMA控制器耦合集成于所述协处理器,所述DMA控制器包括第三方面所述的DMA数据搬运装置,和/或,如第四方面所述的DMA数据搬运装置。In the fifth aspect, an embodiment of the present application provides a coprocessor, the coprocessor is coupled with a DMA controller, or the DMA controller is coupled and integrated with the coprocessor, and the DMA controller includes The DMA data transfer device according to the third aspect, and/or the DMA data transfer device according to the fourth aspect.
第六方面,本申请一实施例提供了一种电子设备,包括存储器、协处理器以及存储在所述存储器中并可在所述协处理器上运行的计算机程序,所述协处理器执行所述计算机程序时实现如第一方面任一实施例所述的DMA数据搬运方法,和/或,实现如第二方面任一实施例所述的DMA数据搬运方法。In a sixth aspect, an embodiment of the present application provides an electronic device, including a memory, a coprocessor, and a computer program stored in the memory and executable on the coprocessor. The coprocessor executes the The computer program implements the DMA data transfer method as described in any embodiment of the first aspect, and/or implements the DMA data transfer method as described in any embodiment of the second aspect.
第七方面,本申请一实施例提供了一种计算机可读存储介质,所述计算机存储介质存储有计算机程序,所述计算机程序被协处理器执行时实现如第一方面任一实施例所述的DMA数据搬运方法,和/或,实现如第二方面任一实施例所述的DMA数据搬运方法。In the seventh aspect, an embodiment of the present application provides a computer-readable storage medium. The computer storage medium stores a computer program. When the computer program is executed by a coprocessor, the computer program implements the method described in any embodiment of the first aspect. The DMA data transfer method, and/or, implement the DMA data transfer method as described in any embodiment of the second aspect.
第八方面,本申请一实施例提供了一种计算机程序产品,当计算机程序产品在电子设备上运行时,使得电子设备可实现如第一方面任一实施例所述的DMA数据搬运方法,和/或,实现如第二方面任一实施例所述的DMA数据搬运方法。In an eighth aspect, an embodiment of the present application provides a computer program product. When the computer program product is run on an electronic device, the electronic device can implement the DMA data transfer method as described in any embodiment of the first aspect, and /Or, implement the DMA data transfer method as described in any embodiment of the second aspect.
本申请实施例通过指令集来控制DMA数据搬运操作,实现了将离散地址范围内的数据搬运到连续地址范围内,或,将连续地址范围内的数据搬运到各离散地址范围内,使得在DMA数据搬运过程中不会因产生DMA中断而影响CPU的正常工作。The embodiment of the present application controls the DMA data transfer operation through the instruction set, realizing the transfer of data in the discrete address range to the continuous address range, or transferring the data in the continuous address range to each discrete address range, so that in the DMA During the data transfer process, the normal operation of the CPU will not be affected by DMA interrupts.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or description of the prior art will be briefly introduced below. Obviously, the drawings in the following description are only for the purpose of the present application. For some embodiments, for those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting creative efforts.
图1是本申请一实施例提供的一种DMA数据搬运方法的实现流程示意图;Figure 1 is a schematic flow chart of the implementation of a DMA data transfer method provided by an embodiment of the present application;
图2是本申请一实施例提供的一种读指令集的结构示意图;Figure 2 is a schematic structural diagram of a read instruction set provided by an embodiment of the present application;
图3是本申请一实施例提供的一种配置信息的含义示意图;Figure 3 is a schematic diagram of the meaning of configuration information provided by an embodiment of the present application;
图4是本申请一实施例提供的一种配置信息的含义示意图;Figure 4 is a schematic diagram of the meaning of configuration information provided by an embodiment of the present application;
图5是本申请一实施例提供的一种图像及其子图像的示意图;Figure 5 is a schematic diagram of an image and its sub-images provided by an embodiment of the present application;
图6是本申请一实施例提供的一种图像处理应用场景中配置信息的含义示意图;Figure 6 is a schematic diagram of the meaning of configuration information in an image processing application scenario provided by an embodiment of the present application;
图7是本申请一实施例提供的一种DMA数据搬运方法的实现流程示意图;Figure 7 is a schematic flow chart of the implementation of a DMA data transfer method provided by an embodiment of the present application;
图8是本申请一实施例提供的一种写指令集的结构示意图;Figure 8 is a schematic structural diagram of a write instruction set provided by an embodiment of the present application;
图9是本申请一实施例提供的一种DMA数据搬运方法的过程示意图;Figure 9 is a schematic process diagram of a DMA data transfer method provided by an embodiment of the present application;
图10是本申请一实施例提供的一种图像处理应用场景的示意图;Figure 10 is a schematic diagram of an image processing application scenario provided by an embodiment of the present application;
图11是本申请一实施例提供的一种图像处理应用场景中配置信息的含义示意图;Figure 11 is a schematic diagram of the meaning of configuration information in an image processing application scenario provided by an embodiment of the present application;
图12是本申请一实施例提供的一种DMA数据搬运装置的结构示意图;Figure 12 is a schematic structural diagram of a DMA data transfer device provided by an embodiment of the present application;
图13是本申请一实施例提供的一种DMA数据搬运装置的结构示意图;Figure 13 is a schematic structural diagram of a DMA data transfer device provided by an embodiment of the present application;
图14是本申请一实施例提供的一种电子设备的结构示意图。FIG. 14 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of explanation rather than limitation, specific details such as specific system structures and technologies are provided to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to those skilled in the art that the present application may be practiced in other embodiments without these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。As used in this specification and the appended claims, the term "and/or" means and includes any and all possible combinations of one or more of the associated listed items.
在本申请说明书中描述的“一个实施例”或“一些实施例”等意味着在本 申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。Reference in the specification of this application to "one embodiment" or "some embodiments" or the like means that a particular feature, structure or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Therefore, the phrases "in one embodiment", "in some embodiments", "in other embodiments", "in other embodiments", etc. appearing in different places in this specification are not necessarily References are made to the same embodiment, but rather to "one or more but not all embodiments" unless specifically stated otherwise. The terms “including,” “includes,” “having,” and variations thereof all mean “including but not limited to,” unless otherwise specifically emphasized.
此外,在本申请的描述中,“多个”的含义是两个或两个以上。术语“第一”和“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。In addition, in the description of this application, "plurality" means two or more. The terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
还应当理解,除非另有明确的规定或限定,术语“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或成一体;可以是直接相连,也可以是通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系。对于本领域普通技术人员而言,可以根据具体情况理解上述术语在本申请中的具体含义。It should also be understood that, unless otherwise expressly stipulated or limited, the term "connection" should be understood in a broad sense. For example, it can be a fixed connection, a detachable connection, or an integral body; it can be a direct connection or an intermediate connection. The medium is indirectly connected, which can be the internal connection between two components or the interaction between two components. For those of ordinary skill in the art, the specific meanings of the above terms in this application can be understood according to specific circumstances.
本申请实施例的DMA数据搬运总共分为两个部分,一个是基于读指令集的DMA数据搬运,主要实现将某些离散地址内的各个数据搬到一个连续地址范围。另外一个是基于写指令集的DMA数据搬运,主要实现将一个连续地址范围内的数据搬到各个离散地址区域内,所以基于写指令集数据搬运可以看成是基于读指令集数据搬运的逆过程。此外,基于读指令集的DMA数据搬运具有上锁的功能。The DMA data transfer in the embodiment of this application is divided into two parts. One is DMA data transfer based on the read instruction set, which mainly realizes moving each data in certain discrete addresses to a continuous address range. The other one is DMA data transfer based on the write instruction set, which mainly implements moving data in a continuous address range to various discrete address areas. Therefore, data transfer based on the write instruction set can be regarded as the reverse process of data transfer based on the read instruction set. . In addition, DMA data transfer based on the read instruction set has a locking function.
需要说明的是,本申请提供的DMA数据搬运方法可以由DMA控制器执行,DMA控制器可以与协处理器耦合或集成于协处理器;若DMA控制器可以与协处理器耦合或集成于协处理器,则该DMA数据搬运方法在协处理器的控制下执行,此处不作限制。It should be noted that the DMA data transfer method provided by this application can be executed by a DMA controller, and the DMA controller can be coupled with a coprocessor or integrated with a coprocessor; if the DMA controller can be coupled with a coprocessor or integrated with a coprocessor, processor, the DMA data transfer method is executed under the control of the coprocessor, and there is no limit here.
图1是本申请一实施例提供的一种基于读指令集的DMA数据搬运方法实现流程示意图,该方法可以包括步骤S110至步骤S120。Figure 1 is a schematic flowchart of a DMA data transfer method based on a read instruction set provided by an embodiment of the present application. The method may include steps S110 to S120.
S110,获取读指令集。S110, obtain the read instruction set.
其中,读指令集包括有配置信息,配置信息包括离散地址信息以及连续地址信息等信息。其中,离散地址信息包括一个或多个,离散地址信息是源地址信息;连续地址信息是目标地址信息,将一个或多个源地址中的数据搬运到一个目标地址。Among them, the read instruction set includes configuration information, and the configuration information includes discrete address information, continuous address information and other information. Among them, the discrete address information includes one or more, and the discrete address information is the source address information; the continuous address information is the target address information, and the data in one or more source addresses is moved to a target address.
作为一个可能的实现方式,根据用户操作和/或系统设置确定配置信息,从而根据该配置信息生成读指令集。As a possible implementation, the configuration information is determined based on user operations and/or system settings, and a read instruction set is generated based on the configuration information.
在一些实施例中,读指令集中的离散地址信息包括源存储器的起始偏移地址、单个离散地址的地址空间大小、相同数据来源的两个相邻离散地址的地址距离、相同数据来源的离散地址的数目、不同数据来源的两个相邻离散地址的地址距离、和数据来源的数目等;连续地址信息包括目的存储器的起始偏移地址信息等。In some embodiments, the discrete address information in the read instruction set includes the starting offset address of the source memory, the address space size of a single discrete address, the address distance of two adjacent discrete addresses of the same data source, and the address space of the same data source. The number of addresses, the address distance between two adjacent discrete addresses of different data sources, and the number of data sources, etc.; continuous address information includes the starting offset address information of the destination memory, etc.
S120,根据读指令集中的配置信息,将离散地址信息对应的离散地址范围内的数据搬运到连续地址信息对应的连续地址范围内。S120: According to the configuration information in the read instruction set, transfer the data in the discrete address range corresponding to the discrete address information to the continuous address range corresponding to the continuous address information.
在一些实施例中,图2所示为读指令集的结构示意图,如图2所示,读指令集总共256bit。读指令集包括的配置信息如下表1所示。In some embodiments, Figure 2 shows a schematic structural diagram of the read instruction set. As shown in Figure 2, the read instruction set has a total of 256 bits. The configuration information included in the read instruction set is shown in Table 1 below.
表1Table 1
Figure PCTCN2022100634-appb-000001
Figure PCTCN2022100634-appb-000001
Figure PCTCN2022100634-appb-000002
Figure PCTCN2022100634-appb-000002
表1所示配置信息中的sector_length、L1_sector_gap、L1_sector_num、L2_sector_gap、和L2_sector_num的具体含义可以参见图3和图4所示。如图3和图4所示,假设源存储器中数据所在的离散地址为一个大区间A,那么可以将A分成多个小区间,例如分成a1、a2、a3、a4、和a5这五个小区间,其中,a1、a3、和a5区域为需要读取数据的区域且三者地址空间大小相等,a2和a4区域为不需读取数据的区域且两者地址空间大小相等;a1、a3、和a5这些小区间的总数目为L2_sector_num,a2和a4的地址空间大小为L2_sector_gap。进一步地,假设图中各灰色区域为大小一样且区域中的数据均需搬运,空白区 域为大小一样但区域中的数据不需搬运的,则灰色区域的地址空间大小为sector_length,空白区域的地址空间大小为L1_sector_gap,每个小区间a1、a3、和a5内的灰色区域的个数为L1_sector_num。基于上述配置信息,便可在DMA内部建立一个地址调度,当搬完一个灰色区域地址空间内的数据后自动跳到下一个灰色区域地址空间搬运数据,直至源存储器中所有灰色区域的数据均被搬运至目的存储器中,也即各个离散地址的数据均被搬运至一个连续地址范围内,如图3和图4所示。通过这些配置信息,使得DMA可以一次性将多个离散地址范围内的数据搬运到一个连续地址范围内。The specific meanings of sector_length, L1_sector_gap, L1_sector_num, L2_sector_gap, and L2_sector_num in the configuration information shown in Table 1 can be seen in Figures 3 and 4. As shown in Figure 3 and Figure 4, assuming that the discrete address where the data in the source memory is located is a large interval A, then A can be divided into multiple small intervals, for example, into five small areas: a1, a2, a3, a4, and a5. time, where a1, a3, and a5 areas are areas that need to read data and the three address spaces are equal in size, a2 and a4 are areas that do not need to read data, and the address spaces of the two are equal in size; a1, a3, The total number of small intervals between a2 and a5 is L2_sector_num, and the address space size of a2 and a4 is L2_sector_gap. Furthermore, assuming that the gray areas in the figure are the same size and the data in the area needs to be moved, and the blank area is the same size but the data in the area does not need to be moved, then the address space size of the gray area is sector_length, and the address of the blank area is sector_length. The space size is L1_sector_gap, and the number of gray areas within each small interval a1, a3, and a5 is L1_sector_num. Based on the above configuration information, an address schedule can be established within the DMA. After moving the data in a gray area address space, it will automatically jump to the next gray area address space to move data until all gray area data in the source memory has been moved. When moved to the destination memory, that is, the data at each discrete address is moved to a continuous address range, as shown in Figures 3 and 4. Through these configuration information, DMA can move data in multiple discrete address ranges to a continuous address range at one time.
下面以图像数据为例,对本申请实施例提供的DMA数据搬运方法进行详细说明。Taking image data as an example, the DMA data transfer method provided by the embodiment of the present application will be described in detail below.
在图像处理过程中,待处理的图像往往有很多张,一般情况下存储器会根据待处理图像的排列顺序进行图像存储,如图5所示,存储器按照待处理图像的排列顺序先存储第一张图像,紧接着存储第二张图像,然后存储第三张图像,直到存储完所有图像。需要说明的是,图像存储时,单张图像的图像数据是一行行首尾相接依次存储的。In the process of image processing, there are often many images to be processed. Generally, the memory stores the images according to the order in which the images to be processed are arranged. As shown in Figure 5, the memory first stores the first image in the order in which the images to be processed are arranged. image, followed by a second image, then a third image, until all images have been stored. It should be noted that when storing images, the image data of a single image is stored row by row in sequence.
但在图像处理过程中,为了考虑硬件资源的成本,在设计时会将尺寸过大的单张图像拆成多张小图像,即子图像来处理,如图5所示。单张图像可以看成由很多张子图像构成,例如A0、B0、A1、B1、A2、B2、A3、和B3等子图像。每次处理时均将所有图像相同位置相同大小的子图像取出来进行处理,例如图5所示,先将子图像A0、A1、A2、和A3取出来进行并行处理,当子图像A0、A1、A2、和A3处理完后,再将子图像B0、B1、B2、和B3取出来进行处理,重复相同操作,依次处理完所有子图像,即完成了所有图像的处理过程。在上述取子图像进行处理的过程中,需要将子图像的数据搬运到相应的协处理器进行处理,但如果采用常规的DMA数据搬运方法,就需要先将A0搬完,再搬A1,再搬A2,再搬A3,再搬B0,再搬B1,直到所有子图像都搬完。由此可见,常规的方法会使得DMA每搬完一个子图像就会产生一个中断信号给 CPU,以致于CPU频繁产生中断信号,影响其正常工作。However, in the image processing process, in order to consider the cost of hardware resources, a single image that is too large will be split into multiple small images, that is, sub-images, for processing during design, as shown in Figure 5. A single image can be viewed as consisting of many sub-images, such as A0, B0, A1, B1, A2, B2, A3, and B3. During each processing, sub-images of the same position and size of all images are taken out for processing. For example, as shown in Figure 5, sub-images A0, A1, A2, and A3 are first taken out for parallel processing. When sub-images A0, A1 , A2, and A3 are processed, then take out the sub-images B0, B1, B2, and B3 for processing, repeat the same operation, and process all sub-images in sequence, that is, the processing of all images is completed. In the above process of fetching sub-images for processing, the sub-image data needs to be transferred to the corresponding co-processor for processing. However, if the conventional DMA data transfer method is used, A0 needs to be moved first, then A1, and then Move A2, then A3, then B0, then B1, until all sub-images are moved. It can be seen that the conventional method will cause the DMA to generate an interrupt signal to the CPU every time it moves a sub-image, so that the CPU frequently generates interrupt signals, affecting its normal operation.
基于上述问题,本申请实施例提供一种基于读指令集的DMA数据搬运方法,先通过用户操作和/或系统设置确定配置信息,后根据配置信息生成读指令集,再利用读指令集完成数据搬运以确保CPU的正常工作。Based on the above problems, embodiments of the present application provide a DMA data transfer method based on a read instruction set. First, the configuration information is determined through user operations and/or system settings, and then a read instruction set is generated based on the configuration information, and then the read instruction set is used to complete the data. Handling to ensure the normal operation of the CPU.
在一个实施例中,如图6所示,可以将sector_length设置为子图像的图像宽度,L1_sector_gap设为子图像的相邻两行中第一行行尾到第二行行首的距离,L1_sector_num是子图像的图像高度,L2_sector_num是所有图像的总个数,(图6中L2_sector_num为4)。结合图5和图6所示,假设图6中第一张图像中子图像A0的最后一个像素是C,第二张图像中子图像A1的第一个像素是D,这两个像素的地址距离记作|CD|,则L2_sector_gap=|CD|-L1_sector_gap,其中,L2_sector_gap为两张图像之间的地址距离,即不需读取数据的长度。In one embodiment, as shown in Figure 6, sector_length can be set to the image width of the sub-image, L1_sector_gap is set to the distance from the end of the first row to the beginning of the second row in two adjacent rows of the sub-image, and L1_sector_num is The image height of the sub-image, L2_sector_num is the total number of all images, (L2_sector_num is 4 in Figure 6). Combining Figure 5 and Figure 6, assuming that the last pixel of sub-image A0 in the first image in Figure 6 is C, and the first pixel of sub-image A1 in the second image is D, the addresses of these two pixels The distance is recorded as |CD|, then L2_sector_gap = |CD|-L1_sector_gap, where L2_sector_gap is the address distance between the two images, that is, the length of the data does not need to be read.
进一步地,根据上述获取的配置信息按照如图2所示的结构生成读指令集,利用读指令将来自处于离散地址的四张图像中的四个子图像A0、A1、A2、和A3都搬运至一连续地址范围内。待数据搬运完毕,DMA控制器直接通知协处理器进行后续处理。按照上述配置,通过读指令集可以将所有图像拆分后的子图像依次搬运给协处理器,避免频繁产生中断信号影响CPU。Further, based on the configuration information obtained above, a read instruction set is generated according to the structure shown in Figure 2, and the read instructions are used to transfer the four sub-images A0, A1, A2, and A3 from the four images at discrete addresses to within a contiguous address range. After the data transfer is completed, the DMA controller directly notifies the co-processor for subsequent processing. According to the above configuration, all sub-images after splitting the image can be transferred to the co-processor in sequence by reading the instruction set to avoid frequent interrupt signals affecting the CPU.
在本申请一些实施例中,基于读指令集的DMA还具有上锁功能,即将指令集中的配置信息lock_bn_begin(即执行上锁功能的起始位置)和lock_bn_end(即执行上锁功能的结束位置)之间的地址空间上锁。一旦该空间上锁,DMA控制器便不能往该地址空间内搬运数据,如果此时下一条读指令集配置了往该上锁空间中搬运数据,此时DMA搬运就会停止,直到该空间被解锁,解锁由协处理器控制的。一般上锁解锁的场景是由于该地址空间的数据还未被协处理器使用完或者后面还需要被协处理器使用造成的,通过上锁可以避免该地址空间内的数据被覆盖,或者减少重复读子图像的次数。当该地址空间的数据不需要继续被协处理器使用时,协处理器需要发出解锁信号给DMA控制器,DMA控制器接收到解锁信号后就会对该地址空间解锁,从而可以往该地址空间内搬 运数据。In some embodiments of this application, the DMA based on the read instruction set also has a locking function, that is, the configuration information in the instruction set lock_bn_begin (i.e., the starting position for executing the locking function) and lock_bn_end (i.e., the end position for executing the locking function) The address space between is locked. Once the space is locked, the DMA controller cannot transfer data to the address space. If the next read instruction set is configured to transfer data to the locked space, the DMA transfer will stop until the space is unlocked. , unlocking is controlled by the coprocessor. Generally, the locking and unlocking scenario is caused by the fact that the data in the address space has not been used by the coprocessor or needs to be used by the coprocessor later. Locking can prevent the data in the address space from being overwritten, or reduce duplication. The number of times to read the sub-image. When the data in this address space no longer needs to be used by the coprocessor, the coprocessor needs to send an unlock signal to the DMA controller. After receiving the unlock signal, the DMA controller will unlock the address space so that it can access the address space. Transfer data within.
本申请实施例还提供一种基于写指令集的DMA数据搬运方法。需要说明的是,基于写指令集的DMA数据搬运方法可以理解为是基于读指令集的DMA数据搬运的逆操作,将连续地址范围内的数据搬运到各个离散地址范围内。The embodiment of the present application also provides a DMA data transfer method based on a write instruction set. It should be noted that the DMA data transfer method based on the write instruction set can be understood as the reverse operation of the DMA data transfer method based on the read instruction set, which transfers data in the continuous address range to each discrete address range.
在一个实施例中,如图7所示,基于写指令集的DMA数据搬运方法可以包括步骤S210至步骤S220。In one embodiment, as shown in FIG. 7 , the DMA data transfer method based on the write instruction set may include steps S210 to S220.
S210,获取写指令集。S210, obtain the write instruction set.
其中,写指令集包括有配置信息,配置信息至少包括离散地址信息以及连续地址信息等信息。其中,连续地址信息是源地址信息;离散地址信息包括一个或多个,离散地址信息是目标地址信息。将一个源地址中的数据搬运到一个或多个离散的目标地址中。The write instruction set includes configuration information, and the configuration information at least includes discrete address information and continuous address information. Among them, the continuous address information is the source address information; the discrete address information includes one or more, and the discrete address information is the target address information. Move data from a source address to one or more discrete destination addresses.
作为一个可能的实现方式,根据用户操作和/或系统设置确定配置信息,从而根据该配置信息生成写指令集。As a possible implementation, the configuration information is determined based on user operations and/or system settings, and a write instruction set is generated based on the configuration information.
在一些实施例中,写指令集的连续地址信息包括源存储器的起始偏移地址信息等;离散地址信息包括目的存储器的起始偏移地址、单个离散地址的地址空间大小、相同数据来源的两个相邻离散地址的地址距离、相同数据来源的离散地址的数目、不同数据来源的两个相邻离散地址的地址距离、和数据来源的数目等。In some embodiments, the continuous address information of the write instruction set includes the starting offset address information of the source memory, etc.; the discrete address information includes the starting offset address of the destination memory, the address space size of a single discrete address, and the same data source. The address distance of two adjacent discrete addresses, the number of discrete addresses with the same data source, the address distance of two adjacent discrete addresses with different data sources, and the number of data sources, etc.
S220,根据写指令集中的配置信息,将连续地址信息对应的连续地址范围内的数据搬运到各离散地址信息对应的离散地址范围内。S220: According to the configuration information in the write instruction set, transfer the data in the continuous address range corresponding to the continuous address information to the discrete address range corresponding to each discrete address information.
在一些实施例中,图8所示为写指令集的结构示意图,如图8所示,写指令集总共256bit。写指令集包括的配置信息如下表2所示。In some embodiments, FIG. 8 shows a schematic structural diagram of the write instruction set. As shown in FIG. 8 , the write instruction set has a total of 256 bits. The configuration information included in the write instruction set is shown in Table 2 below.
表2Table 2
Figure PCTCN2022100634-appb-000003
Figure PCTCN2022100634-appb-000003
Figure PCTCN2022100634-appb-000004
Figure PCTCN2022100634-appb-000004
表2所示配置信息中的sector_length、L1_sector_gap、L1_sector_num、L2_sector_gap、和L2_sector_num具体含义与表1中相同,也可以参见图3和图4所示。此处不再赘述,请参见前述。The specific meanings of sector_length, L1_sector_gap, L1_sector_num, L2_sector_gap, and L2_sector_num in the configuration information shown in Table 2 are the same as those in Table 1, and can also be seen in Figures 3 and 4. No further details will be given here, please refer to the above.
基于写指令集的DMA数据搬运可以看成如图9所示的操作,如图9所示,基于写指令集的DMA数据搬运,可完成将连续数据读出,并写入一块或多块离散地址范围内。通过写指令集中的配置信息,使得DMA可以一次性将一个连续地址范围内的数据搬运到多个离散地址范围内。DMA data transfer based on the write instruction set can be viewed as the operation shown in Figure 9. As shown in Figure 9, DMA data transfer based on the write instruction set can complete reading out continuous data and writing one or more discrete blocks. within the address range. By writing the configuration information in the instruction set, DMA can move data in a continuous address range to multiple discrete address ranges at one time.
下面延续前述图5至图6所示的图像数据的实施例,将DMA数据搬运方法应用到图像处理中。在图像处理过程中,继续参见图5和图6所示,将大的单张图像拆成多个小的子图像来处理,需要将所有图像相同位置相同大小的子图像A0、A1、A2、和A3搬运出来,也就是将离散地址空间内的数据搬运到连续地址空间内,如图10所示,当协处理器完成处理后,即将子图像A0、A1、A2、和A3的数据处理成A’0、A’1、A’2、和A’3后,这时候就需要将处理完成后的数据搬运出去。每个子图像完成处理后,需要写到相应位置上,也就是 将连续地址内的数据搬到离散地址空间内。采用本申请实施例提供的基于写指令集的DMA数据搬运方法就可以完成上述操作。此时,写指令集中配置的sector_length、L1_sector_gap、L1_sector_num、L2_sector_gap、和L2_sector_num等信息就是在最后输出图像上的对应的相关信息,如图11所示。具体地,如图11所示,可以将sector_length设置为子图像的图像宽度,L1_sector_gap设为子图像的相邻两行中第一行行尾到第二行行首的距离,L1_sector_num是子图像的图像高度,L2_sector_num就是图像的总个数(图11中为4)。假设图11中第一张图像中子图像A’0的最后一个像素是C’,第二张图像中子图像A’1的第一个像素是D’,这两个像素的地址距离记作|C’D’|,则L2_sector_gap=|C’D’|-L1_sector_gap。The following continues the aforementioned embodiment of image data shown in Figures 5 to 6, and applies the DMA data transfer method to image processing. In the image processing process, continue to refer to Figures 5 and 6 to split a large single image into multiple small sub-images for processing. It is necessary to divide all images into sub-images A0, A1, A2, A0, A1, A2, and A3 are moved out, that is, the data in the discrete address space is moved to the continuous address space. As shown in Figure 10, when the coprocessor completes the processing, the data of the sub-images A0, A1, A2, and A3 are processed into After A'0, A'1, A'2, and A'3, the processed data needs to be moved out. After each sub-image is processed, it needs to be written to the corresponding location, that is, the data in the continuous address is moved to the discrete address space. The above operation can be completed by using the DMA data transfer method based on the write instruction set provided by the embodiment of the present application. At this time, the information such as sector_length, L1_sector_gap, L1_sector_num, L2_sector_gap, and L2_sector_num configured in the write instruction set is the corresponding relevant information on the final output image, as shown in Figure 11. Specifically, as shown in Figure 11, sector_length can be set to the image width of the sub-image, L1_sector_gap is set to the distance from the end of the first row to the beginning of the second row in two adjacent rows of the sub-image, and L1_sector_num is the width of the sub-image. Image height, L2_sector_num is the total number of images (4 in Figure 11). Assume that the last pixel of sub-image A'0 in the first image in Figure 11 is C', and the first pixel of sub-image A'1 in the second image is D'. The address distance of these two pixels is recorded as |C'D'|, then L2_sector_gap=|C'D'|-L1_sector_gap.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the sequence number of each step in the above embodiment does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
本申请一实施例还提供一种DMA数据搬运装置,DMA数据搬运装置配置于DMA控制器。该DMA数据搬运装置中未详细描述之处请详见前述方法实施例的相关描述,此处不再赘述。An embodiment of the present application also provides a DMA data transfer device. The DMA data transfer device is configured in a DMA controller. For details about the DMA data transfer device that are not described in detail, please refer to the relevant descriptions of the foregoing method embodiments and will not be described again here.
参见图12,图12是本申请一实施例提供的一种DMA数据搬运装置的结构示意图。如图12所示,DMA数据搬运装置包括:第一获取模块1210和数据读取模块1220。Referring to Figure 12, Figure 12 is a schematic structural diagram of a DMA data transfer device provided by an embodiment of the present application. As shown in Figure 12, the DMA data transfer device includes: a first acquisition module 1210 and a data reading module 1220.
其中,第一获取模块1210,用于获取读指令集。Among them, the first acquisition module 1210 is used to acquire the read instruction set.
数据读取模块1220,用于根据读指令集中的配置信息,将离散地址信息对应的离散地址范围内的数据搬运到连续地址信息对应的连续地址范围内。The data reading module 1220 is used to transfer data in the discrete address range corresponding to the discrete address information to the continuous address range corresponding to the continuous address information according to the configuration information in the read instruction set.
参见图13,图13是本申请一实施例提供的一种DMA数据搬运装置的结构示意图。如图13所示,DMA数据搬运装置包括:第二获取模块1310和数据写出模块1320。Referring to Figure 13, Figure 13 is a schematic structural diagram of a DMA data transfer device provided by an embodiment of the present application. As shown in Figure 13, the DMA data transfer device includes: a second acquisition module 1310 and a data writing module 1320.
第二获取模块1310,用于获取写指令集。The second acquisition module 1310 is used to acquire the write instruction set.
数据写出模块1320,用于根据写指令集中的配置信息,将连续地址信息对应的连续地址范围内的数据搬运到各离散地址信息对应的离散地址范围内。The data writing module 1320 is used to transfer data in the continuous address range corresponding to the continuous address information to the discrete address range corresponding to each discrete address information according to the configuration information in the write instruction set.
本申请一实施例还提供了一种DMA控制器,DMA控制器包括前述图12所示实施例的DMA数据搬运装置,和/或,前述图13所示实施例的DMA数据搬运装置。An embodiment of the present application also provides a DMA controller. The DMA controller includes the DMA data transfer device of the embodiment shown in FIG. 12 and/or the DMA data transfer device of the embodiment shown in FIG. 13 .
本申请一实施例还提供了一种协处理器,协处理器集成有前述的DMA控制器,或,所述协处理器与前述的DMA控制器耦合。An embodiment of the present application also provides a coprocessor. The coprocessor is integrated with the foregoing DMA controller, or the coprocessor is coupled with the foregoing DMA controller.
本申请一实施例还提供了一种电子设备。如图14所示,电子设备可以包括一个或多个协处理器1400(图14中仅示出一个),存储器1410以及存储在存储器1410中并可在一个或多个协处理器1400上运行的计算机程序1420,例如,DMA数据搬运的程序。一个或多个协处理器1400执行计算机程序1420时可以实现前述图1所示实施例的DMA数据搬运方法,和/或,前述图7所示实施例的DMA数据搬运方法实施例中的各步骤。或者,一个或多个协处理器1400执行计算机程序1420时可以实现前述图12所示实施例的DMA数据搬运装置,和/或,前述图13所示实施例的DMA数据搬运装置实施例中各模块/单元的功能,此处不作限制。An embodiment of the present application also provides an electronic device. As shown in FIG. 14 , the electronic device may include one or more coprocessors 1400 (only one is shown in FIG. 14 ), a memory 1410 , and a program stored in the memory 1410 and operable on the one or more coprocessors 1400 . Computer program 1420, for example, a program for DMA data transfer. When one or more coprocessors 1400 execute the computer program 1420, they may implement the DMA data transfer method of the embodiment shown in FIG. 1, and/or the steps in the DMA data transfer method of the embodiment shown in FIG. 7. . Alternatively, when one or more co-processors 1400 execute the computer program 1420, they can implement the DMA data transfer apparatus of the embodiment shown in FIG. 12, and/or each of the DMA data transfer apparatus of the embodiment shown in FIG. 13. The functions of modules/units are not limited here.
示例性的,计算机程序1420可以被分割成一个或多个模块/单元,一个或者多个模块/单元被存储在存储器1410中,并由协处理器1400执行,以完成本申请。一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述计算机程序1420在处理单元中的执行过程。For example, the computer program 1420 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 1410 and executed by the coprocessor 1400 to complete the present application. One or more modules/units may be a series of computer program instruction segments capable of completing specific functions. The instruction segments are used to describe the execution process of the computer program 1420 in the processing unit.
例如,计算机程序1420可以被分割成如下几个模块。各模块具体功能如下:For example, the computer program 1420 can be divided into several modules as follows. The specific functions of each module are as follows:
第一获取模块,用于获取读指令集;The first acquisition module is used to acquire the read instruction set;
数据读取模块,用于根据读指令集中的配置信息,将离散地址信息对应的离散地址范围内的数据搬运到连续地址信息对应的连续地址范围内。The data reading module is used to transfer data in the discrete address range corresponding to the discrete address information to the continuous address range corresponding to the continuous address information according to the configuration information in the read instruction set.
和/或,and / or,
第二获取模块,用于获取写指令集;The second acquisition module is used to acquire the write instruction set;
数据写出模块,用于根据写指令集中的配置信息,将连续地址信息对应的连续地址范围内的数据搬运到各离散地址信息对应的离散地址范围内。The data writing module is used to transfer data in the continuous address range corresponding to the continuous address information to the discrete address range corresponding to each discrete address information according to the configuration information in the write instruction set.
本领域技术人员可以理解,图14仅仅是电子设备的示例,并不构成对电子设备的限定。电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如电子设备还可以包括输入输出设备、网络接入设备、总线等。Those skilled in the art can understand that FIG. 14 is only an example of an electronic device and does not constitute a limitation on the electronic device. The electronic device may include more or less components than shown in the figures, or a combination of certain components, or different components. For example, the electronic device may also include input and output devices, network access devices, buses, etc.
在一个实施例中,所称协处理器1400可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。In one embodiment, the so-called coprocessor 1400 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field- Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
在一个实施例中,存储器1410可以是电子设备的内部存储单元,例如电子设备的硬盘或内存。存储器1410也可以是电子设备的外部存储设备,例如电子设备上配备的插接式硬盘,智能存储卡(smart media card,SMC),安全数字(secure digital,SD)卡,闪存卡(flash card)等。进一步地,存储器1410还可以既包括电子设备的内部存储单元也包括外部存储设备。存储器1410用于存储计算机程序以及电子设备所需的其他程序和数据。存储器1410还可以用于暂时地存储已经输出或者将要输出的数据。In one embodiment, the memory 1410 may be an internal storage unit of the electronic device, such as a hard disk or memory of the electronic device. The memory 1410 can also be an external storage device of the electronic device, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, or a flash card equipped on the electronic device. wait. Further, the memory 1410 may also include both an internal storage unit of the electronic device and an external storage device. Memory 1410 is used to store computer programs and other programs and data required by the electronic device. The memory 1410 may also be used to temporarily store data that has been output or is to be output.
本申请一实施例还提供了电子设备的另一种优选的实施例,在本实施例中,电子设备包括一个或多个协处理器。一个或多个协处理器用于执行存储在存储器的以下程序模块:An embodiment of the present application also provides another preferred embodiment of an electronic device. In this embodiment, the electronic device includes one or more co-processors. One or more coprocessors are used to execute the following program modules stored in memory:
第一获取模块,用于获取读指令集;The first acquisition module is used to acquire the read instruction set;
数据读取模块,用于根据读指令集中的配置信息,将离散地址信息对应的离散地址范围内的数据搬运到连续地址信息对应的连续地址范围内。The data reading module is used to transfer data in the discrete address range corresponding to the discrete address information to the continuous address range corresponding to the continuous address information according to the configuration information in the read instruction set.
和/或,and / or,
第二获取模块,用于获取写指令集;The second acquisition module is used to acquire the write instruction set;
数据写出模块,用于根据写指令集中的配置信息,将连续地址信息对应的连续地址范围内的数据搬运到各离散地址信息对应的离散地址范围内。The data writing module is used to transfer data in the continuous address range corresponding to the continuous address information to the discrete address range corresponding to each discrete address information according to the configuration information in the write instruction set.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and simplicity of description, only the division of the above functional units and modules is used as an example. In actual applications, the above functions can be allocated to different functional units and modules according to needs. Module completion means dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above. Each functional unit and module in the embodiment can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit. The above-mentioned integrated unit can be hardware-based. It can also be implemented in the form of software functional units. In addition, the specific names of each functional unit and module are only for the convenience of distinguishing each other and are not used to limit the scope of protection of the present application. For the specific working processes of the units and modules in the above system, please refer to the corresponding processes in the foregoing method embodiments, and will not be described again here.
本申请一实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,计算机程序被处理器执行时可实现前述图1所示实施例的DMA数据搬运方法,和/或,前述图7所示实施例的DMA数据搬运方法。An embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program. When the computer program is executed by a processor, the DMA data transfer method of the embodiment shown in Figure 1 can be implemented. And/or, the DMA data transfer method of the embodiment shown in FIG. 7 .
本申请一实施例提供了一种计算机程序产品,当计算机程序产品在电子设备上运行时,使得电子设备可实现前述图1所示实施例的DMA数据搬运方法,和/或,前述图7所示实施例的DMA数据搬运方法。An embodiment of the present application provides a computer program product. When the computer program product is run on an electronic device, the electronic device can implement the DMA data transfer method of the embodiment shown in Figure 1, and/or the method shown in Figure 7. The DMA data transfer method of the embodiment is shown.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above embodiments, each embodiment is described with its own emphasis. For parts that are not detailed or documented in a certain embodiment, please refer to the relevant descriptions of other embodiments.
依照不同的设计需求,上述第一获取模块、数据读取模块、第二获取模块、和数据写出模块等的实现方式可以是硬件(hardware)、固件(firmware)、软件(software,即计算机程序)或是前述三者中的多者的组合形式。以硬件形式而言,上述第一获取模块、数据读取模块、第二获取模块、和数据写出模块等可以是实现于集成电路(integrated circuit)上的逻辑电路。上述第一获取模块、 数据读取模块、第二获取模块、和数据写出模块等的相关功能可以利用硬件描述语言(hardware description languages,例如Verilog HDL或VHDL)或其他合适的编程语言来实现为硬件。举例来说,上述第一获取模块、数据读取模块、第二获取模块、和数据写出模块等的相关功能可以被实现于一个或多个控制器、微控制器、微处理器、特殊应用集成电路(Application-specific integrated circuit,ASIC)、数字信号处理器(digital signal processor,DSP)、场可程序逻辑门阵列(Field Programmable Gate Array,FPGA)及/或其他处理单元中的各种逻辑区块、模块和电路。以软件形式及/或固件形式而言,上述第一获取模块、数据读取模块、第二获取模块、和数据写出模块等的相关功能可以被实现为计算机程序。其中,计算机程序包括计算机程序代码,计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。计算机程序代码可以被记录/存放在计算机可读介质中。计算机可读介质可以包括:能够携带计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(read-only memory,ROM)、RAM、电载波信号、电信信号以及软件分发介质等。中央处理器(Central Processing Unit,CPU)、控制器、微控制器或微处理器可以从所述计算机可读介质中读取并执行所述计算机程序代码,从而实现上述第一获取模块、数据读取模块、第二获取模块、和数据写出模块等的相关功能。需要说明的是,计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。According to different design requirements, the above-mentioned first acquisition module, data reading module, second acquisition module, and data writing module can be implemented in the form of hardware (hardware), firmware (firmware), software (software, that is, computer program). ) or a combination of more of the above three. In terms of hardware, the above-mentioned first acquisition module, data reading module, second acquisition module, data writing module, etc. may be logic circuits implemented on an integrated circuit. The related functions of the above-mentioned first acquisition module, data reading module, second acquisition module, and data writing module can be implemented using hardware description languages (hardware description languages, such as Verilog HDL or VHDL) or other suitable programming languages. hardware. For example, the related functions of the above-mentioned first acquisition module, data reading module, second acquisition module, and data writing module can be implemented in one or more controllers, microcontrollers, microprocessors, special applications Various logic areas in integrated circuits (Application-specific integrated circuits, ASICs), digital signal processors, DSPs, field programmable gate arrays, and/or other processing units Blocks, modules and circuits. In terms of software form and/or firmware form, the related functions of the above-mentioned first acquisition module, data reading module, second acquisition module, data writing module, etc. can be implemented as computer programs. Among them, the computer program includes computer program code, and the computer program code can be in the form of source code, object code, executable file or some intermediate form, etc. Computer program code may be recorded/stored in computer-readable media. Computer-readable media may include: any entity or device capable of carrying computer program code, recording media, USB flash drives, mobile hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), RAM, electronic Carrier signals, telecommunications signals, and software distribution media, etc. A central processing unit (Central Processing Unit, CPU), controller, microcontroller or microprocessor can read and execute the computer program code from the computer-readable medium, thereby realizing the above-mentioned first acquisition module, data reading Relevant functions of the acquisition module, the second acquisition module, and the data writing module. It should be noted that the content contained in the computer-readable medium can be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, the computer-readable medium does not include Electrical carrier signals and telecommunications signals.
以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above embodiments are only used to illustrate the technical solutions of the present application, but are not intended to limit them. Although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still modify the technical solutions described in the foregoing embodiments. Modifications are made to the recorded technical solutions, or equivalent substitutions are made to some of the technical features; these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and shall be included in this application. within the scope of protection.

Claims (14)

  1. 一种DMA数据搬运方法,应用于DMA控制器,其特征在于,所述DMA数据搬运方法包括:A DMA data transfer method, applied to a DMA controller, characterized in that the DMA data transfer method includes:
    获取读指令集,所述读指令集包括第一配置信息,所述第一配置信息包括第一离散地址信息和第一连续地址信息;Obtain a read instruction set, the read instruction set includes first configuration information, and the first configuration information includes first discrete address information and first continuous address information;
    根据所述读指令集中的所述第一配置信息,将所述第一离散地址信息对应的离散地址范围内的数据搬运到所述第一连续地址信息对应的连续地址范围内。According to the first configuration information in the read instruction set, the data in the discrete address range corresponding to the first discrete address information is moved to the continuous address range corresponding to the first continuous address information.
  2. 如权利要求1所述的DMA数据搬运方法,其特征在于,所述第一配置信息还包括是否执行上锁功能的指示信息。The DMA data transfer method of claim 1, wherein the first configuration information further includes indication information of whether to perform a locking function.
  3. 如权利要求1所述的DMA数据搬运方法,其特征在于,所述第一配置信息还包括执行上锁功能的起始和结束位置,The DMA data transfer method according to claim 1, wherein the first configuration information also includes the starting and ending positions for executing the locking function,
    所述DMA数据搬运方法,还包括:当确定所述第一配置信息中的所述是否执行上锁功能的指示信息为执行上锁功能时,对所述执行上锁功能的起始和结束位置之间的地址空间上锁。The DMA data transfer method further includes: when it is determined that the indication information of whether to execute the locking function in the first configuration information is to execute the locking function, determining the starting and ending positions of the locking function. The address space between is locked.
  4. 如权利要求3所述的DMA数据搬运方法,其特征在于,所述对所述执行上锁功能的起始和结束位置之间的地址空间上锁,包括:停止向所述执行上锁功能的起始和结束位置之间的地址空间内搬运数据,直到所述地址空间解锁。The DMA data transfer method according to claim 3, wherein locking the address space between the start and end positions of the locking function includes: stopping the locking function. Data is moved within the address space between the start and end locations until the address space is unlocked.
  5. 如权利要求1至4任一项所述的DMA数据搬运方法,其特征在于,所述第一离散地址信息包括源存储器的起始偏移地址、单个离散地址的地址空间大小、相同数据来源的两个相邻离散地址的地址距离、相同数据来源的离散地址的数目、不同数据来源的两个相邻离散地址的地址距离、和数据来源的数目;The DMA data transfer method according to any one of claims 1 to 4, wherein the first discrete address information includes the starting offset address of the source memory, the address space size of a single discrete address, and the address space of the same data source. The address distance of two adjacent discrete addresses, the number of discrete addresses with the same data source, the address distance of two adjacent discrete addresses with different data sources, and the number of data sources;
    所述第一连续地址信息包括目的存储器的起始偏移地址信息。The first continuous address information includes starting offset address information of the destination memory.
  6. 如权利要求5所述的DMA数据搬运方法,其特征在于,应用于图像处理,可将多张图像对应的多张子图像从离散地址空间搬运至一连续地址范围;其中,所述单个离散地址的地址空间大小为所述子图像的图像宽度,相同数据 来源的两个相邻离散地址的地址距离为所述子图像的相邻两行中第一行行尾到第二行行首的距离,相同数据来源的离散地址的数目为所述子图像的图像高度,不同数据来源的两个相邻离散地址的地址距离为两个相邻子图像中前一张所述子图像的最后一个像素和后一张子图像的第一个像素之间的地址距离,数据来源的数目为所述子图像的数目。The DMA data transfer method of claim 5, wherein when applied to image processing, multiple sub-images corresponding to multiple images can be transferred from a discrete address space to a continuous address range; wherein the single discrete address The size of the address space is the image width of the sub-image, and the address distance of two adjacent discrete addresses of the same data source is the distance from the end of the first line to the beginning of the second line in two adjacent lines of the sub-image. , the number of discrete addresses from the same data source is the image height of the sub-image, and the address distance of two adjacent discrete addresses from different data sources is the last pixel of the previous sub-image in the two adjacent sub-images. and the address distance between the first pixel of the next sub-image, and the number of data sources is the number of the sub-images.
  7. 如权利要求1至4任一项所述的DMA数据搬运方法,其特征在于,还包括:The DMA data transfer method according to any one of claims 1 to 4, further comprising:
    获取写指令集,所述写指令集包括第二配置信息,所述第二配置信息包括第二离散地址信息和第二连续地址信息;Obtain a write instruction set, the write instruction set includes second configuration information, and the second configuration information includes second discrete address information and second continuous address information;
    根据所述写指令集中的所述第二配置信息,将所述第二连续地址信息对应的连续地址范围内的数据搬运到各所述第二离散地址信息对应的离散地址范围内。According to the second configuration information in the write instruction set, the data in the continuous address range corresponding to the second continuous address information is moved to the discrete address range corresponding to each of the second discrete address information.
  8. 一种DMA数据搬运方法,应用于DMA控制器,其特征在于,所述DMA数据搬运方法包括:A DMA data transfer method, applied to a DMA controller, characterized in that the DMA data transfer method includes:
    获取写指令集,所述写指令集包括第二配置信息,所述第二配置信息包括第二离散地址信息和第二连续地址信息;Obtain a write instruction set, the write instruction set includes second configuration information, and the second configuration information includes second discrete address information and second continuous address information;
    根据所述写指令集中的所述第二配置信息,将所述第二连续地址信息对应的连续地址范围内的数据搬运到各所述第二离散地址信息对应的离散地址范围内。According to the second configuration information in the write instruction set, the data in the continuous address range corresponding to the second continuous address information is moved to the discrete address range corresponding to each of the second discrete address information.
  9. 如权利要求8所述的DMA数据搬运方法,其特征在于,应用于图像处理,可将多张子图像从连续地址范围搬运至各离散地址空间。The DMA data transfer method of claim 8, wherein when applied to image processing, multiple sub-images can be transferred from a continuous address range to each discrete address space.
  10. 一种DMA数据搬运装置,配置于DMA控制器,其特征在于,所述DMA数据搬运装置包括:A DMA data transfer device, configured in a DMA controller, characterized in that the DMA data transfer device includes:
    第一获取模块,用于获取读指令集,所述读指令集包括第一配置信息,所述第一配置信息包括第一离散地址信息和第一连续地址信息;A first acquisition module, configured to acquire a read instruction set, where the read instruction set includes first configuration information, and the first configuration information includes first discrete address information and first continuous address information;
    数据读取模块,用于根据所述读指令集中的所述第一配置信息,将所述第 一离散地址信息对应的离散地址范围内的数据搬运到所述第一连续地址信息对应的连续地址范围内。A data reading module, configured to transfer data in the discrete address range corresponding to the first discrete address information to the continuous address corresponding to the first continuous address information according to the first configuration information in the read instruction set. within the range.
  11. 一种DMA数据搬运装置,配置于DMA控制器,其特征在于,所述DMA数据搬运装置包括:A DMA data transfer device, configured in a DMA controller, characterized in that the DMA data transfer device includes:
    第二获取模块,用于获取写指令集,所述写指令集包括第二配置信息,所述第二配置信息包括第二离散地址信息和第二连续地址信息;a second acquisition module, configured to acquire a write instruction set, where the write instruction set includes second configuration information, and the second configuration information includes second discrete address information and second continuous address information;
    数据写出模块,用于根据所述写指令集中的所述第二配置信息,将所述第二连续地址信息对应的连续地址范围内的数据搬运到各所述第二离散地址信息对应的离散地址范围内。A data writing module, configured to transfer data in the continuous address range corresponding to the second continuous address information to discrete addresses corresponding to each of the second discrete address information according to the second configuration information in the write instruction set. within the address range.
  12. 一种协处理器,其特征在于,所述协处理器与DMA控制器耦合,或,所述DMA控制器耦合集成于所述协处理器,所述DMA控制器包括如权利要求10所述的所述DMA数据搬运装置,和/或,如权利要求11所述的所述DMA数据搬运装置。A co-processor, characterized in that the co-processor is coupled with a DMA controller, or the DMA controller is coupled and integrated with the co-processor, and the DMA controller includes the DMA controller as claimed in claim 10 The DMA data transfer device, and/or the DMA data transfer device according to claim 11.
  13. 一种电子设备,其特征在于,包括存储器、协处理器以及存储在所述存储器中并可在所述协处理器上运行的计算机程序,所述协处理器执行所述计算机程序时实现如权利要求1至7任一项所述的DMA数据搬运方法,和/或,实现如权利要求8至9任一项所述的DMA数据搬运方法。An electronic device, characterized in that it includes a memory, a coprocessor, and a computer program stored in the memory and executable on the coprocessor. When the coprocessor executes the computer program, it implements the following rights: The DMA data transfer method described in any one of claims 1 to 7, and/or, the DMA data transfer method described in any one of claims 8 to 9 is implemented.
  14. 一种计算机可读存储介质,所述计算机存储介质存储有计算机程序,其特征在于,所述计算机程序被协处理器执行时实现如权利要求1至7任一项所述的DMA数据搬运方法,和/或,实现如权利要求8至9任一项所述的DMA数据搬运方法。A computer-readable storage medium, the computer storage medium stores a computer program, characterized in that, when the computer program is executed by a coprocessor, the DMA data transfer method as described in any one of claims 1 to 7 is implemented, And/or, implement the DMA data transfer method as described in any one of claims 8 to 9.
PCT/CN2022/100634 2022-03-21 2022-06-23 Dma data transfer method and device WO2023178859A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210276686.1 2022-03-21
CN202210276686.1A CN114756490A (en) 2022-03-21 2022-03-21 DMA data carrying method and device

Publications (1)

Publication Number Publication Date
WO2023178859A1 true WO2023178859A1 (en) 2023-09-28

Family

ID=82327525

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/100634 WO2023178859A1 (en) 2022-03-21 2022-06-23 Dma data transfer method and device

Country Status (2)

Country Link
CN (1) CN114756490A (en)
WO (1) WO2023178859A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115660941B (en) * 2022-12-27 2023-03-14 北京象帝先计算技术有限公司 Image moving method and device, electronic equipment and computer readable storage medium
CN116166583B (en) * 2023-04-26 2023-07-11 太初(无锡)电子科技有限公司 Data precision conversion method and device, DMA controller and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006126938A (en) * 2004-10-26 2006-05-18 Canon Inc Data transfer system and its data transfer method
CN101876956A (en) * 2009-12-15 2010-11-03 北京中星微电子有限公司 File access method and device of SD (Secure Digital) card
CN111190842A (en) * 2019-12-30 2020-05-22 Oppo广东移动通信有限公司 Direct memory access, processor, electronic device, and data transfer method
CN111615692A (en) * 2019-05-23 2020-09-01 深圳市大疆创新科技有限公司 Data transfer method, calculation processing device, and storage medium
CN112835828A (en) * 2019-11-25 2021-05-25 美光科技公司 Direct Memory Access (DMA) commands for non-sequential source and destination memory addresses

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006126938A (en) * 2004-10-26 2006-05-18 Canon Inc Data transfer system and its data transfer method
CN101876956A (en) * 2009-12-15 2010-11-03 北京中星微电子有限公司 File access method and device of SD (Secure Digital) card
CN111615692A (en) * 2019-05-23 2020-09-01 深圳市大疆创新科技有限公司 Data transfer method, calculation processing device, and storage medium
CN112835828A (en) * 2019-11-25 2021-05-25 美光科技公司 Direct Memory Access (DMA) commands for non-sequential source and destination memory addresses
CN111190842A (en) * 2019-12-30 2020-05-22 Oppo广东移动通信有限公司 Direct memory access, processor, electronic device, and data transfer method

Also Published As

Publication number Publication date
CN114756490A (en) 2022-07-15

Similar Documents

Publication Publication Date Title
WO2023178859A1 (en) Dma data transfer method and device
US7844752B2 (en) Method, apparatus and program storage device for enabling multiple asynchronous direct memory access task executions
US8683093B2 (en) Memory system having high data transfer efficiency and host controller
US10380058B2 (en) Processor core to coprocessor interface with FIFO semantics
US8145822B2 (en) Computer system for electronic data processing
JP6408514B2 (en) Strongly ordered devices across multiple memory areas and automatic ordering of exclusive transactions
US20040186931A1 (en) Transferring data using direct memory access
KR20010031192A (en) Data processing system for logically adjacent data samples such as image data in a machine vision system
JPH0760423B2 (en) Data transfer method
TWI727236B (en) Data bit width converter and system on chip thereof
US20120124248A1 (en) Processor with tightly coupled smart memory unit
US20050060441A1 (en) Multi-use data access descriptor
TW202015044A (en) Data management method and storage controller using the same
US7779172B2 (en) Activator, DMA transfer system, DMA transfer method
WO2019000357A1 (en) Image processing method and device
WO2011113646A1 (en) Masked register write method and apparatus
US10216634B2 (en) Cache directory processing method for multi-core processor system, and directory controller
WO2018000765A1 (en) Co-processor, data reading method, processor system and storage medium
US7016987B2 (en) Transaction aligner microarchitecture
JPH0410102B2 (en)
US10956240B2 (en) Sharing data by a virtual machine
EP1125191A1 (en) Controlling access to a primary memory
WO2019183849A1 (en) Data processing method and device
JPH11184799A (en) Method for transferring memory data and device therefor
US10394574B2 (en) Apparatuses for enqueuing kernels on a device-side

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22932911

Country of ref document: EP

Kind code of ref document: A1