WO2018223302A1 - 数据的重组方法和装置 - Google Patents

数据的重组方法和装置 Download PDF

Info

Publication number
WO2018223302A1
WO2018223302A1 PCT/CN2017/087393 CN2017087393W WO2018223302A1 WO 2018223302 A1 WO2018223302 A1 WO 2018223302A1 CN 2017087393 W CN2017087393 W CN 2017087393W WO 2018223302 A1 WO2018223302 A1 WO 2018223302A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
reassembled
dma
read
address
Prior art date
Application number
PCT/CN2017/087393
Other languages
English (en)
French (fr)
Inventor
贝琰
刘帅
许太火
沈伟
郑明恩
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2017/087393 priority Critical patent/WO2018223302A1/zh
Publication of WO2018223302A1 publication Critical patent/WO2018223302A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/28Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal

Definitions

  • the embodiments of the present application relate to computer technologies, and in particular, to a data recombination method and apparatus.
  • Figure 1a shows the data format expected by the processor
  • Figure 1b shows the data format in the memory, as shown in Figures 1a - 1b, where each square represents a piece of data, 32 bits, and in addition, a square with the same pattern.
  • Data representing the same attribute is stored by the same data that the processor expects the same attribute, that is, the same data is processed to improve the efficiency of processing the data by the processor, but the actual data storage in the memory is fragmented. Therefore, the terminal device usually The data format in memory needs to be reformed.
  • FIG. 2 is a schematic diagram of recombination of data in the prior art.
  • the usual method is to use a direct memory access (DMA) method according to the memory.
  • DMA direct memory access
  • the raw data format sequentially reads data from the memory and writes it into the target memory, so that the data format in the target memory is the data format desired by the processor.
  • the DMA since the bus width is wide (256 bits or higher), the DMA has only 32 bits of continuous data per operation, which makes the DMA move data inefficient, resulting in low efficiency of data reassembly.
  • the embodiment of the present application provides a method and device for reorganizing data to improve the efficiency of data reorganization.
  • an embodiment of the present application provides a data recombination method, which is applied to a DMA, where the method includes:
  • the original data format is the data to be reassembled at the source
  • Recombining data in the DMA is written into a storage medium corresponding to the target address according to a preset data format.
  • the DMA reads the target number of data to be reassembled according to the storage format of the data to be reassembled in the storage medium corresponding to the source address from the storage medium corresponding to the source address, and stores the read data to be reassembled in the storage medium.
  • the DMA for example, it can be stored in the internal cache of the DMA.
  • the DMA since the DMA reads the data to be reassembled into the DMA according to the original data format according to the bus bit width from the storage medium corresponding to the source address, the recombined data stored in the DMA is written according to the preset data format.
  • the storage medium corresponding to the target address avoids the phenomenon of data format recombination by reading a single data in the prior art, thereby not only improving the utilization of bandwidth resources, but also improving the efficiency of data format reorganization.
  • the reading the target number of data to be reassembled and storing the data in the DMA according to the original data format in the storage medium corresponding to the source address including:
  • Step A reading a number of to-be-reassembled from the storage medium corresponding to the source address according to the original data format And storing the read data to be reassembled in a cache unit of the DMA; a is a positive integer greater than or equal to 2 and less than or equal to the target number;
  • Step B Repeat step A until the number of data to be reassembled read is the target number.
  • the DMA when the DMA reads the target number of data to be reassembled from the storage medium corresponding to the source address, it can read the data according to the size of the bus bit width, for example, according to the original data format, corresponding to the source address.
  • the storage medium reads a data to be reassembled, and stores the data to be reassembled read in the DMA buffer unit. After the storage is completed, a data to be reassembled is read, and the read data is read. A data to be reassembled is stored, and the operations of reading and storing are repeatedly performed until the number of data to be reassembled read is the target number, thereby obtaining the recombined data stored in the DMA.
  • the cache unit of the DMA is a D flip-flop array
  • the storing the a to be reassembled data in the cache unit of the DMA including:
  • the storage address of the data to be reassembled is (p, q), and the storage address of the first data to be reassembled in the a to be reassembled data stored in the p+1th row in the D flip-flop array is ( p+1,q); wherein p and q are positive integers;
  • the recombining data in the DMA is written into the storage medium corresponding to the target address according to a preset data format, including:
  • the recombined data read in columns is written in rows into a storage medium corresponding to the target address.
  • the DMA when a plurality of block data format reforming buffers are added in the DMA, and the D flip-flop array is used as the basic storage unit, the DMA will read one from the storage medium corresponding to the source address each time according to the original data format. Or multiple bus width data to be reassembled, and stored in the D flip-flop array according to the row, after the DMA reads the data until the entire data format reforming buffer is filled, the D flip-flop array is processed like a matrix transpose. That is, the input direction is written to the internal buffer in units of "rows", and the output direction reads data from the internal buffer in units of "columns", and starts DMA output direction processing, corresponding to the target address according to the data format desired by the processor. The data is written in the storage medium, and the amount of data written each time is one or more bus widths, and the writing operation is repeated until the data in the DMA output data to the entire data format reforming buffer is read.
  • the cache unit of the DMA is a D flip-flop array
  • the storing the a to be reassembled data in the cache unit of the DMA including:
  • the storage address of the data to be reassembled is (g, k), and the storage address of the first data to be reassembled in the a to-be-reorganized data stored in the k+1th column in the D flip-flop array is ( g, k+1), wherein g and k are positive integers;
  • the recombining data in the DMA is written into the storage medium corresponding to the target address according to a preset data format, including:
  • the recombined data read in rows is written in columns into a storage medium corresponding to the target address.
  • the DMA when a plurality of block data format reforming buffers are added in the DMA, and the D flip-flop array is used as the basic storage unit, the DMA will read one from the storage medium corresponding to the source address each time according to the original data format. Or A plurality of bus widths of data to be reassembled are stored in a D flip-flop array in columns.
  • the D flip-flop array is subjected to a matrix-like transposition process, that is, The input direction is written to the internal buffer in units of "columns", and the output direction reads data from the internal buffer in units of "rows”, and starts the DMA output direction processing, and stores the data in accordance with the desired data format of the processor.
  • the data is written in the medium, and the amount of data written each time is one or more bus widths, and the writing operation is repeated until the data of the DMA output data to the entire data format reforming buffer is read.
  • the cache unit of the DMA is a static random access memory (SRAM);
  • the storing the a to be reassembled data in the cache unit of the DMA including:
  • the recombining data in the DMA is written into the storage medium corresponding to the target address according to a preset data format, including:
  • Reading a recombined data from the SRAM every other column b writing the read recombined data into rows or writing to a storage medium corresponding to the target address, wherein reading from each column
  • the reorganized data has different row addresses stored in the SRAM, and the attributes of the a recombined data read are the same.
  • the DMA reads one or more from the storage medium corresponding to the source address each time according to the original data format.
  • the desired data format writes data to a storage medium corresponding to the target address.
  • the cache unit of the DMA is a static random access memory (SRAM);
  • the storing the a to be reassembled data in the cache unit of the DMA including:
  • the recombining data in the DMA is written into the storage medium corresponding to the target address according to a preset data format, including:
  • the DMA when a plurality of data format reforming buffers are added in the DMA, and the SRAM of the large bit width is used as the basic storage unit, the DMA will read from the storage medium corresponding to the source address each time according to the original data format.
  • the DMA After the DMA stores the data to be reassembled in SRAM, it will start reading data from the internal buffer and start DMA output direction processing.
  • the desired data format of the device writes data to the storage medium corresponding to the target address, and the amount of data written each time is one or more bus widths.
  • the embodiment of the present application provides a data reorganization device, including:
  • An obtaining module configured to acquire a source address, a target address, and a target number of the data to be reassembled
  • a reading module configured to read the target number of data to be reassembled from the storage medium corresponding to the source address according to a raw data format, and store the data in the DMA to obtain recombined data;
  • the original data format is the Resizing data in a storage format in a storage medium corresponding to the source address;
  • a writing module configured to write the recombined data in the DMA into a storage medium corresponding to the target address according to a preset data format.
  • the reading module is specifically configured to:
  • Step A reading a to-be-reassembled data from the storage medium corresponding to the source address according to the original data format, and storing the read a to-be-reassembled data in a cache unit of the DMA; a positive integer greater than or equal to 2 and less than or equal to the target number;
  • Step B Repeat step A until the number of data to be reassembled read is the target number.
  • the cache unit of the DMA is a D flip-flop array
  • the reading module is specifically configured to:
  • the storage address of the data to be reassembled is (p, q), and the storage address of the first data to be reassembled in the a to be reassembled data stored in the p+1th row in the D flip-flop array is ( p+1,q); wherein p and q are positive integers;
  • the writing module is specifically configured to:
  • the recombined data read in columns is written in rows into a storage medium corresponding to the target address.
  • the cache unit of the DMA is a D flip-flop array
  • the reading module is specifically configured to:
  • the storage address of the data to be reassembled is (g, k), and the storage address of the first data to be reassembled in the a to-be-reorganized data stored in the k+1th column in the D flip-flop array is ( g, k+1), wherein g and k are positive integers;
  • the writing module is specifically configured to:
  • the recombined data read in rows is written in columns into a storage medium corresponding to the target address.
  • the cache unit of the DMA is a static random access memory (SRAM);
  • the reading module is specifically configured to:
  • the writing module is specifically configured to:
  • Reading a recombined data from the SRAM every other column b writing the read recombined data into rows or writing to a storage medium corresponding to the target address, wherein reading from each column
  • the reorganized data has different row addresses stored in the SRAM, and the attributes of the a recombined data read are the same.
  • the cache unit of the DMA is a static random access memory (SRAM);
  • the reading module is specifically configured to:
  • the writing module is specifically configured to:
  • an embodiment of the present application provides a DMA, including: a memory and a processor;
  • the memory is used to store program instructions
  • the processor is operative to invoke program instructions in the memory to perform the first aspect described above and various possible implementations of the first aspect.
  • a fourth aspect of the present application provides a data recombining apparatus comprising at least one processing element (or chip) for performing the method of the above first aspect.
  • a fifth aspect of the present application provides a program for performing the method of the above first aspect when executed by a processor.
  • a sixth aspect of the present application provides a program product, such as a computer readable storage medium, comprising the program of the fifth aspect.
  • a seventh aspect of the present application provides a computer readable storage medium having stored therein instructions that, when run on a computer, cause the computer to perform the method of the first aspect described above.
  • the embodiment of the present invention provides a method and a device for reorganizing data, by acquiring a source address, a target address, and a target number of data to be reassembled, and reading from a storage medium corresponding to the source address according to a raw data format.
  • the target number of data to be reassembled is stored in the DMA, and the recombined data is obtained, and the recombined data in the DMA is written into the storage medium corresponding to the target address according to a preset data format.
  • the DMA Since the DMA reads the data to be reassembled into the DMA according to the original data format according to the bus bit width from the storage medium corresponding to the source address, the recombined data stored in the DMA is written to the target address according to the preset data format.
  • the phenomenon of data format recombination by reading a single data in the prior art is avoided, thereby not only improving the utilization of bandwidth resources, but also improving the efficiency of data format reorganization.
  • Figure 1a is the data format expected by the processor
  • Figure 1b shows the data format in memory
  • Embodiment 3 is a schematic flowchart of Embodiment 1 of a method for reorganizing data according to the present application;
  • FIG. 5 is another schematic diagram of data reconstruction by DMA
  • FIG. 8 is a schematic structural diagram of Embodiment 1 of a data reorganization apparatus according to an embodiment of the present disclosure
  • FIG. 9 is a schematic structural diagram of a DMA embodiment according to an embodiment of the present application.
  • the method for reorganizing data according to the embodiment of the present application can be applied to a terminal device that performs data format reforming by using DMA, and is mainly directed to a process of data format reforming, because the bus bit width is wide, but each DMA There are fewer consecutive data for the second operation, which not only causes waste of bus bandwidth resources, but also makes the DMA move data less efficient.
  • the processor in order to achieve higher data parallelism, the processor requires data with the same attribute to be stored together to improve the efficiency of the processor processing data, but the actual memory data storage is fragmented, therefore, the terminal device The data format in memory needs to be reformed.
  • the data format in the memory When the data format in the memory is reformed, the data is sequentially read from the memory according to the original data format in the memory by the DMA method, and written into the target memory, so that the data format in the target memory is the processor.
  • the expected data format When the DMA reads data, it only reads one data block from the memory at a time. For example, the continuous data of each operation is only 32 bits, and the bus width is usually wide, if there are 256 bits or higher, so that not only will This results in a large waste of bus width and makes the DMA move data less efficiently.
  • the method and device for reorganizing data provided by the embodiments of the present invention are directed to solving the technical problem of a large waste of bus bit width and low efficiency of DMA moving data in the process of data format recombination in the prior art.
  • FIG. 3 is a schematic flowchart diagram of Embodiment 1 of a method for reorganizing data according to the present application.
  • the embodiment of the present application provides a method for reorganizing data, which may be performed by any device that performs a method for reorganizing data, and the device may be implemented by software and/or hardware.
  • the device can be integrated in the DMA.
  • the method in this embodiment may include:
  • Step 301 Acquire a source address, a target address, and a target number of data to be reassembled.
  • the processor when the processor processes the data, in order to achieve a higher degree of data parallelism, it is usually necessary to reform the format of the data.
  • the processor may send the source address stored in the data to be reassembled that needs to be formatted, the target address of the stored data and the target number of the data to be reassembled to the DMA.
  • Step 302 Read the target number of data to be reassembled according to the original data format from the storage medium corresponding to the source address, and store the data in the DMA to obtain the recombined data; the original data format is the storage medium to be reassembled in the storage medium corresponding to the source address. Storage format.
  • FIG. 4 is a schematic diagram of data reconstruction by the DMA.
  • the DMA reads from the storage medium corresponding to the source address according to the storage format of the data to be reassembled in the storage medium corresponding to the source address.
  • the target number of data to be reassembled is taken, and the read data to be reassembled is stored in the DMA, for example, may be stored in the internal cache of the DMA.
  • the storage medium includes all carriers capable of storing data, and may include, for example, a hard disk, an optical disk, or a memory.
  • reading the target number of data to be reassembled according to the original data format from the storage medium corresponding to the source address And stored in the DMA specifically may include the following steps:
  • Step A According to the original data format, read a data to be reassembled from the storage medium corresponding to the source address, and store the read data to be reassembled in the cache unit of the DMA; a is greater than or equal to 2 and less than Or a positive integer equal to the number of targets.
  • Step B Repeat step A until the number of data to be reassembled read is the target number.
  • the DMA when the DMA reads the target number of data to be reassembled from the storage medium corresponding to the source address, the DMA may perform the reading according to the size of the bus bit width, for example, according to the original data format, the storage corresponding to the source address. Read a data to be reassembled in the medium, and store the data to be reassembled read in the DMA buffer unit. After the storage is completed, read a data to be reassembled, and read the a data. The data to be reassembled is stored, and the operations of reading and storing are repeatedly performed until the number of data to be reassembled read is the target number, thereby obtaining the recombined data stored in the DMA. For example, as shown in FIG.
  • the bus bit width is 256 bits, and if one data is 32 bits, thus, when the DMA reads the data to be reassembled from the storage medium corresponding to the source address, Eight data can be read at a time. After the reading is completed, eight data are stored in the DMA buffer unit, and eight data are continuously read from the storage medium corresponding to the source address, and stored until the read data. The number is 64. Since the bus width is 256 bits, the DMA can read 8 data at a time, which not only reduces the bus bandwidth consumption, but also greatly improves the efficiency of the DMA reading data.
  • the DMA can also read 7 data or 6 data each time, as long as the number of bits of data read each time does not exceed the bus bit width.
  • the data reading efficiency can be maximized.
  • Step 303 Write the recombined data in the DMA into a storage medium corresponding to the target address according to a preset data format.
  • the preset data format is a storage format desired by the processor, such as storing data with the same attribute, etc., wherein the data having the same attribute refers to data that solves the same problem or data having the same function. , or the same data processing process when the processor processes the data.
  • the DMA reads the data to be reassembled into the buffer unit of the DMA
  • the recombined data stored in the DMA buffer unit is written to the storage medium corresponding to the target address according to a preset data format. in.
  • the DMA may perform step 303 after performing step 302, or may perform step 303 in the process of performing step 302.
  • the DMA may store all the data to be reconstructed in the target number.
  • the recombined data in the DMA is written to the storage medium corresponding to the target address. For example, if the target number is 64 and the DMA buffer unit is also 64, the DMA can complete all 64 data to be reassembled.
  • the recombined data in the DMA is written into the storage medium corresponding to the target address in multiple times.
  • the DMA may also write the recombined data already stored in the cache unit of the DMA to the target address in the order of storage in the process of storing other data to be reassembled after storing the data to be reassembled into the buffer unit of the DMA.
  • the DMA may start to rewrite the recombined data in the cache unit to the target address after storing 64 data to be reassembled in the cache unit.
  • the new data to be reassembled is read from the storage medium corresponding to the source address. Therefore, the sequence of steps 302 and 303 is implemented. The example is not limited here.
  • the method for reorganizing the data provided by the embodiment of the present application, by acquiring the source address, the target address, and the target number of the data to be reassembled, and reading the target quantity from the storage medium corresponding to the source address according to the original data format.
  • the data to be reassembled is stored in the DMA, and the recombined data is obtained, and the recombined data in the DMA is written into the storage medium corresponding to the target address according to a preset data format. Since the DMA reads the data to be reassembled into the DMA according to the original data format according to the bus bit width from the storage medium corresponding to the source address, the recombined data stored in the DMA is written to the target address according to the preset data format. In the corresponding storage medium, the phenomenon of data format recombination by reading a single data in the prior art is avoided, thereby not only improving the utilization of bandwidth resources, but also improving the efficiency of data format reorganization.
  • the storage format of the reorganized data in the DMA is different, so that the manner in which the DMA reads the recombined data from the cache unit is different, and the following describes in detail The storage format of the reorganized data in the DMA and the corresponding way of reading the recombined data.
  • the first type if the cache unit of the DMA is a D flip-flop array, the read data to be reassembled is stored in the cache unit of the DMA, and the stored data to be reassembled is stored in the D flip-flop according to the row.
  • the storage address of the first data to be recombined among the a to be reassembled data of the pth row stored in the D flip-flop array is (p, q), and is stored in the p-th flip-flop array
  • the storage address of the first data to be reassembled among the a to-be-reconstructed data of +1 line is (p+1, q), where p and q are positive integers.
  • the recombined data in the DMA is written into the storage medium corresponding to the target address according to the preset data format, including: reading the recombined data stored in the D flip-flop array by columns, and recomposing the recombined data by the column. Write to the storage medium corresponding to the destination address by row.
  • the DMA reads one or more each time from the storage medium corresponding to the source address according to the original data format.
  • the storage address of the data to be reassembled is (p, q)
  • the storage address of the first data to be reassembled stored in the a to be reassembled data of the p+1th row is (p+1, q).
  • the DMA stores the eight data to be reassembled read for the first time in the first row, and the storage address of the first data to be reassembled stored in the eight data to be reassembled in the first row is ( 1,1), storing the 8 data to be reassembled read in the second row in the second row, and storing the data of the first data to be reassembled among the 8 data to be reassembled read the second time is (2, 1) Wait, and so on, until the internal cache of the DMA is filled. At this time, the DMA will read the recombined data stored in the D flip-flop array by column, and write the read data to the target address according to the row. In the storage medium.
  • the second type if the cache unit of the DMA is a D flip-flop array, the read data to be reassembled is stored in the cache unit of the DMA, and the stored data to be reassembled is stored in the D flip-flop according to the column.
  • the storage address of the first data to be recombined among the a to be reassembled data of the kth column stored in the D flip-flop array is (g, k), and is stored in the kth in the D flip-flop array.
  • the storage address of the first data to be recombined in the a to-be-reconstructed data of the +1 column is (g, k+1), where g and k are positive integers.
  • the recombined data in the DMA is written into the storage medium corresponding to the target address according to the preset data format, including: reading the recombined data stored in the D flip-flop array by rows, and recomping the data read by the row. Write to the storage medium corresponding to the destination address by column.
  • the input direction is written to the internal buffer in units of "columns", and the output direction is read from the internal cache in units of "rows".
  • the DMA reads one or more each time from the storage medium corresponding to the source address according to the original data format.
  • the data of the bus width to be reassembled is stored in the D flip-flop array in columns. For example, if a data to be reassembled is read and stored, the first one of the a to be reassembled data in the kth column is stored.
  • the storage address of the data to be reassembled is (g, k)
  • the storage address of the first data to be reassembled stored in the a to be reassembled data of the k+1th column is (g, k+1).
  • FIG. 5 is another schematic diagram of data reorganization by DMA.
  • the DMA stores the eight data to be reassembled read for the first time in the first column, and stores it in the eight data to be reassembled in the first column.
  • the storage address of the first data to be reassembled is (1, 1), and the 8 data to be reassembled read in the second time are stored in the second column, and the first of the 8 data to be reassembled read the second time.
  • the storage address of the data to be reassembled is (1, 2), etc., and so on, until the internal cache of the DMA is filled.
  • the DMA will read the recombined data stored in the D flip-flop array by row, and The read data is written in columns to the storage medium corresponding to the target address.
  • the third type if the cache unit of the DMA is a static random access memory (SRAM), the read data to be reassembled is stored in the cache unit of the DMA, including: a to be read to be reassembled
  • the data is stored in the SRAM in rows, wherein the storage address of the first data to be recombined among the a to be reassembled data stored in the mth row in the SRAM is (m, n), and is stored in the m+th in the SRAM.
  • the storage address of the first data to be recombined in a row of data to be reassembled is (m+b, n+b), wherein m, n and b are positive integers.
  • the recombined data in the DMA is written into the storage medium corresponding to the target address according to the preset data format, including: reading a recombined data from the SRAM every other b columns, and reading the recombined data by row Write or write to the storage medium corresponding to the target address in a column, wherein the rewritten data read from each column has different row addresses stored in the SRAM, and the attributes of the read recombined data are the same.
  • the buffer unit of the DMA is an SRAM
  • the DMA since the DMA can read only one data in each column when reading the recombined data stored in the SRAM, the storage mode of the recombined data in the SRAM and the D are triggered.
  • the storage in the array is different.
  • the DMA when a plurality of data format reforming buffers are added in the DMA, and the SRAM is used as the basic storage unit, the DMA will read one or each storage medium corresponding to the source address according to the original data format.
  • Data of a plurality of bus widths to be reassembled and stored in the SRAM in rows for example, assuming that a piece of data to be reassembled is read and stored, the first one of the a to be reassembled data stored in the mth row
  • the storage address of the data to be reassembled is (m, n)
  • the storage address of the first data to be reassembled stored in the a to be reassembled data of the m+1th row is (m+b, n+b).
  • the DMA will read a recombined data from the SRAM every other b columns, and the recombined data read in each column will have different row addresses stored in the SRAM, and the read a recombined data
  • the attributes are the same, wherein the attributes of the a recombined data are the same, and the a recombined data is data that solves the same problem, or the a recombined data is data having the same function, or the processor processes the recombined data. The same process and so on.
  • the DMA will repeatedly perform the above read operation until the data in the DMA output data to the entire data format reforming buffer is read. What needs to be explained is that data can be input to the latter piece of data while the DMA is outputting data, thereby performing data flow.
  • FIG. 6 is another schematic diagram of data reconstruction by the DMA.
  • the DMA stores the 8 data to be reassembled read for the first time in the first row, and stores it in the first row.
  • the storage address of the first data to be reassembled among the eight data to be reassembled is (1, 1), and the eight data to be reassembled read the second time are stored in the second row, and are stored in the second row.
  • the storage address of the first data to be reassembled in the data to be reassembled is (2, 2), etc., and so on, until the internal cache of the DMA is filled.
  • the DMA will sequentially read from each column of the SRAM.
  • a recombined data, and the read 8 recombined data is written in rows or written into a storage medium corresponding to the target address.
  • the fourth type if the cache unit of the DMA is an SRAM, storing the data to be reassembled in the cache unit of the DMA, including: storing the read data to be reassembled in a row in the SRAM, where The storage address of the first data to be reassembled in the a to be reassembled data read in the ith time is (i, j), and the first data to be reassembled in the a to be reassembled data read in the i+1th time The storage address is (i, j+a).
  • the recombined data in the DMA is written into the storage medium corresponding to the target address according to the preset data format, including: reading a recombined data from the same row in the same row of the SRAM, and reading the a
  • the recombined data is written in a row or written in a column to a storage medium corresponding to the target address, wherein the attributes of the a recombined data read are the same.
  • the SRAM in the present mode can use a large bit width SRAM.
  • the target number of data to be reconstructed can be all stored in the same row of the SRAM.
  • the width of the data to be reassembled and stored in the SRAM by row for example, assuming that the a to be reorganized data is read for the i th time, the first of the a to be reassembled data to be read i is read to be reassembled
  • the storage address of the data is stored as (i, j), and when the data to be reassembled is read again in the i+1th time, the first one of the a to be reassembled data is read in the i+1th time.
  • the storage address of the data to be reassembled is stored as (i, j+a).
  • the DMA After the DMA stores the data to be reassembled in the SRAM, it will start reading data from the internal cache, and start the DMA output direction processing, and write data to the storage medium corresponding to the target address according to the data format desired by the processor.
  • the amount of data written is one or more bus widths. Specifically, one recombined data is read every other column from the same row of the SRAM, and the read a recombined data is written in rows or in columns.
  • the storage medium corresponding to the target address is entered, wherein the attributes of the a recombined data read are the same.
  • the attributes of the a recombined data may be the same, or the a recombined data is the data that solves the same problem, or the a recombined data is the data having the same function, or the processor processes the a recombined data in the same process. Wait.
  • the DMA will repeatedly perform the above read operation until the data in the DMA output data to the entire data format reforming buffer is read. What needs to be explained is that data can be input to the latter piece of data while the DMA is outputting data, thereby performing data flow.
  • Figure 7 is another schematic diagram of data reorganization by DMA.
  • the DMA will read 8 of the first time.
  • the data to be reassembled is stored in the first row, and the storage address of the first data to be reassembled stored in the 8 data to be reassembled in the first row is (1, 1), and the 8 data to be reassembled read the second time.
  • Also stored in the first row, and the storage address of the first of the 8 data to be reassembled is (1, 9), etc., and so on, until the first row in the internal cache of the DMA is filled up.
  • the DMA will read a recombined data from every 8 columns in the same row of the SRAM, and write the read 8 recombined data into rows or write columns into the storage medium corresponding to the target address.
  • the DMA buffer unit may also be composed of multiple small bit width SRAMs.
  • the target number of data to be reconstructed may be all stored in the same row of the SRAM, and the DMA is reorganized.
  • a recombined data can be read from the same row of the SRAM, every other column, and the read recombined data is written in rows or written to the target address by column. Corresponding storage medium.
  • the storage method of the data to be reorganized in the DMA is not limited to the above four modes.
  • the data to be reassembled can also be stored in the DMA in other ways, as long as the DMA corresponds to the source address according to the original data format.
  • the data to be reassembled in the storage medium is stored in the DMA, and after the recombined data is obtained, the recombined data is written into the storage medium corresponding to the target address according to the preset data format, and the data to be reassembled is in the DMA.
  • the specific storage manner is not limited in this embodiment.
  • the method for reorganizing the data provided by the embodiment of the present application, by acquiring the source address, the target address, and the target number of the data to be reassembled, and reading the target quantity from the storage medium corresponding to the source address according to the original data format.
  • the data to be reassembled is stored in the DMA, and the recombined data is obtained, and the recombined data in the DMA is written into the storage medium corresponding to the target address according to a preset data format. Since the DMA reads the data to be reassembled into the DMA according to the original data format according to the bus bit width from the storage medium corresponding to the source address, the recombined data stored in the DMA is written to the target address according to the preset data format.
  • the phenomenon of data format recombination by reading a single data in the prior art is avoided, thereby not only improving the utilization of bandwidth resources, but also improving the efficiency of data format reorganization.
  • the data to be reassembled can be stored in the DMA according to different formats, thereby improving the flexibility of the data to be reassembled.
  • FIG. 8 is a schematic structural diagram of Embodiment 1 of a data reorganization apparatus according to an embodiment of the present disclosure.
  • the reorganization device may be a stand-alone DMA, or may be a device integrated in the DMA, and the device may be implemented by software, hardware or a combination of software and hardware. As shown in FIG. 8, the reorganization device includes:
  • the obtaining module 11 is configured to obtain a source address, a target address, and a target number of the data to be reassembled to be reassembled;
  • the reading module 12 is configured to read the target number of data to be reassembled according to the original data format from the storage medium corresponding to the source address, and store the data in the DMA to obtain recombined data;
  • the original data format is a storage format of the data to be reassembled in a storage medium corresponding to the source address;
  • the writing module 13 is configured to write the recombined data in the DMA into a storage medium corresponding to the target address according to a preset data format.
  • the apparatus for reorganizing the data provided by the embodiment of the present invention may perform the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
  • the reading module 12 is specifically configured to:
  • Step A reading a to-be-reassembled data from the storage medium corresponding to the source address according to the original data format, and storing the read a to-be-reassembled data in a cache unit of the DMA; a positive integer greater than or equal to 2 and less than or equal to the target number;
  • Step B Repeat step A until the number of data to be reassembled read is the target number.
  • the cache unit of the DMA is a D flip-flop array
  • the reading module 12 is specifically configured to:
  • the storage address of the data to be reassembled is (p, q), and the storage address of the first data to be reassembled in the a to be reassembled data stored in the p+1th row in the D flip-flop array is ( p+1,q); wherein p and q are positive integers;
  • the writing module 13 is specifically configured to:
  • the recombined data read in columns is written in rows into a storage medium corresponding to the target address.
  • the cache unit of the DMA is a D flip-flop array
  • the reading module 12 is specifically configured to:
  • the storage address of the data to be reassembled is (g, k), and the storage address of the first data to be reassembled in the a to-be-reorganized data stored in the k+1th column in the D flip-flop array is ( g, k+1), wherein g and k are positive integers;
  • the writing module 13 is specifically configured to:
  • the recombined data read in rows is written in columns into a storage medium corresponding to the target address.
  • the cache unit of the DMA is a static random access memory (SRAM);
  • the reading module 12 is specifically configured to:
  • the writing module 13 is specifically configured to:
  • Reading a recombined data from the SRAM every other column b writing the read recombined data into rows or writing to a storage medium corresponding to the target address, wherein reading from each column
  • the reorganized data has different row addresses stored in the SRAM, and the attributes of the a recombined data read are the same.
  • the cache unit of the DMA is a static random access memory (SRAM);
  • the reading module 12 is specifically configured to:
  • the writing module 13 is specifically configured to:
  • the apparatus for reorganizing the data provided by the embodiment of the present invention may perform the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
  • FIG. 9 is a schematic structural diagram of a DMA embodiment according to an embodiment of the present application.
  • the DMA may include a processor 21 and a memory 22.
  • Memory 22 may include high speed RAM memory, and may also include non-volatile memory NVM, such as at least one disk memory, in which various programs may be stored for performing various processing functions and implementing the method steps of the present embodiments.
  • NVM non-volatile memory
  • the processor 21 is configured to acquire a source address, a target address, and a target number of the data to be reassembled;
  • the processor 21 is further configured to read the target number of data to be reassembled according to the original data format from the storage medium corresponding to the source address, and store the data in the DMA to obtain recombined data;
  • the original data format is the to-be-processed Resizing data in a storage format in a storage medium corresponding to the source address;
  • the processor 21 is further configured to write the recombined data in the DMA into a storage medium corresponding to the target address according to a preset data format.
  • the processor 21 is specifically configured to:
  • Step A reading a to-be-reassembled data from the storage medium corresponding to the source address according to the original data format, and storing the read a to-be-reassembled data in a cache unit of the DMA; a positive integer greater than or equal to 2 and less than or equal to the target number;
  • Step B Repeat step A until the number of data to be reassembled read is the target number.
  • the cache unit of the DMA is a D flip-flop array
  • the processor 21 is further configured to store the read data to be reassembled in rows in the D flip-flop array, wherein the a of the p rows stored in the D flip-flop array
  • the storage address of the first data to be reassembled in the data to be reassembled is (p, q), and the first one of the a to be reassembled data stored in the p+1th row in the D flip-flop array
  • the storage address of the recombined data is (p+1, q); wherein p and q are positive integers;
  • the processor 21 is further configured to read the recombined data stored in the D flip-flop array by columns; and write the recompressed data read in columns into a storage medium corresponding to the target address.
  • the cache unit of the DMA is a D flip-flop array
  • the processor 21 is further configured to store the read data to be reassembled in columns in the D flip-flop array, wherein the a of the kth column in the D flip-flop array are stored
  • the storage address of the first data to be reassembled in the data to be reassembled is (g, k), and the first one of the a to be reassembled data stored in the k+1th column in the D flip-flop array
  • the storage address of the recombined data is (g, k+1), where g and k are positive integers;
  • the processor 21 is further configured to read the recombined data stored in the D flip-flop array in rows; and write the recompressed data read in rows into a storage medium corresponding to the target address.
  • the cache unit of the DMA is a static random access memory (SRAM);
  • the processor 21 is further configured to store the read data to be reassembled in a row in the SRAM, where the first to be reassembled data in the mth row in the SRAM is stored.
  • the storage address of the data to be reassembled is (m, n), and the storage address of the first data to be reassembled among the a to-be-reassembled data stored in the m+1th row in the SRAM is (m+ b, n+b), wherein m, n and b are positive integers;
  • the processor 21 is further configured to read a recombined data from the SRAM every other column b, and write the read a recombined data into a storage medium corresponding to the target address.
  • the reassembled data read from each column has different row addresses stored in the SRAM, and the attributes of the a recombined data read are the same.
  • the cache unit of the DMA is a static random access memory (SRAM);
  • the processor 21 is further configured to store the read data to be reassembled in the SRAM in a row, where the first data to be reassembled in the a to be reassembled data read in the ith time
  • the storage address is (i, j), and the storage address of the first data to be reassembled among the a to be reassembled data read by the i+1th time is (i, j+a);
  • the processor 21 is further configured to read, from the same row of the SRAM, a recombination data every other the a column, and write the read a recombined data into a row or write the column into the column.
  • the DMA provided by the embodiment of the present application may be used to implement the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.
  • the present application also provides a readable storage medium in which instructions are stored, and when at least one processor of the DMA executes the instruction, the DMA performs a recombination method of the data provided in any of the foregoing method embodiments.
  • the application also provides a program product comprising instructions stored in a readable storage medium.
  • At least one processor of the DMA can read the instruction from a readable storage medium and execute the instruction such that the DMA implements a method of recombining the data provided in any of the method embodiments.
  • the processor may be a central processing unit (English: Central Processing Unit, CPU for short), or other general-purpose processor, digital signal processor (English: Digital Signal Processor, referred to as: DSP), Application Specific Integrated Circuit (ASIC).
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like. The steps of the method disclosed in connection with the present application may be directly embodied by hardware processor execution or by a combination of hardware and software modules in a processor.
  • All or part of the steps of implementing the above method embodiments may be performed by hardware associated with the program instructions.
  • the aforementioned program can be stored in a readable memory.
  • the steps including the foregoing method embodiments are performed; and the foregoing memory (storage medium) includes: read-only memory (English: read-only memory, abbreviation: ROM), RAM, flash memory, hard disk, Solid state drive, magnetic tape (English: magnetic tape), floppy disk (English: floppy disk), optical disc (English: optical disc) and any combination thereof.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or may be each Units exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • a computer readable storage medium A number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) or a processor to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Bus Control (AREA)

Abstract

一种数据的重组方法和装置,该方法包括:获取待重组数据的源地址、目标地址和待重组数据的目标数量(301);从源地址对应的存储介质中按照原始数据格式,读取目标数量的待重组数据并存储在DMA中,得到重组数据;原始数据格式为待重组数据在源地址对应的存储介质中的存储格式(302);按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中(303)。不仅可以提高带宽资源的利用率,而且可以提高数据格式重组的效率。

Description

数据的重组方法和装置 技术领域
本申请实施例涉及计算机技术,特别涉及一种数据的重组方法和装置。
背景技术
随着无线技术的发展,数字信号处理器对数据处理能力的要求越来越高,矢量位宽也随之越来越大。为了达到较高的数据并行度,处理器对数据的存储格式也提出了要求。图1a为处理器期望的数据格式,图1b为内存中的数据格式,如图1a-图1b所示,其中,每个方格表示一个数据,为32位,另外,具有相同图案的方格表示属性相同的数据,通过处理器期望属性相同的数据,即处理过程相同的数据存放在一起,以提高处理器处理数据的效率,但是实际内存中的数据存放是零散的,因此,终端设备通常需要对内存中的数据格式进行重整。
图2为现有技术中数据的重组示意图,如图2所示,对内存中的数据格式进行重整时,通常的做法是通过直接内存存取(Direct Memory Access;DMA)的方式,按照内存中原始数据格式依次从内存中读取数据,并写入到目标内存中,使得目标内存中的数据格式为处理器期望的数据格式。
然而,由于总线位宽较宽(256位或更高),但DMA每次操作的连续数据只有32位,使得DMA搬移数据的效率很低,从而导致数据重组的效率较低。
发明内容
本申请实施例提供了一种数据的重组方法和装置,用以提高数据重组的效率。
第一方面,本申请实施例提供一种数据的重组方法,应用于DMA中,所述方法包括:
获取待重组数据的源地址、目标地址和所述待重组数据的目标数量;
从所述源地址对应的存储介质中按照原始数据格式,读取所述目标数量的待重组数据并存储在DMA中,得到重组数据;所述原始数据格式为所述待重组数据在所述源地址对应的存储介质中的存储格式;
按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中。
在本方案中,DMA从源地址对应的存储介质中,按照待重组数据在源地址对应的存储介质中的存储格式,读取目标数量的待重组数据,并将读取的待重组数据存储在DMA中,例如可以存储在DMA的内部缓存中。
另外,由于DMA根据总线位宽从源地址对应的存储介质中按照原始数据格式,将待重组数据读入到DMA中后,再按照预设的数据格式将存储在DMA中的重组数据写入到目标地址对应的存储介质中,由此避免了现有技术中通过读取单个数据的方式进行数据格式重组的现象,因此不仅可以提高带宽资源的利用率,而且可以提高数据格式重组的效率。
在一种可能的实现方式中,所述从所述源地址对应的存储介质中按照原始数据格式,读取所述目标数量的待重组数据并存储在DMA中,包括:
步骤A:按照所述原始数据格式,从所述源地址对应的存储介质中读取a个待重组数 据,并将读取的所述a个待重组数据存储在DMA的缓存单元中;a为大于或等于2且小于或等于所述目标数量的正整数;
步骤B:重复执行所述步骤A,直至读取的所述待重组数据的数量为所述目标数量。
在本方案中,DMA在从源地址对应的存储介质中读取目标数量的待重组数据时,可以根据总线位宽的大小分多次进行读取,如可以按照原始数据格式,从源地址对应的存储介质中读取a个待重组数据,并将此次读取的a个待重组数据存储在DMA的缓存单元中,存储完成之后,再读取a个待重组数据,并将读取的a个待重组数据进行存储,重复执行读取和存储的操作,直至读取的待重组数据的数量为目标数量,从而得到存储在DMA中的重组数据。
在一种可能的实现方式中,若所述DMA的缓存单元为D触发器阵列;
所述将读取的所述a个待重组数据存储在DMA的缓存单元中,包括:
将读取的所述a个待重组数据按行存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第p行的所述a个待重组数据中的第一个待重组数据的存储地址为(p,q),存储在所述D触发器阵列中的第p+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(p+1,q);其中,p和q为正整数;
所述按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中,包括:
按列读取存储在所述D触发器阵列中的重组数据;
将按列读取的所述重组数据按行写入所述目标地址对应的存储介质中。
在上述方案中,当在DMA中增加若干块数据格式重整缓存,并以D触发器阵列为基本存储单元时,DMA将按照原始数据格式,每次从源地址对应的存储介质中读取一个或多个总线宽度的待重组数据,并按行存储在D触发器阵列,在DMA读数据直到填满整块数据格式重整缓存之后,将对D触发器阵列进行类似矩阵转置的处理,即输入方向以“行”为单位写入内部缓存,输出方向则以“列”为单位从内部缓存中读取数据,并启动DMA输出方向处理,按照处理器期望的数据格式向目标地址对应的存储介质中写入数据,每次写入的数据量是一个或多个总线宽度,并重复执行写入操作,直至DMA输出数据到整块数据格式重整缓存中的数据被读取完。
在一种可能的实现方式中,若所述DMA的缓存单元为D触发器阵列;
所述将读取的所述a个待重组数据存储在DMA的缓存单元中,包括:
将读取的所述a个待重组数据按列存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第k列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k),存储在所述D触发器阵列中的第k+1列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k+1),其中,g和k为正整数;
所述按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中,包括:
按行读取存储在所述D触发器阵列中的重组数据;
将按行读取的所述重组数据按列写入所述目标地址对应的存储介质中。
在上述方案中,当在DMA中增加若干块数据格式重整缓存,并以D触发器阵列为基本存储单元时,DMA将按照原始数据格式,每次从源地址对应的存储介质中读取一个或 多个总线宽度的待重组数据,并按列存储在D触发器阵列,在DMA读数据直到填满整块数据格式重整缓存之后,将对D触发器阵列进行类似矩阵转置的处理,即输入方向以“列”为单位写入内部缓存,输出方向则以“行”为单位从内部缓存中读取数据,并启动DMA输出方向处理,按照处理器期望的数据格式向目标地址对应的存储介质中写入数据,每次写入的数据量是一个或多个总线宽度,并重复执行写入操作,直至DMA输出数据到整块数据格式重整缓存中的数据被读取完。
在一种可能的实现方式中,若所述DMA的缓存单元为静态随机存储器SRAM;
所述将读取的所述a个待重组数据存储在DMA的缓存单元中,包括:
将读取的所述a个待重组数据按行存储在所述SRAM中,其中,存储在所述SRAM中的第m行的所述a个待重组数据中的第一个待重组数据的存储地址为(m,n),存储在所述SRAM中的第m+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(m+b,n+b),其中,m、n和b均为正整数;
所述按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中,包括:
每隔b列从所述SRAM中读取一个重组数据,将读取的a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,从每一列中读取的重组数据在所述SRAM中存储的行地址均不相同,且读取的所述a个重组数据的属性相同。
在上述方案中,当在DMA中增加若干块数据格式重整缓存,并以SRAM为基本存储单元时,DMA将按照原始数据格式,每次从源地址对应的存储介质中读取一个或多个总线宽度的待重组数据,并按行存储在SRAM中,在DMA读数据直到填满整块数据格式重整缓存之后,将从内部缓存中读取数据,并启动DMA输出方向处理,按照处理器期望的数据格式向目标地址对应的存储介质中写入数据。
在一种可能的实现方式中,若所述DMA的缓存单元为静态随机存储器SRAM;
所述将读取的所述a个待重组数据存储在DMA的缓存单元中,包括:
将读取的所述a个待重组数据按行存储在所述SRAM中,其中,第i次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j),第i+1次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j+a);
所述按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中,包括:
从所述SRAM的同一行中,每隔所述a列读取一个重组数据,并将读取的所述a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,读取的所述a个重组数据的属性相同。
在上述方案中,当在DMA中增加若干块数据格式重整缓存,并以大位宽的SRAM为基本存储单元时,DMA将按照原始数据格式,每次从源地址对应的存储介质中读取一个或多个总线宽度的待重组数据,并按行存储在SRAM中,在DMA将待重组数据存储在SRAM中之后,将开始从内部缓存中读取数据,并启动DMA输出方向处理,按照处理器期望的数据格式向目标地址对应的存储介质中写入数据,每次写入的数据量是一个或多个总线宽度。
第二方面,本申请实施例提供一种数据的重组装置,包括:
获取模块,用于获取待重组数据的源地址、目标地址和所述待重组数据的目标数量;
读取模块,用于从所述源地址对应的存储介质中按照原始数据格式,读取所述目标数量的待重组数据并存储在DMA中,得到重组数据;所述原始数据格式为所述待重组数据在所述源地址对应的存储介质中的存储格式;
写入模块,用于按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中。
在一种可能的实现方式中,所述读取模块,具体用于:
步骤A:按照所述原始数据格式,从所述源地址对应的存储介质中读取a个待重组数据,并将读取的所述a个待重组数据存储在DMA的缓存单元中;a为大于或等于2且小于或等于所述目标数量的正整数;
步骤B:重复执行所述步骤A,直至读取的所述待重组数据的数量为所述目标数量。
在一种可能的实现方式中,若所述DMA的缓存单元为D触发器阵列;
所述读取模块,具体用于:
将读取的所述a个待重组数据按行存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第p行的所述a个待重组数据中的第一个待重组数据的存储地址为(p,q),存储在所述D触发器阵列中的第p+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(p+1,q);其中,p和q为正整数;
所述写入模块,具体用于:
按列读取存储在所述D触发器阵列中的重组数据;
将按列读取的所述重组数据按行写入所述目标地址对应的存储介质中。
在一种可能的实现方式中,若所述DMA的缓存单元为D触发器阵列;
所述读取模块,具体用于:
将读取的所述a个待重组数据按列存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第k列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k),存储在所述D触发器阵列中的第k+1列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k+1),其中,g和k为正整数;
所述写入模块,具体用于:
按行读取存储在所述D触发器阵列中的重组数据;
将按行读取的所述重组数据按列写入所述目标地址对应的存储介质中。
在一种可能的实现方式中,若所述DMA的缓存单元为静态随机存储器SRAM;
所述读取模块,具体用于:
将读取的所述a个待重组数据按行存储在所述SRAM中,其中,存储在所述SRAM中的第m行的所述a个待重组数据中的第一个待重组数据的存储地址为(m,n),存储在所述SRAM中的第m+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(m+b,n+b),其中,m、n和b均为正整数;
所述写入模块,具体用于:
每隔b列从所述SRAM中读取一个重组数据,将读取的a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,从每一列中读取的重组数据在所述SRAM中存储的行地址均不相同,且读取的所述a个重组数据的属性相同。
在一种可能的实现方式中,若所述DMA的缓存单元为静态随机存储器SRAM;
所述读取模块,具体用于:
将读取的所述a个待重组数据按行存储在所述SRAM中,其中,第i次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j),第i+1次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j+a);
所述写入模块,具体用于:
从所述SRAM的同一行中,每隔所述a列读取一个重组数据,并将读取的所述a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,读取的所述a个重组数据的属性相同。
上述第二方面以及第二方面的各可能的实现方式所提供的数据的重组装置,其有益效果可以参照上述第一方面以及第一方面的各可能的实现方式所带来的有益效果,在此不再赘述。
第三方面,本申请实施例提供一种DMA,包括:存储器和处理器;
存储器用于存储程序指令;
处理器用于调用存储器中的程序指令执行上述第一方面以及第一方面的各种可能的实现方式。
上述第三方面以及第三方面的各可能的实现方式所提供的DMA,其有益效果可以参照上述第一方面以及第一方面的各可能的实现方式所带来的有益效果,在此不再赘述。
本申请第四方面提供一种数据的重组装置,包括用于执行以上第一方面的方法的至少一个处理元件(或芯片)。
本申请第五方面提供一种程序,该程序在被处理器执行时用于执行以上第一方面的方法。
本申请第六方面提供一种程序产品,例如计算机可读存储介质,包括第五方面的程序。
本申请第七方面提供一种计算机可读存储介质,计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面的方法。
本申请实施例提供一种数据的重组方法和装置,通过获取待重组数据的源地址、目标地址和待重组数据的目标数量,并从该源地址对应的存储介质中按照原始数据格式,读取目标数量的待重组数据并存储在DMA中,得到重组数据,再按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中。由于DMA根据总线位宽从源地址对应的存储介质中按照原始数据格式,将待重组数据读入到DMA中后,再按照预设的数据格式将存储在DMA中的重组数据写入到目标地址对应的存储介质中,由此避免了现有技术中通过读取单个数据的方式进行数据格式重组的现象,因此不仅可以提高带宽资源的利用率,而且可以提高数据格式重组的效率。
附图说明
图1a为处理器期望的数据格式;
图1b为内存中的数据格式;
图2为现有技术中数据的重组示意图;
图3为本申请数据的重组方法实施例一的流程示意图;
图4为DMA进行数据重组的一示意图;
图5为DMA进行数据重组的另一示意图;
图6为DMA进行数据重组的又一示意图;
图7为DMA进行数据重组的又一示意图;
图8为本申请实施例提供的数据的重组装置实施例一的结构示意图;
图9为本申请实施例提供的DMA实施例的结构示意图。
具体实施方式
本申请实施例涉及的数据的重组方法,可以适用于利用DMA进行数据格式重整的终端设备中,主要针对的是在进行数据格式重整的过程中,由于总线位宽较宽,但DMA每次操作的连续数据较少,不仅造成总线带宽资源的浪费,而且使得DMA搬移数据的效率较低的问题。现有技术中,为了达到较高的数据并行度,处理器要求具有相同属性的数据存放在一起,以提高处理器处理数据的效率,但是实际内存中的数据存放是零散的,因此,终端设备需要对内存中的数据格式进行重整。在对内存中的数据格式进行重整时,通常通过DMA的方式,按照内存中原始数据格式依次从内存中读取数据,并写入到目标内存中,使得目标内存中的数据格式为处理器期望的数据格式。DMA在读取数据时,每次只从内存中读取一个数据块,如每次操作的连续数据只有32位,而总线位宽通常较宽,如有256位或更高,这样,不仅会造成总线位宽的大量浪费,而且使得DMA搬移数据的效率较低。
因此,本发明实施例提供的数据的重组方法和设备,旨在解决现有技术中在进行数据格式重组的过程中,总线位宽的大量浪费,且DMA搬移数据的效率较低的技术问题。
下面以具体地实施例对本申请的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图3为本申请数据的重组方法实施例一的流程示意图。本申请实施例提供了一种数据的重组方法,该方法可以由任意执行数据的重组方法的装置来执行,该装置可以通过软件和/或硬件实现。本实施例中,该装置可以集成在DMA中。如图3所示,本实施例的方法可以包括:
步骤301、获取待重组数据的源地址、目标地址和待重组数据的目标数量。
在本实施例中,处理器在对数据进行处理时,为了达到较高的数据并行度,通常需要对数据的格式进行重整。在实际应用中,处理器可以将需要进行格式重整的待重组数据所存储的源地址,存储重整后的数据的目标地址以及待重组数据的目标数量发送给DMA。
步骤302、从源地址对应的存储介质中按照原始数据格式,读取目标数量的待重组数据并存储在DMA中,得到重组数据;该原始数据格式为待重组数据在源地址对应的存储介质中的存储格式。
在本实施例中,图4为DMA进行数据重组的一示意图,如图4所示,DMA从源地址对应的存储介质中,按照待重组数据在源地址对应的存储介质中的存储格式,读取目标数量的待重组数据,并将读取的待重组数据存储在DMA中,例如可以存储在DMA的内部缓存中。其中,存储介质包括能够存储数据的所有载体,例如可以包括硬盘、光盘或内存等。
可选地,从源地址对应的存储介质中按照原始数据格式,读取目标数量的待重组数据 并存储在DMA中,具体可以包括如下步骤:
步骤A:按照原始数据格式,从源地址对应的存储介质中读取a个待重组数据,并将读取的a个待重组数据存储在DMA的缓存单元中;a为大于或等于2且小于或等于目标数量的正整数。
步骤B:重复执行步骤A,直至读取的待重组数据的数量为目标数量。
具体地,DMA在从源地址对应的存储介质中读取目标数量的待重组数据时,可以根据总线位宽的大小分多次进行读取,如可以按照原始数据格式,从源地址对应的存储介质中读取a个待重组数据,并将此次读取的a个待重组数据存储在DMA的缓存单元中,存储完成之后,再读取a个待重组数据,并将读取的a个待重组数据进行存储,重复执行读取和存储的操作,直至读取的待重组数据的数量为目标数量,从而得到存储在DMA中的重组数据。举例来说,如图4所示,若目标数量为64,总线位宽为256位,若一个数据为32位,这样,DMA在从源地址对应的存储介质中读取待重组数据时,每次可以读取8个数据,读取完毕之后,将8个数据存储在DMA的缓存单元中,并继续从源地址对应的存储介质中读取8个数据,并进行存储,直至读取的数据个数为64。由于总线位宽为256位,DMA每次可以读取8个数据,这样不仅减少了总线带宽的消耗,而且可以大幅度提升DMA读取数据的效率。需要进行说明的是,DMA还可以每次读取7个数据或者6个数据等,只要每次读取的数据的位数不超过总线位宽即可,当然,本领域技术人员可以理解,当DMA每次读取的数据的总位数与总线带宽相同时,数据的读取效率可以达到最大。
步骤303、按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中。
在本实施例中,预设的数据格式为处理器期望的存储格式,如将属性相同的数据存储在一起等,其中,属性相同的数据是指解决相同问题的数据、或者具有相同功能的数据,或者处理器对数据进行处理时处理过程相同的数据。继续参照图4所示,DMA将待重组数据读入到DMA的缓存单元中之后,将会按照预设的数据格式,将存储在DMA缓存单元中的重组数据写入到目标地址对应的存储介质中。
需要进行说明的是,DMA可以在执行完步骤302之后,再执行步骤303,也可以在执行步骤302的过程中,执行步骤303,具体地,DMA可以在将目标数量的待重组数据全部存储到DMA的缓存单元中之后,再将DMA中的重组数据写入到目标地址对应的存储介质中,如目标数量为64,DMA的缓存单元也为64时,DMA可以在将64个待重组数据全部存储到DMA的缓存单元中之后,再分多次将DMA中的重组数据写入到目标地址对应的存储介质中。DMA也可以在将部分待重组数据存储到DMA的缓存单元中之后,在存储其他待重组数据的过程中,按照存储的先后顺序将已经存储在DMA的缓存单元中的重组数据写入到目标地址对应的存储介质中,如若目标数量为6000,DMA的缓存单元为64时,DMA可以在缓存单元中存储了64个待重组数据之后,开始将缓存单元中的重组数据写入到目标地址对应的存储介质中,而且在将缓存单元中的重组数据读出的同时,会从源地址对应的存储介质中读入新的待重组数据,因此对于步骤302和步骤303执行的先后顺序,本申请实施例在此不作限制。
本申请实施例提供的数据的重组方法,通过获取待重组数据的源地址、目标地址和待重组数据的目标数量,并从该源地址对应的存储介质中按照原始数据格式,读取目标数量 的待重组数据并存储在DMA中,得到重组数据,再按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中。由于DMA根据总线位宽从源地址对应的存储介质中按照原始数据格式,将待重组数据读入到DMA中后,再按照预设的数据格式将存储在DMA中的重组数据写入到目标地址对应的存储介质中,由此避免了现有技术中通过读取单个数据的方式进行数据格式重组的现象,因此不仅可以提高带宽资源的利用率,而且可以提高数据格式重组的效率。
可选地,在上述实施例的基础上,当DMA的缓存单元不同时,重组数据在DMA中的存储格式不同,这样DMA从缓存单元中读取重组数据的方式也不同,下面将详细介绍几种重组数据在DMA中的存储格式以及相应的读取重组数据的方式。
第一种:若DMA的缓存单元为D触发器阵列,则将读取的a个待重组数据存储在DMA的缓存单元中,包括将读取的a个待重组数据按行存储在D触发器阵列中,其中,存储在D触发器阵列中的第p行的a个待重组数据中的第一个待重组数据的存储地址为(p,q),存储在D触发器阵列中的第p+1行的a个待重组数据中的第一个待重组数据的存储地址为(p+1,q),其中,p和q为正整数。
相应的,按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中,包括:按列读取存储在D触发器阵列中的重组数据,将按列读取的重组数据按行写入目标地址对应的存储介质中。
具体地,当在DMA中增加若干块数据格式重整缓存,并以D触发器阵列为基本存储单元时,DMA将按照原始数据格式,每次从源地址对应的存储介质中读取一个或多个总线宽度的待重组数据,并按行存储在D触发器阵列,例如,假设读取了a个待重组数据并进行存储后,则存储在第p行的a个待重组数据中的第一个待重组数据的存储地址为(p,q)时,存储在第p+1行的a个待重组数据中的第一个待重组数据的存储地址为(p+1,q)。在DMA读数据直到填满整块数据格式重整缓存之后,将对D触发器阵列进行类似矩阵转置的处理,即输入方向以“行”为单位写入内部缓存,输出方向则以“列”为单位从内部缓存中读取数据,并启动DMA输出方向处理,按照处理器期望的数据格式向目标地址对应的存储介质中写入数据,每次写入的数据量是一个或多个总线宽度,并重复执行写入操作,直至DMA输出数据到整块数据格式重整缓存中的数据被读取完。需要进行说明的是,可以在DMA输出数据的同时对后一块数据进行数据输入操作,以此进行数据流水。
继续参照图4所示,DMA将第一次读取的8个待重组数据存储在第1行,存储在第1行的8个待重组数据中的第一个待重组数据的存储地址为(1,1),将第二次读取的8个待重组数据存储在第2行,第二次读取的这8个待重组数据中的第一个待重组数据的存储地址为(2,1)等,以此类推,直至将DMA的内部缓存填满,此时,DMA将按列读取存储在D触发器阵列中的重组数据,并将读取的数据按行写入目标地址对应的存储介质中。
第二种:若DMA的缓存单元为D触发器阵列,则将读取的a个待重组数据存储在DMA的缓存单元中,包括将读取的a个待重组数据按列存储在D触发器阵列中,其中,存储在D触发器阵列中的第k列的a个待重组数据中的第一个待重组数据的存储地址为(g,k),存储在D触发器阵列中的第k+1列的a个待重组数据中的第一个待重组数据的存储地址为(g,k+1),其中,g和k为正整数。
相应的,按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中,包括:按行读取存储在D触发器阵列中的重组数据,将按行读取的重组数据按列写入目标地址对应的存储介质中。
具体地,与第一种方式相比,本方式中是输入方向以“列”为单位写入内部缓存,输出方向则以“行”为单位从内部缓存中读取数据。具体地,当在DMA中增加若干块数据格式重整缓存,并以D触发器阵列为基本存储单元时,DMA将按照原始数据格式,每次从源地址对应的存储介质中读取一个或多个总线宽度的待重组数据,并按列存储在D触发器阵列,例如,假设读取了a个待重组数据并进行存储后,则存储在第k列的a个待重组数据中的第一个待重组数据的存储地址为(g,k)时,存储在第k+1列的a个待重组数据中的第一个待重组数据的存储地址为(g,k+1)。在DMA读数据直到填满整块数据格式重整缓存之后,将对D触发器阵列进行类似矩阵转置的处理,即输入方向以“列”为单位写入内部缓存,输出方向则以“行”为单位从内部缓存中读取数据,并启动DMA输出方向处理,按照处理器期望的数据格式向目标地址对应的存储介质中写入数据,每次写入的数据量是一个或多个总线宽度,并重复执行写入操作,直至DMA输出数据到整块数据格式重整缓存中的数据被读取完。需要进行说明的是,可以在DMA输出数据的同时对后一块数据进行数据输入操作,以此进行数据流水。
图5为DMA进行数据重组的另一示意图,如图5所示,DMA将第一次读取的8个待重组数据存储在第1列,存储在第1列的8个待重组数据中的第一个待重组数据的存储地址为(1,1),将第二次读取的8个待重组数据存储在第2列,第二次读取的这8个待重组数据中的第一个待重组数据的存储地址为(1,2)等,以此类推,直至将DMA的内部缓存填满,此时,DMA将按行读取存储在D触发器阵列中的重组数据,并将读取的数据按列写入目标地址对应的存储介质中。
第三种:若DMA的缓存单元为静态随机存储器(Static Random Access Memory;SRAM),则将读取的a个待重组数据存储在DMA的缓存单元中,包括:将读取的a个待重组数据按行存储在SRAM中,其中,存储在SRAM中的第m行的a个待重组数据中的第一个待重组数据的存储地址为(m,n),存储在SRAM中的第m+1行的a个待重组数据中的第一个待重组数据的存储地址为(m+b,n+b),其中,m、n和b均为正整数。
相应的,按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中,包括:每隔b列从SRAM中读取一个重组数据,将读取的a个重组数据按行写入或按列写入目标地址对应的存储介质中,其中,从每一列中读取的重组数据在SRAM中存储的行地址均不相同,且读取的a个重组数据的属性相同。
具体地,当DMA的缓存单元为SRAM,由于DMA在读取存储在SRAM中的重组数据时,在每一列中只能读取一个数据,因此,重组数据在SRAM中的存储方式将和D触发器阵列中的存储方式有所不同。在具体的实现过程中,当在DMA中增加若干块数据格式重整缓存,并以SRAM为基本存储单元时,DMA将按照原始数据格式,每次从源地址对应的存储介质中读取一个或多个总线宽度的待重组数据,并按行存储在SRAM中,例如,假设读取了a个待重组数据并进行存储后,则存储在第m行的a个待重组数据中的第一个待重组数据的存储地址为(m,n),存储在第m+1行的a个待重组数据中的第一个待重组数据的存储地址为(m+b,n+b)。在DMA读数据直到填满整块数据格式重整缓存之 后,将从内部缓存中读取数据,并启动DMA输出方向处理,按照处理器期望的数据格式向目标地址对应的存储介质中写入数据,每次写入的数据量是一个或多个总线宽度,具体地,DMA将每隔b列从SRAM中读取一个重组数据,且在每一列中读取的重组数据在SRAM中存储的行地址均不相同,且读取的a个重组数据的属性相同,其中,a个重组数据的属性相同可以包括a个重组数据是解决相同问题的数据、或者a个重组数据为具有相同功能的数据,或者处理器对a个重组数据的进行处理时处理过程相同等等。DMA将重复执行上述读取操作,直至DMA输出数据到整块数据格式重整缓存中的数据被读取完。需要进行说明的是,可以在DMA输出数据的同时对后一块数据进行数据输入操作,以此进行数据流水。
图6为DMA进行数据重组的又一示意图,如图6所示,若b为1时,DMA将第一次读取的8个待重组数据存储在第1行后,存储在第1行的8个待重组数据中的第一个待重组数据的存储地址为(1,1),将第二次读取的8个待重组数据存储在第2行后,存储在第2行的8个待重组数据中的第一个待重组数据的存储地址为(2,2)等,以此类推,直至将DMA的内部缓存填满,此时,DMA将会依次从SRAM的每一列中读取一个重组数据,并将读取的8个重组数据按行写入或按列写入目标地址对应的存储介质中。
第四种:若DMA的缓存单元为SRAM,则将读取的a个待重组数据存储在DMA的缓存单元中,包括:将读取的a个待重组数据按行存储在SRAM中,其中,第i次读取的a个待重组数据中的第一个待重组数据的存储地址为(i,j),第i+1次读取的a个待重组数据中的第一个待重组数据的存储地址为(i,j+a)。
相应的,按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中,包括:从SRAM的同一行中,每隔a列读取一个重组数据,并将读取的a个重组数据按行写入或按列写入目标地址对应的存储介质中,其中,读取的a个重组数据的属性相同。
具体地,与第三种方式相比,本方式中的SRAM可以采用大位宽的SRAM,此时,则可以将读取的目标数量的待重组数据全部存储在SRAM的同一行中。当在DMA中增加若干块数据格式重整缓存,并以大位宽的SRAM为基本存储单元时,DMA将按照原始数据格式,每次从源地址对应的存储介质中读取一个或多个总线宽度的待重组数据,并按行存储在SRAM中,例如,假设第i次读取了a个待重组数据,则将第i次读取的该a个待重组数据中的第一个待重组数据的存储地址存储为(i,j),且在第i+1次又读取了a个待重组数据时,则第i+1次读取的该a个待重组数据中的第一个待重组数据的存储地址存储为(i,j+a)。在DMA将待重组数据存储在SRAM中之后,将开始从内部缓存中读取数据,并启动DMA输出方向处理,按照处理器期望的数据格式向目标地址对应的存储介质中写入数据,每次写入的数据量是一个或多个总线宽度,具体地,从SRAM的同一行中,每隔a列读取一个重组数据,并将读取的a个重组数据按行写入或按列写入目标地址对应的存储介质中,其中,读取的a个重组数据的属性相同。其中,a个重组数据的属性相同可以包括a个重组数据是解决相同问题的数据、或者a个重组数据为具有相同功能的数据,或者处理器对a个重组数据的进行处理时处理过程相同等等。DMA将重复执行上述读取操作,直至DMA输出数据到整块数据格式重整缓存中的数据被读取完。需要进行说明的是,可以在DMA输出数据的同时对后一块数据进行数据输入操作,以此进行数据流水。
图7为DMA进行数据重组的又一示意图,如图7所示,DMA将第一次读取的8个 待重组数据存储在第1行,存储在第1行的8个待重组数据中的第一个待重组数据的存储地址为(1,1),将第二次读取的8个待重组数据也存储在第1行,且这8个待重组数据中的第一个待重组数据的存储地址为(1,9)等,以此类推,直至将DMA的内部缓存中的第一行填满,此时,DMA将从SRAM的同一行中,每隔8列读取一个重组数据,并将读取的8个重组数据按行写入或按列写入目标地址对应的存储介质中。
可选地,DMA的缓存单元还可以采用多个小位宽SRAM拼凑组成,此时,也可以将读取的目标数量的待重组数据全部存储在SRAM的同一行中,在将DMA中的重组数据写入目标地址对应的存储介质中时,可以从SRAM的同一行中,每隔a列读取一个重组数据,并将读取的a个重组数据按行写入或按列写入目标地址对应的存储介质中。
另外,值得注意的是,待重组数据在DMA中的存储方式,并不限于上述四种方式,当然,待重组数据还可以采用其他方式存储在DMA中,只要DMA按照原始数据格式将源地址对应的存储介质中的待重组数据,存储到DMA中,得到重组数据之后,再按照预设的数据格式将重组数据写入到目标地址对应的存储介质中即可,至于待重组数据在DMA中的具体存储方式,本实施例在此并不作限制。
本申请实施例提供的数据的重组方法,通过获取待重组数据的源地址、目标地址和待重组数据的目标数量,并从该源地址对应的存储介质中按照原始数据格式,读取目标数量的待重组数据并存储在DMA中,得到重组数据,再按照预设的数据格式将DMA中的重组数据写入目标地址对应的存储介质中。由于DMA根据总线位宽从源地址对应的存储介质中按照原始数据格式,将待重组数据读入到DMA中后,再按照预设的数据格式将存储在DMA中的重组数据写入到目标地址对应的存储介质中,由此避免了现有技术中通过读取单个数据的方式进行数据格式重组的现象,因此不仅可以提高带宽资源的利用率,而且可以提高数据格式重组的效率。另外,根据DMA的缓存单元,可以将待重组数据按照不同的格式存储在DMA中,由此可以提高待重组数据的灵活性。
图8为本申请实施例提供的数据的重组装置实施例一的结构示意图。该重组装置可以为独立的DMA,还可以为集成在DMA中的装置,该装置可以通过软件、硬件或者软硬件结合的方式实现。如图8所示,该重组装置包括:
获取模块11,用于获取待重组数据的源地址、目标地址和所述待重组数据的目标数量;
读取模块12,用于从所述源地址对应的存储介质中按照原始数据格式,读取所述目标数量的待重组数据并存储在DMA中,得到重组数据;所述原始数据格式为所述待重组数据在所述源地址对应的存储介质中的存储格式;
写入模块13,用于按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中。
本发明实施例提供的数据的重组装置,可以执行上述方法实施例,其实现原理和技术效果类似,在此不再赘述。
可选地,所述读取模块12,具体用于:
步骤A:按照所述原始数据格式,从所述源地址对应的存储介质中读取a个待重组数据,并将读取的所述a个待重组数据存储在DMA的缓存单元中;a为大于或等于2且小于或等于所述目标数量的正整数;
步骤B:重复执行所述步骤A,直至读取的所述待重组数据的数量为所述目标数量。
可选地,若所述DMA的缓存单元为D触发器阵列;
所述读取模块12,具体用于:
将读取的所述a个待重组数据按行存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第p行的所述a个待重组数据中的第一个待重组数据的存储地址为(p,q),存储在所述D触发器阵列中的第p+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(p+1,q);其中,p和q为正整数;
所述写入模块13,具体用于:
按列读取存储在所述D触发器阵列中的重组数据;
将按列读取的所述重组数据按行写入所述目标地址对应的存储介质中。
可选地,若所述DMA的缓存单元为D触发器阵列;
所述读取模块12,具体用于:
将读取的所述a个待重组数据按列存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第k列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k),存储在所述D触发器阵列中的第k+1列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k+1),其中,g和k为正整数;
所述写入模块13,具体用于:
按行读取存储在所述D触发器阵列中的重组数据;
将按行读取的所述重组数据按列写入所述目标地址对应的存储介质中。
可选地,若所述DMA的缓存单元为静态随机存储器SRAM;
所述读取模块12,具体用于:
将读取的所述a个待重组数据按行存储在所述SRAM中,其中,存储在所述SRAM中的第m行的所述a个待重组数据中的第一个待重组数据的存储地址为(m,n),存储在所述SRAM中的第m+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(m+b,n+b),其中,m、n和b均为正整数;
所述写入模块13,具体用于:
每隔b列从所述SRAM中读取一个重组数据,将读取的a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,从每一列中读取的重组数据在所述SRAM中存储的行地址均不相同,且读取的所述a个重组数据的属性相同。
可选地,若所述DMA的缓存单元为静态随机存储器SRAM;
所述读取模块12,具体用于:
将读取的所述a个待重组数据按行存储在所述SRAM中,其中,第i次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j),第i+1次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j+a);
所述写入模块13,具体用于:
从所述SRAM的同一行中,每隔所述a列读取一个重组数据,并将读取的所述a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,读取的所述a个重组数据的属性相同。
本发明实施例提供的数据的重组装置,可以执行上述方法实施例,其实现原理和技术效果类似,在此不再赘述。
图9为本申请实施例提供的DMA实施例的结构示意图。如图9所示,该DMA可以包括处理器21和存储器22。存储器22可能包含高速RAM存储器,也可能还包括非易失性存储NVM,例如至少一个磁盘存储器,存储器22中可以存储各种程序,用于完成各种处理功能以及实现本实施例的方法步骤。
本实施例中,处理器21用于获取待重组数据的源地址、目标地址和所述待重组数据的目标数量;
处理器21还用于从所述源地址对应的存储介质中按照原始数据格式,读取所述目标数量的待重组数据并存储在DMA中,得到重组数据;所述原始数据格式为所述待重组数据在所述源地址对应的存储介质中的存储格式;
处理器21还用于按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中。
可选地,处理器21具体用于:
步骤A:按照所述原始数据格式,从所述源地址对应的存储介质中读取a个待重组数据,并将读取的所述a个待重组数据存储在DMA的缓存单元中;a为大于或等于2且小于或等于所述目标数量的正整数;
步骤B:重复执行所述步骤A,直至读取的所述待重组数据的数量为所述目标数量。
可选地,若所述DMA的缓存单元为D触发器阵列;
处理器21,还用于将读取的所述a个待重组数据按行存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第p行的所述a个待重组数据中的第一个待重组数据的存储地址为(p,q),存储在所述D触发器阵列中的第p+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(p+1,q);其中,p和q为正整数;
处理器21,还用于按列读取存储在所述D触发器阵列中的重组数据;将按列读取的所述重组数据按行写入所述目标地址对应的存储介质中。
可选地,若所述DMA的缓存单元为D触发器阵列;
处理器21,还用于将读取的所述a个待重组数据按列存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第k列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k),存储在所述D触发器阵列中的第k+1列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k+1),其中,g和k为正整数;
处理器21,还用于按行读取存储在所述D触发器阵列中的重组数据;将按行读取的所述重组数据按列写入所述目标地址对应的存储介质中。
可选地,若所述DMA的缓存单元为静态随机存储器SRAM;
处理器21,还用于将读取的所述a个待重组数据按行存储在所述SRAM中,其中,存储在所述SRAM中的第m行的所述a个待重组数据中的第一个待重组数据的存储地址为(m,n),存储在所述SRAM中的第m+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(m+b,n+b),其中,m、n和b均为正整数;
处理器21,还用于每隔b列从所述SRAM中读取一个重组数据,将读取的a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,从每一列中读取的重组数据在所述SRAM中存储的行地址均不相同,且读取的所述a个重组数据的属性相同。
可选地,若所述DMA的缓存单元为静态随机存储器SRAM;
处理器21,还用于将读取的所述a个待重组数据按行存储在所述SRAM中,其中,第i次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j),第i+1次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j+a);
处理器21,还用于从所述SRAM的同一行中,每隔所述a列读取一个重组数据,并将读取的所述a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,读取的所述a个重组数据的属性相同。
本申请实施例提供的DMA,可以执行上述方法实施例,其实现原理和技术效果类似,在此不再赘述。
本申请还提供一种可读存储介质,可读存储介质中存储有指令,当DMA的至少一个处理器执行该指令时,DMA执行上述任一方法实施例中提供的数据的重组方法。
本申请还提供一种程序产品,该程序产品包括指令,该指令存储在可读存储介质中。DMA的至少一个处理器可以从可读存储介质读取该指令,并执行该指令使得DMA实施任一方法实施例中提供的数据的重组方法。
在DMA的具体实现中,应理解,处理器可以是中央处理单元(英文:Central Processing Unit,简称:CPU),还可以是其他通用处理器、数字信号处理器(英文:Digital Signal Processor,简称:DSP)、专用集成电路(英文:Application Specific Integrated Circuit,简称:ASIC)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一可读取存储器中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储器(存储介质)包括:只读存储器(英文:read-only memory,缩写:ROM)、RAM、快闪存储器、硬盘、固态硬盘、磁带(英文:magnetic tape)、软盘(英文:floppy disk)、光盘(英文:optical disc)及其任意组合。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各 个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (12)

  1. 一种数据的重组方法,其特征在于,应用于直接内存存取DMA中,所述方法包括:
    获取待重组数据的源地址、目标地址和所述待重组数据的目标数量;
    从所述源地址对应的存储介质中按照原始数据格式,读取所述目标数量的待重组数据并存储在DMA中,得到重组数据;所述原始数据格式为所述待重组数据在所述源地址对应的存储介质中的存储格式;
    按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中。
  2. 根据权利要求1所述的方法,其特征在于,所述从所述源地址对应的存储介质中按照原始数据格式,读取所述目标数量的待重组数据并存储在DMA中,包括:
    步骤A:按照所述原始数据格式,从所述源地址对应的存储介质中读取a个待重组数据,并将读取的所述a个待重组数据存储在DMA的缓存单元中;a为大于或等于2且小于或等于所述目标数量的正整数;
    步骤B:重复执行所述步骤A,直至读取的所述待重组数据的数量为所述目标数量。
  3. 根据权利要求2所述的方法,其特征在于,若所述DMA的缓存单元为D触发器阵列;
    所述将读取的所述a个待重组数据存储在DMA的缓存单元中,包括:
    将读取的所述a个待重组数据按行存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第p行的所述a个待重组数据中的第一个待重组数据的存储地址为(p,q),存储在所述D触发器阵列中的第p+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(p+1,q);其中,p和q为正整数;
    所述按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中,包括:
    按列读取存储在所述D触发器阵列中的重组数据;
    将按列读取的所述重组数据按行写入所述目标地址对应的存储介质中。
  4. 根据权利要求2所述的方法,其特征在于,若所述DMA的缓存单元为D触发器阵列;
    所述将读取的所述a个待重组数据存储在DMA的缓存单元中,包括:
    将读取的所述a个待重组数据按列存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第k列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k),存储在所述D触发器阵列中的第k+1列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k+1),其中,g和k为正整数;
    所述按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中,包括:
    按行读取存储在所述D触发器阵列中的重组数据;
    将按行读取的所述重组数据按列写入所述目标地址对应的存储介质中。
  5. 根据权利要求2所述的方法,其特征在于,若所述DMA的缓存单元为静态随机存储器SRAM;
    所述将读取的所述a个待重组数据存储在DMA的缓存单元中,包括:
    将读取的所述a个待重组数据按行存储在所述SRAM中,其中,存储在所述SRAM中的第m行的所述a个待重组数据中的第一个待重组数据的存储地址为(m,n),存储在所述SRAM中的第m+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(m+b,n+b),其中,m、n和b均为正整数;
    所述按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中,包括:
    每隔b列从所述SRAM中读取一个重组数据,将读取的a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,从每一列中读取的重组数据在所述SRAM中存储的行地址均不相同,且读取的所述a个重组数据的属性相同。
  6. 根据权利要求2所述的方法,其特征在于,若所述DMA的缓存单元为静态随机存储器SRAM;
    所述将读取的所述a个待重组数据存储在DMA的缓存单元中,包括:
    将读取的所述a个待重组数据按行存储在所述SRAM中,其中,第i次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j),第i+1次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j+a);
    所述按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中,包括:
    从所述SRAM的同一行中,每隔所述a列读取一个重组数据,并将读取的所述a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,读取的所述a个重组数据的属性相同。
  7. 一种数据的重组装置,其特征在于,包括:
    获取模块,用于获取待重组数据的源地址、目标地址和所述待重组数据的目标数量;
    读取模块,用于从所述源地址对应的存储介质中按照原始数据格式,读取所述目标数量的待重组数据并存储在DMA中,得到重组数据;所述原始数据格式为所述待重组数据在所述源地址对应的存储介质中的存储格式;
    写入模块,用于按照预设的数据格式将所述DMA中的重组数据写入所述目标地址对应的存储介质中。
  8. 根据权利要求7所述的装置,其特征在于,所述读取模块,具体用于:
    步骤A:按照所述原始数据格式,从所述源地址对应的存储介质中读取a个待重组数据,并将读取的所述a个待重组数据存储在DMA的缓存单元中;a为大于或等于2且小于或等于所述目标数量的正整数;
    步骤B:重复执行所述步骤A,直至读取的所述待重组数据的数量为所述目标数量。
  9. 根据权利要求8所述的装置,其特征在于,若所述DMA的缓存单元为D触发器阵列;
    所述读取模块,具体用于:
    将读取的所述a个待重组数据按行存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第p行的所述a个待重组数据中的第一个待重组数据的存储地址为(p,q),存储在所述D触发器阵列中的第p+1行的所述a个待重组数据中的第一个待重组数据的存 储地址为(p+1,q);其中,p和q为正整数;
    所述写入模块,具体用于:
    按列读取存储在所述D触发器阵列中的重组数据;
    将按列读取的所述重组数据按行写入所述目标地址对应的存储介质中。
  10. 根据权利要求8所述的装置,其特征在于,若所述DMA的缓存单元为D触发器阵列;
    所述读取模块,具体用于:
    将读取的所述a个待重组数据按列存储在所述D触发器阵列中,其中,存储在所述D触发器阵列中的第k列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k),存储在所述D触发器阵列中的第k+1列的所述a个待重组数据中的第一个待重组数据的存储地址为(g,k+1),其中,g和k为正整数;
    所述写入模块,具体用于:
    按行读取存储在所述D触发器阵列中的重组数据;
    将按行读取的所述重组数据按列写入所述目标地址对应的存储介质中。
  11. 根据权利要求8所述的装置,其特征在于,若所述DMA的缓存单元为静态随机存储器SRAM;
    所述读取模块,具体用于:
    将读取的所述a个待重组数据按行存储在所述SRAM中,其中,存储在所述SRAM中的第m行的所述a个待重组数据中的第一个待重组数据的存储地址为(m,n),存储在所述SRAM中的第m+1行的所述a个待重组数据中的第一个待重组数据的存储地址为(m+b,n+b),其中,m、n和b均为正整数;
    所述写入模块,具体用于:
    每隔b列从所述SRAM中读取一个重组数据,将读取的a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,从每一列中读取的重组数据在所述SRAM中存储的行地址均不相同,且读取的所述a个重组数据的属性相同。
  12. 根据权利要求8所述的装置,其特征在于,若所述DMA的缓存单元为静态随机存储器SRAM;
    所述读取模块,具体用于:
    将读取的所述a个待重组数据按行存储在所述SRAM中,其中,第i次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j),第i+1次读取的所述a个待重组数据中的第一个待重组数据的存储地址为(i,j+a);
    所述写入模块,具体用于:
    从所述SRAM的同一行中,每隔所述a列读取一个重组数据,并将读取的所述a个重组数据按行写入或按列写入所述目标地址对应的存储介质中,其中,读取的所述a个重组数据的属性相同。
PCT/CN2017/087393 2017-06-07 2017-06-07 数据的重组方法和装置 WO2018223302A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/087393 WO2018223302A1 (zh) 2017-06-07 2017-06-07 数据的重组方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/087393 WO2018223302A1 (zh) 2017-06-07 2017-06-07 数据的重组方法和装置

Publications (1)

Publication Number Publication Date
WO2018223302A1 true WO2018223302A1 (zh) 2018-12-13

Family

ID=64565686

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/087393 WO2018223302A1 (zh) 2017-06-07 2017-06-07 数据的重组方法和装置

Country Status (1)

Country Link
WO (1) WO2018223302A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101135980A (zh) * 2006-08-29 2008-03-05 飞塔信息科技(北京)有限公司 一种基于Linux操作系统实现零拷贝的装置和方法
CN101986287A (zh) * 2010-11-25 2011-03-16 中国人民解放军国防科学技术大学 用于向量数据流的重整理缓冲器
CN103207847A (zh) * 2013-04-27 2013-07-17 杭州士兰微电子股份有限公司 Dma控制器及直接内存存取控制方法
US20130254596A1 (en) * 2012-03-22 2013-09-26 Intel Mobile Communications GmbH System and Method for Processing Trace Information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101135980A (zh) * 2006-08-29 2008-03-05 飞塔信息科技(北京)有限公司 一种基于Linux操作系统实现零拷贝的装置和方法
CN101986287A (zh) * 2010-11-25 2011-03-16 中国人民解放军国防科学技术大学 用于向量数据流的重整理缓冲器
US20130254596A1 (en) * 2012-03-22 2013-09-26 Intel Mobile Communications GmbH System and Method for Processing Trace Information
CN103207847A (zh) * 2013-04-27 2013-07-17 杭州士兰微电子股份有限公司 Dma控制器及直接内存存取控制方法

Similar Documents

Publication Publication Date Title
US10140251B2 (en) Processor and method for executing matrix multiplication operation on processor
US11349639B2 (en) Circuit and method for overcoming memory bottleneck of ASIC-resistant cryptographic algorithms
TWI554883B (zh) 用於在記憶體系統中分割資料結構之系統及方法
US8676874B2 (en) Data structure for tiling and packetizing a sparse matrix
US9921847B2 (en) Tree-based thread management
US8762655B2 (en) Optimizing output vector data generation using a formatted matrix data structure
US9104526B2 (en) Transaction splitting apparatus and method
CN112199040B (zh) 存储访问方法及智能处理装置
JP2022508028A (ja) 3次元画像処理におけるデータの読み書き方法とシステム、記憶媒体及び端末
CN115237599B (zh) 一种渲染任务处理方法和装置
WO2018223302A1 (zh) 数据的重组方法和装置
TWI224259B (en) Method and related apparatus for clearing data in a memory device
CN109800867B (zh) 一种基于fpga片外存储器的数据调用方法
CN111694513A (zh) 包括循环指令存储器队列的存储器器件和方法
US10452356B2 (en) Arithmetic processing apparatus and control method for arithmetic processing apparatus
CN114422801B (zh) 优化视频压缩控制逻辑的方法、系统、设备和存储介质
CN115860080A (zh) 计算核、加速器、计算方法、装置、设备、介质及系统
US20220188380A1 (en) Data processing method and apparatus applied to graphics processing unit, and electronic device
WO2016177083A1 (zh) 一种数据存储方法、存储装置和计算机存储介质
US20230004533A1 (en) Hybrid intermediate stream format
CN111736130A (zh) 基于fpga的可配置分块式矩阵转置系统及方法
CN114995754B (zh) 一种面向科学大数据hdf5单个文件的高性能读写方法
CN111444127B (zh) 一种数据外存扩展接口
TWI764311B (zh) 記憶體存取方法及智慧處理裝置
CN107861689B (zh) 一种芯片面积与功耗优化方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17912739

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17912739

Country of ref document: EP

Kind code of ref document: A1