WO2021207919A1 - 控制器、存储器件访问系统、电子设备和数据传输方法 - Google Patents
控制器、存储器件访问系统、电子设备和数据传输方法 Download PDFInfo
- Publication number
- WO2021207919A1 WO2021207919A1 PCT/CN2020/084635 CN2020084635W WO2021207919A1 WO 2021207919 A1 WO2021207919 A1 WO 2021207919A1 CN 2020084635 W CN2020084635 W CN 2020084635W WO 2021207919 A1 WO2021207919 A1 WO 2021207919A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- module
- data
- reordering
- read command
- cache
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/0223—User address space allocation, e.g. contiguous or non contiguous base addressing
- G06F12/023—Free address space management
- G06F12/0238—Memory management in non-volatile memory, e.g. resistive RAM or ferroelectric memory
Definitions
- This application relates to the field of computer technology, and in particular to a controller, a storage device access system, electronic equipment, and a data transmission method.
- the order of the data returned by the storage device to the controller often cannot meet the requirements of the processor.
- a reordering cache can be provided in the controller, and the reordering cache can be obtained from the storage device. The data is reordered so that the return order of the data meets the requirements of the processor.
- the disadvantage of the above-mentioned technology is that the reordering buffer in the controller occupies a large area, which results in a large size and high cost of the controller.
- the embodiments of the present application provide a controller, a storage device access system, an electronic device, and a data transmission method, so as to solve the technical problem that the controller of the storage device is large in size and high in cost.
- an embodiment of the present application provides a controller for communicating with a storage device, including: a scheduling module, at least one reordering buffer, and multiple ports;
- the scheduling module is used to obtain data from the storage device
- the reordering buffer is used to reorder the data acquired by the scheduling module and then transmit it to the corresponding port;
- the port is used to send the data obtained from the reordering buffer to the corresponding processing device
- At least two ports of the multiple ports multiplex a reordering buffer.
- an embodiment of the present application provides a storage device access system, including the controller, storage device, and multiple processing devices described in the first aspect.
- an embodiment of the present application provides an electronic device including the storage device access system described in the second aspect.
- an embodiment of the present application provides a data transmission method, the method is applied to a controller, the controller is used to communicate with a storage device, the controller includes a scheduling module, at least one reordering buffer, and multiple ports , The method includes:
- the scheduling module obtains data from the storage device
- the reordering buffer reorders the data acquired by the scheduling module and transmits it to the corresponding port;
- the port sends the data obtained from the reordering buffer to the corresponding processing device
- At least two ports of the multiple ports multiplex a reordering buffer.
- embodiments of the present application provide a computer-readable storage medium, including instructions, which when run on a computer, cause the computer to execute the method described in the fourth aspect.
- At least two of the multiple ports multiplex a reordering buffer, which can realize the sharing of the reordering buffer and reduce the reordering in the controller. Sort the number of buffers, reducing the size and cost of the controller.
- FIG. 1 is a schematic diagram of an application scenario of a DDR controller provided by an embodiment of the application
- FIG. 2 is a schematic diagram of a read channel of a DDR controller provided by an embodiment of the application
- FIG. 3 is a schematic diagram of a write channel of a DDR controller provided by an embodiment of the application.
- FIG. 4 is a schematic diagram of the interaction principle of a DDR controller reading data process provided by an embodiment of the application;
- FIG. 5 is a schematic diagram of the working principle of a data reading process of a DDR controller provided by an embodiment of the application;
- Figure 6 is a schematic structural diagram of a single-channel DDR controller
- Figure 7 is a schematic structural diagram of a dual-channel DDR controller
- FIG. 8 is a schematic structural diagram of a controller for communicating with a storage device according to an embodiment of the application.
- Fig. 9 is a schematic diagram of parameters of various components in the controller shown in Fig. 8.
- FIG. 10 is a schematic diagram of the principle of data transmission by the controller shown in FIG. 8;
- FIG. 11 is a schematic structural diagram of another controller for communicating with a storage device according to an embodiment of the application.
- Fig. 12 is a schematic diagram 1 of the principle of data transmission by the controller shown in Fig. 11;
- FIG. 13 is a second schematic diagram of the data transmission principle of the controller shown in FIG. 11;
- FIG. 14 is a schematic structural diagram of another controller for communicating with a storage device according to an embodiment of the application.
- 15 is a schematic structural diagram of another controller for communicating with a storage device provided by an embodiment of the application.
- FIG. 16 is a schematic flowchart of a data transmission method provided by an embodiment of this application.
- An embodiment of the present application provides a controller for communicating with a storage device, which may include: a scheduling module, at least one reordering buffer, and multiple ports.
- the scheduling module is used to obtain data from the storage device.
- the sorting buffer is used for reordering the data obtained by the scheduling module and transmitting it to the corresponding port, and the port is used for sending the data obtained from the reordering buffer to the corresponding processing device, wherein the plurality of At least two of the ports can reuse one reordering buffer.
- the port of the controller can be used to connect to the processing device and send data to the processing device connected to it.
- the connection relationship between the port and the processing device can be set according to actual needs. For example, multiple ports can be connected to multiple processing devices in a one-to-one correspondence, or at least two ports of the multiple ports can be connected to the same processing device, that is, the processing devices connected to each port can be different.
- the processing device of may also be the same processing device, which is not limited in the embodiment of the present application.
- the port sends data to its corresponding processing device, that is, the processing device connected to it.
- the storage device may be any device that can realize the data storage function, including but not limited to at least one of the following: RAM (Random Access Memory, random access memory), SDRAM (Synchronous Dynamic Random Access Memory, synchronous Dynamic random access memory), DDR (Double Data Rate SDRAM, double-rate synchronous dynamic random access memory), etc.
- RAM Random Access Memory, random access memory
- SDRAM Serial Dynamic Random Access Memory, synchronous Dynamic random access memory
- DDR Double Data Rate SDRAM, double-rate synchronous dynamic random access memory
- the storage device is a DDR device and the controller is a DDR controller as an example for description.
- the DDR device can be any type of DDR such as DDR2, DDR3, DDR4, and so on.
- a DDR controller can be set to connect to DDR devices to provide system memory space.
- ASIC Application Specific Integrated Circuit
- ASIC Application Specific Integrated Circuit
- FIG. 1 is a schematic diagram of an application scenario of a DDR controller provided by an embodiment of the application.
- multiple access ports can be configured for the DDR controller.
- These access ports all follow certain bus standards, such as AXI (Advanced eXtensible Interface), AHB (Advanced High Performance Bus, Advanced Microcontroller Bus Architecture) of AMBA (Advanced Microcontroller Bus Architecture) )Wait.
- AXI Advanced eXtensible Interface
- AHB Advanced High Performance Bus, Advanced Microcontroller Bus Architecture
- AMBA Advanced Microcontroller Bus Architecture
- the DDR controller can be accessed as a SLAVE (slave device), and there are multiple external processing devices as an access master (master device).
- These access masters can come from other IP (Intellectual Property, intellectual property) cores, such as AP (application processor) ), DSP (digital signal processor), GPU (graphics processing unit), MEDIA (multimedia) IP core, etc.
- IP Internet Protocol
- AP application processor
- DSP digital signal processor
- GPU graphics processing unit
- MEDIA multimedia
- the system bottleneck lies in the access efficiency of DDR devices. Therefore, in order to maximize the access efficiency of DDR devices during system design, when a MASTER sends a read data operation to the port of the DDR controller, it is already in the MASTER. It opens up storage space for the read data to be returned, and does not restrict the output of DDR data from the port due to insufficient space.
- 8 accessing MASTERs are taken as an example to illustrate the application scenario of the embodiment of the present application.
- the buses for accessing the MASTERs can be, for example, AMBAAXI buses.
- DDR device is not AXI bus standard interface, its interface is another standard.
- the role of DDR controller is to convert AXI bus access command and timing into DDR device access command and timing, thus completing the data between DDR device and accessing MASTER Interactive.
- AMBA's AXI bus has the following characteristics: Read and write, request and response are all changed to independent one-way transmission channels to improve the parallelism of transmission and increase throughput; there are 5 channels for reading and writing data, respectively: Read address (read command) channel, read data channel, write address (write command) channel, write data channel, write response channel; each independent channel is easy to add Register Slice (register slice), which is conducive to timing closure.
- FIG. 2 is a schematic diagram of a read channel of a DDR controller provided by an embodiment of the application.
- the read channel may specifically include a read address channel (Read address channel) and a read data channel (Read data channel).
- the master interface can transmit address and control signals to the slave interface through the read address channel, and the slave interface can return read data to the master interface through the read data channel. , So as to complete the data read operation.
- FIG. 3 is a schematic diagram of a write channel of a DDR controller provided by an embodiment of the application.
- the write channel may include a write address channel (Write address channel), a write data channel (Write data channel), and a write response channel (Write response channel).
- the master interface can send address and control signals to the slave interface through the write address channel, and then send write data to the slave interface through the write data channel, and the slave device
- the interface can return a write response (Write response) signal through the write response channel to complete the data writing operation.
- the AXI bus standard improves the utilization of bus bandwidth, but it also has the following limitations: there is no requirement for the response sequence of AXI transmissions between different Masters, and there is no requirement for the response sequence between transmissions of different IDs (identifications) issued by the same Master. , But the response sequence of the transmission of the same ID must be consistent with the command sequence.
- FIG. 4 is a schematic diagram of the interaction principle of a data read process of a DDR controller provided by an embodiment of the application.
- Figure 4 describes the sequence relationship between read commands with the same ID and read data from the same MASTER.
- the read data corresponding to the first read command is returned first, and the return order of read data remains consistent with the order in which the read command is issued.
- the MASTER sends a read command ARADDR to the DDR controller.
- FIG. 5 is a schematic diagram of the working principle of a data read process of a DDR controller provided by an embodiment of the application.
- the DDR controller After receiving the AXI bus read command ARADDR, the DDR controller divides each ARADDR into instruction fragments, and adds a token to each instruction fragment to record the attribute information of the instruction fragment.
- the attribute information may include at least one of the following: ID, the serial number under the ID of the read command to which the instruction fragment belongs, the number of fragments of the read command to which the instruction fragment belongs, and the order of each instruction fragment Any information used to implement subsequent data reorganization.
- the token of the instruction fragment may include the attribute information; or, the token may include any information that can determine the attribute information.
- the token may be a string of codes, and the code You can check the table to get the corresponding ID, number of fragments, etc.
- the DDR controller can readjust the execution sequence of these instruction fragments according to the current state of the DDR device, so as to achieve the best execution efficiency of the DDR device and improve the access efficiency of the DDR device.
- the order of the tokens corresponding to each instruction fragment can be adjusted along with the adjustment of the order of the instruction fragments to ensure that the order of the instruction fragments corresponds to the order of the tokens one-to-one.
- the DDR device returns the read data in the adjusted order after receiving the instruction fragments in the adjusted order.
- the DDR controller adds corresponding tokens to the read data returned by the DDR device according to the previous token sequence. At this time, the order of reading data is different from the order in which the ARADDR command is issued, so it does not meet the AXI bus's data return order requirement for the same ID.
- the Read Reorder Buffer (RRB) that the DDR controller can pass through, according to the token carried by the read data, reorganizes the read data return order, and generates the correct RDATA return order to conform to the return order of the AXI bus read data.
- the DDR controller splits each read command into multiple instruction fragments in turn. For example, ARADDR1 is split into 4 instruction fragments, ARADDR2 is split into 2 instruction fragments, and ARADDR3 Split into 5 instruction fragments, ARADDR4 is split into 3 instruction fragments, and ARADDR5 is split into 4 instruction fragments.
- the number in each instruction fragment can be used to indicate which read command the instruction fragment comes from. .
- the reordering buffer in the DDR controller is required to reorder the data. Specifically, because the instruction fragments correspond to the returned data one-to-one, the instruction fragments and The token is also one-to-one, so the data and the token are also one-to-one.
- the data can be reorganized through the token corresponding to the data to obtain the data RDATA1 to RDATA5 corresponding to ARADDR1 to ARADDR5, and return to the corresponding MASTER.
- the order of tokens generated during the command splitting phase determines the order of data reorganization in the reordering buffer.
- the token can be used as an addressing pointer, so that the relevant attribute information of the initial ARADDR instruction can be obtained by addressing, and thus the number information returned by RDATA can be obtained.
- the reordering cache can know the number of data corresponding to a read command. As long as the data corresponding to a read command is collected, then The data corresponding to the read command can be sent to the corresponding port.
- the token generation sequence of the same ID can be maintained by using a FIFO (First Input First Output) memory. The tokens generated first are used first and the corresponding read data is used to generate RDATA.
- FIFO First Input First Output
- Fig. 6 is a structural diagram of a single-channel DDR controller.
- the DDR controller includes a plurality of ports 601, and each port 601 is used to connect to a MASTER.
- the port 601 can be implemented as a data storage module such as a FIFO memory.
- Each port 601 is configured with a reordering buffer 602.
- a scheduling module 603 may also be provided in the DDR controller.
- the scheduling module 603 may include an arbitration module 6031, an instruction cache 6032, an instruction execution module 6033, and the like.
- the arbitration module 6031 is used to arbitrate the read command sent by the MASTER, and send the read command after the arbitration to the instruction buffer 6032.
- the instruction buffer 6032 can be used to divide and sort the read commands.
- the execution module 6033 may be used to execute the divided instruction fragments.
- the size setting of the reordering buffer 602 in the DDR controller is generally affected by the following parameters: the data width (bit width) of the DDR device 604, the data width can be increased by connecting multiple DDR devices 604 in parallel, here it refers to the parallel connection
- the total width of external data is recorded as D, usually 16, 32 or 64
- the BURST Length (burst length) of DDR device 604 indicates the length of a data transfer, recorded as BL
- general DDR devices such as DDR3, LPDDR3, DDR4 and other BLs are all 8, while the BL of LPDDR4 is usually 16 or 32
- the depth M of the instruction cache 6032 is generally 32 or 64.
- the size of a block of reordering buffer 602 is generally set to M*BL*D bits. Because the bit width of the DDR device 604 is D, which transfers BL cycles at a time, the data returned by the DDR device 604 at one time is D*BLbit, and the depth of the instruction cache 6032 is M, so at most M instruction fragments can be stored. Therefore, M* The BL*Dbit reordering buffer 602 can meet the needs of data reorganization.
- the size of the total reordering buffer 602 required in the single-channel mode is N*M*BL*D bits.
- bit width W of the reordering buffer 602 is consistent with the bus data bit width of the DDR controller port 601, and is also consistent with the output bit width of the arbitration module 6031.
- FIG. 7 is a schematic diagram of the structure of a dual-channel DDR controller.
- the size of the rearrangement buffer required in dual-channel mode is twice that of single-channel.
- a dual-channel DDR controller may include multiple ports 701, and each port 701 corresponds to two reordering buffers 702.
- the scheduling module 703, the arbitration module 7031, the instruction buffer 7032, and the instruction execution module 7033 are also Both have double copies, so as to obtain data from the DDR device 704 through dual channels.
- the specific implementation principle of the dual-channel DDR controller is similar to that of the single-channel DDR controller, except that the number of channels becomes two.
- the total reordering buffer required in dual-channel mode is 2*N*M*BL*D bits.
- the advantages of the implementation structure of the single and dual-channel DDR controllers shown in Figure 6 and Figure 7 are simple management, easy implementation, and independent ports. If data blockage occurs on one port, it will not affect the access efficiency of other ports to DDR devices. .
- the instruction cache is shared by each port, and the total amount of data returned from the DDR device within a period of time is limited. In most cases , The reordering buffer usage of each port is not all occupied. In extreme cases, the instruction buffer is all read commands from the same port, then the read data returned by DDR will occupy all reordering buffers of one port. At this time, other ports’ The reordering buffers are all free.
- the solutions shown in FIGS. 6 and 7 require a large cache space for the reordering cache, and each port cannot share the reordering cache uniformly.
- an embodiment of the present application provides a scheme in which multiple ports share a reordering buffer.
- a controller for communicating with a storage device, such as a DDR controller.
- the controller includes at least one reordering buffer and a plurality of ports, wherein at least one of the plurality of ports is Two ports multiplex a reordering buffer.
- multiplexing means that the at least two ports share the same reordering buffer.
- the reordering buffer may determine the port corresponding to the data after acquiring the data transmitted from the storage device, and send the data to the corresponding port in the reordered order, thereby realizing multiple Each port reuses a reordering buffer.
- the multiplexed reordering buffer can store the data of each of the at least two ports, unlike the scheme shown in FIG. 6 or FIG. 7, each reordering buffer can only store its data. The data of the corresponding port.
- the number of the reordering buffer is one, and all ports in the plurality of ports reuse the reordering buffer.
- the number of the reordering buffers is at least two, some of the ports in the plurality of ports reuse one of the reordering buffers, and other ports use another reordering buffer.
- the controller includes eight ports and two reordering buffers, where seven ports multiplex one reordering buffer, and the other port uses another reordering buffer alone, or four ports multiplex one reordering buffer. Sorting buffer, the other four ports reuse another reordering buffer.
- FIG. 8 is a schematic structural diagram of a controller for communicating with a storage device according to an embodiment of the application.
- the controller may include: a scheduling module 803, at least one reordering buffer 802 and multiple ports 801;
- the scheduling module 803 is configured to obtain data from the storage device 804;
- the reordering buffer 802 is used to reorder the data acquired by the scheduling module 803 and then transmit it to the corresponding port 801;
- the port 801 is used to send data obtained from the reordering buffer to a corresponding processing device
- At least two ports 801 of the plurality of ports 801 multiplex a reordering buffer 802.
- the number of reordering buffer 802 is one, the number of ports 801 is eight, and eight ports 801 are multiplexed with one reordering buffer 802.
- those skilled in the art can according to actual needs. To adjust the number of reordering buffers 802 and the number of ports 801 and the multiplexing situation.
- the port 801 can be used to connect with a processing device and send data to the processing device.
- the specific connection relationship between the processing device and the controller can be seen in Fig. 1, and the processing device can be connected to the controller as a master.
- the storage device 804 may be a DDR device, and the controller may be a DDR controller.
- the processing device may include at least one of the following: an application processor, a digital signal processor, a graphics processing unit, and a multimedia processor.
- the types of the multiple processing devices may be the same or different.
- the scheduling module 803 in the controller can be used to obtain data from the storage device 804.
- the scheduling module 803 may include: an instruction cache 8032 for storing and dividing read commands; an instruction execution module 8033 for sending the divided read commands to the storage device 804, so that the storage device 804 executes the divided read command; the arbitration module 8031 is configured to obtain the data returned by the storage device 804 according to the divided read command.
- the instruction cache 8032 may divide the read command into multiple instruction fragments, and the instruction execution module 8033 may send the divided instruction fragments to the storage device 804, so that the storage device 804 executes the Command fragment and return corresponding data.
- the instruction fragments sent to the storage device 804 may be the instruction fragments after the order is shuffled, that is, the order of the read command executed by the storage device 804 and the read command issued by the processing device may not be completely consistent.
- the instruction execution module 8033 can directly send the instruction fragments to the storage device 804, or process the instruction fragments to make them suitable for execution by the storage device 804, and The processed instruction fragment is sent to the storage device 804.
- the arbitration module 8031 can obtain the data returned by the storage device 804. In addition, the arbitration module 8031 can also perform an arbitration function. For example, before the read command reaches the instruction buffer 8032, it determines whether to execute the read command issued by the processing device. The number of read commands issued exceeds the depth of the instruction buffer 8032, then the read commands that exceed the range are rejected.
- the reordering buffer 802 may be used to reorder the data acquired by the scheduling module 803 and then transmit it to the corresponding port 801.
- the reordering buffer 802 may be specifically used to: receive the data sent by the scheduling module 803, determine the port 801 corresponding to the data, and send the data to the corresponding Port 801.
- determining which port 801 is the port 801 corresponding to the data can be achieved by a variety of methods.
- the scheduling module 803 can send the port information corresponding to the data to the reordering buffer 802, so that all The reordering buffer determines which port 801 to send the data to, or the token corresponding to the data may carry port information corresponding to the data, so that the reordering buffer determines which port to send the data to 801.
- the size of the storage space of the reordering buffer 802 can be determined according to the bit width, the burst length of the storage device 804, and the depth of the instruction buffer 8032.
- the storage space size of the reordering cache 802 may be Is M*BL*D, where D, BL, and M are all positive integers.
- the reordering buffer 802 may be connected to the arbitration module 8031 to receive the data returned by the arbitration module 8031 from the storage device 804.
- the input bit width of the reordering buffer 802 may be equal to the output bit width.
- the bit width of the reordering buffer 802 may be greater than the output bit width of the arbitration module 8031, which can increase the data transmission speed of the reordering buffer 802 and increase the utilization rate of the reordering buffer 802.
- the bit width of the reordering buffer 802 may be 2 or 4 times the output bit width of the arbitration module 8031.
- the output bit width of the arbitration module 8031 is W
- the reordering buffer The bit width of 802 can be 2W or 4W, and it can send at least 2 times or 4 times the data in one cycle to ensure data transmission speed and accuracy.
- the reordering buffer 802 may include a first reordering module 8021 and a second reordering module 8022, and the bit widths of the first reordering module 8021 and the second reordering module 8022 may be both
- the output bit width of the arbitration module 8031 is twice as large.
- the input bit width of each port 801 may be equal to the output bit width of the reordering buffer 802, thereby realizing fast data transmission.
- the port 801 may include a data storage module (DM) 8011 and a cache storage module (BM); the data storage module 8011 may be used to connect to the processing device, and the output of the data storage module 8011
- the bit width is equal to the input bit width of the processing device; the buffer storage module is connected to the reordering buffer 802, the input bit width of the buffer storage module is equal to the output bit width of the reordering buffer 802, and the buffer
- the output bit width of the storage module is equal to the input bit width of the data storage module 8011.
- both the data storage module 8011 and the cache storage module can be implemented by a FIFO memory to ensure that data can be read quickly and accurately.
- each of the cache storage modules includes a first cache storage module (first BM) 8012 and a second cache storage module (second BM) 8013.
- the trapezoid between the reordering cache 802 and the first cache storage module 8012 and the second cache storage module 8013 in FIG. 8 can be used to represent the connection relationship between the reordering cache 802 and the first cache storage module 8012 and the second cache storage module 8013 Any connection that can implement the reordering cache function is allowed.
- the first cache storage module 8012 is connected to the first reordering module 8021; the second cache storage module 8013 is connected to the second reordering module 8022, so as to ensure the smooth transmission of data in the reordering cache 802 To port 801.
- the input bit width of the first cache storage module 8012, the input bit width of the second cache storage module 8013, the output bit width of the first reordering module 8021, and the second reordering module 8022 can be equal to further improve the transmission efficiency.
- the depth of the cache storage module is 8 or 16 bits.
- Fig. 9 is a schematic diagram of parameters of various components in the controller shown in Fig. 8. The numbers above each component in Figure 9 indicate the corresponding parameters of the component.
- Figure 9 shows the parameter settings of a port 801.
- Fig. 10 is a schematic diagram of the principle of data transmission by the controller shown in Fig. 8.
- the one-time transmission length ARLEN of the processing device is 16
- the maximum data corresponding to this read command is 16*128bit
- the data that can be returned by executing an instruction fragment is BL*D, which is 16*32bit.
- all data corresponding to a read command is recorded as a data packet, and the data packet includes multiple data corresponding to multiple instruction fragments split into the read command, that is, ,
- the data packet corresponding to a read command is 16*128bit.
- the storage device 804 After the storage device 804 returns 16*32bit data, the data bit width splicing is completed in the arbitration module 8031, and 4*128bit data is output from the arbitration module 8031, then a data packet has 16*128bit, which needs to be divided into 4 times 4* 128bit return.
- the read command is divided into K instruction fragments, where K is a positive integer
- corresponding K pieces of data can be obtained through the K instruction fragments.
- the odd-numbered data can be stored in the first reordering module 8021
- the even-numbered data can be stored in the second reordering module 8022.
- the order of the four data returns may not be returned in the order of 1-1, 1-2, 1-3, 1-4, because the arbitration module 8031 synchronously returns the ports corresponding to the data And token, so the reordering cache 802 can query and store the position sequence information of the returned data this time in the original data packet.
- the read operation can be started. Specifically, the data numbered 1-1 and 1-2 can be read from the first reordering module 8021 and the second reordering module 8022 at the same time, and then the data numbered 1-1 and 1-2 can be read from the first reordering module 8021 and the second reordering module 8022 at the same time. The data numbered 1-3 and 1-4 are read out, and written into the first cache storage module 8012 and the second cache storage module 8013 in the corresponding port 801.
- the first cache storage module 8012 can read data from the first reordering module 8021, and the second cache storage module 8013 can read data from the second reordering module 8022.
- the storage resource portion of the data numbered 1-3, 1-4, 1-1, 1-2 can be released to other ports 801 or data corresponding to other tokens.
- the arbitration port 801 returns new read data, if a new data packet happens to be formed at this time, then in the most demanding case, during the writing process of the arbitration module 8031, all the previous data packets are required to be read.
- the most demanding scenario it is required to read all the data in the previous data packet within 4 controller clock cycles. Because the bit widths of the first reordering buffer 802 and the second reordering buffer 802 are both 256 bits, and the total number of data in the data packet is 4, the previous data packet can be completed within 4 controller clock cycles. The read operation does not affect subsequent data storage operations, meeting the data storage requirements in the most demanding scenarios.
- the reordering buffer 802 sends 4 pieces of 256*2 data to the buffer storage module.
- the buffer storage module splits the data into 4 pieces of 128*4 data, which are transmitted to the data storage module 8011 in 4 cycles, and the data storage module 8011 It is sent to the processing device, thereby completing the data reading operation of the storage device 804.
- multiple ports 801 share the reordering buffer 802, and a small-capacity buffer FIFO memory is added to each port 801.
- a public-to-local method is adopted to store the common reordering buffer 802
- the data of the data is transferred to the local port 801 in time, so that a complete data packet of a port 801 will not occupy the reordering buffer 802 for a long time, and the resources of the reordering buffer 802 are released in time, which significantly reduces the number of reordering buffers 802.
- the size of the storage space of the reordering buffer 802 can be determined according to the parameters of other components. As mentioned above, assume that the output data bit width of the arbitration module 8031 is W, the depth of the instruction buffer 8032 is M, the burst length of the storage device 804 is BL, the combined bit width of the storage device 804 is D, and the port 801 of the controller If the number is N, the size of each reordering buffer 802 is M*BL*D.
- the total storage space size required by the reordering cache 802 is N*M*BL*D.
- the storage space required by the reordering cache 802 is M*BL*D, and each port 801 increases the storage space of the cache storage module to 2*16*2*W, the total storage space The size is N*2*16*2*W+M*BL*D. Therefore, compared with the solution shown in FIG. 6, the solution shown in FIG. 8 can effectively reduce the storage resources of (N-1)*M*BL*D-N*2*16*2*W.
- Table 1 shows the output data bit width W of the arbitration module 8031, the depth M of the instruction buffer 8032, the burst length BL of the storage device 804, the combined bit width D of the storage device 804, and the port 801 number N of the controller. For numerical values, the number of bits saved by the scheme shown in Fig. 8 compared to the scheme shown in Fig. 6 is adopted.
- Table 1 The amount of resources saved in single channel mode
- the solution shown in Fig. 8 can save six-digit bits compared to the solution shown in Fig. 6, and the reordering buffer 802 resources can be shared through multiple ports 801.
- the logic storage resources are greatly reduced, and the cost of the controller is reduced.
- the controller for communicating with a storage device includes a scheduling module 803, at least one reordering buffer 802, and multiple ports 801.
- the scheduling module 803 is used to obtain data from the storage device 804, so
- the reordering buffer 802 is used to reorder the data obtained by the scheduling module 803 and then transmitted to the corresponding port 801, and the port 801 is used to send the data obtained from the reordering buffer 802 to the corresponding processing device ,
- at least two ports 801 of the multiple ports 801 multiplex a reordering buffer 802, which can realize the sharing of the reordering buffer 802, reduce the number of reordering buffers 802 in the controller, and reduce the size of the controller And cost.
- the reordering buffer includes two reordering modules. In other optional embodiments, the reordering buffer may include any number of reordering modules, as long as it meets the timing requirements. , When the last data segment of the next data packet is stored in the reordering buffer, the previous data packet has been read into the port.
- each data packet contains 4 pieces of data
- a multiple of the bit width of the reordering module with respect to the bit width of the arbitration module, multiplied by the number of reordering modules, may be equal to the number of instruction fragments included in the read command.
- the number of instruction fragments contained in the read command is the number of data contained in a data packet
- the bit width of the reordering module divided by the bit width of the arbitration module is the multiple of the number of reordering modules divided into the reordering module Multiply, it can be equal to the number of data contained in a data packet.
- the output bit width of the arbitration module is W
- the bit width of each reordering module is equal to 4W
- one reordering module is required to meet the demand.
- the bit width of each reordering module is equal to W, then Four reordering modules are needed to meet the requirements.
- the reordering buffer starts the data read operation only after collecting a data packet.
- the start time of reading data from the reordering buffer does not necessarily have to wait until all the read data of the corresponding read command is returned, as long as a certain data in the reordering buffer is confirmed according to the token,
- the data that belongs to the highest order in the data packet that has not been returned to the port can output the data to the corresponding port.
- the start interval of the read data operation in the reordering cache can refer to the interval of the storage device BURST to read back the data return, thereby further reducing storage resources.
- the controller may also be a multi-channel controller.
- the number of reordering buffers may be multiple.
- the number of reordering buffers may be the number of channels of the controller. It should be noted that the number of channels of the controller refers to the number of parallel channels of the scheduling module of the controller, and is not necessarily equal to the number of ports of the controller.
- each reordering buffer can be multiplexed by at least two ports, thereby effectively saving the storage resources of the controller.
- the following uses a dual-channel controller as an example to describe the use of the reordering buffer in the multi-channel case.
- FIG. 11 is a schematic structural diagram of another controller for communicating with a storage device provided by an embodiment of the application.
- the controller may include: a scheduling module 113, two reordering buffers 112, and multiple ports 111;
- the scheduling module 113 is configured to obtain data from the storage device 114;
- the reordering buffer 112 is used to reorder the data acquired by the scheduling module 113 and then transmit it to the corresponding port 111;
- the port 111 is used to send the data obtained from the reordering buffer 112 to the corresponding processing device;
- each reordering buffer 112 is multiplexed by the multiple ports 111.
- the reordering buffer 112 is specifically used to: receive the data sent by the scheduling module 113, determine the port 111 corresponding to the data, and send the data to the corresponding port in the reordered order 111, so as to realize the multiplexing of the reordering buffer 112 by multiple ports 111.
- the scheduling module 113 may include:
- Two instruction caches 1132 both of which are used to store and divide read commands
- the two instruction execution modules 1133 are both configured to send the divided read command to the storage device 114, so that the storage device 114 executes the divided read command;
- the two arbitration modules 1131 are both used to obtain the data returned by the storage device 114 according to the divided read command.
- the double instruction cache 1132, the instruction execution module 1133, and the arbitration module 1131 are used to implement dual-channel data transmission.
- the specific functions and implementation principles of each instruction cache 1132, instruction execution module 1133, and arbitration module 1131 can be referred to the implementation manners shown in FIG. 8 to FIG. 10, and will not be repeated here.
- each reordering cache 112 Similar to the case of a single channel, assuming that the bit width of the storage device 114 is D, the burst length of the storage device 114 is BL, and the depth of each instruction cache 1132 is M, then each reordering cache 112 The storage space size of the two reordering buffers 112 is M*BL*D, and the total storage space size of the two reordering buffers 112 is 2*M*BL*D, where D, BL, and M are all positive integers.
- each reordering buffer 112 may include a first reordering module 1121 and a second reordering module 1122, and the bit widths of the first reordering module 1121 and the second reordering module 1122 are equal. It is twice the output bit width of an arbitration module 1131.
- the input bit width of each port 111 may be equal to the combined bit width of the at least one reordering buffer 112.
- the input bit width of each port 111 may be equal to twice the output bit width of one reordering buffer 112.
- the port 111 includes a data storage module (DM) 1111 and a cache storage module (BM); the data storage module 1111 is connected to the processing device, and the output bit width of the data storage module 1111 is equal to the The input bit width of the processing device.
- DM data storage module
- BM cache storage module
- the cache storage module is connected to the reordering cache 112, the input bit width of the cache storage module is equal to the combined bit width of the two reordering caches 112, and the output bit width of the cache storage module is equal to the data The input bit width of the storage module 1111.
- each of the cache storage modules includes a first cache storage module (first BM) 1112, a second cache storage module (second BM) 1113, a third cache storage module (third BM) 1114, and a fourth cache storage module (third BM) 1114.
- the cache storage module (fourth BM) 1115 can realize the simultaneous acquisition of four data and improve the efficiency of data transmission.
- each reordering buffer 112 includes a first reordering module 1121 and a second reordering module 1122, so there are four reordering modules in the controller, and each port 111 It includes four cache storage modules, and the four reordering modules are connected to the four cache storage modules.
- the trapezoid in FIG. 11 can be used to indicate the connection relationship between the two reordering buffers 112 and the four buffer storage modules of each port, and any connection that can realize the data transmission function is allowed.
- the specific connection relationship between the four cache storage modules of each port and the four reordering modules can be set according to actual needs, which is not limited in this embodiment.
- the depth of the cache storage module may be 8 or 16 bits
- each of the cache storage modules includes the input bit width of the first cache storage module 1112, the input bit width of the second cache storage module 1113, The input bit width of the third cache storage module 1114, the input bit width of the fourth cache storage module 1115, the output bit width of the first reordering module 1121, and the output bit width of the second reordering module 1122 are all equal, Ensure that data is transmitted quickly and accurately.
- the process of data transmission is similar to that of single-channel. Specifically, if the two arbitration modules 1131 do not return the last data at the same time, it is similar to a single channel.
- the first reordering module 1121 and the second reordering module 1122 return data to the port 111 according to their respective procedures. Go into details.
- the reordering buffer 112 can implement the data packet output process according to the specific storage mode of the data.
- This embodiment provides the following two data storage methods: one is that multiple data in one data packet are stored in the same reordering buffer 112, and the other is that multiple data in one data packet are scattered in two reordering buffers. Stored in the cache 112. Described below separately.
- Fig. 12 is a schematic diagram 1 of the principle of data transmission by the controller shown in Fig. 11.
- the data belonging to the same read command output from the arbitration module 1131 are alternately stored in the first reordering module 1121 and the second reordering module 1121 and the second reordering module in a reordering buffer 112 in the reordered order. Sorting module 1122.
- the read command is divided into K instruction fragments, where K is a positive integer
- corresponding K pieces of data can be obtained through the K instruction fragments.
- the odd-numbered data can be stored in the first reordering module 1121 in the first reordering buffer 112
- the even-numbered data can be stored in the second reordering module 1121 of the reordering buffer 112. Sorting module 1122.
- the reordering buffer 112 After any one of the two reordering buffers 112 collects the K data corresponding to the read command, the reordering buffer 112 sends the K data to the corresponding port 111.
- the first read command and the second read command correspond to the same port 111, Then determine whether the first read command and the second read command belong to the same ID; if the first read command and the second read command belong to the same ID, then according to the first read command and the first read command The sequence of the second read command returns the corresponding data.
- the reordering buffer 112 may send the data corresponding to the read command in the first order to the first buffer storage module 1112 and the second read command according to the sequence of the first read command and the second read command.
- the second cache storage module 1113 simultaneously sends data corresponding to another read command to the third cache storage module 1114 and the fourth cache storage module 1115.
- the numbers are 1-1, 1-2, 1-3, 1-4, then the original data in the order of 1, 3 can be changed to 1-1 , 1-3 are stored in the first reordering module 1121 in the first reordering buffer 112, and the data 1-2, 1-4 originally in the order of 2, 4 are stored in the second reordering module 1122.
- the numbers are 2-1, 2-2, 2-3, 2-4, then the original data in the order of 1, 3 can be changed to 2-1 2-3 is stored in the first reordering module 1121 in the second reordering buffer 112, and the data 2-2 and 2-4 in the original order of 2, 4 are stored in the second reordering module 1122.
- the two reordering buffers 112 can be combined into two data packets at the same time, and the two data packets can belong to the same port 111 can also belong to a different port 111.
- the order of the data packets stored in the first cache storage module 1112 and the second cache storage module 1113 may have priority over those stored in the third cache storage module 1114 and the fourth cache storage module 1115 Packets. Assuming that data packet 1 has priority over data packet 2, then data packet 1 is sent to the first cache storage module 1112 and the second cache storage module 1113, and data packet 2 is sent to the third cache storage module 1114 and The fourth buffer storage module 1115, in this way, when the port 111 returns data to the processing device, the data packet 1 can be returned to the processing device before the data packet 2 to ensure the data transmission sequence.
- Fig. 13 is a second schematic diagram of the data transmission principle of the controller shown in Fig. 11.
- the data belonging to the same read command output from the arbitration module 1131 is alternately stored in the first reordering module 1121 and the second reordering module 1121 in the two reordering buffers 112 in the reordered order.
- Reordering module 1122 is alternately stored in the first reordering module 1121 and the second reordering module 1121 in the two reordering buffers 112 in the reordered order.
- the i+1th data is stored in the first reordering buffer 112.
- the reordering module 1121, the i+2th data is stored in the second reordering module 1122 of the reordering buffer 112, the i+3th data is stored in the first reordering module 1121 of another reordering buffer 112, the i-th +4 pieces of data are stored in the second reordering module 1122 of the other reordering buffer 112, i is 0 or i is a multiple of 4.
- the two reordering buffers 112 send the K data to the corresponding port 111.
- the first read command and the second read command correspond to the same port 111, Then determine whether the first read command and the second read command belong to the same ID; if the first read command and the second read command belong to the same ID, then according to the first read command and the first read command The sequence of the second read command returns the corresponding data.
- the reordering buffer 112 may send the data corresponding to the read command in the first order to the first buffer storage module 1112 and the second read command according to the sequence of the first read command and the second read command.
- the second cache storage module 1113 simultaneously sends data corresponding to another read command to the third cache storage module 1114 and the fourth cache storage module 1115.
- data packet 1 corresponding to a read command includes four data, numbered 1-1, 1-2, 1-3, 1-4
- data 1-1 can be stored
- the data 1-2 is stored in the second reordering module 1122 in the first reordering cache 112
- the data 1-3 is stored in the second reordering module 1122.
- data 1-4 are stored in the second reordering module 1122 in the second reordering buffer 112.
- the numbers are 2-1, 2-2, 2-3, 2-4, then the data 2-1 can be stored in the first reordering In the first reordering module 1121 in the cache 112, the data 2-2 is stored in the second reordering module 1122 in the first reordering cache 112, and the data 2-3 is stored in the second reordering cache 112 In the first reordering module 1121 of, the data 2-4 are stored in the second reordering module 1122 in the second reordering buffer 112.
- the two reordering buffers 112 can be combined into two data packets at the same time, and the two data packets can belong to the same port 111 can also belong to a different port 111.
- the arrow in Figure 13 indicates that the reordering module sends data to the corresponding cache storage module.
- the order of the data packets stored in the first cache storage module 1112 and the second cache storage module 1113 may have priority over those stored in the third cache storage module 1114 and the fourth cache storage module 1115 Packets. Assuming that data packet 1 has priority over data packet 2, then data packet 1 is sent to the first cache storage module 1112 and the second cache storage module 1113, and data packet 2 is sent to the third cache storage module 1114 and The fourth buffer storage module 1115, in this way, when the port 111 returns data to the processing device, the data packet 1 can be returned to the processing device before the data packet 2 to ensure the data transmission sequence.
- the size of the storage space of the reordering cache 112 can be determined according to the parameters of other components. Assuming that the output data bit width of the arbitration module 1131 is W, the depth of the instruction buffer 1132 is M, the burst length of the storage device 114 is BL, the combined bit width of the storage device 114 is D, and the number of port 111 of the controller is N, then , The size of each reordering buffer 112 is M*BL*D.
- the total storage space size required by the reordering cache 112 is 2*N*M*BL*D.
- the storage space size required by the reordering cache 112 is 2*M*BL*D, and the storage space size of the cache storage module is increased for each port 111 to 4*8*2*W, the total The storage space size is N*4*8*2*W+2*M*BL*D. Therefore, compared with the solution shown in FIG. 7, the solution shown in FIG. 11 can effectively reduce the storage resources of 2*(N-1)*M*BL*D-N*4*8*2*W.
- Table 2 shows the output data bit width W of the arbitration module 1131, the depth M of the instruction cache 1132, the burst length BL of the storage device 114, the combined bit width D of the storage device 114, and the number N of the port 111 of the controller. For numerical values, the number of bits saved by the scheme shown in Fig. 11 compared to the scheme shown in Fig. 7 is adopted.
- the solution shown in Fig. 11 can save six-digit bits compared to the solution shown in Fig. 7, and the resources of the reordering buffer 112 are shared through multiple ports 111.
- the logic storage resources are greatly reduced, and the cost of the controller is reduced.
- the controller provided in FIG. 11 for communicating with the storage device 114 includes: a scheduling module 113, two reordering buffers 112, and a plurality of ports 111.
- the scheduling module 113 includes two sets of instruction buffers 1132, an instruction execution module 1133, and
- the arbitration module 1131 can realize data transmission in dual-channel mode.
- the reordering buffer 112 can reorder the data acquired by the scheduling module 113 and then transmit it to the corresponding port 111.
- the multiple ports 111 multiplex the data.
- the two reordering buffers 112 can realize the sharing of the reordering buffer 112 in the dual-channel mode, reducing the number of reordering buffers 112 in the controller, and reducing the size and cost of the controller.
- each port is provided with a cache storage module.
- the cache storage module may also be omitted and only the data storage module is retained. The following examples are used for description.
- FIG. 14 is a schematic structural diagram of another controller for communicating with a storage device according to an embodiment of the application.
- the controller may include: a scheduling module 143, a reordering buffer 142, and multiple ports;
- the scheduling module is used to obtain data from the storage device 144;
- the reordering buffer 142 is used to reorder the data acquired by the scheduling module 143 and then transmit it to the corresponding port;
- the port is used to send the data obtained from the reordering buffer 142 to the corresponding processing device
- At least two ports of the plurality of ports reuse the reordering buffer 142.
- the reordering cache may include a first reordering module 1421 and a second reordering module 1422
- the scheduling module 143 may include an instruction cache 1432, an instruction execution module 1433, and an arbitration module 1431.
- the port may include a data storage module (DM) 141 connected to the reordering buffer 142, and the input bit width of the data storage module 141 is equal to the output bit width of the reordering buffer 142
- the output bit width of the data storage module 141 is equal to the input bit width of the processing device.
- each data storage module 141 may include a FIFO memory, and the writing depth of the FIFO memory multiplied by the writing bit width may be equal to the reading depth multiplied by the reading bit width, thereby ensuring normal data transmission.
- the embodiment shown in FIG. 14 is based on the embodiments shown in FIGS. 8 to 10, and adjusts the storage module in each port.
- the data storage module 141 directly receives the data sent by the reordering buffer 142, and other components
- the structure, function, implementation principle and technical effect of is similar to the embodiment shown in FIG. 8 to FIG. 10, and will not be repeated here.
- FIG. 15 is a schematic structural diagram of another controller for communicating with a storage device according to an embodiment of the application.
- the controller may include: a scheduling module 153, two reordering buffers 152 and multiple ports;
- the scheduling module 153 is configured to obtain data from the storage device 154;
- the reordering buffer 152 is used for reordering the data acquired by the scheduling module 153 and then transmitting it to the corresponding port;
- the port is used to send the data obtained from the reordering buffer 152 to the corresponding processing device;
- each reordering buffer 152 is multiplexed by the multiple ports.
- each of the reordering buffers 152 may include a first reordering module 1521 and a second reordering module 1522
- the scheduling module 153 may include two instruction buffers 1532, two instruction execution modules 1533, and two arbitration modules. Module 1531.
- the port may include a data storage module (DM) 151 connected to the reordering buffer 152.
- the input bit width of the data storage module 151 may be equal to twice the output bit width of a reordering buffer 152, and the output bit width of the data storage module 151 is equal to the input bit width of the processing device.
- the data storage module 151 of each port may include a FIFO memory, and the writing depth of the FIFO memory multiplied by the writing bit width may be equal to the reading depth multiplied by the reading bit width, thereby ensuring normal data transmission.
- the embodiment shown in Fig. 15 is based on the embodiment shown in Fig. 11 to Fig. 13 and adjusts the storage module in each port.
- the data storage module 151 directly receives the data sent by the reordering buffer 152, and other components
- the structure, function, implementation principle and technical effect of is similar to the embodiment shown in FIG. 11, and will not be repeated here.
- a control unit may be provided in the reordering buffer for controlling the data transmission process, for example, determining the sequence of the data according to the token.
- the storage space size of the reordering buffer described in the embodiments of the present application may refer to the size of the storage space used for storing data sent to the processing device other than the control unit.
- controller takes the controller as an example of a DDR controller to describe the specific implementation principles of the embodiments of the present application. Those skilled in the art can understand that the controller may also be other types of reordering buffers. The specific implementation principle of the controller is similar to that of the DDR controller, and will not be repeated in the embodiment of the present application.
- FIG. 16 is a schematic flowchart of a data transmission method provided by an embodiment of this application.
- the method can be applied to a controller, the controller being used to communicate with a storage device, the controller including a scheduling module, at least one reordering buffer, and a plurality of ports.
- the data transmission method may include:
- the scheduling module obtains data from the storage device.
- the reordering buffer reorders the data acquired by the scheduling module and transmits the data to the corresponding port, where at least two of the multiple ports multiplex a reordering buffer;
- the port sends the data obtained from the reordering buffer to a corresponding processing device.
- reordering the data acquired by the scheduling module and transmitting the data to the corresponding port includes:
- the data is sent to the corresponding port in the reordered order.
- the scheduling module includes an instruction cache, an instruction execution module, and an arbitration module; the scheduling module acquiring data from the storage device includes:
- the instruction cache stores and divides read commands
- the instruction execution module sends the divided read command to the storage device, so that the storage device executes the divided read command
- the arbitration module obtains the data returned by the storage device according to the divided read command.
- the reordering buffer includes a first reordering module and a second reordering module; receiving data sent by the scheduling module includes:
- the reordering buffer includes a first reordering module and a second reordering module; the read command is divided into K instruction fragments, and the K is a positive integer; and the schedule is received
- the data sent by the module includes:
- K data corresponding to the K instruction fragments sent by the arbitration module in the scheduling module are received; among the K data, the odd-numbered data is stored in the first reordering module, and the even-numbered data is stored in The second reordering module.
- sending the data to the corresponding port in the reordered order includes:
- the K data corresponding to the read command are collected in the reordering buffer, the K data are sent to the corresponding port in the reordered order.
- the port includes a data storage module and a cache storage module;
- the cache storage module includes a first cache storage module and a second cache storage module;
- the K data are sorted according to Sequentially sent to the corresponding ports, including:
- the first reordering module in the reordering cache sends part of the K data to the first cache storage module, and the second reordering module in the reordering cache stores the K data Send other data in the data to the second cache storage module;
- the first cache storage module and the second cache storage module send the K data to the data storage module.
- the controller is a dual-channel controller, and the number of the reordering cache, the instruction cache, the instruction execution module, and the arbitration module are all two; each of the reordering caches Both include a first reordering module and a second reordering module.
- receiving the data sent by the scheduling module includes:
- the data belonging to the same read command sent by the arbitration module in the scheduling module is received, and the received data is alternately stored in the first reordering module and the second reordering module in one of the reordering buffers.
- receiving the data sent by the scheduling module includes:
- the read command is divided into K instruction fragments, and the K is a positive integer
- Receiving the data sent by the scheduling module includes:
- the i+1th data is stored in the first reordering module of a reordering cache
- the i+2th data is stored in the second reordering module of the reordering cache
- the i+th data is stored in the second reordering module of the reordering cache.
- Three data are stored in the first reordering module of another reordering cache, and the i+4th data is stored in the second reordering module of the other reordering cache, i is 0 or i is a multiple of 4.
- sending the data to the corresponding port in the reordered order includes:
- the multiple data corresponding to the read commands are collected in the two reordering buffers, the multiple data is sent to the corresponding port in the reordered order.
- sending the multiple data to the corresponding port in the reordered order includes:
- first read command and the second read command belong to the same ID, corresponding data is returned according to the sequence of the first read command and the second read command.
- the port includes a data storage module and a cache storage module; each of the cache storage modules includes a first cache storage module, a second cache storage module, a third cache storage module, and a fourth cache Storage module; if the first read command and the second read command belong to the same ID, return corresponding data according to the sequence of the first read command and the second read command, including:
- the data corresponding to the read command that comes first is sent to all
- the first cache storage module and the second cache storage module simultaneously send data corresponding to another read command to the third cache storage module and the fourth cache storage module.
- the storage device is a DDR device
- the controller is a DDR controller
- the processing device includes at least one of the following: an application processor, a digital signal processor, a graphics processing unit, and a multimedia processor.
- An embodiment of the present application also provides a storage device access system, including the controller, storage device, and multiple processing devices described in any of the foregoing embodiments.
- the system may be an SoC (System on Chip, System on Chip).
- SoC System on Chip, System on Chip.
- the SoC is based on an embedded system, based on IP multiplexing technology, integrating software and hardware, and pursuing an integrated chip that is the most tolerant of the product system.
- the structure of the storage device access system provided by the embodiment of the present application can be implemented with reference to FIG. 1.
- the structures, functions, implementation principles, and technical effects of the components in the storage device access system provided in the embodiments of the present application can be referred to the foregoing embodiments, and details are not described herein again.
- An embodiment of the present application also provides an electronic device, including the storage device access system described above.
- the device may be any one of the following: unmanned aerial vehicle, unmanned vehicle, pan/tilt, camera, etc.
- An embodiment of the present application also provides a computer-readable storage medium, including instructions, which when run on a computer, cause the computer to execute the method described in any of the foregoing embodiments.
- the disclosed related remote control device and method can be implemented in other ways.
- the embodiments of the remote control device described above are merely illustrative.
- the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods, such as multiple units or components. It can be combined or integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of the remote control device or unit, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
- the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , Including several instructions to make a computer processor (processor) execute all or part of the steps of the method described in each embodiment of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read_Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disks or optical disks and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Bus Control (AREA)
Abstract
本申请实施例提供一种控制器、存储器件访问系统、电子设备和数据传输方法,其中控制器包括:调度模块、至少一个重排序缓存和多个端口;所述调度模块用于从所述存储器件获取数据;所述重排序缓存用于将所述调度模块获取的数据进行重排序后传输至对应的端口;所述端口用于将从所述重排序缓存获取的数据发送给对应的处理装置;其中,所述多个端口中的至少两个端口复用一个重排序缓存。本申请实施例提供的控制器、存储器件访问系统、电子设备和数据传输方法,多个端口中的至少两个端口复用一个重排序缓存,能够实现重排序缓存的共享,减少控制器中重排序缓存的个数,减少控制器的体积和成本。
Description
本申请涉及计算机技术领域,尤其涉及一种控制器、存储器件访问系统、电子设备和数据传输方法。
随着计算机技术的不断进步,存储器件的应用也越来越广泛。常用的存储器件能够存储数据,供处理器使用。但是,由于处理器的总线标准常常与一般存储器件的总线标准不同,因此,需要增加一块控制器来实现处理器与存储器件的通信。
在一些技术实现中,存储器件返给控制器的数据的顺序往往不能满足处理器的要求,为了解决这一问题,在控制器中可以设置有重排序缓存,重排序缓存能够将从存储器件获取的数据进行重排序,从而使数据的返回顺序符合处理器的要求。
上述技术的不足之处在于,控制器中的重排序缓存占用面积较大,导致控制器的体积较大,成本较高。
发明内容
本申请实施例提供了一种控制器、存储器件访问系统、电子设备和数据传输方法,用以解决存储器件的控制器体积较大、成本较高的技术问题。
第一方面,本申请实施例提供一种用于与存储器件通信的控制器,包括:调度模块、至少一个重排序缓存和多个端口;
所述调度模块用于从所述存储器件获取数据;
所述重排序缓存用于将所述调度模块获取的数据进行重排序后传输至对应的端口;
所述端口用于将从所述重排序缓存获取的数据发送给对应的处理装置;
其中,所述多个端口中的至少两个端口复用一个重排序缓存。
第二方面,本申请实施例提供一种存储器件访问系统,包括第一方面所 述的控制器、存储器件和多个处理装置。
第三方面,本申请实施例提供一种电子设备,包括第二方面所述的存储器件访问系统。
第四方面,本申请实施例提供一种数据传输方法,所述方法应用于控制器,所述控制器用于与存储器件通信,所述控制器包括调度模块、至少一个重排序缓存和多个端口,所述方法包括:
所述调度模块从所述存储器件获取数据;
所述重排序缓存将所述调度模块获取的数据进行重排序后传输至对应的端口;
所述端口将从所述重排序缓存获取的数据发送给对应的处理装置;
其中,所述多个端口中的至少两个端口复用一个重排序缓存。
第五方面,本申请实施例提供一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得所述计算机执行如上述第四方面所述的方法。
本申请实施例提供的控制器、存储器件访问系统、电子设备和数据传输方法,多个端口中的至少两个端口复用一个重排序缓存,能够实现重排序缓存的共享,减少控制器中重排序缓存的个数,减少控制器的体积和成本。
图1为本申请实施例提供的一种DDR控制器的应用场景示意图;
图2为本申请实施例提供的一种DDR控制器的读通道示意图;
图3为本申请实施例提供的一种DDR控制器的写通道示意图;
图4为本申请实施例提供的一种DDR控制器读数据过程的交互原理示意图;
图5为本申请实施例提供的一种DDR控制器读数据过程的工作原理示意图;
图6为一种单通道的DDR控制器的结构示意图;
图7为一种双通道的DDR控制器的结构示意图;
图8为本申请实施例提供的一种用于与存储器件通信的控制器的结构示意图;
图9为图8所示控制器中各部件的参数示意图;
图10为图8所示控制器传输数据的原理示意图;
图11为本申请实施例提供的另一种用于与存储器件通信的控制器的结构示意图;
图12为图11所示控制器传输数据的原理示意图一;
图13为图11所示控制器传输数据的原理示意图二;
图14为本申请实施例提供的又一种用于与存储器件通信的控制器的结构示意图;
图15为本申请实施例提供的还一种用于与存储器件通信的控制器的结构示意图;
图16为本申请实施例提供的一种数据传输方法的流程示意图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。
本申请实施例提供一种用于与存储器件通信的控制器,可以包括:调度模块、至少一个重排序缓存和多个端口,所述调度模块用于从所述存储器件获取数据,所述重排序缓存用于将所述调度模块获取的数据进行重排序后传输至对应的端口,所述端口用于将从所述重排序缓存获取的数据发送给对应的处理装置,其中,所述多个端口中的至少两个端口可以复用一个所述重排序缓存。
可以理解的是,所述控制器的端口可以用于与处理装置连接,并向与之连接的处理装置发送数据。所述端口与处理装置的连接关系可以根据实际需要来设置。例如,多个端口可以与多个处理装置一一对应连接,或者,所述多个端口中的至少两个端口可以与同一个处理装置连接,也就是说,各个端口连接的处理装置可以是不同的处理装置,也可以是同一个处理装置,本申请实施例对此不作限制。在数据传输过程中,端口向其对应的处理装置即与其连接的处理装置发送数据。
可选的,所述存储器件可以为任意能够实现数据存储功能的器件,包括但不限于下述至少一种:RAM(Random Access Memory,随机存取存储器)、SDRAM(Synchronous Dynamic Random Access Memory,同步动态随机存取内存)、DDR(Double Data Rate SDRAM,双倍速率同步动态随机存储器)等。
为了便于描述,本申请实施例中以所述存储器件为DDR器件、所述控制器为DDR控制器为例来进行说明。其中,所述DDR器件可以为任意类型的DDR如DDR2、DDR3、DDR4等。
针对需要大容量内存的应用场景,可以设置一个DDR控制器来连接DDR器件,以提供系统内存空间。在手机应用、图像处理、电视游戏等领域的ASIC(Application SpecificIntegrated Circuit,专用集成电路)芯片,可以以DDR控制器为核心,实现数据的存储和访问,提高DDR控制器的访问效率,就提高了整个芯片的处理性能。
图1为本申请实施例提供的一种DDR控制器的应用场景示意图。如图1所示,为了提高DDR控制器的访问效率,可以为DDR控制器配置多个访问端口。这些访问端口,都遵循一定的总线标准,如AMBA(Advanced Microcontroller Bus Architecture,先进微控制器总线体系)的AXI(Advanced eXtensible Interface,先进可扩展接口)、AHB(Advanced High Performance Bus,高级高性能总线)等。
DDR控制器可以作为SLAVE(从设备)被访问,外部存在多个处理装置作为访问MASTER(主设备),这些访问MASTER可以来源于其他IP(Intellectual Property,知识产权)核,如AP(应用处理器)、DSP(数字信号处理器)、GPU(图形处理单元)、MEDIA(多媒体)IP核等。
一般情况下,系统瓶颈都在DDR器件的访问效率上,所以在系统设计时,为了尽量提高DDR器件的访问效率,当某个MASTER向DDR控制器的端口发出读数据操作时,在MASTER内部已经为即将返回的读数据开辟了存储空间,不会因为空间不够,而从端口上限制DDR数据的输出。
图1中以8个访问MASTER为例,用以说明本申请实施例的应用场景,这些访问MASTER的总线例如都可以为AMBAAXI总线。DDR器件不是AXI总线标准接口,它的接口是另外的标准,DDR控制器的作用就是将AXI总线访问命令和时序转换成DDR器件的访问命令和时序,从而完成DDR器件与访问MASTER之间的数据交互。
AMBA的AXI总线具有如下特点:将读和写、请求和响应,都改为独立的单向传输通道,提高传输的并行度,增加吞吐率;有5个通道用于读写数据,分别为:读地址(读命令)通道、读数据通道、写地址(写命令)通道、写数据通道、写响应通道;各自独立的通道易于加入Register Slice(寄存器片), 有利于时序收敛。
图2为本申请实施例提供的一种DDR控制器的读通道示意图。如图2所示,读通道具体可以包括读地址通道(Read address channel)和读数据通道(Read data channel)。
主设备接口(Master interface)可以通过读地址通道向从设备接口(Slave interface)传输地址和控制(Address and control)信号,从设备接口可以通过读数据通道向主设备接口返回读数据(Read data),从而完成数据的读取操作。
图3为本申请实施例提供的一种DDR控制器的写通道示意图。如图3所示,写通道可以包括写地址通道(Write address channel)、写数据通道(Write data channel)、写响应通道(Write response channel)。
主设备接口(Master interface)可以通过写地址通道向从设备接口(Slave interface)发送地址和控制(Address and control)信号,然后通过写数据通道向从设备接口发送写数据(Write data),从设备接口可以通过写响应通道返回写响应(Write response)信号,从而完成数据的写入操作。
AXI总线标准提高了总线带宽的利用率,但是它也有如下限制:不同Master之间的AXI传输的响应顺序没有要求,同一个Master发出的不同ID(标识)的传输之间的响应顺序也没有要求,但是相同ID的传输的响应顺序必须和命令顺序保持一致。
图4为本申请实施例提供的一种DDR控制器读数据过程的交互原理示意图。图4描述了从同一个MASTER发出具有相同ID的读命令和读数据的顺序关系,先发的读命令对应的读数据先返回,读数据返回顺序保持与读命令发出顺序一致。
如图4所示,MASTER向DDR控制器发送读命令ARADDR,为了便于描述,将MASTER向DDR控制器发出的读命令依次记为ARADDR1至ARADDR5。若ARADDR1至ARADDR5的ID相同,那么,在返回数据时,DDR控制器需要依次返回RDATA1至RDATA5,其中,RDATAi是响应于ARADDRi返回的数据,i=1、2、3、4、5。
图5为本申请实施例提供的一种DDR控制器读数据过程的工作原理示意图。DDR控制器在收到AXI总线读命令ARADDR后,将每条ARADDR分割成指令片段,并为每个指令片段添加令牌,用以记录所述指令片段的属性信息。
可选的,所述属性信息可以包括下述至少一项:ID、指令片段所属的读命令在所述ID下的序号、指令片段所属的读命令被分割的片段个数、各个指令片段的顺序等用于实现后续数据重组的任意信息。
其中,所述指令片段的令牌可以包括所述属性信息;或者,所述令牌可以包括能够确定所述属性信息的任意信息,例如,所述令牌可以为一串编码,通过所述编码可以查表得到对应的ID、片段个数等。
DDR控制器可以根据当前DDR器件的状态,将这些指令片段重新调整执行顺序,以达到DDR器件的最佳执行效率,提高DDR器件的访问效率。可选的,各个指令片段对应的令牌的顺序可以随着指令片段的顺序调整而进行调整,保证指令片段的顺序与令牌的顺序一一对应。
DDR器件在收到调整顺序后的指令片段后,按调整后的顺序返回读数据。DDR控制器根据之前的令牌顺序,为DDR器件返回的读数据添加对应的令牌。此时读数据的顺序就与ARADDR命令的发出顺序不同,因此就不满足AXI总线对于同一个ID的数据返回顺序要求。DDR控制器可以通过的重排序缓存(Read Reorder Buffer,RRB),根据读数据携带的令牌,进行读数据返回顺序重组,产生正确的RDATA返回顺序,以符合AXI总线读数据的返回顺序。
参见图5,DDR控制器接收到读命令ARADDR1至ARADDR5后,将各个读命令依次拆分为多个指令片段,例如,ARADDR1拆分为4个指令片段,ARADDR2拆分为2个指令片段,ARADDR3拆分为5个指令片段,ARADDR4拆分为3个指令片段,ARADDR5拆分为4个指令片段,图5中用每个指令片段中的数字可以用于表示该指令片段来自第几个读命令。
将这些指令片段调整顺序后送入DDR器件执行,得到返回的数据,返回的数据中的数字可以代表该数据是第几个读命令返回的数据。通过图5中的数据返回顺序可知,DDR器件返回的数据顺序与调整后的指令片段的顺序相同。
由于DDR器件返回的数据顺序并不满足MASTER的要求,因此,需要DDR控制器中的重排序缓存对数据进行重排序,具体地,由于指令片段与返回的数据是一一对应的,指令片段与令牌也是一一对应的,那么,数据与令牌也是一一对应的,可以通过数据对应的令牌,对数据进行重组,得到对应于ARADDR1至ARADDR5的数据RDATA1至RDATA5,并返回给对应的MASTER。
在命令分割阶段产生的令牌顺序决定了重排序缓存内数据重组的顺序。令牌可以作为寻址的指针,从而寻址得到初始的ARADDR指令的相关属性信息, 从而可以得到RDATA的返回个数信息。
具体来说,由于令牌可以指示一个读命令包含的指令片段的个数,因此,重排序缓存可以知道一个读命令对应的数据的个数,只要集齐了一个读命令对应的数据,那么就可以将所述读命令对应的数据发送给对应的端口。
因为令牌决定了重排序缓存内数据重组的顺序,所以同一个ID下,必须按令牌产生顺序,产生对应的RDATA。可选的,同一个ID的令牌产生顺序可以使用FIFO(First Input First Output,先进先出)存储器来维护,先产生的令牌,对应的读数据优先使用,生成RDATA。
图6为一种单通道的DDR控制器的结构示意图。如图6所示,DDR控制器包括多个端口601,每个端口601用于与一个MASTER连接,所述端口601可以实现为一数据存储模块如FIFO存储器,为了实现数据的重排序,为每个端口601配置了一个重排序缓存602。
此外,DDR控制器中还可以设置有调度模块603,所述调度模块603可以包括仲裁模块6031、指令缓存6032和指令执行模块6033等。其中,所述仲裁模块6031用于对MASTER发送的读命令进行仲裁,并将仲裁后的读命令发送给指令缓存6032,所述指令缓存6032可以用于对读命令进行分割和排序,所述指令执行模块6033可以用于执行分割后的指令片段。
DDR控制器中的重排序缓存602的大小设置一般受以下参数影响:DDR器件604的数据宽度(位宽),可以通过多个DDR器件604并联实现数据宽度的增大,此处是指并联后的对外数据总宽度,记做D,一般是16、32或者64;DDR器件604的BURST Length(突发长度),表示一次数据传输的长度,记做BL,一般的DDR器件如DDR3、LPDDR3、DDR4等BL都是8,而LPDDR4的BL通常是16或者32;指令缓存6032的深度M,一般是32或者64。
通常来说,一块重排序缓存602的大小一般设置为M*BL*D bit(位)。因为DDR器件604的位宽是D,一次传输BL个周期,所以DDR器件604一次返回的数据有D*BLbit,指令缓存6032的深度为M,那么最多可以存储M条指令片段,因此,M*BL*Dbit的重排序缓存602可以满足数据重组的需求。
假设DDR控制器的端口601数为N,那么单通道模式下需要的总的重排序缓存602的大小为N*M*BL*D bit。一般重排序缓存602的位宽W和DDR控制器端口601的总线数据位宽是保持一致的,也与仲裁模块6031的输出位宽是一致的。
图7为一种双通道的DDR控制器的结构示意图。双通道模式下需要的重排 序缓存大小是单通道的两倍。如图7所示,双通道的DDR控制器可以包括多个端口701,每个端口701对应于两块重排序缓存702,调度模块703中,仲裁模块7031、指令缓存7032和指令执行模块7033也都有双份,从而实现通过双通道从DDR器件704中获取数据,双通道的DDR控制器的具体实现原理与单通道的DDR控制器类似,只是通道数变为两个。双通道模式下需要的总的重排序缓存为2*N*M*BL*D bit。
图6和图7所示的单、双通道的DDR控制器的实现结构的优点在于管理简单,实现方便,各个端口独立,如果一个端口发生数据堵塞,不会影响其他端口对DDR器件的访问效率。但是,图6和图7所示的单、双通道的DDR控制器的实现结构中,指令缓存是各个端口共享的,在一段时间内从DDR器件返回的数据总量是有限的,多数情况下,各个端口的重排序缓存使用都不是全部占用的,极端情况下指令缓存中都是同一个端口的读指令,那么DDR返回的读数据会占用一个端口的全部重排序缓存,此时其他端口的重排序缓存都是空闲的。图6和图7所示的方案,对于重排序缓存的缓存空间要求大,并且各个端口无法统一共享重排序缓存。
有鉴于此,本申请实施例提供一种多个端口共享重排序缓存的方案。具体地,本申请实施例提供一种用于与存储器件通信的控制器,例如DDR控制器,所述控制器包括至少一个重排序缓存和多个端口,其中,所述多个端口中的至少两个端口复用一个重排序缓存。
其中,复用是指所述至少两个端口共享同一重排序缓存。具体地,所述重排序缓存可以在获取到从存储器件传输来的数据后,确定所述数据对应的端口,将所述数据按照重排序后的顺序发送给所述对应的端口,从而实现多个端口复用一个重排序缓存。
可以理解的是,被复用的重排序缓存可以存储所述至少两个端口中的每一个端口的数据,而不像图6或者图7所示的方案,每个重排序缓存只能存储其对应的端口的数据。
在一个可选的实施方式中,所述重排序缓存的个数为一个,所述多个端口中的全部端口复用所述重排序缓存。
在另一个可选的实施方式中,所述重排序缓存的个数为至少两个,所述多个端口中的部分端口复用其中一个重排序缓存,其它端口使用另外的重排序缓存。
例如,所述控制器中包括八个端口、两个重排序缓存,其中七个端口复用一个重排序缓存,另外一个端口单独使用另一个重排序缓存,或者,其中四个端口复用一个重排序缓存,另外四个端口复用另一个重排序缓存。
下面结合附图,对本申请的一些实施方式作详细说明。在各实施例之间不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
本申请实施例提供一种用于与存储器件通信的控制器。图8为本申请实施例提供的一种用于与存储器件通信的控制器的结构示意图。如图8所示,所述控制器可以包括:调度模块803、至少一个重排序缓存802和多个端口801;
所述调度模块803用于从所述存储器件804获取数据;
所述重排序缓存802用于将所述调度模块803获取的数据进行重排序后传输至对应的端口801;
所述端口801用于将从所述重排序缓存获取的数据发送给对应的处理装置;
其中,所述多个端口801中的至少两个端口801复用一个重排序缓存802。
图8示出的控制器中,重排序缓存802的个数为一个,端口801的个数为八个,八个端口801复用一个重排序缓存802,当然,本领域技术人员可以根据实际需要来调整重排序缓存802的个数和端口801的个数以及复用情况。
所述端口801可以用于与处理装置连接,并向所述处理装置发送数据。可选的,所述多个端口801可以用于与多个处理装置一一对应连接。具体地,假设端口801的个数为n,那么可以用于与n个处理装置一一对应连接,其中,第i个端口801用于连接第i个处理装置,这里i=1、2、……、n。例如,每个端口801连接一个处理装置,那么八个端口801可以连接八个处理装置,实现八个处理装置与存储器件804的通信功能。所述处理装置与控制器的具体连接关系可以参见图1,处理装置可以作为MASTER与控制器连接。
其中,所述存储器件804可以为DDR器件,所述控制器可以为DDR控制器。所述处理装置可以包括下述至少一项:应用处理器、数字信号处理器、图形处理单元、多媒体处理器。所述多个处理装置的类型可以相同,也可以不同。
所述控制器中的调度模块803可以用于从所述存储器件804获取数据。可选的,所述调度模块803可以包括:指令缓存8032,用于存储并分割读命令;指令执行模块8033,用于向所述存储器件804发送分割后的读命令,以使所述存储器件804执行所述分割后的读命令;仲裁模块8031,用于获取所述存储器 件804根据所述分割后的读命令返回的数据。
可选的,所述指令缓存8032可以将读命令分割为多个指令片段,所述指令执行模块8033可以向所述存储器件804发送分割得到的指令片段,以使所述存储器件804执行所述指令片段并返回对应的数据。参见图5,发送给所述存储器件804的指令片段可以是打乱顺序后的指令片段,即存储器件804执行的读命令与处理装置发出的读命令的顺序可以不完全一致。
可以理解的是,所述指令执行模块8033可以直接将所述指令片段发送给所述存储器件804,也可以对所述指令片段进行加工,使之成为适于存储器件804执行的指令片段,并将加工后的指令片段发送给所述存储器件804。
仲裁模块8031可以获取存储器件804返回的数据,此外,所述仲裁模块8031还可以执行仲裁功能,例如,在读命令到达指令缓存8032之前,确定是否执行处理装置发出的读命令,如果一个端口801一次发出的读命令的数量超过指令缓存8032的深度,那么拒收超出范围的读命令。
所述重排序缓存802可以用于将所述调度模块803获取的数据进行重排序后传输至对应的端口801。可选的,所述重排序缓存802具体可以用于:接收所述调度模块803发送的数据,确定所述数据对应的端口801,将所述数据按照重排序后的顺序发送给所述对应的端口801。
其中,确定所述数据对应的端口801具体是哪个端口801,可以通过多种方法来实现,例如,所述调度模块803可以将数据对应的端口信息发送给所述重排序缓存802,以使所述重排序缓存确定将所述数据发送给哪个端口801,或者,数据对应的令牌可以携带所述数据对应的端口信息,以使所述所述重排序缓存确定将所述数据发送给哪个端口801。
所述重排序缓存802的存储空间大小可以根据所述存储器件804的位宽、突发长度以及指令缓存8032的深度来决定。可选的,若所述存储器件804的位宽为D,所述存储器件804的突发长度为BL,所述指令缓存8032的深度为M,则所述重排序缓存802的存储空间大小可以为M*BL*D,其中,所述D、BL、M均为正整数。
所述重排序缓存802可以与所述仲裁模块8031连接,以接收所述仲裁模块8031从所述存储器件804返回的数据。所述重排序缓存802的输入位宽可以等于输出位宽。所述重排序缓存802的位宽可以大于所述仲裁模块8031的输出位宽,能够提高重排序缓存802传输数据的速度,提高所述重排序缓存802的利 用率。
可选的,所述重排序缓存802的位宽可以为所述仲裁模块8031的输出位宽的2倍或4倍,例如,所述仲裁模块8031的输出位宽为W,所述重排序缓存802的位宽可以为2W或4W,能够在一个周期内至少发送2倍或4倍的数据,保证数据的传输速度和准确率。
本实施例中,所述重排序缓存802可以包括第一重排序模块8021和第二重排序模块8022,所述第一重排序模块8021和所述第二重排序模块8022的位宽可以均为仲裁模块8031的输出位宽的2倍。
可选的,每个端口801的输入位宽可以等于所述重排序缓存802的输出位宽,从而实现数据的快速传输。本实施例中,所述端口801可以包括数据存储模块(DM)8011和缓存存储模块(BM);所述数据存储模块8011可以用于与所述处理装置连接,所述数据存储模块8011的输出位宽等于所述处理装置的输入位宽;所述缓存存储模块与所述重排序缓存802连接,所述缓存存储模块的输入位宽等于所述重排序缓存802的输出位宽,所述缓存存储模块的输出位宽等于所述数据存储模块8011的输入位宽。
其中,所述数据存储模块8011和所述缓存存储模块都可以通过FIFO存储器来实现,保证数据被快速、准确地读取。
可选的,每个所述缓存存储模块包括第一缓存存储模块(第一BM)8012和第二缓存存储模块(第二BM)8013。图8中重排序缓存802与第一缓存存储模块8012、第二缓存存储模块8013之间的梯形可以用于表示重排序缓存802与第一缓存存储模块8012、第二缓存存储模块8013的连接关系,任意能够实现重排序缓存功能的连接都是允许的。例如,所述第一缓存存储模块8012与所述第一重排序模块8021连接;所述第二缓存存储模块8013与所述第二重排序模块8022连接,从而保证重排序缓存802的数据顺利传输到端口801。
其中,所述第一缓存存储模块8012的输入位宽、所述第二缓存存储模块8013的输入位宽、所述第一重排序模块8021连接的输出位宽、所述第二重排序模块8022的输出位宽均可以相等,进一步提高传输效率。
可选的,所述缓存存储模块的深度为8或16位。深度越大,每次传输的有效带宽越大,能够节约传输时间,深度越小,存储空间越小,能够有效节约控制器的面积和成本。
下面以存储器件804的数据宽度D=32,存储器件804的突发长度BL=16,指 令缓存8032的深度M=64,端口801的AXI总线读数据位宽为128的配置为例,介绍本实施例的实现细节。
图9为图8所示控制器中各部件的参数示意图。图9中各个部件上方的数字表示该部件对应的参数。如图9所示,存储器件804的位宽D=32,仲裁模块8031的位宽W=128,第一重排序模块8021和第二重排序模块8022的存储空间大小均为M*2W=64*256,第一重排序模块8021和第二重排序模块8022组成的整个重排序缓存802的存储空间大小恰好等于M*BL*D=64*16*32。
图9示出了一个端口801的参数设置,在所述控制器中,每个端口801的第一缓存存储模块8012的存储空间大小均可以为深度*2W=16*256,数据存储模块8011的存储空间大小可以为深度*AXI总线读数据位宽=32*128。
图10为图8所示控制器传输数据的原理示意图。
以某个端口801为例,假设处理装置的一次传输长度ARLEN为16,那么需要在RRB中为此条AXI读命令开辟返回的数据的存储空间。此次读命令对应的数据最大为16*128bit,而执行一条指令片段可以返回的数据是BL*D即16*32bit。为了便于描述,本实施例中,将一条读命令对应的全部数据记为一个数据包,所述数据包中包括所述读命令拆分成的多个指令片段对应的多个数据,也就是说,一条读命令对应的数据包为16*128bit。
在存储器件804返回16*32bit的数据后,在仲裁模块8031内完成数据位宽拼接,从仲裁模块8031输出4*128bit的数据,那么一个数据包有16*128bit,需要分为4次4*128bit返回。
可选的,从所述仲裁模块8031输出的属于同一读命令的数据,可以按照重排序后的顺序交替存储于所述第一重排序模块8021和所述第二重排序模块8022。具体地,从仲裁模块8031获取的数据是打乱顺序后的数据,所述重排序后的顺序即为数据原本应有的顺序。
举例来说,假设所述读命令被分割为K个指令片段,所述K为正整数,通过所述K个指令片段可以得到对应的K个数据。所述K个指令片段对应的K个数据中,第奇数个数据可以存储到所述第一重排序模块8021,第偶数个数据可以存储到所述第二重排序模块8022。
参见图10,假设对应于某一条读命令的四个数据编号分别为1-1、1-2、1-3、1-4,那么,可以将原来是1、3顺序的数据1-1、1-3存放在第一重排序模块8021中,原来是2、4顺序的数据1-2、1-4存放在第二重排序模块8022中。
因为指令片段是乱序执行的,因此,这四个数据返回的顺序可能不是按1-1、1-2、1-3、1-4顺序返回,由于仲裁模块8031同步返回了数据对应的端口和令牌,因此重排序缓存802可以查询到本次返回的数据在原数据包中的位置顺序信息并存储。
假设总裁模块按1-3、1-4、1-1、1-2顺序返回数据,那么当1-2的数据写入重排序缓存802中时,就可以启动读取操作。具体地,可以同时从第一重排序模块8021和第二重排序模块8022分别读出编号为1-1、1-2的数据,再同时从第一重排序模块8021和第二重排序模块8022分别读出编号为1-3、1-4的数据,写入对应端口801内的第一缓存存储模块8012和第二缓存存储模块8013中。
如图10所示,所述第一缓存存储模块8012可以从所述第一重排序模块8021读取数据,所述第二缓存存储模块8013可以从所述第二重排序模块8022读取数据。
当读取操作完成,编号为1-3、1-4、1-1、1-2的数据的存储资源部分就可以释放给其他端口801或者其他令牌对应的数据。当下次仲裁端口801返回新的读数据,如果此时恰好构成新的数据包,那么在最苛刻的情况下,在仲裁模块8031写入的过程中,要求将前一数据包全部读走。
因为BL=16,那么最快至少要占用BL/2=8个PHY时钟才能从存储器件804得到16*32bit的数据。而常见的控制器的时钟与PHY时钟的比例为1:2或者1:1,那么,对应到控制器的时钟为4个或者8个,即仲裁模块8031至少需要花费4或者8个控制器时钟周期才能完成新的数据写入。
所以,在最苛刻的场景下,要求4个控制器时钟周期内,要将前一数据包中的数据全部读走。因为第一重排序缓存802和第二重排序缓存802的位宽均为256bit,而数据包中数据的总个数为4个,恰好可以在4个控制器时钟周期内完成前一数据包的读取操作,并且不会影响后续数据的存储操作,满足最苛刻场景下的数据存储需求。
重排序缓存802将4个256*2的数据发送给缓存存储模块,缓存存储模块将数据拆分为4个128*4的数据,通过4个周期传输到数据存储模块8011,由数据存储模块8011发送给处理装置,从而完成存储器件804的数据读取操作。
本实施例中,多个端口801共享重排序缓存802,在每个端口801内添加小容量的缓存FIFO存储器,在数据传输过程中,采用公共转局部的方法,将公共的重排序缓存802中的数据及时转到局部的端口801中,使得一个端口801的 完整的数据包不会长期占用重排序缓存802,及时释放重排序缓存802资源,显著减少重排序缓存802的数量。
在实际应用中,重排序缓存802的存储空间大小可以根据其它各部件的参数来确定。如前所述,假设仲裁模块8031输出数据位宽为W,指令缓存8032的深度为M,存储器件804的突发长度为BL,存储器件804合并后的位宽为D,控制器的端口801数为N,那么,每个重排序缓存802的大小为M*BL*D。
按照图6所示的方案,重排序缓存802需要的总的存储空间大小为N*M*BL*D。按照图8所示的方案,重排序缓存802需要的存储空间大小为M*BL*D,每个端口801增加了缓存存储模块的存储空间大小为2*16*2*W,总的存储空间大小为N*2*16*2*W+M*BL*D。因此,图8所示的方案相对于图6所示的方案,可以有效减少的存储资源为(N-1)*M*BL*D-N*2*16*2*W。
表1给出了仲裁模块8031输出数据位宽W、指令缓存8032的深度M、存储器件804的突发长度BL、存储器件804合并后的位宽D、控制器的端口801数N选择不同的数值时,采用图8所示方案相对于图6所示方案节约的bit数。
表1 单通道模式下节约的资源数量
BL | D | W | M | N | 节省的bit数 |
16 | 32 | 64 | 64 | 8 | 196608 |
16 | 32 | 64 | 64 | 8 | 196608 |
32 | 32 | 64 | 32 | 8 | 196608 |
32 | 32 | 64 | 32 | 8 | 196608 |
32 | 32 | 64 | 64 | 8 | 425984 |
32 | 32 | 64 | 64 | 8 | 425984 |
16 | 32 | 128 | 64 | 8 | 163840 |
16 | 32 | 128 | 64 | 8 | 163840 |
32 | 32 | 128 | 32 | 8 | 163840 |
32 | 32 | 128 | 32 | 8 | 163840 |
32 | 32 | 128 | 64 | 8 | 393216 |
32 | 32 | 128 | 64 | 8 | 393216 |
通过表1可知,在控制器的一些常见配置情况下,图8所示方案相对于图6所示方案,可以节约六位数的bit数,通过多个端口801共享重排序缓存802资源,相对于按端口801独立设置对应的重排序缓存802,大大减少了逻辑存储资源,降低了控制器的成本。
本实施例提供的用于与存储器件通信的控制器,包括调度模块803、至少一个重排序缓存802和多个端口801,,所述调度模块803用于从所述存储器件 804获取数据,所述重排序缓存802用于将所述调度模块803获取的数据进行重排序后传输至对应的端口801,所述端口801用于将从所述重排序缓存802获取的数据发送给对应的处理装置,其中,所述多个端口801中的至少两个端口801复用一个重排序缓存802,能够实现重排序缓存802的共享,减少控制器中重排序缓存802的个数,减少控制器的体积和成本。
在图8至图10所示的方案中,重排序缓存包括两个重排序模块,在其它可选的实施方式中,所述重排序缓存可以包括任意多个重排序模块,只要满足在时序上,后一个数据包的最后一个数据片段存储到重排序缓存时,前一个数据包已经被读取到端口中即可。
举例来说,假定每个数据包包括4个数据,那么在最严苛的情况下,传进来一个数据,就需要把前一个数据包中的数据都传走,相当于进一个数据的同时需要出4个数据。
可选的,所述重排序模块的位宽相对于所述仲裁模块的位宽的倍数,乘以所述重排序模块的个数可以等于读命令包含的指令片段的个数。
其中,读命令包含的指令片段的个数也就是一个数据包包含的数据个数,重排序模块的位宽除以仲裁模块的位宽得到的倍数与重排序模块分成的重排序模块的个数相乘,可以等于一个数据包包含的数据个数。
例如,在仲裁模块的输出位宽为W时,若每个重排序模块的位宽等于4W,那么需要1个重排序模块即可满足需求,若每个重排序模块的位宽等于W,那么需要4个重排序模块可以满足要求。
此外,上述实施例提供的技术方案中,重排序缓存在集齐一个数据包后才会启动读出数据的操作。在其它可选的实施方式中,从重排序缓存中读出数据的启动时刻,也不需要一定要等到对应的读命令的全部读数据返回,只要根据令牌确认重排序缓存中的某个数据,属于还没有返回给端口的数据包中顺序最靠前的数据即可向对应的端口输出该数据。
例如,某一ID下有5个读指令,对应5个数据包,其中第一个数据包已经全部发送给端口,第二个数据包中的数据返回顺序应该是2-1、2-2、2-3、2-4,假设重排序缓存先接收到了2-3,由于2-3前面还有其它数据,因此不能马上输出2-3,重排序缓存接收到的下一个数据是2-1,由于2-1的顺序优先级是最高的,因此可以先返回2-1,不需要等待后面的2-2和2-4集齐后再一并返回。这种情况下,重排序缓存中读数据操作的启动间隔可以参考存储器件BURST回 读数据返回的间隔,从而进一步减少存储资源。
以上提供了一种单通道的控制器的实现方案。可选的,所述控制器也可以为多通道的控制器,在多通道的控制器中,重排序缓存的个数可以为多个。可选的,所述重排序缓存的个数可以为所述控制器的通道个数。需注意的时,所述控制器的通道个数是指所述控制器的调度模块的并行通道个数,并不一定等于所述控制器的端口个数。
在多通道的控制器中,每个重排序缓存均可以被至少两个端口复用,从而有效节约控制器的存储资源。下面以双通道的控制器为例,来对多通道情况下重排序缓存的使用情况进行描述。
图11为本申请实施例提供的另一种用于与存储器件通信的控制器的结构示意图。如图11所示,所述控制器可以包括:调度模块113、两个重排序缓存112和多个端口111;
所述调度模块113用于从所述存储器件114获取数据;
所述重排序缓存112用于将所述调度模块113获取的数据进行重排序后传输至对应的端口111;
所述端口111用于将从所述重排序缓存112获取的数据发送给对应的处理装置;
其中,每个重排序缓存112均被所述多个端口111复用。
可选的,所述重排序缓存112具体用于:接收所述调度模块113发送的数据,确定所述数据对应的端口111,将所述数据按照重排序后的顺序发送给所述对应的端口111,从而实现多个端口111对所述重排序缓存112的复用。
可选的,所述调度模块113可以包括:
两个指令缓存1132,均用于存储并分割读命令;
两个指令执行模块1133,均用于向所述存储器件114发送分割后的读命令,以使所述存储器件114执行所述分割后的读命令;
两个仲裁模块1131,均用于获取所述存储器件114根据分割后的读命令返回的数据。
其中,双份的指令缓存1132、指令执行模块1133和仲裁模块1131用于实现双通道的数据传输。每个指令缓存1132、指令执行模块1133和仲裁模块1131的具体功能和实现原理可以参见图8至图10所示的实施方式,此处不再赘述。
与单通道的情况类似,假设所述存储器件114的位宽为D,所述存储器件 114的突发长度为BL,每个所述指令缓存1132的深度为M,则每个重排序缓存112的存储空间大小为M*BL*D,两个重排序缓存112的存储空间大小共为2*M*BL*D,其中,所述D、BL、M均为正整数。
如图11所示,每个重排序缓存112均可以包括第一重排序模块1121和第二重排序模块1122,所述第一重排序模块1121和所述第二重排序模块1122的位宽均为一个仲裁模块1131的输出位宽的2倍。
每个所述端口111的输入位宽可以等于所述至少一个重排序缓存112的合并位宽。在所述重排序缓存112的个数为两个时,每个端口111的输入位宽可以等于一个重排序缓存112的输出位宽的2倍。
可选的,所述端口111包括数据存储模块(DM)1111和缓存存储模块(BM);所述数据存储模块1111与所述处理装置连接,所述数据存储模块1111的输出位宽等于所述处理装置的输入位宽。
所述缓存存储模块与所述重排序缓存112连接,所述缓存存储模块的输入位宽等于两个所述重排序缓存112的合并位宽,所述缓存存储模块的输出位宽等于所述数据存储模块1111的输入位宽。
具体地,每个所述缓存存储模块均包括第一缓存存储模块(第一BM)1112、第二缓存存储模块(第二BM)1113、第三缓存存储模块(第三BM)1114和第四缓存存储模块(第四BM)1115,能够实现四个数据的同时获取,提高数据传输的效率。
本实施例中,两个重排序缓存112中,每个重排序缓存112均包括第一重排序模块1121和第二重排序模块1122,因此控制器中共有四个重排序模块,每个端口111包括四个缓存存储模块,四个重排序模块与四个缓存存储模块连接。
图11中的梯形可以用于表明两个重排序缓存112与每个端口的四个缓存存储模块的连接关系,任意能够实现数据传输功能的连接都是允许的。在实际应用中,每个端口的四个缓存存储模块与四个重排序模块的具体连接关系可以根据实际需要来设置,本实施例对此不作限制。
可选的,所述缓存存储模块的深度可以为8或16位,每个所述缓存存储模块均中的第一缓存存储模块1112的输入位宽、第二缓存存储模块1113的输入位宽、第三缓存存储模块1114的输入位宽、第四缓存存储模块1115的输入位宽、所述第一重排序模块1121的输出位宽、所述第二重排序模块1122的输出 位宽均相等,保证数据被快速、准确地传输。
双通道模式下,数据传输的过程与单通道类似。具体地,如果两个仲裁模块1131不同时返回最后一个数据,那么就和单通道类似,第一重排序模块1121和第二重排序模块1122按照各自的流程向端口111返回数据,此处不再赘述。
如果两个仲裁模块1131同时返回了最后一个数据,那么,重排序缓存112可以根据数据的具体存储方式来实现数据包的输出流程。本实施例提供如下两种数据存储方式:一种是一个数据包中的多个数据存储在同一个重排序缓存112中,另一种是一个数据包中的多个数据分散在两个重排序缓存112中存储。下面分别说明。
图12为图11所示控制器传输数据的原理示意图一。在图12所示的方案中,从所述仲裁模块1131输出的属于同一读命令的数据,按照重排序后的顺序交替存储于一个重排序缓存112中的第一重排序模块1121和第二重排序模块1122。
举例来说,假设所述读命令被分割为K个指令片段,所述K为正整数,通过所述K个指令片段可以得到对应的K个数据。所述K个指令片段对应的K个数据中,第奇数个数据可以存储到一重排序缓存112中的第一重排序模块1121,第偶数个数据可以存储到所述重排序缓存112的第二重排序模块1122。
在两个重排序缓存112中的任意一个重排序缓存112集齐读命令对应的K个数据后,所述重排序缓存112向对应的端口111发送所述K个数据。
若两个重排序缓存112中同时集齐第一读命令对应的K个数据和第二读命令对应的K个数据,所述第一读命令和所述第二读命令对应同一个端口111,则判断所述第一读命令和所述第二读命令是否属于同一ID;若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命令和所述第二读命令的先后顺序,返回对应的数据。
可选的,所述重排序缓存112可以根据所述第一读命令和所述第二读命令的先后顺序,将顺序在先的读命令对应的数据发送给所述第一缓存存储模块1112和所述第二缓存存储模块1113,同时将另一读命令对应的数据发送给所述第三缓存存储模块1114和所述第四缓存存储模块1115。
假设某一条读命令对应的数据包1,包括四个数据,编号分别为1-1、1-2、1-3、1-4,那么,可以将原来是1、3顺序的数据1-1、1-3存放在第一个重排 序缓存112中的第一重排序模块1121中,原来是2、4顺序的数据1-2、1-4存放在第二重排序模块1122中。
假设某一条读命令对应的数据包2,包括四个数据,编号分别为2-1、2-2、2-3、2-4,那么,可以将原来是1、3顺序的数据2-1、2-3存放在第二个重排序缓存112中的第一重排序模块1121中,原来是2、4顺序的数据2-2、2-4存放在第二重排序模块1122中。
如果两个仲裁模块1131同时返回的两个数据,都属于各自数据包里的最后一个数据,那么两个重排序缓存112可以同时凑成两个数据包,这两个数据包可以属于同一个端口111也可以属于不同端口111。
假设数据包1和数据包2都是属于同一个端口111的,如果数据包1和数据包2的ID相同,那么就需要判断这两个数据包的返回先后顺序,具体可以通过数据对应的令牌来确定是先返回数据包1还是先返回数据包2,并按照顺序发送给对应的端口111。图12中的箭头表明了重排序模块向对应的缓存存储模块发送数据。
可选的,所述第一缓存存储模块1112和所述第二缓存存储模块1113中存储的数据包的顺序可以优先于所述第三缓存存储模块1114和所述第四缓存存储模块1115中存储的数据包。假设数据包1优先于数据包2,则将数据包1发送给所述第一缓存存储模块1112和所述第二缓存存储模块1113,将数据包2发送给所述第三缓存存储模块1114和所述第四缓存存储模块1115,这样,端口111在向处理装置返回数据时,可以将数据包1先于数据包2返回给处理装置,保证数据的传输顺序。
将这些数据分布在四块重排序模块中保存,就是为了能够在下次仲裁模块1131的新数据写入之前将数据全部读走。由于一个数据包中的数据总个数为4个,恰好可以在4个控制器时钟周期内完成数据包1以及数据包2的读取操作。
如果数据包1和数据包2不属于同一个端口111,那么就按照单通道模式,写入各自的端口111即可。
图13为图11所示控制器传输数据的原理示意图二。在图13所示的方案中,从所述仲裁模块1131输出的属于同一读命令的数据,按照重排序后的顺序交替存储于两个重排序缓存112中的第一重排序模块1121和第二重排序模块1122。
假设一个读命令被分割为K个指令片段,所述K为正整数;其中,所述K个指令片段对应的K个数据中,第i+1个数据存储到一个重排序缓存112的第一重排序模块1121,第i+2个数据存储到该重排序缓存112的第二重排序模块1122,第i+3个数据存储到另一个重排序缓存112的第一重排序模块1121,第i+4个数据存储到所述另一个重排序缓存112的第二重排序模块1122,i为0或者i为4的倍数。
类似地,在两个重排序缓存112中集齐读命令对应的K个数据后,所述两个重排序缓存112向对应的端口111发送所述K个数据。
若两个重排序缓存112中同时集齐第一读命令对应的K个数据和第二读命令对应的K个数据,所述第一读命令和所述第二读命令对应同一个端口111,则判断所述第一读命令和所述第二读命令是否属于同一ID;若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命令和所述第二读命令的先后顺序,返回对应的数据。
可选的,所述重排序缓存112可以根据所述第一读命令和所述第二读命令的先后顺序,将顺序在先的读命令对应的数据发送给所述第一缓存存储模块1112和所述第二缓存存储模块1113,同时将另一读命令对应的数据发送给所述第三缓存存储模块1114和所述第四缓存存储模块1115。
如图13所示,假设某一条读命令对应的数据包1,包括四个数据,编号分别为1-1、1-2、1-3、1-4,那么,可以将数据1-1存放在第一个重排序缓存112中的第一重排序模块1121中,数据1-2存放在第一个重排序缓存112中的第二重排序模块1122中,将数据1-3存放在第二个重排序缓存112中的第一重排序模块1121中,数据1-4存放在第二个重排序缓存112中的第二重排序模块1122中。
假设某一条读命令对应的数据包2,包括四个数据,编号分别为2-1、2-2、2-3、2-4,那么,可以将数据2-1存放在第一个重排序缓存112中的第一重排序模块1121中,数据2-2存放在第一个重排序缓存112中的第二重排序模块1122中,将数据2-3存放在第二个重排序缓存112中的第一重排序模块1121中,数据2-4存放在第二个重排序缓存112中的第二重排序模块1122中。
如果两个仲裁模块1131同时返回的两个数据,都属于各自数据包里的最后一个数据,那么两个重排序缓存112可以同时凑成两个数据包,这两个数据包可以属于同一个端口111也可以属于不同端口111。
假设数据包1和数据包2都是属于同一个端口111的,如果数据包1和数据包2的ID相同,那么就需要判断这两个数据包的返回先后顺序,具体可以通过数据对应的令牌来确定是先返回数据包1还是先返回数据包2,并按照顺序发送给对应的端口111。
图13中的箭头表明了重排序模块向对应的缓存存储模块发送数据。可选的,所述第一缓存存储模块1112和所述第二缓存存储模块1113中存储的数据包的顺序可以优先于所述第三缓存存储模块1114和所述第四缓存存储模块1115中存储的数据包。假设数据包1优先于数据包2,则将数据包1发送给所述第一缓存存储模块1112和所述第二缓存存储模块1113,将数据包2发送给所述第三缓存存储模块1114和所述第四缓存存储模块1115,这样,端口111在向处理装置返回数据时,可以将数据包1先于数据包2返回给处理装置,保证数据的传输顺序。
将这些数据分布在四块重排序模块中保存,就是为了能够在下次仲裁模块1131的新数据写入之前将数据全部读走。由于一个数据包中的数据总个数为4个,每个数据传输两个时钟周期,恰好可以在4个控制器时钟周期内完成数据包1以及数据包2的读取操作。
如果数据包1和数据包2不属于同一个端口111,那么就按照单通道模式,写入各自的端口111即可。
在实际应用中,重排序缓存112的存储空间大小可以根据其它各部件的参数来确定。假设仲裁模块1131输出数据位宽为W,指令缓存1132的深度为M,存储器件114的突发长度为BL,存储器件114合并后的位宽为D,控制器的端口111数为N,那么,每个重排序缓存112的大小为M*BL*D。
按照图7所示的方案,重排序缓存112需要的总的存储空间大小为2*N*M*BL*D。按照图11所示的方案,重排序缓存112需要的存储空间大小为2*M*BL*D,每个端口111增加了缓存存储模块的存储空间大小为4*8*2*W,总的存储空间大小为N*4*8*2*W+2*M*BL*D。因此,图11所示的方案相对于图7所示的方案,可以有效减少的存储资源为2*(N-1)*M*BL*D-N*4*8*2*W。
表2给出了仲裁模块1131输出数据位宽W、指令缓存1132的深度M、存储器件114的突发长度BL、存储器件114合并后的位宽D、控制器的端口111数N选择不同的数值时,采用图11所示方案相对于图7所示方案节约的bit数。
表2 双通道模式下节约的资源数量
BL | D | W | M | N | 节省的bit数 |
16 | 32 | 64 | 64 | 8 | 360448 |
16 | 32 | 64 | 64 | 8 | 360448 |
32 | 32 | 64 | 32 | 8 | 360448 |
32 | 32 | 64 | 32 | 8 | 360448 |
32 | 32 | 64 | 64 | 8 | 753664 |
32 | 32 | 64 | 64 | 8 | 753664 |
16 | 32 | 128 | 64 | 8 | 327680 |
16 | 32 | 128 | 64 | 8 | 327680 |
32 | 32 | 128 | 32 | 8 | 327680 |
32 | 32 | 128 | 32 | 8 | 327680 |
32 | 32 | 128 | 64 | 8 | 720896 |
32 | 32 | 128 | 64 | 8 | 720896 |
通过表2可知,在控制器的一些常见配置情况下,图11所示方案相对于图7所示方案,可以节约六位数的bit数,通过多个端口111共享重排序缓存112资源,相对于按端口111独立设置对应的重排序缓存112,大大减少了逻辑存储资源,降低了控制器的成本。
图11提供的用于与存储器件114通信的控制器,包括:调度模块113、两个重排序缓存112和多个端口111,所述调度模块113包括两组指令缓存1132、指令执行模块1133和仲裁模块1131,能够实现双通道模式的数据传输,所述重排序缓存112可以将所述调度模块113获取的数据进行重排序后传输至对应的端口111,所述多个端口111复用所述两个重排序缓存112,能够实现双通道模式下的重排序缓存112共享,减少控制器中重排序缓存112的个数,减少控制器的体积和成本。
在上述各实施例提供的技术方案中,每个端口都设置了缓存存储模块,在其它可选的实施方式中,也可以省略缓存存储模块,只保留数据存储模块。下面通过以下实施例进行说明。
图14为本申请实施例提供的又一种用于与存储器件通信的控制器的结构示意图。如图14所示,所述控制器可以包括:调度模块143、重排序缓存142和多个端口;
所述调度模块用于从所述存储器件144获取数据;
所述重排序缓存142用于将所述调度模块143获取的数据进行重排序后传输至对应的端口;
所述端口用于将从所述重排序缓存142获取的数据发送给对应的处理装 置;
其中,所述多个端口中的至少两个端口复用所述重排序缓存142。
具体地,所述重排序缓存可以包括第一重排序模块1421和第二重排序模块1422,所述调度模块143可以包括指令缓存1432、指令执行模块1433和仲裁模块1431。所述端口可以包括数据存储模块(DM)141,所述数据存储模块141与所述重排序缓存142连接,所述数据存储模块141的输入位宽与所述重排序缓存142的输出位宽相等,所述数据存储模块141的输出位宽与所述处理装置的输入位宽相等。
可选的,每个数据存储模块141可以包括FIFO存储器,所述FIFO存储器的写入深度乘以写入位宽可以等于读出深度乘以读出位宽,从而保证数据的正常传输。
图14所示实施例是在图8至图10所示实施例的基础上,对每个端口内的存储模块进行了调整,通过数据存储模块141直接接收重排序缓存142发送的数据,其它部件的结构、功能、实现原理和技术效果均与图8至图10所示实施例类似,此处不再赘述。
图15为本申请实施例提供的还一种用于与存储器件通信的控制器的结构示意图。如图15所示,所述控制器可以包括:调度模块153、两个重排序缓存152和多个端口;
所述调度模块153用于从所述存储器件154获取数据;
所述重排序缓存152用于将所述调度模块153获取的数据进行重排序后传输至对应的端口;
所述端口用于将从所述重排序缓存152获取的数据发送给对应的处理装置;
其中,每个重排序缓存152均被所述多个端口复用。
具体地,每个所述重排序缓存152可以包括第一重排序模块1521和第二重排序模块1522,所述调度模块153可以包括两个指令缓存1532、两个指令执行模块1533和两个仲裁模块1531。
所述端口可以包括数据存储模块(DM)151,所述数据存储模块151与所述重排序缓存152连接。所述数据存储模块151的输入位宽可以等于一个重排序缓存152的输出位宽的2倍,所述数据存储模块151的输出位宽与所述处理装置的输入位宽相等。
可选的,每个端口的数据存储模块151可以包括FIFO存储器,所述FIFO存储器的写入深度乘以写入位宽可以等于读出深度乘以读出位宽,从而保证数据的正常传输。
图15所示实施例是在图11至图13所示实施例的基础上,对每个端口内的存储模块进行了调整,通过数据存储模块151直接接收重排序缓存152发送的数据,其它部件的结构、功能、实现原理和技术效果均与图11所示实施例类似,此处不再赘述。
在上述各实施例提供的技术方案的基础上,重排序缓存中可以设置有控制单元,用于对数据传输过程进行控制,例如根据令牌确定数据的顺序等。可以理解的是,本申请各实施例中所述的重排序缓存的存储空间大小,可以是指控制单元以外的、用于存储发送给处理装置的数据的存储空间的大小。
以上以所述控制器为DDR控制器为例来对本申请实施例的具体实现原理进行了描述,本领域技术人员可以理解的是,所述控制器也可以为设置有重排序缓存的其它类型的控制器,具体的实现原理与DDR控制器的实现原理类似,本申请实施例不再赘述。
本申请实施例还提供一种数据传输方法。图16为本申请实施例提供的一种数据传输方法的流程示意图。所述方法可以应用于控制器,所述控制器用于与存储器件通信,所述控制器包括调度模块、至少一个重排序缓存和多个端口。如图16所示,所述数据传输方法,可以包括:
161、调度模块从所述存储器件获取数据。
162、重排序缓存将所述调度模块获取的数据进行重排序后传输至对应的端口,其中,所述多个端口中的至少两个端口复用一个重排序缓存;
163、所述端口将从所述重排序缓存获取的数据发送给对应的处理装置。
在一个可选的实施方式中,将所述调度模块获取的数据进行重排序后传输至对应的端口,包括:
接收所述调度模块发送的数据;
确定所述数据对应的端口;
将所述数据按照重排序后的顺序发送给所述对应的端口。
在一个可选的实施方式中,所述调度模块包括指令缓存、指令执行模块和仲裁模块;所述调度模块从所述存储器件获取数据,包括:
所述指令缓存存储并分割读命令;
所述指令执行模块向所述存储器件发送分割后的读命令,以使所述存储器件执行所述分割后的读命令;
所述仲裁模块获取所述存储器件根据所述分割后的读命令返回的数据。
在一个可选的实施方式中,所述重排序缓存包括第一重排序模块和第二重排序模块;接收所述调度模块发送的数据,包括:
接收所述调度模块中的仲裁模块发送的属于同一读命令的数据,将接收到的数据按照重排序后的顺序交替存储于所述第一重排序模块和所述第二重排序模块。
在一个可选的实施方式中,所述重排序缓存包括第一重排序模块和第二重排序模块;所述读命令被分割为K个指令片段,所述K为正整数;接收所述调度模块发送的数据,包括:
接收所述调度模块中的仲裁模块发送的所述K个指令片段对应的K个数据;所述K个数据中,第奇数个数据存储到所述第一重排序模块,第偶数个数据存储到所述第二重排序模块。
在一个可选的实施方式中,将所述数据按照重排序后的顺序发送给所述对应的端口,包括:
在所述重排序缓存中集齐所述读命令对应的K个数据后,将所述K个数据按照重排序后的顺序发送至对应的端口。
在一个可选的实施方式中,所述端口包括数据存储模块和缓存存储模块;所述缓存存储模块包括第一缓存存储模块和第二缓存存储模块;将所述K个数据按照重排序后的顺序发送至对应的端口,包括:
所述重排序缓存中的第一重排序模块将所述K个数据中的部分数据发送给所述第一缓存存储模块,所述重排序缓存中的第二重排序模块将所述K个数据中的其它数据发送给所述第二缓存存储模块;
所述第一缓存存储模块和所述第二缓存存储模块将所述K个数据发送给所述数据存储模块。
在一个可选的实施方式中,所述控制器为双通道的控制器,所述重排序缓存、指令缓存、指令执行模块和仲裁模块的个数均为两个;每个所述重排序缓存均包括第一重排序模块和第二重排序模块。
在一个可选的实施方式中,接收所述调度模块发送的数据,包括:
接收所述调度模块中的仲裁模块发送的属于同一读命令的数据,将接收 到的数据交替存储于其中一个重排序缓存中的第一重排序模块和第二重排序模块。
在一个可选的实施方式中,接收所述调度模块发送的数据,包括:
接收所述调度模块中的仲裁模块发送的属于同一读命令的数据,将接收到的数据按照重排序后的顺序交替存储于两个重排序缓存中的第一重排序模块和第二重排序模块。
在一个可选的实施方式中,所述读命令被分割为K个指令片段,所述K为正整数;
接收所述调度模块发送的数据,包括:
接收所述调度模块中的仲裁模块发送的所述K个指令片段对应的K个数据;
其中,所述K个数据中,第i+1个数据存储到一个重排序缓存的第一重排序模块,第i+2个数据存储到该重排序缓存的第二重排序模块,第i+3个数据存储到另一个重排序缓存的第一重排序模块,第i+4个数据存储到所述另一个重排序缓存的第二重排序模块,i为0或者i为4的倍数。
在一个可选的实施方式中,将所述数据按照重排序后的顺序发送给所述对应的端口,包括:
在两个重排序缓存中集齐读命令对应的多个数据后,将所述多个数据按照重排序后的顺序发送至对应的端口。
在一个可选的实施方式中,在两个重排序缓存中集齐读命令对应的多个数据后,将所述多个数据按照重排序后的顺序发送至对应的端口,包括:
若两个重排序缓存中同时集齐第一读命令对应的多个数据和第二读命令对应的多个数据,则判断所述第一读命令和所述第二读命令是否属于同一ID;
若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命令和所述第二读命令的先后顺序,返回对应的数据。
在一个可选的实施方式中,所述端口包括数据存储模块和缓存存储模块;每个所述缓存存储模块包括第一缓存存储模块、第二缓存存储模块、第三缓存存储模块和第四缓存存储模块;若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命令和所述第二读命令的先后顺序,返回对应的数据,包括:
若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命令和所述第二读命令的先后顺序,将顺序在先的读命令对应的数据发送给所 述第一缓存存储模块和所述第二缓存存储模块,同时将另一读命令对应的数据发送给所述第三缓存存储模块和所述第四缓存存储模块。
在一个可选的实施方式中,所述存储器件为DDR器件,所述控制器为DDR控制器。
在一个可选的实施方式中,所述处理装置包括下述至少一项:应用处理器、数字信号处理器、图形处理单元、多媒体处理器。
本实施例中的方法,可以基于上述任一实施例所述的控制器来实现,所述方法的具体实现原理、过程和有益效果,均可以参见上述各实施例,此处不再赘述。
本申请实施例还提供一种存储器件访问系统,包括上述任一实施例所述的控制器、存储器件和多个处理装置。
其中,所述系统可以为SoC(System on Chip,片上系统),SoC是以嵌入式系统为核心,以IP复用技术为基础,集软、硬件于一体,追求产品系统最大包容的集成芯片。
本申请实施例提供的存储器件访问系统的结构可以参照图1来实现。本申请实施例提供的存储器件访问系统中各部件的结构、功能、实现原理和技术效果均可以参见前述实施例,此处不再赘述。
本申请实施例还提供一种电子设备,包括以上所述的存储器件访问系统。可选的,所述设备为可以下述任意一项:无人机、无人车、云台、相机等。
本申请实施例提供的电子设备中各部件的结构、功能、实现原理和技术效果均可以参见前述实施例,此处不再赘述。
本申请实施例还提供一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得所述计算机执行如上述任一实施例所述的方法。
以上各个实施例中的技术方案、技术特征在与本相冲突的情况下均可以单独,或者进行组合,只要未超出本领域技术人员的认知范围,均属于本申请保护范围内的等同实施例。
在本申请所提供的几个实施例中,应该理解到,所揭露的相关遥控装置和方法,可以通过其它的方式实现。例如,以上所描述的遥控装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨 论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,遥控装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得计算机处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read_Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁盘或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。
Claims (56)
- 一种用于与存储器件通信的控制器,其特征在于,包括:调度模块、至少一个重排序缓存和多个端口;所述调度模块用于从所述存储器件获取数据;所述重排序缓存用于将所述调度模块获取的数据进行重排序后传输至对应的端口;所述端口用于将从所述重排序缓存获取的数据发送给对应的处理装置;其中,所述多个端口中的至少两个端口复用一个重排序缓存。
- 据权利要求1所述的控制器,其特征在于,所述重排序缓存的个数为一个;所述多个端口中的全部端口复用所述重排序缓存。
- 据权利要求1所述的控制器,其特征在于,所述重排序缓存的个数为至少两个;所述多个端口中的部分端口复用其中一个重排序缓存,其它端口使用另外的重排序缓存。
- 据权利要求1所述的控制器,其特征在于,每个所述端口的输入位宽等于所述至少一个重排序缓存的合并位宽。
- 据权利要求1所述的控制器,其特征在于,所述重排序缓存具体用于:接收所述调度模块发送的数据,确定所述数据对应的端口,将所述数据按照重排序后的顺序发送给所述对应的端口。
- 据权利要求1所述的控制器,其特征在于,所述调度模块包括:指令缓存,用于存储并分割读命令;指令执行模块,用于向所述存储器件发送分割后的读命令,以使所述存储器件执行所述分割后的读命令;仲裁模块,用于获取所述存储器件根据所述分割后的读命令返回的数据。
- 据权利要求6所述的控制器,其特征在于,所述存储器件的位宽为D,所述存储器件的突发长度为BL,所述指令缓存的深度为M;所述重排序缓存的存储空间大小为M*BL*D;其中,所述D、BL、M均为正整数。
- 据权利要求6所述的控制器,其特征在于,所述重排序缓存与所述仲裁模块连接;所述重排序缓存的输入位宽大于所述仲裁模块的输出位宽。
- 根据权利要求6所述的控制器,其特征在于,所述重排序缓存包括至少一个重排序模块;所述重排序模块的位宽相对于所述仲裁模块的位宽的倍数,乘以所述重排序模块的个数等于读命令包含的指令片段的个数。
- 据权利要求6所述的控制器,其特征在于,所述重排序缓存包括第一重排序模块和第二重排序模块;所述第一重排序模块和所述第二重排序模块的位宽均为所述仲裁模块的输出位宽的2倍。
- 根据权利要求10所述的控制器,其特征在于,所述端口包括数据存储模块,所述数据存储模块与所述重排序缓存连接;所述数据存储模块的输入位宽与所述重排序缓存的输出位宽相等,所述数据存储模块的输出位宽与所述处理装置的输入位宽相等。
- 根据权利要求11所述的控制器,其特征在于,所述端口包括数据存储模块和缓存存储模块;所述数据存储模块用于与所述处理装置连接,所述数据存储模块的输出位宽等于所述处理装置的输入位宽;所述数据存储模块与所述重排序缓存之间通过所述缓存存储模块连接,所述缓存存储模块的输入位宽等于所述重排序缓存的输出位宽,所述缓存存储模块的输出位宽等于所述数据存储模块的输入位宽。
- 根据权利要求12所述的控制器,其特征在于,所述缓存存储模块包括第一缓存存储模块和第二缓存存储模块。
- 根据权利要求13所述的控制器,其特征在于,所述第一缓存存储模块的输入位宽、所述第二缓存存储模块的输入位宽、所述第一重排序模块的输出位宽、所述第二重排序模块的输出位宽均相等。
- 根据权利要求12所述的控制器,其特征在于,所述缓存存储模块的深度为8或16位。
- 根据权利要求14所述的控制器,其特征在于,所述重排序缓存获取到的属于同一读命令的数据,按照重排序后的顺序交替存储于所述第一重排序模块和所述第二重排序模块。
- 根据权利要求14所述的控制器,其特征在于,所述读命令被分割为K 个指令片段,所述K为正整数;其中,所述K个指令片段对应的K个数据中,第奇数个数据存储到所述第一重排序模块,第偶数个数据存储到所述第二重排序模块。
- 根据权利要求17所述的控制器,其特征在于,所述重排序缓存中集齐所述读命令对应的K个数据后,所述重排序缓存向对应的端口发送所述K个数据。
- 根据权利要求18所述的控制器,其特征在于,所述重排序缓存中的第一重排序模块用于向对应的端口的第一缓存存储模块发送数据,所述第二重排序模块向所述对应的端口的第二缓存存储模块发送数据。
- 据权利要求1所述的控制器,其特征在于,所述控制器为多通道的控制器,所述重排序缓存的个数为多个;其中,每个重排序缓存均被至少两个端口复用。
- 据权利要求20所述的控制器,其特征在于,所述控制器为双通道的控制器,所述控制器包括两个所述重排序缓存。
- 据权利要求21所述的控制器,其特征在于,所述调度模块包括:两个指令缓存,均用于存储并分割读命令;两个指令执行模块,均用于向所述存储器件发送分割后的读命令,以使所述存储器件执行所述分割后的读命令;两个仲裁模块,均用于获取所述存储器件根据分割后的读命令返回的数据。
- 据权利要求22所述的控制器,其特征在于,所述存储器件的位宽为D,所述存储器件的突发长度为BL,每个所述指令缓存的深度为M;所述两个重排序缓存的存储空间大小为2*M*BL*D;其中,所述D、BL、M均为正整数。
- 根据权利要求22所述的控制器,其特征在于,每个重排序缓存均包括第一重排序模块和第二重排序模块;所述第一重排序模块和所述第二重排序模块的位宽均为一个仲裁模块的输出位宽的2倍。
- 根据权利要求24所述的控制器,其特征在于,所述端口包括数据存储模块,所述数据存储模块与所述重排序缓存连接;所述数据存储模块的输入位宽等于一个重排序缓存的输出位宽的2倍,所 述数据存储模块的输出位宽与所述处理装置的输入位宽相等。
- 根据权利要求24所述的控制器,其特征在于,所述端口包括数据存储模块和缓存存储模块;所述数据存储模块用于与所述处理装置连接,所述数据存储模块的输出位宽等于所述处理装置的输入位宽;所述数据存储模块与所述重排序缓存之间通过所述缓存存储模块连接,所述缓存存储模块的输入位宽等于两个所述重排序缓存的合并位宽,所述缓存存储模块的输出位宽等于所述数据存储模块的输入位宽。
- 根据权利要求26所述的控制器,其特征在于,每个所述缓存存储模块包括第一缓存存储模块、第二缓存存储模块、第三缓存存储模块和第四缓存存储模块。
- 根据权利要求27所述的控制器,其特征在于,每个所述缓存存储模块中的第一缓存存储模块的输入位宽、第二缓存存储模块的输入位宽、第三缓存存储模块的输入位宽、第四缓存存储模块的输入位宽、所述第一重排序模块的输出位宽、所述第二重排序模块的输出位宽均相等。
- 根据权利要求27所述的控制器,其特征在于,从所述仲裁模块输出的属于同一读命令的数据,按照重排序后的顺序交替存储于一个重排序缓存中的第一重排序模块和第二重排序模块。
- 根据权利要求27所述的控制器,其特征在于,从所述仲裁模块输出的属于同一读命令的数据,按照重排序后的顺序交替存储于两个重排序缓存中的第一重排序模块和第二重排序模块。
- 根据权利要求30所述的控制器,其特征在于,所述读命令被分割为K个指令片段,所述K为正整数;其中,所述K个指令片段对应的K个数据中,第i+1个数据存储到一个重排序缓存的第一重排序模块,第i+2个数据存储到该重排序缓存的第二重排序模块,第i+3个数据存储到另一个重排序缓存的第一重排序模块,第i+4个数据存储到所述另一个重排序缓存的第二重排序模块,i为0或者i为4的倍数。
- 根据权利要求27所述的控制器,其特征在于,在两个重排序缓存中集齐一个读命令对应的多个数据后,所述两个重排序缓存向对应的端口发送所述多个数据。
- 根据权利要求27所述的控制器,其特征在于,若两个重排序缓存中 同时集齐第一读命令对应的多个数据和第二读命令对应的多个数据,所述第一读命令和所述第二读命令对应同一个端口,则判断所述第一读命令和所述第二读命令是否属于同一ID;若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命令和所述第二读命令的先后顺序,返回对应的数据。
- 根据权利要求33所述的控制器,其特征在于,所述重排序缓存根据所述第一读命令和所述第二读命令的先后顺序,将顺序在先的读命令对应的数据发送给所述第一缓存存储模块和所述第二缓存存储模块,同时将另一读命令对应的数据发送给所述第三缓存存储模块和所述第四缓存存储模块。
- 据权利要求1-34中任一项所述的控制器,其特征在于,所述存储器件为DDR器件,所述控制器为DDR控制器。
- 据权利要求1-34中任一项所述的控制器,其特征在于,所述处理装置包括下述至少一项:应用处理器、数字信号处理器、图形处理单元、多媒体处理器。
- 一种存储器件访问系统,其特征在于,包括:权利要求1-36中任一项所述的控制器、存储器件和多个处理装置。
- 一种电子设备,其特征在于,包括:权利要求37所述的存储器件访问系统。
- 一种电子设备,其特征在于,所述设备为下述任意一项:无人机、无人车、云台、相机。
- 一种数据传输方法,其特征在于,所述方法应用于控制器,所述控制器用于与存储器件通信,所述控制器包括调度模块、至少一个重排序缓存和多个端口,所述方法包括:所述调度模块从所述存储器件获取数据;所述重排序缓存将所述调度模块获取的数据进行重排序后传输至对应的端口;所述端口将从所述重排序缓存获取的数据发送给对应的处理装置;其中,所述多个端口中的至少两个端口复用一个重排序缓存。
- 据权利要求40所述的方法,其特征在于,将所述调度模块获取的数据进行重排序后传输至对应的端口,包括:接收所述调度模块发送的数据;确定所述数据对应的端口;将所述数据按照重排序后的顺序发送给所述对应的端口。
- 据权利要求41所述的方法,其特征在于,所述调度模块包括指令缓存、指令执行模块和仲裁模块;所述调度模块从所述存储器件获取数据,包括:所述指令缓存存储并分割读命令;所述指令执行模块向所述存储器件发送分割后的读命令,以使所述存储器件执行所述分割后的读命令;所述仲裁模块获取所述存储器件根据所述分割后的读命令返回的数据。
- 根据权利要求42所述的方法,其特征在于,所述重排序缓存包括第一重排序模块和第二重排序模块;接收所述调度模块发送的数据,包括:接收所述调度模块中的仲裁模块发送的属于同一读命令的数据,将接收到的数据按照重排序后的顺序交替存储于所述第一重排序模块和所述第二重排序模块。
- 根据权利要求42所述的方法,其特征在于,所述重排序缓存包括第一重排序模块和第二重排序模块;所述读命令被分割为K个指令片段,所述K为正整数;接收所述调度模块发送的数据,包括:接收所述调度模块中的仲裁模块发送的所述K个指令片段对应的K个数据;所述K个数据中,第奇数个数据存储到所述第一重排序模块,第偶数个数据存储到所述第二重排序模块。
- 根据权利要求44所述的方法,其特征在于,将所述数据按照重排序后的顺序发送给所述对应的端口,包括:在所述重排序缓存中集齐所述读命令对应的K个数据后,将所述K个数据按照重排序后的顺序发送至对应的端口。
- 根据权利要求45所述的方法,其特征在于,所述端口包括数据存储模块和缓存存储模块;所述缓存存储模块包括第一缓存存储模块和第二缓存存储模块;将所述K个数据按照重排序后的顺序发送至对应的端口,包括:所述重排序缓存中的第一重排序模块将所述K个数据中的部分数据发送给所述第一缓存存储模块,所述重排序缓存中的第二重排序模块将所述K个数据中的其它数据发送给所述第二缓存存储模块;所述第一缓存存储模块和所述第二缓存存储模块将所述K个数据发送给所述数据存储模块。
- 据权利要求42所述的方法,其特征在于,所述控制器为双通道的控制器,所述重排序缓存、指令缓存、指令执行模块和仲裁模块的个数均为两个;每个所述重排序缓存均包括第一重排序模块和第二重排序模块。
- 根据权利要求47所述的方法,其特征在于,接收所述调度模块发送的数据,包括:接收所述调度模块中的仲裁模块发送的属于同一读命令的数据,将接收到的数据交替存储于其中一个重排序缓存中的第一重排序模块和第二重排序模块。
- 根据权利要求47所述的方法,其特征在于,接收所述调度模块发送的数据,包括:接收所述调度模块中的仲裁模块发送的属于同一读命令的数据,将接收到的数据按照重排序后的顺序交替存储于两个重排序缓存中的第一重排序模块和第二重排序模块。
- 根据权利要求47所述的方法,其特征在于,所述读命令被分割为K个指令片段,所述K为正整数;接收所述调度模块发送的数据,包括:接收所述调度模块中的仲裁模块发送的所述K个指令片段对应的K个数据;其中,所述K个数据中,第i+1个数据存储到一个重排序缓存的第一重排序模块,第i+2个数据存储到该重排序缓存的第二重排序模块,第i+3个数据存储到另一个重排序缓存的第一重排序模块,第i+4个数据存储到所述另一个重排序缓存的第二重排序模块,i为0或者i为4的倍数。
- 根据权利要求47所述的方法,其特征在于,将所述数据按照重排序后的顺序发送给所述对应的端口,包括:在两个重排序缓存中集齐读命令对应的多个数据后,将所述多个数据按照重排序后的顺序发送至对应的端口。
- 根据权利要求51所述的方法,其特征在于,在两个重排序缓存中集齐读命令对应的多个数据后,将所述多个数据按照重排序后的顺序发送至对应的端口,包括:若两个重排序缓存中同时集齐第一读命令对应的多个数据和第二读命令对应的多个数据,则判断所述第一读命令和所述第二读命令是否属于同一ID;若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命 令和所述第二读命令的先后顺序,返回对应的数据。
- 根据权利要求52所述的方法,其特征在于,所述端口包括数据存储模块和缓存存储模块;每个所述缓存存储模块包括第一缓存存储模块、第二缓存存储模块、第三缓存存储模块和第四缓存存储模块;若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命令和所述第二读命令的先后顺序,返回对应的数据,包括:若所述第一读命令和所述第二读命令属于同一ID,则根据所述第一读命令和所述第二读命令的先后顺序,将顺序在先的读命令对应的数据发送给所述第一缓存存储模块和所述第二缓存存储模块,同时将另一读命令对应的数据发送给所述第三缓存存储模块和所述第四缓存存储模块。
- 据权利要求40所述的方法,其特征在于,所述存储器件为DDR器件,所述控制器为DDR控制器。
- 据权利要求40所述的方法,其特征在于,所述处理装置包括下述至少一项:应用处理器、数字信号处理器、图形处理单元、多媒体处理器。
- 一种计算机可读存储介质,其特征在于,包括指令,当其在计算机上运行时,使得所述计算机执行如上述权利要求40-55任一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/084635 WO2021207919A1 (zh) | 2020-04-14 | 2020-04-14 | 控制器、存储器件访问系统、电子设备和数据传输方法 |
CN202080004989.7A CN112703489A (zh) | 2020-04-14 | 2020-04-14 | 控制器、存储器件访问系统、电子设备和数据传输方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/084635 WO2021207919A1 (zh) | 2020-04-14 | 2020-04-14 | 控制器、存储器件访问系统、电子设备和数据传输方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021207919A1 true WO2021207919A1 (zh) | 2021-10-21 |
Family
ID=75514818
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/084635 WO2021207919A1 (zh) | 2020-04-14 | 2020-04-14 | 控制器、存储器件访问系统、电子设备和数据传输方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112703489A (zh) |
WO (1) | WO2021207919A1 (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116340202A (zh) * | 2023-03-28 | 2023-06-27 | 中科驭数(北京)科技有限公司 | 数据传输方法、装置、设备及计算机可读存储介质 |
CN116860185A (zh) * | 2023-09-05 | 2023-10-10 | 深圳比特微电子科技有限公司 | Sram阵列的数据访问装置、系统、方法、设备、芯片和介质 |
CN117453291A (zh) * | 2023-10-26 | 2024-01-26 | 上海合芯数字科技有限公司 | 一种指令转换系统、方法、处理器芯片和计算机设备 |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113301285A (zh) * | 2021-05-11 | 2021-08-24 | 深圳市度信科技有限公司 | 多通道数据传输方法、装置及系统 |
CN113468097B (zh) * | 2021-07-01 | 2024-02-20 | 合肥中科采象科技有限公司 | 基于片上系统的数据交换方法 |
CN113986817B (zh) * | 2021-12-30 | 2022-03-18 | 中科声龙科技发展(北京)有限公司 | 运算芯片访问片内存储区域的方法和运算芯片 |
CN114840458B (zh) * | 2022-07-06 | 2022-09-20 | 北京象帝先计算技术有限公司 | 读写模块、片上系统和电子设备 |
CN116414767B (zh) * | 2023-06-09 | 2023-09-29 | 太初(无锡)电子科技有限公司 | 一种对基于axi协议乱序响应的重排序方法及系统 |
CN116893992B (zh) * | 2023-09-11 | 2023-12-26 | 西安智多晶微电子有限公司 | 一种ddr2/3内存控制器的命令重排方法 |
CN117891758B (zh) * | 2024-03-12 | 2024-05-17 | 成都登临科技有限公司 | 一种基于仲裁的存储访问系统、处理器及计算设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101246460A (zh) * | 2008-03-10 | 2008-08-20 | 华为技术有限公司 | 缓存数据写入系统及方法和缓存数据读取系统及方法 |
CN102662634A (zh) * | 2012-03-21 | 2012-09-12 | 杭州中天微系统有限公司 | 非阻塞发射和执行的存储器访问执行装置 |
CN103049240A (zh) * | 2011-10-13 | 2013-04-17 | 北京同步科技有限公司 | Pci-e设备及其接收数据重排序方法 |
CN105940381A (zh) * | 2013-12-26 | 2016-09-14 | 英特尔公司 | 存储器访问期间的数据重排序 |
US20180150403A1 (en) * | 2012-07-30 | 2018-05-31 | Intel Corporation | Method and apparatus for supporting a plurality of load accesses of a cache in a single cycle to maintain throughput |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10282109B1 (en) * | 2016-09-15 | 2019-05-07 | Altera Corporation | Memory interface circuitry with distributed data reordering capabilities |
CN106776390A (zh) * | 2016-12-06 | 2017-05-31 | 中国电子科技集团公司第三十二研究所 | 多设备访问存储器的实现方法 |
CN110995598B (zh) * | 2019-11-12 | 2022-03-04 | 芯创智(北京)微电子有限公司 | 一种变长报文数据处理方法和调度装置 |
-
2020
- 2020-04-14 CN CN202080004989.7A patent/CN112703489A/zh active Pending
- 2020-04-14 WO PCT/CN2020/084635 patent/WO2021207919A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101246460A (zh) * | 2008-03-10 | 2008-08-20 | 华为技术有限公司 | 缓存数据写入系统及方法和缓存数据读取系统及方法 |
CN103049240A (zh) * | 2011-10-13 | 2013-04-17 | 北京同步科技有限公司 | Pci-e设备及其接收数据重排序方法 |
CN102662634A (zh) * | 2012-03-21 | 2012-09-12 | 杭州中天微系统有限公司 | 非阻塞发射和执行的存储器访问执行装置 |
US20180150403A1 (en) * | 2012-07-30 | 2018-05-31 | Intel Corporation | Method and apparatus for supporting a plurality of load accesses of a cache in a single cycle to maintain throughput |
CN105940381A (zh) * | 2013-12-26 | 2016-09-14 | 英特尔公司 | 存储器访问期间的数据重排序 |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116340202A (zh) * | 2023-03-28 | 2023-06-27 | 中科驭数(北京)科技有限公司 | 数据传输方法、装置、设备及计算机可读存储介质 |
CN116340202B (zh) * | 2023-03-28 | 2024-03-01 | 中科驭数(北京)科技有限公司 | 数据传输方法、装置、设备及计算机可读存储介质 |
CN116860185A (zh) * | 2023-09-05 | 2023-10-10 | 深圳比特微电子科技有限公司 | Sram阵列的数据访问装置、系统、方法、设备、芯片和介质 |
CN116860185B (zh) * | 2023-09-05 | 2024-06-07 | 深圳比特微电子科技有限公司 | Sram阵列的数据访问装置、系统、方法、设备、芯片和介质 |
CN117453291A (zh) * | 2023-10-26 | 2024-01-26 | 上海合芯数字科技有限公司 | 一种指令转换系统、方法、处理器芯片和计算机设备 |
Also Published As
Publication number | Publication date |
---|---|
CN112703489A (zh) | 2021-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021207919A1 (zh) | 控制器、存储器件访问系统、电子设备和数据传输方法 | |
KR101611516B1 (ko) | 직렬 포트 메모리 통신 레이턴시 및 신뢰성을 향상시키기 위한 방법 및 시스템 | |
US7797467B2 (en) | Systems for implementing SDRAM controllers, and buses adapted to include advanced high performance bus features | |
US8285912B2 (en) | Communication infrastructure for a data processing apparatus and a method of operation of such a communication infrastructure | |
US8667195B2 (en) | Bus-system including an interconnector, a master device, a slave device, and an operating method thereof | |
CN110109847A (zh) | Apb总线多个主设备的仲裁方法、系统及存储介质 | |
CN108279927A (zh) | 可调整指令优先级的多通道指令控制方法及系统、控制器 | |
CN118260225A (zh) | 一种基于fpga的多通道dma实现方法 | |
CN109616149A (zh) | 一种eMMC主机控制器、eMMC控制系统及控制方法 | |
CN117806999A (zh) | 一种位宽和通道可调的片上总线 | |
CN111581136B (zh) | 一种dma控制器及其实现方法 | |
CN115633098B (zh) | 众核系统的存储管理方法、装置和集成电路 | |
US7865645B2 (en) | Bus arbiter, bus device and system for granting successive requests by a master without rearbitration | |
CN102521180B (zh) | 一种多通道实时直读存储器结构 | |
CN109145397A (zh) | 一种支持并行流水访问的外存仲裁结构 | |
CN209249081U (zh) | 一种eMMC主机控制器和eMMC控制系统 | |
CN111241024A (zh) | 一种全互联axi总线的级联方法 | |
KR20160109733A (ko) | 다수의 클라이언트 데이터를 처리하는 저장 장치 및 방법 | |
US20210334230A1 (en) | Method for accessing data bus, accessing system, and device | |
US20240220104A1 (en) | Memory control system and memory control method | |
KR101910619B1 (ko) | 시스템 칩 설계를 위한 효율적인 온칩버스 구조 | |
Lin et al. | Design and Implementation of Multiport Ethernet Data Arbiter Based on AXI4-Stream | |
CN115762596A (zh) | 一种mcu的存取记忆体数字电路架构 | |
CN118095162A (zh) | 基于数据标签技术的fpga外置大容量存储器多端口访问系统 | |
CN118331480A (zh) | 存储器控制系统与存储器控制方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20931531 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20931531 Country of ref document: EP Kind code of ref document: A1 |