CN102207916B - Instruction prefetch-based multi-core shared memory control equipment - Google Patents

Instruction prefetch-based multi-core shared memory control equipment Download PDF

Info

Publication number
CN102207916B
CN102207916B CN 201110141796 CN201110141796A CN102207916B CN 102207916 B CN102207916 B CN 102207916B CN 201110141796 CN201110141796 CN 201110141796 CN 201110141796 A CN201110141796 A CN 201110141796A CN 102207916 B CN102207916 B CN 102207916B
Authority
CN
China
Prior art keywords
instruction
data
write
read
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN 201110141796
Other languages
Chinese (zh)
Other versions
CN102207916A (en
Inventor
李康
光青
郝跃
雷理
彭毓佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN 201110141796 priority Critical patent/CN102207916B/en
Publication of CN102207916A publication Critical patent/CN102207916A/en
Application granted granted Critical
Publication of CN102207916B publication Critical patent/CN102207916B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses an instruction prefetch-based multi-core shared memory control equipment. The memory control equipment comprises an access instruction buffer module, an instruction resolving and address decoding module, a data read-write control module, a memory control module and a memory interface module. The data read-write control module controls the transmission of data between a memory and an on-chip multi-core processor. The memory control module generates an instruction prefetch marking signal and fetches a next access instruction from the access instruction buffer module in advance, the next access instruction is pre-decoded through the instruction resolving and address decoding module, and the memory control module dynamically selects a page opening or page closing strategy of the memory according to the control information of the prefetch instruction. The memory control equipment can reduce the delay caused by the access of the processor, improve the transmission efficiency of the data and meet the requirement for high-performance storage bus throughput required when the processors work in parallel.

Description

A kind of multinuclear shared storage opertaing device based on instruction prefetch
Technical field
The present invention relates to a kind of data storage control system, specifically, relate to a kind of multinuclear shared storage opertaing device based on instruction prefetch.
Background technology
Since the eighties in last century, the performance of processor promotes with per speed of doubling in 18 months according to Moore's Law always, and memory access latency only can improve 10% in average per 12 months.Processor and memory performance gap are that the problem of " storer wall " is increasing, have become the problem that parallel real-time multiple core processing system is mainly considered.
At present, dynamic RAM (DRAM) has been widely used in field of data storage.Normal operation DRAM opertaing device is controlled the read-write to DRAM.The read/write requests of DRAM opertaing device receiving processor, the operation by command analysis control DRAM writes data DRAM or imports the DRAM data into processor.Shown in Figure 1 is the block diagram that traditional multinuclear is shared DRAM opertaing device 120.As shown in Figure 1, the DRAM opertaing device comprises data FIFO 100, be used for temporary processor data writing or return to the data of processor; Control module 102, according to the type of the access instruction of processor, the redirect of control internal state and the data transmission of data path, and the storer control signal sent to I/O interface module 108; Data path 104 is used for controlling data transfer, and data to be written are transferred to I/O interface module 108 from data FIFO, the data data writing FIFO100 that perhaps will read from I/O; I/O interface module 108 is accepted from the control signal of control module 102 and the data-signal of data path, and finishes order and data transfer with storer according to the sequential standard of storer.DRAM storer 110 is comprised of row, column as a storage array, to DRAM storer 110 action needs according to strict industrial standard.At first need by row gating designated store block address (Bank ADDR) and row address (Row ADDR), then specify columns address (Column ADDR) after space postpones (tRCD), really choose a certain address of storer this moment, appear on the memory bus through read latency (tCL) data.If namely to the current gating line precharge of advancing, be called a page shutdown strategy (Close Page) after complete to this address read write operation, if not precharge current line is called page or leaf and opens strategy (Open Page).At present the DRAM opertaing device generally adopts static control strategy, refers to namely that after the DRAM read-write operation is finished according to the control strategy of static state single selection closes or continue to open certain delegation of current operation.Static control strategy is closely related with the mode of DRAM addressing.Such as the DRAM opertaing device adopts page or leaf to open the control strategy of the static state of (Open Page), and namely each read-write operation is complete can precharge, the capable open mode that is in of current operation.The static policies that page or leaf is opened is very suitable for the application scenario that the DRAM storage has data dependence.If but the operation of the next one of DRAM addressing is other row of access same storage block (Bank), will cause the page or leaf conflict (PageConfict) of access, it is capable to close first current operation, and then the row address of sending out new, column address, same reason, the static cost control strategy that the DRAM opertaing device also can adopt page or leaf to cut out, namely each complete precharge operation that all can carry out of read-write operation is closed current line.The static policies that page or leaf is closed is applicable to the poor DRAM addressing application scenario of data dependence.If but read-write operation is that identical row to identical storage block (Bank) carries out, and is called page or leaf this moment and hits fast (Page Fast Hit) next time.At this moment because the static policies that adopts page or leaf to close needs to reappear and sends row address, column address.There is close relationship static control strategy and the application scenario of DRAM memory data access, but may exist static control strategy to conflict with the DRAM addressing, thereby increase the delay of processor memory access.In order to reduce the time of access dram chip, some DRAM opertaing devices adopt forecasting techniques to adjust dynamically strategy, but this can increase the complexity of DRAM opertaing device again, and predict the incorrect time that also can increase memory access, reduce the DRAM Bus through-put.
The DRAM opertaing device that the tradition multinuclear is shared is connected with chip multi-core processor by interconnection structure on the sheet, such as AMBA bus interconnect architecture.A parallel real time processing system, generally adopt the data transfer mode of separate bus.Traditional AMBA bus is divided into 2 subcycles with a bus transfer cycle: address cycle and data cycle.When certain processor on the sheet need to carry out read access to the DRAM opertaing device from equipment, at first send request signal to bus arbiter unit, by the time after the bus arbitration mandate, processor is obtained the access right of DRAM opertaing device, enter address cycle, and control signal, DRAM address sent to the DRAM opertaing device by bus, when the DRAM opertaing device thinks that this action need postpones for a long time, then to bus arbitration application separated transmission, above-mentioned processor is abdicated the bus right to use, and then other processors can obtain the right to use of bus.When the DRAM opertaing device when addressing and readout are ready to data that processor needs, to the request for arbitration data transmission, after above-mentioned host device processor obtains the bus right to use, again initiate bus transfer, by new address cycle and data cycle, processor is fetched the data of reading from DRAM from data bus.The bus transfer of separating can when needing long-time operation from equipment, authorize other main equipments to occupy the request of bus.Compare traditional bus transfer, the separate bus transmission can make the bus right to use switch fast between different main equipments, thereby can reduce the delay of bus.But the separated transmission process complexity of traditional shared bus and life period loss, need again to transmit to the bus arbitration request for data from equipment, then after needing to wait until that main equipment obtains the bus right to use, just can finish a data transfer by the new bus transfer cycle always.This can increase the delay of host device processor access slave DRAM storer, reduces data throughput, the performance of restriction parallel processing.
Traditional multinuclear is shared memory control apparatus and is connected with chip multi-core processor by shared bus, although adopt the separate bus data transmission mechanism, and inner static policies or the performance prediction technology of adopting, but all there is the large shortcoming of delay of the low and memory access of data throughput, therefore in the parallel multinuclear shared system of processing in real time, needs a kind of novel DRAM opertaing device to address these problems.
Summary of the invention
In order to overcome the prior art defects, the invention provides a kind of multinuclear shared storage opertaing device based on instruction prefetch.This Memory Controller has and the on all four interface specification of dynamic RAM, interface and the transmission specification of compatible synchronous DRAM (SDRAM), double-speed synchronous DRAM (DDR), second generation Double Data Rate synchronous DRAM (DDR2), third generation Double Data Rate synchronous DRAM (DDR3) universal memory device.
Purpose of the present invention is achieved through the following technical solutions:
A kind of multinuclear shared storage opertaing device based on instruction prefetch, comprising: access instruction buffer queue module, instruction parsing and address decoding module, data read-write control module, memory control module, memory interface module.Store interface module is used for and the external memory bus interaction data, is connected with the memory control module of opertaing device inside simultaneously.
The access instruction buffer module is used for depositing the access instruction that chip multi-core processor sends, and described instruction comprises command type, address information and corresponding control information;
Instruction parsing and address decoding module, be used for access instruction is carried out command analysis and address decoding, and command type, storage address and the data transmission number that decoding obtains be input to storage control module, simultaneously with other control informations of this instruction, comprise that processor ID number, the inner read-write register of processor address are delivered to the data read-write control module;
Storage control module is deciphered command type, storage address and the data transmission number that obtains according to instruction parsing and address decoding module, and control store interface module and data read-write control module are finished the correct transmission of data between storer and processor;
The data read-write control module, reception is from the control signal of storage control module, instruction parsing and address decoding module and the data of reading from storer, initiatively initiate writing or read operation of data, the control data are transmitted between storer and chip multi-core processor;
Store interface module is used for the standard time sequence according to storer, and the control data are perhaps correctly read data from storer, and write storage control module from the correct write store of storage control module.
Described opertaing device, wherein, described access instruction buffer module can be taken out next bar instruction in advance under the effect of instruction prefetch marking signal, and pre-decode is carried out in described next bar instruction input instruction parsing and address decoding module.
Described opertaing device, wherein, described storage control module comprises:
The read-write steering logic: responsible control information register upgrades and steering order is looked ahead, and marking signal sends, and opens or the page or leaf shutdown strategy according to the information Dynamic Selection storage page of flag register;
Control information register: be used for preserving the control information of present instruction, comprise command type, storage address and data transmission number;
Address comparator: described address comparator is responsible for the address of the storer that the comparison present instruction carries out and the relation of next bar instruction memory address of looking ahead, and the control information that produces the tag addresses relation;
Described flag register is then stored the storage address of current execution instruction according to above-mentioned control information and the concrete numerical value of next bar instruction memory address relation of looking ahead.
Described opertaing device, wherein, described instruction prefetch may further comprise the steps:
Step 500: under the effect of instruction prefetch marking signal, in advance from access instruction buffer module prefetched instruction, and carry out pre-decode through instruction parsing and address decoding module, then jump to step 502;
Step 502: memory address information and the present instruction storage address of prefetched instruction being deciphered rear gained compare, and then jump to step;
Step 504: judge that prefetched instruction and the storage address of present instruction sensing are the same delegation of identical Bank; If so, then jump to step 506, if not, then jump to step 508;
Step 506: the value of putting low two of flag register is 0x1, and high two value of flag register 406 remains unchanged.Then jump to step 514;
Step 508: judge that the storage address that prefetched instruction and present instruction are pointed to is different Bank address, if it is jump to step 510, if not then jumping to step 512;
Step 510: the value of putting low two of flag register is 0x2, and the MBA memory block address BankADDR of present instruction 412 is write the high two of flag register, then jumps to step 512;
Step 512: put low two zero clearings of flag register, the value that flag register is high two remains unchanged, and then jumps to step 514;
Step 514: the read-write steering logic is come the state transition of control store opertaing device inside according to low two place values of flag register, and when prefetched instruction 410 begins to be stored the opertaing device execution, upgrade control information register 402 with the control information of prefetched instruction 410.
Described opertaing device, wherein, described data read-write control module comprises:
Internal bus interface: be used for to accept the data and the control information that pass over from storage control module and instruction parsing and address decoding module, comprise processor ID number, given processor internal register addresses and data bus request;
The address date register of reading and writing data bus: be used for preserving current access instruction data, data bus request and internal processor register address;
Processor ID register: be used for keeping current access instruction processor ID number.
Described opertaing device, wherein, described data read-write control module controls data are transmitted between storer and chip multi-core processor and be may further comprise the steps:
Step 1: according to the data path of processor ID gating alignment processing device;
Step 2: for polycaryon processor during to the memory write data, send data bus write request and internal processor register address at the data path of gating; For polycaryon processor will be from memory read data the time, send data, data bus read request and internal processor register address at the data path of gating;
Step 3: processor response data read-write bus request signal, data write bus is read and put into to data from the register of processor inside, perhaps the data with the data read bus write internal processor register;
Step 4: the read-write control module is accepted the data from data write bus, perhaps after the data of data read bus write internal processor register, notifies above-mentioned processor data " DSR ".
Described opertaing device, wherein, described storer is SDRAM or DDR.
Embodiments of the invention have following beneficial effect, pass through such scheme, storage control module is opened by instruction prefetch Dynamic Selection page or leaf or the page or leaf shutdown strategy, the time loss of minimizing because using static policies or performance prediction technology to bring improves the parallel performance of processing in real time multiple nucleus system.On the other hand, the data read-write control module can be initiated the operation of reading and writing data bus, adopts more simple directly mode to communicate with chip multi-core processor, reduces the delay of data bus on the sheet, improves the data bus handling capacity.So the dram controller of embodiments of the invention can reduce the delay that the processor memory access brings, and improve data transfer efficient, required high-performance memory bus throughput requirement in the time of more can adapting to the processor concurrent working.
Description of drawings
In conjunction with the drawings embodiments of the invention are described in detail, the purpose of above and other of the present invention, feature, advantage will become apparent, wherein:
Fig. 1 is the synoptic diagram that traditional multinuclear is shared memory control apparatus;
Fig. 2 is based on the synoptic diagram of the multinuclear shared storage opertaing device of instruction prefetch;
Fig. 3 is based on the synoptic diagram of the chip multi-core system interconnection of reading and writing data bus;
Fig. 4 is the DRAM memory control apparatus instruction prefetch structural drawing of the embodiment of the invention;
Fig. 5 is the DRAM memory control apparatus instruction prefetch realization flow figure of the embodiment of the invention;
Fig. 6 is the DRAM memory control apparatus read-write state redirect process flow diagram of the embodiment of the invention;
Fig. 7 a is that the DRAM memory control apparatus of the embodiment of the invention is compared the sequential chart that adopts page or leaf shutdown strategy tradition DRAM memory controller to reduce access delay;
Fig. 7 b is that the DRAM memory control apparatus of the embodiment of the invention is compared the sequential chart that the traditional dram controller that does not adopt the Bank interleaving technique reduces access delay;
Fig. 7 c is that the DRAM memory control apparatus of the embodiment of the invention is compared the sequential chart that adopts page or leaf to open tactful traditional dram controller minimizing access delay;
Fig. 8 is the data read-write control modular structure figure of the DRAM memory control apparatus of the embodiment of the invention;
Fig. 9 is the data read bus operation-interface agreement process flow diagram of the DRAM memory control apparatus of the embodiment of the invention;
Figure 10 is the data write bus operation-interface agreement process flow diagram of the DRAM memory control apparatus of the embodiment of the invention;
Embodiment
Memory control apparatus according to the embodiment of the invention is described below with reference to accompanying drawings.In the accompanying drawings, identical reference number represents identical element from start to finish.Be to be understood that: the embodiments described herein only is illustrative, and should not be interpreted as limiting the scope of the invention.
Embodiment 1
The multinuclear that is based on as shown in Figure 2 instruction prefetch is shared the synoptic diagram of memory control apparatus.In embodiments of the present invention, storage access command has used different bus structure to process respectively from reading and writing data, and can realize to greatest extent like this executed in parallel of memory access.Command line 304 is unidirectional buss, only is responsible for transmitting the storer memory access instruction of sending from a plurality of processors.In embodiments of the present invention, the order format of access instruction comprises the data amount check that transmit the address of the instruction type, processor ID number of this instruction, the storer of accessing and internal processor register address and needs.
In order to improve the throughput performance of data, data bus has been divided into data read bus 302 and data write bus 300 in embodiments of the present invention, is used for respectively finishing data reading operation and write operation.Because what adopt is the multiprocessor shared structure, each processor also may have a plurality of hardware threads in some cases, therefore memory control apparatus inside should have access instruction buffer module 202, guarantees that the storage instruction that all processors send is cached in the formation.Command line 304 can carry out priority arbitration according to the different kinds of memory instruction, and the efficient that memory instructions is carried out reaches the highest.On two data buses, memory control apparatus is by data read-write control module 200, and the handled data of memory control apparatus that assurance is shared can be corresponding one by one with processor or the processor thread of appointment.Data read-write control module 200 also is responsible for this storage operation of notification processor and is finished.Multiprocessor and the Memory Controller interconnection structure of sharing of having provided shown in Figure 3.Reading and writing data bus 302,300 is connected with the mode of command line 304 with CrossBar interconnect bus structure, and the transaction that can effectively reduce the on-chip interconnect bus postpones.After multiline procedure processor 306~310 sends memory reference instruction, can continue the function of other thread, data write or read operation is initiatively initiated by the data read-write control module 200 in the memory control apparatus, needn't take the instruction execution cycle of processor 306~310.
Embodiment 2
The memory control apparatus instruction prefetch implementation of present embodiment as shown in Figure 4.Described implementation relates to storage control module 206, access instruction buffer module 202 and instruction parsing and address decoding module 204.
Storage control module 206 comprises read-write steering logic 400, control information register 402, address comparator 404 and flag register 406.Read-write steering logic 400 is responsible for the redirect of control store opertaing device internal state, and the realization data are are correctly read and write, and when decision sends instruction prefetch marking signal 408.Read-write steering logic 400 determines NextState according to the information of the present instruction 412 of control information register 402 preservations and the content of flag register 406.Control information register 402 is responsible for preserving the control information of present instruction 412, comprises command type, storage address and the data transmission number of access instruction.The informational needs of next bar instruction 410 of looking ahead upgrades control information register 402 when read-write steering logic 400 jumps to next bar instruction of execution.Relation between the storage address that address comparator 404 responsible relatively present instructions 412 are carried out and the storage address of prefetched instruction 410.Storage address comprises MBA memory block address Bank ADDR, row address Row ADDR and column address Column ADDR, 404 of address comparators are MBA memory block address Bank ADDR and the row address Row ADDR of present instruction 412 and prefetched instruction 410 relatively, and the storage address of present instruction 412 and prefetched instruction 410 is equated or unequal relation information is stored in the flag register 406.The storage address that flag register 406 is responsible for storage present instruction 412 and prefetched instruction 410 equates or unequal relation information, and described address relationship information is sent to read-write steering logic 400.In the present embodiment, flag register 406 bit wides are 4bit, and that lowest order is that high level represents present instruction 412 and the address of next bar instruction 410 of looking ahead is pointed to is same MBA memory block address Bank ADDR and same row address Row ADDR; Lowest order is opposite for low Shi Ze represent situation, and namely the address of next bar instruction and current execution instruction are not the same MBA memory block address Bank ADDR of sensing and same row address Row ADDR; When second is high level, that then represent present instruction 412 addresses and the sensing of prefetched instruction 410 addresses is different MBA memory block address Bank ADDR, if second is low level then represents that current instruction address and prefetch instruction address are not to point to different MBA memory block address Bank ADDR; The 3rd of flag register and the 4th bit representation are the MBA memory block address Bank of present instruction 412.
Above-mentioned memory control structure and register definitions are used for realizing the prefetch operation of storer memory access instruction.The specific implementation flow process of the memory control apparatus instruction prefetch operation of the embodiment of the invention as shown in Figure 5.Comprise:
Step 500: under the effect of instruction prefetch marking signal 408, in advance from access instruction buffer module 202 prefetched instructions, and carry out pre-decode through instruction parsing and address decoding module 204.Then jump to step 502;
Step 502: memory address information and present instruction 412 storage addresss of prefetched instruction 410 being deciphered rear gained compare, and then jump to step 504;
Step 504: judge that prefetched instruction 410 and the storage address of present instruction 412 sensings are the same delegation of identical Bank; If so, then jump to step 506, if not, then jump to step 508;
Step 506: the value of putting low two of flag register 406 is 0x1, and high two value of flag register 406 remains unchanged.Then jump to step 514;
Step 508: judge that the storage address that prefetched instruction 410 and present instruction 412 are pointed to is different Bank address.If it is jump to step 510, if not then jumping to step 512;
Step 510: the value of putting low two of flag register 406 is 0x2, and the MBA memory block address Bank ADDR of present instruction 412 is write the high two of flag register.Then jump to step 512;
Step 512: put flag register 406 low two zero clearings, flag register 406 high two values remain unchanged.Then jump to step 514;
Step 514: the read-write steering logic 400 that is positioned at storage control module 206 is come the state transition of control store opertaing device inside according to flag register 406 low two place values.And when prefetched instruction 410 begins to be stored the opertaing device execution, upgrade control information register 402 with the control information of prefetched instruction 410.In embodiments of the present invention, read-write steering logic 400 is sent the instruction prefetch marking signal at the logical state of the column selection of present instruction 412 operations, takes out next bar instruction from access instruction buffer module 202 in advance.Then jump to step 500, the flow process of beginning instruction prefetch next time.
As shown in Figure 6 be that read-write steering logic 400 is according to the process flow diagram of the content-control memory read/write of flag register 406.Comprise:
Step 600: when not having instruction to need reference-to storage at present, read-write steering logic 400 does not send out effective order any, and storer enters idle condition.If have new access instruction to be performed, jump to so step 602;
Step 602: the 400 distribution gating instructions of read-write steering logic, the memory row address of activation present instruction 412 correspondences.Then jump to step 604;
Step 604: read-write steering logic 400 jumps to step 606 through the delays of tRCD (seeing Fig. 7) row gatings, and during this period, the read-write steering logic does not send out effective order any;
Step 606: the logical instruction of 400 column selections of read-write steering logic, the memory column address of gating present instruction 412 correspondences.Then jump to step 608;
Step 608: the value that read-write steering logic 400 judgement symbol registers 406 hang down two is 0x2.If so, then jump to step 610, if not, then jump to step 612;
Step 610: the value that read-write steering logic 400 is high two according to flag register 406, sending out a precharge instruction, to close the last time command operating capable.Then jump to step 612;
Step 612: read-write steering logic 400 sends instruction prefetch marking signal 408 when column selection is logical, and through after the tDELAY delay, the value of flag register 406 obtains upgrading, then jump procedure 614;
Step 614: read-write steering logic 400 judgement symbol registers 406 low two place values are 0x1, if jump to step 606, and memory column address corresponding to bar instruction under the gating; If not, jump to step 616;
Step 616: read-write steering logic 400 judgement symbol registers 406 low two place values are 0x2, if jump to step 602, and memory row address corresponding to bar instruction under the gating; If not, jump to step 618;
Step 618: 400 precharge command of read-write steering logic, close the present instruction action row, then jump to step 620;
Step 620: read-write steering logic 400 postpones through tRP, then jumps to step 602, memory row address corresponding to bar instruction under the gating;
Memory control apparatus based on instruction prefetch can effectively reduce and hiding memory access latency, improves the bandwidth availability ratio of DRAM bus.As shown in Figure 6, read-write steering logic 400 jumps to different states, the read-write of control store according to the value of flag register 406.If low two value of flag register 406 is 0x1, illustrate that then next bar instruction is the same delegation that points to same memory bank Bank.After present instruction finished, read-write steering logic 400 did not enter the state of precharge, and the action row employing page or leaf of present instruction is opened strategy, reduced the memory access latency of next read-write operation.If the value that flag register 406 hangs down two is 0x2, the operation that next bar instruction then is described is to point to different memory bank Bank.At this moment will forward dynamically the staggered treatment step of Bank to, thereby hide the precharge delay of different B ank.Be present instruction when carrying out, the corresponding rank addresses of read-write steering logic next bar instruction of 400 gatings, during the transmission of present instruction end data, next bar instruction is the transmission of data and then.High two bank-address that be responsible for to keep winding instruction of flag register 406 in the present instruction the transmission of data, are closed winding the command operating line precharge of advancing.So finish data read-write operation to winding the capable employing page or leaf of command operating shutdown strategy.If flag register 406 low two place values are 0x0, illustrate that then next bar instruction address is to point to the different rows of same memory bank Bank.After present instruction finished, read-write steering logic 400 entered pre-charge state, the strategy that memory control apparatus adopts page or leaf to close, and the present instruction action row is closed in present instruction.
If it is 0x1 that flag register 406 hangs down two, be that page or leaf fast hit (Page fast hit) to the read-write operation of DRAM storer next time so, the strategy that embodiment of the invention DRAM memory control apparatus adopts page or leaf to open, after the present instruction operation is finished, described action row is not carried out precharge close, and continue to open described action row.Compare with the traditional DRAM opertaing device that adopts the page or leaf shutdown strategy, can reduce memory access latency based on the memory control apparatus of instruction prefetch.What Fig. 7 a represented is that memory control apparatus of the present invention adopts page or leaf to open the sequential chart that strategy reduces memory access latency.Suppose that adjacent two access instruction are read memory operation, and the address is the mutually colleague a that points to same memory bank, the column address of article one instruction is m, and then the column address of an instruction is m+4.Because the strategy that adopts page or leaf to open, memory control apparatus of the present invention do not need to carry out precharge row a and the capable a operation of gating, save the tRP+tRCD time than the traditional DRAM opertaing device that adopts page or leaf to close so.
If it is 0x2 that flag register 406 hangs down two, be that page or leaf hit (Page hit) to the read-write operation of DRAM next time so, and embodiment of the invention DRAM memory control apparatus adopts the Bank interleaving technique to hide the delay of precharge.What Fig. 7 b represented is to adopt the Bank interleaving technique to hide the sequential chart of precharge delay.Suppose that adjacent two access instruction are read memory operation, article one instruction is to point to the capable m row of a of the first block memory (Bank 0), and the second instruction is to point to the capable n row of the b of the second block memory (Bank 1).If the tradition dram controller does not adopt the staggered technology of Bank, is serial to the memory access of Bank0 and Bank1 so, closing a of Bank0, just to open the b of Bank1 after capable capable.Based on the memory control apparatus of the instruction prefetch address by more adjacent two instructions, learn that next bar instruction is to the Bank1 operation, so when the Bank0 the transmission of data, the rank addresses of gating Bank1 simultaneously.When the data transmission of Bank0 was finished, the data of Bank1 also began to appear on the data line, equally when the capable n biographies of Bank1 b transmission of data, to the Bank0 a line precharge operation of advancing.So the memory control apparatus based on instruction prefetch can be saved the tRP+tRCD time than the conventional memory device that does not adopt the Bank interleaving technique.If traditional DRAM opertaing device adopts the Bank interleaving technique, namely giving tacit consent to adjacent twice accessing operation is that page or leaf hits (Page hit).The strategy that above-mentioned traditional DRAM opertaing device generally selects page or leaf to cut out hits (Page hit) if adjacent twice accessing operation is page or leaf, adopts so the Bank interleaving technique to hide precharge time, reduces the time of memory access.If hit fast (Pagefast hit) but adjacent twice accessing operation is page or leaf, can increase the memory access time so, shown in Fig. 7 a.So be that page or leaf when fast hitting (Page fast hit) reduce tRP+tRCD time than the traditional DRAM opertaing device that adopts the Bank interleaving technique in adjacent twice operation based on the memory control apparatus of instruction prefetch.
If it is 0x0 that flag register 406 hangs down two, be page or leaf conflict (PageConfict) to the read-write operation of DRAM next time so, embodiment of the invention DRAM memory control apparatus adopts a page or leaf shutdown strategy, after the present instruction operation is finished, to shown in action row carry out precharge and close.Compare with a traditional DRAM opertaing device that adopts page or leaf to open strategy, can reduce memory access latency based on the memory control apparatus of instruction prefetch.Sequential is to such as shown in Fig. 7 c.Suppose that adjacent two access instruction are read memory operation, and point to same bank-address, but row a is pointed in article one instruction, and row b is pointed in the second instruction.Adopt page or leaf to open traditional DRAM opertaing device of strategy after each command operating is finished, can not close the storage operation line precharge of advancing, when page or leaf conflict (PageConfict) occurs when, at first need to close the capable a that winds the storer that instruction opens, then could activate present instruction action row b, the present instruction memory access time is tRP+tRCD+tCL so.Memory control apparatus based on instruction prefetch is the strategy that page or leaf is closed because of what adopt, and the memory access time of present instruction is tRCD+tCL, has saved the tRP time.
Embodiment 3
Possesses data read-write control module 200 in the DRAM memory control apparatus of the embodiment of the invention.Data read-write control module 200 implementations as shown in Figure 8.Data read-write control module 200 comprises: internal bus interface 812, reception is from the control signal of storage control module 206 and instruction parsing and address decoding module 204 and the data of reading from DRAM storer 110, described control signal comprise access instruction type, processor ID number, specify multiline procedure processor internal register addresses and data bus request signal; Data read bus address date register 806: address, data read bus request signal and the data read bus data of being responsible for specifying on the save data read bus multiline procedure processor register; Data write bus address register 808: be responsible for appointment multiline procedure processor register address and data write bus request signal on the save data write bus; Data write bus data register 810: responsible save data draws bus data; Data read bus ID register 802: be responsible for specifying multiline procedure processor ID number on the save data read bus; Data write bus ID register 804: be responsible for specifying multiline procedure processor ID number on the save data write bus; The major function of data read-write control module 200 is to want and can timely data be transmitted between storer and appointment multiline procedure processor.Described module is responsible for initiating the read-write operation of data bus, and independently reading and writing data bus is provided respectively.On the one hand, data read-write control module 200 " is drawn in " data bus with data to be written from the internal register of specifying multiline procedure processor by write bus, and data are delivered to storage control module 206.On the other hand, data read-write control module 200 also will " be shifted onto " on the data bus from the data that storer is read by read bus, and write the internal register of specifying multiline procedure processor under the control of data read-write control module 200.The method of operating of data channel is such as following description.
When the specify hardware thread of a given processor will be to the memory write data in polycaryon processor, according to the write data channel of access instruction type and processor ID gated data read-write control module 200, prepare writing of data.Data write bus 300 interface protocols are according to as shown in Figure 9 flow process.At first according to processor and the thread number thereof of processor ID gating appointment, then in the internal register addresses that sends data write bus request signal and processor to the thread of appointment.The periodicity that request signal is kept is corresponding with the number of institute's read data, processor sampled data write bus request signal and home address value, response data write bus request signal, to specify the data in the register address to send on the data bus, data read-write control module 200 directly is placed on the data bus of memory chip after described data are cushioned through storage control module 206.
When the specify hardware thread of a given processor will be from memory read data in polycaryon processor, the read data passage of gated data read-write control module 200 was prepared reading of data.Data read bus 302 interface protocols are according to shown in 10 workflow graphs.At first according to processor thread and the thread number thereof of the processor ID gating appointment of appointment, and address and read data path corresponding to data read bus request signal gating by the given processor internal register.Then will be placed into from the data that storer is read on the data read bus.The periodicity that data read bus request signal is wherein kept is corresponding with the data effective length.Final data read-write control module 200 is written to internal processor register with the data on the data read bus, and sends " DSR " signal to processor thread, thereby can notify this processor thread can continue to carry out.
The DRAM memory control apparatus of the embodiment of the invention has adopted the distributed arbitration program mode, respectively command line formation, reading and writing data bus 300,302 is dispatched respectively.The memory reference instruction of being sent by a plurality of processor threads has improved the execution efficient of memory reference instruction to greatest extent by the instruction prefetch strategy under the scheduling of command line moderator; Utilize simultaneously 400 transmission of control data between chip external memory and a plurality of processor thread of read-write steering logic, and utilize " instruction is finished " signal to come the notification processor thread timely data to be processed.This transmission control mode is compared with traditional shared bus transmission mode, needed high-performance memory bus throughput needs in the time of more can adapting to the multiline procedure processor concurrent working.And simple in structure, can reduce bus delay.
The above has been described in detail purpose of the present invention, technical scheme.Institute it should be understood that the above does not limit the scope of the invention, and all any modifications of making within principle of the present invention and technical foundation, improvement etc. all should be included within protection scope of the present invention.

Claims (6)

1. the multinuclear shared storage opertaing device based on instruction prefetch is characterized in that, comprising:
The access instruction buffer module is used for depositing the access instruction that chip multi-core processor sends, and described instruction comprises command type, address information and corresponding control information;
Instruction parsing and address decoding module, be used for access instruction is carried out command analysis and address decoding, and command type, storage address and the data transmission number that decoding obtains be input to storage control module, simultaneously with other control informations of this instruction, comprise chip multi-core processor ID number, the inner read-write register of chip multi-core processor address, be delivered to the data read-write control module;
Storage control module, decipher command type, storage address and the data transmission number that obtains according to instruction parsing and address decoding module, control store interface module and data read-write control module are finished the correct transmission of data between storer and chip multi-core processor;
The data read-write control module, reception is from the control signal of storage control module, instruction parsing and address decoding module and the data of reading from storer, initiatively initiate writing or read operation of data, the control data are transmitted between storer and chip multi-core processor;
Store interface module is used for the standard time sequence according to storer, and the control data are perhaps correctly read data from storer, and write storage control module from the correct write store of storage control module;
Described instruction prefetch may further comprise the steps:
Step 500: under the effect of instruction prefetch marking signal, in advance from access instruction buffer module prefetched instruction, and carry out pre-decode through instruction parsing and address decoding module, then jump to step 502;
Step 502: memory address information and the present instruction storage address of prefetched instruction being deciphered rear gained compare, and then jump to step 504;
Step 504: judge that prefetched instruction and the storage address of present instruction sensing are the same delegation of identical Bank; If so, then jump to step 506, if not, then jump to step 508;
Step 506: the value of putting low two of flag register is 0x1, and high two value of flag register (406) remains unchanged; Then jump to step 514;
Step 508: judge that the storage address that prefetched instruction and present instruction are pointed to is different Bank address, if it is jump to step 510, if not then jumping to step 512;
Step 510: the value of putting low two of flag register is 0x2, and the MBA memory block address Bank ADDR of present instruction (412) is write the high two of flag register, then jumps to step 514;
Step 512: put low two zero clearings of flag register, the value that flag register is high two remains unchanged, and then jumps to step 514;
Step 514: the read-write steering logic is come the state transition of control store opertaing device inside according to low two place values of flag register, and when prefetched instruction (410) begins to be stored the execution of device opertaing device, upgrade control information register (402) with the control information of prefetched instruction (410);
Described read-write steering logic (400) comprising according to the flow process of the content-control memory read/write of flag register (406):
Step 600: when not having instruction to need reference-to storage at present, read-write steering logic (400) does not send out effective order any, and storer enters idle condition; If have new access instruction to be performed, jump to so step 602;
Step 602: the instruction of read-write steering logic (400) distribution gating, activate memory row address corresponding to present instruction (412); Then jump to step 604;
Step 604: read-write steering logic (400) jumps to step 606 through the delay of tRCD capable gating, and during this period, the read-write steering logic does not send out effective order any;
Step 606: read-write steering logic (400) is sent out the logical instruction of column selection, the memory column address that gating present instruction (412) is corresponding; Then jump to step 608;
Step 608: the value that read-write steering logic (400) judgement symbol register (406) is low two is 0x2; If so, then jump to step 610, if not, then jump to step 612;
Step 610: read-write steering logic (400) is according to high two value of flag register (406), and sending out a precharge instruction, to close the last time command operating capable; Then jump to step 612;
Step 612: read-write steering logic (400) sends instruction prefetch marking signal (408) when column selection is logical, and through after the tDELAY delay, the value of flag register (406) obtains upgrading, then jump procedure 614;
Step 614: low two place values of read-write steering logic (400) judgement symbol register (406) are 0x1, if jump to step 606, and memory column address corresponding to bar instruction under the gating; If not, jump to step 616;
Step 616: low two place values of read-write steering logic (400) judgement symbol register (406) are 0x2, if jump to step 602, and memory row address corresponding to bar instruction under the gating; If not, jump to step 618;
Step 618: read-write steering logic (400) is sent out precharge command, closes the present instruction action row, then jumps to step 620;
Step 620: read-write steering logic (400) postpones through tRP, then jumps to step 602, memory row address corresponding to bar instruction under the gating.
2. opertaing device according to claim 1, it is characterized in that, described access instruction buffer module is taken out next bar instruction in advance under the effect of instruction prefetch marking signal, and pre-decode is carried out in described next bar instruction input instruction parsing and address decoding module.
3. opertaing device according to claim 1 is characterized in that, described storage control module comprises:
The read-write steering logic: responsible control information register upgrades and steering order is looked ahead, and marking signal sends, and opens or the page or leaf shutdown strategy according to the information Dynamic Selection storage page of flag register;
Control information register: be used for preserving the control information of present instruction, comprise command type, storage address and data transmission number;
Address comparator: described address comparator is responsible for the address of storer of more current execution instruction and the relation of next bar instruction memory address of looking ahead, and the control information that produces the tag addresses relation;
Flag register: described flag register is then stored the concrete numerical value of storage address with the relation of next bar instruction memory address of looking ahead of current execution instruction according to the control information of tag addresses relation.
4. opertaing device according to claim 1 is characterized in that, described data read-write control module comprises:
Internal bus interface: be used for to accept the data and the control information that pass over from storage control module and instruction parsing and address decoding module, comprise chip multi-core processor ID number, specify chip multi-core processor internal register addresses and data bus request;
The address date register of reading and writing data bus: be used for preserving current access instruction data, data bus request and chip multi-core processor internal register addresses;
Processor ID register: be used for keeping current access instruction chip multi-core processor ID number.
5. opertaing device according to claim 4 is characterized in that, wherein, described data read-write control module controls data are transmitted between storer and chip multi-core processor and be may further comprise the steps:
Step 1: according to the data path of the corresponding chip multi-core processor of chip multi-core processor ID gating;
Step 2: for chip multi-core processor during to the memory write data, send data write bus request and chip multi-core processor internal register addresses at the data path of gating; For chip multi-core processor will be from memory read data the time, send data, the request of data read bus and chip multi-core processor internal register addresses at the data path of gating;
Step 3: chip multi-core processor response data read-write bus request signal, data write bus is read and put into to data from the register of chip multi-core processor inside, perhaps the data with the data read bus write the chip multi-core processor internal register;
Step 4: the data read-write control module is accepted the data from data write bus, perhaps after the data of data read bus write the chip multi-core processor internal register, notifies above-mentioned chip multi-core processor data to be ready to.
6. according to claim 1 to 5 arbitrary described opertaing devices, be further characterized in that described storer is SDRAM or DDR.
CN 201110141796 2011-05-30 2011-05-30 Instruction prefetch-based multi-core shared memory control equipment Expired - Fee Related CN102207916B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110141796 CN102207916B (en) 2011-05-30 2011-05-30 Instruction prefetch-based multi-core shared memory control equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110141796 CN102207916B (en) 2011-05-30 2011-05-30 Instruction prefetch-based multi-core shared memory control equipment

Publications (2)

Publication Number Publication Date
CN102207916A CN102207916A (en) 2011-10-05
CN102207916B true CN102207916B (en) 2013-10-30

Family

ID=44696757

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110141796 Expired - Fee Related CN102207916B (en) 2011-05-30 2011-05-30 Instruction prefetch-based multi-core shared memory control equipment

Country Status (1)

Country Link
CN (1) CN102207916B (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9430596B2 (en) 2011-06-14 2016-08-30 Montana Systems Inc. System, method and apparatus for a scalable parallel processor
CN102567278A (en) * 2011-12-29 2012-07-11 中国科学院计算技术研究所 On-chip multi-core data transmission method and device
US9141550B2 (en) * 2013-03-05 2015-09-22 International Business Machines Corporation Specific prefetch algorithm for a chip having a parent core and a scout core
US20150199134A1 (en) * 2014-01-10 2015-07-16 Qualcomm Incorporated System and method for resolving dram page conflicts based on memory access patterns
WO2018205117A1 (en) * 2017-05-08 2018-11-15 华为技术有限公司 Memory access method for multi-core system, and related apparatus, system and storage medium
CN108959133B (en) * 2017-05-22 2021-12-10 扬智科技股份有限公司 Circuit structure capable of sharing memory and digital video conversion device
CN110618833B (en) * 2018-06-19 2022-01-11 深圳大心电子科技有限公司 Instruction processing method and storage controller
CN111124433B (en) * 2018-10-31 2024-04-02 华北电力大学扬中智能电气研究中心 Program programming equipment, system and method
CN110322979B (en) * 2019-07-25 2024-01-30 美核电气(济南)股份有限公司 Nuclear power station digital control computer system core processing unit based on FPGA
CN110399325B (en) * 2019-07-30 2023-05-30 江西理工大学 Improved IP core based on IIC bus protocol
CN112559397A (en) * 2019-09-26 2021-03-26 阿里巴巴集团控股有限公司 Device and method
CN110941578B (en) * 2019-11-26 2021-05-04 成都天玙兴科技有限公司 LIO design method and device with DMA function
CN111045732B (en) * 2019-12-05 2023-06-09 腾讯科技(深圳)有限公司 Data processing method, chip, device and storage medium
CN111143820B (en) * 2019-12-20 2022-08-02 苏州浪潮智能科技有限公司 Optical module access method, optical module access equipment and storage medium
CN113127177B (en) * 2019-12-30 2023-11-14 澜起科技股份有限公司 Processing device and distributed processing system
CN113689902B (en) * 2020-05-19 2023-09-01 长鑫存储技术有限公司 Method for generating memory address data, computer-readable storage medium and apparatus
CN112181879B (en) * 2020-08-28 2022-04-08 珠海欧比特宇航科技股份有限公司 APB interface module for DMA controller, DMA controller and chip
CN113703835B (en) * 2021-08-11 2024-03-19 深圳市德明利技术股份有限公司 High-speed data stream processing method and system based on multi-core processor
CN114036096B (en) * 2021-11-04 2024-05-03 珠海一微半导体股份有限公司 Read controller based on bus interface
CN114265872B (en) * 2022-02-24 2022-05-24 苏州浪潮智能科技有限公司 Interconnection device for bus
CN115269015B (en) * 2022-09-26 2022-12-02 沐曦集成电路(南京)有限公司 Shared variable processing system based on Atomic instruction
CN116166606B (en) * 2023-04-21 2023-07-14 无锡国芯微高新技术有限公司 Cache control architecture based on shared tightly coupled memory
CN117331880A (en) * 2023-08-15 2024-01-02 北京城建智控科技股份有限公司 Dual-core communication device, method and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6219764B1 (en) * 1998-08-03 2001-04-17 Micron Technology, Inc. Memory paging control method
JP2006107021A (en) * 2004-10-04 2006-04-20 Canon Inc Memory controller
CN101048762A (en) * 2004-08-27 2007-10-03 高通股份有限公司 Method and apparatus for transmitting memory pre-fetch commands on a bus
CN101078979A (en) * 2007-06-29 2007-11-28 东南大学 Storage control circuit with multiple-passage instruction pre-fetching function
CN101165662A (en) * 2006-10-18 2008-04-23 国际商业机器公司 Method and apparatus for implementing memory accesses
US20090063777A1 (en) * 2007-08-30 2009-03-05 Hiroyuki Usui Cache system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6219764B1 (en) * 1998-08-03 2001-04-17 Micron Technology, Inc. Memory paging control method
CN101048762A (en) * 2004-08-27 2007-10-03 高通股份有限公司 Method and apparatus for transmitting memory pre-fetch commands on a bus
JP2006107021A (en) * 2004-10-04 2006-04-20 Canon Inc Memory controller
CN101165662A (en) * 2006-10-18 2008-04-23 国际商业机器公司 Method and apparatus for implementing memory accesses
CN101078979A (en) * 2007-06-29 2007-11-28 东南大学 Storage control circuit with multiple-passage instruction pre-fetching function
US20090063777A1 (en) * 2007-08-30 2009-03-05 Hiroyuki Usui Cache system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A DRAM Precharge Policy Based on Address Analysis;Chiyuan Ma et al.;《10th Euromicro Conference on Digital System Design Architecture,Methods and Tools》;20071230;244-248页 *
Chiyuan Ma et al..A DRAM Precharge Policy Based on Address Analysis.《10th Euromicro Conference on Digital System Design Architecture,Methods and Tools》.2007,244-248页.
Efficient Use of Memory Bandwidth to Improve Network Processor Throughput;Jahangir Hasan et al.;《Proceedings of the 30th annual international symposium on Computer architecture(ISCA"03)》;20031230;300-313页 *
Jahangir Hasan et al..Efficient Use of Memory Bandwidth to Improve Network Processor Throughput.《Proceedings of the 30th annual international symposium on Computer architecture(ISCA"03)》.2003,300-313页.
JP特开2006-107021A 2006.04.20

Also Published As

Publication number Publication date
CN102207916A (en) 2011-10-05

Similar Documents

Publication Publication Date Title
CN102207916B (en) Instruction prefetch-based multi-core shared memory control equipment
US9343127B1 (en) Memory device having an adaptable number of open rows
US7536530B2 (en) Method and apparatus for determining a dynamic random access memory page management implementation
US10437758B1 (en) Memory request management system
US9411757B2 (en) Memory interface
KR102402630B1 (en) Cache Control Aware Memory Controller
JP5893632B2 (en) Memory controller, system, and method for applying page management policy based on stream transaction information
US9336164B2 (en) Scheduling memory banks based on memory access patterns
CN101609438A (en) Accumulator system, its access control method and computer program
CN112088368A (en) Dynamic per bank and full bank refresh
CN103377154B (en) The memory access control device of storer and method, processor and north bridge chips
US9263106B2 (en) Efficient command mapping scheme for short data burst length memory devices
EP3570286B1 (en) Apparatus for simultaneous read and precharge of a memory
CN101071403A (en) Dynamic update adaptive idle timer
US20180188976A1 (en) Increasing read pending queue capacity to increase memory bandwidth
US8990473B2 (en) Managing requests to open and closed banks in a memory system
CN101042926A (en) Memory control method, memory device and memory controller
CN105988951A (en) Memory controller and related control method
TWI541647B (en) Memory controller and associated control method
Li et al. A high-performance DRAM controller based on multi-core system through instruction prefetching

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20131030

Termination date: 20190530

CF01 Termination of patent right due to non-payment of annual fee