CN1209213A - High performance universal multi-port internally cached dynamic randon access memory system, architecture and method - Google Patents

High performance universal multi-port internally cached dynamic randon access memory system, architecture and method Download PDF

Info

Publication number
CN1209213A
CN1209213A CN96180069A CN96180069A CN1209213A CN 1209213 A CN1209213 A CN 1209213A CN 96180069 A CN96180069 A CN 96180069A CN 96180069 A CN96180069 A CN 96180069A CN 1209213 A CN1209213 A CN 1209213A
Authority
CN
China
Prior art keywords
data
dram
interface
buffer
port
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN96180069A
Other languages
Chinese (zh)
Other versions
CN1120495C (en
Inventor
穆凯什·查特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CN1209213A publication Critical patent/CN1209213A/en
Application granted granted Critical
Publication of CN1120495C publication Critical patent/CN1120495C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • G11C7/10Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers
    • G11C7/1075Input/output [I/O] data interface arrangements, e.g. I/O data control circuits, I/O data buffers for multiport memories each having random access ports and serial ports, e.g. video RAM
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Dram (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Static Random-Access Memory (AREA)
  • Acyclic And Carbocyclic Compounds In Medicinal Compositions (AREA)
  • Transition And Organic Metals Composition Catalysts For Addition Polymerization (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A novel low cost/high performance multi-port internally cached dynamic random access memory architecture called 'AMPIC DRAM', and consequentially a unique system architecture which eliminates current serious system bandwidth limitations, providing a means to transfer blocks of data internal to the chip, orders of magnitude faster than the traditional approach, and with the chip also interconnecting significantly higher numbers of resources with substantially enhanced performance and at notably lower cost through the use of a system configuration based on this novel architecture and working equally efficiently for both main memory functions and as graphics memory, thus providing a truly low cost, high performance unified memory architecture.

Description

High performance universal multi-port internally cached dynamic randon access memory system, architecture and method
The invention relates to dynamic RAM technology (DRAM), be more specifically at one eliminate the restriction and the relevant issues of current system bandwidth and the system performance that strengthens greatly is provided and the cost of reduction, can be basically because consistent memory architecture can be provided to the DRAM system architecture of the general novelty of many application.
A large amount of this system designs is particularly in network connections/communication, owing to variously between the homology visit existence of main system memory (almost always DRAM) is not competed and is restricted on performance.The immediate cause that causes a large amount of competitions is to adopt the architecture of unibus, and one of them bus is done interconnecting between CPU, primary memory and the I/O source.This and past and present analog architecture is also owing to serious bandwidth constraints hinders the ability that CPU manages more I/O interface.
Similarly system bandwidth limitations also makes image/multimedia designer that video memory is separated with primary memory, thereby brings negative effect to system cost.Also develop the special DRAM that uses for image and further strengthened the video data bandwidth capacity.Adopt the system architecture of different types DRAM to make a common storage not only to be used as primary memory but also can be used as video memory although proposed some, not good because of the operational effect that a side is always arranged, the effect of this measure is limited.
Therefore, before occurring, the present invention do not have low cost, high performance consistent memory architecture basically, and the present invention proposes a kind of DRAM architecture of the original creation that below will discuss and the unique system architecture that forms thus, thereby it has been eliminated these problems widely and can provide much bigger data bandwidth capacity to connect the source of a myriad of and the performance of enhancing, and has reduced cost significantly.And will be equal to the work of efficient ground to primary memory function and video memory, thereby further realize the consistent memory architecture of certain low-cost and high-performance based on the system configuration of this novel architecture.Therefore this chip solution is called " AMPIC DRAM ", i.e. A Multi Port Internally Cached DRAM (multi-port internally zero access DRAM).
Background of the present invention
As discussed above, the high performance system of most of this character all tends to adopt the architecture based on bus, wherein single system bus is done interconnecting between CPU, primary memory and the I/O source, as shown in the Fig. 1 that discusses below (term " primary memory " and " system storage " here can exchange mutually).This is simple and clear relatively design and leaves extending space, but it has critical limitations.When CPU or peripheral hardware need be visited primary memory (realizing with DRAM usually), arbitrate for access of system bus always.Like this, the concurrent activities amount is subjected to the restriction of the total volume of external bus in the system.
Along with the speed of CPU increases, the bandwidth of system bus must corresponding increase be brought into play the whole potentiality of system.Yet increase bus bandwidth extremely difficult and very expensive thereby become technically can not or its cost can't stand.The quantity in the I/O source that may move on the bus in addition, also is subjected to the restriction of bandwidth.Though should be pointed out that in fact unibus can have extendability significantly in theory, and actual motion is because competition produces restriction greatly to this expansion.
This problem is ubiquity in all types of application.Though these problems below will be introduced illustrative network connection and image is used as an example in order to be familiar with better, the present invention never only is confined to these illustration scopes.
Network connects applicating example
Typical network access device (also being referred to as interconnectivity equipment) for example switch, router, bridge, socket etc. interconnects multimeshed network such as ATM, SONET, Token Ring (token ring), FDDI, Ethernet (Ethernet), Fiber Channel (optical-fibre channel) etc., as described later shown in Fig. 2.Modular design comprises the primary memory that a high-performance CPU and the traditional DRAM of a large amount of common employing realize, Fig. 3 and shown in Figure 4 as described later.Data from each network are sent to primary memory with packet (packet is the set of byte) form, handle via CPU, then are sent to their objective networks separately usually again.
All networks above-mentioned (ATM, SONET, Fiber Channel, Token Ring, FDDI etc.) have the different measure from any to another transferring data.They are all different on hardware, software and data transfer rate.Need the equipment of interconnecting and make that the user on one of these networks can be with the telex network on different agreements, seamless slash and another network.
Typically interconnect in the equipment one, it is particular network interface controller (also being commonly referred to as network controller) that network interface is provided with the various types interface.Like this, Ethernet promptly has the network interface different with being used for Fiber Channel or ATM (Fig. 3 and 4).
Explain in only being used for shown in Fig. 4 this illustrative system configuration typical data stream for example, may relate to following these canonical parameters:
A. system bus is 32 bit wides (4 bytes);
B. the DRAM of four traditional 2M * 8 constitutes 2M * 32;
C. four network interface Fiber Channel, ATM, Ethernet and FDDI; With
D. packet is 1024 bytes.
Consider this situation, for example the online user of Ethernet sends a packet to the online user of for example FDDI.Equipment Ethernet interface controller receives this packet and the via controller chip analysis by interconnecting, and common local FIFO (first in first out) storer of only the relevant information content being stored into it this moment is so that be sent to primary memory subsequently.Owing on system bus, have a multiple arrangement that comprises CPU and diverse network controller, thus arbitrate between the promising active source of obtaining system memory bus.After the Ethernet controller obtained bus by arbitration, the system bus interface that data promptly are utilized 32 bit wides was sent to system storage.Transmit at every turn 4 bytes are passed to primary memory owing in the packet 1024 bytes are arranged, so be 256 this transmission of transferring data bag need.If network controller obtains the transmission that bus only is allowed to do one time 4 byte at every turn, (this arbitration number of times can be less if network controller has the transfer capability of bursting then also will to need at least 256 arbitration cycles.For example say the transfer capability of bursting that has 16 bytes for obtain at every turn, then need minimum 64 arbitration cycles).
After this packet was deposited into primary memory, it was handled (mainly being heading message) and gone to the FDDI mouth by changed course in this example by CPU.Handle on the contrary now.Chip internal FIFO storer is searched and be sent to data by the fddi interface controller from primary memory.This also needs the arbitration of 256 transmission and respective numbers.Data are sent to its network from the FDDI control chip concurrently then.
The travelling speed of each network is respectively FDDI, 100Mbit/sec; Ethernet, 10/100Mbit/sec; ATM is about 600Mbit/sec; Token Ring, 16Mbit/sec; And FiberChannel, 800M bit/sec.
A large amount of transmission and be consumed in the access frequency that time consumption in the arbitration falls a big chunk of available data bandwidth and reduces CPU.When the interface of increase of network interface quantity or more speed is added into, can be used for the time decreased that each source comprises CPU, thus the summit performance of impair system.This also forces the designer to seek more high performance CPU and relevant expensive components, thereby raises cost.The quantity of the network that can connect via the system bus of this prior art pattern also remains very low because of these serious restrictions, and this problem is along with adding the more and more higher network of increasing speed and worsen all the more for adapting to for example internet (Internet) related expanding activity.
Image/multimedia application for example
For the explanation background technology, utilize one image/multimedia application example once more, two main functions relevant with storer are arranged in this pattern system:
(a) upgrade the screen memory that is used to desire display graphics; With
(b) with very high speed retrieval screen storer to upgrade cathode ray tube (CRT) or other screen displays.
First task need a large amount of from storer position transmit to the frequent data that are called " BitBlt " of another location, but this requirement is tended to burst in nature.This expends the system bandwidth of a large portion, thereby is necessary to utilize independently that storer comes the memory image data, as described later shown in Fig. 5 like that, thereby bring adverse effect to system cost.Consider that needs upgrade 16 row screen memories and utilize common 2M * 8 DRAM examples of members.For the data value that transmits 16 row arrives new place, required data transmit number of times and are:
Columns (1024)=16384 in the line number that tendency to develop is sent (16) * each row
Obviously also need do the arbitration of corresponding quantity to system bus.And this a large amount of transmission must be in very short time, thereby consume the most of available data bandwidth in the small time slot, and make CPU and other I/O source deficiencies.But existing DRAM fabricator does not also provide actual breakthrough for extenuating this problem in this respect before the present invention.
Certainly, also need the repeated retrieval screen memory for loading and refreshing CRT monitor; And depend on the pattern (VGA, Super VGA etc.) of display, for making the required bandwidth difference of this renewal, become 100,000,000 or the above order of magnitude but trend towards per second.Different with " BitBlt ", it is continuous that CRT upgrades demand, and identical with " Bitblt " be also will utilize system bandwidth in large quantities.
As example, consider following situation:
A. show that size is 1024 * 768 pixels;
B. noninterlace-per second upgrades 72 times; With
C. each of red, green and Lan Sanse is 8 of every pixels.
In the per second byte, required bandwidth is:
1024×768×72×8×3/8=170MByte。
This is a huge demand, also is that common DRAM is difficult to be competent at if also be used as primary memory.Therefore this just causes the exploitation of more expensive customizations DRAM, and one of this special-purpose DRAM that is widely used is video DRAM, also is referred to as " VRAM ".Most of VRAM is dual-port and removes some also additional one the 3rd port that has that makes an exception.Typical case VRAM has the system interface that is similar to traditional DRAM, but it also has a line width buffer at chip internal (to be called SAM, Serial Access Memory (serial access memory)), it is by independently but identical data pins that can be used for system interface and the extraneous reciprocation of quantity, as described later shown in Fig. 6.As an example, the VRAM of a 256K * 8 also has the additional port of one 8 bit wides, is used for injecting refresh data stream to CRT continuously.This " SAM " buffer has and the fixedlying connected of external display interface.In service, CPU (or system bus master control person) stores by system data interface accessing VRAM and in VRAM or the renewal screen image.Once in the visit on-screen data of a full line is being shifted into " SAM " buffer then.These data are sent to display by SAM I/O interface identical with system interface on width again.
This VRAM only can be effectively used to must with the interactive design conditions of one eikongen/target.But because more additional pin and bigger silicon chip, they are than traditional DRAM costliness, and this architecture proposes very strict structure.Because of greatly increasing the scalability that makes with more interfaces of more devices, number of pins is subjected to strict restriction." SAM " fixes to the connection of exterior I/O interface, and the size of FPDP is also scheduled.This measure also can't solve the problem of quickening huge data movement requirement.Therefore before the present invention occurs, VRAM only just is counted as solution in default of any better replacement scheme.
VRAM (particularly 3 port design) also once be mentioned once in a while be applied to that network connect to use is gone up but since the ability (speaking by the book only two) of the I/O structure of their aforesaid strictnesses, very limited connection multiple source, bigger wiring board space, more expensive structure and the power consumption of Geng Gao almost be not employed.
The state of the art of system configuration and relevant issues
Therefore, say typical prior art and be subjected to top said restriction based on the current system configuration (Fig. 3 as described above) of traditional DRAM blanketly.Be related to that need to interconnect the problem of application of network connection aspect of a large amount of express networks with low cost still unresolved basically, thereby cause the higher price of each network interface.
Though system bandwidth limitations and constant CRT display update demand cause the development of video DRAM, mainly be to use and be normally used in as shown in Figure 6 the configuration as discussed earlier at image, though and this configuration is better than traditional DRAM runnability, along with the performance requirement to primary memory and video data bandwidth increases, because from the video memory separation and because the cost of VRAM itself is higher, the disadvantageous spinoff of this configuration has also increased the cost of system with primary memory.
RAMBUS company has also developed the prior art DRAM of another type, be called " RDRAM ", it moves with 250Mhz, can work in image well uses, also even can bring the economic effect better, but this measure still needs to keep two buses independently to high-end image/multimedia environment than VRAM.
Cost is most important in PC market (all 60% of chip all being consumed in this part).Begin to explore the configuration that searching one is called " consistent memory architecture (Unified MemoryArchitecture) " then, it will also only adopt a kind of storage arrangement of pattern for visual and the shared common bus of primary memory function.
A kind of possibility is to utilize VRAM to be used as image and main memory section, is still rolled over by the fringe cost of parts by the interests of using the common storage gained to disappear.Another kind of possible solution is to adopt aforesaid RDRAM, and the number of pins of its each chip is less than VRAM, thereby makes that power consumption is lower, actual investment is less and cost is relatively low.But unfortunately because their block-oriented agreement and interface restriction, it is very low to non-localized main memory accesses efficient, thereby can not make and oneself show as " consistent memory architecture ".The employing of RDRAM is also in the challenge that brings aspect radiation, noise and PCB layout in a series of great electrical engineerings designs, and makes very difficulty of design objective.
Therefore still seeking more feasible low-cost/high performance consistent memory architecture that a kind of variation that can satisfy the visit of primary memory and video memory on an equal basis effectively requires.
In fact, the present invention is exactly at these pressing issues of effective solution, will be understood that the present invention is a breakthrough in the progress of novel DRAM architecture and method, it:
(a) with the innovation of architecture rather than provide much higher system data bandwidth by device speed entirely;
(b) with to the minimum impact of system bandwidth to/move mass data from a plurality of I/O source;
(c) but with the traditional measures I/O source more much bigger than number of connection;
(d) with the time frame of a fast at least order of magnitude with hardly system bandwidth is moved very big data block at chip internal with exerting an influence;
(e) can be organized the different pieces of information transfer rate that structure adapts to the I/O source;
(f) reduce the reception of input packet and the stand-by period between its transmission subsequently;
(g) the smallest number pin is set;
(h) cost is reasonable;
(i) low power consumption;
(j) the system interface simplification makes design effort amount minimum; With
(k) the two is worked on an equal basis effectively to primary memory and visual demand, and real " consistent memory architecture " and general substantially measure is provided thus.
Purpose of the present invention
Thereby purpose of the present invention just provides new improved dynamic randon access (DRAM) system, architecture and the method for the DRAM structure that adopts novel multi-port internally zero access, this DRAM structure is got rid of the bandwidth constraints of current system and relevant problem, and with the cost that the reduces performance of enhanced system greatly, and make thus and can realize general basically utilization myriad applications.
Another purpose provides a kind of like this novel system, wherein at chip internal with data speed transmission piece than the fast order of magnitude of traditional measures, and have with the cost of obvious reduction and the performance that strengthens greatly makes the interconnective facility in much more source.
Another purpose provides based on this new system configuration to the real consistent memory architecture of high-performance of the two architecture of all working effectively on an equal basis of primary memory function and video memory one.
These and other purposes will be described below and in the claim of attached row, do and describe more specifically.
General introduction
But from a starting point, in a word the present invention is used for one to have and be connected to the public system bus interface separately and to the improved DRAM architecture of the system of the master controller of the CPU (central processing unit) that for example has parallel data port (CPU) of its accessing competition and dynamic RAM (DRAM), comprise multi-port internally zero access DRAM (AMPIC DRAM), this AMPIC DRAM comprises: a plurality ofly be connected the exterior I/O source of opening in a minute and the independent serial data interface between the inner DRAM storer by corresponding buffer separately; Assign into the adapter assembly between serial line interface and buffer; With the adapter assembly logic control that is used under the dynamic constitution of doing by the bus master controller of for example described CPU, serial line interface being connected to buffer, so that do to be suitable for the switching distribution of desirable data route.Describe preferred and best Design Pattern and technology below in detail.
Accompanying drawing
Now according to description of drawings the present invention, the following description prior art of Fig. 1-6 wherein:
Fig. 1 is a typical prior art unibus parallel architecture calcspar;
Fig. 2 represents a typical prior art network configuration;
Fig. 3 and 4 shows the typical prior art network access device that is used for such as the employing DRAM of the such configuration of Fig. 2;
Fig. 5 is the block scheme that has Stand Alone Memory and adopt the prior art image application configuration of traditional DRAM;
Fig. 6 is the similar fitgures of the typical architecture of the image application of employing VRAM;
Fig. 7 is according to the present invention's block scheme that constitute and that adopt the system architecture of multi-port internally zero access of the present invention " (AMPIC) DRAM ";
Fig. 8 is Fig. 7's the similar view of the highest architecture of part of " AMPIC DRAM ", shows the Port Multiplier/crossbar switch conversion between auxiliary serial line interface, buffer and the DRAM nuclear;
Fig. 9 shows the details of the illustrative serial data Port Multiplier embodiment among Fig. 8;
Figure 10 explanation is constituted to the example into a plurality of serial line interfaces of port;
Figure 11 is 2 bit ports of AMPIC DRAM and the key diagram of control corresponding line;
Figure 12 and 13 is the exemplary plot of serial data transformat, and wherein Figure 13 is used for illustrating 2 of each port;
Figure 14 intervenes the five-star example block diagram of part of two memory banks (bank) " AMPIC DRAM " the Control Component architecture of (being called PRITI), no memory unit for having parallel-by-bit internal affairs described later;
Figure 15 represents the sequence of operation that the PRITI when two memory banks are made internal data and transmitted transmits;
Figure 16 is similar to Figure 14, uses for " PRITI " function but have two line width memory unit groups;
" PRITI " that Figure 17 represents to have the two line width memory unit groups of Figure 16 transmits the internal exchange of data between description operation sequence and two memory banks;
Figure 18 is similar to Figure 17, but only utilizes a line width memory unit group;
Figure 19 shows the example that a useful pin that has " the AMPIC DRAM " of the present invention of 9 exemplary serial line interfaces is exported;
Figure 20 explanation has the example networks connection device that utilizes AMPIC DRAM of the present invention to constitute with 32 bit wide system buss of CPU operation;
Figure 21 is used for the similar pattern that image is used;
Figure 22 is the similar pattern of explanation four memory bank system configuration, and each memory bank is connected to different network interfaces;
Figure 23 is similar to Figure 22, but adopts memory bank and two memory banks with traditional DRAM of two " AMPIC DRAM ";
Figure 24 is also similar to Figure 22, and two memory banks are used for other application but two memory banks are used for image;
Figure 25 is the figure of the another kind of modification of " AMPIC DRAM " architecture, wherein, two internal repository and aforesaid " PRITI " ability is arranged, and a memory bank is made primary memory usefulness and another memory bank confession image or other application;
Figure 26 is the modification of the AMPIC DRAM system of aforementioned Figure 19, be applicable to so-called " PARAS " interface and access, as explanation among the unsettled U.S. Patent application NO.08/320058 (on October 7th, 1994), has the low pin number that is used for the integrated memory architecture.(this application discloses and is used for by making the method and apparatus that novel interface and access process improve the access capability of asynchronous and Synchronous Dynamic Random Access Memory device, wherein identical pin is used to each row, column and data access and is used in write and read equally in the cycle, and making like this can increase data bandwidth and addressing range under the situation that has less pin in substantially the same size encapsulation effectively.)
Figure 27 intervenes the highest example block diagram of part of multibank " AMPICDRAM " the Control Component architecture of (RPITI) for having aforementioned parallel-by-bit internal affairs, has a line width memory unit group.
The preferred embodiments of the present invention
Now the present invention should be described, wherein by reduce to transmit in large quantities and corresponding arbitration number of times to system bus new at it be to eliminate bandwidth and other previously described bottleneck problems in the solution at center with " AMPIC DRAM ", thereby improve the performance of total system greatly and bring faster internal data transfer capability.Other benefits comprise that the result just can reduce system cost as described above to the less system extension ability of data bandwidth influence.
Referring to Fig. 7, the CPU element of using common basic parallel port data is connected to system bus, system bus also is connected with the main storage unit that comprises " AMPIC DRAM " of the present invention described later, and auxiliary serial line interface input (#1~#n) be connected to main storage unit from each I/O (I/O) source #1~#n.
In this explanation, the independent serial line interface of a plurality of bit wides is provided to " AMPICDRAM " like this and is used for transmitting data between I/O source and primary memory.These serial line interfaces supply central processing unit CPU or similarly main controller device utilization together with the basic parallel port that is used for system bus interface.The quantity of this serial line interface only is subject to device technique, number of pins, power consumption and cost etc.Serial data that receive via these interfaces #1~#n or that desire sends is stored among buffer #1~#n separately very little within " AMPIC DRAM ", more completely as shown in Figure 8.For the consideration in the practice, this may be in 64 bytes in the scope of 512 bytes, but then is subjected to the layout restrictions of sensor amplifier in theory.In conventional in layout, it is limited to obtainable data bits in the DRAM of the delegation nuclear.Like this, if each line access can obtain 1024 bytes, then " the AMPIC RAM " of largest buffered device size can be designed to each internal repository 1024 byte.Be the serial line interface number if " m " is buffer number and " n ", then packet buffer (term " packet buffer " can be replaced application mutually with term " buffer " here) number " m " is more than or equal to serial line interface number " n ".The upper limit to " m " is not subjected to architectural limitation by the technological limits restriction.
According to the preferred embodiment, a Port Multiplier and/or crossbar switch logical OR combination (Port Multiplier/crossbar switch among Fig. 8) is connected to " m " individual buffer with " n " individual serial line interface.Being connected between each serial line interface and a buffer, making dynamic constitution, and do to be suitable for the change of data route by CPU (or current system bus primary controller).
Be used to organize functional block diagram that a possible serial data interface of structure 4 serial line interfaces and 4 buffers implements as shown in Figure 9.But can there be number of ways to realize desirable architecture, and keeps basic concepts identical.When data in Fig. 8 are need be between packet buffer and DRAM nuclear mobile, among different activity data pack buffers and CPU, arbitrate.But by serial line interface from/receive or send data to packet buffer, then need not to do any arbitration.
The input packet buffer can be redefined to the output data pack buffer and be directed to the data of its destination even need not to carry out between buffer and core DRAM the intermediate steps of transmission data.This has reduced the packet of reception input and has sent to subsequently the related stand-by period of its destination.Why this becomes may only be because " AMPIC DRAM " can be assigned to arbitrary buffer the ability of arbitrary serial line interface by multichannel/crossbar switch assembly.
In the layout of " AMPIC DRAM " architecture, can for example be 1,2,4 or 8 etc. narrow width bus (being called " port ") for size also, but not have this restriction in theory the serial line interface group structure of a plurality of bit wides.It can be by the 1 any number to " n " by the device technique permission, also depends on enforcement.In case a plurality of serial line interfaces are set up and be defined as jointly a port, they promptly all are connected to a common data pack buffer, and as more specifically expression among Figure 10, wherein the port of 1 bit wide and a serial line interface are same.This makes that can make faster data transmits and keep simultaneously dirigibility, and is being very useful with requiring the source of operation to carry out in the interface with different bandwidth and data transmission.Therefore, each packet buffer (Fig. 8) has the ability of carrying out interface simultaneously with maximum " n " individual serial line interfaces if be defined as a port.Buffer is organized the equal port size of port that structure is used for being connected to it (also being referred to as sometimes to be docked to).
Serial data stream on each port is by its control line control separately.Each port interface is made up of a control and one group of serial data interface.As example, if each serial port only is 1 bit wide, then each of data line is utilized a control line.If it is a port that two serial line interfaces are organized structure, then a control line is used to this port of two, and so on, sees Figure 11.In addition, for making the number of pins minimum, also can to organize structure be the serial data interface line as long as meet the criterion serial line interface control line of each port one control line.The connection of each its port of control pair can be organized structure.The effect of control line is to be the data stream between control I/O source and serial port.If to the inessential words of fabricator, can being provided with certainly, number of pins independently controls pin.For some application, even need not to control pin, and the I/O source is enough to be used for exchange of control information to the parallel interface of system bus.
Data transfer format between I/O source and " AMPIC DRAM " serial port makes each memory chip (part of same exterior storage body) receive and send data bit simultaneously on its port, as showing among Figure 12.This can an example explain better.Suppose " AMPIC DRAM " (wherein a port is defined as one 1 serial line interface) of four 2M * 8 forms that have one 32 bit wide system interfaces as shown in Figure 12.Each of four chips receives data simultaneously.Chip 0 receives position 0, and chip 1 receives position 8, and chip 2 receives position 16, and chip 3 receives position 24.In following one-period, all items will be by increment 1.This will proceed to all 32 and all be transmitted, and make each chip all receive its 8.In case finish, this processing will be repeated following 32, as indicated among Figure 12, and so on.
Consider now with a port definition another example, as shown in figure 13 for forming by two serial line interfaces.What offer the I/O source like this is an interface of 8 altogether, and it must offer each " AMPIC DRAM " each two simultaneously.The ordering of position makes that chip 0 receives position 0 and position 1 simultaneously, and chip 1 receives position 8 and position 9 simultaneously, and chip 2 receives position 16 and position 17 and chip 3 simultaneously and receives position 24 and position 25 simultaneously.In following one-period, all item increments 2.Proceed so all to be transmitted, and make each chip receive its 8 until all 32.In case finish, this processing will be to next 32 repetition, and so on.
It should be noted that this architecture of the present invention does not prevent for example shared parallel system bus of network controller chip of I/O source, if wish like this.This will be useful to for example task of controller configuration and condition managing and so on.
Best, " AMPIC DRAM " is provided with just like the major clock pin shown in Figure 19, and each serial line interface organized structure with the multiple of this clock rate or factor operation, and the dirigibility that adapts to multiple source can be provided.Also might provide a more than independent clock to substitute a major clock, its restriction only is device technique, number of pins and cost restriction.Should point out that it is the characteristics of serial line interface rather than buffer that clock frequency is distributed.Any one all can be docked to any of serial port and with the operation of this port speed in " m " individual buffer like this.
In addition, the structure the organized property of " AMPIC DRAM " of the present invention also makes and can not interrupt transmitting ground from the buffer serial line interface of transferring to another buffer ground.This constitute network and visual aspect many important application are arranged.Another buffer can once be loaded the line width data in the visit when a buffer is being used to send information.Also note that, though other supplier has realized similar two buffer methods in VRAM, are called and separate buffer and transmit, but these are different fully with the present invention, wherein always fix with the exterior I/O interface of buffer and have and the wide identical width of VRAM system data.The Port Multiplier of " AMPIC " of the present invention/crossbar switch assembly has fully been eliminated all these restrictions.
More than one internal repository that connects by the delegation width bus can be arranged in this AMPIC DRAM, make all buffers all reside on this bus, perhaps in another embodiment, the buffer group that can separate each internal repository setting.
The influence to system bandwidth that is produced because of above-mentioned frequent " BitBlt " operation can come to reduce widely like this, promptly rely on and to connect via a line width interface as shown in Figure 14 more than one inside DRAM nuclear memory bank so that when the transmission that need carry out from an internal repository of storer to another internal repository the time, the suitable row address while of each memory bank is with " RAS " signal gating of correspondence.After in sensor amplifier, can obtaining, be written to another memory bank from the data of being read row.Direction control is set by internal logic, and Figure 15 illustrates the sequencing of this operation.After finishing delegation's transmission, can start another transmission, continue this processing until finishing.The transmission of this pattern is named as " PRITI " (Parallel Row Internal TransactionIntervention (intervention of parallel-by-bit internal affairs)).Clearly, carrying out such inside when transmitting, other the visit to this accessed memory bank is unallowed.Should point out that the transmission on serial line interface also can walk abreast with this inner transmission and carry out.One similar notion contrasts with characteristics of the present invention after a while, discloses (Dec 5 nineteen ninety-five) by U.S. Patent No. 5473566.
By this innovative techniques, in the access time of delegation, can transmit very lot of data.As an example, consider that one has two and respectively is " the AMPIC DRAM " of the internal repository of 1M * 8.The inner structure of each memory bank is that the 1K of each 8 bit wide is capable.By " PRITI " ability, can innerly transmit the 8K position in the cycle at one " RAS ".This is a very big progress for the method that exists now, in current approach, transmits 8 data via system bus interface, will need 1K cycle and corresponding arbitration under best situation.If " r " row and " c " row are arranged, the transmission sum required with the present invention " PRITI " ability will be " r ", and required total transmission number then is " r * c " in classic method.
This handles any amount of memory bank all is same.If there is " m " individual internal repository to connect by a line width interface, this " PRITI " assembly just can transmit data to all the other memory banks more than simultaneously from a memory bank.This is of great use broadcasting sowing the formula packet when a memory bank is shifted to every other internal repository.Utilize the present invention, carry out this operation and need not any line width register or latch (also being referred to as a line width memory unit group), thereby can realize very economically.
The highest inner structure of " PRITI " assembly is as the aforementioned shown in Figure 14." PRITI " is loaded with the initial row address of each memory bank and transmits number.Behind the group structure, for the internal bus that obtains both sides' memory bank is arbitrated.To this key concept many variants can be arranged, for example in a single day this " PRITI " assembly is organized structure and promptly done the transmission of bursting of predetermined quantity after the right that obtain these row of visit, or discharges bus so that DRAM nuclear can be shared in other sources after each the transmission.
Another possible embodiments of the present invention relates to as a line width memory unit group of being indicated among Figure 27 (or enforcement of any being equal on can actuating logic of task) and carries out data exchange operation.As an example, for a 1M * 1DRAM with 1024 bit wide row, described memory unit group will contain 1024 memory units.At this, visit the delegation of internal repository with read operation, the data of searching in the sensor amplifier place that is used for this (being called memory bank 2) are stored into this line width memory unit group.Write memory bank 2 from another memory bank (memory bank 1) retrieve data and with it then.After this operation, coming since then, the data of memory unit are written to memory bank 1.The suitable sequence of the operation that the diagrammatic representation of Figure 18 is such.The desired circuit of this enforcement is less than the data exchange of still doing with the method for two groups of memory units described later slightly, is cost though will slow down to some extent with execution.This method is to the universal method of memory bank 1 to " m ".This ability makes and can make mass message exchanging in the time of lacking very much that one-to-many medium/image is used useful especially instrument.The circuit that this enforcement obviously need Duo than original recipe because of additional one group of memory unit still need not move to the original data of preservation before its position here in new data.
Another modification of this invention is to adopt as two groups of memory units indicating among Figure 16 circuit of task (or be equal on any energy actuating logic) carry out data exchange operation.At this, the delegation of each of two internal repository is visited simultaneously by read operation, and the data of being searched at the sensor amplifier place are stored in this line width memory unit group, as pointing out among Figure 16.Search like this and stored data write back simultaneously then that this is two capable.The diagram of Figure 17 is represented the exemplary sequence of such operation.
Therefore, " PRITI " of the present invention method is not restricted to two internal repository, but is equally applicable to any multibank layout in the dram chip.Even also might in a conventional version DRAM, increase " PRITI " function and need not the remainder of " AMPIC DRAM ".In addition, more complicated " PRITI " also has divided by row outer also being listed as the transmission scope that defines, and needs background register to load column address.
Different with the system of described U.S. Patent No. 5473566, only need a line width memory unit group according to the preferred embodiments of the present invention, rather than one of each internal repository.This makes method of the present invention can be fit to general application and keeps the relatively cheap of DRAM structure.
Further improvement of the present invention:
A. realize that more than a line width bus together with their the memory unit group of a plurality of memory banks of connection, thereby parallel " PRITI " that can do more than one transmits.In general, if any " m " individual memory bank, then the line width number of buses of the maximum possible of irredundant degree is " m/2 ".If each bus is utilized one group of above-mentioned memory unit, " PRITI " transmits one of each memory bank when then only needing " m/2 " group memory unit to do that " m " is individual to be separated;
If b. the memory bank number is very big, then memory bank can be made the subgroup on the bus of separating.As an example, in the configuration of one 8 memory banks, 4 memory banks can reside on the bus all the other 4 and then transmit with their above-mentioned " PRITI " and be implemented on second bus, and these two subgroups connect by another bus with any of above-mentioned " PRITI " transfer capability again.
Though use a line width bus in this explanation, and reduce cost as requiring, part line width bus also is effective.Being also pointed out that storer differs is decided to be a DRAM and utilizes these special abilities.
The influence of " AMPIC DRAM " docking port design
This chip has some different pin and exports the architecture that reflects that it is unique.The one a kind of possible pin output of chip that has 2M * 8 of 9 serial line interfaces exemplarily is illustrated among Figure 19, has the additional pin that need change in the Interface design based on the primary memory of " AMPIC DRAM ".
Whenever in buffer and the inner transmission of the internuclear generation of DRAM the time, provide one " WAIT (wait) " signal to system bus interface.CPU (or other master controllers) can or the sharp beginning that it postpones to visit, perhaps substitute and implement one, will prolong access cycle so that finish before this visit carrying out.
Because this " AMPIC DRAM " can be organized structure to heavens, so need a mechanism to come between the visit of the DRAM of routine nuclear and one group of structure order or buffer transmission reciprocation, to be distinguished.Method among Figure 19 promptly provides an auxiliary control signal to show an order or a data access.
In command cycle, this order indication can be propagated because they are not employed in the cycle at " RAS " by data line.This is particularly useful for inside transmission order, and wherein DRAM nuclear address need provide together in company with buffer ID.This arrangement makes and may utilize classical signal " RAS " and " CAS " that core DRAM is provided the address, and wherein data line will have buffer number or any other additional information/instruction.In fact might send two orders, one becomes when effective at " RAS ", then as " CAS " when being established.Have the approach of this access mechanism of a plurality of realizations, these also are the results of device technique and cost consideration.
Though because the relation " AMPIC DRAM " of the present invention of serial port has the pin more than traditional DRAM, if but adopted the DRAM assembly of " PARAS " pattern of the previous suggestion of described pending application application, people will obtain this DRAM and also only do critical increase on number of pins.
The network of band " AMPIC DRAM " connects applicating example
As previously explained, according to the present invention, serial line interface/port is set between each network controller and the primary memory.Data moving between controller and primary memory mainly is serial.Serial data that receives from network controller or the data that send to network controller all are stored in by the system bus master control to be distributed to its datagram buffer.This discussion yes hypothetical network controller can or receive serial data stream with the desired format classification of the architecture of this new system.
Research and previous same example, promptly, 32 bit wide buses replace traditional DRAM, line width packet buffer and four network interfaces with four 2M * 8 AMPICDRAM, one for example the online user of Ethernet send the packet of one 1024 bytes to for example another online user of DFFI.In this has innovative system architecture based on the primary memory of " AMPIC DRAM ", as shown in Figure 20, data will be received by the serial port on " AMPIC " that be connected to the Ethernet controller.Need not any arbitration does not also consume any primary memory on transmitting bandwidth.After the data transmission has been finished (each will receive 256 bytes in four " AMPIC DRAM "), it can be only once to visit by the whole DRAM of being sent to nuclear after this packet buffer is obtained internal bus through arbitration.When a row address was provided to this DRAM nuclear, its sensor amplifier promptly had all data bit of this delegation.Like this, whole packet buffer can once be stored in the visit.If the size of packet buffer, then needs repeatedly (though still seldom) visit less than a line width.
This parallel bus method to aforesaid current existence is a most great improvement, and the there needs 256 visits and corresponding arbitration.
Be sent at this packet after " AMPIC DRAM " nuclear, handle and in this example, altered course by CPU and be sent to the FDDI port.Produce opposite processing now.From then on packet is examined in a single reference that needs to arbitrate is sent to suitable packet buffer.These data are sent to the FDDI controller by serial port from packet buffer subsequently, then shift to its network from the FDDI controller chip simultaneously again.Again, this reverse process only is required to be its transmission and does once arbitration, then needs 256 transmission and corresponding arbitration in existing design.
Other of this novel DRAM of the present invention is also advantageous in that the huge interests that can be obtained by following aspect, promptly only can broadcast sowing packet with one in visit once loads in all buffers that are fit to, be sent to whole network subsequently again, and a plurality of row can be accessed sequentially and be loaded into different buffers and transmit by their port more subsequently, improves runnability widely.
Image/many matchmakers applicating example with AMPIC DRAM
As previously mentioned, the main bandwidth of arbitrary pattern system all is to consume in " BitBlt " operation, wherein needs and will pass to another zone from the mass data of storage area.This expends quite most system bandwidth, and the DRAM that therefore all will be used for visual purpose usually separates with main system memory.But this brings negative interaction on system cost.The present invention also proposes to eliminate the measure of two independent bus line demands, as described later.
Utilize the example of front prior art Fig. 5, wherein needing to upgrade 16 row shows, and these DRAM parts are same 2M * 8 sizes " AMPIC DRAM " and have each size and be two internal repository of IM * 8 now, also be provided with " PRITI " ability, the data value of 16 row can be sent to this reposition, and is by chance accurately identical with line number to the data transmission number of this new DRAM:
Transmit the line number (16) of number of times=desire to be transmitted.
This is that another transmits and the corresponding huge improvement of arbitrating for desired 16384 times current design based on common DRAM, and shows the improvement of 3 orders of magnitude on the runnability.It also reduces the influence to system bandwidth pro rata.Can transmit mass data in delegation only in access time according to the present invention.The two the novelty of this architecture of system and chip-scale all makes unique DRAM configuration can obtain the enhancing on the system performance.
This " AMPIC DRAM " can also be come at full speed the image screen data to be provided to display by the group structure as described above.
Be applied to the illustrative parameter of the configuration of Figure 21 below for example having a look at:
A. each chip can have 5 serial line interfaces;
B.4 a this chip is realized the system bus of 32 bit wides;
C. 4 serial line interfaces of each chip are defined as a port and are used to transmit video data; With
D. the data retrieval speed of each port is 100Mhz (can faster than this speed).
In this example, because 4 serial line interfaces of each chip are used to figure, so realize the visual interface of 16 bit wides, it can provide data with the speed of 2 bytes of each clock, thereby the bandwidth of per second 200 megabyte is provided, and this uses most of image all is enough.If adopt " AMPIC DRAM " with 9 serial line interfaces, this chip just energy is provided bigger video data bandwidth by the bus that the group structure is used for 8 bit wides.
Used an exterior storage body that has 32 bit wide system buss in the above in the example of Cai Yonging.But can adopt in some applications more than an exterior storage body, 4 memory banks for example, each memory bank 32 bit wide " AMPIC DRAM ", as shown in Figure 22.The feasible energy of this architecture, words are connected to each memory bank with different network interfaces as desired.This can bring up to much bigger degree with the ability of interconnecting of network within the cost comparing comparatively reasonably with current solution.As an example,, then can connect 16 networks altogether if provide 9 serial line interfaces and each memory bank to be connected to 4 network interfaces to each " AMPIC DRAM ".This is compared with obviously being one mainly to benefit with the prior aries of 4 to 5 interfaces output usually at most.
Also can see by this architecture, when adopting, can be connected to the serial port of another memory bank from " AMPIC DRAM " serial port of a memory bank more than one exterior storage body.This provides the feasible packet buffer of utilizing of an additional approaches between memory bank to come externally to transmit fast between the memory bank data.
And need not to use whole " AMPIC DRAM " in the system configuration.Some application can mix " AMPIC DRAM " with existing type DRAM mutually, are proposed in the modification as Figure 23.
In an other system configuration, " AMPIC DRAM " can be used to provide image or display interface the two, as indicated among Figure 24, and be connected to the I/O source of other type, for example gamma camera or satellite interface etc.
Consistent memory architecture
Under ecotopia, cross as explained earlier, best is that image and primary memory function are had a common memory chip and necessary runnability still is provided.This is called " consistent memory architecture (Unified Memory Architecture, UMA) " method.Current being in fierce the discussion, and the solution that has proposed some suggestion comprises above-mentioned RDRAM chip.Though the pin of lesser amt is adopted in this suggestion, the result makes that power consumption is lower, investment is less and cost is relatively low, agreement and the previous interface restriction of discussing based on packet can not be effective as primary memory work, and visit here trends towards non-localized.
Another kind of possible solution is to primary memory and the two VRAM that adopts the front to say of video memory, but additional expense can not show the reasonable of this change.
Therefore the reasonable solution of the extensive diversity needs in PC market did not also appear adapting to before the present invention.
In front in the system-level solution that embodiment discussed in conjunction with Figure 19, provide the serial line interface that to be organized structure and the function of " PRITI ", fill up this vacancy for certain based on " AMPIC DRAM " of the present invention.The signal pin that it has is than many (but the quantity of power supply and ground pin is perhaps less) by RAMBUS suggestion, but is less than VRAM, and the effect that both sides are in operation is equal.In fact by this realization of the present invention, the two reduction that can reside in its bandwidth in the same storer of image and primary memory function can be ignored, thereby reaches the target elusory of " consistent memory architecture ".
Another possible embodiments of this solution is to have two internal repository on chip.One can be used for image or similarly application according to " AMPIC " assembly, and second internal repository might be bigger, then can be similar in appearance to storer based on traditional DRAM, and these two memory banks both sides share " PRITI " of the present invention function, as shown in Figure 25.This effective combination provides both sides suitable environment: a memory bank is revealed as primary memory, and another memory bank then shows as the video memory of optimization.This chip architecture can provide realization to all types of application general common chip and the required ability that is necessary of a unified bus so that the very small influence ground of system bandwidth is transmitted lot of data between two internal repository because of " PRITI " function makes thus.
If the interface accessing of so-called " PARAS " DRAM of described pending application mechanism is used together with " AMPIC " as among Figure 26, then the present invention just can do further to improve reducing number of pins and cost, with realize one with minimum may cost the two requirement is optimized to primary memory and image memory chip.Consider the example of the 2M * 8DRAM of band " PARAS " interface, the number of pins of being saved is 8, and they can be used to provide serial line interface like this.As only realizing 5 serial line interfaces, then the number of pins in this high performance/low cost chip just can be comparable with traditional DRAM, and yes has the huge additional benefit that illustrates previously.The advantage of such combination DRAM is:
A. with the innovation of architecture rather than provide the system data bandwidth of very big enhancing by device speed fully;
B. can to/move mass data from a plurality of I/O source and to the minimum that influences of system bandwidth;
C. can be organized the different pieces of information transfer rate that structure adapts to the I/O source;
D. can move very big data block with the time frame of fast several orders of magnitude and can ignore at chip internal the influence of system bandwidth;
E. the functional devices that is provided had seldom number of pins;
F. because number of pins reduces cost also relatively low;
G, required power consumption are relatively low;
H. this architecture reduces the reception of input packet and the stand-by period between its transmission subsequently;
I. compare the much bigger number of I/O source that interconnects with classic method;
J. the system design interface is much at one in existing DRAM, thereby makes the design cycle the shortest; With
K. to primary memory and visual demand the two with equal efficient work, thereby consistent memory architecture is provided.
Other modification that can do for the skilled staff in present technique field also comprise, the logic that same packet buffer is unloaded to other ports of similar definition is set, with the ability that makes the buffer tandem, or with serial line interface and buffer transformation applications to other storage arrangement rather than DRAM, and think that these all belong in the claim of attached row within the defined spirit of the present invention and category.

Claims (27)

1, being used for one has and is connected to the public system bus interface separately and to the improved DRAM architecture of the system of the master controller of the CPU (central processing unit) that for example has parallel data port (CPU) of its accessing competition and dynamic RAM (DRAM), comprise multi-port internally zero access DRAM (AMPIC DRAM), this AMPIC DRAM comprises: a plurality ofly be connected the exterior I/O source of opening in a minute and the independent serial data interface between the inner DRAM storer by corresponding buffer separately; Assign into the adapter assembly between serial line interface and buffer; With the adapter assembly logic control that is used under the dynamic constitution of doing by the bus master controller of for example described CPU, serial line interface being connected to buffer, so that do to be suitable for the switching distribution of desirable data route.
2, system described in the claim 1 is characterized in that adapter assembly comprises one or more Port Multiplier or crossbar switch, perhaps their combination.
3, system described in the claim 1 is characterized in that described storer is the DRAM nuclear primary memory of system.
4, system described in the claim 3, it is characterized in that buffer is a packet buffer, and be provided with the device that each activity data pack buffer and CPU are arbitrated the visit of bus interface, receive or send data from packet buffer by serial line interface and then need not arbitration to it.
5, system described in the claim 1, it is characterized in that AMPIC DRAM adapter assembly distribute arbitrary buffer give arbitrary serial line interface and need not any between buffer and core DRAM the intermediate steps of transmission data.
6, system described in the claim 1 is characterized in that each independent serial line interface is a bit wide.
7, system described in the claim 6 is characterized in that a plurality of bit wide serial interfaces and a public I/O source are constituted to a narrow width bus or a port, are connected to a public buffer.
8, system described in the claim 7 is characterized in that each buffer has the ability of carrying out simultaneously interface with all serial line interfaces when being defined as a port, and each buffer then is used for the same port size of port that is connected or docks with it by the group structure.
9, system described in the claim 1 is characterized in that each port interface is provided with a control line controls corresponding I/O source and the serial data stream between the serial port.
10, system described in the claim 1 is characterized in that being provided with a plurality of AMPIC dram chips, and each is connected between bus interface and the I/O subject string line interface, and wherein one or more serial line interface is used as a port.
11, system described in claim 1, it is characterized in that being provided with at least two inner DRAM nuclear memory banks, they are connected so that after delegation's reading of data of a memory bank, it is write at least one another memory bank via having the line width interface that carries out the row address of gating with the corresponding RAS signal period simultaneously.
12, system described in claim 11, it is characterized in that the internal logic provider to control so that finish and can start another after delegation transmits and transmit, the parallel-by-bit internal affairs of gained are intervened (PRITI) and are proceeded to and finishes.
13, system described in claim 12 is characterized in that being provided with the visit that refusal is examined DRAM during this inner the transmission, but allows the device of the transmission on serial line interface during this inner transmission.
14, system described in claim 13, it is characterized in that two line width, two memory unit groups carry out interface between described memory bank, and be provided with and visit delegation in each memory bank, storage simultaneously with described operation and advance described memory unit, write back to the device in two sources simultaneously then.
15, system described in claim 13, it is characterized in that a line width memory unit group is set on this journey width bus interface, and being provided with the line data of a memory bank of storage, the line data of a described memory bank is written to described at least one other memory bank after at least one other memory bank is write data a described memory bank.
16, system described in claim 1 is characterized in that the AMPIC dram chip also be provided with address, data, PAS, CAS at the bus interface oral-lateral except that the serial line interface pin that separates, writes, wait, command and major clock pin; CPU utilizes waiting signal or postpones the beginning (wait) or the prolongation access cycle of visiting and transmit so that finish this inside before handling this visit whenever buffer and inner transmission of the internuclear generation of DRAM the time; The command-control signal that is used to visit provides via the data line of not using in the cycle at RAS; Core DRAM address is provided for RAS and CAS line and data line provides buffer number or house-keeping instruction information; With major clock control serial line interface.
17, system described in the claim 1, be applicable to that network connects application, this application includes a plurality of network controllers, make serial line interface with one or more memory bank of corresponding primary memory AMPIC DRAM separately, the latter is connected to a side of bus interface successively, CPU then is connected to the opposite side of bus interface, it is serial basically that data between its middle controller and primary memory move, and the serial data that the conduct that receives from network controller desires to be sent to the data of a network controller is stored the packet buffer that into is assigned to it by the system bus master cpu.
18, system described in the claim 17, it is characterized in that when a network controller sends data to second network controller, the data that receive by the serial port of a described network controller be sent to corresponding based on DRAM primary memory and need not arbitration or consume main memory bandwidth, and only in the once visit after packet buffer obtains bus by arbitration data are transmitted the DRAM nuclear that is applied to correspondence, and row address data is provided to DRAM nuclear; The packet that wherein is sent to AMPIC DRAM is handled and is rerouted to the described second network controller port by CPU and data are to be sent to corresponding packet buffer in the individual access after arbitration, and the serial port by correspondence is sent to described second network controller and network thereof then.
19, system described in the claim 1, be applicable to image/multimedia application, this application relates to the minimum a plurality of storer AMPIC DRAM that show the transmission of line number and be sent to a side that is connected to described bus interface successively from a plurality of demonstration serial interface port, CPU is connected to the opposite side of this interface, wherein, it is serial basically that data between display port and storer move, and the serial data that receives from display interface is stored the packet buffer that into is assigned to the system bus master cpu.
20, system described in the claim 19 is characterized in that being provided with the device that transmits the display line of described quantity with the data transfer operation of equal number.
21, system described in the claim 17 is characterized in that the exterior storage body of also that one or more is the other common traditional DRAM that does not have serial line interface is connected to bus interface.
22, system described in the claim 17 is characterized in that also being provided with one or more other AMPIC DRAM exterior storage body, also is connected to bus interface and makes serial line interface with visual video data port.
23, one has control linkage and has in the system of parallel data port CPU of system bus interface of memory bank of one or more DRAM unit that is equipped with storer or DRAM unit, be used to eliminate the DRAM system bandwidth limitations, significantly increase data transfer rate, significantly reduce the bus arbitration demand, make the cost of the I/O source interface that can realize increase and reduction reach the method for lower power consumption, comprising: for equipping the auxiliary serial data port that at least one is used for being undertaken by a corresponding serial line interface and exterior I/O data source interface in each DRAM unit;
One buffer is set and an adapter assembly is assigned between buffer and the serial line interface for each serial line interface in that each DRAM is inner; And impel CPU to control the connection of serial line interface to buffer by the switching that dynamic constitution adapter assembly work is suitable for desirable data route.
24, method described in claim 23, it is characterized in that described switching by multipath conversion or in length and breadth the switching or both realize.
25, the method described in claim 23, it is characterized in that described switching that buffer is distributed to arbitrary serial line interface and need not any intermediate steps that between buffer and DRAM storer, transmits data, and between buffer and CPU, do the arbitration that bus interface is visited, but receive or then need not to arbitrate from buffer to its transmission data by serial line interface.
26, improved DRAM architecture described in the claim 1, it is characterized in that a chip internal contains at least two DRAM memory banks and adapter assembly and buffer, parallel-by-bit internal affairs tampering devic is used for the transmission of internal data row and is connected so that at least one memory bank to an I/O display plotter connected in series mainly utilizes another memory bank as the primary memory core, make the CPU of one of visit or both sides' memory bank that data are moved in described another memory bank, and data are mobile between memory bank under the control of parallel-by-bit internal affairs tampering devic, thereby provide the chip that is suitable for consistent memory architecture.
27, being used for one has and is connected to separately and to the architecture of the improved memory cell of the system of the master controller of the CPU (central processing unit) that for example has parallel data port (CPU) of the public system bus interface of its accessing competition and random access memory unit, comprise multi-port internally high speed access storage unit, the latter comprises: a plurality of separately by corresponding buffer be connected separately exterior I/O source and the independent serial data interface between the internal storage in the unit; Assign into the adapter assembly between serial line interface and the buffer; Under by the bus master controller dynamic constitution that CPU did as described, serial line interface is connected to the adapter assembly logic control of buffer with being used for, so that do to be suitable for the switching distribution of desirable data route.
CN96180069A 1995-12-29 1996-08-12 High performance universal multi-port internally cached dynamic randon access memory system, architecture and method Expired - Fee Related CN1120495C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/581,467 US5799209A (en) 1995-12-29 1995-12-29 Multi-port internally cached DRAM system utilizing independent serial interfaces and buffers arbitratively connected under a dynamic configuration
US08/581,467 1995-12-29

Publications (2)

Publication Number Publication Date
CN1209213A true CN1209213A (en) 1999-02-24
CN1120495C CN1120495C (en) 2003-09-03

Family

ID=24325313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN96180069A Expired - Fee Related CN1120495C (en) 1995-12-29 1996-08-12 High performance universal multi-port internally cached dynamic randon access memory system, architecture and method

Country Status (15)

Country Link
US (2) US5799209A (en)
EP (1) EP0870303B1 (en)
JP (1) JP3699126B2 (en)
KR (1) KR100328603B1 (en)
CN (1) CN1120495C (en)
AT (1) ATE197101T1 (en)
AU (1) AU721764B2 (en)
CA (1) CA2241841C (en)
DE (1) DE69610714T2 (en)
DK (1) DK0870303T3 (en)
GR (1) GR3035261T3 (en)
HK (1) HK1018342A1 (en)
IL (1) IL125135A (en)
TW (1) TW318222B (en)
WO (1) WO1997024725A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100456734C (en) * 2003-03-13 2009-01-28 马维尔国际贸易有限公司 System structure with multiport memory, device, system and use method thereof
CN103258568A (en) * 2006-12-22 2013-08-21 富士通半导体股份有限公司 Memory device, memory controller and memory system
CN104717152A (en) * 2013-12-17 2015-06-17 深圳市中兴微电子技术有限公司 Method and device for achieving interface caching dynamic allocation
CN106293635A (en) * 2015-05-13 2017-01-04 华为技术有限公司 Instruction block processing method and processing device
CN109582226A (en) * 2018-11-14 2019-04-05 北京中电华大电子设计有限责任公司 A kind of high speed storing access logical construction and its control method

Families Citing this family (78)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6118776A (en) * 1997-02-18 2000-09-12 Vixel Corporation Methods and apparatus for fiber channel interconnection of private loop devices
JPH10283088A (en) * 1997-04-02 1998-10-23 Oki Electric Ind Co Ltd Serial communication circuit
AU744329B2 (en) * 1997-04-30 2002-02-21 Canon Kabushiki Kaisha Data normalization circuit and method
JP3733699B2 (en) * 1997-06-20 2006-01-11 ソニー株式会社 Serial interface circuit
US5918074A (en) * 1997-07-25 1999-06-29 Neonet Llc System architecture for and method of dual path data processing and management of packets and/or cells and the like
US6212597B1 (en) * 1997-07-28 2001-04-03 Neonet Lllc Apparatus for and method of architecturally enhancing the performance of a multi-port internally cached (AMPIC) DRAM array and like
US6108758A (en) * 1997-08-29 2000-08-22 Intel Corporation Multiple masters in a memory control system
US6067595A (en) * 1997-09-23 2000-05-23 Icore Technologies, Inc. Method and apparatus for enabling high-performance intelligent I/O subsystems using multi-port memories
KR100261218B1 (en) * 1997-12-08 2000-07-01 윤종용 Pin assignment method of semiconductor memory device & semiconductor memory device inputing packet signal
US6622224B1 (en) * 1997-12-29 2003-09-16 Micron Technology, Inc. Internal buffered bus for a drum
US6085290A (en) * 1998-03-10 2000-07-04 Nexabit Networks, Llc Method of and apparatus for validating data read out of a multi port internally cached dynamic random access memory (AMPIC DRAM)
US6138219A (en) * 1998-03-27 2000-10-24 Nexabit Networks Llc Method of and operating architectural enhancement for multi-port internally cached dynamic random access memory (AMPIC DRAM) systems, eliminating external control paths and random memory addressing, while providing zero bus contention for DRAM access
US6003121A (en) * 1998-05-18 1999-12-14 Intel Corporation Single and multiple channel memory detection and sizing
US6112267A (en) * 1998-05-28 2000-08-29 Digital Equipment Corporation Hierarchical ring buffers for buffering data between processor and I/O device permitting data writes by processor and data reads by I/O device simultaneously directed at different buffers at different levels
FR2779843A1 (en) * 1998-06-16 1999-12-17 Busless Computers Serial multi port memory component comprising RAM memory bank assemblies for use in computer
US6122680A (en) * 1998-06-18 2000-09-19 Lsi Logic Corporation Multiple channel data communication buffer with separate single port transmit and receive memories having a unique channel for each communication port and with fixed arbitration
US6237130B1 (en) * 1998-10-29 2001-05-22 Nexabit Networks, Inc. Chip layout for implementing arbitrated high speed switching access of pluralities of I/O data ports to internally cached DRAM banks and the like
US5991163A (en) * 1998-11-12 1999-11-23 Nexabit Networks, Inc. Electronic circuit board assembly and method of closely stacking boards and cooling the same
US6272567B1 (en) * 1998-11-24 2001-08-07 Nexabit Networks, Inc. System for interposing a multi-port internally cached DRAM in a control path for temporarily storing multicast start of packet data until such can be passed
US6389494B1 (en) * 1998-12-30 2002-05-14 Emc Corporation System for interfacing a data storage system to a host utilizing a plurality of busses for carrying end-user data and a separate bus for carrying interface state data
US7073020B1 (en) 1999-01-04 2006-07-04 Emc Corporation Method for message transfer in computer storage system
US7117275B1 (en) 1999-01-04 2006-10-03 Emc Corporation Data storage system having separate data transfer section and message network
US6467018B1 (en) * 1999-01-04 2002-10-15 International Business Machines Corporation Method and apparatus for addressing individual banks of DRAMs on a memory card
US6345345B1 (en) * 1999-01-26 2002-02-05 Advanced Micro Devices, Inc. Data communications device and associated method for arbitrating access using dynamically programmable arbitration scheme and limits on data transfers
CA2367878A1 (en) * 1999-03-26 2000-10-05 Richard F. Conlin Ampic dram system
US6412032B1 (en) * 1999-09-30 2002-06-25 Rockwell Automation Technologies, Inc. Interface for industrial controller network card
DE19951046A1 (en) * 1999-10-22 2001-04-26 Siemens Ag Memory component for a multi-processor computer system has a DRAM memory block connected via an internal bus to controllers with integral SRAM cache with 1 controller for each processor so that memory access is speeded
US6628662B1 (en) 1999-11-29 2003-09-30 International Business Machines Corporation Method and system for multilevel arbitration in a non-blocking crossbar switch
US7010575B1 (en) 2000-03-31 2006-03-07 Emc Corporation Data storage system having separate data transfer section and message network having bus arbitration
US7007194B1 (en) 2000-06-29 2006-02-28 Emc Corporation Data storage system having point-to-point configuration
US6779071B1 (en) 2000-04-28 2004-08-17 Emc Corporation Data storage system having separate data transfer section and message network with status register
US6651130B1 (en) 2000-04-28 2003-11-18 Emc Corporation Data storage system having separate data transfer section and message network with bus arbitration
KR20010106079A (en) * 2000-05-19 2001-11-29 강 크리스토퍼 Pipelined and shared memory switch
US6349058B1 (en) * 2001-02-16 2002-02-19 Microchip Technology Incorporated Electronic circuit and method for storing configuration and calibration information in a non-volatile memory array
US6924538B2 (en) 2001-07-25 2005-08-02 Nantero, Inc. Devices having vertically-disposed nanofabric articles and methods of making the same
US6911682B2 (en) 2001-12-28 2005-06-28 Nantero, Inc. Electromechanical three-trace junction devices
US7259410B2 (en) 2001-07-25 2007-08-21 Nantero, Inc. Devices having horizontally-disposed nanofabric articles and methods of making the same
US6574130B2 (en) 2001-07-25 2003-06-03 Nantero, Inc. Hybrid circuit having nanotube electromechanical memory
US6706402B2 (en) 2001-07-25 2004-03-16 Nantero, Inc. Nanotube films and articles
US7566478B2 (en) 2001-07-25 2009-07-28 Nantero, Inc. Methods of making carbon nanotube films, layers, fabrics, ribbons, elements and articles
US6919592B2 (en) 2001-07-25 2005-07-19 Nantero, Inc. Electromechanical memory array using nanotube ribbons and method for making same
US6835591B2 (en) 2001-07-25 2004-12-28 Nantero, Inc. Methods of nanotube films and articles
US6643165B2 (en) 2001-07-25 2003-11-04 Nantero, Inc. Electromechanical memory having cell selection circuitry constructed with nanotube technology
US6988161B2 (en) * 2001-12-20 2006-01-17 Intel Corporation Multiple port allocation and configurations for different port operation modes on a host
US6784028B2 (en) 2001-12-28 2004-08-31 Nantero, Inc. Methods of making electromechanical three-trace junction devices
US7176505B2 (en) 2001-12-28 2007-02-13 Nantero, Inc. Electromechanical three-trace junction devices
US7335395B2 (en) 2002-04-23 2008-02-26 Nantero, Inc. Methods of using pre-formed nanotubes to make carbon nanotube films, layers, fabrics, ribbons, elements and articles
DE10253918A1 (en) * 2002-11-19 2004-06-17 Infineon Technologies Ag Storage system, in particular for network broadcasting applications such as video / audio applications, and method for operating a storage system
US7099983B2 (en) * 2002-11-25 2006-08-29 Lsi Logic Corporation Multi-core communications module, data communications system incorporating a multi-core communications module, and data communications process
US7560136B2 (en) 2003-01-13 2009-07-14 Nantero, Inc. Methods of using thin metal layers to make carbon nanotube films, layers, fabrics, ribbons, elements and articles
US20040199727A1 (en) * 2003-04-02 2004-10-07 Narad Charles E. Cache allocation
KR100518572B1 (en) * 2003-05-15 2005-10-04 삼성전자주식회사 Method and appartus for communicating in serial multi port, and recording medium
JP2004355351A (en) * 2003-05-29 2004-12-16 Hitachi Ltd Server device
CN100390755C (en) * 2003-10-14 2008-05-28 中国科学院计算技术研究所 Computer micro system structure comprising explicit high-speed buffer storage
US7587521B2 (en) * 2005-06-23 2009-09-08 Intel Corporation Mechanism for assembling memory access requests while speculatively returning data
US8332598B2 (en) 2005-06-23 2012-12-11 Intel Corporation Memory micro-tiling request reordering
US7765366B2 (en) * 2005-06-23 2010-07-27 Intel Corporation Memory micro-tiling
US8253751B2 (en) * 2005-06-30 2012-08-28 Intel Corporation Memory controller interface for micro-tiled memory access
US7558941B2 (en) * 2005-06-30 2009-07-07 Intel Corporation Automatic detection of micro-tile enabled memory
JP2007334564A (en) * 2006-06-14 2007-12-27 Matsushita Electric Ind Co Ltd Unified memory system
EP2104089A4 (en) 2007-01-12 2010-01-13 Panasonic Corp Plasma display device, and method for driving plasma display panel
CN101216751B (en) * 2008-01-21 2010-07-14 戴葵 DRAM device with data handling capacity based on distributed memory structure
JP5599969B2 (en) 2008-03-19 2014-10-01 ピーエスフォー ルクスコ エスエイアールエル Multi-port memory and computer system including the multi-port memory
JP5449686B2 (en) * 2008-03-21 2014-03-19 ピーエスフォー ルクスコ エスエイアールエル Multiport memory and system using the multiport memory
JP5588100B2 (en) * 2008-06-23 2014-09-10 ピーエスフォー ルクスコ エスエイアールエル Semiconductor device and data processing system
US8713248B2 (en) * 2009-06-02 2014-04-29 Nokia Corporation Memory device and method for dynamic random access memory having serial interface and integral instruction buffer
US9003206B2 (en) * 2009-12-23 2015-04-07 Bae Systems Information And Electronic Systems Integration Inc. Managing communication and control of power components
US8547774B2 (en) 2010-01-29 2013-10-01 Mosys, Inc. Hierarchical multi-bank multi-port memory organization
CN102193865B (en) * 2010-03-16 2015-03-25 联想(北京)有限公司 Storage system, storage method and terminal using same
US8718806B2 (en) 2011-09-02 2014-05-06 Apple Inc. Slave mode transmit with zero delay for audio interface
US9514069B1 (en) 2012-05-24 2016-12-06 Schwegman, Lundberg & Woessner, P.A. Enhanced computer processor and memory management architecture
EP3454594B1 (en) 2013-06-11 2020-11-04 Seven Networks, LLC Offloading application traffic to a shared communication channel for signal optimisation in a wireless network for traffic utilizing proprietary and non-proprietary protocols
US9965211B2 (en) 2016-09-08 2018-05-08 Cisco Technology, Inc. Dynamic packet buffers with consolidation of low utilized memory banks
KR20180092476A (en) * 2017-02-09 2018-08-20 에스케이하이닉스 주식회사 Storage device and operating method thereof
US11385837B2 (en) 2020-01-07 2022-07-12 SK Hynix Inc. Memory system
TW202141290A (en) 2020-01-07 2021-11-01 韓商愛思開海力士有限公司 Processing-in-memory (pim) system and operating methods of the pim system
US11315611B2 (en) * 2020-01-07 2022-04-26 SK Hynix Inc. Processing-in-memory (PIM) system and operating methods of the PIM system
CN115065572B (en) * 2022-02-28 2023-09-29 西安电子科技大学 CAN FD controller for vehicle-mounted electronic system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH01280860A (en) * 1988-05-06 1989-11-13 Hitachi Ltd Multiprocessor system with multiport cache memory
EP0471932B1 (en) * 1990-07-27 1997-01-22 International Business Machines Corporation Virtual multi-port ram
US5581773A (en) * 1992-05-12 1996-12-03 Glover; Michael A. Massively parallel SIMD processor which selectively transfers individual contiguously disposed serial memory elements
US5450355A (en) * 1993-02-05 1995-09-12 Micron Semiconductor, Inc. Multi-port memory device
US5490112A (en) * 1993-02-05 1996-02-06 Micron Technology, Inc. Multi-port memory device with multiple sets of columns
JPH06251166A (en) * 1993-02-25 1994-09-09 Toshiba Corp Image processing device
US5442747A (en) * 1993-09-27 1995-08-15 Auravision Corporation Flexible multiport multiformat burst buffer
US5457654A (en) * 1994-07-26 1995-10-10 Micron Technology, Inc. Memory circuit for pre-loading a serial pipeline

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100456734C (en) * 2003-03-13 2009-01-28 马维尔国际贸易有限公司 System structure with multiport memory, device, system and use method thereof
CN103258568A (en) * 2006-12-22 2013-08-21 富士通半导体股份有限公司 Memory device, memory controller and memory system
CN103258568B (en) * 2006-12-22 2016-09-21 株式会社索思未来 Memory devices, Memory Controller and accumulator system
CN104717152A (en) * 2013-12-17 2015-06-17 深圳市中兴微电子技术有限公司 Method and device for achieving interface caching dynamic allocation
WO2015089984A1 (en) * 2013-12-17 2015-06-25 深圳市中兴微电子技术有限公司 Method, device and computer storage medium for implementing interface cache dynamic allocation
US10142435B2 (en) 2013-12-17 2018-11-27 Sanechips Technology Co., Ltd. Method, device and computer storage medium for implementing interface cache dynamic allocation
CN104717152B (en) * 2013-12-17 2019-07-19 深圳市中兴微电子技术有限公司 A kind of method and apparatus realizing interface caching and dynamically distributing
CN106293635A (en) * 2015-05-13 2017-01-04 华为技术有限公司 Instruction block processing method and processing device
CN106293635B (en) * 2015-05-13 2018-10-30 华为技术有限公司 Instruction block processing method and processing device
CN109582226A (en) * 2018-11-14 2019-04-05 北京中电华大电子设计有限责任公司 A kind of high speed storing access logical construction and its control method

Also Published As

Publication number Publication date
HK1018342A1 (en) 1999-12-17
CN1120495C (en) 2003-09-03
AU6529596A (en) 1997-07-28
US5799209A (en) 1998-08-25
IL125135A (en) 2002-12-01
EP0870303B1 (en) 2000-10-18
ATE197101T1 (en) 2000-11-15
DK0870303T3 (en) 2001-02-26
WO1997024725A1 (en) 1997-07-10
CA2241841A1 (en) 1997-07-10
DE69610714T2 (en) 2001-05-10
EP0870303A1 (en) 1998-10-14
KR19990076893A (en) 1999-10-25
CA2241841C (en) 1999-10-26
JP2000501524A (en) 2000-02-08
KR100328603B1 (en) 2002-10-19
DE69610714D1 (en) 2000-11-23
US6108725A (en) 2000-08-22
IL125135A0 (en) 1999-01-26
GR3035261T3 (en) 2001-04-30
TW318222B (en) 1997-10-21
JP3699126B2 (en) 2005-09-28
AU721764B2 (en) 2000-07-13

Similar Documents

Publication Publication Date Title
CN1120495C (en) High performance universal multi-port internally cached dynamic randon access memory system, architecture and method
US5687325A (en) Application specific field programmable gate array
JP4926947B2 (en) GPU rendering to system memory
US4747081A (en) Video display system using memory with parallel and serial access employing serial shift registers selected by column address
US6041400A (en) Distributed extensible processing architecture for digital signal processing applications
US4639890A (en) Video display system using memory with parallel and serial access employing selectable cascaded serial shift registers
JPH03184082A (en) Electronic system
WO1997024725A9 (en) High performance universal multi-port internally cached dynamic random access memory system, architecture and method
EP0789882A2 (en) Multi-media processor architecture with high performance-density
CN1173670A (en) System and method for implementing overlay paths
JP2014238908A (en) Switched interface stacked-die memory architecture
CN1656465A (en) Scalable high performance 3d graphics
CN102164072A (en) Identifying devices in a topology of devices for audio/video streaming
TWI514165B (en) Data processing system
CN112114754B (en) System-on-chip (SOC) for processing backlight data and terminal equipment
CN1941192A (en) Multi-port memory device with serial input/output interface
US20240264963A1 (en) Scatter and Gather Streaming Data through a Circular FIFO
WO2023107218A1 (en) High-bandwidth memory module architecture
CN1132121C (en) Image processor and data processing system using the same processor
CN109873998A (en) Infrared video based on multi-level guiding filtering enhances system
US10546558B2 (en) Request aggregation with opportunism
CN104025013A (en) Transpose of image data between a linear and a y-tiled storage format
JP2003177958A (en) Specialized memory device
US7817651B2 (en) Method and apparatus for controlling storage of data
US6346947B1 (en) MPEG-decoder and MPEG decoding method with two memory controllers

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20030903

Termination date: 20150812

EXPY Termination of patent right or utility model