TW201019120A - Method, apparatus and system for reducing memory latency - Google Patents

Method, apparatus and system for reducing memory latency Download PDF

Info

Publication number
TW201019120A
TW201019120A TW98136752A TW98136752A TW201019120A TW 201019120 A TW201019120 A TW 201019120A TW 98136752 A TW98136752 A TW 98136752A TW 98136752 A TW98136752 A TW 98136752A TW 201019120 A TW201019120 A TW 201019120A
Authority
TW
Taiwan
Prior art keywords
memory
data
instruction
reducing
group
Prior art date
Application number
TW98136752A
Other languages
Chinese (zh)
Other versions
TWI467381B (en
Inventor
Alan Ruberg
Seung-Jong Lee
Hyung-Rok Lee
Dae-Yun Shim
Dongyun Lee
Sung-Joon Kim
Anu Murthy
Original Assignee
Silicon Image Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Silicon Image Inc filed Critical Silicon Image Inc
Publication of TW201019120A publication Critical patent/TW201019120A/en
Application granted granted Critical
Publication of TWI467381B publication Critical patent/TWI467381B/en

Links

Landscapes

  • Information Transfer Systems (AREA)
  • Memory System (AREA)

Abstract

A method, apparatus and system for reducing memory latency is disclosed. In one embodiment, data between a host computer system and a memory is communicated via a port or a group of ports at the memory over multiple time intervals, wherein the host computer is coupled to the memory. Further, a command associated with the data is communicated between the host computer system and the memory via the port or the group of ports over a single time interval.

Description

201019120 六、發明說明: 【發明所屬之技術領域】 —本2明之實施例大致上是關於電腦記憶體之領域,更 特疋而5 κ針對串列琿記憶體溝通潛時及可靠度之改良。 【先前技術】 在使用同速串列介面之記憶體系統中,一主機(嬖如系 統整合晶片、電腦、圖形控制器或其他等)或複數個^機及 系經由各埠以傳輸指令及資料,其被期望能提 、取,丸並結合錯誤偵測,以確保正常的系統運作。 串列連接(Serial Link)具有固有之潛時(Latency),因為 ^次僅傳送—位元。此外’串列化及解串列化步驟將產 領外之潛時。分別使用這些埠並不能明顯改善潛時,因 此利用-種特殊的存取方法(譬如從各埠存取不同的、專用 的兒憶體區域’譬如串化存取(StHpedA⑽ 寬°藉由啟動埠連結(PortBindin細用一致的多個蜂善)頻 ❹:憶體潛時可藉由包含許多位元之資料同時傳送而降低, ”增加頻寬而無需格式化的存取方法。 5己憶體亦需要—定量的資料安全性。舉例而言,在一 ^列通道中’除非制會導致不可接受潛時之方法,否則 其可能發生無法偵狀錯誤。在—種連結埠之情況中 些埠在指令期中仍處於閒置。在那些相同的期間中,未使 =寬係填滿了重複的指令。此方法可利用二性= 之伸至早一埠,以提供埠構造之特色。 圖一顯示一種習知串列位元配置100,依EIA標準 3 201019120 RS-232-C。在圖中,其資料之串列傳送係相似於一種 RS-232連接,其中藉由對其逐個輪流觀察並指定其於值 124中不同之思義,各別的二位元值(位元)ι至us可被 組合為整體值124。舉例而言,若其第一位元為1〇4,且被 指定為值124中最大有效位元(M〇st Significant BU, MSB) ’接著是第二位元1〇6及其他等等(譬如位元ι〇8、 110、112、114及116等),直到最小有效位元如㈣ Significant Bit,LSB)填入了最後一溝通之位元ιι8。在此 例中之組合值被稱為框(Frame)128,其包含值124及停止 及開始位元102及120。此外,框128係利用了額外位元 加以描述,稱為成框位元(Framing Bits)126,其包含開始 位元120及停止位元122,使接收器可用以尋找框128的 開始,並可破認預期中框何時到達。在其他架構中,即使 當發送器及接收器間資料速率有些微不同或改變時,亦可 利用成框位元126以利接收器可靠地尋找個別位元。 ❹ 冑越各串列連結對記憶體之溝通將引起大幅潛時,且 提供超過-個主機存取一單一記憶體將引起記憶體資源之 稷雜=。此外,記憶體可能具有—或多埠,各包含一 發送器帛列接收器及相關電路以改善潛時及頻寬。在 (BQUnd pQn Si卜某些痒在指令期 間仍為閒置。在同樣時期φ,去μ m ' 性重藉Μ八使用之頻寬將填滿了空間 =广t此方法可利用在不同時期中之時間重複 从伸至早-蟑,以提供所料構造之特色。在 車的情況中’資料係傳送於多淳,但是指令必須能獨立運。 4 201019120 =Γ首先、,未使用之埠可能包含重複的指令(Command Duplicates)。第二,甘比 κ 因為對於錯誤管理之串二歹;令可能f布於相同時間。此外, .之串列化、解串列化、使資料成框,以 t!步化之額外步驟,因此串列溝通增加潛時於平行 溝通。 【發明内l界】而要導人及實施能減少記憶體潛時之技術。 Φ ^發=揭露-種用於改善埠記憶體溝通潛時及可靠度 之方法、裝置及系統。 夕古、土 例中’本發明提供一種用於降低記憶體潛時 所述夺^3於一主機電腦系統及一記憶體之間經由位於 ''°之一埠或一組埠於多時間間隔溝通資料,其中 ==機電腦系統係搞合於所述記憶體,所述蜂為所述組 τ埴赤邮及於所述主機電腦系統及所述記憶體之間經由所 斤述組埠於一單一時間間隔溝通相關於所述資料之 Φ 一才曰令。 之裝署Γ&例巾本發明提供一種用於降低記憶體潛時 主機電腦系統’輕合至一記憶體,所述記 戶 =主=位=所述記憶體之一琿或一組蜂於多時間間隔從 二 m祕收資料,所料為所❹埠之一;及 ,、中所述記憶體係經過修改,以麫 一單一時間間隔從所述主機電收^ 乂組於 电脑系統接收相關於所述資料 知令。 在實施例中’本發明提供一種用於降低記憶體潛時 5 201019120 之系統,包含一主機電腦系統,耦合於一記憶體,所述記 憶體使用一埠連結系統以降低記憶體潛時,所述埠連結系 統包含複數個埠以溝通資料及指令,其中所述複數個埠之 二或以上之埠係可被連結為一或多組埠,所述埠連結系統 係用以:經由位於所述記憶體之一埠或一組埠於多時間間 隔而於所述主機電腦系統及所述記憶體之間溝通資料,所 述埠係為所述組埠之一;及經由所述埠或所述組埠於一單 一時間間隔於所述主機電腦系統及所述記憶體之間溝通相 ®關於所述資料之一指令。 【實施方式】 本發明之實施例大致上是關於串列埠記憶體溝通潛時 及可靠度之改良,然而,其同樣可應用於其他形式之介面, 譬如高速並列介面(High-Speed Parallel)。 此中所述之「§己憶體(Mem〇ry)」係意指在電腦系統(嬖 如圖二D及二E)中之一元件,其負責操取先前儲存之資 ⑩料X使用於任何「主機(Host)」(譬如一電腦處理器)或週 邊(譬如鍵盤、顯示器、攝像機、大量儲存裝置(磁碟、光 ,、磁帶及其他等)、網路控制器、或無線網路)。一般而 a,纪憶體係柄合於-或多個在電腦系統中處理資料之微 處理器。資料可藉由一主機而儲存於記憶體中,壁如於隨 機存取記憶體(RAM)、靜態隨機存取記憶體(sram)、動離 酼機存取記憶體(DRAM)、快閃記憶體(Flash)、可 記憶體(PROM)、可抹除可鋥—、^ $ 口貝 j徠除了程式化唯讀記憶體(EPR〇M)、電 可抹除可程式化唯讀記憶體(EEPR0M)或譬如預先決定 6 201019120 的唯讀記憶體(ROM)。主機可經由一匯流排(譬如pci)或經 由中間3己憶體控制器以直接地存取記憶體。對記憶體之串 歹J存取牵涉到單一序列電子訊號經過單一電路之行進及往 ^記憶體(相似於圖—中所示之RS 232)之有意義指令及 責料間之轉換。執行轉換之電路稱為「埠(p〇rt)」。 在一實施例中,為了減少記憶體之潛時,因此利用一 種,罩機制(Masking Scheme),使得指令及資料之寫入可 ㈤在相同溝通框(Communication Frame)中包含遮蔽資訊 而加以描述,以降低框中之位元數及潛時。此外,為了降 低潛時因此提供一種基於記憶體之協定,以降低更短之框 ^ j之,曰時’以提供相對於舊有的動態隨機存取記憶體協 疋而S更大之彈性,並減少增加頻寬時指令組之改變。 圖一 A顯不一種單一主機連結埠記憶體2〇〇之實施 例。實施例圖中連結埠記憶體200包含一記憶體核心 2〇2(譬如DRAM或快閃記憶體),其包含多記憶庫(Bank)(譬 © 庫)且相連於連結蟑記憶體200之連結埠記憶體系統 2〇4 5己憶體核心202之記憶庫係溝通於多埠,譬如四埠 206至212。所有四埠2〇6至212 一起運作以提供介面至具 #可變頻見之單-主機。這些記憶庫係獨立地使用,譬如 同時從一记憶庫讀取並從另一記憶庫寫入。記憶體核心 202亦可包含一記憶體讀取匯流排以讀取資料,及一記憶 f寫入匯流排以寫入資料。然而,在記憶體核心搬中可 月b具有單一連接可讀取並且寫入資料。此外,連結埠記 憶體系統204包含一連結多工器綱及一連結解多工器 7 201019120 262 ’將分別描述於圖二κ及二i。 =知技術中,所有指令及資料位元之傳送係立刻傳 送、各平仃線路,其同時到達於編碼形成時。$而, 度變快時,流經這些各線路之資料可能不正確地或於^誤 的時間被取樣(譬如使用相關時脈訊號)。為了解決高速取 樣問題,因此利用一種自取樣串列訊號(譬如圖一、之 RS:232)。然而,平行方法將使潛時增加,因為其資料係依 時處理。在-實施例中,對於串列蟑與記憶體溝通之溝通 ©潛時及可靠度係利用了多串列介面或痒之編組及重複指令 (Duplicating CoInmands) ’其可為接連地或時間性地,亦或 於多埠間之相同時間或空間性地。在此中說明了此技術之 實施例,並利用單一主機埠連結之埠記憶體2〇〇加以實施。 在一實施例中,一發送器(Tx)可將十六位元之平行資 料轉換為一種串列位元串流,並傳送此單一位元串流。而 一接收器(Receiver,RX)可接收十六位元單一串流並將其 ❹轉換為一種平行串流。在此例中,一區域記憶體可為三;^ 二位元寬且於大約相同之速率。在圖中之埠記憶體2〇〇 中,係利用四埠206至212,因此四埠2〇6至212間各通 道具有一種一百二十八位元資料運動,隨著一種六十四位 元之資料串流(譬如,十六位元乘上四槔等於各通道六十四 位元)。此一百二十八位元運動係由晶片中必要電路所支 援。 不同於習知技術,實施例中各埠2〇6至212利用一種 串化器/解串器以於一較快之速率對資料串流進行串列化 8 201019120 及解串列化。舉例而言,可利用一鎖相迴路(Phase Locked Loop,PLL)(譬如圖中244)以將一輸入時脈乘以較高之速 率,以匹配用於取樣各位元之輸入資料速率。雖然資料串 流到達埠206至212之時間可能稍有不同,其可使這些資 料串流流動更快。換言之,各串流係以相同速度並快更多 的速率而流動(即各資料串流可與其他資料串流以相同之 速率流動;然而,各資料串流可被設定為較其先前之流動 速率更快)。此外,各位元之時序可能並非完全完美,但並 ®不需要對準這些位元,因為這些位元的實際到達時間並不 造成影響。因此,並非於接腳(Pin)處對各位元同步,取而 代之在各埠206至212中解串化之後對這些位元同步。舉 例而言,如同星號230至236所示。同樣地,在解串化這 些埠206至212之後,位於或接近這些星號230至236之 資料位元之流速相較於高速外部記憶體介面214至228(分 別表示 TxO、Txl、Tx2、Tx3、RxO、Rxl、Rx2 及 Rx3)(譬 &如兩百五十皮秒)而言可能慢上二十倍(譬如五奈秒)。在一 實施例中,這些埠206至212能夠進行相位偵測、資料位 元管理、資料位元取樣及通道對準(Lane Alignment)等。 指令解讀器248繼續處理指令,根據通道構形(進一步 描述於圖二C、七C及七D中)之附屬/密切相關之指令。 其中240表示PLE訊號、242表示/LPD訊號、246表 示REFCLK訊號、250表示模式暫存器、250表示指令訊 號、252表示三角、258表示庫訊號及260表示遮罩訊號。 圖二B顯示四埠記憶體200之一種單一主機連結270 9 201019120 之實施例。實施例圖中表示記憶體200與一主機271之連 接270。主機271 —次作出一讀取及一寫入之運作。串列 槔記憶體技術(Serial Port Memory Technology,SPMT)將分 組埠一起定義,以形成寬廣之資料溝通。而在一連結群組 中可以動態地選擇埠206至212之數目。舉例而言,可利 用單一埠或任何數量之埠(譬如二的次方之數量)來結合以 提供埠206至212。當使用較少埠時,將使用較少之接腳 及較少之電力。當使用較多埠時,將增加頻寬並使得獲得 _同樣資料之潛時減少。其中連結埠之數量可隨時改變。 圖二C顯示用於單一主機介面之埠連結選擇275之實 施例。其中276表示一埠(可提供十六位元存取及單一指 令)、277表示二埠(可提供三十二位元資料存取及三十二位 元指令或指令重複),及278表示四或多埠(可提供六十四 位元資料存取及同時之三十二位元指令及重複)。當連結二 或以上之埠277及278時,資料將傳遞於群組中所有琿上, ❹並有效地乘以資料頻寬。然而,個別指令可能僅需要一埠, 留下其餘之埠未使用。因此為了避免浪費頻寬、改善記憶 體運作及儲存指令頻寬,因此提供一組密切相關之指令或 附屬指令(Adjunct Command)。這些指令可正確使用額外頻 寬,使其不致浪費,且如此密切相關/附屬指令可發布於一 指令之前或之後,並可與其餘埠同時發布。舉例而言,當 發布一現行指令(Active Command,ACT)時,可同時發布 現行庫指令(Active Bank Command,ABNK)作為附屬,以 完成指令。相似地,附屬寫入遮罩指令(Write Mask 201019120201019120 VI. Description of the invention: [Technical field to which the invention pertains] - The embodiment of the present invention is generally related to the field of computer memory, and more particularly, the improvement of 5 κ for serial communication memory latency and reliability. [Prior Art] In a memory system using the same speed serial interface, a host (such as a system integrated chip, a computer, a graphics controller, or the like) or a plurality of devices and systems are used to transmit commands and data. It is expected to be able to extract, take, and combine error detection to ensure normal system operation. Serial Link has an inherent Latency because ^ is only transmitted as a bit. In addition, the 'serialization and deserialization steps will take the time outside the production. The use of these flaws alone does not significantly improve the latency, so a special access method is used (for example, accessing different, dedicated child memory regions from each node) such as serial access (StHpedA(10) width by starting 埠The link (PortBindin uses a consistent number of bee good) frequency: the memory potential can be reduced by simultaneously transmitting data containing many bits, "increasing the bandwidth without a formatted access method. There is also a need for quantitative data security. For example, in a column of channels, 'unless the system causes unacceptable latency, it may be undetectable. In the case of a link, some It is still idle during the command period. In those same periods, the = wide line is not filled with repeated instructions. This method can use the two sex = stretch to the next one to provide the characteristics of the 埠 structure. A conventional tandem bit configuration 100, according to EIA Standard 3 201019120 RS-232-C. In the figure, the serial transmission of data is similar to an RS-232 connection, in which it is observed and designated by rotation one by one. It is not in the value 124 In the same sense, the individual two-bit values (bits) ι to us can be combined into an overall value of 124. For example, if the first bit is 1〇4 and is specified as the largest value of 124 The valid bit (M〇st Significant BU, MSB) ' is followed by the second bit 1〇6 and others, etc. (such as bits 〇8, 110, 112, 114, and 116, etc.) until the least significant bit is (4) The Significant Bit (LSB) is filled with the last communication bit ι0. The combined value in this example is called Frame 128, which contains the value 124 and the stop and start bits 102 and 120. In addition, the box The 128 series is described using additional bits, called Framing Bits 126, which includes a start bit 120 and a stop bit 122, making the receiver available to find the beginning of block 128 and can be expected to be broken. When the middle frame arrives, in other architectures, even when the data rate between the transmitter and the receiver is slightly different or changed, the framed bit 126 can be utilized to facilitate the receiver to reliably find individual bits. The communication of serial links to memory will cause significant latency and provide more than one host A single memory will cause noisy memory resources. In addition, the memory may have - or multiple channels, each containing a transmitter array receiver and associated circuitry to improve latency and bandwidth. (BQUnd pQn Si Some itch is still idle during the instruction. In the same period φ, go to μ m 'sexual heavy borrowing Μ eight used bandwidth will fill the space = wide t This method can be used to repeat the time from different periods In the early days - 蟑, to provide the characteristics of the structure. In the case of the car, the 'data is transmitted in multiple orders, but the instructions must be able to operate independently. 4 201019120 =ΓFirst, if not used, may contain repeated instructions ( Command Duplicates). Second, Ganbi κ is because of the mismatch of error management; so that f can be distributed at the same time. In addition, serialization, deserialization, and data-framed are additional steps of t!, so serial communication increases latency and parallel communication. [Invented within 1 world] It is necessary to guide people and implement techniques that can reduce the potential of memory. Φ ^ 发 = expose - a method, device and system for improving the latency and reliability of memory communication. In the case of the ancient times, the invention provides a method for reducing the memory potential of the host computer system and a memory between the host computer system and a memory device. Communication data, wherein the == machine computer system is engaged in the memory, and the bee is in the group τ埴赤邮 and between the host computer system and the memory via the group A single time interval communicates the Φ related to the information. The present invention provides a method for reducing the memory potential of a host computer system 'lightweight to a memory, the card = main = bit = one of the memories or a group of bees The time interval is from two m secret data, which is expected to be one of the contents; and, the memory system is modified to receive the relevant data from the host computer at a single time interval in the computer system. Know the order in the information. In an embodiment, the present invention provides a system for reducing memory latency 5 201019120, comprising a host computer system coupled to a memory, the memory using a connection system to reduce memory latency. The linking system includes a plurality of ports for communicating data and instructions, wherein the plurality of two or more nodes can be linked into one or more sets of files, the connecting system is configured to: One or more of the memory communicates data between the host computer system and the memory over a plurality of time intervals, the lanthanum being one of the group ;; and via the 埠 or the The group communicates with the host computer system and the memory at a single time interval with respect to one of the instructions. [Embodiment] Embodiments of the present invention generally relate to improvements in serial communication memory latency and reliability, however, the same can be applied to other forms of interfaces, such as High-Speed Parallel. The term "Mem〇ry" as used herein means one of the components in a computer system (see Figures 2D and IIE), which is responsible for the operation of the previously stored material. Any "Host" (such as a computer processor) or peripherals (such as a keyboard, monitor, video camera, mass storage device (disk, light, tape, etc.), network controller, or wireless network) . Typically, a, the memory system handles - or multiple microprocessors that process data in a computer system. The data can be stored in the memory by a host, such as random access memory (RAM), static random access memory (sram), dynamic access memory (DRAM), flash memory. Flash (Flash), Memory (PROM), Can be erased -, ^ 口 口 j徕 In addition to stylized read-only memory (EPR〇M), electrically erasable programmable read-only memory ( EEPR0M) or, for example, a read-only memory (ROM) of 6 201019120. The host can access the memory directly via a bus (e.g., pci) or via an intermediate 3 memory controller. The serial access to the memory involves the conversion of a single sequence of electronic signals through a single circuit and the conversion of meaningful instructions and responsibilities between the memory (similar to the RS 232 shown in the figure). The circuit that performs the conversion is called "埠(p〇rt)". In an embodiment, in order to reduce the potential of the memory, a Masking Scheme is used to enable the writing of instructions and data to be described in the same communication frame including the masking information. To reduce the number of bits and latency in the box. In addition, in order to reduce the latency, a memory-based protocol is provided to reduce the shorter frame size, and to provide greater flexibility with respect to the old dynamic random access memory. And reduce the change of the instruction set when increasing the bandwidth. Figure 1 shows an example of a single host-connected memory. In the embodiment, the link memory 200 includes a memory core 2 (such as DRAM or flash memory), which includes a multi-bank (譬© library) and is connected to the link of the link memory 200.埠Memory System 2〇4 5 The memory of the core 202 is communicated in many ways, such as four 206 to 212. All four 埠2〇6 to 212 operate together to provide an interface to a single-host with #invertible view. These memories are used independently, such as reading from one memory and writing from another memory at the same time. The memory core 202 can also include a memory read bus to read data, and a memory f write bus to write data. However, in the memory core, there is a single connection to read and write data. In addition, the link memory system 204 includes a link multiplexer and a linker multiplexer 7 201019120 262 ′ will be described in FIG. 2 κ and II i, respectively. In the known technology, the transmission of all instructions and data bits is transmitted immediately, and each flat line arrives at the same time as the code is formed. $, while the degree becomes faster, the data flowing through these lines may be sampled incorrectly or at the wrong time (for example, using the relevant clock signal). In order to solve the problem of high-speed sampling, a self-sampling serial signal is used (譬, RS, 232). However, parallel methods will increase latency because their data are processed on time. In the embodiment, the communication between the serial port and the memory © latency and reliability utilizes multiple serial interface or iterative grouping and repeating instructions (Duplicating CoInmands) 'which may be consecutively or temporally , or in the same time or spatially. Embodiments of this technique are described herein and are implemented using a single host-connected memory. In one embodiment, a transmitter (Tx) can convert sixteen bit parallel data into a serial bit stream and transmit the single bit stream. A receiver (Receiver, RX) can receive a sixteen-bit single stream and convert it into a parallel stream. In this example, a region of memory can be three; ^ two bits wide and at about the same rate. In the memory 2〇〇 in the figure, four 埠 206 to 212 are used, so each channel between four 埠 2 〇 6 and 212 has a one hundred and twenty octet data motion, with a sixty four The data stream of the meta (for example, sixteen bits multiplied by four turns equals sixty-four bits of each channel). This one hundred and twenty-eight-bit motion system is supported by the necessary circuitry in the chip. Different from the prior art, each of the embodiments 〇2〇6 to 212 uses a serializer/deserializer to serialize the data stream at a faster rate. 201019120 and deserialization. For example, a Phase Locked Loop (PLL) (譬 244) can be utilized to multiply an input clock by a higher rate to match the input data rate used to sample the bits. Although the time at which the data stream reaches 埠206 to 212 may be slightly different, it allows these data streams to flow faster. In other words, each stream flows at the same speed and at a much faster rate (ie, each data stream can flow at the same rate as other data streams; however, each data stream can be set to flow earlier than it. The rate is faster). In addition, the timing of the elements may not be perfect, but ® does not need to be aligned to these bits because the actual arrival time of these bits does not affect. Therefore, instead of synchronizing the bits at the pin, the bits are synchronized after being deserialized in the respective blocks 206 to 212. For example, as shown by asterisks 230 through 236. Similarly, after deserializing these turns 206 to 212, the flow rates of the data bits at or near these asterisks 230 through 236 are compared to the high speed external memory interfaces 214 through 228 (representing TxO, Txl, Tx2, Tx3, respectively). RxO, Rxl, Rx2, and Rx3) (譬& such as two hundred and fifty picoseconds) may be twenty times slower (e.g., five nanoseconds). In one embodiment, the ports 206 through 212 are capable of phase detection, data bit management, data bit sampling, and lane alignment. Instruction interpreter 248 continues to process instructions based on the associated/closely related instructions of the channel configuration (further described in Figures 2C, 7C, and 7D). 240 denotes PLE signal, 242 denotes /LPD signal, 246 denotes REFCLK signal, 250 denotes mode register, 250 denotes command signal, 252 denotes triangle, 258 denotes library signal and 260 denotes mask signal. FIG. 2B shows an embodiment of a single host connection 270 9 201019120 of the four-way memory 200. The embodiment shows a connection 270 between the memory 200 and a host 271. The host 271 makes a read and a write operation. Serial Port Memory Technology (SPMT) defines groupings together to form a broad data communication. The number of 埠206 to 212 can be dynamically selected in a link group. For example, a single 埠 or any number of 埠 (such as the number of powers of two) may be combined to provide 埠 206 to 212. When fewer turns are used, fewer pins and less power will be used. When more 埠 is used, the bandwidth is increased and the latency for obtaining the same data is reduced. The number of links can be changed at any time. Figure 2C shows an embodiment of a 埠 link selection 275 for a single host interface. 276 denotes one (available for sixteen-bit access and a single instruction), 277 for two (a 32-bit data access and 32-bit instruction or instruction repetition), and 278 for four Or more (available for 64-bit data access and 32-bit instructions and repetitions at the same time). When two or more links 277 and 278 are linked, the data is passed to all the cells in the group, and is effectively multiplied by the data bandwidth. However, individual instructions may only need one 埠, leaving the rest unused. Therefore, in order to avoid wasting bandwidth, improving memory operation and storing instruction bandwidth, a set of closely related instructions or accompanying instructions (Adjunct Command) is provided. These instructions correctly use the extra bandwidth so that it is not wasted, and such closely related/affiliated instructions can be posted before or after an instruction and can be issued simultaneously with the rest. For example, when an Active Command (ACT) is issued, the current library command (Active Bank Command, ABNK) can be issued as an attachment to complete the instruction. Similarly, the auxiliary write mask instruction (Write Mask 201019120)

Command,WMSK)可同時伴隨一寫入指令(Write Command,WR)。在埠0處可接受戶斤有指令,但相關之附 屬指令可由其他埠接收,以保留指令頻寬。指令ACT、 ABNK、WR及WMSK係更進一步描述於圖七B至七D中, 以作為參考。 此外,一種單一選項可啟動指令之重複,以改善錯誤 偵測,以從不正常之記憶體運作狀態中避免錯誤之指令。 當啟動此選項時 單一痒將對於在第一框中之一指令與 ⑩後續框之重複(Duplicate)進行比較。當連結及使用二或以 上之埠277及278時,對於重複而言並不需使用額外之頻 寬,因為重複係於相同時間顯示於另一埠。雖然指令被重 複,但在此實施例中資料並未重複。當具有至少四連結埠 278時,將可能同時使用重複及附屬指令。 參 圖二D顯示一種智慧型行動電話架構280之實施例, 其包含一基頻處理器282及一應用處理器281,及個別的 揮發性記憶體(譬如DRAM 274、SRAM/DRAM 283)、非揮 發性記憶體(譬如NAND 272及NOR快閃273)及溝通通道 (分享記憶體)269位於兩處理器281及282之間。記憶體 272至274及283係用以儲存及接收可執行程式碼及相關 於連接之處理器未分享之維持隱密性資料。任何的分享或 溝通係藉由溝通通道269加以執行。應用處理器281可耦 合於其他週邊裝置,譬如攝像機201及顯示裝置203。 圖二E顯示對於圖二D之選擇實施例之一種智慧型行 動電話架構284,其包含SPDRAM 285。在一實施例中, 11 201019120 忑隱體被刀旱於基頻處理器282及應用處理器28 i之間。 在實施例中,SPDRAM 285可用以溝通基頻處理器282 及應用處理器281、用於兩處理器之儲存程式碼及資料, 及減少需用以實施此架構之記憶元件或技術數量。此外, 可減/屺憶體及處理器之間的連結數量,包含排除特定之 溝通通道。在一實施例中提供了區隔,使得一些主機可存 取部分記憶體而另-些則不可。這使得記憶體裝置可分享 於安全環境,譬如對於基頻軟體。舉例而言,應用處理器 281可將基頻軟體影像載入spDRAM 中並指示基頻 處理器282其影像已就位。接著基頻處理器282將移除對 其他主機之存取,並確認此影像之有效性。若其正確基 頻處理器282可自影像繼續運作,而不需與運行於應用^ 理器281之軟體有所分裂。 圖二F顯示多主機連結構形286、287及288之實施 例。在一實施例中,可利用多主機功能結合多谭連結 例而言,若一主機(譬如一應用處理器281)需要更多頻寬, 其可利用其介面之許多連結蟑,而其他主機可繼續使用單 -埠。在實施例圖中提供在—種四蟑裝置上連結許多主機 之一些組合286(二獨立主機、雙埠/各主機)、287(二獨立 主機、主機厂單一痒、主機2:雙璋)及288(四獨立 每埠一主機)。舉例而言,在組合286争,主機i及2之介 :係分別對應於兩蟑。在組合287中’主機!之介面係;; 應於埠〇,而主機2之介面對應於兩埠2及3。在組合Mg 中’各主機之介面對應於一單一璋。可理解實施例中可提 12 201019120 供主機埠連結或介面之任意組合,譬如一單一主機可一起 連結全部四個埠。何埠實際上對應於何主機則可視暫存器 之β又疋而定,其給予連結痒群組之長度。 圖二G顯示多主機連結埠記憶體292之實施例。圖中 多主機埠連結記憶體(多主機記憶體系統)292包含四埠 mo,並溝通於包含八記憶庫289之記憶體核心291。為了 簡潔之目的,僅顯示有限數量之埠29〇及記憶庫289。雖 然多主機記憶體系統292相似於圖二Α之單一主機記憶體 系統204,此中來自各埠29〇之資料係個別可得於各庫 289/在此實施例中,記憶庫289係定義為整體多主機記憶 體系統292之部分,其可獨立地追縱資料傳送。此外,藉 由提供個別之存取,在指令期間埠29〇之單一埠可相關於 記憶庫289之-單-記憶庫,且不與存取其他記憶庫之其 他埠相衝突。連結之多工器293及解多工器294係相乘, 以產生一種交叉開關(Crossbar Switch)之可能實施例,以導 ❹引記憶庫289及埠290之多埠群組間之資料。 圖二Η顯示用於可達十六槔之一種淳連結控制暫存器 挪之實施例,及用於可達十六埠之一種重複指令確認暫 存器296。為了簡潔之目的,實施例採用之連結發生於多 個二位元倍數之連續埠(譬如埠丨、2或4),且隨著匹配標 準(Matching ModUlus)(譬如埠〇對應於四埠、埠〇或2對 應於兩槔,或任何埠對應於單—埠)。埠可根據暫存器之設 定而確認其本身在-連結群組之身分。圖中顯示了 一種結 合十六淳之蜂控制暫存器295。此連結描述及提供了一'種 13 201019120 P白層模式,111此若未設定位元,貞彳所有琿將獨立、s七 於兩埠之裝置而言,僅使用了位开n阜將獨立運作。對 士 ^一 使用了位兀0,而對於四埠裝置而 :埠連^至4描述了可能之連結’藉由增加對於四埠之 、阜連'、·.剩餘者,並連結所有蟑。對於人埠裝 士 =之剩餘者位於位元…、四璋連結位於位5元13 且所有埠位於位το 15°此模式可無限制繼續下去。 獨自=’蟑•可以不屬於一連結群組,在此情況中它們可 ❹丄 以不是一連結群組的-部分並可個別運 罾:^們可為不只-連結群組之—部分一種對於此衝 =之處理技術在於選擇敎之最大連結群組。當利用暫存 器295增加一埠於一連結群組時,下個指令係接著使用於 連結群組之内容中,且在一新埠準備好之前並不需要發布 新的才曰7。當從連結群組移除一璋時,其可能被停止或隨 後立即獨自使用。 ’實施例中將每埠一位元分配至一暫存器,以啟 ❹動重複指令(Duplicate Command)確認,如圖中所示之重複 才曰·?確 < 暫存器296。若一埠被連結於任何群組,其將確 〜其連、·、λ埠之私令值。若其並非連結於一群組則將於其 連續週期_發現重複。 圖二I顯示一種連結之解多工器294之實施例。在一 實施例中,埠準備通道(p〇rt Ready Lanes)294a(譬如 p〇n_rdy lanes)係由個別埠29〇所產生,並被授予其連結指 令。舉例而言,當四埠29〇被連結時,將發布所有埠準備 通道294a。然而,若僅埠29〇之埠2及3被連結於兩埠群 201019120 組中,則會發布蟑準備通道294&之_-吻[3 : 2]。相似 二右早獨運作埠! ’則僅發布p〇rt—_⑴。此技術被用 於確認從琿290至前往記憶庫289之正料準備通道純 之傳送大小及路由,以及建立整體記憶體字元 鎖(Latches)。 圖二J顯示表2973及2975之實施例其顯示連結解 器之路由。舉例而言’當資料到達時,解多工器根據 路由功能表297a,x料資料路由至正確之通道。此 ®器暫存器接著根據enable_fn(圖二工中之298)之功能,以 將用於記憶體之資料捕捉至正確的埠通道中,如表Μ% 所示。一旦拾鎖了所有資料,則藉由enaMe一& 利用 wr一strobe而命令其核心儲存資料。 一利用一種平行資料路徑,可達成寫入遮蔽功能或禁止 選疋資料之儲存。在儲存週期之開端,係根據如 設定所有遮蔽(譬如,禁止所㈣道)。達時,相 ❹關之遮罩係隨著資料而加以路由及儲存。若非全部資料到 達(譬如中斷或短傳送(ShGrt Trans㈣),則僅抵達之資料會 被加以儲存’因為資料通道未抵達所以未有機會清除相關 遮罩。 圖二K顯示一種連結之多工器⑼之實施例。在一實 施例中,蟑準備(譬如卿身)29如及讀取指令(譬如 read_cmd)訊號係由記憶體之讀取料(ι^心卿,叫 加以延遲2",使得從記憶體(及拾鎖)到達之資料可即時選 擇輸出埠290。如此埠之選擇係簡單地利用延遲輸出值所 15 201019120 達成。來自-讀取指令之P(m_rdy通道 2们加以解讀,相似於解 係:由多工器 通道映射至輸出埠29。,其係根據圖二之^ 功能。 <衣所不之 為了簡潔之目的,採取可能從記憶體 位元)資料字元於每-週期,且具有储存或延遲之:率 而古需要更多㈣戈 阜之情況。假使相較於輸出週期 而二要更多週期來取得資料’則隨著—「預取得緩衝區 w (Pre-Fetch Buffer)」而建古一猫分、、 種杉心,其從記憶體載有較 二:’:跨越連續週期而選擇較短之片段。況 二二T可連結於預取得緩衝區。為了使資料節流 (仏ottle),指令解讀器可將讀取指令分割為較短之量,並 以較低速率發布中間指令,以匹配於輸出速率。 圖三顯示-觀於框同步化之步驟實施例。起初,纪 憶體埠係關閉電力以系統重置302。為了開啟璋電力,連 ❹結電力關閉(Link P〇wer_D〇wn,/LPD)係被驅動為高位 304,使/LPD等於零且使埠停止。然而,當/LpD等於一時, 則開始框搜尋306,以尋找特定碼或稱為SYNc之位元順 序。當偵測到SYNC時,步驟前進至一種運作模式3〇8。 此步驟可繼續於多埠(若實施),如圖四所示。 因為主機及記憶體以串列式交換資料,接收器係被同 步化以確認在一框中之位元位置對應關係。為了確認正確 之同步化,其連結在「框搜尋(FrameSearch)」情況3〇6中 搜尋一特定之位元序列。舉例而言,起初串列連結傳送兩 201019120 同步化位元序列之一者:SYNC及SYNC2。藉由主機及記 憶體之使用,發送器之實體層(Rx-PHY)可偵測這些成框資 料封包。SYNC在一重置或錯誤後之連結提起(Link Bring-up)中扮演一種關鍵角色。此外,在正常運作中,在 任何未使用之框中,記憶體Tx-PHY將傳送SYNC。在正 常運作中,在未使用之框中,主機Tx-PHY將傳送SYNC 或SYNC2。當偵測到SYNC並且從記憶體加以辨識時,其 步驟前進至正常運作模式308中。若成框失敗,譬如指示 _ 一種二十位元解碼錯誤,則舉例而言記憶體將轉回「框搜 尋」情況306,直到再次偵測到SYNC。在任何狀態中, 若/LPD變為零,則表示埠將轉回至「連結關閉(Link Down)」狀態304,並且重新開始。 記憶體傳送SYNC2以指示接收主機資料之錯誤,或 因為留下「連結關閉」狀態或一種成框錯誤。主機獨佔地 藉由發送SYNC做出回應,直到記憶體重新建立成框並開 _始傳送S YNC。主機在指令間傳送S YNC2,以用於適當的 錯誤回復運作。SYNC及SYNC2建立及回復連接成框,且 主機安排協調連結之建立。 圖四顯示一種用於電力控制之步驟實施例。其接收 /LPD。在決策區塊402,其確認一埠是否開啟。在/LPD前 之斜線係表示反邏輯,譬如當/LPD等於零時連結是電力關 閉的,表示其電力並非開啟,則步驟將前進至區塊412, 以停止主埠。相似地,/LPD等於一表示並非電力關閉,表 示其為電力開啟。若/LPD等於一(譬如連結之電力開啟)則 17 201019120 執行一訓練步驟,在處理區塊404,尋找一框以用於特定 碼或位元序列(譬如SYNC)。用於SYNC搜尋之訓練步驟 將持續到偵測到SYNC為止,且接著在處理區塊4〇6,其 步驟將進入一種運作模式。此步驟將對於圖三作進一步描 述。在決策區塊408,將確認是否有埠錯誤。若是,步= 將繼續至決策區塊402。若否,則在決策區塊41〇將確認 是否更多埠接連進入運作模式中。若未增加其他埠,步驟 將繼續於處理區塊406之運作模式甲。然而,若偵測到額 外之埠,步驟將繼續於處理區塊414以訓練新埠。 XI些額外(多)埠係處理於處理區塊416。多埠之使用亦 描述於圖九以作為參考。在決策區塊418將確認埠錯誤。 若確認有蟑錯誤,譬如在決策區塊430藉由痒被關閉電力 (譬如/LPD=G)所造成之埠錯誤。若是,在處理區塊432將 停止所有埠,且步驟將前進於決策區塊434之單一埠模 式。若在決策區塊434中/LPD並非零(譬如/LpD=i),則在 籲處理區塊436將訓練所有埠,且步驟將繼續至處理區塊 416。請往回參考決策區塊43〇,若/LpD並非零(譬如 /LPD=1)’則步驟將前進至在處理區塊428之訓練錯誤埠, 且更繼續於處理區塊416。 ^睛返回參考決策區塊418,若未發現埠錯誤,則在決 策區塊420將另外決定是否有更多璋加入。若是,步驟將 繼續至處理區塊414之新埠訓練步驟(譬如尋找各新璋之 =YNC)。右沒有開啟額外之埠,則在決策區塊々Μ將確認 疋否移除任何埠。若$,步驟將繼續至處理區塊416。若 18 201019120 是,在處理區塊424將停止任何被移除之埠。在此中在 決策區塊426將確認是否可得單一埠,以轉回至單—璋模 式。若是,步驟將繼續至處理區塊4〇6之單一埠模式。若 否,步驟將繼續至處理區塊416之多埠模式。 右 電力控制238(參考圖二A)係負責傳遞/LpD,而埠 至212係負責訓練其本身及圖三及圖四中之處理。 圖五顯示利用單一埠之重複確認及指令解讀之步驟 施例。利用多埠之更複雜步驟將顯示於圖九。如圖五中之 ⑩進一步敘述,在處理區塊5〇2,係於一埠(譬如第一埠或主 埠)執行資料之接收、讀取及解碼,以開始第一框。在^ 區塊504將確認是否偵測到埠錯誤。若偵測到錯誤,在區 塊528將以-轉回錯誤以結束步驟。若未債測到璋錯誤了 在決策區塊506將確認是否開啟重複。應理解其重^可 必要性及需求加以開啟或關閉。若開啟重複,步驟將前進 至步驟區塊508之資料接收、讀取及解碼,於當下隨著第 鬱二框之埠加以執行。再者,在決策區塊5ι〇將確認是 測到埠錯誤。若是,步驟將結束於區塊似之錯誤轉回。 若否,步驟將繼續至區塊512,以確認第一框是否等於第 =塊2,步驟將繼續至區塊Μ,若否,步驟將繼續 至&塊528以錯誤轉回而結束。 塊5^未7到蜂錯誤(且未開啟重複,請返回參考決策區 ),:驟將繼續確認此框是否為指令或資料。若此框 ^曰々’在決策區塊516將確認此指令是否有效。奸八 *、、、無效,步驟將於區塊528以錯誤轉回而結束。若指令= 19 201019120 =於518將確認此指令是否位於序列中或是 列中,步驟將於區 传㈣而結束。若指令係位於序列中,在處 區塊520將處理指令,且在區塊530將發布一正常轉回。 凊返回參考決策區塊514,若框為㈣ 參 塊522將確認記憶體是否準備好進行寫入運作。若否^ ㈣於區塊528以錯誤轉回結束。若是,在處理區塊524 負料將被寫入記憶體’且步驟將於區塊53〇結束於正 =。在此實施例中’區塊516、518及52〇之步驟係執行於 圖六顯示—種於蟑中執行各種功能之步驟實施例。在 區塊602提供接收、讀取及解碼#料串流之步驟。舉例而 言:於-埠經由RX接收單一資料串流(以位元),並於圖中 斤示形成平行串"IL並接著解碼(譬如使用1 解碼)。 鬱利用連結電力關閉(/LPD)訊號以控制所有埠之電力(經由 電力控制機制),使電力進入及外出於單一主機連結蜂記憶 體之所有崞’如圖二A所示(譬如圖二A中之虛線表示/LPD 之電力控制)。在決策區塊6〇4將確認/LpD是否等於零。 若為零,步驟將於區塊614以轉回錯誤結束。然而,若/LpD 並非等於零,步驟將繼續於處理區塊606。 在處理區塊606將讀取資料框,其包含埠以接收一種 按位元(Bitwise)之資料串流,並產生框之平行串流(譬如二 十位元解串化)。在處理區塊6〇8將對框解碼(譬如使用 20 201019120 17B/20B解碼技術)以接著產生有效資料。在決策區塊61〇 將確認框之有效性。舉例而言,將確認框是否具有一種二 十位元碼,且被正確地解碼為一種十七位元之值。若轉換 失敗,因為不明確使其並未產生任何結果,其有效性將失 敗,且在區塊614轉回錯誤。然而’若轉換成功且產生結 果,則資料框將被視為有效,且在區塊612將為正常轉回, 此將於圖九中進一步說明。 圖七A顯示一種十七位元後解碼(p〇st_Dec〇ded)框格 鬱式700之實施例。實施例圖中之一種十七位元解碼框7〇〇 可用以傳送十七位元資料、指令及/或狀態,且經過轉換編 碼以產生用於串列傳輸之二十位元框。資料、指令及狀態 係以二十位元框加以傳送及接收。當接收時將執行相反之 步驟’其中一十位元轉換編碼框係經過解碼以產生一種十 七位元框700 ’以保存資料、指令及狀態。 圖中所示之十七位元後解碼框(格式)700將前十六位 眷元貝獻於酬載(Payload)702,且將最後一位元(即第十七位 元)貢獻於酬載指標704。記憶體存取格式建立於基本解碼 格式。舉例而言,位元十六704可表示其酬載是設定為一 或零,以用於資料、指令或狀態。在框對框之基礎上,指 令及寫入資料可分享其接收者連結。為了降低潛時,指令 可被插入(或優先)於一寫入資料串流,以延遲寫入指令之 完成。 圖七B顯示一種指令、狀態或資料編碼框格式72〇之 實施例。圖中實施例包含但不限於一種串列淳 21 201019120 DRAM(SPDRAM)之指令、狀態及資料編碼框720之實施 例。圖中十七位元編碼框720具有可延伸性,其可提供彈 性以保留多位元,可於未來增加額外之指令(舉例而言,視 技術或需求改變)。舉例而言,旗標722及次指令724佔據 框720之前七位元(位元零至七),且因為在次指令724中 大部分項目皆為一,因此此區域(包含旗標區域722)可於未 來增加額外指令(譬如,可達十六指令),以擴充框720。 相似地,有其他區域具有有限之範圍(譬如模式暫存器 ❹群組726),其亦可用於額外指令(譬如模式暫存器群組726 之次指令區域,其僅包含三指令)。另一個如此區域為 DRAM指令群組728(譬如DRAM指令群組728之次指令 區域,其全部為一),可用以增加其他指令。 SYNC 730控制及維持連結框同步化,而SYNC2 732 表示一種特定連結運作狀態。SYNC 730及732皆為關於 圖三及圖四之進一步說明。資料(資料框)734包含一種十七 ⑩位元框,相似於圖七A中所示之資料框700,其包含第十 七位元設定為一及後續兩個八位元之位元組。現行庫指令 (ABNK)736及現行指令(ACT)738將於圖七C中說明。寫 入指令(WR)740起始一種對於特定庫及行(Column)之記憶 體寫入週期。寫入遮罩(WMSK)742設定一種八位元之位元 組遮罩,用以於步驟中寫入指令,且隨著WR指令740以 具有任何效果。WMSK 742將於圖七D進一步說明,以作 為參考。 讀取(Read,RD)744表示一種讀取指令,以起始一種 22 201019120 記憶體讀取週期’而叢發停止(Burst St〇P,BSTP)746表示 一種指令,以中斷一埠當下之讀取或寫入指令’且係根據 所特定庫。預充(Precharge ’ PCG)748表示一種指令’以預 充指令中特定記憶庫,而預充全部(Precharge All,PCA)750 包含一指令,以同時預充所有記憶庫。逐記憶庫更新 (Per-Bank Refresh,REFB)752使特定記憶庫可自動更新’ 而所有記憶庫更新(All-Bank Refresh,REFA)754根據一内 部計數器使所有記憶庫更新。在發布REFA指令前,所有 ®記憶庫皆位於預充狀態中。 模式暫存器寫入(Mode Register Write,MRW)758表示 一種指令,以執行寫入至一模式暫存器。模式暫存器寫入 資料(Mode Register Write Data,MRD)760 隨著 MRW 指令 758之後提供寫入資料,在下個中間框、從埠0,且是以 MRD指令760之形式。模式暫存器讀取(Mode Register Read,MRR)756表示一種指令,以從一模式暫存器執行讀 ❿取。自更新電力關閉(Self-Refresh Power-Down,SPRD)762 將使記憶體核心立刻進入自我再新狀態。電力關閉離開 (Power-Down Exit,PDX)764表示一種指令,其發布以離 開自更新電力關閉,且用以在連接建立後喚醒記憶體核心。 圖七C表示一種ABNK(埠0(ΑΒΝΚ))及ACT指令736 及738之實施例。為了同時傳送二或以上之指令,它們將 可彼此支援對方功能或是功能性為正交。第三種標準包含 複雜性’因為語意記憶(Memory Semantics)或實施決策可 能因為正交性而造成失敗。舉例而言,串列埠DRAM可包 23 201019120 含一指令以啟動(Activate) —記憶庫,且其將要啟動之列位 址對於框而言太長。在單一埠情況中,此指令將需要二或 以上之框,但對於連結埠而言,其可於二或以上埠之上溝 通於一框時間中。 舉例而言,ABNK 736設定目標記憶庫753及列位址 755之較高五位元用於後續之ACT指令738。ACT指令738 被傳送至最後ABNK指令736中所指定之記憶庫753。若 連結二或以上之埠,則可能於埠二出現一種選擇性ABNK _ 736指令。列位址之較低十五位元765係指定於ACT指令 738之最不重要十五位元,而其最重要之五位元係指定於 最後ABNK指令736之較低五位元,或出現於埠2之ABNK 770。此實施例指出各指令736及738可於後續框中任何時 間獨立運作。這提供了可變之埠群組尺寸、獨立於埠群組 尺寸一般控制器,及符合跨埠連結之語意。此外,指令738 及770彼此互補,並可同時執行。此外,757表示一種較 ❿高列位址。 圖七D顯示一種WMSK及WR指令742及740之實 施例。圖七D顯示一種WR指令742及相關之位元組/寫 入遮罩742,以選擇性寫入。WMSK 742表示一種指令, 其設定為在步驟中用於WR指令740之一種八位元組遮罩 772,且跟在WR指令740之後以具有任何效果。在溝通 八位元組資料後,遮罩772重新開始接下來八位元組。在 遮罩772中之字母「H」表示字元轉換之高位元組(譬如位 元15至8),而「L」表示低位元組(譬如位元7至0)。 24 201019120 皿l 740起始—種記憶體寫人週期,至指定記憶 庫774及行776。—旦傳送職指令74〇,則 入^ 資料。若連結二或以上之蟑,則一種選擇性wms^令 780 =傳送至璋2 ’涵蓋或遮蔽前八位元祖(778)。遮罩爪 重複母八位70 ’除非其被接續之WMSK指令所重設。二或 多槔連結之其他實施例包含同時讀取及寫入之結合或同 時啟動及寫人,根據其記憶體及介面之語意學。 圖八A、八B及八C顯示-種寫入遮罩模型_、85〇 及875之實施例。對於使用串列溝通之記憶體而言,減少 =可整除(Indivisible)之傳送之位元數量係用以降低潛 日、。不可整除之傳送係定義為一種以位元之框或字元長 ^ ’其描述-種整體資料量(譬如—位元組)或-種可執行 才曰令包含任何需要完成此指令之中間運算元資料,譬如「寫 入」及目標地址。 ” 入對於大部分記憶體而言,寫入運算子同時包含败指 鑤令、地址、運算子(此例中為遮罩)及寫入資料。對於較快 之記憶體裝置而言,需要描述指令之速度變得過高,因此 ,利用-種叢發傳送(Bum Transfer)。叢發傳送係起始於 指令及起始資料,但繼續於資料串流及後續被計算之地址 (譬如增加的)。無論資料何時傳送,其皆伴隨著額外之 入遮罩指示訊號。 —因為指令及地址可能非連續資料傳送所必須,因此隨 者串列溝通’指令、地址、寫入遮罩及資料之編碼將可能 效率不足。對此情況,資料將隨著界&指令及地址利用 25 201019120 叢發傳送以交付資料。為了降低潛時,寫入遮罩或WMSK 指令(譬如每位元組一位元)僅需伴隨當寫入叢發中位置值 未儲存之資料。雖然如此之最佳化對於串列介面效率可為 關鍵的,此機制係可用以減少在平行記憶體介面中之所需 頻覓。由於串列介面改善了多主機記憶體之實用性,藉由 放置寫入遮罩至指令串流中,各主機將可使用獨立寫入遮 罩以獨立傳送。為了減少包含WMSK伴隨資料以減少潛時 之相關性’在叢發中將採用及說明一種之三用模型。 ❿ 圖八A顯示一種重複模式WMSK模型800之實施例。 其包含之前的記憶體内容(Mem〇ry c〇ntents Bef〇re)8〇2、 指令串流804及之後的記憶體内容(Mem〇ry c〇ntents After)806。在圖中,WMSK重複於各傳送,譬如在包含紅、 綠及藍資料之矩形中僅改變紅值,其他兩顏色將被遮蔽, 且此WMSK將於此矩形中重複跨越所有三原色資料。圖八 B顯示一種起始及末端WMSK模型85〇之實施例,包含之 ❿刖的s己憶體内容852、指令串流854及之後的記憶體内容 856。此中,寫入遮罩僅用於傳送之初始部分。舉例而言, 網路封包可起始於一奇(〇dd)傳送邊界(傳送之四位元組之 第一位兀組)以最佳化存取(對準)於封包中其餘資料結 構。一旦耗盡了起始遮罩,整個剩餘之封包資料將被寫入 s己憶體。為了完成傳送,因此將插入一新的WMSK以修剪 最後兩位元組。圖八C顯示一種利用多串列介面以用於重 複模式之WMSK模型875之實施例,其中包含之前的記憶 體内容876、指令串流878及之後的記憶體内容88〇。此中, 26 201019120 WMSK係用於一 I 一播 構。舉例而言,傳送尺寸中選擇—資料結 、一十一位几之整數中僅寫入第二位元組。 參 用。二:模_^卿& 85。,寫人遮罩係重複使用或不常使 >而°纟種形式之傳送(譬如快取寫人及大量儲存 要遮罩°在這些情況中,於資料中包含寫入遮 f疋:有效率的’因為其大部分時間並被未使用。相較於 早元傳送而言《小或較小之傳送並未獲得叢發傳送之優 1因此將指定資料、指令、地址及寫入遮罩。如此之短 專送通常發生於内部之快取記憶體,以從此類常用運 減輕叢發傾向之記憶體。 請,意模型_及850中之假設,寫入遮罩係包含資 4 ’但疋右要獲得叢發傳送之優點,則將寫人遮罩鱼指令 加以連結是不足夠的。對此情況,將來自指令及資料之寫 入遮罩傳达之去搞合(De_pling)將可被理解為—種新的 指令。在包含不可分割傳送(框)之單一串列串流中,寫入 鲁,令係被發布於其-框中之位址,且其資料串流對於所計 异之記憶體而言係定址於一種框序列中,且寫入遮翠係被 描述為一個別指令並被應用於單位叢發,以發布於寫入浐 令之後,且位於所需資料中。一單位叢發係被定義為位^ 之數量,為實施之單-寫入遮罩位元乘上寫入遮罩指令中 之寫入遮罩位元數量。當發布寫入指令時,寫入遮罩將被 清除,使後續資料被寫入。若寫入遮罩指令立刻跟隨著寫 入指令’其應用將開始於第一單位叢發。 … 若使用模型800中所述之重複模式,則遮罩將重複於 27 201019120 所有單位叢發m詩錢巾改變,縣發布一種額 外寫入遮罩指令’使新的寫入遮罩應用至所有後續資料。 ,使用模型85G中所述之起始模式,則寫人遮罩將於第一 皁位叢發之後被清除。若需要額外遮罩(譬如於末端單位叢 ,中),則將發布-種額外寫人遮罩指令,其僅應用於下一 單位叢發,而那時寫入遮罩將被清除。 對於杈型875而言,係利用一種多埠版本之模型8〇〇, 其中遮罩係被重複,但WMSK指令同時發生肖職指令同 時發生,但於不同之蟑上。若使用多串列介面,其可能產 生更複雜的指令配置。若一起使用兩璋’舉例而言,寫入 指令可被結合於一埠上,而第一寫入遮罩於另一璋上以 改善頻寬之利用性。 圖九顯示一種利用多埠以重複核對(DupHcati〇n 及指令解讀之步驟實施例。在區塊902開始處理埠 之步驟’在處理區塊904隨著多埠之第一埠之處理。在處 ❹里區塊906,在第一埠經由對應之Rx接收具有資料之資料 =流(接著將被解碼)。參考處理區塊9〇6之用語「埠m+i」, m」表不連結群組,而「丨」表示連結群組中之數目。在 此實施例中’ i開始於零,而因為實施單—主機,因此历 等於零。在決策區塊9〇8將確認第一埠(埠〇)有益任何铒 Γ若發現錯誤,在_ 942將以—錯誤轉㈣結束步;^ 若未毛現錯誤,步驟將繼續至處理區塊910以確認下— 埠:直到確認過所有埠。舉例而言,在決策區塊 912將確 心疋否剩下任何埠。若是,步驟將繼續至處理區塊9%並 28 201019120 至下一埠。若否,步驟將繼續至處理區塊914。 在決策區塊916將確認重複是否開啟。若是,在決 區塊918將確認當前之埠並根據其結果,其步驟將於區塊 942以轉回錯誤結束,或(若未開啟重複,則返回參考決策 區塊916)步驟將繼續至決策區塊92〇,其中將確認此埠是 否具有資料。若其資料並未重複其將不會進行比較。若有 資料,在決策區塊936將執行寫入運作。若未進行寫入, 在區塊942步驟將以轉回錯誤而結束。若執行寫入運作,’ 則在處理區塊938資料將從所有埠被寫入記憶體,且在區 塊940執行一正常轉回。 °° 々請返回參考決策區塊920,若埠並不具有資料,在決 策區塊922將執行指令有效性確認。在決策區塊922將確 認痒指令是否有效’舉例而言,將確認一指令列表。若指 令並非有效,則步驟將結束於區塊942。若發現指令有效, 在决策區塊924將確認指令是否於序列中(譬如,指令是否 ❹在正確位置)。若指令並非位於序列中,在區塊942將轉回 錯誤。若發現指令位於序列中,則將於處理區塊926進行 指令處理。 在决策區塊928將確認下一埠,以觀察下一 槔疋否有重複資料。因為資料之重複通常牵涉到一對埠, 在處理區塊930其埠數量將以二增加以確認下兩埠。請返 回參考決策區塊928,若答案為是,則在處理區塊934將 ,擇下(單一)埠。接著步驟將前進至決策區塊932,以確認 疋否要處理更多埠。若是,步驟將繼續至決策區塊916。 29 201019120 右否’在區塊94 0將發布·一正常轉回。 在一實施例中’資料係接收於一璋且指令係接收於此 埠。指令被處理於指令解讀器248(如圖二Α所示),在此 處表示為區塊922、924及926。而重複核對係執行於以兩 個三角254所表示之位置(如圖二a所示),在此處表示為 區塊 916、918 及 920。 圖十表示一種指令重複模型1 〇〇2、1004及1006之實 施例。在一實施例中指令重複係用以改良錯誤偵測。指令 ⑮係被傳送兩次,且原始指令被比較於重複指令。若使用一 或二埠1002及1004,在原始指令1008及1〇12之後框上 立即重複指令1010及1 〇 14。若使用四或以上之蟑丨, 則重複指令1016及1 〇 18將出現於其他蟑。 指令係經過特定選擇,因為:(1)在一種連結埠情況 ▲中,重複可用以填滿未使用之頻寬;(2)指令之錯誤解讀可 能造成未預料之結果,譬如違反指令排序(譬如,啟動已啟 ⑩動之庫,或寫入未啟動之庫)或破壞不相關於目前傳送之一 記憶體位置;反之,若一指令為正確,則任何壞資料至少 限制於目前之傳送;及(3)雖然重複資料可產生較好之社 果,但有效系統頻寬將變為一半,因為對於指令之可得自 由空間並非可得於資料串流中。 才"重複模型、1GG4卩丨刪顯示—種單一 Γ重2複=複之連結璋_及_之組合。其更顯 ,及多指令如何一起工作 中可傳送最大量為二之不同指令。 在门才[時間 201019120 在單一埠模型1002中,指令係單一地發布,且其重複 1010跟隨於指令1008之後。對於兩埠模型1〇〇4而言重 複指令1014係傳送於相同框時間。然❿,若重複被關閉, 兩指令可佔據此框時間。對於四或以上之埠模$ 而 言’二(或以上)之指令1020及1〇22可伯據一框時間,且 兩指令贈及1()22皆可於相同框時間中被重複為重 t 1〇16及1G18。其對於同時傳送之指令數量或於群組中 埠數量之粒度(Granularity)並無必須之限制。 —^據使用模型_2、_4及,其係可機會地執 在某些情況中這將節省潛時,但其將衡量錯誤機 得,則錯誤的結果將立刻可得。 製自為可 上述内容之目的在於解釋本 細節是為了提供對於本發= 2定 春=:在其他-些例子中,-些習知的結構; 在圖中之元件間可能包含中二= 說明或心件可包含額外之輸入或輸出而未加以 本發明之各種實施例可包 由硬體元件加以實施,或可包含於電腦程步驟可藉 指令中,可用以產生一一 式或電腦可讀取 有指令以執行這些步驟之邏輯電二特別目的之處理器或具 可藉由硬體及軟體之結合:;=:選擇性地,這些步驟 31 201019120 說明書中所述之一或多個模、_ 相關於像是關於實施例之多主機改^或成分,位於或 軟體及/或以上之組合。在一實施^機制’可包含硬體、 社貫施例中一模組包含軟體、敕 ==及1或設定,可經由-機器/電™體: f“加以提供。-製品可包含具有内容之機器可存 取/讀取之媒體以提供指令、資料或其他等 子裝置(舉例而言,像是_濾波器、一磁碟顿述之— 磁碟控制器)’以執行所述各種操作及執行動作。 ❿ ^明各種實施例之部分可由-電腦程式產品所提 供,其可包含一電腦可讀取媒體,具有儲存於其中之電腦 程式指令’其可用以程式化一電腦(或其他電子裝置)以執 打根據本發明實施例之步驟。機器可讀取媒體可包含但不 限於軟碟、光碟、唯讀光碟(CD-ROM)、磁光碟 (Magnet〇-0ptical Disk)、唯讀記憶體、隨機存取記憶體、 可抹除可程式化唯讀記憶體(EPR〇M)、電子可抹除可程式 ❹化唯讀記憶體(EEPR0M)、磁或光學卡、快閃記憶體或其 他形式之適用於儲存電子指令之媒體/機器可讀取媒體。此 外,本發明亦可被下載為電腦程式產品,纟中程式可從一 遠端電腦傳送至一提出請求之電腦。 所述之許夕方法係為其最基本之形式,但在未背離本 發明之基本範圍下,這些任何方法之步驟可加以增添或刪 除,且所述訊息之資訊可增加或減少。明顯地,本領域中 技藝者可據此加以修改或變更。特定之實施例並非用以提 供限制而是為了說明本發明。本發明實施例之範圍並非由 32 201019120 特又實化例所決定,而是以下述申請專利範圍。 若°兒月元件「A」是耗合至元件「B」,元件A可直 接地輕合至疋件B或間接地經由譬如元件C加以搞合。當Command, WMSK) can be accompanied by a Write Command (WR). At 埠0, the user can receive instructions, but related dependent instructions can be received by other , to retain the command bandwidth. The commands ACT, ABNK, WR, and WMSK are further described in Figures 7B through 7D for reference. In addition, a single option initiates the repetition of instructions to improve error detection to avoid erroneous instructions from abnormal memory operating conditions. When this option is activated, a single iteration will be compared to a Duplicate of one of the instructions in the first box and the 10 subsequent boxes. When linking and using two or more of 277 and 278, no additional bandwidth is required for repetition because the repeat is displayed at the same time at the same time. Although the instructions are repeated, the material is not repeated in this embodiment. When there are at least four links 278, it will be possible to use both repeat and attached instructions. Figure 2D shows an embodiment of a smart mobile phone architecture 280 comprising a baseband processor 282 and an application processor 281, and individual volatile memory (e.g., DRAM 274, SRAM/DRAM 283), non- Volatile memory (such as NAND 272 and NOR flash 273) and communication channel (shared memory) 269 are located between the two processors 281 and 282. Memory 272 to 274 and 283 are used to store and receive executable code and maintain privacy information that is not shared by the connected processor. Any sharing or communication is performed by communication channel 269. The application processor 281 can be coupled to other peripheral devices such as the camera 201 and the display device 203. Figure 2E shows a smart phone architecture 284 for the selected embodiment of Figure 2D, which includes SPDRAM 285. In an embodiment, 11 201019120 忑 is hidden between the baseband processor 282 and the application processor 28 i. In an embodiment, SPDRAM 285 can be used to communicate baseband processor 282 and application processor 281, storage code and data for both processors, and to reduce the number of memory elements or technologies needed to implement the architecture. In addition, the number of links between the memory and the processor can be reduced, including the exclusion of specific communication channels. In one embodiment, a segmentation is provided such that some hosts can access a portion of the memory while others are not. This allows the memory device to be shared in a secure environment, such as for a baseband software. For example, the application processor 281 can load the baseband software image into the spDRAM and instruct the baseband processor 282 that its image is in place. The baseband processor 282 will then remove access to other hosts and confirm the validity of this image. If its correct baseband processor 282 can continue to operate from the image, it does not need to be split with the software running on the application 281. Figure 2F shows an embodiment of a multi-master connection configuration 286, 287, and 288. In an embodiment, the multi-host function can be combined with the multi-tan link example. If a host (such as an application processor 281) needs more bandwidth, it can utilize many interfaces of its interface, while other hosts can Continue to use single-埠. In the embodiment diagram, some combinations 286 (two independent hosts, dual ports/hosts), 287 (two independent hosts, single factory itch, host 2: double 璋) of a plurality of hosts are connected on a four-inch device and 288 (four independents per host). For example, in the combination 286, the hosts i and 2: correspond to two. In combination 287 'host! The interface is; 应, and the interface of the host 2 corresponds to two 埠 2 and 3. In the combination of Mg, the interface of each host corresponds to a single frame. It can be understood that in the embodiment, 12 201019120 can be used for any combination of hosts or interfaces, for example, a single host can connect all four ports together. What is actually corresponding to the host is dependent on the beta of the scratchpad, which gives the length of the connected itchy group. Figure 2G shows an embodiment of a multi-host interface memory 292. In the figure, the multi-host 埠 connected memory (multi-host memory system) 292 includes four 埠 mo and communicates with the memory core 291 including eight memories 289. For the sake of brevity, only a limited number of files and memory 289 are shown. Although the multi-host memory system 292 is similar to the single host memory system 204 of FIG. 2, the data from each of the 埠29〇 is available to each bank 289. In this embodiment, the memory 289 is defined as A portion of the overall multi-host memory system 292 that can track data transfers independently. In addition, by providing individual accesses, a single frame during the command period can be associated with a single-memory bank of memory 289 and does not conflict with accessing other memory banks. The multiplexer 293 and the multiplexer 294 are multiplied to produce a possible embodiment of a crossbar switch to direct data between the plurality of groups of memory banks 289 and 290. Figure 2 shows an embodiment of a 淳 link control register for up to sixteen ,, and a repeat instruction acknowledge register 296 for up to sixteen 。. For the sake of brevity, the connection used in the embodiment occurs in a continuous number of multiples of multiples (such as 埠丨, 2 or 4), and with the matching criterion (Matching ModUlus) (譬如埠〇 corresponds to four 埠, 埠〇 or 2 corresponds to two 槔, or any 埠 corresponds to a single 埠).确认 You can confirm your identity in the -link group based on the settings of the scratchpad. The figure shows a bee control register 295 in combination with a sixteen inch. This link describes and provides a '13 201019120 P white layer mode, 111 if the bit is not set, all the devices will be independent, s seven in two devices, only use the bit open n阜 will be independent Operation. For 士一一, the position 兀0 is used, and for the four 埠 device: 埠连^ to 4 describes the possible link 'by adding to the four, 阜, '. For the person's wearer = the rest of the person is in the position..., the four-way link is at the position of 5 yuan 13 and all the 埠 are at the position το 15°. This mode can continue without limit. Alone = '蟑• can not belong to a link group, in which case they can be part of a link group and can be operated individually: ^ can be more than - link group - part of a pair The processing technique of this punch = is to select the largest link group. When the temporary cache 295 is used to add a link group, the next command is then used in the content of the link group, and it is not necessary to release a new game 7 until a new one is ready. When a trip is removed from a link group, it may be stopped or used immediately afterwards. In the embodiment, each bit is allocated to a temporary register to confirm the Duplicate Command, as shown in the figure. Indeed < Register 296. If a group is linked to any group, it will confirm the private value of its, ·, λ埠. If it is not linked to a group, it will be repeated in its continuous cycle. FIG. 2I shows an embodiment of a linked demultiplexer 294. In one embodiment, the p〇rt Ready Lanes 294a (e.g., p〇n_rdy lanes) are generated by an individual and are given a link instruction. For example, when four 埠29〇 are linked, all 埠 preparation channels 294a will be issued. However, if only 29〇2 and 3 are linked to the two groups 201019120, the preparation channel 294&_ kiss [3: 2] will be released. Similar to the two right to operate alone! ‘Only publish p〇rt—_(1). This technique is used to confirm the normal transfer size and routing from 珲290 to memory 289, and to create a global memory character lock (Latches). Figure 2J shows an embodiment of Tables 2973 and 2975 which shows the route of the linker. For example, when the data arrives, the demultiplexer routes the data to the correct channel according to the routing function table 297a. The ® register is then based on the function of enable_fn (298 in Figure 2) to capture the data for the memory into the correct channel, as shown in Table %. Once all the information has been locked, the core storage data is ordered by enaMe & using wr-strobe. By using a parallel data path, write masking or copying of selected data can be achieved. At the beginning of the storage period, all masks are set according to, for example, the (four) lane is prohibited. At the time of the arrival, the masks of the relevant parts are routed and stored with the data. If not all data arrives (such as interruption or short transmission (ShGrt Trans (4)), only the arrival data will be stored 'because the data channel has not arrived so there is no chance to clear the relevant mask. Figure 2K shows a connected multiplexer (9) In an embodiment, a preparation (such as a read_cmd) signal and a read command (such as read_cmd) are read from a memory ("much", called delay 2" The data arriving at the memory (and the lock) can be selected immediately for output 埠 290. The choice of this is simply achieved by using the delayed output value 15 201019120. The P from the read command (m_rdy channel 2 is interpreted, similar to Solution: The multiplexer channel is mapped to the output 埠29. It is based on the function of Figure 2. <clothing is not for the sake of brevity, taking data characters from memory bits in every cycle, with storage or delay: rate and the need for more (four) Ge 阜. If you want more data to get the data than the output cycle, then you will build a cat, and a cedar heart, along with the pre-fetch Buffer. There are two: ': Select a shorter segment across consecutive cycles. Condition 22 can be linked to the pre-fetch buffer. To throttle the data, the instruction interpreter divides the read instruction into shorter quantities and issues intermediate instructions at a lower rate to match the output rate. Figure 3 shows a step-by-step embodiment of the block synchronization. Initially, the memory system shut down the power to reset 302. In order to turn on the power, the Link Power Close (Link P〇wer_D〇wn, /LPD) is driven to the high bit 304, causing /LPD to be equal to zero and stopping. However, when /LpD is equal to one, then block search 306 is initiated to find the particular code or bit order known as SYNc. When SYNC is detected, the step proceeds to an operational mode 3〇8. This step can continue with multiple (if implemented), as shown in Figure 4. Since the host and the memory exchange data in tandem, the receivers are synchronized to confirm the positional correspondence of the bits in a frame. In order to confirm the correct synchronization, the link searches for a specific bit sequence in the "FrameSearch" case 3〇6. For example, initially the serial link transmits one of the two 201019120 synchronization bit sequences: SYNC and SYNC2. The physical layer of the transmitter (Rx-PHY) can detect these framed data packets by using the host and the memory. SYNC plays a key role in Link Bring-up after a reset or error. In addition, in normal operation, the memory Tx-PHY will transmit SYNC in any unused box. In normal operation, the host Tx-PHY will transmit SYNC or SYNC2 in the unused box. When SYNC is detected and recognized from the memory, its steps proceed to normal operating mode 308. If the frame fails, such as an indication _ a twentieth decoding error, then for example the memory will switch back to the "box search" case 306 until SYNC is detected again. In any state, if /LPD goes to zero, it means that 埠 will switch back to the "Link Down" state 304 and start over. The memory transmits SYNC2 to indicate an error in receiving the host data, or because a "link closed" status or a framed error is left. The host exclusively responds by sending a SYNC until the memory is re-established and the S YNC is transmitted. The host transfers S YNC2 between instructions for proper error recovery operation. The SYNC and SYNC2 setup and reply connections are framed, and the host schedules the establishment of a coordinated link. Figure 4 shows an embodiment of a step for power control. It receives /LPD. At decision block 402, it is determined if a turn is on. The slash before /LPD indicates the inverse logic. For example, if the link is power-off when /LPD is equal to zero, indicating that its power is not on, the step will proceed to block 412 to stop the master. Similarly, /LPD equals one to indicate that the power is not off, indicating that it is power on. If /LPD is equal to one (e.g., the connected power is turned on) then 17 201019120 performs a training step in which block 404 is searched for a particular code or sequence of bits (e.g., SYNC). The training step for the SYNC search will continue until SYNC is detected, and then in processing block 4〇6, the steps will enter an operational mode. This step will be further described in Figure 3. At decision block 408, a confirmation is made as to whether there is an error. If so, step = will continue to decision block 402. If not, then in decision block 41, it will be confirmed whether more 埠 are successively entered into the operational mode. If no other defects are added, the step will continue to operate mode block 406. However, if an additional flaw is detected, the step will continue to process block 414 to train the new trick. Some additional (multiple) systems are processed in processing block 416. The use of multiple uses is also described in Figure IX for reference. At decision block 418, an error will be confirmed. If an error is confirmed, such as in the decision block 430, the power is turned off by the itch (e.g., /LPD = G). If so, all blocks will be stopped at processing block 432 and the step will advance to a single mode of decision block 434. If /LPD is not zero in decision block 434 (e.g., /LpD = i), then all processing blocks will be trained and the process will continue to processing block 416. Referring back to decision block 43, if /LpD is not zero (e.g., /LPD = 1) then the step will proceed to the training error in processing block 428 and continue to processing block 416. The eye returns to reference decision block 418, and if no error is found, then decision block 420 will additionally determine if there are more 璋 joins. If so, the step will continue to the new training step of processing block 414 (e.g., looking for each new =YNC). If there is no extra 开启 on the right, you will be confirmed in the decision block 疋 No 移除 will be removed. If $, the step will continue to processing block 416. If 18 201019120 is, processing block 424 will stop any removal. Here, at decision block 426, it will be confirmed if a single defect is available to switch back to the single-璋 mode. If so, the step will continue to the single mode of processing block 4〇6. If not, the step will continue to the multi-mode mode of processing block 416. Right power control 238 (refer to Figure 2A) is responsible for transmitting /LpD, while 埠 to 212 is responsible for training itself and the processing in Figures 3 and 4. Figure 5 shows the steps for repeating confirmation and instruction interpretation using a single unit. More complex steps using multiples will be shown in Figure 9. As further described in FIG. 5, in the processing block 5〇2, the receiving, reading and decoding of the data are performed in a frame (such as the first port or the main port) to start the first frame. At ^ block 504, it will be confirmed if a defect is detected. If an error is detected, at block 528, the error will be returned to the end step. If the debt is not detected, the decision block 506 will confirm whether the repetition is turned on. It should be understood that its re-use and need are turned on or off. If the repetition is turned on, the step proceeds to the data reception, reading, and decoding of the step block 508, which is performed immediately after the second frame. Furthermore, in the decision block 5 〇 〇 will confirm that it is a detected error. If it is, the step will end in the block like error back. If not, the step will continue to block 512 to confirm if the first block is equal to block = 2, and the step will continue to block Μ, if not, the step will continue until & block 528 ends with an error turn back. Block 5^ does not 7 to bee error (and does not open the repeat, please return to the reference decision area), : will continue to confirm whether this box is an instruction or data. If this box ^曰々' is in decision block 516 it will be confirmed if this instruction is valid. The traits are *, , and invalid, and the steps will end in block 528 with an error back. If the instruction = 19 201019120 = 518 will confirm whether the instruction is in the sequence or in the column, the step will end in the transmission (4). If the instruction is in the sequence, block 520 will process the instruction and block 530 will post a normal return.凊 Return to reference decision block 514, if the box is (4), block 522 will confirm whether the memory is ready for write operation. If no ^ (4) at block 528, the error is reversed. If so, the processing block 524 will be written to the memory & and the step will end at block 53. The steps of 'blocks 516, 518, and 52' in this embodiment are performed as shown in Figure 6 for an example of the steps in which various functions are performed. Block 602 provides the steps of receiving, reading, and decoding the #stream stream. For example: 埠-埠 receives a single stream of data (in bits) via RX, and in the figure shows a parallel string "IL and then decodes (such as using 1 decoding). Yu uses the Linked Power Off (/LPD) signal to control all of the power (via the power control mechanism), allowing power to enter and out of the single host to connect all of the bee memory' as shown in Figure 2A (Figure 2A) The dotted line in the middle indicates the power control of /LPD). In decision block 6〇4 it will be confirmed if /LpD is equal to zero. If zero, the step will end at block 614 with a return error. However, if /LpD is not equal to zero, the step will continue to process block 606. At processing block 606, a data frame is received that contains 埠 to receive a bitwise data stream and produces a parallel stream of frames (e.g., twenty-bit deserialization). The block is decoded at processing block 6〇8 (e.g., using 20 201019120 17B/20B decoding techniques) to then generate valid data. In decision block 61〇 the validity of the box will be confirmed. For example, it will be confirmed if the box has a twenty-bit code and is correctly decoded as a seventeen-bit value. If the conversion fails, because it is not clear that it has not produced any results, its validity will fail and an error will be returned at block 614. However, if the conversion is successful and results are produced, the data frame will be considered valid and will be returned normally at block 612, as will be further illustrated in Figure 9. Figure 7A shows an embodiment of a 17-bit post-decoding (p〇st_Dec〇ded) box 700. A seventeen bit decoding block 7A of the embodiment diagram can be used to transfer seventeen bit data, instructions and/or status, and is converted to produce a twiter box for serial transmission. Data, instructions and status are transmitted and received in a twentieth box. The opposite step will be performed when receiving' where a ten-bit conversion coding frame is decoded to produce a seven-bit box 700' to hold data, instructions and status. The seventeen-bit post-decoding block (format) 700 shown in the figure presents the first sixteen elements in the payload (Payload) 702, and contributes the last bit (ie, the seventeenth bit) to the reward. Indicator 704. The memory access format is based on the basic decoding format. For example, bit sixteen 704 may indicate that its payload is set to one or zero for use in data, instructions, or status. On the basis of the box-to-frame, the instructions and the written data can share their recipient links. To reduce latency, instructions can be inserted (or prioritized) into a write stream to delay the completion of the write command. Figure 7B shows an embodiment of an instruction, status or data encoding frame format 72. The embodiment of the figure includes, but is not limited to, an embodiment of an instruction, status, and data encoding block 720 of a serial port 21 201019120 DRAM (SPDRAM). The seventeen-bit coding block 720 in the figure is extensible, which provides flexibility to preserve multiple bits, and additional instructions can be added in the future (for example, depending on technology or demand changes). For example, flag 722 and secondary instruction 724 occupy seven bits (bits zero through seven) before block 720, and because most of the items in secondary instruction 724 are one, this region (including flag region 722) Additional instructions (e.g., up to sixteen instructions) may be added in the future to expand block 720. Similarly, other regions have a limited range (e.g., mode register ❹ group 726), which can also be used for additional instructions (e.g., the instruction region of mode register group 726, which contains only three instructions). Another such area is the DRAM instruction group 728 (e.g., the sub-instruction area of the DRAM instruction group 728, all of which is one), which can be used to add other instructions. The SYNC 730 controls and maintains the link frame synchronization, while the SYNC2 732 represents a specific link operation state. Both SYNC 730 and 732 are further described with respect to Figures 3 and 4. The data (frame) 734 contains a seventeen 10-bit box similar to the data box 700 shown in Figure 7A, which contains the seventeenth bit set to one and the next two octets. The current library instruction (ABNK) 736 and the current order (ACT) 738 will be illustrated in Figure VII. Write command (WR) 740 initiates a memory write cycle for a particular bank and column. The write mask (WMSK) 742 sets an octet byte mask for writing instructions in the step and has any effect with the WR instruction 740. WMSK 742 will be further illustrated in Figure 7D for reference. Read (RD) 744 represents a read command to initiate a 22 201019120 memory read cycle ' and Burst St〇P (BSTP) 746 represents an instruction to interrupt a current read. The instruction is fetched or written' and is based on the particular library. Precharge (PCG) 748 represents an instruction 'to pre-charge a particular bank of instructions, and Precharge All (PCA) 750 contains an instruction to pre-charge all banks simultaneously. Per-Bank Refresh (REFB) 752 enables specific banks to be automatically updated' and All-Bank Refresh (REFA) 754 updates all banks based on an internal counter. All ® banks are in the precharge state prior to the release of the REFA instruction. Mode Register Write (MRW) 758 represents an instruction to perform a write to a mode register. Mode Register Write Data (MRD) 760 provides write data following the MRW instruction 758, in the next middle frame, from 埠0, and in the form of an MRD instruction 760. Mode Register Read (MRR) 756 represents an instruction to perform a read from a mode register. The Self-Refresh Power-Down (SPRD) 762 will cause the memory core to immediately enter the self-renew state. Power-Down Exit (PDX) 764 represents an instruction that is issued to leave the self-updating power off and used to wake up the memory core after the connection is established. Figure 7C shows an embodiment of ABNK (埠0(ΑΒΝΚ)) and ACT commands 736 and 738. In order to transmit two or more instructions at the same time, they will support each other's functions or functionality as orthogonal to each other. The third criterion involves complexity' because Memory Semantics or implementation decisions can fail due to orthogonality. For example, the serial 埠 DRAM can include 23 201019120 with an instruction to initiate - the memory bank, and its column address to be started is too long for the frame. In a single case, this command would require a box of two or more, but for a link, it could be traversed in a box time on two or more. For example, ABNK 736 sets the upper five bits of target memory 753 and column address 755 for subsequent ACT instructions 738. The ACT instruction 738 is passed to the memory bank 753 specified in the last ABNK instruction 736. If two or more links are connected, a selective ABNK _ 736 instruction may appear on the second. The lower fifteen-bit 765 of the column address is assigned to the least significant fifteen of the ACT instruction 738, and the most significant five-bit element is assigned to the lower five bits of the last ABNK instruction 736, or appears ABNK 770 of Yu埠2. This embodiment indicates that instructions 736 and 738 can operate independently at any time in subsequent frames. This provides a variable group size, a general controller independent of the group size, and the semantics of the cross-link. Additionally, instructions 738 and 770 are complementary to each other and can be executed simultaneously. In addition, 757 represents a higher column address. Figure 7D shows an embodiment of WMSK and WR commands 742 and 740. Figure 7D shows a WR command 742 and associated byte/write mask 742 for selective writing. WMSK 742 represents an instruction that is set to use an octet mask 772 for the WR instruction 740 in the step and follows the WR instruction 740 to have any effect. After communicating the octet data, the mask 772 restarts the next octet. The letter "H" in the mask 772 indicates the high byte of the character conversion (e.g., bits 15 to 8), and the "L" indicates the low byte (e.g., bit 7 to 0). 24 201019120 Dish l 740 Start - a memory write cycle, to the specified memory 774 and line 776. Once the transfer order is 74〇, enter the data. If two or more links are linked, then a selective wms^ 780 = transmitted to 璋2' covers or masks the first octet (778). The masking claw repeats the female eight-bit 70' unless it is reset by the continuation of the WMSK instruction. Other embodiments of the two or more links include simultaneous reading and writing of the combination or simultaneous activation and writing of the person, according to the semantics of the memory and interface. Figures 8A, 8B and 8C show an embodiment of the write mask models _, 85 〇 and 875. For memory using serial communication, the number of bits transmitted by the reduced = indivisible is used to reduce the latent time. An indivisible transmission is defined as a box or a character length ^ 'the description of the total amount of data (such as a byte) or an executable to contain any intermediate operations that need to complete the instruction. Metadata such as "write" and destination address. For most memory, the write operator contains both the command, the address, the operator (in this case, the mask), and the data to be written. For faster memory devices, a description is required. The speed of the instruction becomes too high, so the Bum Transfer is used. The burst transmission starts at the instruction and the start data, but continues at the data stream and subsequent calculated addresses (such as increased No matter when the data is transmitted, it is accompanied by an additional mask indication signal. - Because the command and address may be required for non-continuous data transfer, the serial communication is followed by 'instruction, address, write mask and data. Encoding may be inefficient. In this case, the data will be transmitted with the boundaries & directives and addresses using 25 201019120 bursts. To reduce latency, write masks or WMSK instructions (such as one per tuple) Elementary) only need to accompany the data stored in the burst where the position value is not stored. Although such optimization can be critical for tandem interface efficiency, this mechanism can be used to reduce parallel memory. In order to improve the practicability of multi-host memory, by placing a write mask into the instruction stream, each host can use an independent write mask for independent transfer. Include WMSK companion data to reduce latency correlation' A three-purpose model will be used and illustrated in the burst. ❿ Figure VIIIA shows an embodiment of a repetitive pattern WMSK model 800. It contains the previous memory content (Mem 〇ry c〇ntents Bef〇re)8〇2, instruction stream 804 and subsequent memory contents (Mem〇ry c〇ntents After) 806. In the figure, WMSK is repeated in each transmission, for example, in containing red, green Only the red value is changed in the rectangle of the blue data, the other two colors will be masked, and the WMSK will repeatedly span all the three primary colors in this rectangle. Figure VIIIB shows an embodiment of the start and end WMSK model 85〇, including The suffix content 852, the instruction stream 854, and the subsequent memory content 856. Here, the write mask is only used for the initial portion of the transmission. For example, the network packet can start at Odd (〇dd) transfer boundary The first group of the transmitted four-bytes is optimized to access (align) the rest of the data structure in the packet. Once the starting mask is exhausted, the entire remaining packet data will be written to In order to complete the transfer, a new WMSK will be inserted to trim the last two tuples. Figure VIIIC shows an embodiment of a WMSK model 875 that utilizes a multi-string interface for repeating mode, including the previous memory. The body content 876, the command stream 878, and the subsequent memory contents 88. Among them, 26 201019120 WMSK is used for an I-cast. For example, the transmission size is selected - the data node, the eleven-digit number Only the second byte is written in the integer. Reference. Two: Mode _^ Qing & 85. Writer masks are reused or infrequently used to transmit the form (such as cache writers and mass storage to mask). In these cases, the data is included in the mask. Efficiency 'because it is not used most of the time. Compared to the early transmission, "small or smaller transmissions do not get the best of the burst transmission 1 so the data, instructions, addresses and write masks will be specified. Such a short delivery usually occurs in the internal cache memory, in order to reduce the tendency of the memory from such common use. Please, the assumptions in the Italian model _ and 850, the write mask contains 4 ' but疋Right to get the advantages of the burst transmission, it is not enough to link the man-masking fish instructions. In this case, the de-pling will be conveyed from the command and data. It is understood as a new instruction. In a single serial stream containing an indivisible transmission (box), the code is written to the address in its frame, and its data stream is calculated. The memory of the different memory is addressed in a sequence of frames and written in the occult It is described as a separate instruction and is applied to the unit burst to be published after the write command, and is located in the required data. A unit burst is defined as the number of bits ^, for the implementation of single-write The masked bit is multiplied by the number of write mask bits in the write mask instruction. When the write command is issued, the write mask will be cleared, so that subsequent data is written. Immediately follow the write command 'its application will start in the first unit burst. ... If you use the repeat mode described in model 800, the mask will repeat on 27 201019120 all units of the censored money change, the county released An additional write mask instruction 'applies a new write mask to all subsequent data. Using the start mode described in model 85G, the write mask will be cleared after the first soap level is burst. If additional masking is required (such as in the end unit bundle, medium), an additional write mask command will be issued that will only be applied to the next unit burst, and the write mask will be cleared at that time. For the Type 875, it uses a multi-version version of the model 8〇 The mask is repeated, but the WMSK instruction occurs at the same time, but on the other hand. If multiple serial interfaces are used, it may generate more complicated instruction configuration. In other words, the write command can be combined on one frame and the first write mask on the other frame to improve the usability of the bandwidth. Figure 9 shows a multi-turn for repeated verification (DupHcati〇n and instruction interpretation) The step embodiment begins at block 902 where the processing step 904 follows the processing of the first block in the processing block 904. In the block 906, the first frame receives the data via the corresponding Rx. The data = stream (which will then be decoded). The term "埠m+i" is used in the reference processing block 9〇6, the m" table does not link the group, and the "丨" indicates the number in the link group. In the example, 'i starts at zero, and because the single-host is implemented, the calendar is equal to zero. In decision block 9〇8, it will be confirmed that the first 埠(埠〇) is beneficial. If any error is found, the _ 942 will end with _ ERROR (4); ^ If the error is not detected, the step will continue to the processing block. 910 to confirm - 埠: until all 埠 has been confirmed. For example, in decision block 912, it will be true that there are no flaws left. If so, the steps will continue until the processing block is 9% and 28 201019120 to the next. If no, the step will continue to processing block 914. At decision block 916, it is confirmed whether the repetition is on. If so, at block 918, the current acknowledgment will be confirmed and, depending on the result, the steps will end at block 942 with a return error, or (if the repetition is not turned on, return to reference decision block 916) the steps will continue until the decision is made. Block 92〇, which will confirm whether this file has data. If the information is not repeated, it will not be compared. If there is data, a write operation will be performed in decision block 936. If no writes have been made, the block 942 step will end with a return error. If a write operation is performed, then at processing block 938 the data will be written to the memory from all ports and a normal switch back is performed at block 940. °° 返回 Please return to the reference decision block 920. If there is no data, the decision block 922 will perform the instruction validity check. In decision block 922, it will be confirmed whether the iteration command is valid. For example, a list of instructions will be confirmed. If the order is not valid, the step will end at block 942. If the command is found to be valid, then at decision block 924 it will be confirmed if the instruction is in the sequence (e.g., if the instruction is in the correct position). If the instruction is not in the sequence, an error will be returned at block 942. If the instruction is found to be in the sequence, then processing is performed at processing block 926. The next block will be confirmed in decision block 928 to see if there is duplicate data in the next. Since the repetition of the data usually involves a pair of defects, the number of defects in the processing block 930 will be increased by two to confirm the next two. Returning to decision decision block 928, if the answer is yes, then in processing block 934, select (single) 埠. The next step will proceed to decision block 932 to confirm if more defects are to be processed. If so, the step will continue to decision block 916. 29 201019120 Right No' will be released in block 94 0. In one embodiment, the data is received and the command is received. The instructions are processed by instruction interpreter 248 (shown in Figure 2), shown here as blocks 922, 924, and 926. The repetitive check is performed at the position indicated by the two triangles 254 (as shown in Fig. 2a), and is represented here as blocks 916, 918 and 920. Figure 10 shows an embodiment of an instruction repetition model 1 100 2, 1004, and 1006. In one embodiment, the instruction repetition is used to improve error detection. Instruction 15 is transmitted twice and the original instruction is compared to the repeated instruction. If one or two of 1002 and 1004 are used, the instructions 1010 and 1 〇 14 are repeated immediately after the original instructions 1008 and 1〇12. If four or more are used, the repeat command 1016 and 1 〇 18 will appear on the other side. The instruction is specifically selected because: (1) in a connection ▲ case ▲, the repetition can be used to fill the unused bandwidth; (2) the misinterpretation of the instruction may result in unanticipated results, such as a violation of the order of instructions (eg , start a library that has been activated, or write to an unstarted library) or destroy a memory location that is not related to the current transmission; conversely, if an instruction is correct, then any bad data is limited to at least the current transmission; (3) Although repeated data can produce better results, the effective system bandwidth will be half, because the free space available for instructions is not available in the data stream. Only "repetition model, 1GG4 卩丨 delete display - a single Γ weight 2 complex = complex link 璋 _ and _ combination. It is more obvious, and how multiple instructions work together can deliver a maximum of two different instructions. At the door [time 201019120 in the single 埠 model 1002, the instructions are issued singly, and their repetition 1010 follows the instruction 1008. For the two models 1〇〇4, the repeat command 1014 is transmitted at the same frame time. Then, if the repetition is turned off, the two instructions can occupy this frame time. For four or more models $, the two (or more) instructions 1020 and 1〇22 can be counted as one frame time, and the two instructions give 1 () 22 can be repeated in the same box time as a heavy t 1〇16 and 1G18. There is no limit to the number of instructions that can be transmitted at the same time or the granularity of the number of 中 in the group. -^ According to the use of models_2, _4 and, it is possible to perform in some cases which will save potential, but it will measure the error, and the wrong result will be available immediately. The purpose of the above is to explain the details in order to provide for the present invention = 2 in the other - some examples, some of the conventional structures; between the components in the figure may contain the second two = description Or the core member may include additional inputs or outputs without the various embodiments of the invention being implemented by hardware components, or may be included in a computer program step, which may be used to generate one-to-one or computer-readable A processor or device having instructions for performing these steps may be a combination of hardware and software:; =: selectively, one or more of the modules described in the specification of the step 31 201019120, _ related to a plurality of host modifications or components, for example, in combination with or in combination with software and/or the like. In an implementation mechanism, a module may include a hardware, a module containing software, 敕== and 1 or a setting, which may be provided via a -machine/electric TM body: f". The article may contain content The machine can access/read the medium to provide instructions, data or other sub-devices (for example, like a filter, a disk-disc-disk controller) to perform the various operations described. And the execution of the action. ❿ ^ Part of the various embodiments may be provided by a computer program product, which may include a computer readable medium having computer program instructions stored therein that can be used to program a computer (or other electronic The device is configured to perform the steps according to the embodiments of the present invention. The machine readable medium may include, but is not limited to, a floppy disk, a compact disk, a CD-ROM, a magneto-optical disk, and a read-only memory. Body, random access memory, erasable programmable read only memory (EPR〇M), electronic erasable programmable read only memory (EEPR0M), magnetic or optical card, flash memory or Other forms are suitable for storing electronic instructions The body/machine can read the media. In addition, the present invention can also be downloaded as a computer program product, and the program can be transmitted from a remote computer to a requesting computer. Forms, but without departing from the basic scope of the invention, the steps of any of these methods may be added or deleted, and the information of the message may be increased or decreased. Obviously, those skilled in the art may modify or modify the information accordingly. The specific embodiments are not intended to be limiting, but to illustrate the invention. The scope of the embodiments of the present invention is not determined by the specific embodiment of 32 201019120, but the scope of the following patent application. It is consumed to the component "B", and the component A can be directly coupled to the component B or indirectly via the component C. when

書或凊求項說明一元件、特徵、結構、步驟或特性A w成」一 tl件、特徵、結構、步驟或特性B,其表示「A」 至少部分造成了「B」,但其中可包含至少另一元件、特徵、 結構L步驟或特性有助於造成「B」。若說明書中表示「可」、 ❹可或「可以」包含一元件、特徵、結構、步驟或特 ,則並非必需包含此特定元件、特徵、結構、步驟或特 性。若說明書中矣干「 斗、「 _ 盆僅目士 —個」元件,那並不表示 八僅具有一個所述元件。 提及「一實施例 個實施例 實把例是本發明之—種實施方式或範例。說明書中 一眘大& t Γ l a . 些實施例」或「其 他實施例」,表示實施例之一特定元件、特徵、 或特性係包含於至少一此實祐 ’ 實施例’但未必於所有實施例。 鬱種U —實施例」、「一個實施例」《「一些實 未必意指所有實施例。應理解在 」 :、·各種特徵有時是群料—單—實_ 1式或其中 之-者咬多:= 助於理解各種發明觀點 利之二二方法不可解讀為表示請求專 發U要於各㉔求項中超過所記載之更多特徵 如說,下述請求項反應出發明之觀點依賴於少於 一 述揭露之實施例之所有特徵。因此,藉此特意將::: 入說明卜各請求項表示其本身為本發明之_獨立實施= 33 201019120 【圖式簡單說明】 本發明之實施例剌以舉例說明而非 之圖式中相同之元件符號係表示相同之元件。限制,伴隨 一圖一顯示一種RS_232之習知串列位元配置 顯不-種單—主機連結埠記憶體之實施例;圖二 =A 種四埠記憶體之單-主機連結之實施例;圖2 ”肩不— 對於一單一主機介面之埠連結選擇之實施例,·圖種 一種智慧型行動電話架構之實施例;圖二£顯示對^不 參D之利用串列璋DRAM(SPDRAM)之一種智慧型電話架: 之選擇性實施例;圖二F顯示多主機連結構形之實施例. 圖二G顯示一種多主機連結埠記憶體之實施例;圖^ =顯 不用於可達十六埠之一種埠連結控制暫存器及用於可達十 六琿之一種重複指令確認暫存器之實施例;圖二I顯示— 種連結解多工器之實施例;圖二j顯示連結解多工器路由 表之實施例;圖二K顯示一種連結解多工器之實施例;圖 ❾二L顯示一種連結多工器路由表之實施例;圖三顯示一種 用於框同步化之步驟實施例;圖四顯示一種用於電力控制 之步驟實施例;圖五顯示一種實施單一埠進行重複確認及 指令解讀之步驟實施例;圖六顯示一種在一蟑中接收及解 碼框之步驟實施例;圖七A顯示一種十七位元的後-解碼 框(格式)實施例;圖七B顯示一種指令、狀態及資料編碼 框之實施例;圖七C顯示一種現行記憶庫及現行指令之實 施例;圖七D顯示一種寫入遮罩及寫入指令之實施例;圖 八A、八B及八C顯示一種寫入遮罩模型之實施例;圖九 34 201019120 顯示一種利用多埠以重複核對及解讀指令之步驟實施例; 及圖十顯示一種指令重複模型之實施例。 【主要元件符號說明】 100習知串列位元配置 102、122停止位元 104、106、108、110、112、114、116、118 位元 12 0開始位元 124值 φ I26成框位元 128框 200埠記憶體 201攝像機 202記憶體核心 203顯示裝置 204埠記憶體系統 206、208、210、212 埠 ❹ 214、216、218、220 ' 222 ' 224 ' 226、228 外部記憶 體介面 230、232、234、236 星號 238電力控制 240 PLE訊號 242 /LPD 訊號 244鎖相迴路 246 REFCLK 訊號 35 201019120 248指令解讀器 250模式暫存器 252、254 三角 256指令訊號 258庫訊號 260遮罩訊號 262解多工器 264多工器 ® 269溝通通道 270單一主機連結 271主機 272 NAND 快閃 273 NOR快閃A book or a claim means that a component, feature, structure, step or characteristic A w is a t-piece, feature, structure, step or characteristic B, which means that "A" at least partially causes "B", but may include At least another component, feature, structure L step or characteristic contributes to "B". It is not necessary to include a particular element, feature, structure, step or feature in the specification. If the instructions in the manual are "fighting, "_ basin only meticulous", it does not mean that eight has only one of the components. Reference is made to "an embodiment of the present invention, which is an embodiment or an example of the present invention. In the specification, a cautionary embodiment" or "another embodiment" means one of the embodiments. Particular elements, features, or characteristics are included in at least one such embodiment, but not necessarily in all embodiments. "U", "Usage", "an embodiment", "some do not necessarily mean all embodiments. It should be understood": , · various characteristics are sometimes group material - single - real - 1 or any of them - Biting more: = Helping to understand the various ideas of the invention. The method can not be interpreted as indicating that the request for the special issue U is more than the stated characteristics in each of the 24 claims. The following claims reflect the idea of the invention. Less than all of the features of the disclosed embodiments. Therefore, it is specifically intended that::: indicates that each request item indicates itself as the invention _ independent implementation = 33 201019120 [Simplified description of the drawings] The embodiments of the present invention are illustrated by way of example and not by the drawings. The component symbols are the same components. Limitation, with the accompanying FIG. 1 showing an RS-232 conventional serial bit configuration, an embodiment of a single-host connection memory; FIG. 2 is an embodiment of a single-host connection of a four-four memory; Figure 2 "Shoulders" - an embodiment of a single host interface connection selection, an embodiment of a smart phone architecture; Figure 2 shows the use of tandem DRAM (SPDRAM) An intelligent telephone rack: an alternative embodiment; FIG. 2F shows an embodiment of a multi-host connection structure. FIG. 2G shows an embodiment of a multi-host interface memory; FIG. An embodiment of a connection control register and a repeat instruction confirmation register for up to sixteen; FIG. 2I shows an embodiment of a connection demultiplexer; FIG. An embodiment of a multiplexer routing table is illustrated; FIG. 2K shows an embodiment of a connection demultiplexer; FIG. 2L shows an embodiment of a connection multiplexer routing table; and FIG. 3 shows a method for frame synchronization. Step embodiment; Figure 4 shows one for power control Step embodiment; FIG. 5 shows an embodiment of a step of performing a single acknowledgment for repeated acknowledgment and instruction interpretation; FIG. 6 shows an embodiment of a step of receiving and decoding a frame in one ;; FIG. Post-decoding block (format) embodiment; Figure 7B shows an embodiment of an instruction, status, and data encoding block; Figure 7C shows an embodiment of a current memory bank and current instructions; Figure 7D shows a write mask And an embodiment of a write command; FIGS. 8A, 8B, and 8C show an embodiment of a write mask model; FIG. IX 34 201019120 shows an embodiment of a step of using multiple scans to repeatedly check and interpret instructions; An embodiment of an instruction repetition model is shown. [Main element symbol description] 100 conventional serial bit configuration 102, 122 stop bits 104, 106, 108, 110, 112, 114, 116, 118 bits 12 0 start Bit 124 value φ I26 into frame bit 128 frame 200 埠 memory 201 camera 202 memory core 203 display device 204 埠 memory system 206, 208, 210, 212 埠❹ 214, 216, 218, 220 ' 222 ' 224 ' 226, 228 external memory interface 230, 232, 234, 236 asterisk 238 power control 240 PLE signal 242 / LPD signal 244 phase-locked loop 246 REFCLK signal 35 201019120 248 instruction reader 250 mode register 252, 254 triangle 256 command signal 258 library signal 260 mask signal 262 solution multiplexer 264 multiplexer® 269 communication channel 270 single host link 271 host 272 NAND flash 273 NOR flash

274 DRAM 275埠連結選擇 〇 276 -埠 277 二埠 278四或多埠 279表274 DRAM 275 埠 Link Selection 〇 276 - 埠 277 2 埠 278 4 or more 279

280智慧型行動電話架構 281應用處理器 282基頻處理器 283 SRAM/DRAM 285 SPDRAM 36 201019120 286、287、288多主機連結構形 289記憶庫 290槔 291記憶體核心 292多主機連結埠記憶體 293多工器 294解多工器 294a埠準備通道 ❿ 295埠連結控制暫存器 296重複指令確認暫存器 297a、297b 表 298 enable_fn 299讀取潛時 302、304、306、308 步驟 402、404、406、408、410、412、414、416、418、420、 ⑩ 422、424、426、428、430、432、434、436 步驟 502、504、506、508、510、512、514、516、518、520、 522、524、528、530 步驟 602、604、606、608、610、612、614 步驟 700框 702酬載 704酬載指標 7 2 0編碼框格式 722旗標 37 201019120 724次指令 726模式暫存器群組 728 DRAM指令群組 730 SYNC 732 SYNC2 734資料 736現行庫指令 738現行指令 © 740寫入指令 742寫入遮罩 744讀取 746叢發停止 748預充 750預充全部 752逐庫更新 φ 7 5 3記憶庫 754所有庫更新 755列位址 756模式暫存器讀取 757較高列位址 758模式暫存器寫入 760模式暫存器寫入資料 762自我再新電力關閉 764電力關閉離開 38 201019120 765較低列位址 770 ABNK 772遮罩位元 774記憶庫 776行位址 778遮罩 780選擇性WMSK指令 800、850、875 模型 ❹ 802、852、876之前的記憶體内容 804、854、878指令串流 806、856、880之後的記憶體内容 902、904、906、908、910、912、914、916、918、920、 922、924、926、928、930、932、934、936 ' 938、 940步驟 1002、1004、1006 模型 ⑩ 1008、1012原始指令 1010、1014、1016、1018 重複指令 1020、1022 指令 39280 smart mobile phone architecture 281 application processor 282 baseband processor 283 SRAM/DRAM 285 SPDRAM 36 201019120 286, 287, 288 multi-host connection structure 289 memory 290 槔 291 memory core 292 multi-host connection 埠 memory 293 The multiplexer 294 demultiplexer 294a prepares the channel 295. The link control register 296 repeats the command confirmation register 297a, 297b. Table 298 enable_fn 299 reads the latency 302, 304, 306, 308. Steps 402, 404, 406, 408, 410, 412, 414, 416, 418, 420, 10 422, 424, 426, 428, 430, 432, 434, 436 Steps 502, 504, 506, 508, 510, 512, 514, 516, 518 , 520, 522, 524, 528, 530 steps 602, 604, 606, 608, 610, 612, 614 step 700 block 702 payload 704 payload index 7 2 0 coding frame format 722 flag 37 201019120 724 instructions 726 mode Register group 728 DRAM instruction group 730 SYNC 732 SYNC2 734 data 736 current library instruction 738 current instruction © 740 write instruction 742 write mask 744 read 746 burst stop 748 pre-charge 750 pre-fill all 752 library Update φ 7 5 3 Memory Library 754 All Library Updates 755 Address 756 mode register read 757 higher column address 758 mode register write 760 mode register write data 762 self-renew power off 764 power off leave 38 201019120 765 lower column address 770 ABNK 772 mask bit 774 memory 776 row address 778 mask 780 selective WMSK instruction 800, 850, 875 model 802, 852, 876 prior to memory contents 804, 854, 878 instruction stream 806, 856, 880 Subsequent memory contents 902, 904, 906, 908, 910, 912, 914, 916, 918, 920, 922, 924, 926, 928, 930, 932, 934, 936 '938, 940 steps 1002, 1004, 1006 Model 10 1008, 1012 original instructions 1010, 1014, 1016, 1018 repeat instructions 1020, 1022 instructions 39

Claims (1)

201019120 七、申請專利範圍: 1. 一種用於降低記憶體潛時之方法,包含: 於一主機電腦系統及-記憶體之間經由位於該記憶體 之一埠或一組埠於多時間間隔溝通資料,其中該 腦系統係耦合於該記憶體,該琿為該組埠之一;及 於該主機電腦系統及該記憶體之間經由該淳或节 於一單一時間間隔溝通相關於該資料之一指令。心、阜 ❿2· 求項i所述之用於降低記憶體潛時之方法 =於該組埠中之隸㈣可於記憶體運作中任何時^ 3·如請求項1所述之用於降低記憶體潛時 :::::該组埠…被該指令佔據’而對該指::: 項3所述之用於降低記憶體潛時之方法,其中1 後續♦曰令係如同該指令於該單一時間間隔中溝通、。“ 求項1所述之用於降低記憶體潛時之方法,八 維持該組埠之埠而溝通該指令之-重複指令Λ 3 6. 201019120 元之數量 罩資訊之資料’以更降低溝通位 之方法’其中該 ::求項6所述之用於降低記憶體潛時 、’’、機制於一資料串流中係可改變。 之方法,其中該 同 8.:凊求項7所述之用於降低 溝通框中寫入未包含遮罩資 料機制係經祕改,⑽止於其中 11 溝通柩由舍、L, 丨罵入指令之相 訊之資料 述之用於降低記憶體潛時之方法,其中該 、、、機制係自動地重複於後續單元傳送。 10:=:所述之用於降低記憶體潛時之方法,其中該 遮罩機制在一皁一單元傳送後停止。 魯 該 二求項1所述之用於降低記憶體潛時之方法,其中 才曰々可插入於一寫入資料串流中。 狀 12.如=求項i所述之用於降低記憶體潛時之方法,其中 態資訊可插入於一寫入資料串流中。 /、 13.一種用於降低記憶體潛時之裝置,包含·· 主機電腦系統,輕合至一今,陪妙 — 祠口主°己隐體,该記憶體經由位於 201019120 :»己隐體之一痒或一組埠於多時間間隔從該主機電腦 系統接收資料,該埠為該組埠之一;及 ^ U e 'it體係㈣修改’以經由料或該組璋於一單 -時間間隔從該主機電腦系統接收相關於該資料之一 14.201019120 VII. Patent application scope: 1. A method for reducing the memory potential time, comprising: communicating between a host computer system and a memory via a time interval of one or a group of the memory Information, wherein the brain system is coupled to the memory, the 珲 is one of the group ;; and the host computer system and the memory communicate with the data at a single time interval via the 淳 or the node An instruction. Heart, 阜❿2· The method described in item i for reducing the potential of memory = in the group 埠 (4) can be used in any operation of the memory ^ 3 · as described in claim 1 for lowering Memory latent time::::: The group 埠...occupied by the instruction' and the finger::: Item 3 is used to reduce the memory latency, wherein 1 subsequent ♦ command is like the instruction Communicate in this single time interval. "The method described in Item 1 for reducing the potential of the memory, the eighth is to maintain the group and communicate the instruction - repeat the instruction Λ 3 6. The amount of information of the 201019120 yuan cover information to reduce the communication bit The method of: wherein: the method of claim 6 is for reducing memory latency, '', the mechanism is changeable in a data stream, wherein the same is as described in 8. It is used to reduce the mechanism of writing data without masks in the communication box, and (10) the information in the communication, which is used to reduce the memory latency. The method, wherein the mechanism is automatically repeated in subsequent unit transmissions. 10:=: The method for reducing memory latency, wherein the mask mechanism stops after a soap unit is delivered. The method for reducing memory latency according to the second item 1, wherein the method can be inserted into a write data stream. 12. The use of the item i is used to reduce the memory potential. In the time method, the state information can be inserted into a write data stream. /, 13. A device for reducing the potential of memory, including · host computer system, light to the present, with the wonderful - 祠口主°己隐体, the memory via the 201019120 :» 隐隐One of the itching or a set of time to receive data from the host computer system at multiple intervals, the 埠 is one of the group ;; and ^ U e 'it system (4) modify 'to pass the material or the group 璋 in a single-time The interval is received from the host computer system related to one of the data. 如明求項13所述之用於降低記憶體潛時之裝置,直令 該埠於該組料之隸屬係可於記憶體運作中任何時間 15.如請求項14所述之用於降低記憶體潛時之裝置, =體係經過修改,以經由維持該組㈣不被該指 7佔據,而接收該指令之後續指令。The apparatus for reducing the memory latency as described in claim 13, wherein the membership of the group is directly operable at any time during the operation of the memory. 15. The method of claim 14 is for reducing memory. The device of the submersible, the system is modified to receive subsequent instructions of the instruction by maintaining the group (4) not occupied by the finger 7. 16.如請求項15所述之用於降低記憶 4後續指令係如同該指令於該單一 體潛時之裝置,其中 時間間隔中溝通。 Π.如請求項B所述之用於降低記憶體潛時之裝置, 該記憶體係經過修改,以經由維持 I /、中 才曰令之一重複指令。 坎叹〇豕 18.如請求項17所述之用於降低記憶體潛時 該記憶體更被修改,以實施一遮覃 ,其中 入貝料之相同溝通框中禁止寫入未勺人 」该寫 匕3遮罩資訊之資 42 201019120 料以$降低溝通位元之數量及記憶 201019120 體潛時 置,其中 = : 19所述之用於降低記憶體潜時之穿置“ «罩機制係經過修改,以禁止 :裝置,其中 同溝同該寫入指令之;j:曰 φ 寫入未包含遮罩資訊之資料。 21.如μ求項2G所述之用於降低記憶體潛時 該遮罩機制係自動重複於後續單元傳送。、,八中 中 22::r:— ❿23·:::項13所述之用於降低記憶體潛時之裝置,盆中 孩才曰π係可插入於一寫入資料串流中。 、 24. 如:求項13所述之用於降低記憶體潛時之裝置,其 狀態資訊係可插入於一讀取資料串流中。 、 25. —種用於降低記憶體潛時之系統,包含: -主機電腦系統,耦合於一記憶體,該記憶體使用一埠 連結系統以降低記憶體潛時,料連#系統包含複數個 43 201019120 埠以溝通資料及指令,其中該複數個埠之二或以上之埠 係可被連結為一或多組埠,該埠連結系統係用以: 經由位於該記憶體之一埠或一組埠於多時間間隔而於 該主機電腦系統及該記憶體之間溝通資料,該埠係為該 組埠之一;及 經由該#或該組槔於一# —日夺間間隔於該主機電腦系 統及該記憶體之間溝通相關於該資料之一指令。 ® 26·如請求項25所述之用於降低記憶體潛時之系統,其中 S亥埠於該組埠中之隸屬係可於記憶體運作中任何時間 r\T — * 27·=:求項25所述之用於降低記憶體潛時之系統 二車連結系統更經過修改,以經由維持該料 W 曰令佔據而溝通該指令之後續指令。 28·如π求項25料之歸降低記憶體料 5亥後續指令係如同該指令於該單—時間間隔中溝通、。 該二:Γ统 用於/低記憶體潛時之系統,其t 溝通改’以經由维持該組痒之卑而 30.如請求 ,23所述之用於降低記憶體潛 J、示統,其中 44 201019120 該埠連結系統係實 之相同溝通框中林 降低溝通位元之數 施一遮罩機制,以於如同該寫入資料 止寫入未包含遮罩資訊之資料,以更 量及記憶體潛時。 31.如請求項3 、+ 述之用於降低記憶體潛時之李铋甘士 该遮罩機制係 am Α + τι糸統,其中 货J於—貧料串流中改變。 3 2.如請求項3丨张、+、 〇 _ . _ 斤述之用於降低記憶體潛時之系祐^ , 该遮罩機制係痤 卞疋糸統,其中 同溝通框中寫入包含遮罩資 — U過修改,以禁止於如同該寫人指令之相 訊之資料。 3.二:求項32所述之用於降低記憶 该遮罩機制係更經過修改,以林 手之系統,其中 相同溝通框中寫入去匕g 如同該寫入指令之 甲冩入未包含遮罩資訊之資料。 ❹34::求項33所述之用於降低記憶體潛 "”、罩機制係自動重複於後續單元傳送。、、’其中 35. 月求項25所述之用於降低記憶體潛時之系統其中 °亥才曰令係可插入於一寫入資料串 流中。 45 201019120 37.如請求項25所述之用於降低記憶體潛時之系統,其中 狀態資訊係可插入於一讀取資料串流中。16. The method for reducing memory as described in claim 15 wherein the subsequent instructions are as if the instructions were in the single body, wherein the time interval is communicated.装置. The apparatus for reducing memory latency as described in claim B, the memory system being modified to repeat the command by maintaining one of the I/, 中 commands. The sigh is 18. The memory used in claim 17 is used to reduce the potential of the memory, and the memory is modified to implement a concealer, wherein the same communication box in the bedding is prohibited from writing to the unspoken person. Write 匕3 mask information information 42 201019120 It is expected to reduce the number of communication bits and memory 201019120 body time, which = 19 is used to reduce the memory potential of the wear-through "The cover mechanism is Modify to prohibit: the device, wherein the same groove is the same as the write command; j: 曰 φ writes the data that does not contain the mask information. 21. The μ2 is used to reduce the memory potential. The mask mechanism is automatically repeated in subsequent unit transmissions. , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , In the data stream, the device is inserted into a data stream, and the state information is used to reduce the memory potential. A system for reducing memory latency, comprising: - a host computer system coupled to a memory The memory uses a link system to reduce the memory latency. The system contains a plurality of 43 201019120 埠 to communicate data and instructions, wherein the plurality of 埠 or more can be linked to one or The plurality of sets of ports are configured to: communicate data between the host computer system and the memory via one or more of the time intervals of the memory, the system being the group And one of the instructions relating to the information is communicated between the host computer system and the memory via the ########. In the system for reducing the potential of the memory, the membership of the group in the group can be used at any time during the operation of the memory r\T — * 27·=: claim 25 for reducing the memory potential The system two-car connection system has been modified to communicate the subsequent instructions of the instruction by maintaining the material. 28·If the π-item 25 material is reduced, the memory material is followed by the instruction. Communicate in the single-time interval, The second: the system used for / low memory latent time, its t communication changed 'to maintain the group of itching humble 30. As requested, 23 used to reduce memory potential J, the system , where 44 201019120 the link system is the same as the communication frame in the same communication box, the number of communication bits is reduced by a masking mechanism, so as to write the data without the mask information as the data is written, Memory potential. 31. As requested in Item 3, +, which is used to reduce the memory latency, the mask mechanism is am Α + τι糸, where the cargo J changes in the lean stream. 3 2. If the request item 3 丨 、, +, 〇 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Masking - U has been modified to prohibit access to information such as the instructions of the writer. 3. Two: The method described in Item 32 is used to reduce memory. The mask mechanism is modified to the forester's system, in which the same communication box is written to 匕g as if the write command was not included. Mask information. ❹34:: referred to in item 33 for reducing memory potential", the mask mechanism is automatically repeated in subsequent unit transmissions.,, wherein 35. month claim 25 is used to reduce memory latency. The system can be inserted into a write data stream. 45 201019120 37. The system for reducing memory latency according to claim 25, wherein the state information can be inserted into a read In the data stream. 4646
TW98136752A 2008-10-23 2009-10-29 Method, apparatus and system for reducing memory latency TWI467381B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US60513408A 2008-10-23 2008-10-23
US10948008P 2008-10-29 2008-10-29

Publications (2)

Publication Number Publication Date
TW201019120A true TW201019120A (en) 2010-05-16
TWI467381B TWI467381B (en) 2015-01-01

Family

ID=44831612

Family Applications (1)

Application Number Title Priority Date Filing Date
TW98136752A TWI467381B (en) 2008-10-23 2009-10-29 Method, apparatus and system for reducing memory latency

Country Status (1)

Country Link
TW (1) TWI467381B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4807189A (en) * 1987-08-05 1989-02-21 Texas Instruments Incorporated Read/write memory having a multiple column select mode
US6851033B2 (en) * 2002-10-01 2005-02-01 Arm Limited Memory access prediction in a data processing apparatus
US7657667B2 (en) * 2004-03-25 2010-02-02 International Business Machines Corporation Method to provide cache management commands for a DMA controller
US20060090059A1 (en) * 2004-10-21 2006-04-27 Hsiang-I Huang Methods and devices for memory paging management
US7346713B2 (en) * 2004-11-12 2008-03-18 International Business Machines Corporation Methods and apparatus for servicing commands through a memory controller port

Also Published As

Publication number Publication date
TWI467381B (en) 2015-01-01

Similar Documents

Publication Publication Date Title
KR101611516B1 (en) Method and system for improving serial port memory communication latency and reliability
JP2575557B2 (en) Super computer system
JP4742116B2 (en) Out-of-order DRAM sequencer
TW200428406A (en) Multiport memory architecture, devices and systems including the same, and methods of using the same
US8762616B2 (en) Bus system and bridge circuit connecting bus system and connection apparatus
TW200832406A (en) Memory controller including a dual-mode memory interconnect
WO2021207919A1 (en) Controller, storage device access system, electronic device and data transmission method
JPH10340224A (en) High performance and high band width memory using sdram and system therefor
US20050125590A1 (en) PCI express switch
US6675251B1 (en) Bridge device for connecting multiple devices to one slot
US7822903B2 (en) Single bus command having transfer information for transferring data in a processing system
CN103077132A (en) Cache processing method and protocol processor cache control unit
US20040066673A1 (en) Memory device and system having a variable depth write buffer and preload method
CN1996276A (en) Data transmission of multiple processor system
TWI250405B (en) Cache bank interface unit
US7549074B2 (en) Content deskewing for multichannel synchronization
CN110008162B (en) Buffer interface circuit, and method and application for transmitting data based on buffer interface circuit
CN109032818A (en) A kind of method of isomorphism system internuclear synchronization and communication
CN109616149A (en) A kind of eMMC host controller, eMMC control system and control method
TW201019120A (en) Method, apparatus and system for reducing memory latency
US20090265485A1 (en) Ring-based cache coherent bus
CN209249081U (en) A kind of eMMC host controller and eMMC control system
Liljeberg et al. Asynchronous interface for locally clocked modules in ULSI systems
CN101228733B (en) Method of data transmission between two asynchronous system and asynchronous data buffer
TW548923B (en) Data register in communication system and method thereof