WO2011069780A1 - Mémoire cache et procédé d'accès - Google Patents

Mémoire cache et procédé d'accès Download PDF

Info

Publication number
WO2011069780A1
WO2011069780A1 PCT/EP2010/067524 EP2010067524W WO2011069780A1 WO 2011069780 A1 WO2011069780 A1 WO 2011069780A1 EP 2010067524 W EP2010067524 W EP 2010067524W WO 2011069780 A1 WO2011069780 A1 WO 2011069780A1
Authority
WO
WIPO (PCT)
Prior art keywords
gate
signal
gated
access
read
Prior art date
Application number
PCT/EP2010/067524
Other languages
English (en)
Inventor
Bao Truong
Michael Ju Hyeok Lee
Samuel Ward
Original Assignee
International Business Machines Corporation
Ibm United Kingdom Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corporation, Ibm United Kingdom Limited filed Critical International Business Machines Corporation
Priority to CN201080055681.1A priority Critical patent/CN102652311B/zh
Publication of WO2011069780A1 publication Critical patent/WO2011069780A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0895Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C11/00Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
    • G11C11/21Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
    • G11C11/34Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
    • G11C11/40Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
    • G11C11/41Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming static cells with positive feedback, i.e. cells not needing refreshing or charge regeneration, e.g. bistable multivibrator or Schmitt trigger
    • G11C11/412Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming static cells with positive feedback, i.e. cells not needing refreshing or charge regeneration, e.g. bistable multivibrator or Schmitt trigger using field-effect transistors only
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1028Power efficiency
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates generally to a cache access memory, a method for gating a read access of a row of such a memory and a design structure.
  • Random access memory most commonly refers to computer chips that temporarily store dynamic data to enhance computer performance. By storing frequently used or active files in random access memory, a computer may access the data faster than if the computer retrieves the data from a far-larger hard drive. Random access memory is volatile memory, meaning it loses its contents once power is cut. This is different from non- volatile memory such as hard disks and flash memory, which do not require a power source to retain data. When a computer shuts down properly, data located in random access memory is committed to permanent storage on the hard drive or flash drive. At the next boot-up, RAM begins to fill with programs automatically loaded at startup and with files opened by the user.
  • Random access memory which may also be referred to as cache memory arrays, is comprised of a plurality of memory cells having an individual logic circuit associated with each memory cell.
  • Cache memory arrays may also employ the concept of a valid bit. Each logical row of memory cells contains at least one bit used to indicate whether the data stored is valid or invalid. Traditionally, the lookup would occur regardless of the state of the valid bit. Additional logic after the memory array output would discard the data returned from a read operation if the value stored for the valid bit denoted invalid data.
  • the memory cell used to store data in an invalid bit may be different than traditional cells, such as the 6T cell. This difference consists of a reset port that may switch the state of the cell without the need for a standard wordline driver enabled access.
  • a method, in a cache access memory for gating a read access of any row in the cache access memory that has been invalidated.
  • the illustrative embodiment sends, by an address decoder in the cache access memory, a memory access to a non-gated wordline driver and a gated wordline driver associated with the memory access.
  • the illustrative embodiment determines, by the non-gated wordline driver, whether the memory access is a write access or a read access. Responsive to the non-gated wordline driver determining the memory access as being the read access, the illustrative embodiment outputs, by the non-gated wordline driver, the data stored in a valid bit memory cell to the gated wordline driver.
  • the illustrative embodiment determines, by the gated wordline driver, whether the memory access is the write access or the read access.
  • the illustrative embodiment determines, by the gated wordline driver, whether the data from the valid bit memory cell from the non-gated wordline driver indicates either valid data or invalid data. Responsive to the data being invalid, the illustrative embodiment denies, by the gated wordline driver, an output of the data in a row of memory cells associated with the gated wordline driver.
  • a cache access memory may comprise an address decoder in the cache access memory that sends a memory access to a non-gated wordline driver and a gated wordline driver associated with the memory access.
  • the non-gated wordline driver determines whether the memory access is a write access or a read access and outputs the data stored in a valid bit memory cell to the gated wordline driver in response to the non-gated wordline driver determining the memory access as being the read access.
  • the gated wordline driver determines whether the memory access is the write access or the read access, determines whether the data from the valid bit memory cell from the non-gated wordline driver indicates either valid data or invalid data in response to the gated wordline driver determining the memory access as being the read access, and denies an output of the data in a row of memory cells associated with the gated wordline driver in response to the data being invalid.
  • a design structure embodied in a machine readable medium for designing, manufacturing, or testing an integrated circuit is provided.
  • the design structure may be encoded on a machine-readable data storage medium and may comprise elements that, when processed in a computer-aided design system, generates a machine-executable representation of a booth decoder.
  • the design structure may be a hardware description language (HDL) design structure.
  • the design structure may comprise a netlist and may reside on a storage medium as a data format used for the exchange of layout data of integrated circuits.
  • FIG. 1 is an exemplary block diagram of a processor in accordance with an illustrative embodiment
  • Figure 2 illustrates a high-level example of a typical cache memory array comprising multiple memory cells in accordance with an illustrative embodiment
  • Figure 3 depicts an example of a typical memory cell in accordance with an illustrative embodiment
  • Figure 4 illustrates one example of a cache memory array comprising multiple memory cells and valid bit memory cells in accordance with an illustrative embodiment
  • Figure 5 depicts one exemplary implementation of a non-gated wordline driver in
  • Figure 6 depicts one exemplary implementation of a gated wordline driver in accordance with an illustrative embodiment
  • Figure 7 depicts an example of a valid bit memory cell in accordance with an illustrative embodiment
  • Figure 8 is a flowchart outlining an exemplary operation of a cache memory array using a valid bit memory cell and the gated wordline driver in accordance with one illustrative embodiment.
  • Figure 9 is a flow diagram of a design process used in semiconductor design, manufacture, and/or test.
  • the illustrative embodiments provide a mechanism for gating the read access of any row in a cache access memory (for example, a Static Random Access Memory(SRAM) based cache memory) array that has been invalidated.
  • a cache access memory for example, a Static Random Access Memory(SRAM) based cache memory
  • SRAM Static Random Access Memory
  • Figure 1 is provided as one example of a data processing environment in which a cache memory array may be utilized, i.e. in a cache of a processor.
  • Figure 1 is only offered as an example data processing environment in which the aspects of the illustrative embodiments may be implemented and is not intended to state or imply any limitation with regard to the types of, or configurations of, data processing environments in which the illustrative embodiments may be used. To the contrary, any environment in which a cache memory array may be utilized is intended to be within the spirit and scope of the present invention.
  • FIG. 1 is an exemplary block diagram of processor 100 in accordance with an illustrative embodiment.
  • Processor 100 includes controller 102, which controls the flow of instructions and data into and out of processor 100. Controller 102 sends control signals to instruction unit 104, which includes LI cache 106. Instruction unit 104 issues instructions to execution unit 108, which also includes LI cache 110. Execution unit 108 executes the instructions and holds or forwards any resulting data results to, for example, L2 cache 112 or controller 102. In turn, execution unit 108 retrieves data from L2 cache 112 as appropriate. Instruction unit 104 also retrieves instructions from L2 cache 112 when necessary. Controller 102 sends control signals to control storage or retrieval of data from L2 cache 112.
  • Processor 100 may contain additional components not shown, and is merely provided as a basic representation of a processor and does not limit the scope of the present invention.
  • the hardware in Figure 1 may vary depending on the implementation.
  • Other internal hardware or peripheral devices such as flash memory, equivalent non- volatile memory, or optical disk drives and the like, may be used in addition to or in place of the hardware depicted in Figure 1.
  • the processes of the illustrative embodiments may be applied to a multiprocessor data processing system, without departing from the spirit and scope of the present invention.
  • the data processing system 100 may take the form of any of a number of different data processing systems including client computing devices, server computing devices, a tablet computer, laptop computer, telephone or other communication device, a personal digital assistant (PDA), or the like.
  • PDA personal digital assistant
  • data processing system 100 may be a portable computing device which is configured with flash memory to provide non- volatile memory for storing operating system files and/or user-generated data, for example.
  • data processing system 100 may be any known or later developed data processing system without architectural limitation.
  • Figure 2 illustrates a high-level example of a typical cache memory array 200 comprising multiple memory cells 202 in accordance with an illustrative embodiment.
  • Memory cells 202 in a particular row 204 are connected to one another by wordlines 208.
  • Wordlines 208 of each row 204 are also connected to wordline drivers 210 which receive output 212 from address decoder 214 that identifies which row 204 is to be output and cache memory array 200 outputs the corresponding data entry through data outputs 216.
  • Memory cells 202 in a particular column 206 are connected to one another by a pair of bitlines 218 which are driven to complementary during read/write executions and are traditionally precharged to the voltage supply.
  • the true and complement bitlines 218 feed bitline evaluators 220, which may be sense amplifiers, to convert the differential signal to a single-ended signal for use in logic downstream.
  • address decoder 214 receives an address associated with a read/write access from external logic 222. Address decoder 214 decodes the address and signals the particular one of wordline drivers 210 associated with the decoded address using output 212. The particular one of wordline drivers 210 then fires due to the signal from address decoder 214 and the data in the associated row 204 of memory cells 202 is output through data outputs 216 if the access is a read access or, if the access is a write access, data is written to memory cells 202 in associated row 204.
  • FIG 3 depicts an example of a typical memory cell, such as one of memory cells 202 of Figure 2, in accordance with an illustrative embodiment.
  • Memory cell 300 forms the basis for most static random-access memories in CMOS technology.
  • Memory cell 300 uses six transistors 301-306 to store and access one bit.
  • Transistors 301-304 in the center form two cross-coupled inverters, which is illustrated in the more simplified memory cell 310 comprising inverters 311 and 312. Due to the feedback structure created by inverters 311 and 312, a low input value on inverter 311 will generate a high value on inverter 312, which amplifies (and stores) the low value on inverter 312.
  • inverter 311 will generate a low input value on inverter 312, which feeds back the low input value onto inverter 311. Therefore, inverters 311 and 312 will store their current logical value, whatever value that is.
  • Lines 317 and 318 between inverters 311 and 312 are connected to separate bitlines 319 and 320 via two n-channel pass-transistors 315 and 316.
  • the gates of transistors 315 and 316 are driven by wordline 321.
  • wordline 321 is used to address and enable all bits of one memory word. As long as wordline 321 is kept low, memory cell 310 is disconnected from bitlines 319 and 320.
  • Inverters 311 and 312 keep feeding themselves and memory cell 310 stores its current value.
  • wordline 321 is high, both transistors 315 and 316 are conducting and connect the inputs and outputs of inverters 311 and 312 to bitlines 319 and 320. That is, inverters 311 and 312 drive the current data value stored inside the memory cell 310 onto bitline 319 and the inverted data value on inverted bitline 320.
  • This data may then be amplified by a bitline evaluator, such as bitline evaluators 220 of Figure 2, and generates the output value of memory cell 310 during a read operation.
  • wordline 321 is activated and, depending on the current value stored inside memory cell 310, there might be a short-circuit condition and the value inside memory cell 310 is literally overwritten. This only works because transistors 301-304 that make up inverters 311 and 312 are very weak. That is, transistors 301-304 are considered weak because when new data is to be written to transistors 301-304, the current state of transistors 301-304 may be easily overridden with the new state.
  • bitlines 218 in Figure 2 and bitlines 319 and 320 in Figure 3 must span the entire height of the cache memory array and tend to be highly capacitive. Since power is directly proportional to capacitance, lower power consumption results if the cache memory array bitlines are precharged and discharged less often.
  • Known methods to save power array rely on reducing supply voltages to induce a "sleep" mode or decreasing bitline swing.
  • the illustrative embodiments gate the read access of any row in a cache memory array that has been invalidated. When a read access to an invalid row is requested, that row's wordline driver does not fire. The bitlines both stay at the precharge voltage and very little bitline power is dissapated.
  • the illustrative embodiments implement a valid bit through the addition of memory cell per row. Programming the valid bit requires a firing of the wordline driver as with any write operation. However, the actual writing of the valid bit is then gated by a dedicated write enable signal. If this dedicated write enable signal is not asserted when the wordline fires, no data is driven to the valid bit and the contents of the valid bit cell is driven to the bitlines and a read occurs.
  • FIG. 4 illustrates one example of a cache memory array 400 comprising multiple memory cells 402 and valid bit memory cells 424 in accordance with an illustrative embodiment.
  • Memory cells 402 are arranged as an array having rows 404 and columns 406. Memory cells 402 in a particular row 404 are connected to one another by wordlines 408. Wordlines 408 of each row 404 are also connected to gated wordline drivers 410 which receive output 412 from address decoder 414 that identifies which row is to be output as well as output from an associated valid bit memory cell 424 that indicates whether the row as being valid or not.
  • Memory cells 402 in a particular columns 406 as well as valid bit memory cells 424 in column 430 are connected to one another by a pair of bitlines 418 which are driven to complementary during read/write executions and are traditionally precharged to the voltage supply.
  • the true and complement bitlines 418 feed bitline evaluators 420, which may be sense amplifiers, to convert the differential signal to a single-ended signal for use in logic downstream.
  • address decoder 414 receives an address associated with a read/write access from external logic 422. Address decoder 414 decodes the address and signals the particular one of non-gated wordline drivers 426 and gated wordline drivers 410 associated with the decoded address using outputs 412. The particular one of non-gated wordline drivers 426 then fires due to the signal from address decoder 414 and the valid bit in the associated valid bit memory cell 424 is output through data output 428 to the associated gated wordline drivers 410.
  • the particular one of gated wordline drivers 410 fires due to the signal from address decoder 414 and the data in the associated row 404 of memory cells 402 is output through data outputs 416 if the access is a read access.
  • the access is a write access
  • data is written to memory cells 402 in associated row 404 regardless whether the data in data output 428 indicates that the data in the associated ones of memory cells 402 is valid or invalid.
  • Non- gated wordline driver 500 comprises AND gates 502, 504, and 506 as well as OR gate 508.
  • an access received from an address decoder such as address decoder 414 of Figure 4
  • read_enable signal 510 is set high into AND gate 502 and the read access complement, read_enable' signal 512, is set low into AND gate 504. Since the access is a read access, write_enable signal 514 is set low into AND gate 504 and the write access complement, write_enable' 516 is set high into AND gate 502.
  • OR gate 508 then fires and with address decode signal 518 from the address decoder, AND gate 506 fires and outputs a read access signal to the associated valid bit memory cell, such as valid bit memory cell 424 of Figure 4.
  • the valid bit memory cell then outputs an appropriate signal to an associated gated wordline driver, such as gated wordline driver 410 of Figure 4.
  • the signal from the valid bit memory cell would be high if the data is valid or low if the data is not valid.
  • read enable signal 510 is set low into AND gate 502 and the read access complement, read_enable' signal 512, is set high into AND gate 504. Since the access is a write access, write_enable signal 514 is set high into AND gate 504 and the write access complement, write_enable' 516 is set low into AND gate 502. Since write_enable signal 514 and read enable' signal 512 are both high, AND gate 504 fires into OR gate 508. OR gate 508 then fires and with address decode signal 518 from the address decoder, AND gate 506 fires and outputs a write access signal to the associated valid bit memory cell. The valid bit memory cell then outputs an appropriate signal to an associated gated wordline driver, such as gated wordline driver 410 of Figure 4.
  • gated wordline driver such as gated wordline driver 410 of Figure 4.
  • FIG. 6 depicts one exemplary implementation of a gated wordline driver, such as gated wordline driver 410 of Figure 4, in accordance with an illustrative embodiment.
  • Gated wordline driver 600 comprises AND gates 602, 604, and 606 as well as OR gate 608.
  • read enable signal 610 is set high into AND gate 602 and the read access complement, read_enable' signal 612, is set low into AND gate 604. Since the access is a read access, write_enable signal 614 is set low into AND gate 604 and the write access complement, write enable' 616 is set high into AND gate 602.
  • AND gate 602 looks to valid bit signal 620 from the valid bit memory cell to determine whether to fire or not. If valid bit signal 620 is low, then AND gate 602 does not fire and, conversely, if valid bit signal 620 is high then AND gate 602 fires into OR gate 608. OR gate 608 then fires and with
  • AND gate 606 fires and outputs a read access signal to the associated memory cell, such as valid bit memory cell 402 of Figure 4.
  • read enable signal 610 is set low into AND gate 602 and the read access complement, read_enable' signal 612, is set high into AND gate 604. Since the access is a write access, write_enable signal 614 is set high into AND gate 604 and the write access complement, write_enable' 616 is set low into AND gate 602. Since write_enable signal 614 and read enable' signal 612 are both high, AND gate 604 fires into OR gate 608. OR gate 608 then fires and with address decode signal 618 from the address decoder, AND gate 606 fires and outputs a write access signal to the associated memory cell. As can be seen, regardless of valid bit signal 620 from the valid bit memory cell, a write access will always occur.
  • FIG 7 depicts an example of a valid bit memory cell, such as valid bit memory cells 424 of Figure 4, in accordance with an illustrative embodiment.
  • Valid bit memory cell 700 which is similar to memory cell 310 of Figure 3, may use six transistors to store and access one bit. As with memory cell 310 in Figure 3, the four transistors in the middle form two cross-coupled inverters, which is illustrated in the more simplified valid bit memory cell 700 comprising inverters 711 and 712. Due to the feedback structure created by inverters 711 and 712, a low input value on inverter 711 will generate a high value on inverter 712, which amplifies (and stores) the low value on inverter 712.
  • inverter 711 will generate a low input value on inverter 712, which feeds back the low input value onto inverter 711. Therefore, inverters 711 and 712 will store their current logical value, whatever value that is.
  • Lines 717 and 718 between inverters 711 and 712 are connected to separate bitlines 719 and 720 via two n-channel pass-transistors 715 and 716.
  • the gates of transistors 715 and 716 are driven by wordline 721.
  • wordline 721 is used to address and enable all bits of one memory word. As long as wordline 721 is kept low, valid bit memory cell 700 is disconnected from bitlines 719 and 720. Inverters 711 and 712 keep feeding themselves and valid bit memory cell 700 stores its current value.
  • both transistors 715 and 716 are conducting and connect the inputs and outputs of inverters 711 and 712 to bitlines 719 and 720. That is, inverters 711 and 712 drive the current data value stored inside valid bit memory cell 700 onto bitline 719 and the inverted data value on inverted bitline 720. This data may then be amplified by a bitline evaluator, such as bitline evaluators 420 of Figure 4, and generates the output value of valid bit memory cell 700 during a read operation. To write new data into valid bit memory cell 700, wordline 721 is activated and, depending on the current value stored inside valid bit memory cell 700, there might be a short-circuit condition and the value inside valid bit memory cell 700 is literally overwritten.
  • valid bit memory cell 700 also comprises inverter 722 that allows for the output of the value stored in inverters 711 and 712 to be output. This output is illustrated as output signal 723 and is the input to the gated wordline driver, such as gated wordline driver 410 of Figure 4 as data output 428 and gated wordline driver 600 of Figure 6 as valid bit signal 620.
  • the illustrative embodiments provide a mechanism to save power in memory arrays implemented with a valid bit.
  • the power savings lie in gating off the read access to any row with invalid data.
  • the invalid data condition prohibits the wordline driver from firing and thus stops any bitline from being discharged. No power is saved during a write operation since every bit (valid bit included) must be programmed to the incoming value.
  • the valid bit memory cell and the gated wordline driver circuitry of the illustrative embodiments is preferably implemented in an integrated circuit device.
  • the valid bit memory cell and the gated wordline driver circuitry may be used in a processor for performing multiplication operations.
  • the circuitry described above may further be implemented as one or more software routines, in some illustrative embodiments, that approximate the operation of the circuits described above.
  • the illustrative embodiments be embodied in circuitry of a hardware device, such as an integrated circuit, processor, or the like, but they may also be implemented as software instructions executed by a processor.
  • FIG. 8 is a flowchart outlining an exemplary operation of a cache memory array using a valid bit memory cell and the gated wordline driver in accordance with one illustrative embodiment. It will be understood that each block of the flowchart illustration, and combinations of blocks in the flowchart illustration, can be implemented by computer program instructions. These computer program instructions may be provided to a processor or other programmable data processing apparatus to produce a machine, such that the instructions which execute on the processor or other programmable data processing apparatus create means for implementing the functions specified in the flowchart block or blocks.
  • These computer program instructions may also be stored in a computer-readable memory or storage medium that can direct a processor or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory or storage medium produce an article of manufacture including instruction means which implement the functions specified in the flowchart block or blocks.
  • blocks of the flowchart illustration support combinations of means for performing the specified functions, combinations of steps for performing the specified functions and program instruction means for performing the specified functions. It will also be understood that each block of the flowchart illustration, and combinations of blocks in the flowchart illustration, can be implemented by special purpose hardware-based computer systems which perform the specified functions or steps, or by combinations of special purpose hardware and computer instructions.
  • the operation starts by an address decoder in a cache memory array receiving an address associated with a read/write access from external logic (step 802).
  • the address decoder decodes the address and signals a non-gated wordline driver and a gated wordline driver associated with the decoded address (step 804). From this point the operation splits.
  • the non-gated wordline driver determines whether the access associated with the decoded address is a write access or a read access (step 806).
  • the non-gated wordline driver determines that the access is a write access, then the non-gated wordline driver fires, the data associated with the write access is written to the valid bit memory cell associated with the non-gated wordline driver, and the data stored in the valid bit memory cell is output to the gated wordline driver (step 808), with this part of the operation ending thereafter. If at step 806 the non-gated wordline driver determines that the access is a read access, then the non-gated wordline driver fires and the data stored in the valid bit memory cell is output to the gated wordline driver (step 810), with this part of the operation ending thereafter.
  • the gated wordline driver determines whether the access associated with the decoded address is a write access or a read access (step 812). If at step 812 the gated wordline driver determines that the access is a write access, then the gated wordline driver fires and the data associated with the write access is written to the memory cells associated with the gated wordline driver (step 814), with this part of the operation ending thereafter. If at step 812 the gated wordline driver determines that the access is a read access, then the gated wordline driver determines whether the valid bit from the non-gated wordline driver is valid or invalid (step 816). If at step 816 the valid bit indicates that the data is valid, then the gated wordline driver fires and the data in the associated row of memory cells is output (step 812). If at step 812 the gated wordline driver determines that the access is a write access, then the gated wordline driver fires and the data associated with the write access is written to the memory cells associated with the gated wordline driver (step 814), with this
  • step 818 If at step 816 the valid bit indicates that the data is invalid, then the gated wordline driver does not fire (step 820), with this part of the operation ending thereafter.
  • the illustrative embodiments provide a valid bit memory cell and gated wordline driver circuits that save power in memory arrays implemented with a valid bit.
  • the power savings lies in gating off the read access to any row with invalid data.
  • the invalid data condition prohibits the wordline driver from firing and thus stops any bitline from being discharged. No power is saved during a write operation since every bit (valid bit included) must be programmed to the incoming value.
  • the circuit as described above may be part of the design for an integrated circuit chip.
  • the chip design may be created in a graphical computer programming language, and stored in a computer storage medium (such as a disk, tape, physical hard drive, or virtual hard drive such as in a storage access network). If the designer does not fabricate chips or the photolithographic masks used to fabricate chips, the designer transmits the resulting design by physical means (e.g., by providing a copy of the storage medium storing the design) or electronically (e.g., through the Internet) to such entities, directly or indirectly.
  • the stored design may then be converted into the appropriate format (e.g., GDSII (Graphic Database System II) for the fabrication of photolithographic masks, which typically include multiple copies of the chip design in question that are to be formed on a wafer.
  • GDSII Graphic Database System II
  • photolithographic masks may be utilized to define areas of the wafer (and/or the layers thereon) to be etched or otherwise processed.
  • the resulting integrated circuit chips can be distributed by the fabricator in raw wafer form (that is, as a single wafer that has multiple unpackaged chips), as a bare die, or in a packaged form.
  • the chip may be mounted in a single chip package (such as a plastic carrier, with leads that are affixed to a motherboard or other higher level carrier) or in a multichip package (such as a ceramic carrier that has either or both surface interconnections or buried interconnections).
  • the chip may then be integrated with other chips, discrete circuit elements, and/or other signal processing devices as part of either (a) an intermediate product, such as a motherboard, or (b) an end product.
  • the end product can be any product that includes integrated circuit chips, ranging from toys and other low-end applications to advanced computer products having a display, a keyboard or other input device, and a central processor.
  • the end products in which the integrated circuit chips may be provided may include game machines, game consoles, hand-held computing devices, personal digital assistants, communication devices, such as wireless telephones and the like, laptop computing devices, desktop computing devices, server computing devices, or any other computing device.
  • Figure 9 shows a block diagram of an exemplary design flow 900 used, for example, in semiconductor IC logic design, simulation, test, layout, and manufacture.
  • Design flow 900 includes processes and mechanisms for processing design structures to generate logically or otherwise functionally equivalent representations of the embodiments of the invention shown in Figures 4-7.
  • the design structures processed and/or generated by design flow 900 may be encoded on machine-readable transmission or storage media to include data and/or instructions that when executed or otherwise processed on a data processing system generate a logically, structurally, or otherwise functionally equivalent representation of hardware components, circuits, devices, or systems.
  • Figure 9 illustrates multiple such design structures including an input design structure 920 that is preferably processed by a design process 910.
  • Design structure 920 may be a logical simulation design structure generated and processed by design process 910 to produce a logically equivalent functional representation of a hardware device.
  • Design structure 920 may also or alternatively comprise data and/or program instructions that when processed by design process 910, generate a functional representation of the physical structure of a hardware device. Whether representing functional and/or structural design features, design structure 920 may be generated using electronic computer-aided design (EC AD) such as implemented by a core developer/designer.
  • EC AD electronic computer-aided design
  • design structure 920 When encoded on a machine-readable data transmission or storage medium, design structure 920 may be accessed and processed by one or more hardware and/or software modules within design process 910 to simulate or otherwise functionally represent an electronic component, circuit, electronic or logic module, apparatus, device, or system such as those shown in Figures 5-11.
  • design structure 920 may comprise files or other data structures including human and/or machine-readable source code, compiled structures, and computer-executable code structures that when processed by a design or simulation data processing system, functionally simulate or otherwise represent circuits or other levels of hardware logic design.
  • Such data structures may include hardware-description language (HDL) design entities or other data structures conforming to and/or compatible with lower-level HDL design languages such as Verilog and VHDL (Very High Speed Integrated Circuit HDL), and/or higher level design languages such as C or C++.
  • HDL hardware-description language
  • Design process 910 preferably employs and incorporates hardware and/or software modules for synthesizing, translating, or otherwise processing a design/simulation functional equivalent of the components, circuits, devices, or logic structures shown in Figures 5-11 to generate a netlist 980 which may contain design structures such as design structure 920.
  • Netlist 980 may comprise, for example, compiled or otherwise processed data structures representing a list of wires, discrete components, logic gates, control circuits, I/O devices, models, etc. that describes the connections to other elements and circuits in an integrated circuit design.
  • Netlist 980 may be synthesized using an iterative process in which netlist 980 is resynthesized one or more times depending on design specifications and parameters for the device.
  • netlist 980 may be recorded on a machine-readable data storage medium.
  • the medium may be a non- volatile storage medium such as a magnetic or optical disk drive, a compact flash, or other flash memory. Additionally, or in the alternative, the medium may be a system or cache memory, buffer space, or electrically or optically conductive devices and materials on which data packets may be transmitted and intermediately stored via the Internet, or other networking suitable means.
  • Design process 910 may include hardware and software modules for processing a variety of input data structure types including netlist 980.
  • data structure types may reside, for example, within library elements 930 and include a set of commonly used elements, circuits, and devices, including models, layouts, and symbolic representations, for a given
  • the data structure types may further include design specifications 940, characterization data 950, verification data 960, design rules 970, and test data files 985 which may include input test patterns, output test results, and other testing information.
  • Design process 910 may further include modules for performing standard circuit design processes such as timing analysis, verification, design rule checking, place and route operations, etc.
  • Design process 910 employs and incorporates well-known logic and physical design tools such as HDL compilers and simulation model build tools to process design structure 920 together with some or all of the depicted supporting data structures to generate a second design structure 990.
  • design structure 990 preferably comprises one or more files, data structures, or other computer-encoded data or instructions that reside on transmission or data storage media and that when processed by an ECAD system generate a logically or otherwise functionally equivalent form of one or more of the embodiments of the invention shown in Figures 5-11.
  • design structure 990 may comprise a compiled, executable HDL simulation model that functionally simulates the devices shown in Figures 5-11.
  • Design structure 990 may also employ a data format used for the exchange of layout data of integrated circuits and/or symbolic data format (e.g. information stored in a GDSII (GDS2), GL1, OASIS (Open Artwork System Interchange Standard), map files, or any other suitable format for storing such design data structures).
  • Design structure 990 may comprise information such as, for example, symbolic data, map files, test data files, design content files, manufacturing data, layout parameters, wires, levels of metal, vias, shapes, data for routing through the manufacturing line, and any other data processed by semiconductor manufacturing tools to fabricate embodiments of the invention as shown in Figures 5-11.
  • Design structure 990 may then proceed to a stage 995 where, for example, design structure 990 proceeds to tape-out, is released to manufacturing, is released to a mask house, is sent to another design house, is sent back to the customer, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Static Random-Access Memory (AREA)

Abstract

L'invention porte sur un mécanisme qui permet de déclencher un accès en lecture à n'importe quelle rangée d'une mémoire cache qui a été invalidée. Un décodeur d'adresse dans la mémoire cache envoie un accès mémoire à un circuit d'attaque de ligne de mot sans déclenchement et un circuit d'attaque de ligne de mot à déclenchement associés à l'accès mémoire. Le circuit d'attaque de ligne de mot sans déclenchement émet les données stockées dans une cellule de mémoire de bit valide au circuit d'attaque de ligne de mot à déclenchement en réponse au fait que le circuit d'attaque de ligne de mot sans déclenchement détermine que l'accès mémoire est un accès en lecture. Le circuit d'attaque de ligne de mot à déclenchement détermine si les données de la cellule de mémoire de bit valide provenant du circuit d'attaque de ligne de mot sans déclenchement indiquent des données valides ou des données invalides en réponse au fait que le circuit d'attaque de ligne de mot à déclenchement détermine que l'accès mémoire est un accès en lecture, et refuse une sortie des données dans une rangée de cellules de mémoire associée au circuit d'attaque de ligne de mot à déclenchement en réponse au fait que les données sont invalides.
PCT/EP2010/067524 2009-12-10 2010-11-16 Mémoire cache et procédé d'accès WO2011069780A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201080055681.1A CN102652311B (zh) 2009-12-10 2010-11-16 高速访问存储器和方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/635,234 2009-12-10
US12/635,234 US8014215B2 (en) 2009-12-10 2009-12-10 Cache array power savings through a design structure for valid bit detection

Publications (1)

Publication Number Publication Date
WO2011069780A1 true WO2011069780A1 (fr) 2011-06-16

Family

ID=43640632

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2010/067524 WO2011069780A1 (fr) 2009-12-10 2010-11-16 Mémoire cache et procédé d'accès

Country Status (3)

Country Link
US (1) US8014215B2 (fr)
CN (1) CN102652311B (fr)
WO (1) WO2011069780A1 (fr)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8837226B2 (en) * 2011-11-01 2014-09-16 Apple Inc. Memory including a reduced leakage wordline driver
US8400864B1 (en) * 2011-11-01 2013-03-19 Apple Inc. Mechanism for peak power management in a memory
US9158328B2 (en) 2011-12-20 2015-10-13 Oracle International Corporation Memory array clock gating scheme
US9224439B2 (en) 2012-06-29 2015-12-29 Freescale Semiconductor, Inc. Memory with word line access control
US9977485B2 (en) 2012-09-18 2018-05-22 International Business Machines Corporation Cache array with reduced power consumption
CN103106918B (zh) * 2012-12-24 2015-12-02 西安华芯半导体有限公司 一种使用单端口存储单元的两端口静态随机存储器
US9117498B2 (en) 2013-03-14 2015-08-25 Freescale Semiconductor, Inc. Memory with power savings for unnecessary reads
US9672938B2 (en) 2014-04-22 2017-06-06 Nxp Usa, Inc. Memory with redundancy
GB2533972B (en) * 2015-01-12 2021-08-18 Advanced Risc Mach Ltd An interconnect and method of operation of an interconnect
US9384795B1 (en) * 2015-04-29 2016-07-05 Qualcomm Incorporated Fully valid-gated read and write for low power array
CN106297868B (zh) * 2015-05-12 2018-11-06 晶豪科技股份有限公司 驱动子字线的半导体存储器元件
DE102017114986B4 (de) * 2016-12-13 2021-07-29 Taiwan Semiconductor Manufacturing Co. Ltd. Speicher mit symmetrischem Lesestromprofil und diesbezügliches Leseverfahren
CN108536473B (zh) * 2017-03-03 2021-02-23 华为技术有限公司 读取数据的方法和装置
CN110875072B (zh) * 2018-08-29 2021-09-07 中芯国际集成电路制造(北京)有限公司 一种存取存储器的字线驱动电路和静态随机存取存储器
CN111240581B (zh) * 2018-11-29 2023-08-08 北京地平线机器人技术研发有限公司 存储器访问控制方法、装置和电子设备
KR20210082769A (ko) * 2019-12-26 2021-07-06 삼성전자주식회사 리페어 동작을 수행하는 메모리 장치, 그것을 포함하는 메모리 시스템 및 그것의 동작 방법

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6016534A (en) * 1997-07-30 2000-01-18 International Business Machines Corporation Data processing system for controlling operation of a sense amplifier in a cache
US20080259667A1 (en) * 2007-04-17 2008-10-23 Wickeraad John A Content addressable memory
US20090244992A1 (en) * 2008-03-28 2009-10-01 Marco Goetz Integrated circuit and method for reading the content of a memory cell

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2842809B2 (ja) * 1995-06-28 1999-01-06 甲府日本電気株式会社 キャッシュ索引の障害訂正装置
US5911153A (en) * 1996-10-03 1999-06-08 International Business Machines Corporation Memory design which facilitates incremental fetch and store requests off applied base address requests
US6157986A (en) * 1997-12-16 2000-12-05 Advanced Micro Devices, Inc. Fast linear tag validation unit for use in microprocessor
US6198684B1 (en) * 1999-12-23 2001-03-06 Intel Corporation Word line decoder for dual-port cache memory
US6510506B2 (en) * 2000-12-28 2003-01-21 Intel Corporation Error detection in cache tag array using valid vector
US7240277B2 (en) * 2003-09-26 2007-07-03 Texas Instruments Incorporated Memory error detection reporting
CN101477833B (zh) * 2009-01-08 2010-12-01 西安电子科技大学 钟控异步fifo存储器

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6016534A (en) * 1997-07-30 2000-01-18 International Business Machines Corporation Data processing system for controlling operation of a sense amplifier in a cache
US20080259667A1 (en) * 2007-04-17 2008-10-23 Wickeraad John A Content addressable memory
US20090244992A1 (en) * 2008-03-28 2009-10-01 Marco Goetz Integrated circuit and method for reading the content of a memory cell

Also Published As

Publication number Publication date
US8014215B2 (en) 2011-09-06
US20110141826A1 (en) 2011-06-16
CN102652311A (zh) 2012-08-29
CN102652311B (zh) 2015-06-24

Similar Documents

Publication Publication Date Title
US8014215B2 (en) Cache array power savings through a design structure for valid bit detection
US8493774B2 (en) Performing logic functions on more than one memory cell within an array of memory cells
US8526256B2 (en) Single-ended sense amplifier with read-assist
US7379347B1 (en) Memory device and method for performing write operations in such a memory device
JP3983032B2 (ja) 半導体記憶装置
EP3304555B1 (fr) Circuit d'assistance à l'écriture de mémoire orienté rangée et à faible consommation
US7817481B2 (en) Column selectable self-biasing virtual voltages for SRAM write assist
US7613050B2 (en) Sense-amplifier assist (SAA) with power-reduction technique
US8659963B2 (en) Enhanced power savings for memory arrays
CN114730595A (zh) 具有锁存器的静态随机存取存储器读路径
CN108352175B (zh) 低功率高性能sram中的感测放大器
US7502276B1 (en) Method and apparatus for multi-word write in domino read SRAMs
US9047981B2 (en) Bit-flipping in memories
US20080137450A1 (en) Apparatus and method for sram array power reduction through majority evaluation
Ataei et al. A 64 kb differential single-port 12T SRAM design with a bit-interleaving scheme for low-voltage operation in 32 nm SOI CMOS
Cosemans et al. A 3.6 pJ/access 480 MHz, 128 kb on-chip SRAM with 850 MHz boost mode in 90 nm CMOS with tunable sense amplifiers
JP2003303494A (ja) 半導体記憶装置
US7123500B2 (en) 1P1N 2T gain cell
Hsiao et al. Design of low-leakage multi-port SRAM for register file in graphics processing unit
US8375172B2 (en) Preventing fast read before write in static random access memory arrays
US11017848B2 (en) Static random-access memory (SRAM) system with delay tuning and control and a method thereof
US20140140157A1 (en) Complementary Metal-Oxide-Semiconductor (CMOS) Min/Max Voltage Circuit for Switching Between Multiple Voltages
JP2001243764A (ja) 半導体記憶装置
US9047930B2 (en) Single-ended low-swing power-savings mechanism with process compensation
Chen et al. A 100 MHz SRAM Design in 180 nm Process

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201080055681.1

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10782237

Country of ref document: EP

Kind code of ref document: A1

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10782237

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10782237

Country of ref document: EP

Kind code of ref document: A1