WO2016063667A1 - 再構成可能デバイス - Google Patents
再構成可能デバイス Download PDFInfo
- Publication number
- WO2016063667A1 WO2016063667A1 PCT/JP2015/076610 JP2015076610W WO2016063667A1 WO 2016063667 A1 WO2016063667 A1 WO 2016063667A1 JP 2015076610 W JP2015076610 W JP 2015076610W WO 2016063667 A1 WO2016063667 A1 WO 2016063667A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- address
- data
- line
- memory
- memory cell
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K19/00—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
- H03K19/02—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components
- H03K19/173—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components
- H03K19/177—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components arranged in matrix form
- H03K19/17724—Structural details of logic blocks
- H03K19/17728—Reconfigurable logic blocks, e.g. lookup tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C11/00—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
- G11C11/21—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
- G11C11/34—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
- G11C11/40—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
- G11C11/41—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming static cells with positive feedback, i.e. cells not needing refreshing or charge regeneration, e.g. bistable multivibrator or Schmitt trigger
- G11C11/413—Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing, timing or power reduction
- G11C11/417—Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing, timing or power reduction for memory cells of the field-effect type
- G11C11/418—Address circuits
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C11/00—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
- G11C11/21—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
- G11C11/34—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
- G11C11/40—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
- G11C11/41—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming static cells with positive feedback, i.e. cells not needing refreshing or charge regeneration, e.g. bistable multivibrator or Schmitt trigger
- G11C11/413—Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing, timing or power reduction
- G11C11/417—Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing, timing or power reduction for memory cells of the field-effect type
- G11C11/419—Read-write [R-W] circuits
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C7/00—Arrangements for writing information into, or reading information out from, a digital store
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C8/00—Arrangements for selecting an address in a digital store
- G11C8/10—Decoders
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K19/00—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
- H03K19/02—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components
- H03K19/173—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components
- H03K19/177—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components arranged in matrix form
- H03K19/17736—Structural details of routing resources
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K19/00—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
- H03K19/02—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components
- H03K19/173—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components
- H03K19/177—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components arranged in matrix form
- H03K19/17736—Structural details of routing resources
- H03K19/17744—Structural details of routing resources for input/output signals
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03K—PULSE TECHNIQUE
- H03K19/00—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits
- H03K19/02—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components
- H03K19/173—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components
- H03K19/177—Logic circuits, i.e. having at least two inputs acting on one output; Inverting circuits using specified components using elementary logic circuits as components arranged in matrix form
- H03K19/17748—Structural details of configuration resources
- H03K19/1776—Structural details of configuration resources for memories
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
Definitions
- the present invention relates to a reconfigurable device and a semiconductor device including the same.
- the CPU performs arithmetic processing on the data held in the register, but prefetches the data to be operated from the cache into the register, and if the data in the cache is not the target data, the CPU determines that a “cache miss” has occurred from the main memory. Process to read data.
- the delay of data processing tends to be strong when there is a large amount of data even though the arithmetic processing itself is a repetition of simple arithmetic. For this reason, it is not necessary to perform advanced processing as much as the processor performs. For this reason, data processing is not transferred to the CPU, data processing is performed on the memory side, and the CPU can be used when more advanced arithmetic processing is required, thereby speeding up data processing.
- the semiconductor device according to the present embodiment is arranged on the main memory side and is responsible for repeating simple operations, thereby reducing main memory access from the CPU and increasing the speed of data processing.
- the reconfigurable device includes a plurality of logic units connected to each other by an address line or a data line, Each of the logic units is Multiple address lines, Multiple data lines, An address decoder that decodes an address input from a part of the plurality of address lines; A plurality of memory cells specified by the decode line of the address decoder, and a memory cell array unit that outputs data read from the specified memory cells to the data line, A reconfigurable device in which an address line of the memory cell array unit is connected to a data output line of the main memory.
- this semiconductor device operates as a logic element and / or a connection element using a multi-lookup table, it is clearly different from an FPGA that realizes wiring connection by a selection circuit.
- Each of the logic units is A first address decoder for decoding an address input from a part of the plurality of address lines; A second address decoder for decoding an address input from another part of the plurality of address lines; A first memory cell unit having a plurality of memory cells specified by a decode line of the first address decoder;
- the reconfigurable device according to item 1 or 2 comprising: a second memory cell unit having a plurality of memory cells specified by a decode line of the second address decoder.
- the first memory cell unit and the second memory cell unit store a plurality of truth table data, and output data specifying one of the plurality of truth table data to a second plurality of address lines.
- Item 4 The semiconductor device according to Item 3, to be connected.
- the reconfigurable device includes a plurality of logic units connected to each other by an address line or a data line, Each of the logic units is Multiple address lines, Multiple data lines, An address decoder that decodes an address input from a part of the plurality of address lines; A plurality of memory cells specified by the decode line of the address decoder, and a memory cell array unit that outputs data read from the specified memory cells to the data line, A semiconductor device comprising: a reconfigurable device in which a data output of the memory cell array unit is connected to an address line of the main memory.
- the second reconfigurable device includes a plurality of logic units connected to each other by an address line or a data line, Each of the logic units is Multiple address lines, Multiple data lines, An address decoder that decodes an address input from a part of the plurality of address lines; A plurality of memory cells specified by the decode line of the address decoder, and a memory cell array unit that outputs data read from the specified memory cells to the data line, A data output of the memory cell array unit is connected to an address line of the main memory; Item 7.
- Item 7 The semiconductor device according to Item 5 or 6, further comprising a scale adjustment circuit for adjusting a circuit scale between the main memory and the reconfigurable device.
- This embodiment can reduce the main memory access from the CPU and increase the data processing speed.
- FIG. 9 is a diagram illustrating a circuit example of the MLUT illustrated in FIG. 8.
- FIG. It is a figure explaining MRLD using MLUT shown in FIG. It is a figure which shows the circuit example of MLUT which can perform synchronous asynchronous switching based on 2nd Embodiment. It is a figure which shows an example of a scale adjustment circuit. It is a figure which shows an example of MLUT. It is a figure which shows an example of MLUT which operate
- the data in the main memory is a process involving a very large time and burden because the information search is executed while sequentially checking the address. Therefore, in order to efficiently perform information processing, it is usually arranged in advance what information exists at which address on the memory, and metadata is prepared so that the burden of searching for information on the processor can be reduced. . For example, a hash table. However, since it takes a lot of time to prepare metadata, it is necessary to repeat data maintenance, and it is necessary to increase the size of the apparatus and provide high power, for example, by parallelizing processors. A computer device or a semiconductor device described below can realize various functions in a memory without requiring metadata.
- FIG. 1 is a diagram illustrating a first example of the overall configuration of a computer device according to the present embodiment.
- the computer device 10 includes a processor 510, a main memory 600, a communication unit 530, an external storage device 540, a drive device 550, and an I / O controller 560.
- the processor 510 includes a processor core 511, an L2 cache controller 512, an L2 cache memory 514, and a memory controller 516.
- the processor 510 is connected to the communication unit 530 and the external storage device 540 via the I / O controller 560.
- the processor 510 is a device that loads data from the main memory 600 by executing a program stored in the main memory 600, calculates the loaded data, and stores the calculation result in the main memory 600.
- the memory controller 516 performs interface of the main memory such as reading and writing of data to and from the main memory 600 on the computer device 10 and refreshing the main memory 600 if it is a DRAM. For example, data is loaded from the main memory 600 to the L2 cache memory 514, data is stored from the L2 cache controller 512 to the main memory 600, and the like.
- the L2 cache memory 514 holds a part of data stored in the main memory 600.
- the L2 cache memory 514 includes data held in an L1 cache memory (L1 Cache Memory) included in the processor core 511.
- L1 cache memory L1 Cache Memory
- the L2 cache controller 512 operates to store data with high access frequency from the processor core 511 in the L2 cache memory 514 and to drive data with low access frequency from the L2 cache memory 514 to the main memory 600.
- the processor core 511 is, for example, a processor core and has the arithmetic function of the processor 510 described above. Note that the number of processor cores shown in FIG. 1 is one, but may be more than one.
- one processor core operates as a master, executes a program, and operates so that the other processor cores as slaves share and execute the program.
- Such a master operation may be described as an instruction sequence in the program and executed by executing the instruction sequence.
- the I / O controller 560 is an input / output control device that controls connection between the processor 510 and other units.
- the I / O controller 560 operates according to a standard such as PCI Express (Peripheral Component Interconnect Express), for example.
- the main memory 600 is a device that stores data and programs.
- the processor 510 can access the main memory 600 without going through the I / O controller 560.
- the main memory 600 is, for example, a DRAM (Dynamic Random Access Memory).
- External storage device 540 is a non-volatile storage device that stores programs and data stored in main memory 600.
- the external storage device 540 is a disk array using a magnetic disk, an SSD (Solid State Drive) using a flash memory, or the like.
- the communication unit 530 is connected to the network 1100 as a communication path, and transmits and receives data between the computer apparatus 10 and another computer apparatus connected to the network 1100.
- the communication unit 530 is, for example, a NIC (Network Interface Controller).
- the drive device 550 is a device that reads and writes a storage medium 1200 such as, for example, a floppy (registered trademark) disk, a CD-ROM (Compact Disc Only Memory), a DVD (Digital Versatile Disc), or the like.
- the drive device 550 includes a motor that rotates the storage medium 1200, a head that reads and writes data on the storage medium 1200, and the like.
- the storage medium 1200 can store a program.
- the storage medium 1200 generates a circuit description language 1210 such as a C language description or a hardware description language (HDL) for designing an integrated circuit, and truth table data 1230, in addition to a program that defines arithmetic processing.
- the logical configuration program 1220 can be stored.
- Truth table data 1230 is generated by the processor core 511, but may be stored in a storage medium 1200 and carried as shown in the figure. In this case, truth table data 1230 is generated by another computer device (not shown).
- the drive device 550 reads a program from the storage medium 1200 set in the drive device 550.
- the processor 510 stores the program read by the drive device 550 in the main memory 600 or the external storage device 540.
- the truth table data 1230 is written in the reconfigurable device 20 and causes the reconfigurable device 20 to execute desired arithmetic processing. However, both programs are executed by the processor core. Differentiated.
- 1.1 Semiconductor Device 16 is a semiconductor device composed of at least a main memory and a reconfigurable device.
- the reconfigurable device 20 is connected to the data output of the main memory 600.
- the reconfigurable device implements a circuit that performs simple operations. For example, a sequential comparator or an automaton.
- the reconfigurable device 20 When the memory controller 516 reads a predetermined address space using the address AD, the reconfigurable device 20 performs an operation on the data RD1 output from the address space of the main memory 600, and necessary data RD2 Is output to the memory controller 516. At this time, the data amount of the data RD2 obtained by performing the arithmetic processing on the data RD1 is the data after the data processing performed by the previous processor, and thus the processing load on the processor 510 can be reduced.
- the processor core 511 executes the process specified by the instruction on the data held in the register (not shown) according to the instruction read from the L1 cache memory (not shown). Instructions include floating point arithmetic, integer arithmetic, address generation, branch instruction execution, and store or load operations. That is, the processor core 511 can dynamically execute any instruction according to the program.
- the reconfigurable device 20 is configured by an MLUT (Multi Look Up Table), and reconfiguration requires a memory rewrite process. Therefore, a processor core 511 having a circuit configuration according to a predetermined instruction Therefore, a plurality of arithmetic processes cannot be executed at high speed. However, for example, by performing data search processing in image processing or parallel operation of data search processing, main memory access from the processor 510 can be dramatically reduced, and the throughput of the computer apparatus 10 can be greatly improved. it can.
- the data output line of the main memory 600 is connected to the address input line of the reconfigurable device 20. Since the main memory 600 is highly integrated, it is preferable that the reconfigurable device 20 is also highly integrated. Therefore, it is preferable that the memory of the reconfigurable device 20 is also composed of a DRAM constituting the main memory.
- FIG. 2 is a diagram showing a second example of the overall configuration of the computer apparatus according to the present embodiment. Unlike FIG. 1, the reconfigurable device 20 ⁇ / b> A is mounted before the address input of the main memory 600.
- the reconfigurable device 20A converts the address AD1 into the address AD2.
- the reconfigurable device 20B performs an operation on the data RD1 output from the address space of the main memory 600 based on the address AD2, and outputs the data RD2 to the memory controller 516.
- Reconfigurable devices 20A and 20B realize memory defect relief, CAM (Content Addressable Memory), and the like.
- Memory failure remedy is to replace defective bits with redundant bits to improve defects in order to improve yield.
- the main memory generally has a fixed relief circuit, but this memory failure relief can also be realized by the reconfigurable device 20A. As a result, it is possible to autonomously relieve the defect by testing and switching to another address.
- FIG. 3 shows an example of an arithmetic unit configured.
- the reconfigurable device 20 can configure the computing unit of FIG. 3 in parallel for each data output.
- the memory data of the word address selected and read from the main memory 600 is directly assigned to the flip-flop, logical product with the past flag of the flip-flop, logical sum, exclusive logic, memory data, and past data of the flip-flop output.
- the logical negation of both is freely selected, and an arbitrary 1-bit operation can be performed in parallel with n bits. For example, when 8-bit data is calculated, 1-bit operation is repeated a predetermined number of times.
- the current information processing data is a collection of 1-bit data, and all information processing is possible with this calculation method as long as each 1-bit information is defined.
- the reconfigurable device is configured by a logical unit called MLUT, and these are configured as a logical element and / or a connection element by storing truth table data.
- MLUT logical unit
- Such a circuit is not limited to the arithmetic unit shown in FIG.
- the reconfigurable devices 20A and 20B can realize CAM. Since the CAM has a coincidence circuit for each memory cell, the circuit configuration becomes extremely large, and it must be made from a circuit design with a special memory, and the memory IP cannot be used. However, if the reconfigurable device 20A registers data in the main memory 600 in the form of an index, the address of the main memory 600 that stores the data word can be output.
- the reconfigurable logical device is also referred to as MRLD (Memory based Reconfigurable Logic Device) (registered trademark), and will be described using the same reference numeral 20.
- MRLD Memory based Reconfigurable Logic Device
- the MLUTs are directly connected without interposing wiring elements, and the function of the synchronous SRAM supplied as the memory IP is effectively used.
- an address transition detection unit is provided and the synchronous SRAM is also desynchronized. As a result, at the same time as de-synchronization, no input signal is input to the block that does not constitute logic, address transition does not occur, and power can be reduced. Since an input signal is input to the blocks constituting the logic, a clock is generated and a predetermined logic value can be output.
- the MRLD 20 includes an MLUT array 60 in which a plurality of MLUTs 30 using a synchronous SRAM are arranged in an array, a memory read operation of the MLUT 30, a row decoder 61 that identifies a memory cell that is a write operation target, and a column decoder 62.
- the MLUT 30 is composed of a synchronous SRAM.
- the MLUT 30 performs a logical operation that operates as a logical element, a connection element, or a logical element and a connection element by storing data regarded as a truth table in the storage element of the memory.
- a logic address LA indicated by a solid line and a signal of the logic data LD are used.
- the logic address LA is used as an input signal for the logic circuit.
- the logic data LD is used as an output signal of the logic circuit.
- the logic address LA of the MLUT 30 is connected to the data line of the logic operation data LD of the adjacent MLUT.
- the logic realized by the logic operation of the MRLD 20 is realized by truth table data stored in the MLUT 30.
- Some MLUTs 30 operate as logic elements as combinational circuits such as AND circuits and adders.
- the other MLUTs operate as connection elements that connect the MLUTs 30 that realize the combinational circuit. Rewriting of truth table data for the MLUT 30 to realize a logical element and a connection element is performed by a write operation to the memory.
- the write operation of the MRLD 20 is performed by the write address AD and the write data WD, and the read operation is performed by the write address AD and the read data RD.
- the write address AD is an address for specifying a memory cell in the MLUT 30.
- the write address AD specifies m number of memory cells of 2 m with m signal lines.
- the row decoder 61 receives the MLUT address via the m signal lines, decodes the MLUT address, and selects and specifies the MLUT 30 that is the target of the memory operation.
- the memory operation address is used in both the memory read operation and the write operation, and is decoded by the row decoder 61 and the column decoder 62 via the m signal lines to select a target memory cell. .
- the logical address LA is decoded by a decoder in the MLUT.
- the row decoder 61 decodes x bits of m bits of the write address AD in accordance with control signals such as a read enable signal re and a write enable signal we, and outputs a decoded address n to the MLUT 30.
- the decode address n is used as an address for specifying a memory cell in the MLUT 30.
- the column decoder 62 decodes y bits out of m bits of the write address AD, has the same function as the row decoder 61, outputs the decode address n to the MLUT 30, and writes the write data WD. And the read data RD are input.
- n ⁇ t bit data is input from the MLUT array 60 to the row decoder 61.
- the row decoder outputs re and we for o rows. That is, the o line corresponds to the s line of the MLUT.
- a word line of a specific memory cell is selected by activating only one bit out of the o bits. Since t MLUTs output n-bit data, n ⁇ t-bit data is selected from the MLUT array 60, and the column decoder 62 is used to select one of them.
- AD in FIG. 4 corresponds to RD1 in FIG. 1
- RD in FIG. 4 corresponds to RD2 in FIG.
- AD in FIG. 4 corresponds to AD1 in FIG. 2
- RD in FIG. 4 corresponds to AD2 in FIG.
- AD in FIG. 4 corresponds to RD1 in FIG. 2
- RD in FIG. 4 corresponds to RD2 in FIG.
- FIG. 7 is a diagram schematically showing an MRLD configured by horizontally stacking MLUTs composed of two memory cell units shown in FIG.
- FIG. 8 is a diagram showing the input / output lines of the MLUT.
- the MLUT 30 shown in FIG. 7 has inputs of addresses A0L to A7L shown in FIG. 8 from the left direction, and inputs of addresses A0R to A7R shown in FIG. 8 from the right direction. There are outputs of data D0L to D7L, and there are outputs of data D0R to D7R shown in FIG. 8 in the right direction.
- this plan is composed of 8K (256 words ⁇ 16 bits ⁇ 2 MLUTs) bits.
- FIG. 9 is a diagram illustrating a circuit example of the MLUT illustrated in FIG.
- the MLUT 30 illustrated in FIG. 9 includes memory cell units 31A and 31B.
- the memory cell unit is, for example, an SRAM.
- the memory cell unit 31A includes a plurality of memory cells that are specified by the first plurality of address lines from one side and output to the first plurality of data lines that is twice as many as the first plurality of address lines.
- the memory cell unit 31B has a plurality of memory cells that are specified by the second plurality of address lines from the other side and output to the second plurality of data lines that is twice the number of the second plurality of address lines.
- the MLUT 30 outputs a part of the first plurality of data lines and the second plurality of data lines to one side, and outputs the other part of the first plurality of data lines and the second plurality of data lines to the other side.
- Each memory cell unit stores truth table data in a memory cell for each direction. Therefore, each of the memory cell units 31A and 31B stores right-to-left truth table data and left-to-right truth table data. That is, the MLUT stores two truth table data each defining a specific data output direction.
- the number of data in each memory cell unit is increased from the number of addresses, and the direction of data output from each memory cell unit is bidirectional, thereby reducing the number of required memory cells and bidirectional data output. Can be made possible.
- FIG. 10 shows a more detailed circuit example than the MLUT shown in FIG.
- the MLUT 30 shown in FIG. 10 includes memory cell units 31A and 31B, address decoders 11A and 11B, address selectors 15A and 15B, I / O (input / output) buffers 12A and 12B, and data selectors 13A and 13B.
- the memory cell units 31A and 31B of the MLUT 30 each have an address decoder, an address selector, an I / O buffer, and a data selector.
- Input addresses to the memory cell units 31A and 31B are addresses A0L to A7L and A8 to A15, and addresses A0R to A7R and A8 to A15, respectively. Therefore, the memory cell units 31A and 31B have a large capacity of 512K of 2 16 (65,536) words ⁇ 8 bits.
- the memory cell units 31A and 31B have inputs of addresses A0L to A7L and A8 to A15, and address addresses A0R to A7R and A8 to A15, respectively.
- FIG. 9 is a schematic diagram and does not show a decoder or the like that is a peripheral circuit of the memory cell unit, and the decoders are prepared for each memory cell unit by the address decoders 11A and 11B described in FIG. Are arranged between the address selectors 15A and 15B and the memory cell units 31A and 31B. Therefore, the address decoder may decode all the addresses output from the address selectors 15A and 15B.
- the address selectors 15A and 15B are selection circuits for switching between an address line for logic operation and an address for writing, and are necessary when the memory cell is a single port. When the memory cell is a dual port, an address selector is not necessary.
- the data selectors 13A and 13B are selection circuits that switch output data or write data WD.
- MRLD can use a conventional large-capacity memory device without going through semiconductor design prototyping and manufacturing for a dedicated small SRAM.
- a memory IP Intelligent Property
- the area of the address decoder and sense amplifier is large, and the composition ratio of the memory itself is 50% or less. . This also becomes an overhead of MRLD and is inefficient.
- the ratio of address decoders and sense amplifiers decreases, and the memory usage efficiency increases. For this reason, the present proposal for a large-capacity memory is effective in the case of an MRLD chip.
- the MLUT described here is a bidirectionally arranged MLUT and has the same functional configuration as that of the MLUT described with reference to FIGS. However, unlike the bidirectionally arranged MLUT, the memory cell unit for synchronous operation and the memory cell unit for asynchronous operation are provided.
- the memory cell unit for synchronous operation or the memory cell unit for asynchronous operation constitutes a pair, but there is only one memory cell unit that operates as a logic element and / or a connection element. Since both data outputs are connected by a wired OR connection or an OR circuit, data “0” is stored in all the memory cell units that do not operate.
- FIG. 11 is a diagram showing an example of a MLUT circuit capable of synchronous and asynchronous switching.
- the MLUT 30 shown in FIG. 11 includes memory cell units 31A to 31D, address decoders 11A to 11D, I / O (input / output) buffers 13A to 13D, selection circuits 32A to 32D, a data selection circuit 33, and an address transition detection unit 35.
- the address transition detector 35 includes an ATD (Address Transition Detector) circuit, and detects the address transition by comparing the logical address transmitted together with the clock with the previously transmitted logical address.
- ATD Address Transition Detector
- FIG. 5 is a circuit diagram showing an example of the address transition detection unit.
- FIG. 6 is a timing chart of the address transition detection unit.
- the address transition detection unit 35 includes negative logical sum (NOR) circuits 110A and 110B, a logical sum (OR) circuit 120, an exclusive logical sum (EOR) circuit 130, delay circuits 140A to 140C, a flip-flop (FF) 150, an inverter 160A and 160B, and a D latch 170.
- NOR negative logical sum
- OR logical sum
- EOR exclusive logical sum
- FF flip-flop
- the signal S1 is an address input signal output from the processor.
- Signal S2 is the output of the D latch.
- the D latch 170 latches so as not to change for a certain period. This is to ignore subsequent address transitions due to noise or the like.
- the signal S3 is a delayed signal output from the D latch 170. As shown in FIG. 5, the delayed signal is delayed by a delay circuit 140B in order to generate a clock at the rising edge and the falling edge to generate the clock width of the signal S4.
- the signal S4 generated as a clock signal detects a change and is output from the EOR 130.
- the EOR 130 since the input and output of the delay circuit 140B are input, if the signal levels of the two differ, the signal level “high” is output. Thereby, an address transition can be detected.
- the time T1 of S4 shown in FIG. 6 indicates the time from the detection of the change of the logical address to the FF fetch, and the time T2 indicates the time from the detection of the change of the logical address to the reading of the memory cell unit.
- OR circuit 120 other address transition signals are input together with the signal S4, and an OR operation value is output.
- the output of the OR circuit 120 is delayed by the delay circuit 140C, and the signal S5 is output.
- the signal S5 is a delay signal output from the delay circuit 140C and waits for an enable signal of the D latch 170 and inputs the clock.
- the signal S6 is a signal extension of the signal S5 and is a pulse generation of the enable signal.
- the NOR circuit 110A outputs a signal S7 that is a NOR operation value of the signals S5 and S6.
- the signal S7 becomes an enable signal for the D latch 170.
- the signal S8 is a signal obtained by inverting the signal S5 by the inverter 160A, and is used by the FF 150 as a clock for latching the address signal.
- the signal S9 is used to enable the memory cell unit 31 in the subsequent stage, the signal S10 is used as a clock (atd_clk) of the memory cell unit 31, and the signal S11 is used as an address of the memory cell unit 31.
- a signal S10 in FIG. 5 indicates the time from detection of a change in logical address to reading from the memory.
- a clock is generated with a change in the address for which the data request is made, and the memory is driven.
- the memory is activated, it is possible to autonomously reduce power consumption without driving the memory when it is unnecessary.
- the memory cell units 31A to 31D are synchronous SRAMs. Each of the memory cell units 31A to 31D stores truth table data for connection in the left direction and the right direction.
- the memory cell units 31B and 31D operate in synchronization with the system clock.
- the memory cell units 31A and 31C operate in synchronization with an ATD generation clock (also referred to as “internal clock signal”) generated by an address transition circuit 35 described later, they are asynchronous with respect to the clock (system clock).
- the ATD generation clock operates at a frequency higher than that of the system clock signal, the memory cell units 31A and 31C provide an asynchronous function by appearing to operate asynchronously from the outside of the MLUT 30.
- the memory cell units 31A and 31C have the same functions as the memory cell units 31A and 31B shown in FIGS. The same applies to the memory cell units 31B and 31D.
- the address decoders 11A and 11B both decode addresses A0 to A3 inputted from the left side, and output decode signals to the memory cell units 31A and 31B, respectively, to activate the word lines of the memory cell units 31A and 31B. To do.
- the address decoders 11C and 11D decode addresses A4 to A7 input from the right side, and output decode signals to the memory cell units 31C and 31D, respectively, to activate the word lines of the memory cell units 31C and 31D. .
- the address decoders 11A and 11C decode the SRAM address asynchronous signal (sram_address (sync)), and the address decoders 11A and 11C decode the SRAM address synchronization signal (sram_address (sync)) and are specified by the decode signal.
- the word line of the memory cell unit to be activated is activated.
- each memory cell unit is a 16 word ⁇ 8 bit memory block.
- the memory cell units 31A and 31B can use 16wordx8bitx2 in the synchronous mode and 16wordx8bitx2 in the asynchronous mode. Synchronous and asynchronous operations cannot be performed simultaneously. For example, when logical data is written to a synchronously operating memory cell unit, all "0" must be written to the asynchronously operating memory cell unit.
- the data output of the memory cell unit may be a wired OR as shown in the figure, or an OR logic circuit may be provided.
- Selection circuit The selection conditions for the selection circuit are shown in the table below.
- the selection circuits 32A to 32D are circuits for selecting the operation of the memory cell units 31A and 31C for asynchronous operation or the memory cell units 31B and 31D for synchronous operation.
- the selection circuit 32A selects the atd_ad latch address (S11 shown in FIG. 3) generated by the address transition circuit 35 and selects the SRAM address asynchronous signal (sram_address ( async)). If asynchronous operation is not selected, the logical address is output as it is.
- the selection circuit 32B selects and outputs the ATD generation clock generated by the address transition circuit 35 when the asynchronous operation is selected by the selection signal (Select). If asynchronous operation is not selected, the clock is output as is.
- the selection circuit 32C selects and outputs the ATD generation chip select generated by the address transition circuit 35.
- the SRAM chip enable is output as it is.
- the selection circuit 32D outputs the logical address as it is when the synchronous operation is selected by the selection signal (Select).
- Truth table 1 is a truth table that forms an AND circuit using A0 and A1 and outputs it to D0.
- truth table 2 an AND circuit is configured using A0 and A4, and a truth table output to D0 is shown. Since the logic in the truth table 1 can be logically operated only by the memory cell unit 31A using A3-A0, if “0” is written in another memory cell unit, another memory cell unit is obtained by OR operation. The problem of forbidden logic does not occur.
- the I / O (input / output) buffers 13A to 13D provide an FF function by reading data from the data line of the memory cell unit in synchronization with either the clock or the ATD generation clock. Yes.
- the I / O (input / output) buffers 13A to 13D include a sense amplifier that amplifies a voltage output from the bit line of the memory cell.
- the selection circuit 32 outputs the SRAM data output (O_mdata) as either SRAM data output or logical data output according to the selection signal.
- FIG. 12 is a diagram illustrating an example of the scale adjustment circuit.
- the scale adjustment circuit 21A is disposed between the main memory 600 and the MRLD 20A, and the circuit scale adjustment circuit 21B is disposed between the main memory 600 and the MRLD 20B.
- FIG. 13 is a diagram illustrating an example of an MLUT.
- the notation of the address selector, the I / O buffer, and the data selector is omitted in order to explain the logical operation.
- the logic operation data lines D0 to D3 connect the 16 storage elements 40 in series.
- the address decoder 9 is configured to select four storage elements connected to any of the 16 word lines based on signals input to the logic address input LA lines A0 to A3.
- These four storage elements are connected to logic operation data lines D0 to D3, respectively, and output data stored in the storage elements to logic operation data lines D0 to D3.
- the four storage elements 40A, 40B, 40C, and 40D can be selected.
- the storage element 40A is connected to the logic operation data line D0
- the storage element 40B is connected to the logic operation data line D1
- the storage element 40D is connected to the logic operation data line D2.
- 40D is connected to the logic operation data line D3.
- signals stored in the storage elements 40A to 40D are output to the logic operation data lines D0 to D3.
- the MLUTs 30A and 30B receive the logical address input LA from the logical address input LA lines A0 to A3, and values stored in the four storage elements 40 selected by the address decoder 9 based on the logical address input LA. Are output as logic operation data to the logic operation data lines D0 to D3, respectively.
- the logical address input LA line A2 of the MLUT 30A is connected to the logical operation data line D0 of the adjacent MLUT 30B, and the MLUT 30A receives the logical operation data output from the MLUT 30B as the logical address input LA. .
- the logic operation data line D2 of the MLUT 30A is connected to the logic address input LA line A0 of the MLUT 30B, and the logic operation data output from the MLUT 30A is received by the MLUT 30B as the logic address input LA.
- the logic operation data line D2 of the MLUT 30A is one of 16 storage elements connected to the logic operation data D2 based on signals input to the logic address input LA lines A0 to A3 of the MLUT 30A. Is output to the logic address input LA line A0 of the MLUT 30B.
- the logic operation data line D0 of the MLUT 30B is one of 16 storage elements connected to the logic operation data line D0 based on signals input to the logic address input LA lines A0 to A3 of the MLUT 30B.
- the signal stored in one is output to the logic address input LA line A2 of the MLUT 30A.
- the MLUTs are connected to each other using a pair of address lines and data lines.
- a pair of address lines and data lines used for MLUT connection such as the logic address input LA line A2 of the MLUT 30A and the logic operation data line D2, is referred to as an “AD pair”.
- MLUTs 30 ⁇ / b> A and 30 ⁇ / b> B have 4 AD pairs, but the number of AD pairs is not particularly limited to 4 as will be described later.
- FIG. 14 is a diagram illustrating an example of an MLUT that operates as a logic circuit.
- the logical address input LA lines A0 and A1 are input to the two-input NOR circuit 701
- the logical address input LA lines A2 and A3 are input to the two-input NAND circuit 702.
- the output of the 2-input NOR circuit 701 and the output of the 2-input NAND circuit 702 are input to the 2-input NAND circuit 703, and the output of the 2-input NAND circuit 703 is output to the logic operation data line D0. To do.
- FIG. 15 is a diagram showing a truth table of the logic circuit shown in FIG. Since the logic circuit of FIG. 14 has four inputs, all the inputs A0 to A3 are used as inputs. On the other hand, since there is only one output, only the output D0 is used as an output. “*” Is written in the columns of outputs D1 to D3 of the truth table. This indicates that any value of “0” or “1” may be used. However, when the truth table data is actually written into the MLUT for reconstruction, it is necessary to write either “0” or “1” in these fields.
- FIG. 16 is a diagram illustrating an example of an MLUT that operates as a connection element.
- the MLUT as a connection element outputs the signal of the logic address input LA line A0 to the logic operation data line D1, and outputs the signal of the logic address input LA line A1 to the logic operation data line D2.
- the logic address input LA line A2 operates to output the signal to the logic operation data line D3.
- the MLUT as the connection element further operates to output the signal of the logic address input LA line A3 to the logic operation data line D0.
- FIG. 17 is a diagram showing a truth table of the connection elements shown in FIG.
- the connection element shown in FIG. 16 has 4 inputs and 4 outputs. Therefore, all inputs A0-A3 and all outputs D0-D3 are used.
- the MLUT outputs the signal of the input A0 to the output D1, the signal of the input A1 to the output D2, the signal of the input A2 to the output D3, and the signal of the input A3. It operates as a connection element that outputs to the output D0.
- FIG. 18 is a diagram illustrating an example of a connection element realized by an MLUT having four AD pairs of AD pair 0, AD pair 1, AD pair 2, and AD pair 3.
- AD0 has a logic address input LA line A0 and a logic operation data line D0.
- AD1 has a logic address input LA line A1 and a logic operation data line D1.
- AD2 has a logic address input LA line A2 and a logic operation data line D2.
- AD3 has a logic address input LA line A3 and a logic operation data line D3.
- a two-dot chain line indicates a signal flow in which a signal input to the AD address 0 logic address input LA line A 0 is output to the AD pair 1 logic operation data line D 1.
- a broken line indicates a signal flow in which a signal input to the AD pair 1 logic address input LA line A1 is output to the AD operation 2 logic operation data line D2.
- a solid line indicates a flow of a signal in which a signal input to the logic address input LA line A2 of the AD pair 2 is output to the logic operation data line D3 of the AD pair 3.
- a one-dot chain line indicates a signal flow in which a signal input to the logic address input LA line A3 of the AD pair 3 is output to the logic operation data line D0 of the AD pair 0.
- the MLUT 30 has four AD pairs, but the number of AD pairs is not particularly limited to four.
- FIG. 19 is a diagram illustrating an example in which one MLUT operates as a logic element and a connection element.
- the logical address input LA lines A 0 and A 1 are input to the 2-input NOR circuit 121, and the output of the 2-input NOR circuit 121 and the logical address input LA line A 2 are connected to the 2-input NAND circuit 122.
- a logic circuit is provided that inputs and outputs the output of the 2-input NAND circuit 122 to the logic operation data line D0.
- a connection element for outputting the signal of the logic address input LA line A3 to the logic operation data line D2 is formed.
- FIG. 20 shows a truth table of the logic elements and connection elements shown in FIG.
- the logic operation of FIG. 19 uses three inputs D0 to D3 and uses one output D0 as an output.
- the connection element of FIG. 20 is configured as a connection element that outputs the signal of the input A3 to the output D2.
- FIG. 21 is a diagram illustrating an example of logical operations and connection elements realized by an MLUT having four AD pairs of AD0, AD1, AD2, and AD3.
- AD0 has a logic address input LA line A0 and a logic operation data line D0.
- AD1 has a logic address input LA line A1 and a logic operation data line D1.
- AD2 has a logic address input LA line A2 and a logic operation data line D2.
- AD3 has a logic address input LA line A3 and a logic operation data line D3.
- the MLUT 30 realizes two operations, ie, a logic operation with three inputs and one output and a connection element with one input and one output, with one MLUT 30.
- the logic operation is performed by using the logic address input LA line A0 of AD pair 0, the logic address input LA line A1 of AD pair 1 and the logic address input LA line A2 of AD pair 2 as inputs. use. Then, the address line of the logic operation data line D0 of AD pair 0 is used as an output. Further, the connection element outputs a signal input to the logic address input LA line A3 of the AD pair 3 to the logic operation data line D2 of the AD pair 2 as indicated by a broken line.
- truth table data applied to the reconfigurable semiconductor device described using the first and second embodiments is generated by an information processing apparatus that executes a software program for logical configuration.
- the information processing apparatus may be the computer apparatus 10, or may be another computer apparatus that has the same hardware resources as the computer apparatus 10 and is connected to the network 1100.
- the computer device 10 includes a processor 510, a main memory 600, and a drive device 550.
- the processor 510 executes the logic configuration software 1210 loaded from the communication unit 530 or the drive device 550, and then executes a circuit description language 1220 such as C language description or hardware description language (HDL) for designing an integrated circuit.
- Truth table data 1230 is generated from the data and stored in the main memory 600.
- the processor 510 writes the generated truth table data 1230 to the reconfigurable device 20.
- the drive device 550 is a device that reads and writes a storage medium 1200 such as a DVD (Digital Versatile Disc) or a flash memory.
- the drive device 550 includes a motor that rotates the storage medium 1200, a head that reads and writes data on the storage medium 1200, and the like.
- the drive device 550 reads a program from the set storage medium 1200.
- the processor 510 stores the program or truth table data read by the drive device 550 in the main memory 600.
- truth table data 1230 When the truth table data 1230 is read into the reconfigurable device 20, a function as a logical element and / or a connection element is constructed by specific means in which the truth table data and hardware resources cooperate.
- the truth table data can also be said to be data having a structure indicating a logical structure called a truth table.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Logic Circuits (AREA)
Abstract
Description
前記再構成可能デバイスは、互いにアドレス線又はデータ線で接続する複数の論理部を備え、
前記各論理部は、
複数のアドレス線と、
複数のデータ線と、
前記複数のアドレス線の一部から入力されるアドレスをデコードするアドレスデコーダと、
前記アドレスデコーダのデコード線により特定される複数のメモリセルを有し、前記特定されたメモリセルから読み出されたデータを前記データ線に出力するメモリセルアレイユニットと、を有し、
前記メモリセルアレイユニットのアドレス線が、前記メインメモリのデータ出力線と接続されている、再構成可能デバイス。
前記複数のアドレス線の一部から入力されるアドレスをデコードする第1アドレスデコーダと、
前記複数のアドレス線の他の一部から入力されるアドレスをデコードする第2アドレスデコーダと、
前記第1アドレスデコーダのデコード線により特定される複数にメモリセルを有する第1メモリセルユニットと、
前記第2アドレスデコーダのデコード線により特定される複数のメモリセルを有する第2メモリセルユニットと、を備える項目1又は2に記載の再構成可能デバイス。
前記メインメモリと接続する再構成可能デバイスであって、
前記再構成可能デバイスは、互いにアドレス線又はデータ線で接続する複数の論理部を備え、
前記各論理部は、
複数のアドレス線と、
複数のデータ線と、
前記複数のアドレス線の一部から入力されるアドレスをデコードするアドレスデコーダと、
前記アドレスデコーダのデコード線により特定される複数のメモリセルを有し、前記特定されたメモリセルから読み出されたデータを前記データ線に出力するメモリセルアレ
イユニットと、を有し、
前記メモリセルアレイユニットのデータ出力が、前記メインメモリのアドレス線と接続されている、再構成可能デバイスと、を備える半導体装置。
前記第2の再構成可能デバイスは、互いにアドレス線又はデータ線で接続する複数の論理部を備え、
前記各論理部は、
複数のアドレス線と、
複数のデータ線と、
前記複数のアドレス線の一部から入力されるアドレスをデコードするアドレスデコーダと、
前記アドレスデコーダのデコード線により特定される複数のメモリセルを有し、前記特定されたメモリセルから読み出されたデータを前記データ線に出力するメモリセルアレイユニットと、を有し、
前記メモリセルアレイユニットのデータ出力が、前記メインメモリのアドレス線と接続されている、
項目6に記載の半導体装置。
プロセッサにとってメインメモリ上のデータは、逐次アドレスを照合しながら情報探しを実行することになるので極めて大きな時間と負担を伴う処理となる。従って効率的に情報処理を行うためには通常メモリ上のどのアドレスにどのような情報が存在するかなどを事前に整理加工し、プロセッサの情報探しの負担が軽減できるようにメタデータを用意する。例えば、ハッシュテーブルである。しかし、メタデータの準備には多大な時間を要するため、データメンテナンスを繰り返す必要があり、プロセッサを並列化するなど、装置を大型化して大電力を与える必要がある。以下に示す、コンピュータ装置又は半導体装置は、メタデータを要することなく、メモリに様々な機能を実現することができる。
図1は、本実施形態に係るコンピュータ装置の全体構成の第1例を示す図である。図1に示すように、コンピュータ装置10は、プロセッサ510、メインメモリ600、通信部530、外部記憶装置540、ドライブ装置550、及びI/Oコントローラ560を有する。
16は、メインメモリと再構成可能デバイスで少なくとも構成される半導体装置である。メインメモリ600のデータ出力には、再構成可能デバイス20が接続されている。再構成可能デバイスは、単純な演算を行う回路が実現されている。例えば、シーケンシャル比較機やオートマトンである。
以下、再構成可能な論理デバイスを、MRLD(Memory based Reconfigurable Logic Device)(登録商標)とも呼び、同一の参照符号20を用いて説明する。MRLDは、各MLUT間が、配線要素を介在せずに直接接続するとともに、メモリIPとして供給される同期SRAMの機能を有効に活用される。尚、以下の記述でのMLUTでは図示していないが、アドレス遷移検出部を備え、同期SRAMでも非同期化している。これにより、非同期化すると同時に、論理を構成しないブロックには入力信号が入力されず、アドレス遷移が起こらず、電力が削減できる。論理を構成するブロックは入力信号が入力されるので、クロック生成がなされ、所定の論理値を出力できる。
図7は、図9に示す2メモリセルユニットからなるMLUTを横積みして構成されるMRLDを概略的に示す図である。図8は、MLUTの入出力線を示す図である。図7に示すMLUT30は、左方向から図8に示すアドレスA0L~A7Lの入力があり、及び、右方向から図8に示すアドレスA0R~A7Rの入力があり、また、左方向へ図8に示すデータD0L~D7Lの出力があり、右方向へ図8に示すデータD0R~D7Rの出力がある。n値=8のMLUTは従来方式では1MビットとなりCLB相当が4Mビットと大規模化してしまう。それに対して本案では後述するように、8K(256ワード×16ビット×MLUT2個)ビットで構成される。
ここで説明するMLUTは、双方向配置MLUTであり、図7及び図8で説明したMLUTと同じ機能構成を有する。しかし、上記の双方向配置MLUTと異なり、同期動作用のメモリセルユニットと、非同期動作用のメモリセルユニットを備える。同期動作用のメモリセルユニット又は非同期動作用のメモリセルユニットは、ペアを構成するが、論理要素及び/又は接続要素として動作するメモリセルユニットは、何れか1つである。両者のデータ出力を、ワイヤードオア接続、又は、OR回路で接続されるため、動作しないメモリセルユニットには、全て「0」のデータが格納される。
図11に示す信号線を、下記表1に説明する。
メモリセルユニット31A~31Dは、同期SRAMである。メモリセルユニット31A~31Dはそれぞれ、左方向および右方向へ接続するための真理値表データを記憶する。メモリセルユニット31B及び31Dは、システムクロックに同期して動作する。一方、メモリセルユニット31A及び31Cは、後述するアドレス遷移回路35が生成するATD生成クロック(「内部クロック信号」とも言う)に同期して動作するために、クロック(システムクロック)に対して、非同期で動作する。ATD生成クロックが、システムクロック信号より、高周波数で動作するために、メモリセルユニット31A、31Cは、MLUT30外部からは、非同期動作するようにみえることで、非同期の機能を提供する。
また、メモリ分割の特性として、禁止論理構成がある。表2に示す2つの真理値表を用いて、禁止論理の必要性を説明する。
I/O(入出力)バッファ13A~13Dは、クロックとATD生成クロックの何れかに同期して、メモリセルユニットのデータ線からデータを読み出すことで、FFの機能を提供している。なお、I/O(入出力)バッファ13A~13Dは、メモリセルのビット線から出力される電圧を増幅するセンスアンプを含んでいる。
MRLD20は、小さなメモリセルユニットで構成されるため、メインメモリ600と比べて集積回路の規模が大きくなり、合わない。図12は、規模調整回路の一例を示す図である。規模調整回路21Aは、メインメモリ600とMRLD20Aの間に配置され、回路規模調整回路21Bは、メインメモリ600とMRLD20Bの間に配置される。
A.論理要素
図13は、MLUTの一例を示す図である。図13では、論理動作の説明を行うために、アドレスセレクタ、I/Oバッファ及びデータセレクタの表記は、省略される。図13に示すMLUT30a、30bは、4つの論理用アドレス入力LA線A0~A3と、4つの論理動作用データ線D0~D3と、4×16=64個の記憶素子40と、アドレスデコーダ9とをそれぞれ有する。論理動作用データ線D0~D3は、16個の記憶素子40をそれぞれ直列に接続する。アドレスデコーダ9は、論理用アドレス入力LA線A0~A3に入力される信号に基づき、16本のワード線のいずれかに接続される4つの記憶素子を選択するように構成される。この4つの記憶素子はそれぞれ、論理動作用データ線D0~D3に接続され、記憶素子に記憶されるデータを論理動作用データ線D0~D3に出力する。例えば、論理用アドレス入力LA線A0~A3に適当な信号が入力される場合は、4つの記憶素子40A、40B、40C、及び40Dを選択するように構成することができる。ここで、記憶素子40Aは、論理動作用データ線D0に接続され、記憶素子40Bは、論理動作用データ線D1に接続され、記憶素子40Dは、論理動作用データ線D2に接続され、記憶素子40Dは、論理動作用データ線D3に接続される。そして、論理動作用データ線D0~D3には、記憶素子40A~40Dに記憶される信号が出力される。このように、MLUT30A、30Bは、論理用アドレス入力LA線A0~A3から論理用アドレス入力LAを受け取り、その論理用アドレス入力LAによってアドレスデコーダ9が選択する4つの記憶素子40に記憶される値を、論理動作用データ線D0~D3に論理動作用データとしてそれぞれ出力する。なお、MLUT30Aの論理用アドレス入力LA線A2は、隣接するMLUT30Bの論理動作用データ線D0と接続しており、MLUT30Aは、MLUT30Bから出力される論理動作用データを、論理用アドレス入力LAとして受け取る。また、MLUT30Aの論理動作用データ線D2は、MLUT30Bの論理用アドレス入力LA線A0と接続しており、MLUT30Aが出力する論理動作用データは、MLUT30Bで論理用アドレス入力LAとして受け取られる。例えば、MLUT30Aの論理動作用データ線D2は、MLUT30Aの論理用アドレス入力LA線A0~A3に入力される信号に基づき、論理動作用データD2に接続される16個の記憶素子のいずれか1つに記憶される信号をMLUT30Bの論理用アドレス入力LA線A0に出力する。同様に、MLUT30Bの論理動作用データ線D0は、MLUT30Bの論理用アドレス入力LA線A0~A3に入力される信号に基づき、論理動作用データ線D0に接続される16個の記憶素子のいずれか1つに記憶される信号をMLUT30Aの論理用アドレス入力LA線A2に出力する。このように、MLUT同士の連結は、1対のアドレス線とデータ線とを用いる。以下、MLUT30Aの論理用アドレス入力LA線A2と、論理動作用データ線D2のように、MLUTの連結に使用されるアドレス線とデータ線の対を「AD対」という。
図16は、接続要素として動作するMLUTの一例を示す図である。図16では、接続要素としてのMLUTは、論理用アドレス入力LA線A0の信号を論理動作用データ線D1に出力し、論理用アドレス入力LA線A1の信号を論理動作用データ線D2に出力し、論理用アドレス入力LA線A2の信号を論理動作用データ線D3に出力するように動作する。接続要素としてのMLUTはさらに、論理用アドレス入力LA線A3の信号を論理動作用データ線D0に出力するように動作する。
図19は、1つのMLUTが、論理要素及び接続要素として動作する一例を示す図である。図19に示す例では、論理用アドレス入力LA線A0及びA1を2入力NOR回路121の入力とし、2入力NOR回路121の出力と、論理用アドレス入力LA線A2とを2入力NAND回路122の入力とし、2入力NAND回路122の出力を論理動作用データ線D0に出力する論理回路を構成する。また同時に、論理用アドレス入力LA線A3の信号を論理動作用データ線D2に出力する接続要素を構成する。
第1及び第2実施形態を用いて説明した再構成可能な半導体装置に適用される真理値表データは、論理構成用のソフトウェアプログラムを実行する情報処理装置によって生成される。例えば、当該情報処理装置は、コンピュータ装置10であってもよいし、また、コンピュータ装置10と同様のハードウェア資源を有し、ネットワーク1100に接続される他のコンピュータ装置であってもよい。
11 アドレスデコーダ
12 I/Oバッファ
13 データセレクタ
20 再構成可能デバイス
30 MLUT
31 メモリセルユニット
32 選択回路
35 アドレス遷移検出部
60 MLUTアレイ
61 行デコーダ
62 列デコーダ
510 プロセッサ
530 通信部
540 外部記憶装置
550 ドライブ装置
600 メインメモリ
1100 ネットワーク
1200 記憶媒体
Claims (7)
- メインメモリと、接続する再構成可能デバイスであって、
前記再構成可能デバイスは、互いにアドレス線又はデータ線で接続する複数の論理部を備え、
前記各論理部は、
複数のアドレス線と、
複数のデータ線と、
前記複数のアドレス線の一部から入力されるアドレスをデコードするアドレスデコーダと、
前記アドレスデコーダのデコード線により特定される複数のメモリセルを有し、前記特定されたメモリセルから読み出されたデータを前記データ線に出力するメモリセルアレイユニットと、を有し、
前記メモリセルアレイユニットのアドレス線が、前記メインメモリのデータ出力線と接続されている、再構成可能デバイス。 - 前記メモリセルユニットは、マルチルックアップテーブルである、請求項1に記載の再構成可能デバイス。
- 前記各論理部は、
前記複数のアドレス線の一部から入力されるアドレスをデコードする第1アドレスデコーダと、
前記複数のアドレス線の他の一部から入力されるアドレスをデコードする第2アドレスデコーダと、
前記第1アドレスデコーダのデコード線により特定される複数にメモリセルを有する第1メモリセルユニットと、
前記第2アドレスデコーダのデコード線により特定される複数のメモリセルを有する第2メモリセルユニットと、を備える請求項1又は2に記載の再構成可能デバイス。 - 前記第1メモリセルユニット及び第2メモリセルユニットは、複数の真理値表データを記憶し、前記複数の真理値表データの何れか1つを特定するデータを出力する第2の複数アドレス線に接続する、請求項3に記載の半導体装置。
- メインメモリと、
前記メインメモリと接続する再構成可能デバイスであって、
前記再構成可能デバイスは、互いにアドレス線又はデータ線で接続する複数の論理部を備え、
前記各論理部は、
複数のアドレス線と、
複数のデータ線と、
前記複数のアドレス線の一部から入力されるアドレスをデコードするアドレスデコーダと、
前記アドレスデコーダのデコード線により特定される複数のメモリセルを有し、前記特定されたメモリセルから読み出されたデータを前記データ線に出力するメモリセルアレ
イユニットと、を有し、
前記メモリセルアレイユニットのデータ出力が、前記メインメモリのアドレス線と接続されている、再構成可能デバイスと、を備える半導体装置。 - 第2の再構成可能デバイスをさらに備え、
前記第2の再構成可能デバイスは、互いにアドレス線又はデータ線で接続する複数の論理部を備え、
前記各論理部は、
複数のアドレス線と、
複数のデータ線と、
前記複数のアドレス線の一部から入力されるアドレスをデコードするアドレスデコーダと、
前記アドレスデコーダのデコード線により特定される複数のメモリセルを有し、前記特定されたメモリセルから読み出されたデータを前記データ線に出力するメモリセルアレ
イユニットと、を有し、
前記メモリセルアレイユニットのデータ出力が、前記メインメモリのアドレス線と接続されている、
請求項6に記載の半導体装置。 - 前記メインメモリと、前記再構成可能デバイスの間に、両者の回路規模を調整する規模調整回路をさらに備える請求項5又は6に記載の半導体装置。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016555138A JP6378775B2 (ja) | 2014-10-22 | 2015-09-18 | 再構成可能デバイス |
US15/514,179 US9923561B2 (en) | 2014-10-22 | 2015-09-18 | Reconfigurable device |
CN201580056708.1A CN107078740A (zh) | 2014-10-22 | 2015-09-18 | 可重构设备 |
EP15852793.7A EP3211795A4 (en) | 2014-10-22 | 2015-09-18 | Reconfigurable device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014215160 | 2014-10-22 | ||
JP2014-215160 | 2014-10-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016063667A1 true WO2016063667A1 (ja) | 2016-04-28 |
Family
ID=55760712
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2015/076610 WO2016063667A1 (ja) | 2014-10-22 | 2015-09-18 | 再構成可能デバイス |
Country Status (6)
Country | Link |
---|---|
US (1) | US9923561B2 (ja) |
EP (1) | EP3211795A4 (ja) |
JP (1) | JP6378775B2 (ja) |
CN (1) | CN107078740A (ja) |
TW (1) | TWI618357B (ja) |
WO (1) | WO2016063667A1 (ja) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10074493B2 (en) * | 2016-11-21 | 2018-09-11 | Aeroflex Colorado Springs Inc. | Radiation-hardened break before make circuit |
US10312918B2 (en) * | 2017-02-13 | 2019-06-04 | High Performance Data Storage And Processing Corporation | Programmable logic design |
CN112735493B (zh) * | 2019-10-28 | 2023-06-13 | 敦泰电子股份有限公司 | 静态随机存取内存系统及其数据读写方法 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004093445A1 (ja) * | 2003-04-16 | 2004-10-28 | Fujitsu Limited | Ip画像伝送装置 |
WO2014163098A2 (ja) * | 2013-04-02 | 2014-10-09 | 太陽誘電株式会社 | 半導体装置 |
Family Cites Families (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5726584A (en) * | 1996-03-18 | 1998-03-10 | Xilinx, Inc. | Virtual high density programmable integrated circuit having addressable shared memory cells |
JP3106998B2 (ja) * | 1997-04-11 | 2000-11-06 | 日本電気株式会社 | メモリ付加型プログラマブルロジックlsi |
JP4212257B2 (ja) * | 2001-04-26 | 2009-01-21 | 株式会社東芝 | 半導体集積回路 |
US6765408B2 (en) * | 2002-02-11 | 2004-07-20 | Lattice Semiconductor Corporation | Device and method with generic logic blocks |
CN1311637C (zh) * | 2002-03-18 | 2007-04-18 | 皇家飞利浦电子股份有限公司 | 基于lut可重配置逻辑体系结构的配置存储器实现 |
JP2007531956A (ja) * | 2003-07-14 | 2007-11-08 | ズィーモス テクノロジー,アイエヌシー. | 1t1csram |
KR100606242B1 (ko) * | 2004-01-30 | 2006-07-31 | 삼성전자주식회사 | 불휘발성 메모리와 호스트간에 버퍼링 동작을 수행하는멀티 포트 휘발성 메모리 장치, 이를 이용한 멀티-칩패키지 반도체 장치 및 이를 이용한 데이터 처리장치 |
US8138788B2 (en) * | 2005-05-31 | 2012-03-20 | Fuji Xerox Co., Ltd. | Reconfigurable device |
WO2009001426A1 (ja) * | 2007-06-25 | 2008-12-31 | Taiyo Yuden Co., Ltd. | 半導体装置 |
JP5140029B2 (ja) * | 2009-03-30 | 2013-02-06 | 太陽誘電株式会社 | 半導体装置 |
US20110137805A1 (en) | 2009-12-03 | 2011-06-09 | International Business Machines Corporation | Inter-cloud resource sharing within a cloud computing environment |
US8952721B2 (en) * | 2010-06-24 | 2015-02-10 | Taiyo Yuden Co., Ltd. | Semiconductor device |
US8572538B2 (en) * | 2011-07-01 | 2013-10-29 | Altera Corporation | Reconfigurable logic block |
JP5927012B2 (ja) * | 2012-04-11 | 2016-05-25 | 太陽誘電株式会社 | 再構成可能な半導体装置 |
JP5822772B2 (ja) * | 2012-04-11 | 2015-11-24 | 太陽誘電株式会社 | 再構成可能な半導体装置 |
US8767501B2 (en) * | 2012-07-17 | 2014-07-01 | International Business Machines Corporation | Self-reconfigurable address decoder for associative index extended caches |
US9350357B2 (en) * | 2012-10-28 | 2016-05-24 | Taiyo Yuden Co., Ltd. | Reconfigurable semiconductor device |
US9514259B2 (en) * | 2012-11-20 | 2016-12-06 | Taiyo Yuden Co., Ltd. | Logic configuration method for reconfigurable semiconductor device |
WO2014163099A2 (ja) * | 2013-04-02 | 2014-10-09 | 太陽誘電株式会社 | 再構成可能な論理デバイス |
JP6306846B2 (ja) * | 2013-09-16 | 2018-04-04 | 太陽誘電株式会社 | 再構成可能な論理デバイス |
JP6517626B2 (ja) * | 2015-08-11 | 2019-05-22 | 太陽誘電株式会社 | 再構成可能な半導体装置 |
-
2015
- 2015-09-18 US US15/514,179 patent/US9923561B2/en not_active Expired - Fee Related
- 2015-09-18 EP EP15852793.7A patent/EP3211795A4/en not_active Withdrawn
- 2015-09-18 WO PCT/JP2015/076610 patent/WO2016063667A1/ja active Application Filing
- 2015-09-18 JP JP2016555138A patent/JP6378775B2/ja not_active Expired - Fee Related
- 2015-09-18 CN CN201580056708.1A patent/CN107078740A/zh active Pending
- 2015-09-25 TW TW104131918A patent/TWI618357B/zh not_active IP Right Cessation
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004093445A1 (ja) * | 2003-04-16 | 2004-10-28 | Fujitsu Limited | Ip画像伝送装置 |
WO2014163098A2 (ja) * | 2013-04-02 | 2014-10-09 | 太陽誘電株式会社 | 半導体装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3211795A4 * |
Also Published As
Publication number | Publication date |
---|---|
EP3211795A4 (en) | 2018-10-03 |
US9923561B2 (en) | 2018-03-20 |
TW201626726A (zh) | 2016-07-16 |
TWI618357B (zh) | 2018-03-11 |
US20170279451A1 (en) | 2017-09-28 |
EP3211795A1 (en) | 2017-08-30 |
CN107078740A (zh) | 2017-08-18 |
JP6378775B2 (ja) | 2018-08-22 |
JPWO2016063667A1 (ja) | 2017-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107408404B (zh) | 用于存储器装置的设备及方法以作为程序指令的存储 | |
US10346092B2 (en) | Apparatuses and methods for in-memory operations using timing circuitry | |
JP6325726B2 (ja) | 半導体装置 | |
US9093135B2 (en) | System, method, and computer program product for implementing a storage array | |
US10203878B2 (en) | Near memory accelerator | |
JP6791522B2 (ja) | インデータパス計算動作のための装置及び方法 | |
US20160283111A1 (en) | Read operations in memory devices | |
JPH1173400A (ja) | ロジック混載dramlsi | |
US20130111102A1 (en) | Semiconductor memory devices | |
JP3334589B2 (ja) | 信号遅延装置及び半導体記憶装置 | |
JP2017038247A (ja) | 再構成可能な半導体装置 | |
EP3729289A2 (en) | A memory apparatus and method for controlling the same | |
JP6378775B2 (ja) | 再構成可能デバイス | |
JP2021152900A (ja) | オーバーラップする範囲を有するコマンドを管理するストレージ装置及びオーバーラップをチェックする方法 | |
WO2013011848A1 (ja) | 半導体メモリ装置 | |
KR102518010B1 (ko) | 휘발성 메모리에 대한 극성 기반 데이터 트랜스퍼 기능 | |
US10559345B1 (en) | Address decoding circuit performing a multi-bit shift operation in a single clock cycle | |
CN112486904A (zh) | 可重构处理单元阵列的寄存器堆设计方法及装置 | |
US7400550B2 (en) | Delay mechanism for unbalanced read/write paths in domino SRAM arrays | |
KR102326661B1 (ko) | 이퀄라이저 장치 및 이를 포함하는 메모리 장치 | |
US11700004B2 (en) | Direct bi-directional gray code counter | |
US11854602B2 (en) | Read clock start and stop for synchronous memories | |
US20210303215A1 (en) | Memory controller, memory, and related memory system | |
KR20240064001A (ko) | 인코딩된 인에이블 클록 게이터 | |
CN110750943A (zh) | 一种具有同步多端口的寄存器组文件的实现方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15852793 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016555138 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15514179 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2015852793 Country of ref document: EP |