WO2007008178A1 - High performance computer architecture - Google Patents

High performance computer architecture Download PDF

Info

Publication number
WO2007008178A1
WO2007008178A1 PCT/SG2006/000194 SG2006000194W WO2007008178A1 WO 2007008178 A1 WO2007008178 A1 WO 2007008178A1 SG 2006000194 W SG2006000194 W SG 2006000194W WO 2007008178 A1 WO2007008178 A1 WO 2007008178A1
Authority
WO
WIPO (PCT)
Prior art keywords
processor
memory
processing
board
output
Prior art date
Application number
PCT/SG2006/000194
Other languages
French (fr)
Inventor
Liang Shing Ng
Original Assignee
Continuum Science And Technologies Sdn Bhd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Continuum Science And Technologies Sdn Bhd filed Critical Continuum Science And Technologies Sdn Bhd
Publication of WO2007008178A1 publication Critical patent/WO2007008178A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • G06F15/8023Two dimensional arrays, e.g. mesh, torus

Definitions

  • the present invention relates generally to the field of computer architecture and more particularly, to processor-memory architecture for improved computational performance. More specifically, the present invention relates to a processor-memory architecture where an array of processor-memory chips are mounted on a circuit board and each processor- memory chip in turn comprises an array of processor-memory cells, and each processor- memory cell comprises its individual local memory bank and processing unit.
  • Computers are used for a wide variety of purposes. They are used for example, in data processing, billing, inventory control, data gathering, modeling of complex natural phenomenon, content creation, games and some other applications.
  • the microprocessor and the system memory are separately packaged as distinct integrated circuit package or chips.
  • the microprocessor and the system memory are connected electrically on a motherboard or a printed circuit board (PCB).
  • PCB printed circuit board
  • processor-memory a circuit board used in the computer system includes among others, an array of processor-memory chips and within each of the processor-memory chip, there is an array of processor-memory cells.
  • processor-memory cell includes a processing unit and local memory.
  • a processing unit typically includes an arithmetic logic unit (ALU), registers, control unit and other related components.
  • ALU arithmetic logic unit
  • the present invention may include single processor- memory chip on a printed circuit board.
  • the processing unit and the system memory are located within the same processor-memory chip so that the data , transfer problem as described above is overcome.
  • the improved architecture is realized by having a processor-memory system of various configurations where each configuration includes a circuit board mounted thereto an array of processor-memory chips and each chip comprises an array of processor- memory cells.
  • processor-memory chips having a permanent storage medium, such as flash memory.
  • a circuit board (10) comprising: -
  • a board level input/output and control module (11) adapted for communication with an input/output device
  • processing-memory system (12) having connection with said board level input/output and control module (11);
  • said processing-memory system (12) includes a plurality of processor- memory chips (13) arranged in an array, said processor-memory chips are functionally connected to said board level input/output and control module
  • each of said processor-memory chips (13) having at least one processor- memory cell (14);
  • processor-memory cell (14) having local memory (15) and a processing unit (16).
  • the instruction and data are read from the local memory and processed by the processing unit, and written back to the local memory.
  • the processing unit (16) carries out intercell read/write operations wherein the processing unit reads and writes instructions and data in the memory banks (15) of the neighbouring processor-memory cells.
  • chip level input/output and control module (13a) facilitates the intercell read/write operation.
  • the instruction and processing are performed in parallel within the processor-memory cells.
  • a multi-board system (20) comprising:-
  • a motherboard (18); an motherboard level input/output and control module (21); and a plurality of circuit boards (10);
  • circuit boards (10) are mounted on said motherboard (18) and are electrically connected to each other (10) through said motherboard level input/output and control module (21) for parallel processing of data and instruction stored in processor-memory chips (13) on said circuit boards (10).
  • a system comprising: -
  • said program instructions are spatially executed by distributing the instructions spatially into an array of processor-memory cells (14), whereby said processing-memory system (12) having an array of processor- memory chips (13), each of said processor-memory chips having said array of processor-memory cells (14).
  • the object may also be accomplished by providing,
  • a method of providing fault-tolerant operation characterized in that a program, instruction or the like is executed in a plurality of processor-memory cells (14) of a processing-memory system (12), said processing-memory system (12) having an array of processor-memory chips (13) mounted on a circuit board (10), each of said processor- memory chips having an array of said processor-memory cells (14).
  • the method of providing fault-tolerant operation is further characterized in that when any said processor-memory cell (14) experience irrecoverable functional failure, said malfunctioned processing memory cell is rebooted by another processor-memory cell (H).
  • the object may also be accomplished by providing, In a wafer production of processor-memory integrated circuit, characterized in that said integrated circuit having an array of processor-memory cells (14), each of said processor- memory cells (14) having a local memory (15) and a processing unit (16) and when any of said processor-memory cell is considered defective, said defective processor-memory cell is functionally bypassed or not used and said processor-memory chip is considered acceptable.
  • the object may also be accomplished by providing,
  • a processor-memory integrated circuit characterized in that a permanent storage medium is incorporated within said processor-memory integrated circuit.
  • Figure 1 shows a block diagram representation of a conventional computer system.
  • Figure 2 shows a diagrammatic layout of a proposed computer system architecture of the present invention.
  • Figure 3 shows a diagrammatic representation of the processor-memory chip shown in Figure 2.
  • Figure 4 shows a diagrammatic representation of the processor-memory cell shown in Figure 3.
  • Figure 5 shows a diagrammatic representation of a network of a plurality of circuit boards each mounted with the processing-memory system according to the invention.
  • Figure 6 shows a diagrammatic representation of a network of a plurality of the circuit boards and at least one multi-board system.
  • Figure 7 shows a diagrammatic representation of a plurality of circuit boards mounted on each multi-board system and a network of the plurality of multi-board systems.
  • FIG. 1 there is shown a block diagram simplification of a conventional computer system.
  • the system is typically realized on a motherboard.
  • the system includes a processing unit (1) within a microprocessor (2), system memory (3) and input/output control module (4).
  • the system memory includes random-access-memory (RAM) and it is normally located in separate chip package from the microprocessor.
  • the processing unit typically contains an arithmetic logic unit (ALU, not shown), control unit (flag registers, not shown) and others.
  • the input/output control module (4) acts as. the controller for communication between the processor with input/output devices such as external storage (not shown), input system such as keyboard (not shown) and output devices (display, speaker etc, also not shown).
  • the components in the system interact with each other via busses (5) where data are transferred amongst various components in the system.
  • the processing unit is often known as the central processing unit as there is usually only one processing unit in a conventional computer system.
  • the processing unit (1) typically contains an arithmetic logic unit (ALU, not shown), control unit (flags, flags register, not shown) and others.
  • ALU arithmetic logic unit
  • control unit flags register, not shown
  • RISC Reduced Instruction Set Computing
  • Figure 2 shows a simplified layout representation of a proposed computer architecture configured according to the embodiment of the present invention.
  • the embodiment is S
  • the processing-memory system comprises a plurality of processor-memory chips (13) electrically and functionally connected together via buses and the board level input/output and control module (11). Since a plurality of processors (to be further explained later) is involved in this invention, input/output and control modules are needed at different levels of the computer architecture (motherboard level, board level, chip level) to coordinate operations of the plurality processors, such as to co-ordinate operations of the processor-memory cells (14) in the processor-memory chips (13). In contrast to the conventional computer system mentioned earlier, only a processor is involved. Thus input/output control module is directly controlled by the processor itself.
  • the board level input/output and control module (11) also provides control and connection to other input/output devices where such input/output devices may include hard disk (not shown), external storage (not shown), input devices such as keyboard (not shown), output devices (display, speaker etc, also not shown) and network devices (not shown) as typically known in the art via external connections (5a).
  • the processing- memory system (12) of the proposed invention comprises a multiplicity of processor- - memory chips (13) arranged in an array and electrically/functionally connected to the board level input/output and controlmodule (11).
  • the board level chip input/output and control module (11) coordinates input/output among the processor-memory chips and facilitates external input/output operations.
  • processor-memory cells Within each of the processor-memory chips, there are arranged processor-memory cells (14, Figure 3).
  • Each of the processor-memory cell (14) has its local memory (15, Figure 4) and processing unit (16, Figure 4).
  • FIG 3 shows simplified representation of a single processor-memory chip (13) that . formed part of the chips array in the processing-memory system (12, Figure 2).
  • the processor-memory chip (13) comprises an array of such processor- memory cells (14).
  • the number of such chips within the motherboard and the number of processor-memory cells within the processor-memory chip may be set by the system designer, or according to requirements or subject to physical limits in the fabrication process and cost.
  • Figure 4 shows the representation of a single processor-memory cell (14) disclosed in Figure 3.
  • each of the processor-memory cell (14) comprises local memory (15) and a processing unit (16).
  • Such local memory may include multiplicity of memory banks.
  • the processing unit includes arithmetic logic unit (ALU, not shown), registers (not shown) and control unit (also not shown).
  • Each processing unit (16) in the processor-memory cells (14) can read from and write to local memory banks
  • chip level input/output and control module (13a) on each processor-memory chips (13).
  • Explicit coordination through chip level input/output and control module (13a) is required in the case of direct intercell read/write operation.
  • data and instructions can be passed through the chip level input/output and control module (13a) or a dedicated input/output channel or direct read/write to the memory bank (15) of a neighouring cell with the coordination of the chip level input/output and control module.
  • the chip level input/output and control module (13a) coordinates input/output among the processor-memory cells and facilitates external input/output operations.
  • fault-tolerant computer operation may be realized as the operating system kernel and applications programs may be executed on the multiple processor-memory cells simultaneously. Should one processor-memory cell "crashes" (i.e. experience irrecoverable functional failure), it can be re-booted by another processor-memory cell, thus making the system "un-crashable”.
  • Execution of computer program in the proposed system may be achieved via "spatial program execution” or combination of conventional "temporal program execution” and “spatial program execution”.
  • conventional "temporal program execution” instructions are executed serially in temporal sequence
  • the instructions are distributed spatially in the array of the processor-memory cells.
  • the program instruction may be executed in one processor- memory cell and followed by the execution of the next instruction located in another processor-memory cell.
  • the combination of "temporal program execution” and “spatial program execution” program execution would also produce substantial improvement over the prior art.
  • Such new mode of program execution would generally require a new computer programming language and operating system. This is generally due to the fact that typical existing programming languages and operating systems are designed based on the principle of temporal execution.
  • Another benefit from the implementation of the embodiment of the present invention is improved yield production of processor-memory integrated circuit.
  • wafer production of integrated circuit of the present invention a bad cell may be marked but the whole package may not necessarily be rejected. This is because the rest of the cells may still be usable. This is done by marking and bypassing the defective cells but passing the overall integrated circuit. It is envisaged that a new algorithm and marking scheme may be needed to cater for such purpose.
  • Non-volatile memory such FLASH RAM may be included as permanent storage within the processor-memory chip.
  • Non-volatile memory is a semiconductor memory that retains its contents when power is switched off.
  • improvement with respect to data transfer speed may be realized as compared to a conventional system where permanent storage (the hard disk) is separated from the circuit board.
  • Mobile devices or robotic systems are some of applications that could benefit from such flexibility.
  • the circuit board is further characterized in that an optional auxiliary memory (17) is arranged on the circuit board as shown in Figure 2.
  • the auxiliary memory is made of conventional Random Access Memory (RAM) or other similar chips and is electrically connected to each of the processor-memory cells via busses (5) for the expansion of memory capacity of the processor-memory cells based upon the computing needs imposed upon this new processor.
  • RAM Random Access Memory
  • each of the circuit board (10) can be connected together by a network means through the board level input/output and control module (11) on each circuit board (10) for parallel processing of data and instruction as shown in Figure 5.
  • the invention also can be realized as a new multi-board system (20) as shown in Figure 6 and 7 in which a motherboard (18) is installed with a plurality of the new circuit boards (10) as disclosed earlier.
  • the circuit boards (10) are electrically connected to each other (10) via busses on the motherboard for parallel processing of data and instruction stored in processor-memory chips (13) on said circuit boards (10).
  • Respective board level and motherboard level input/output and control module (11, 21) on each circuit board and on the motherboard will provide control and connection to other input/output devices connected to the multi-board system (20).
  • the motherboard level input/output and control module (21) coordinates input/output among the circuit boards (10) and facilitates external input/output operations.
  • These input/output devices may include hard disk (not shown), external storage (not shown), input devices such as keyboard (not shown), output devices (display, speaker etc, also not shown) and network devices (not shown) as typically known in the art.
  • a plurality of such multi-board systems (20) can- also be electrically connected together by a network means as shown in Figure 7 for parallel processing of data and instructions stored in processor-memory chips (13) on circuit boards (10) installed in each multi-board system (20).
  • Another variation of the computing system would be a plurality of circuit, boards (10) and a plurality of multi-board systems (20) are connected by a network means to each other (10, 20) as shown in Figure 6 for parallel processing of data and instructions stored in processor-memory chips (13) on each said circuit boards (10).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

Architecture for high performance computing comprises arrays of processor-memory chip (13) containing arrays of processor-memory cell (14). Each processor-memory cell has a processing unit (16) and a local memory (15). The arrays of processor-memory chips can be connected to other arrays through a common motherboard or by other networking means. Processing power is increased through the presence of a higher number of processing units and the direct connection of memory units to corresponding processor units reducing data transmission bottlenecks. Processor-memory cells can also be bypassed if found defective allowing processor-memory chips with faulty processor-memory cells to be considered acceptable for use.

Description

High Performance Computer Architecture
Field of the Invention
The present invention relates generally to the field of computer architecture and more particularly, to processor-memory architecture for improved computational performance. More specifically, the present invention relates to a processor-memory architecture where an array of processor-memory chips are mounted on a circuit board and each processor- memory chip in turn comprises an array of processor-memory cells, and each processor- memory cell comprises its individual local memory bank and processing unit.
Background of the Invention
The advent of computers has brought many changes with respect to the way we work, conduct businesses, live, and many other areas of our life. Computer or at least parts of computer exist in many of the electronic/electrical devices around us. Computers are used for a wide variety of purposes. They are used for example, in data processing, billing, inventory control, data gathering, modeling of complex natural phenomenon, content creation, games and some other applications.
In conventional mainstream computer systems, such as the personal computer (PC), the microprocessor and the system memory are separately packaged as distinct integrated circuit package or chips. The microprocessor and the system memory are connected electrically on a motherboard or a printed circuit board (PCB). Such separation has created a bottleneck in data transfer between the system memory and the processor, thus limiting the overall computational performance of such computer systems.
It is therefore an object of the present invention to provide solutions to the data transfer problem in conventional mainstream computer architecture. It is proposed that computer system utilizes a so-called "processor-memory" architecture where a circuit board used in the computer system includes among others, an array of processor-memory chips and within each of the processor-memory chip, there is an array of processor-memory cells. Each of the processor-memory cell includes a processing unit and local memory. It is generally understood in the field that a processing unit typically includes an arithmetic logic unit (ALU), registers, control unit and other related components. Although not intended to be a limiting feature, the present invention may include single processor- memory chip on a printed circuit board. Through the implementation of the proposed invention, two distinct advantages may be observed. First, the processing unit and the system memory are located within the same processor-memory chip so that the data , transfer problem as described above is overcome. Second, there will be a large number (typically more than 10) of ALU within the processor-memory chip in the proposed architecture as compared to the present high-end system where only a small number (typically less than 10) of ALU are employed. With a higher number of ALUs, more processing power is advantageously offered.
Summary of the Invention
It is therefore an object of the present invention to provide an improved architecture where the data transfer rate between the processing unit and memory is significantly improved. The improved architecture is realized by having a processor-memory system of various configurations where each configuration includes a circuit board mounted thereto an array of processor-memory chips and each chip comprises an array of processor- memory cells.
It is also another object of the present invention to provide such improved architecture where each of the processor-memory cells has local memory and processing unit.
It is yet another object of the present invention to provide a program execution method where program instructions may be spatially executed.
Yet, it is another object of the present invention to provide a fault-tolerant system.
Yet, it is another object of the present invention to provide a method of achieving high yield wafer fabrication of processor-memory chips. Yet, it is another object of the present invention to provide a processor-memory chips having a permanent storage medium, such as flash memory.
These and other objects of the present invention are accomplished by providing,
A circuit board (10) comprising: -
a board level input/output and control module (11) adapted for communication with an input/output device; and
a processing-memory system (12), said processing-memory system (12) having connection with said board level input/output and control module (11);
characterized in that :-
said processing-memory system (12) includes a plurality of processor- memory chips (13) arranged in an array, said processor-memory chips are functionally connected to said board level input/output and control module
(H);
each of said processor-memory chips (13) having at least one processor- memory cell (14); and
said processor-memory cell (14) having local memory (15) and a processing unit (16).
Preferably, the instruction and data are read from the local memory and processed by the processing unit, and written back to the local memory. Preferably, the processing unit (16) carries out intercell read/write operations wherein the processing unit reads and writes instructions and data in the memory banks (15) of the neighbouring processor-memory cells.
Furthermore, the chip level input/output and control module (13a) facilitates the intercell read/write operation.
Also preferable, the instruction and processing are performed in parallel within the processor-memory cells.
A multi-board system (20) comprising:-
a motherboard (18); an motherboard level input/output and control module (21); and a plurality of circuit boards (10);
characterized in that :-
said circuit boards (10) are mounted on said motherboard (18) and are electrically connected to each other (10) through said motherboard level input/output and control module (21) for parallel processing of data and instruction stored in processor-memory chips (13) on said circuit boards (10).
A system comprising: -
a plurality of circuit boards (10); and a plurality of multi-board systems (20)
characterized in that:- said plurality of circuit boards (10) and said plurality of multi-board systems (20) are connected by a network means to each other (10, 20) for parallel processing of data and instructions stored in processor-memory chips (13) on said circuit boards (10).
The objects may be further accomplished by providing,
A program execution method for a circuit board (10) having a board level input/output and control module (11) and a processing-memory system (12);
characterized in that said program instructions are spatially executed by distributing the instructions spatially into an array of processor-memory cells (14), whereby said processing-memory system (12) having an array of processor- memory chips (13), each of said processor-memory chips having said array of processor-memory cells (14).
Yet, the object may also be accomplished by providing,
A method of providing fault-tolerant operation; characterized in that a program, instruction or the like is executed in a plurality of processor-memory cells (14) of a processing-memory system (12), said processing-memory system (12) having an array of processor-memory chips (13) mounted on a circuit board (10), each of said processor- memory chips having an array of said processor-memory cells (14).
The method of providing fault-tolerant operation is further characterized in that when any said processor-memory cell (14) experience irrecoverable functional failure, said malfunctioned processing memory cell is rebooted by another processor-memory cell (H).
Also, the object may also be accomplished by providing, In a wafer production of processor-memory integrated circuit, characterized in that said integrated circuit having an array of processor-memory cells (14), each of said processor- memory cells (14) having a local memory (15) and a processing unit (16) and when any of said processor-memory cell is considered defective, said defective processor-memory cell is functionally bypassed or not used and said processor-memory chip is considered acceptable.
And, the object may also be accomplished by providing,
A processor-memory integrated circuit, characterized in that a permanent storage medium is incorporated within said processor-memory integrated circuit.
Brief Description of the Drawings
The embodiments of the invention will now be described, by way of example only, with reference to the accompanying figures in which:
Figure 1 shows a block diagram representation of a conventional computer system.
Figure 2 shows a diagrammatic layout of a proposed computer system architecture of the present invention.
Figure 3 shows a diagrammatic representation of the processor-memory chip shown in Figure 2; and
Figure 4 shows a diagrammatic representation of the processor-memory cell shown in Figure 3.
Figure 5 shows a diagrammatic representation of a network of a plurality of circuit boards each mounted with the processing-memory system according to the invention. Figure 6 shows a diagrammatic representation of a network of a plurality of the circuit boards and at least one multi-board system.
Figure 7 shows a diagrammatic representation of a plurality of circuit boards mounted on each multi-board system and a network of the plurality of multi-board systems.
Detailed Description of the Preferred Embodiments
Referring to Figure 1, there is shown a block diagram simplification of a conventional computer system. The system is typically realized on a motherboard. In the most minimal configuration, the system includes a processing unit (1) within a microprocessor (2), system memory (3) and input/output control module (4). Typically the system memory includes random-access-memory (RAM) and it is normally located in separate chip package from the microprocessor. The processing unit typically contains an arithmetic logic unit (ALU, not shown), control unit (flag registers, not shown) and others. The input/output control module (4) acts as. the controller for communication between the processor with input/output devices such as external storage (not shown), input system such as keyboard (not shown) and output devices (display, speaker etc, also not shown).. The components in the system interact with each other via busses (5) where data are transferred amongst various components in the system. The processing unit is often known as the central processing unit as there is usually only one processing unit in a conventional computer system. The processing unit (1) typically contains an arithmetic logic unit (ALU, not shown), control unit (flags, flags register, not shown) and others. As the system memory and microprocessor are located in separate packages, it becomes a bottleneck for data transfer. This could affect the performance of the computer system, even with the application of advanced computer instruction sets such as Reduced Instruction Set Computing (RISC), etc.
Reference is now made to Figure 2, Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7. Figure 2 shows a simplified layout representation of a proposed computer architecture configured according to the embodiment of the present invention. The embodiment is S
realized on a circuit board (10) having a board level input/output and control module (11) and a processing-memory system (12). The processing-memory system comprises a plurality of processor-memory chips (13) electrically and functionally connected together via buses and the board level input/output and control module (11). Since a plurality of processors (to be further explained later) is involved in this invention, input/output and control modules are needed at different levels of the computer architecture (motherboard level, board level, chip level) to coordinate operations of the plurality processors, such as to co-ordinate operations of the processor-memory cells (14) in the processor-memory chips (13). In contrast to the conventional computer system mentioned earlier, only a processor is involved. Thus input/output control module is directly controlled by the processor itself.
The board level input/output and control module (11) also provides control and connection to other input/output devices where such input/output devices may include hard disk (not shown), external storage (not shown), input devices such as keyboard (not shown), output devices (display, speaker etc, also not shown) and network devices (not shown) as typically known in the art via external connections (5a). The processing- memory system (12) of the proposed invention comprises a multiplicity of processor- - memory chips (13) arranged in an array and electrically/functionally connected to the board level input/output and controlmodule (11). The board level chip input/output and control module (11) coordinates input/output among the processor-memory chips and facilitates external input/output operations. Within each of the processor-memory chips, there are arranged processor-memory cells (14, Figure 3). Each of the processor-memory cell (14) has its local memory (15, Figure 4) and processing unit (16, Figure 4).
Figure 3 shows simplified representation of a single processor-memory chip (13) that . formed part of the chips array in the processing-memory system (12, Figure 2). As shown in this figure, the processor-memory chip (13) comprises an array of such processor- memory cells (14). The number of such chips within the motherboard and the number of processor-memory cells within the processor-memory chip may be set by the system designer, or according to requirements or subject to physical limits in the fabrication process and cost. Figure 4 shows the representation of a single processor-memory cell (14) disclosed in Figure 3. As shown in the figure, each of the processor-memory cell (14) comprises local memory (15) and a processing unit (16). Such local memory may include multiplicity of memory banks. The processing unit includes arithmetic logic unit (ALU, not shown), registers (not shown) and control unit (also not shown). Each processing unit (16) in the processor-memory cells (14) can read from and write to local memory banks
(15) of other processor-memory cells directly or through a chip level input/output and control module (13a) on each processor-memory chips (13). Explicit coordination through chip level input/output and control module (13a) is required in the case of direct intercell read/write operation. For such intercell read/write operations, data and instructions can be passed through the chip level input/output and control module (13a) or a dedicated input/output channel or direct read/write to the memory bank (15) of a neighouring cell with the coordination of the chip level input/output and control module.
The chip level input/output and control module (13a) coordinates input/output among the processor-memory cells and facilitates external input/output operations.
With the implementation of computer system architecture as proposed by the present invention, various improvements may be obtained apart from the substantial reduction of data transfer bottlenecks as mentioned earlier. For example, fault-tolerant computer operation may be realized as the operating system kernel and applications programs may be executed on the multiple processor-memory cells simultaneously. Should one processor-memory cell "crashes" (i.e. experience irrecoverable functional failure), it can be re-booted by another processor-memory cell, thus making the system "un-crashable".
Execution of computer program in the proposed system may be achieved via "spatial program execution" or combination of conventional "temporal program execution" and "spatial program execution". In the conventional "temporal program execution", instructions are executed serially in temporal sequence, whereas in the new "spatial program execution" scheme, the instructions are distributed spatially in the array of the processor-memory cells. The program instruction may be executed in one processor- memory cell and followed by the execution of the next instruction located in another processor-memory cell. The combination of "temporal program execution" and "spatial program execution" program execution would also produce substantial improvement over the prior art. Such new mode of program execution would generally require a new computer programming language and operating system. This is generally due to the fact that typical existing programming languages and operating systems are designed based on the principle of temporal execution.
Another benefit from the implementation of the embodiment of the present invention is improved yield production of processor-memory integrated circuit. In wafer production of integrated circuit of the present invention, a bad cell may be marked but the whole package may not necessarily be rejected. This is because the rest of the cells may still be usable. This is done by marking and bypassing the defective cells but passing the overall integrated circuit. It is envisaged that a new algorithm and marking scheme may be needed to cater for such purpose.
Non-volatile memory such FLASH RAM may be included as permanent storage within the processor-memory chip. Non-volatile memory is a semiconductor memory that retains its contents when power is switched off. By incorporating non- volatile memory within the processor-memory chip, improvement with respect to data transfer speed may be realized as compared to a conventional system where permanent storage (the hard disk) is separated from the circuit board. Mobile devices or robotic systems are some of applications that could benefit from such flexibility.
The circuit board is further characterized in that an optional auxiliary memory (17) is arranged on the circuit board as shown in Figure 2. The auxiliary memory is made of conventional Random Access Memory (RAM) or other similar chips and is electrically connected to each of the processor-memory cells via busses (5) for the expansion of memory capacity of the processor-memory cells based upon the computing needs imposed upon this new processor.
Furthermore, each of the circuit board (10) can be connected together by a network means through the board level input/output and control module (11) on each circuit board (10) for parallel processing of data and instruction as shown in Figure 5. The invention also can be realized as a new multi-board system (20) as shown in Figure 6 and 7 in which a motherboard (18) is installed with a plurality of the new circuit boards (10) as disclosed earlier. On a single multi-board system (20), the circuit boards (10) are electrically connected to each other (10) via busses on the motherboard for parallel processing of data and instruction stored in processor-memory chips (13) on said circuit boards (10). Respective board level and motherboard level input/output and control module (11, 21) on each circuit board and on the motherboard will provide control and connection to other input/output devices connected to the multi-board system (20). The motherboard level input/output and control module (21) coordinates input/output among the circuit boards (10) and facilitates external input/output operations. These input/output devices may include hard disk (not shown), external storage (not shown), input devices such as keyboard (not shown), output devices (display, speaker etc, also not shown) and network devices (not shown) as typically known in the art.
A plurality of such multi-board systems (20) can- also be electrically connected together by a network means as shown in Figure 7 for parallel processing of data and instructions stored in processor-memory chips (13) on circuit boards (10) installed in each multi-board system (20). Another variation of the computing system would be a plurality of circuit, boards (10) and a plurality of multi-board systems (20) are connected by a network means to each other (10, 20) as shown in Figure 6 for parallel processing of data and instructions stored in processor-memory chips (13) on each said circuit boards (10).
It is believed that the features of the present invention may be incorporated into many electronic devices, computers or the likes. Such devices may include smart phone, PDA, robotic system and so on. While the preferred embodiments of the present invention have been described, it should be understood that various changes, adaptations and modifications may be made thereto. It should be understood, therefore, that the invention is not limited to details of the illustrated invention shown in the figures and that variations in such minor details will be apparent to one skilled in the art.

Claims

Claims
1. A circuit board (10) comprising:-
a board level input/output and control module (11) adapted for communication with an input/output device; and
a processing-memory system (12), said processing-memory system (12) having connection with said board level input/output and control module (11);
characterized in that :-
said processing-memory system (12) includes a plurality of processor- memory chips (13) arranged in an array, said processor-memory chips are functionally connected to said board level input/output and controlmodule
(H);
each of said processor-memory chips (13) having at least one processor- memory cell (14) and chip level input/output and control module (13 a), wherein said processor-memory cells are functionally connected to said chip level input/output and control module (13a); and
said processor-memory cell (14) having local memory (15) and a processing unit (16).
2. A circuit board as claimed in claim 1, further characterized in that each of said processor-memory chips (13) having an array of said processor-memory cell (14).
3. A circuit board as claimed in any one of claims 1 or 2, further characterized in that instruction and data are read from said local memory (15) in said processor-memory cell to be processed by said processing unit (16) in the same said processor-memory cell and written back into the local memory (15).
4. A circuit board as claimed in claim 3, further characterized in that execution of said instruction and processing of said data are performed in parallel within said processor-memory cells (14).
5. A circuit board as claimed in any one of claims 1 or 2, further characterized in that said processing unit (16) carries out intercell read/write operations wherein said processing unit reads and writes instructions and data in said memory banks (15) of neighbouring said processor-memory cells.
6. A circuit board as claimed in any one of claim 5, wherein said intercell read/write operation is facilitated by said chip level input/output and control module.
7. A circuit board as claimed in claim 1, further characterized in that an auxiliary memory (17) is arranged on said circuit board, said auxiliary memory electrically connected to each of the processor-memory cells.
8. A plurality of the circuit board (10) as claimed in claim 1, characterized in that said circuit boards are connected by a network means for parallel processing of data and instructions stored in processor-memory chips (13) on said circuit boards (10).
9. A multi-board system (20) comprising:-
a motherboard (18); a motherboard level input/output and control module (21); and a plurality of circuit boards (10) as claimed in claims 1 to 5;
characterized in that :-
said circuit boards (10) are mounted on said motherboard (18) and are electrically connected to each other (10) through said motherboard level input/output and control module (21) for parallel processing of data and instruction stored in processor-memory chips (13) on said circuit boards (10).
10. A plurality of multi-board systems (20) as claimed in claim 7 characterized in that said multi-boards are connected by a network means for parallel processing of data and instructions stored in processor-memory chips (13) on said circuit boards (10) installed in each said multi-board system (20).
11. A system comprising:-
a plurality of circuit boards (10) as claimed in claims 1 to 8; and a plurality of multi-board system (20) as claimed in claims 9 to 10 characterized in that:- said plurality of circuit boards (10) and said plurality of multi-board systems (20) are connected by a network means to -each other (10, 20) for parallel processing of data and instructions stored in processor-memory chips (13) on said circuit boards
(10).
12. A program execution method for a circuit board (10) having an board level input/output and control module (11) and a processing-memory system (12);
characterized in that said program instructions are spatially executed by distributing the instructions spatially into an array of processor-memory cells (14), whereby said processing-memory system (12) having an array of processor-memory chips (13), each of said processor-memory chips having said array of processor-memory cells (14).
13. A program execution method as claimed in claim 12, further characterized in that said spatial execution is where an instruction of the program is executed in one processor-memory cell and the next instruction of the program is executed in another processor-memory cell.
14. A program execution method as claimed in claim 12, further characterized in that said spatial execution is performed within a single chip.
15. A program execution method as claimed in claim 12, further characterized in that said program execution is performed via a combination of said spatial execution and a temporal execution whereby in said temporal execution, instructions are executed serially.
16. A method of providing fault-tolerant operation; characterized in that a program, instruction or the like is executed in a plurality of processor-memory cells (14) of a processing-memory system (12), said processing-memory system (12) having an array of processor-memory chips (13) mounted on a circuit board (10), each of said processor-memory chips having an array of said processor-memory cells (14).
17. The method of providing fault-tolerant operation as claimed in claim 16 further characterized in that when any said processor-memory cell (14) experience irrecoverable functional failure, said malfunctioned processing memory cell is rebooted by another processor-memory cell (14).
18. A method of increasing yield in the wafer production of processor-memory integrated circuit, characterized in that said integrated circuit having an array of processor-memory cells (14), each of said processor-memory cells (14) having a local memory (15) and a processing unit (16) and when any of said processor- memory cell is considered defective, said defective processor-memory cell is functionally bypassed or not used and said processor-memory chip is considered acceptable.
19. A processor-memory integrated circuit as claimed in claims 1 to 18, characterized in that a permanent storage medium is incorporated within said processor-memory integrated circuit.
PCT/SG2006/000194 2005-07-14 2006-07-12 High performance computer architecture WO2007008178A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI20053217 2005-07-14
MYPI20053217 2005-07-14

Publications (1)

Publication Number Publication Date
WO2007008178A1 true WO2007008178A1 (en) 2007-01-18

Family

ID=37637432

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2006/000194 WO2007008178A1 (en) 2005-07-14 2006-07-12 High performance computer architecture

Country Status (2)

Country Link
TW (1) TW200712918A (en)
WO (1) WO2007008178A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040015735A1 (en) * 1994-03-22 2004-01-22 Norman Richard S. Fault tolerant cell array architecture

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040015735A1 (en) * 1994-03-22 2004-01-22 Norman Richard S. Fault tolerant cell array architecture

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
"PowerHawk Series 700 Multiprocessing Systems", CONCURRENT COMPUTER CORPORATION, December 2001 (2001-12-01), XP003005938, Retrieved from the Internet <URL:http://www.ccur.com/isddocs/cpb-hw-Power%20Hawk%20700.pdf> *
"Technology@Intel Magazine", INTEL, May 2005 (2005-05-01), XP003005941, Retrieved from the Internet <URL:http://www.intel.com/technology/magazine/computing/Dual-core-0505.pdf> *
"The AMD Opteron Processor for Servers and Workstations", AMD, 13 April 2005 (2005-04-13), Retrieved from the Internet <URL:http://www.multicore.amd.com/en/Products/AMD_Opteron_Overview.pdf> *
SUDHARSANAN S.: "MAJC-5200: A High Performance Microprocessor for Multimedia Computing", INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 1 May 2000 (2000-05-01) - 5 May 2000 (2000-05-05), XP003005940, Retrieved from the Internet <URL:http://www.ipdps.cc.gatech.edu/2000/pdvim/1800164.pdf> *
TENDLER ET AL.: "POWER4 System Microarchitecture", October 2001 (2001-10-01), XP003005939, Retrieved from the Internet <URL:http://www.03.ibm.com/servers/eserver/pseries/hardware/whitepapers/power4.pdf> *

Also Published As

Publication number Publication date
TW200712918A (en) 2007-04-01

Similar Documents

Publication Publication Date Title
CN102292778B (en) Memory devices and methods for managing error regions
US20180231605A1 (en) Configurable Vertical Integration
EP0752132B1 (en) Cell-based defect tolerant architecture with beneficial use of unassigned spare cells
US20150106574A1 (en) Performing Processing Operations for Memory Circuits using a Hierarchical Arrangement of Processing Circuits
US8611123B2 (en) Complex semiconductor device for use in mobile equipment
US10747703B2 (en) Memory with alternative command interfaces
US9767858B2 (en) Register files including distributed capacitor circuit blocks
US20180276161A1 (en) PCIe VIRTUAL SWITCHES AND AN OPERATING METHOD THEREOF
EP0973099A2 (en) Parallel data processor
US6266735B1 (en) Information processing system
KR101077285B1 (en) Processor surrogate for use in multiprocessor systems and multiprocessor system using same
CN117852600B (en) Artificial intelligence chip, method of operating the same, and machine-readable storage medium
US7882479B2 (en) Method and apparatus for implementing redundant memory access using multiple controllers on the same bank of memory
CN114761936A (en) Memory module with computing power
US8001310B2 (en) Scalable computer node having an expansion module that is socket-compatible with a central processing unit
CN112071352B (en) Method, circuit, storage medium and terminal for reducing read current of nonvolatile flash memory
WO2007008178A1 (en) High performance computer architecture
CN115915577A (en) Techniques for protecting inductors on circuit boards
US5835505A (en) Semiconductor integrated circuit and system incorporating the same
JP7517772B2 (en) Enhanced performance computing system, semiconductor substrate, computer program, and method for operating an enhanced performance computing system - Patents.com
JP2003316571A (en) Parallel processor
US20230418604A1 (en) Reconfigurable vector processing in a memory
CN112486904A (en) Register file design method and device for reconfigurable processing unit array
US20070118672A1 (en) Redundant link mezzanine daughter card
US20230315334A1 (en) Providing fine grain access to package memory

Legal Events

Date Code Title Description
DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: COMMUNICATION UNDER RULE 112(1) EPC, EPO FORM 1205A DATED 20/05/08.

122 Ep: pct application non-entry in european phase

Ref document number: 06769677

Country of ref document: EP

Kind code of ref document: A1