GB2571297A - Computing element - Google Patents
Computing element Download PDFInfo
- Publication number
- GB2571297A GB2571297A GB1802913.2A GB201802913A GB2571297A GB 2571297 A GB2571297 A GB 2571297A GB 201802913 A GB201802913 A GB 201802913A GB 2571297 A GB2571297 A GB 2571297A
- Authority
- GB
- United Kingdom
- Prior art keywords
- processor
- interfaces
- application specific
- interface
- computing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
- G06F15/17—Interprocessor communication using an input/output type connection, e.g. channel, I/O port
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7807—System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Multi Processors (AREA)
Abstract
A computing element comprises an application specific processor 7 (ASP; for example a mobile phone processor or a processor that does not comprise a general purpose native bus interface) having a plurality of heterogeneous application specific input/output interfaces, for example modem 24, Flash memory 25, network, solid-state drive (SSD) 26, WiFi® 27, screen 28, processor native memory, general-purpose peripheral, or processor aware memory interfaces. One or more of the plurality of heterogeneous application specific i/o interfaces are repurposed to form a data link 30 between the ASP and a switch fabric. The repurposed interfaces may be arranged to provide native memory access through a process-aware address scheme with the switch fabric. Two or more repurposed interfaces may be joined together to form a single data link 30. A computing node may comprise a plurality of computing elements, a storage element, a network port, and a switch fabric configured to couple the computing elements, at least one storage element and at least one network port to each other
Description
COMPUTING ELEMENT [0001] The present application relates to a computing element and a method for providing a computing element.
Background [0002] Traditionally a computing node is defined as one of two classes of computing element. The first class is a computing element with specific processing capability, a local memory, and application specific input/output (IO) interfaces, with the local memory and interfaces being controlled, and dedicated for use by, the computing element, and is generally referred to as an Application Specific Processor (ASP). The second class differs in that the computing element has limited application specific IO interfaces but has an additional general purpose IO interface, and is generally referred to as a General Purpose Processor (GPP). Typically, the processing, memory and interface resources assigned to the computing node are selected so as to be able to manage the tasks envisaged for the computing node. Such a computing node may for example be used as an independently capable server.
[0003] Typically, a computing element is an application specific processor (ASP) such as an application specific integrated circuit (ASIC) specifically designed to provide the processing and other functionality required by the computing node. In some examples the processing element may be a general purpose processor configured to provide the processing and other functionality required by the computing node through the general purpose I/O interface.
[0004] A problem with computing elements using a general purpose IO interface is that although this can be general purpose, and so in principle is useable for any specific use, in practice it will often be found that a desired specific use will exceed the capability of the general purpose IO interface, so that this interface will cause a bottleneck for communication to and from the computing element. The current solution to this problem is to integrate a specific IO interface that provides the required capability with the processor as an ASIC.
[0005] Problems with the ASIC approach are that it may be time consuming and costly to design and manufacture an ASP providing the desired functionality, and the inclusion of a specific IO interface makes the ASP unsuitable for general purpose use. Further, a general purpose IO interface on a processor may be relatively inefficient at providing the desired functionality, and may have significant unused functionality, resulting in the processor being unnecessary large and costly to address the needs of a specific application.
[0006] Another problem with the traditional general purpose computing node architecture is that the resources which can be assigned to a single node may be limited by what is technically possible and/or by cost. For example, improving the processing performance of the computing node by using a faster processor is limited by the available fabrication technology and consequential density limits, and by the high cost of high performance processors. Further, increasing the computing capability by improving processor performance without also updating the general purpose IO capability may not result in any useable improvement for specific applications, for example due to bottlenecks, potentially worsening the problem of the processor being unnecessary large and costly to address the needs of a specific application.
[0007] Another problem with the traditional ASP based computing node architecture is that the resources available to the node are fixed. As a result, if the computing node is assigned a task requiring a different balance of resources from those available to the ASIC the efficiency of the computing node may be relatively low. For example a computing node assigned a memory intensive task may find that the rate at which the task can be carried out is constrained by the amount of memory and memory access bandwidth available to the computing node, while other resources available to the node, such as the processor, are underutilized.
[0008] The embodiments described below are not limited to implementations which solve any or all of the disadvantages of the known approaches described above.
Summary [0009] This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
[0010] A system and method is provided for using the compute elements of an ASIC in manner which repurposes available applicable application specific IO as a repurposed-IO as to enable application specific use case that were outside of the target design of the ASP with a capability that can exceed the capability of any general purpose IO interface optionally available on the device.
[0011] In a first aspect, the present disclosure provides a computing element comprising an application specific processor 'ASP' having a plurality of heterogeneous application specific input/output ΊΟ' interfaces; wherein one or more of the plurality of heterogeneous application specific in/out interfaces are repurposed to form a data link between the ASP and a switch fabric.
[0012] In a second aspect, the present disclosure provides a computing node comprising: a plurality of computing elements according to any preceding claim; at least one storage element, at least one network port; and a switch fabric configured to couple the plurality of computing elements, at least one storage element and at least one network port to each other.
[0013] In a third aspect, the present disclosure provides a method of providing a computing element, the method comprising: obtaining an application specific processor 'ASP' having a plurality of heterogeneous application specific input/output ΊΟ' interfaces; and repurposing one or more of the plurality of heterogeneous application specific IO interfaces to form a data link between the ASP and a switch fabric.
[0014] The preferred features may be combined as appropriate, as would be apparent to a skilled person, and may be combined with any of the aspects of the invention.
Brief Description of the Drawings [0015] Embodiments of the invention will be described, by way of example, with reference to the following drawings, in which:
[0016] Figure 1 is an explanatory diagram of a computing node according to a first embodiment;
[0017] Figure 2 is an explanatory diagram of processor useable in the computing node of figure 1;
[0018] Figure 3 is an explanatory diagram of a computing element of the computing node of figure 1; and [0019] Figure 4 is an explanatory diagram of a computer system according to a second embodiment.
[0020] Common reference numerals are used throughout the figures to indicate similar features.
Detailed Description [0021] Embodiments of the present invention are described below byway of example only. These examples represent the best ways of putting the invention into practice that are currently known to the Applicant although they are not the only ways in which this could be achieved. The description sets forth the functions of the example and the sequence of steps for constructing and operating the example. However, the same or equivalent functions and sequences may be accomplished by different examples.
[0022] Figure 1 shows a diagrammatic illustration of a computing node 1 according to a first embodiment of the present invention.
[0023] In the illustrated example the computing node 1 comprises a group of four processing elements 2a to 2d, an attached switch fabric 3, a data storage element 4 and a network port 5. The group of processing elements 2, switch fabric 3, storage element 4 and network port 5 are physical resource elements mounted on a common physical substrate 6, such as a PCB, and in combination provide the computing node 1 with desired application specific capabilities.
[0024] The switch fabric 3 provides an interconnection between the various elements within the computing node, and bridges to an external interface or interfaces. This can be defined as a crossbar switching matrix that allows each end point of the switch fabric a full bandwidth connection to any other end point without different end points sharing bandwidth, or by any other interconnection scheme capable of transferring between the different elements of the computing node. Preferably, signal traffic between two end points of the switch fabric does not reduce the bandwidth available for signal traffic between other end points.
[0025] Each processing element 2 comprises a repurposed computing element 7 and an associated local memory element 8, and optionally may further comprise a non-volatile cache element 9. The computing element 7 is a repurposed application specific integrated circuit (ASIC) comprising a plurality of application specific input/output (IO) interfaces, which comprises at least a processor. Each local memory element 8 may be a dynamic random access memory (DRAM) controlled by the associated computing element 7. Each non-volatile cache element 9 may be a flash memory element. In the illustrated example each of the computing elements 7 is an identical mobile phone processor, such as a Samsung Exyons 7420 processor. It will be understood that mobile phone processors are examples of an application specific processor (ASP) implemented using ASIC techniques and are designed to operate alone to provide communications functionality in devices, and are not designed to be used as collectively in groups, or to be used for general processing tasks or partitioned and/or repurposed for other application specific uses.
[0026] The use of mobile phone processors as the processors comprising each computing element 7 may be advantageous because mobile phone processors are high performance processors with well mapped and assessed capabilities and are readily available at a relatively low cost because they are produced in very large numbers, resulting in economies of scale.
[0027] The computing elements 7 may comprise multi-core processors having two or more independent central processing units (CPUs), commonly referred to as cores.
[0028] The switch fabric 3 is arranged to interconnect together the various components of the computing node 1 in either a fixed or a reconfigurable manner using a processor native addressing scheme or a processor aware addressing scheme between the processor memory interface and the memory.
[0029] All processors using the Von-Neumann model of a processor implement one or more interfaces to address IO units and memory locations, these interfaces are nativity controlled by the processor hardware. Any such processor therefore natively addresses a memory location using its native addressing scheme, referred to as the Processor Native Addressing Scheme. It is also possible to include a bridge between the processor and memory native interfaces so long as the bridge is aware of the requirements of the processor native addressing scheme. Such bridged processor native addressing schemes are referred to as a Processor Aware Addressing scheme. For example, the ARM AMBA interconnect architecture specification provides the definition of the native interfaces supported by ARM processors. For clarity both the native and processor aware addressing schemes do not require software to encode the addressing scheme. In contrast, an example of an addressing scheme which is neither natively defined or processor aware would be the TCP/IP addressing scheme used by the Internet.
[0030] The switch fabric 3 in the illustrated example of figure 1 connects the processing elements 2 to the storage element 4 and the network port 5, as indicated in figure 1 by the interconnecting arrows. The switch fabric 3 bridges all of the processing elements 2 and provides extended bus links from the processing elements 2 and their constituent computing elements 7 to the network port 5. Accordingly, the switch fabric 3 can be bridged through an external network through the network port 5. The network port 5 can act as a native network port into all of the resources of the computing elements 7 using IO interfaces of the computing elements 7 interconnected with the switch fabric 3. A native network port is a port from the switch fabric which bridges between the processor native or aware address scheme used by the switch fabric and the address scheme used by a network.
[0031] In the illustrated example of figure 1 the native network port 5 is an Ethernet port. Accordingly, the switch fabric 3 can provide the processing elements 2a to 2d with high speed Ethernet connections and provide a storage area network interconnecting the processing elements 2a to 2d with the storage element 4. In other examples the network port 5 may provide other types of network connection and/or internet connection instead of, or in addition to, Ethernet connection.
[0032] The switch fabric 3 is formed by fixed or reconfigurable switching hardware and may comprise one or more field programmable gate arrays (FPGAs). The reconfigurable switching hardware of the switch fabric 3 is reconfigured under the control of driver software operating on the CPUs 20 of the processors 7 of the computing node 1.
[0033] The storage element 4 may be a solid-state drive (SSD) mounted to a bridge between the switch fabric and an SSD slot of the physical substrate 6.
[0034] The specifications of the different components of the computing node 1 may be selected to provide a desired balanced ratio of system resources, such as processing, memory, storage and networking system resources.
[0035] In alternative examples the computing node 1 may include additional or alternative physical resource elements providing different system resources, such as accelerators, or other components used in computer facilities such as servers.
[0036] The computing node 1 may comprise further components such as power supply and management components, cooling systems, heat sinks, etc. These are not shown in figure 1, to improve clarity.
[0037] The switch fabric will include interface bridges to connect to the computing elements 7 and to other application specific interfaces such as to Ethernet and storage.
[0038] The switch fabric 3 of the computing node 1 using a processor native or processor aware addressing scheme enables physical disaggregation of the physical resource elements of the computing node 1 mounted on the physical substrate 10, allowing the computing resources provided by the physical resource elements to be treated as a plurality of operationally independent resource element types, with each resource element type being composed of a pool of resource elements. The physical resource elements being physically disaggregated means that the different physical resource elements are not directly connected to either the ASP or the GPP in an instance of the computing node, but are located together in groups of physical resource elements defined by the physical location or arrangement of the different physical resource elements. For example, the resource elements may be regarded as separate pools of processor elements, memory elements, persistent storage elements, and networking elements. These resource elements from the different pools can then be grouped and combined together as required for specific tasks by the switch fabric 3 regardless of their physical locations to create the I/O interfaces available to a compute element.
[0039] The plurality of pools of resource elements can also be regarded as directly attached to either the ASP or GPP and then being expressed in a single plane of disaggregated logical resources. The single disaggregated logical resource plane formed by the different computing nodes 4 provides resource pools of processor elements, memory elements, persistent storage elements, and networking elements. Elements can be selected from these resource pools and grouped together as required for specific tasks.
[0040] Figure 2 shows a schematic illustration of a mobile phone processor 7, which may be used as one of the computing elements 7a to 7d of the computing node 1 in the example of figure 1.
[0041] The mobile phone processor 7 comprises a central processing unit (CPU) 20, accelerators 21, and graphics processing 22, together with an interface 23 providing an interface to access and control the local memory element 8 and the non-volatile cache element 9 associated with the processor 7.
[0042] The mobile phone processor 7 further comprises a plurality of different application specific IO interfaces. Specifically, in the illustrated example the IO interfaces comprise a modem interface 24, a flash memory interface 25, an SSD interface 26, a WiFi interface 27, and a screen interface 28.
[0043] It should be understood that figure 2 is a schematic illustration to show the different functionalities of the processor 7 and is not intended to illustrate the actual physical arrangement of the different functions in the processor 7.
[0044] Figure 3 shows a more detailed schematic illustration of one of the processing elements 2 of the example of figure 1.
[0045] As discussed previously, the processing element 2 comprises a mobile phone processor 7, a local memory element 8, and optionally a non-volatile cache element 9. The processing element 2 is connected either directly to the switch fabric 3, or to corresponding interface bridges of the switch fabric 3, by a data connection 30. The switch fabric 3 and data connection 30 are connected to the processor 7, the local memory element 8 and a nonvolatile cache element 9 are connected to the processor 7, and are not necessarily connected directly to the switch fabric 3 and data connection 30.
[0046] In the illustrated example the mobile phone processor 7 is connected to the switch fabric 3 using two of the application specific IO interfaces of the processor 7, which are repurposed by software to become repurposed IO interfaces and used differently to their originally intended interface functions to provide a data link between the processor 7 and the switch fabric 3. In the illustrated example the modem interface 24 and the SSD interface 26 of the processor 7 which provide the data link are connected to the switch fabric 3 by the data connection 30. The modem interface 24 may, for example, be a 4G modem interface. It will be understood that although the data connection 30 is shown as a single arrow this is a diagrammatic representation and it may comprise plural physical conductors. The mobile phone processor 7 does not have any native bus interface.
[0047] The 4G modem interface 24 and the SSD interface 26 of the processor 7 are different interfaces, each of which can support a high bandwidth, or in other words, high speed, data link and provide native (that is, driverless) memory access which can directly address memory owned by, or assigned to, the processor 7. Accordingly, by repurposing and aggregating the capability of these two interfaces of the processor 7 and using their combined bandwidth to provide the data link connecting the processor 7 to the switch fabric 3 a sufficiently large bandwidth and low latency data link can be provided to allow the processing power of the processor 7 to be fully available for use as a computing resource by the computing node 1.
[0048] In alternative examples a different two interfaces of the processor 7 may be repurposed and combined to provide the data link. The WiFi interface 27 of the processor 7 is also an interface which can support a high bandwidth data link and provide native memory access. Accordingly, in such examples the 4G modem interface 24 and the WiFi interface 27 of the processor 7, or the SSD interface 26 and the WiFi interface 27 of the processor 7, could instead be repurposed from their intended different interface functions to provide a data link connected to the switch fabric 3 by the connection 30.
[0049] In alternative examples where a higher bandwidth low latency connection between the processor 7 and the switch fabric 3 is required, the WiFi interface 27 of the processor 7 could be repurposed to provide the data link in addition to, and in combination with, the 4G modem interface 24 and the SSD interface 26. Such a higher bandwidth low latency connection may be required, for example where the computing resources required by the computing node 1 necessitate a higher bandwidth low latency connection, or where the processor 7 is a processor having a higher processing capability compared to the bandwidth of its IO interfaces.
[0050] In alternative examples where a lower bandwidth low latency connection between the processor 7 and the switch fabric 3 is required, only one of the 4G modem interface 24, the SSD interface 26, and the WiFi interface 27 of the processor 7 could be used to form the data link connecting the processor 7 to the switch fabric 3 via the connection 30. A lower bandwidth low latency connection may be sufficient, for example where the computing resources required by the computing node 1 only require a lower bandwidth low latency connection, or where the processor 7 is a processor having a lower processing capability compared to the bandwidth of its IO interfaces.
[0051] In general one or more of the application specific IO interfaces of the processor 7 may be repurposed and used in combination to form the data link to connect the processor 7 to the switch fabric 3, with the number and identity of the IO interfaces used and combined being selected as necessary to provide a data link having the required bandwidth and latency of connection between the processor 7 and the switch fabric 3. In general this required bandwidth and latency of connection will be determined by the desired computing resources which it is desired for the processor 7 to make available to the computing node 1. The switch fabric may include interface bridges to connect to the repurposed application specific IO interfaces of the processor 7 and to create a processor aware addressing scheme.
[0052] In examples where the processor 7 comprises a general purpose IO interface, one or more of the application specific IO interfaces of the processor? may be repurposed and used in combination with the general purpose IO interface to form the data link to connect the processor 7 to the switch fabric 3, with the number and identity of the application specific IO interfaces used and combined with the general purpose IO interface being selected as necessary to provide a data link having the required bandwidth and latency of connection between the processor 7 and the switch fabric 3. The switch fabric may include interface bridges to connect to the repurposed application specific IO interfaces and the general purpose IO interface of the processor?.
[0053] In general, only IO interfaces of the processor 7 which can support a high bandwidth data link and can provide native memory access, either directly or through protocol conversion, are suitable for use to form the data link providing the connection between the processor 7 and the switch fabric 3. In the illustrated example, the 4G modem interface 24, the SSD interface 26, and the WiFi interface 27 of the processor 7 are suitable for use to form the data link connecting between the processor 7 and the switch fabric 3, but the screen interface 28 is typically not.
[0054] Since the data link between the processor 7 and the switch fabric 3 may consist of multiple independently controlled/managed IO interfaces, this aggregate data link can be made resilient to failure of any specific IO interface. Further, IO interfaces can be dynamically allocated/deallocated to the combined interface to compensate for IO interface failures, and/or to respond to changes in the demanded data link bandwidth, and to scale usage energy cost against bandwidth demand.
[0055] In some examples the different repurposed interfaces may be joined together with one another, and possibly with any general purpose IO interface, to form the data link between the processor 7 and the switch fabric 3.
[0056] In the illustrated example the processors 7 are Samsung Exynos7420 processors. In other examples alternative mobile phone processors may be used.
[0057] In further alternative examples suitable application specific processors (ASPs) other than mobile phone processors may be used as the processors 7. A suitable ASP will be an ASP having one or more repurposable IO interfaces capable of supporting a high bandwidth data link and providing native memory access, either directly or through a processor aware address protocol conversion scheme. It should be understood that even when only a single IO interface is used, an application specific IO interface must generally still be repurposed, because ASPs in general, and mobile phone processors in particular, do not have general purpose IO interfaces intended for either processor native or processor aware connection to a switch fabric enabling subsequent connection to alternative application specific IO interfaces.
[0058] In order to repurpose an application specific IO interface of an ASP, in the illustrated example the processor 7, to create a repurposed IO interface for interconnection with the switch fabric 3 it is necessary to either configure the application specific IO interface of the ASP to read and write a region of the processing element memory, in the illustrated example the local memory element 8 associated with the processor 7, directly with a mechanism of synchronization and/or notification with the ASP, processor 7, when memory updates are made, for example through a processor interrupt, or to translate the native scheme of the application specific IO interface to follow these same semantics.
[0059] Translating the native protocol can be carried out by protocol conversion, which can include, for example, the encapsulation of memory read/write/synchronization operations as Ethernet packets on a network.
[0060] Software running on the ASP, in the illustrated example processor 7, then aggregates each of the assigned IO memory regions and the each of the corresponding notifications for each region, for example an interrupt, from each of the corresponding application specific IO interfaces when they complete an update of the associated memory region, into a unified memory region and single subsequent notification or interrupt, and presents this to the operating system running on the ASP as a single device, in the illustrated example processor 7, as a single repurposed IO interface.
[0061] This interface aggregation software can also control how many application specific IO interfaces to include in the aggregation, provide resilience of the repurposed IO interface, and ensure that where multiple application specific IO interfaces are combined into a single repurposed IO interface, this single repurposed IO interface presents uniform semantics to the operating system across the different active application specific IO interfaces.
[0062] The disclosed use of a plurality of ASPs, such as mobile phone processors, to form a computing node, with IO interfaces of the ASPs being repurposed to form connections to a switch fabric of the computing node may provide a high performance reconfigurable computing node, without excessive cost.
[0063] The computer nodes 1 according to the above examples may be combined to form a scaleable computer, for example a scaleable server.
[0064] Figure 4 shows a diagrammatic illustration of a computer system 40 according to a second embodiment of the present invention.
[0065] The computer system 40 comprises a set of four computing nodes 1 according to the first embodiment, an additional fabric switch 41, and a network port 42. Accordingly, the computing system 40 comprises 16 (i.e. 4x4) processors 7.
[0066] The additional fabric switch 41 interconnects the four computing nodes 1 and the network port 42. The additional fabric switch 41 operates to extend the switch fabrics 3 of the individual computing nodes 1 into a single common switch fabric, and to bridge the four computing nodes 1, and all of their components through an external network through the network port 42.
[0067] The additional fabric switch 41 is formed by reconfigurable switching hardware and may comprise one or more FPGAs.
[0068] In an alternative embodiment the additional fabric switch 41 may be omitted and the four computing nodes 1 may be connected directly to the network port 42. In such embodiments the switch fabrics 3 of the computing nodes 1 may cooperate to create a common single switch fabric bridging the four computing nodes 1.
[0069] In either embodiment, the common switch fabric across the four computing nodes enables the plurality of pools of resource elements provided by the different computing nodes 4 of the computer system 40 to be regarded as making up a single disaggregated logical resource plane. The single disaggregated logical resource plane formed by the different computing nodes 4 provides resource pools of processor elements, memory elements, persistent storage elements, and networking elements.
[0070] In some examples the computing nodes 1 and/or the computer 40 may include additional or alternative physical resource elements providing different system resources, such as accelerators, or other components used in computer facilities such as servers. The computing resources provided by these other physical resource elements will also be contained in a resource pool in the disaggregated logical resource plane.
[0071] The computer system 40 may be a blade for use as a data center server.
[0072] Each resource element in a pool of resource elements of a resource element type in the disaggregated logical resource plane operates independently of any other resource element, so that a plurality of them can be grouped together or encapsulated, one or more instances for each resource element type, by a resource element manager, to create a logically defined computing facility. Accordingly, a computing facility may be created by selecting any number of instances of instances of any type of logical resource element from the pools of logical resource element types of the disaggregated logical resource plane. The disaggregation of the physical resource elements and subsequent encapsulation of instances of logical resource elements performed by the resource element manager, is made possible thanks to the processor native or processor aware addressing scheme which is adopted, opposed to conventional networking aware schemes of physical disaggregation.
[0073] More specifically, a computing facility can be created, dynamically or statically, by a) physicalization of resource elements through a common physical address space in which disaggregated resources are placed physically within the processor native address space, or b) virtualization of resource elements in which resource elements can be shared between processors over any form of abstracted communication, or c) any combination thereof.
[0074] From a logical view, each logical resource element type becomes a logical pool of resources built internally using traditional processor system on a chip (SoC) addressing schemes into a global pool of resources. For example, no single resource element is the master of the computing facility, and, as such, networking can serve storage without processor element involvement. It also means the capabilities of each computing node can be independently defined and instantiated without the traditional cost of building a new SoC with different IO resource capabilities.
[0075] From a physical view, each computing facility is created with the convergence of processing, memory, storage and networking system resources using a physical balanced ratio of capabilities required to create the computing node. A single computing facility therefore can include any number of processing elements, storage elements, or network resource elements for the computing node to create a computing facility. Each physical resource element cannot exist independently but only when connected with one or more of the other physical resource element types. The resource elements are arranged using a processor aware addressing scheme of physical disaggregation and therefore they also need to interconnect to become a meaningful system.
[0076] There is not a precise CPU allocation and a memory dedicated to the processing elements 2, but a pool of memories distributed across the processing elements 2 which can be used by the different processing elements 2 and the different processing elements 2 can be connected together to adapt the processing capability to the requirement ofthe specific required tasks. Likewise, the global resource pool addressing scheme allows the physical network port resources placed anywhere in the system to be attached to a processing element 2 as if the resource was physically attached to the local address bus ofthe processor.
[0077] All logical resources are therefore considered at the same level of importance in the computer system.
[0078] Additionally, it is not necessary to access the CPU to 'speak' with the memory or the resources physically associated with a compute facility, but access is possible directly through the switch fabric address, which may be globally any resource address, without management by any other element ofthe system, (assuming the appropriate security and access privileges).
[0079] In the illustrated examples the or each computing node comprises four repurposed ASPs. In alterative examples the computing node may comprise a different number of repurposed ASPs, and specifically may comprise one or more repurposed ASPs.
[0080] In the illustrated examples the or each computing node comprises four identical repurposed ASPs. In alterative examples the computing node may comprise a number of heterogeneous repurposed ASPs which are different from one another, and not identical.
[0081] In the illustrated examples the computing system comprises four computing nodes. In alterative examples the computing system may comprise a different number of computing nodes, and specifically may comprise one or more computing nodes.
[0082] In the illustrated examples the computing system comprises four identical computing nodes. In alterative examples the computing system may comprise a number of heterogeneous computing nodes, which are different from one another, and not identical.
[0083] In the illustrated examples the ASP does not have any native bus interface. In other examples the ASP could have a native bus interface.
[0084] In the illustrated examples the different resource specific IO interfaces of the processor comprise a 4G modem interface, a flash memory interface, an SSD interface, a
WiFi interface, and a screen driver. In other examples other different interfaces may be provided instead of, or in addition to, these interfaces.
[0085] In the illustrated examples the 4G modem interface, SSD interface, and/Or WiFi interface are used individually or in combination to provide the data link. In other examples different interfaces may be used instead of, or in addition to, these interfaces.
[0086] In some examples a modem interface may be used. The 4G modem interface in the illustrated examples is an example of a modem interface. In some examples a Flash memory interface may be used. In some examples a Network interface may be used. In some examples a processor native memory interface may be used. In some examples a general purpose peripheral interface may be used. In some examples a processor aware memory interface may be used.
[0087] In the described embodiments of the invention a computing node or computing system is provided. Platform software comprising an operating system or any other suitable platform software may be provided at the computing node or computing system to enable application software to be executed.
[0088] Although the computing system is shown as a single device it will be appreciated that this system may be distributed or located remotely and accessed via a network or other communication link (e.g. using a communication interface).
[0089] The term 'computer' is used herein to refer to any device with processing capability such that it can execute instructions. Those skilled in the art will realise that such processing capabilities are incorporated into many different devices and therefore the term 'computer' includes PCs, servers, mobile telephones, personal digital assistants and many other devices.
[0090] It will be understood that the benefits and advantages described above may relate to one embodiment or may relate to several embodiments. The embodiments are not limited to those that solve any or all of the stated problems orthose that have any or all of the stated benefits and advantages.
[0091] Any reference to 'an' item refers to one or more of those items. The term 'comprising' is used herein to mean including the method steps or elements identified, but that such steps or elements do not comprise an exclusive list and a method or apparatus may contain additional steps or elements.
[0092] It will be understood that the above description of a preferred embodiment is given by way of example only and that various modifications may be made by those skilled in the art. Although various embodiments have been described above with a certain degree of particularity, or with reference to one or more individual embodiments, those skilled in the art 5 could make numerous alterations to the disclosed embodiments without departing from the spirit or scope of this invention.
Claims (19)
1. A computing element comprising an application specific processor 'ASP' having a plurality of heterogeneous application specific input/output ΊΟ' interfaces;
wherein one or more of the plurality of heterogeneous application specific in/out interfaces are repurposed to form a data link between the ASP and a switch fabric.
2. The computing element according to claim 1, wherein two or more of the plurality of heterogeneous application specific IO interfaces are repurposed to form the data link between the ASP and the switch fabric in combination.
3. The computing element according to claim 1 or claim 2, wherein all of the plurality of heterogeneous application specific IO interfaces are repurposed to form the data link between the ASP and the switch fabric in combination.
4. The computing element according to claim 2 or claim 3, wherein the repurposed heterogeneous application specific IO interfaces are joined together to form a single data link.
5. The computing element according to any preceding claim, wherein each of the plurality of heterogeneous application specific IO interfaces is arranged to provide native memory access.
6. The computing element according to any preceding claim, wherein each of the repurposed heterogeneous application specific IO interfaces is arranged to provide native memory access through a processor aware address scheme with the switch fabric.
7. The computing element according to any preceding claim, wherein the plurality of heterogeneous application specific IO interfaces comprise one or more of: a modem interface; a Flash memory interface; a Network interface, a processor native memory interface, a general purpose peripheral interface, a processor aware memory interface.
8. The computing element according to any preceding claim, wherein the ASP is a mobile phone processor.
9. The computing element according to any preceding claim, wherein the ASP does not comprise a general purpose processor native bus interface.
10. A computing node comprising:
a plurality of computing elements according to any preceding claim;
at least one storage element, at least one network port; and a switch fabric configured to couple the plurality of computing elements, at least one storage element and at least one network port to each other.
11. A method of providing a computing element, the method comprising:
obtaining an application specific processor 'ASP' having a plurality of heterogeneous application specific input/output ΊΟ' interfaces; and repurposing one or more of the plurality of heterogeneous application specific IO interfaces to form a data link between the ASP and a switch fabric.
12. The method according to claim 11, wherein the repurposing comprises:
repurposing two or more of the plurality of heterogeneous application specific IO interfaces to form the data link between the ASP and the switch fabric in combination.
13. The method according to claim 11 or claim 13, wherein all of the plurality of heterogeneous application specific IO interfaces are repurposed to form the data link between the ASP and the switch fabric in combination.
14. The method according to claim 12 or claim 13, wherein the repurposed heterogeneous application specific IO interfaces are joined together to form a single data link.
15. The method according to any of claims 11 to 14, wherein each of the plurality of heterogeneous application specific IO interfaces is arranged to provide native memory access.
16. The method according to any of claims 11 to 15, wherein each of the repurposed heterogeneous application specific IO interfaces is arranged to provide native memory access through a processor aware address scheme with the switch fabric
17. The method according to any one of claims 11 to 16, wherein the plurality of heterogeneous application specific IO interfaces comprise one or more of: a modem interface; a Flash memory interface; a Network interface, a processor native memory interface, a general purpose peripheral interface, a processor aware memory
5 interface.
18. The method according to any one of claims 11 to 17, wherein the ASP is a mobile phone processor.
19. The method according to any one of claims 11 to 18, wherein the ASP does not comprise a native bus interface.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1802913.2A GB2571297A (en) | 2018-02-22 | 2018-02-22 | Computing element |
PCT/GB2019/050480 WO2019162677A1 (en) | 2018-02-22 | 2019-02-21 | Computing element |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1802913.2A GB2571297A (en) | 2018-02-22 | 2018-02-22 | Computing element |
Publications (2)
Publication Number | Publication Date |
---|---|
GB201802913D0 GB201802913D0 (en) | 2018-04-11 |
GB2571297A true GB2571297A (en) | 2019-08-28 |
Family
ID=61903310
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB1802913.2A Withdrawn GB2571297A (en) | 2018-02-22 | 2018-02-22 | Computing element |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2571297A (en) |
WO (1) | WO2019162677A1 (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090006808A1 (en) * | 2007-06-26 | 2009-01-01 | International Business Machines Corporation | Ultrascalable petaflop parallel supercomputer |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11200192B2 (en) * | 2015-02-13 | 2021-12-14 | Amazon Technologies. lac. | Multi-mode system on a chip |
-
2018
- 2018-02-22 GB GB1802913.2A patent/GB2571297A/en not_active Withdrawn
-
2019
- 2019-02-21 WO PCT/GB2019/050480 patent/WO2019162677A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090006808A1 (en) * | 2007-06-26 | 2009-01-01 | International Business Machines Corporation | Ultrascalable petaflop parallel supercomputer |
Non-Patent Citations (1)
Title |
---|
Nikola Rajovic et al.: 'The Mont-Blanc Prototype: An Alternative Approach for HPC Systems', published in 'SC16: International Conference for High Performance Computing, Networking, Storage and Analysis'; 13 November 2016 * |
Also Published As
Publication number | Publication date |
---|---|
GB201802913D0 (en) | 2018-04-11 |
WO2019162677A1 (en) | 2019-08-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11579788B2 (en) | Technologies for providing shared memory for accelerator sleds | |
US11934883B2 (en) | Computer cluster arrangement for processing a computation task and method for operation thereof | |
EP4002134A1 (en) | Direct memory access (dma) engine with network interface capabilities | |
US20240314072A1 (en) | Network interface for data transport in heterogeneous computing environments | |
Katrinis et al. | Rack-scale disaggregated cloud data centers: The dReDBox project vision | |
CN110809760B (en) | Resource pool management method and device, resource pool control unit and communication equipment | |
US20210266253A1 (en) | Pooling of network processing resources | |
KR102043276B1 (en) | Apparatus and method for dynamic resource allocation based on interconnect fabric switching | |
US9280513B1 (en) | Matrix processor proxy systems and methods | |
US10949313B2 (en) | Automatic failover permissions | |
US20230205718A1 (en) | Platform with configurable pooled resources | |
US11301278B2 (en) | Packet handling based on multiprocessor architecture configuration | |
US20210326221A1 (en) | Network interface device management of service execution failover | |
US9509562B2 (en) | Method of providing a dynamic node service and device using the same | |
US20170147492A1 (en) | Method and System for Implementing Directory Structure of Host System | |
US11138146B2 (en) | Hyperscale architecture | |
WO2023121768A1 (en) | Switch for managing service meshes | |
US9081891B2 (en) | Reconfigurable crossbar networks | |
US8131975B1 (en) | Matrix processor initialization systems and methods | |
WO2019162677A1 (en) | Computing element | |
EP3343843B1 (en) | A control plane system and method for managing a data plane amongst a plurality of equipments | |
US20240264759A1 (en) | Method and apparatus to perform memory reconfiguration without a system reboot | |
US20220326962A1 (en) | Accelerator capable of executing fast dynamic change in acceleration type | |
US20230305881A1 (en) | Configurable Access to a Multi-Die Reconfigurable Processor by a Virtual Function | |
Jeong et al. | Enhancing network I/o performance for a virtualized Hadoop cluster |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WAP | Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1) |