WO2024092188A1 - Firmware broadcast in a multi-chip module - Google Patents

Firmware broadcast in a multi-chip module Download PDF

Info

Publication number
WO2024092188A1
WO2024092188A1 PCT/US2023/078008 US2023078008W WO2024092188A1 WO 2024092188 A1 WO2024092188 A1 WO 2024092188A1 US 2023078008 W US2023078008 W US 2023078008W WO 2024092188 A1 WO2024092188 A1 WO 2024092188A1
Authority
WO
WIPO (PCT)
Prior art keywords
tile
follower
leader
bus
phys
Prior art date
Application number
PCT/US2023/078008
Other languages
French (fr)
Inventor
Jon Kenneth NICOLL
Peter Korger
Subhash Roy
Original Assignee
Kandou Labs SA
Kandou Us, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kandou Labs SA, Kandou Us, Inc. filed Critical Kandou Labs SA
Publication of WO2024092188A1 publication Critical patent/WO2024092188A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0646Configuration or reconfiguration
    • G06F12/0692Multiconfiguration, e.g. local and global addressing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/10Program control for peripheral devices
    • G06F13/12Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor
    • G06F13/122Program control for peripheral devices using hardware independent of the central processor, e.g. channel or peripheral processor where hardware performs an I/O function other than control of data transfer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • G06F13/1652Handling requests for interconnection or transfer for access to memory bus based on arbitration in a multiprocessor architecture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/24Handling requests for interconnection or transfer for access to input/output bus using interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/382Information transfer, e.g. on bus using universal interface adapter
    • G06F13/385Information transfer, e.g. on bus using universal interface adapter for adaptation of a particular data processing system to different peripheral devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4027Coupling between buses using bus bridges
    • G06F13/404Coupling between buses using bus bridges with address mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
    • G06F15/163Interprocessor communication
    • G06F15/167Interprocessor communication using a common memory, e.g. mailbox
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7807System on chip, i.e. computer system on a single chip; System in package, i.e. computer system on one or more chips in a single package
    • G06F15/7825Globally asynchronous, locally synchronous, e.g. network on chip
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications

Abstract

A multi-chip module (MCM) is described herein which includes a number of components communicatively coupled by one or more busses (325, 625, 725, 725') and controlled by a processor (300). Some of these components require firmware to operate correctly. The firmware is typically loaded during an initialization process, called a "boot process". The system and methods described herein allow for updating of the firmware from time to time. Standards such as the Peripheral Component Interconnect Express (PCIe) standard typically place limits on boot time, e.g. a PCIe-compliant MCM may be required to be performing PCIe link training following boot after a certain time, e.g. 100ms or 120ms. Loading firmware during a boot process can take up a significant portion of this time, particularly in the case where multiple components require firmware to be loaded. Described herein is a technique that is capable of loading firmware to multiple MCM components in a time-efficient manner.

Description

FIRMWARE BROADCAST IN A MULTI-CHIP MODULE
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Application No. 63/381,264, filed October 27, 2022, entitled ‘’SECURE BOOTABLE PCIE RETIMER”, which is hereby incorporated herein by reference in its entirety7 for all purposes.
REFERENCES
[0002] The following references are herein incorporated by reference in their entirety for all purposes:
[0003] PCI Express Base Specification Revision 6.0.1, Version 1.0. September 13, 2022, accessible at pcisig[dot]com/specifications.
[0004] PCI Express Retimer Test Specification Revision 4.0, Version 1.0, June 10, 2022, accessible at pcisig[dot]com/specifications.
[0005] U.S. Application No. 13/895,206, filed May 15, 2013, which granted as U.S. Patent No. 9,288,082 on March 15, 2016, entitled “Circuits for Efficient Detection of Vector Signaling Codes for Chip-To-Chip Communication Using Sums of Differences”, naming Roger Ulrich and Peter Hunt (referred to herein as [Ulrich]).
BACKGROUND
[0006] As signals propagate over wires, they tend to degrade - that is, the signal to noise ratio decreases. This attenuation of a signal is often measured in decibels (dB) and tends to increase with the length of the wire that the signal is transmitted over.
[0007] Many electronics standards define a maximum loss for signals transmitted between an upstream component and a downstream component. For example, the Peripheral Component Interconnect Express (PCIe) 5.0 standard gives a -36dB loss budget at 16GHz for transmission from an upstream component (typically a root complex or switch) to a downstream component (typically an endpoint or switch). Failure to comply with this loss budget results in non- compliance with the standard, which is undesirable. However, it can be difficult to meet a loss budget in practice, particularly in the case of longer wires and higher data rates.
[0008] To resolve this issue, a retimer can be used. A retimer is a component that is located in the signal path between the upstream component and the downstream component. The retimer breaks the link between the upstream component and downstream component into two entirely separate links. The retimer is configured to condition the signal it receives via an upstream pseudo-port before transmitting the conditioned signal out via a downstream pseudo-port. Typically, a retimer equalizes the incoming signal and recovers the clocking of the incoming signal, such that the output of the retimer is a high amplitude, low noise and low jitter signal. A retimer can thus significantly reduce the total losses between the upstream and downstream components, bringing a previously non-compliant link within specification.
[0009] Retimers, and more generally, input/output devices, can be provided as a multi-chip module (MCM) that includes a plurality of tiles communicatively coupled in some manner. Each tile can comprise a plurality of components such as physical data lane circuits (PHYs). These components may require some configuration data to operate correctly, e.g. firmware. The data may be the same for each of the plurality of components. Providing each component with the data in a serial fashion can take an undesirably long time and create corresponding performance issues with the MCM.
BRIEF DESCRIPTION
[0010] A multi-chip module (MCM) is described herein w hich includes a number of components communicatively coupled by one or more busses and controlled by a processor such as a CPU, microcontroller, etc. Some of these components require firmware to operate correctly. The firmware is typically loaded during an initialization process, called a '‘boot process” or simply “boot”. The system and methods described herein allows for updating of the firmware from time to time. Standards such as the Peripheral Component Interconnect Express (PCIe) standard typically place limits on boot time. e.g. a PCIe-compliant MCM may be required to be performing PCIe link training following boot after a certain time, e.g. 100ms or 120ms. Loading firmware during a boot process can take up a significant portion of this time, particularly in the case w here multiple components require firmware to be loaded. Described herein is a technique that is capable of loading firmware to multiple MCM components in a time-efficient manner. The MCM is described herein in the context of a retimer but this disclosure is not limited to retimers as a MCM performing any function can make use of the techniques disclosed in this specification.
[0011] Techniques are disclosed for performing broadcast write operations in a single-tile package and a multi-tile package comprising multiple tiles. The package can provide any functionality, e.g. retimer functionality. A broadcast address is simultaneously assigned to a plurality of components of the MCM and a data package is transmitted to all of the components simultaneously using the broadcast address. In the multi-tile case, a first broadcast write is performed to simultaneously w rite to components on a leader tile and a second broadcast w ite is performed to simultaneously write to components on one or more follower tiles. The broadcast write is performed using a broadcast address or address range that is common to the leader tile components and follower tile components. A broadcast read operation is also disclosed, where this is performed in a MCM. A bitwise OR operation is performed on the read results to generate a resultant value. The resultant value can be used to determine if any component on any of the tiles has raised an interrupt request.
[0012] A method according to an embodiment comprises obtaining, by a processor of a MCM having a plurality of tiles and a plurality of physical layer data lane circuits (PHYs) distributed across the plurality of tiles, a PHY configuration data package; and simultaneously writing, by the processor, via a bus coupled to the processor and to the plurality7 of PHY s, the configuration data package to the plurality of PHY s using a broadcast address space of the MCM, the broadcast address space of the MCM comprising at least one address assigned to all of the plurality of PHYs, the MCM also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHY s.
[0013] An apparatus according to an embodiment comprises a multi-chip module (MCM) having a plurality of tiles and a plurality of physical layer data lane circuits (PHYs) distributed across the plurality of tiles, a processor and a bus coupled to the processor and to the plurality’ of PHY s, the processor configured to: obtain a PHY configuration data package; and simultaneously write, via the bus, the configuration data package to the plurality of PHYs using a broadcast address space of the MCM. the broadcast address space of the MCM comprising at least one address assigned to all of the plurality of PHYs. the MCM also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs.
[0014] A method according to another embodiment comprises: simultaneously transmitting, by a processor of a MCM having a leader tile having a plurality of physical data lane leader circuits (leader PHYs) and one or more follower tiles each having a respective plurality of physical data lane follower circuits (follower PHYs), a broadcast read instruction to a leader tile register located on the leader tile and to one or more follower tile registers respectively located on the one or more follower tiles; receiving, responsive to the broadcast read instruction, a leader tile read result from the leader tile register; receiving, responsive to the broadcast read instruction, one or more follower tile read results from respectively ones of the one or more follower tile registers; and performing a bitwise OR operation on the leader tile read result and the one or more follower tile read results to generate a resultant read value.
[0015] An apparatus according to another embodiment comprises: a multi-chip module (MCM) having a plurality of physical layer data lane leader circuits (leader PHYs) located on a leader tile of the MCM and a plurality of physical data lane follower circuits (follower PHYs) respectively located on each of one or more follower tiles of the MCM; a leader tile register located on the leader tile; one or more follower tile registers respectively located on the one or more follower tiles; and a processor and a bus coupled to the processor and to the plurality of PHYs; wherein the processor is configured to: simultaneously transmit a broadcast read instruction to the leader tile register and to the one or more follower tile registers; receive, responsive to the broadcast read instruction, a leader tile read result from the leader tile register; receive, responsive to the broadcast read instruction, one or more follower tile read results from respectively ones of the one or more follower tile registers; and perform a bitwise OR operation on the leader tile read result and the one or more follower tile read results to generate a resultant read value.
BRIEF DESCRIPTION OF FIGURES
[0016] FIG. 1 is a block diagram of an apparatus comprising a multi-chip module suitable for implementing embodiments described herein.
[0017] FIG. 2 is a block diagram of a retimer suitable for implementing embodiments described herein.
[0018] FIG. 3 is a block diagram of a single tile retimer suitable for implementing embodiments described herein.
[0019] FIG. 4 is a schematic drawing of contents of a memory external to a retimer, which memory can hold a data package for writing to components of the retimer.
[0020] FIG. 5 is a block diagram of a two-tile retimer suitable for implementing embodiments described herein.
[0021] FIG. 6 is a block diagram of the follower tile of the two-tile retimer of FIG. 5.
[0022] FIG. 7 is a block diagram of a four-tile retimer suitable for implementing embodiments described herein.
[0023] FIG. 8 is a flow chart illustrating a process for performing a broadcast write operation in a single-tile retimer, according to an embodiment.
[0024] FIG. 9 is a further block diagram of the four-tile retimer of FIG. 7.
[0025] FIG. 10 is a flow chart illustrating a process for performing a broadcast write operation in a multi-tile retimer, according to an embodiment.
[0026] FIG. 11 is a flow chart illustrating a process for performing a broadcast read operation in a multi-tile retimer, according to an embodiment.
[0027] FIG. 12 is a block diagram of a multi-chip module that is capable of performing multicast operations, according to an embodiment. [0028] FIG. 13 is a schematic diagram of an address space suitable for use with the embodiment of Fig. 12, according to an embodiment.
[0029] FIG. 14 is a flow chart illustrating a process for performing a write operation in a multichip module like the multi-chip module of Fig. 12 that supports broadcast and multicast operations, according to an embodiment.
DETAILED DESCRIPTION
[0030] At times in this specification reference is made to the PCIe standard. This is to assist in the understanding of this disclosure by describing certain features in the context of a particular standard. However, it should be appreciated that, unless expressly stated otherwise, teaching herein has applicability' outside of the PCIe standard.
[0031] Fig. 1 shows in schematic form a multi-chip module (MCM) 100 that is suitable for implementing embodiments described herein. MCM 100 includes a leader tile 105 that has a processor 110 located on it. Also present are follower tiles 115a, 115b and 115c. In the illustrated embodiment there are three follower tiles but it should be understood that this is purely exemplary and any number of follower tiles, e.g. one, two, three, four etc. can be present. Leader hie 105 and follow er tiles 115a - 115c are all part of the same package.
[0032] Leader tile 105 is distinguished from follower tiles 115a - 115c at least in that leader tile 105 has an operating processor 110. That is, processor 110 is performing computations at least some of the time when MCM 100 is in a powered-on state. Although not shown in Fig. 1 , follower tiles 115a - 115c may each also have a processor located on them for ease of circuit design and fabrication. How ever, if present, the follower tile processors are inactive during use of MCM 100, e.g. powered off.
[0033] Leader tile 105 is communicatively coupled to each of the follow er tiles 115a - 115c via respective couplings 120a, 120b, 120c. These can be wires of a bus, e.g. a Serial Peripheral Interface (SPI) bus. More detail on this is provided later in this specification.
[0034] Each tile has one or more physical layer entities such as a pseudo-port, a Serialiser/Deserialiser (SerDes), etc. located on it. These are referred to herein as physical layer data lane circuits (PHYs). In the illustrated embodiment four PHYs 125 are present on each tile, but this number is not fixed as any number of PHYs can alternatively be present. (In the interest of clarity, one PHY has been labelled in Fig. 1 on the understanding that the identically drawn elements in Fig. 1 are also PHYs). It is also possible for different numbers of PHYs to be present on some or all tiles compared to others of the tiles. The leader tile 105 may have a different number of PHYs to each of the follower tiles 115a - 115c. It is also possible for the leader tile 105 to have no PHYs on it and to act primarily as a control tile for the MCM. Other variations on these themes are possible.
[0035] In the illustrated embodiment each PHY is capable of communicating in both directions with some entity external to MCM 100, e.g. a root complex, endpoint or other device. Bidirectional communication is not strictly required as PHYs that support communication in only one direction can alternatively be present.
[0036] In the following description, the example of MCM 100 being a retimer is used to give additional context to embodiments. However, it should be appreciated that references to ‘retimer’ throughout this specification can be replaced with ‘MCM’ unless the description relates specifically to retiming functions.
[0037] Fig. 2 shows in schematic form a system 200 incorporating a retimer 210. Retimer 200 is a multi -lane PCIe retimer, meaning that it is configured to process (i.e. retime) multiple lanes of PCIe traffic simultaneously. Retimer 210 is coupled to an upstream component 205 that is typically a root complex or a switch. This coupling is via upstream pseudo-port 220a of retimer 210. Similarly, retimer 210 is coupled via downstream pseudo-port 220b to a downstream component 215, typically a switch or endpoint. Upstream pseudo-port 220a and downstream pseudo-port 220b are examples of PHYs.
[0038] It is thus apparent from Fig. 2 that retimer 210 functions to divide a link between upstream component 205 and downstream component 215 into two parts. Retimer 210 is configured to condition the signal received via upstream pseudo-port 220a and to provide a clean signal with low jitter and good signal to noise ratio as an output of downstream pseudo-port 220b. Retimer 210 is bi-directional, and thus is also capable of conditioning a signal received as an input to downstream pseudo-port 220b. In this case, the clean output signal would be sent out via upstream pseudo-port 220a.
[0039] Fig. 3 shows retimer 210 in schematic form in additional detail. For ease of understanding, some components of retimer 210 have been omitted.
[0040] Retimer 210 includes a CPU core 300, also referred to herein as a processor. This is equivalent to processor 110 of Fig. 1. CPU core 300 is configured to perform various tasks to support the function of retimer 210. One such task is the loading of firmware from external nonvolatile memory to boot ROM 305 during a boot process, and to load firmware to PHYs of retimer 210. More detail on this boot process is provided later. CPU core 300 acts in accordance with instructions stored in instruction RAM 310 and operates on data stored in data RAM 315. CPU core 300 is also coupled to interrupt request (IRQ) controller 320 to enable CPU core 300 to receive interrupt requests from other components of retimer 210, and/or from external components. [0041] CPU core 300 is also coupled to Advanced Peripheral Bus (APB) interconnect 325. The APB interconnect enables CPU core 300 to communicate w ith other components of retimer 210 that are coupled to this bus - reference is made to Fig. 3 in this regard. It will be appreciated that APB interconnect 325 can be replaced with an alternative bus, e.g. AHB or another AMBA bus, without departing from the scope of this disclosure.
[0042] APB interconnect 325 also enables other components of retimer 210 to communicate with instruction RAM 310 directly in a controlled manner (see ‘access restriction’ in Fig. 3). This ensures that only components that should be able to access instruction RAM 310 can do so, and further that instructions that any such components place in instruction RAM 310 are legitimate.
[0043] Retimer 210 also includes a non-volatile read-only memory- that could be a one-time programmable (OTP) memory 330 as shown in Fig. 3. Other forms of non-volatile ROM could alternatively be used. OTP memory 330 stores, among other things, a public key, or hash of a public key, that is usable by CPU core 300 to check that firmware is genuine as it is loaded by CPU core 300.
[0044] Firmware is loaded from an external non-volatile memory'. Here, ‘external’ refers to the memory being located off-die. i.e. it is not part of the die 335 that CPU core 300 is part of. The external non-volatile memory can be part of the MCM package, or it may be external to the MCM package. In the illustrated embodiment the external non-volatile memory is a SPI flash memory 340. CPU core 300 communicates w ith SPI flash 340 via an SPI bus, with the corresponding SPI leader 345 being connected to APB interconnect 325 to provide the complete communication channel between CPU core 300 and SPI flash 340. This configuration is provided as an example and is not the only possible configuration. For example, external non-volatile memory could instead be an EEPROM and in that case CPU core 300 could communicate with the EEPROM via and I2C bus (see I2C bus leader 350 in Fig. 3) that is coupled to APB interconnect 325. Further variations are possible, and it should be understood that any variation that enables CPU core 300 to communicate with external non-volatile memory is w ithin the scope of this disclosure.
[0045] It is noted that the PCIe standard as applicable to retimers requires an I2C bus to be present. How ever, it has been recognised that I2C is a relatively slow7 interface such that problems can arise when loading firmware from the external memory7. Specifically, an I2C bus and EEPROM may make it difficult to meet certain timing requirements of the PCIe specification. For this reason, a SPI bus and SPI flash 340 can be used to significantly reduce firmware loading times by virtue of the fact that an SPI interface offers a higher data transfer rate than an I2C interface. Given this, it is contemplated that in some implementations the I2C bus could be omitted entirely. [0046] Retimer 110 also includes timer 355, general purpose input/output pin(s) (GPIO) 360 and system management bus (SMBus) 365. These components are all coupled to APB interconnect 325 to facilitate communication with other components of retimer 210.
[0047] Timer 355 provides a programmable timing capability, e.g. to allow the performance of periodic tasks between which a low power state may be entered. GPIO 360 provides one or more general purpose pins that are unused by default, but which may be controlled by software to be used in some manner, e.g. to extend the functionality of retimer 210 in some way. SMBus 365 provides a facility for communicating information (e.g. status, configuration, device name, type, etc.) about devices coupled to retimer 310 and also for transmitting commands to said devices. One or more of timer 355, GPIO 360 and SMBus 365 could be omitted, or replaced with another component of similar functionality, without departing from the scope of this disclosure.
[0048] Retimer 210 further includes a plurality of physical layer data lane circuits (PCIe PHYs) 370, e.g. four or eight PHYs. These represent physical-layer components, e.g. a serializer/deserializer (SerDes). PHYs 370 are coupled to APB interconnect 325 to provide a communication path to CPU core 300, as well as any other component of retimer 210 also coupled to APB interconnect 325. PHYs 370 require CPU core 300 to initialise them by providing a PHY configuration data package to the PHYs during a boot process. The configuration data package could be a PCIe PHY configuration data package, for example. This PHY configuration data package could be loaded by CPU core 300 from SPI flash 340, for example. More information on this process is provided later.
[0049] Retimer 210 additionally includes a PCIe switch 375 that is coupled to APB interconnect 325. PCIe switch 375 implements PCIe switching functionality as defined by the relevant part of the PCIe standard. This enables retimer 210 to operate in a PCIe switching mode if desired. It will be appreciated that PCIe switch 375 can be omitted in the case where it is not necessary' for retimer 210 to provide a PCIe switching capability’.
[0050] Fig. 3 includes a placeholder ‘peripheral N’ 380 that is coupled to APB interconnect 325 to illustrate that retimer 210 is not limited to the specific set of peripherals illustrated in Fig. 3. Additional peripherals coupled to APB interconnect 325 may be added to retimer 210 as desired. Examples include: one or more PCIe Compute Express Links (CXLs), Physical Coding Sublayer (PCS) components, a packet inspecting component, a Joint Test Action Group (JTAG) interface, and/or a high-speed die-to-die interface as described in [Ulrich], Peripheral N 380 thus represents one or more peripherals.
[0051] Fig. 4 shows one set of possible contents for SPI flash 340. Many variations are possible and it should thus be understood that Fig. 4 is provided with a vieyv to assisting in the understanding of this disclosure rather than restricting its scope. [0052] SPI flash 340 is split into two regions (a.k.a. partitions) - an active region and an inactive region. Each region corresponds to a set of addresses in SPI flash 340. These addresses do not necessarily need to be continuous - indeed, as illustrated in Fig. 4. they can be interposed between one another. An active region refers to a set of memory addresses that hold information that will be used by CPU core 300 on next boot whereas an inactive region refers to a set of memory addresses that hold information that will not be used by CPU core 300 on next boot. The purpose of this partitioning is to allow updated firmware to be stored in the inactive region without disrupting the operation of the active region. This means that, in the event the updated firmware image is not usable (e.g. it is corrupt or invalid), the retimer can still boot from the existing firmware image stored in the active region.
[0053] The active and inactive statuses are set by one or more flags that are stored in header 400. Header 400 can store any other information that is deemed to be useful, such as the size of each memory region in bits, a starting address of each region, a date on which the SPI flash was last updated, version information, and the like.
[0054] The active region includes an active firmware image 405. This is the firmware image that will be used by CPU core 300 the next time retimer 210 is booted. Active firmware image 405 includes a configuration file 410, PHY configuration data package 415 and an application 420. It will be appreciated that this is just one example and that active firmware image 405 could alternatively include different information, or additional information, to that shown in Fig. 4.
[0055] Configuration file 410 stores information that is used by CPU core 300 during a boot process to configure retimer 210. For example, configuration file 410 could include one or more values that are to be respectively written to one or more registers of retimer 210 during the boot process. Protocol-specific information can be stored in configuration file 410, such as one or more PCIe vendor-defined message codes.
[0056] PHY configuration data package 415 serves to configure PHYs 370. PHY configuration data package 415 could be PHY firmware - that is, a smaller firmware image within active firmware image 405 - and/or PHY configuration data (e.g. initial values for one or more registers of the PHY). PHY configuration data package 415 is used to initialise and/or configure PHYs 370, e.g. CPU core 300 provides PHY configuration data package 415 to each of PHYs 370 during a boot process. PHY configuration data package 415 provides a convenient and secure channel for configuring PHYs 370 and updating their firmware, the latter because a new firmware image with updated PHY firmw are can be loaded into SPI flash 340. More information is provided later in this specification relating to the process by which PHY configuration data package 415 is transmitted to each of PHYs 370. [0057] Application 420 is an executable file that is run by CPU core 300 to enable it to boot correctly. During boot, application 420 is loaded by CPU core 300 and executed once loaded, assuming any security checks that are put in place are passed successfully.
[0058] Active firmware image 405 can also include a second stage bootloader (not shown). The second stage bootloader is an application that handles loading of certain items such as a real-time operating system (RTOS), to assist application 420. The second stage bootloader can be omitted if not needed.
[0059] Inactive firmware image 425 is a copy of active firmware image 405. It also includes a configuration file, the PHY configuration data package and an application as described above. As mentioned earlier, inactive firmware image 425 can differ from active firmware image 405 in aspects such as firmware version - e.g. the PHY configuration data package, configuration file and/or application in inactive firmware image 425 can be a different version than its counterpart in the active firmware image 405.
[0060] Thus far the discussion has been restricted to a single-tile configuration, in which the components of retimer 210 are located on a single die 335 (other than SPI flash 340 which is external to the die). Figs. 5 and 6 show a multi-tile configuration in which a second tile is introduced. The components of the second tile are located on a separate, second die 500. As show n in Fig. 6, the components of the second tile are largely identical to those of the first tile and have been given reference signs with identical suffix to those of Fig. 3 to reflect this. Reference is thus made to the preceding discussion in this regard.
[0061] The first tile is referred to herein as the leader tile (ak.a master tile) and the second tile is referred to herein as the follower tile (ak.a slave tile). A distinction between the leader tile and follower tile in many embodiments is that some of the components on the follower tile are inactive - i.e. either off entirely, or in a low power state. In one embodiment, at least the CPU core 600 on the follower tile is inactive. The follower tile is thus controlled and configured by CPU core 300 on the leader tile. In another embodiment, the following components are inactive on the follower tile: CPU core 600, boot ROM 605, instruction RAM 610, data RAM 615, IRQ controller 620, OTP memory7 630, SPI leader 645, I2C leader 650, timer 655, GPIO 660, SMBus 665 and T2T SPI leader 675. These components are present on the follower tile die as it is easier from a manufacturing perspective to produce identical tiles and designate one as leader and the other as follower. However, alternatively the above-mentioned components could be omitted from the follower tile die. Similarly, the leader tile includes both T2T SPI leader 385 and T2T SPI follow er 390, with only the T2T SPI leader 385 being active. As noted above, alternative non-identical manufacture is possible in which only the T2T leader is present on the leader tile and only the T2T follower is present on the follow er tile. Further, during die testing, certain die defects that affect leader tile functions/circuits might nonetheless be deemed acceptable for a die to act as a follower tile, thus increasing production yield percentages.
[0062] It is also pointed out that there is no SPI flash (or other external memory) coupled to the follower tile. This is because only the leader tile CPU core 300 is active, hence there is no need to load firmware to inactive CPU core 600 of the follower tile.
[0063] The leader tile and follower tile communicate via a bus that spans both dies 335 and 500 (see Fig. 5). In the case of Figs. 5 and 6 this bus is a tile-to-tile (‘T2T’) SPI bus, but alternative bus types could be used in place of an SPI bus if desired.
[0064] More specifically, the leader tile includes a T2T SPI bus leader 670 that is coupled to a corresponding T2T SPI bus follower 675 on the follower tile via wires extending between the leader and follower tiles. These wires could be circuit traces, for example. Collectively, the T2T SPI leader 670 and T2T SPI follower 675 are referred to herein as the ‘T2T SPI bus’. T2T SPI leader 670 is coupled to APB interconnect 625 to enable communication with other components on the leader tile, e.g. CPU core 300. Similarly, T2T SPI follower 675 is coupled to APB interconnect 625 on the follower tile to enable communication with other components on the follower tile, e.g. PHYs 685. PCIe switch 675 and other peripherals 680.
[0065] Remaining true to the principle of identical tiles, in Figs. 5 and 6 both the T2T SPI leader 670 and T2T SPI follower 675 are shown on the follower tile. However, it should be appreciated that only T2T SPI follower 675 is active on the follower tile of Fig. 6. Similarly, the leader tile includes both T2T SPI leader 385 and T2T SPI follower 390, with only the T2T SPI leader 385 being active. As noted above, alternative non-identical manufacture is possible in which only the T2T leader is present on the leader tile and only the T2T follow er is present on the follower tile.
[0066] The follower tile has its own set of follower PHYs 685, PCIe switch 675 and other peripherals 680. These are the same as the corresponding items shown on Fig. 3 and reference is thus made to the discussion above. Follower PHYs 685, PCIe switch 675 and other peripherals 680 can be controlled by the CPU core 300 of the leader tile via the T2T SPI bus and APB interconnect 625 on the follow er tile.
[0067] More than one bus can be present that spans both dies to provide multiple channels of communication between the dies. For example, a high speed die-to-die SerDes-based interface as described in [Ulrich] could additionally or alternatively be present. The high-speed interface described in [Ulrich] is a high bandwidth bus that enables relatively large volumes of data to be exchanged between the leader and follower tiles. Other bus types could additionally or alternatively be present, e.g. a Universal Chiplet Interconnect Express (UCIe) bus and/or an I2C bus. [0068] It is possible to extend the two-tile configuration discussed above to further tiles. A four- tile configuration is shown in Fig. 7. In this configuration there is one leader tile and three follower tiles (tiles 1, 2 and 3). Each of the four tiles is on its own die - leader tile is on die 335, follower tile 1 is on die 500, follower tile 2 is on die 700 and follower tile 3 is on die 700’. Each follower tile is the same as the follower tile show n in Figs. 5 and 6 and as discussed above. The leader tile is the same as discussed above. T2T SPI leader 385 on the leader tile is coupled to the respective T2T SPI follower on each follower tile - i.e. T2T follower 675, 775 and 775'. This enables CPU core 300 to control any component on any of the follower tiles. Although not shown for clarity in Fig. 7, the leader tile and each follower tile has its own PHYs, PCIe switch and/or other peripherals of the type discussed above, which are all controllable by CPU core 300.
[0069] In the general case, it is possible to extend to N tiles with one leader and N-l follower tiles coupled via an inter-tile bus like the T2T SPI bus described above. Alternative bus types, e.g. I'C or Universal Chiplet Interconnect Express (UCIe), can be used instead of SPI if desired.
[0070] In a boot process of retimer 210 it is necessary to initialise PHYs 370. Initialisation can include loading configuration data for use by PHYs 370, e.g. to set initial values for registers to select an operational mode, for example. Initialisation can additionally or alternatively include loading firmware for use by a processor of each PHY 370, e.g. into an SRAM of each PHY. This can be achieved by transmitting PHY configuration data package 415 to each PHY 370 during a boot process.
[0071] The PHYs 370 may include their own internal memory, e.g. SRAM, and contents of the PHY configuration data package 415 transmitted to the PHYs may be loaded into this internal memory. CPU core 300 does not need to have knowledge of how the PHY s work or what the data package contains - all that is required for CPU core 300 to perform the data package transfer is the data package itself and an address of each PHY to send it to.
[0072] As multiple PHYs are present even in the single tile case, it is necessary’ to transmit the PHY configuration data package to each PHY. Performing this process sequentially is relatively time consuming and so, to avoid this unwanted delay during boot, a broadcast address or address range is assigned to each of the plurality of PHYs 370 simultaneously. This allows the PHY configuration data package to be transmitted to each PHY of the plurality of PHYs 370 simultaneously, reducing the total time taken to transmit the PHY configuration data package to all of the PHYs. Further information on this ‘broadcast write’ process is provided below
[0073] A process for sending a PHY configuration data package like package 415 to PHYs 370 in a one-tile system is shown in Fig. 8. This process can be used as part of a boot process. Additionally or alternatively this process can be used in any other context in which it is desirable to provide a PHY configuration data package to all of PHYs 370 simultaneously. [0074] In step 800, a processor of multi-lane PCIe retimer 210 obtains a PHY configuration data package. The processor could be CPU core 300 or processor 110, for example. The PHY configuration data package could be part of a firmware image, e.g. firmware image 405. The processor could obtain the configuration data package as part of a firmware update operation, for example. The PHY configuration data package can be obtained from a source external to retimer 210, e.g. SPI flash 340. SPI flash 340 can in turn receive the data package from another source, e.g. over a network such as the internet, a cellular network, and the like. The data package can be stored in SPI flash 340 (or an EEPROM, alternatively) before being loaded, e.g. as part of firmware image 405. The loading process can involve one or more security checks to ensure that the firmware image is valid and authentic before it is loaded. In cases where no firmware update is being performed (e.g. during a boot process), the data package can be retrieved from SPI flash 340 or some other memory.
[0075] Where the PHY configuration data package is part of a firmware image, the processor identifies the PHY configuration data package within the firmware image. The PHY configuration data package can be identified by information contained in a header, e.g. header 400. Additional information such as a version number, a target PHY that the PHY configuration data package is suitable for use with, and the like can be included in the data package if desired.
[0076] In step 805, the processor simultaneously writes, via a bus coupled to the processor and to the plurality of PHY s, the configuration data package to the plurality of PHY s using a broadcast address space of the multi-lane PCIe retimer. In the case of the configuration shown in Fig. 3, the bus is APB interconnect 325. This is one possible configuration shown to assist in the understanding of the invention and should not be taken as limiting, as variations to this configuration are possible. The broadcast address space of the multi-lane PCIe retimer comprises at least one broadcast address assigned to all of the plurality of PHYs. That is, any data put onto the bus associated with the broadcast address space will be retrieved by all of the plurality of PHYs 370 because they have all been assigned the at least one broadcast address. This means that the PHY configuration data package can be transmitted to all of PHYs 370 simultaneously, reducing the total time taken to provide the PHY configuration data package to PHYs 370 compared with a set of successive write operations that each write to one of the PHYs 370.
[0077] This should be contrasted with a non-broadcast address space that the multi-lane PCIe retimer also has. The non-broadcast address space comprises a plurality’ of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs. This means that it is also possible to address a particular one of the plurality of PHYs individually using the respective unique address of the particular PHY. The non-broadcast address space and the broadcast address space can be part of the same global memory map that provides addresses for other components located on the leader tile (and components located on follower tile(s), in multi-tile embodiments) of the multi-lane PCIe retimer.
[0078] The at least one broadcast address can comprise a 24-bit address that is assigned to all of the PHYs 370. It will be appreciated that the disclosure is not limited to 24-bit addresses and that addresses of fewer or more bits than 24 bits can alternatively be used. The broadcast address can be offset from a base address in an address space, which may be a 32-bit address space. The base address may be equal to the product of a constant value and an identifier associated with the tile, e.g. a tile number.
[0079] It is also the case that it can be necessary to transmit data to a plurality of PHYs in a multitile configuration. Fig. 9 illustrates those components that are relevant for the discussion of the transmission of data to a plurality of PHYs in a multi-tile configuration and omits those components that are not. for clarity. PHYs located on a follower tile are referred to as 'follower PHYs’.
[0080] As shown in Fig. 9, T2T SPI leader 385 has a plurality of follower select lines, denoted FS. Each follower select line is coupled to a respective one of the T2T SPI followers 675, 775 and 775’. Boundaries between tiles are shown as dashed lines. Other SPI lines, e.g. clock, Leader Out Follower In (LOFI), Leader In Follower Out (LIFO), etc., are not shown in the interests of clarity, but it will be appreciated that such lines can additionally be present.
[0081] Each SPI follower uses the follower select line to determine whether it is an intended recipient of data currently on the SPI bus. Specifically, an active or ‘asserted' follower select line informs the corresponding SPI follower that the data currently on the SPI bus is to be read by said SPI follower.
[0082] In the case of a broadcast write, T2T SPI leader 385 simultaneously asserts all follower select lines. This has the effect of causing all of the SPI followers 675, 775 and 775’ to receive data put onto the T2T SPI bus, enabling a broadcast write operation to occur. In the illustrated example the follower select line is active low; but an active high scheme could alternatively be used.
[0083] As shown in Fig. 9, each SPI follower 675, 775, 775’ is coupled to respective PHYs 685, 785, 785’ via respective APB interconnects 625, 725, 725’. SPI protocol does not support addressing, but the APB protocol does. Part of the data put onto the T2T SPI bus by CPU core 300 is APB address information, and specifically the broadcast address corresponding to all of the follower PHYs 685, 785, 785’ (e.g. 12 or 24 total follower PHYs, depending on the number of PHYs on each follower tile). Each PHY on each follower tile is deliberately assigned the same APB broadcast address or address range, this also being the broadcast address or address range assigned to the PHYs 370 on the leader tile. As this broadcast address space relates to multiple APB interconnects (e.g. 325, 625), the broadcast address space is described herein as being associated with "the multi-lane PCIe retimer’. This is to acknowledge that multiple physical instances of an APB bus are associated with the broadcast address space in a multi-tile embodiment.
[0084] The combination of asserting all follower select lines simultaneously and putting data on the T2T SPI bus that includes APB broadcast address(es) simultaneously corresponding to all of the PHYs for all of the follower tiles thus enables all PHYs on all follower tiles to be written to simultaneously. This can reduce the time taken to configure and/or update all of the PHYs. This is desirable in a time-sensitive process such as a boot process.
[0085] Each PHY (including follower PHYs) is also separately assigned a unique address / address range, so that it is also possible to write to one specific PHY if desired. From the perspective of the leader tile processor, the entire multi-tile module has a single global address space that includes separate regions for each PHY as well as a broadcast region for writing to all PHYs and follower PHYs simultaneously.
[0086] The APB address space is a global address space across all tiles. This means it is possible to address any register on any tile via this global address space. One particular configuration provides a base address for each tile that is given by a tile identifier multiplied by a constant. The tile identifier can be a tile number and the constant can be a base address for the leader tile. Other memory space constructions are possible. Each register on each tile has a unique address or address range assigned to it within this global address space. Each PHY of PHYs 370. 685, 785, 785’ thus also has a unique address or address range assigned to it.
[0087] It is possible to perform the broadcast write as a one-step process or a two-step process. Each is described in turn below.
[0088] In a one-step broadcast w rite process, SPI bus control information is included in the data that is put onto the T2T SPI bus. Assuming for the sake of illustration 24-bit APB addresses and a 32-bit data word size, data put onto the SPI bus can be of the following format. This is referred to herein as a ‘control packet’.
Figure imgf000017_0001
Bits 0-23 are address bits (‘a’), bits 24, 25 and 26 are follower select bits (‘s’) and bits 27-31 are reserved bits (‘r’). In this particular case there are three follower select bits because there are three follower tiles (and hence three T2T SPI followers) in this example. The reserved bits provide space for additional follower select bits - in this case, there are five reserved bits and so up to eight follower select bits can be provided, supporting up to eight follower tiles. The principles established here can be extended to any number of follower tiles by increasing the word size. Other encoding schemes can be used, e.g. the follower select bits specify a value that corresponds to a given follower tile, with one value corresponding to all follower tiles simultaneously. One example of this uses two follower select bits and is as follows:
Figure imgf000018_0001
[0089] The address bits ‘a’ form an APB address. In the case of a broadcast write, the same APB address is assigned to all of the PHYs on all of the follower tiles simultaneously. The T2T-SPI followers are each configured as bus leader on their respective APB interconnects, enabling each T2T-SPI follower to instruct its respective APB interconnect to perform a write operation to the respective PHYs the APB interconnect is coupled to. In some cases the address data can be omitted because the T2T-SPI bus can auto-increment addresses such that it already knows which address to write data to.
[0090] The follower select bits each respectively correspond to one of the T2T SPI followers 675, 775, 775’. The value of each follower select bit indicates whether the corresponding SPI follower is to receive data or not. Specifically, the T2T SPI leader 385 controls the follower select lines SF1, FS2 and FS3 based on the values of the follower select bits. Each follower select bit corresponds to one of the follower select lines. In the case of a broadcast write, all three of the follower select bits would be set to an 'active’ value (e.g.. ‘ 1 ’), leading to all three of the follower select lines being asserted simultaneously. In the case of non-broadcast writes, only the follower select bit(s) corresponding to the follower tile(s) that is/are to be written to would be set to the active value, with the remaining follower select bit(s) being set to an inactive value (e.g. ‘0’).
[0091] In a two-step broadcast write process, follower select control information is sent to the T2T SPI leader 385 separately from the address data. The follower select information could be sent in-band as illustrated above in the one-step process, or another channel could be used such as a System Management bus (SMBus). The address data can be sent separately and before the PHY configuration data package is transmitted. In some cases the address data can be omitted because the T2T-SPI bus can auto-increment addresses such that it already knows which address to write data to.
[0092] In each of the one-step and two-step broadcast write cases, once the follower select and address information (if required) has been provided, data can be transmitted. The T2T SPI leader 385 can keep the follower select lines asserted until it receives new instructions regarding follower select line configuration. Similarly, the APB bus can continue writing to the address(es) specified (possibly by auto-incrementing) until new addressing information is provided. In this way, the PHY configuration data package can be broadcast to all of the PHYs on all of the follower tiles simultaneously.
[0093] A process for transmitting a PHY configuration data package to PHY s of a leader tile and also to follower PHYs of one or more follower tiles in a multi -tile retimer is discussed below in connection with Fig. 10. The multi -tile retimer can include two or more tiles, one of which is a leader tile and the remaining tile(s) is/are follower tiles.
[0094] The configuration of the multi-tile retimer is as illustrated in Figs. 7 and 9, on the understanding that j ust one follower tile may be present. Specifically, the retimer includes a leader tile having a leader tile bus (e.g. APB interconnect 325) and processor (e.g. CPU core 300) located on the leader tile. The retimer also comprises a follower tile having a plurality of follower PHYs (e.g. PHYs 685) coupled to a follower bus (e.g. ABP interconnect 625) located on the follower tile. The leader tile and the follower tile are communicatively coupled via a T2T bus (e.g. the T2T SPI bus discussed above). The T2T bus has a T2T bus leader (e.g. T2T-SPI leader 385) located on the leader tile and a T2T bus follower (e.g. T2T-SPI follower 675) located on the follower tile and coupled to the follower bus.
[0095] In step 1000, the processor simultaneously writes, via a bus coupled to the processor and to the plurality of PHYs, the configuration data package to the plurality of PHYs using a broadcast address space of the multi-lane PCIe retimer. This step is the same as step 800 and thus reference is made to the discussion of step 800 above. Step 1000 is the part of the process that writes the configuration data package to the PHYs on the leader tile.
[0096] In step 1005, the processor simultaneously w rites, via the T2T bus, the PHY configuration data package to a plurality of follower PHYs located on a follow er tile or tiles using the broadcast address space of the multi-lane PCIe retimer. The PHY configuration data package can be of the type discussed above, e.g. configuration information and/or firmware for PHYs 685, 785 and/or 785’. As discussed above, at least one address of the broadcast address space is assigned to all of the plurality of follower PHYs that are located on the follow er tile or tiles.
[0097] In the case where there are multiple follower tiles, step 1005 is performed in respect of each follower tile simultaneously. Reference is again made to the discussion above in connection with Fig. 9. Every follower PHY on all of the follower tiles has the same broadcast address / address range, meaning that every follower PHY on every follower tile is written to simultaneously.
[0098] It is pointed out that writing to the follower tiles involves the T2T bus. This bus is involved to enable the PHY configuration data package to be transmitted from the leader tile to the follower tile(s). At the follower tile(s), the PHY configuration data package is then placed onto the local APB interconnect (e.g. 625, 725, 725’) of the follower tile(s) for onward transmission to the follower PHYs. The techniques discussed above in connection with Fig. 9 can be used to send ABP-format data packets over the T2T bus.
[0099] The process of Fig. 10 enables the PHY configuration data package to be transmitted to both PHYs 370 of the leader tile and follower PHYs 685, 785 and/or 785’ of the follower tile(s). The transmission to the leader and follower tiles is performed separately because the leader tile PHYs are directly coupled to the CPU core 200 via APB interconnect 325, such that the T2T bus is not required to communicate with the leader tile PHY s 370. The APB bus broadcast address(es) for the leader tile PHYs 370 can nevertheless be the same as for the follower PHYs 685, 785 and/or 785’ as this simplifies the broadcast write from the perspective of CPU core 300.
[0100] It is also possible to perform broadcast read operations using the broadcast address space in a multi-tile environment. One scenario that a broadcast read is considered to be useful is detecting whether any interrupt requests have been raised.
[0101] Fig. 11 indicates how a broadcast read can be performed. The broadcast read operation is targeted at a particular register that is common to all of the tiles in the multi-tile environment. This register be located on the PHYs of each tile, but does not have to be as any register that is common to all of the tiles can be read using this process.
[0102] In step 1100, a processor (e.g. CPU core 200) of the multi-lane PCIe multi-tile retimer simultaneously transmits a broadcast read operation to a leader tile register located on the leader tile and one or more follower tile registers respectively located on the one or more follower tiles. The broadcast read operation can specify a broadcast read address, the broadcast read address being simultaneously assigned to the leader tile register and to the one or more follower tile registers. Reference is made to the discussion above regarding broadcast writes as the address assigning technique for broadcast reads is the same. As noted in the case of broadcast writes, in some cases the relevant bus (e.g. APB interconnect 325, 625, 725, 725’) supports autoincrementing such that the broadcast read address does not always need to be provided to the bus. [0103] The broadcast read address can be a 24-bit address, for example, of the type discussed above. However, this disclosure is not limited in this regard and addresses of other sizes can be used instead.
[0104] In step 1105, a leader tile read result is received from the leader tile register. The leader tile read result is responsive to the broadcast read instruction transmitted in step 1100. The leader tile read result can be received by the processor or by another entity such as a storage buffer or similar. [0105] In step 1110, one or more follower tile read results are received from respective ones of the one or more follower tile registers. The one or more follower tile read results are responsive to the broadcast read instruction transmitted in step 1100. The result(s) can be received by the processor or by another entity such as a storage buffer or similar. One result is received for each follower tile, e.g. there will be three follower tile read results in the case of the multi-tile arrangement shown in Fig. 7.
[0106] In step 1115, a bitwise OR operation is performed on the leader tile read result and the one or more follower tile read results to generate a resultant read value. The bitwise OR operation is performed using all read results simultaneously - e g. in the three tile case of Fig. 7, the bitwise OR operation is as follows:
Resultant read value = bitwise OR (leader tile read result, follower tile 1 read result, follower tile 2 read result, follower tile 3 read result)
[0107] The bitwise OR operation serves to generate a resultant read value of 1 if any of the leader tile read result or the follower tile read result(s) are 1. The only case where the resultant read value is 0 is where all of the leader tile read result and follower tile read result(s) are 0. This is useful in certain situations such as determining whether an interrupt request has been raised. In this case the leader tile register and follower tile register(s) can serve as interrupt request registers. The processor can poll these interrupt request registers using the broadcast read process of Fig. 11 and determine whether any of the tiles have raised an interrupt request based on the resultant read value. In the case where the resultant read value indicates that an interrupt request has been raised, the processor can take further action to identify specifics of the interrupt request such as which component generated it. The processor can then handle the interrupt request as it deems appropriate.
[0108] It is possible to adapt the broadcast read to operate with just a sub-set of the total set of registers being read, e.g. just registers located on one tile. In this case the other registers not being read can be gated off by the T2T SPI leader 385 to avoid their signal being taken into account when performing the bitwise OR operation. Alternatively, the T2T SPI bus can be configured as active high so that the registers that are not being read are at a low value during the broadcast read operation.
[0109] The broadcast read process described above can reduce the total time taken to identify a certain condition such as a pending interrupt request. This is because the read operation is performed simultaneously across all tiles, rather than sequentially. This can lead to improved error handling responsiveness, for example. [0110] The process of Fig. 11 can be performed by a multi-lane multi-tile PCIe retimer having a plurality of physical layer data lane leader circuits (leader PHYs) located on a leader tile of the PCIe retimer and a plurality of physical data lane follower circuits (follower PHYs) respectively located on each of one or more follower tiles of the PCIe retimer. The PCIe retimer also has a leader tile register located on the leader tile; and one or more follower tile registers respectively located on the one or more follower tiles; and a processor and a bus coupled to the processor and to the plurality of PHYs.
[0111] The processor of the PCIe retimer is configured to perform the process of Fig. 11. Specifically, the processor is configured to simultaneously transmit a broadcast read instruction to the leader tile register and to the one or more follower tile registers; receive, responsive to the broadcast read instruction, a leader tile read result from the leader tile register; receive, responsive to the broadcast read instruction, one or more follower tile read results from respectively ones of the one or more follower tile registers; and perform a bitwise OR operation on the leader tile read result and the one or more follower tile read results to generate a resultant read value. The processor can additionally be configured to perform any of the additional actions described above in connection with Fig. 11. The processor can be CPU core 300, for example.
[0112] In addition to the broadcast write operation discussed above, or in the alternative to the broadcast write operation, this disclosure also contemplates a so-called multicast operation. A multicast involves transmission of data to more than one PHY simultaneously, but not the entire set of PHYs in a MCM. That is, a multicast transmission targets a proper subset of PHYs within the MCM.
[0113] Fig. 12 illustrates an embodiment of a MCM that is capable of performing multicast operations. Fig. 12 shares common features with Fig. 1 and the numbering of Fig. 1 has been used where appropriate to illustrate this commonality. As in the earlier parts of this disclosure, the MCM of Fig. 12 may provide retimer functionality, although the MCM is not limited to this as it can provide other, different functionality.
[0114] In Fig. 12 PHYs are grouped into groups of two, referred to as a 'PHY top’. PHY top 1200 is labelled in Fig. 12 on the understanding that the other elements labelled ‘PHY top’ as identical to PHY top 1200.
[0115] As shown, in the illustrated embodiment each PHY top comprises two PHYs. This number is exemplary as generally speaking a PHY top can comprise any number of PHYs, e g. three PHYs, four PHYs, and above. The significance of the number of PHYs in a PHY top is that it controls the granularity of a multicast operation, since as described in more detail below, a multicast targets a particular PHY top and consequently all PHYs that are part of said PHY top. [0116] Fig. 12 shows four PHYs and two PHY tops per tile but these numbers can also be varied. For example, there could alternatively be eight PHYs per tile with four PHY tops each respectively comprising two of said eight PHYs. Further variations, e.g. eight PHYs with two PHY tops respectively comprising four of said eight PHYs are also possible.
[0117] Each PHY is shown in Fig. 12 as having a configuration space ‘cfg’ in the local memory of the PHY. The configuration space contains data related to the configuration of the PHY. This data may be provided to each PHY when the MCM of Fig. 12 is powered on and/or rebooted, etc. It may be necessary to comply with particular timing constraints in a power on or boot sequence, e.g. as proscribed by a protocol such as PCIe. The multicast operation described herein can reduce the total time required to load data into each PHY of a PHY top, e.g. into the configuration space of each said PHY, making it easier to comply with protocol requirements.
[0118] Communication between tiles is enabled by a tile-to-tile interface. In the embodiment of Fig. 12 this interface is a T2T SPI bus of the type described earlier in this specification. Alternative bus types, e.g. UCIe, I2C etc. can be used in place of the T2T SPI bus. The T2T SPI bus functions as described earlier in this specification, with the description being omitted here in the interests of brevity.
[0119] Referring now to Fig. 13, one embodiment of an address space 1300 suitable for use with the embodiment of Fig. 12 is shown in graphical form. Address space 1300 is the address space for the entire MCM. This address space is split into four regions 1305a- 1305d, each respectively corresponding to one of the four tiles 105, 115a - 115c. It will be understood that, in connection with Fig. 13, reference to an ‘address’ can mean a single address or an address range. In the case of an address range, the constituent addresses of the range can be contiguous or non-contiguous.
[0120] Each region 1305a - 1305d contains a unique address (‘PHY i address’, i = 0 to 3) for each PHY to enable one specific PHY to be targeted. Each region 1305a - 1305d also contains an address for each PHY top (‘PHY TopJ ', j = 0 or 1) to enable a specific PHY top to be targeted. It will be understood from Fig. 12 that targeting a specific PHY top has the result of targeting the PHY top’s constituent PHYs, so that e.g. targeting PHY Top O on Tile O via address range 1305a will result in data being transmitted to both PHY 0 and PHY I on Tile O.
[0121] The MCM also includes a broadcast address 1310. This is not associated with any particular tile because, as described earlier, broadcast address 1310 can be used to target all PHYs on all tiles in one operation.
[0122] While the addresses have been show n in Fig. 13 as contiguous, this is not essential as noncontiguous addresses can alternatively be used. More generally, alternative address schemes that achieve the three levels of mapping of Fig. 13 - namely, single PHY operations, multicast operations and broadcast operations - are also within the scope of this disclosure. [0123] Fig. 14 illustrates a process for performing a write operation in a MCM like that shown in Fig. 12 where broadcast and multicast operations are supported. This process is described in the context of a T2T SPI bus as described above, on the understanding that modifications can be made to the process to support a different type of bus.
[0124] In element 1400, the tile(s) that are to be written to are selected. In the case of a T2T SPI bus, this can include setting chip select to ‘active’ in respect of all tiles that are to be written to. If the tile is the leader tile, a loopback path can be used to select this tile. Where the operation is a direct write to a single PHY, chip select is set to select the tile that the single PHY resides on. In the case of a multicast waite, chip select is set to target the tile(s) having the PHY top(s) on that are to take part in the write operation. For a broadcast write, chip select is set to target all tiles.
[0125] In element 1405, the data is transmitted via the T2T SPI bus, or an alternative bus in the case where a SPI bus is not used. It will be appreciated that the tile selection process of element 1400, e.g. the chip select settings, will control w hich tile(s) act upon the data.
[0126] In element 1410, the recipient tile(s) each respectively determine whether the address that the data relates to is a broadcast address. In the case where the address is not a broadcast address, the process moves to element 1415 where the recipient tile(s) each write the data to the PHY or PHY top corresponding to the address. The data could be written to a configuration space for each PHY as shown in Fig. 12, for example.
[0127] In the case where the address is a broadcast address then the process moves to element 1420 where the recipient tile(s) each write the data to all of the PHYs located on the respective tile. The data could be written to a configuration space for each PHY as shown in Fig. 12, for example.
[0128] It will be appreciated that the actions specified in elements 1410, 1415 and 1420 are separately carried out for each tile that is selected in element 1400.
[0129] In addition to the embodiments described above, the following clauses set out further embodiments of the disclosure.
[0130] Clause 1. A method, comprising: obtaining, by a processor of a multi-chip module (MCM) having a plurality of tiles and a plurality of physical layer data lane circuits (PHYs) distributed across the plurality of tiles, a PHY configuration data package; and simultaneously writing, by the processor, via a bus coupled to the processor and to the plurality of PHYs, the configuration data package to the plurality of PHYs using a broadcast address space of the MCM, the broadcast address space of the MCM comprising at least one address assigned to all of the plurality of PHYs, the MCM also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs. [0131] Clause 2: The method of clause 1, wherein obtaining the PHY configuration data package comprises receiving the PHY configuration data package within a firmware image as part of a firmware update operation.
[0132] Clause 3: The method of clause 1 or clause 2, further comprising: simultaneously w riting, by the processor via a tile-to-tile (T2T) bus coupled to a leader tile of the plurality of tiles and to a follow er tile of the plurality of tiles, the PHY configuration data package to at least one follow er PHY of the plurality of PHYs located on the follower tile using the broadcast address space of the MCM, at least one of the plurality of PHYs located on the leader tile of the MCM and the nonbroadcast address space comprising at least one unique address assigned to the tile-to-tile bus and further comprising at least one unique address respectively assigned to the at least one follower PHY.
[0133] Clause 4: The method of clause 1 or clause 2, further comprising: simultaneously writing, by the processor via a tile-to-tile (T2T) bus coupled to a leader tile of the MCM and to a plurality of follow er tiles of the plurality of tiles, the configuration data package to a plurality of follower PHY s respectively located on the plurality of follow er tiles using the broadcast address space of the MCM, a plurality of the plurality of PHYs located on the leader tile of the MCM and the nonbroadcast address space comprising at least one unique address assigned to the tile-to-tile bus and further comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of follow er PHYs.
[0134] Clause 5: The method of clause 4, wherein the T2T bus is a Serial Peripheral Interface (SPI) bus having a T2T bus leader located on the leader tile and a plurality of T2T bus followers respectively located on the plurality of follower tiles, and wherein simultaneously writing the configuration data package to the plurality of follower PHY s comprises: simultaneously asserting, by the T2T bus leader, all of a plurality of follow er select lines of the T2T bus, the plurality of follower select lines respectively coupled to the plurality of T2T bus followers; and transmitting, by the T2T bus leader, the PHY configuration data package whilst all of the plurality of follower select lines are simultaneously asserted.
[0135] Clause 6: The method of clause 4, wherein the T2T bus is an SPI bus having a T2T bus leader located on the leader tile and a plurality of T2T bus followers respectively located on the plurality of follower tiles, and wherein simultaneously writing the configuration data package to the plurality of follower PHYs comprises: setting, by the processor, a plurality of follow er select bits of a control packet to an active value, each follower select bit respectively corresponding to one of the plurality of T2T bus followers; transmitting, by the processor, the control packet to the T2T bus leader; reading, by the T2T bus leader, the follower select bits of the control packet; simultaneously asserting, by the T2T bus leader, each of a plurality of follower select lines of the T2T bus based on the respective values of the follower select bits; and transmitting, by the T2T bus leader, the PHY configuration data package whilst each of the plurality of follower select lines is simultaneously asserted.
[0136] Clause 7: The method of any preceding clause, wherein the obtaining the PHY configuration data package further comprises: receiving a firmware image including the PHY configuration data package; writing, by the processor, the firmware image to a non-volatile memory; retrieving, by the processor, firmware authentication information from a read-only memory of the retimer; authenticating, by the processor, the firmware image using the firmware authentication information; and selectively preventing execution of the firmware image and selectively preventing distribution of the PHY data package based on the authenticating.
[0137] Clause 8: An apparatus, comprising: a multi-chip module (MCM) having a plurality of tiles and a plurality’ of physical layer data lane circuits (PHYs) distributed across the plurality of tiles, a processor and a bus coupled to the processor and to the plurality of PHYs, the processor configured to: obtain a PHY configuration data package; and simultaneously write, via the bus, the configuration data package to the plurality of PHYs using a broadcast address space of the MCM, the broadcast address space of the MCM comprising at least one address assigned to all of the plurality of PHY s, the MCM also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs. [0138] Clause 9: The apparatus of clause 8, wherein the processor is further configured to receive the PHY configuration data package within a firmware image as part of a firmware update operation.
[0139] Clause 10: The apparatus of clause 8 or clause 9, further comprising: a tile-to-tile (T2T) bus coupled to a leader tile of the plurality of tiles and to a follower tile of the plurality’ of tiles; wherein: at least one leadser PHY of the plurality' of PHYs is located on the leader tile and at least one follower PHY of the plurality of PHYs is located on the follower tile; and the processor is further configured to simultaneously write the configuration data package to the at least one follower PHY via the T2T bus and using the broadcast address space of the MCM, and the nonbroadcast address space further comprises at least one unique address assigned to the tile-to-tile bus and at least one address respectively assigned to the at least one follower PHY.
[0140] Clause 11: The apparatus of clause 8 or clause 9. further comprising: a tile-to-tile (T2T) bus coupled to a leader tile of the plurality of tiles and to a plurality’ of follower tiles of the plurality of tiles; wherein: at least one leader PHY of the plurality' of PHYs is located on the leader tile and at least one follower PHY of the plurality' of PHYs is respectively located on each of the plurality' of follower tiles such that the MCM comprises a plurality of follower PHYs; the processor is further configured to simultaneously write the configuration data package to the plurality of follower PHYs via the T2T bus and using the broadcast address space of the MCM, and the nonbroadcast address space further comprises at least one unique address assigned to the tile-to-tile bus and a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of follower PHYs.
[0141] Clause 12: The apparatus of clause 11, wherein the T2T bus is a Serial Peripheral Interface (SPI) bus, the SPI bus comprising: a T2T bus leader located on the leader rile and coupled to a plurality of follower select lines; and a plurality of T2T bus followers respectively located on the plurality of follower tiles and respectively coupled to the plurality of follower select lines; wherein the T2T bus leader is configured to: simultaneously assert all of the plurality of follower select lines; and transmit the PHY configuration data package whilst all of the plurality of follower select lines are simultaneously asserted.
[0142] Clause 13: The apparatus of clause 11, wherein the T2T bus is an SPI bus, the SPI bus comprising: a T2T bus leader located on the leader tile and coupled to a plurality of follower select lines; and a plurality7 of T2T bus followers respectively located on the plurality of follower tiles and respectively coupled to the plurality of follower select lines; wherein the processor is further configured to: set a plurality7 of follower select bits of a control packet to an active value, each follower select bit respectively corresponding to one of the plurality of T2T bus followers; and transmit the control packet to the T2T bus leader; wherein the T2T bus leader is further configured to: simultaneously assert each of the plurality of follower select lines based on the respective values of the follower select bits; and transmit the PHY configuration data package whilst each of the plurality of follower select lines is simultaneously asserted.
[0143] Clause 14: The apparatus of any one of clauses 8 to 13, further comprising: a non-volatile memory; and a read-only memory of the MCM; wherein the processor is configured to: receive a firmware image including the PCIe PHY configuration data package; write the firmware image to the non-volatile memory; receive firmware authentication information from the read-only memory; authenticate the firmware image using the firmware authentication information; and selectively prevent execution of the firmware image and selectively prevent distribution of the PHY data package based on the authentication.
[0144] Clause 15. A method, comprising: simultaneously transmitting, by a processor of a multichip module (MCM) having a leader tile having a plurality7 of physical data lane leader circuits (PHYs) and one or more follower tiles each having a respective plurality of physical data lane follower circuits (follow er PHYs), a broadcast read instruction to a leader tile register located on the leader tile and to one or more follower tile registers respectively located on the one or more follower tiles; receiving, responsive to the broadcast read instruction, a leader tile read result from the leader tile register; receiving, responsive to the broadcast read instruction, one or more follower tile read results from respectively ones of the one or more follower tile registers; and performing a bitwise OR operation on the leader tile read result and the one or more follower tile read results to generate a resultant read value.
[0145] Clause 16: The method of clause 15, wherein the broadcast read operation specifies a broadcast read address, the broadcast read address being simultaneously assigned to the leader tile register and the one or more follower tile registers.
[0146] Clause 17: The method of clause 14 or 15, wherein the leader tile register and the one or more follower tile registers are interrupt request registers, the method further comprising: identifying, by the processor, that an interrupt request has been raised based on the resultant read value.
[0147] Clause 18: An apparatus, comprising: a multi-chip module (MCM) having a plurality of physical layer data lane leader circuits (PHYs) located on a leader tile of the MCM and a plurality of physical data lane follower circuits (follower PHYs) respectively located on each of one or more follower tiles of the MCM; a leader tile register located on the leader tile; one or more follower tile registers respectively located on the one or more follower tiles; and a processor and a bus coupled to the processor and to the plurality of PCIe PHYs; wherein the processor is configured to: simultaneously transmit a broadcast read instruction to the leader tile register and to the one or more follower tile registers; receive, responsive to the broadcast read instruction, a leader tile read result from the leader tile register; receive, responsive to the broadcast read instruction, one or more follower tile read results from respectively ones of the one or more follower tile registers; and perform a bitwise OR operation on the leader tile read result and the one or more follower tile read results to generate a resultant read value.
[0148] Clause 19: The apparatus of clause 18, wherein the broadcast read operation specifies a broadcast read address, the broadcast read address being simultaneously assigned to the leader tile register and the one or more follower tile registers.
[0149] Clause 20: The apparatus of clause 14 or 15, wherein the leader tile register and the one or more follower tile registers are interrupt request registers, the processor further configured to: identify' that an interrupt request has been raised based on the resultant read value.
Clause 21 : A method, comprising: obtaining, by a processor of a chip comprising a tile and a plurality of physical layer data lane circuits (PHYs) located on the tile, a PHY configuration data package; and writing, by the processor, via a bus coupled to the processor and to the plurality of PHYs, the configuration data package to a proper subset of the plurality of PHYs using a multicast address space of the tile, the multicast address space of the tile comprising at least one address assigned to the proper subset of the plurality of PHYs, the tile also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs, the proper subset of the plurality7 of PHYs comprising at least two PHYs.
[0150] Clause 22: The method of clause 21 wherein the tile also has a broadcast address space comprising an address assigned to all of the plurality of PHYs.
[0151] Clause 23: A method, comprising: obtaining, by a processor of a multi-chip module (MCM) comprising a plurality of tiles and a plurality of physical layer data lane circuits (PHYs) distributed across the plurality of tiles, a PHY configuration data package; and writing, by the processor, via a bus coupled to the processor and to the plurality of tiles, the configuration data package to a proper subset of the plurality of PHYs using a multicast address space of the MCM, the multicast address space of the MCM comprising at least one address assigned to the proper subset of the plurality of PHYs, the MCM also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs, the proper subset of the plurality’ of PHYs comprising at least two PHYs.
[0152] Clause 24: The method of clause 23 wherein the MCM also has a broadcast address space comprising an address assigned to all of the plurality of PHYs.
[0153] Clause 25: An apparatus, comprising: a tile comprising a processor, a plurality of physical layer data lane circuits (PHYs) and a bus coupling the processor to each of the plurality of PHYs, the processor configured to: obtain a PHY configuration data package; and write the PHY configuration data package to a proper subset of the plurality’ of PHY s using a multicast address space of the tile, the multicast address space of the tile comprising at least one address assigned to the proper subset of the plurality of PHYs. the tile also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs, the proper subset of the plurality of PHYs comprising at least two PHYs. [0154] Clause 26: The apparatus of clause 25 wherein the tile also has a broadcast address space comprising an address assigned to all of the plurality of PHYs.
[0155] Clause 27: An apparatus, comprising: a multi-chip module (MCM) comprising a plurality of tiles, a processor on a leader tile of the plurality of tiles, a plurality of physical layer data lane circuits (PHYs) distributed across the plurality of tiles and a bus coupling the processor to each of the plurality of tiles, the processor configured to: obtain a PHY configuration data package; and write the PHY configuration data package to a proper subset of the plurality of PHYs using a multicast address space of the MCM, the multicast address space of the MCM comprising at least one address assigned to the proper subset of the plurality of PHYs, the MCM also having a nonbroadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs, the proper subset of the plurality of PHYs comprising at least two PHYs. [0156] Clause 28: The apparatus of clause 27 wherein the MCM also has a broadcast address space comprising an address assigned to all of the plurality of PHYs.
[0157] It will be apparent to a person skilled in the art having the benefit of the present disclosure that various modifications, extensions, substitutions and the like to the subject matter described herein are possible. Such changes are also within the scope of this disclosure. It is also noted that, where method steps are described, these steps can be performed in any order unless expressly stated otherwise.

Claims

1. A method, comprising: obtaining, by a processor of a multi-chip module (MCM) having a plurality of tiles and a plurality of physical layer data lane circuits (PHY s) distributed across the plurality of tiles, a PHY configuration data package; and simultaneously writing, by the processor, via a bus coupled to the processor and to the plurality of PHYs, the configuration data package to the plurality of PHYs using a broadcast address space of the MCM, the broadcast address space of the MCM comprising at least one address assigned to all of the plurality of PHY s, the MCM also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs.
2. The method of claim 1, wherein obtaining the PHY configuration data package comprises receiving the PHY configuration data package within a firmware image as part of a firmware update operation.
3. The method of claim 1, further comprising: simultaneously writing, by the processor via a tile-to-tile (T2T) bus coupled to a leader tile of the plurality of tiles and to a follower tile of the plurality of tiles, the PHY configuration data package to at least one follower PHY of the plurality of PHYs located on the follower tile using the broadcast address space of the MCM , at least one of the plurality of PHYs located on the leader tile of the MCM and the non-broadcast address space comprising at least one unique address assigned to the tile-to-tile bus and further comprising at least one unique address respectively assigned to the at least one follower PHY.
4. The method of claim 1, further comprising: simultaneously writing, by the processor via a tile-to-tile (T2T) bus coupled to a leader tile of the MCM and to a plurality of follower tiles of the plurality of tiles, the configuration data package to a plurality of follower PHY s respectively located on the plurality of follower tiles using the broadcast address space of the MCM, a plurality of the plurality of PHYs located on the leader tile of the MCM and the non-broadcast address space comprising at least one unique address assigned to the tile-to-tile bus and further comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of follower PHYs.
5. The method of claim 4, wherein the T2T bus is a Serial Peripheral Interface (SPI) bus having a T2T bus leader located on the leader tile and a plurality of T2T bus followers respectively located on the plurality of follower tiles, and wherein simultaneously writing the configuration data package to the plurality of follower PHYs comprises: simultaneously asserting, by the T2T bus leader, all of a plurality7 of follower select lines of the T2T bus, the plurality of follower select lines respectively coupled to the plurality7 of T2T bus followers; and transmitting, by the T2T bus leader, the PHY configuration data package whilst all of the plurality of follower select lines are simultaneously asserted.
6. The method of claim 4, wherein the T2T bus is an SPI bus having a T2T bus leader located on the leader tile and a plurality of T2T bus followers respectively located on the plurality of follower tiles, and wherein simultaneously writing the configuration data package to the plurality' of follower PHYs comprises: setting, by the processor, a plurality7 of follower select bits of a control packet to an active value, each follower select bit respectively corresponding to one of the plurality of T2T bus followers; transmitting, by the processor, the control packet to the T2T bus leader; reading, by the T2T bus leader, the follower select bits of the control packet; simultaneously asserting, by the T2T bus leader, each of a plurality7 of follower select lines of the T2T bus based on the respective values of the follower select bits; and transmitting, by the T2T bus leader, the PHY configuration data package whilst each of the plurality of follower select lines is simultaneously asserted.
7. The method of claim 1. wherein the obtaining the PHY configuration data package further comprises: receiving a firmware image including the PHY configuration data package; writing, by the processor, the firmware image to a non-volatile memory7; retrieving, by the processor, firmware authentication information from a read-only- memory of the retimer; authenticating, by the processor, the firmware image using the firmware authentication information; and, selectively preventing execution of the firmware image and selectively preventing distribution of the PHY data package based on the authenticating.
8. An apparatus, comprising: a multi-chip module (MCM) having a plurality of tiles and a plurality of physical layer data lane circuits (PHYs) distributed across the plurality of tiles, a processor and a bus coupled to the processor and to the plurality of PHYs, the processor configured to: obtain a PHY configuration data package; and simultaneously write, via the bus, the configuration data package to the plurality of PHYs using a broadcast address space of the MCM, the broadcast address space of the MCM comprising at least one address assigned to all of the plurality of PHY s, the MCM also having a non-broadcast address space comprising a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of PHYs.
9. The apparatus of claim 8, wherein the processor is further configured to receive the PHY configuration data package within a firmware image as part of a firmware update operation.
10. The apparatus of claim 8, further comprising: a tile-to-tile (T2T) bus coupled to a leader tile of the plurality of tiles and to a follower tile of the plurality of tiles: wherein: at least one leader PHY of the plurality of PHYs is located on the leader tile and at least one follower PHY of the plurality of PHYs is located on the follower tile; and the processor is further configured to simultaneously write the configuration data package to the at least one follower PHY via the T2T bus and using the broadcast address space of the MCM, and the non-broadcast address space further comprises at least one unique address assigned to the tile-to-tile bus and at least one address respectively assigned to the at least one follower PHY.
1 1. The apparatus of claim 8, further comprising: a tile-to-tile (T2T) bus coupled to a leader tile of the plurality7 of tiles and to a plurality7 of follower tiles of the plurality of tiles; wherein: at least one leader PHY of the plurality of PHYs is located on the leader tile and at least one follower PHY of the plurality of PHYs is respectively located on each of the plurality of follower tiles such that the MCM comprises a plurality of follower PHYs; the processor is further configured to simultaneously write the configuration data package to the plurality of follower PHYs via the T2T bus and using the broadcast address space of the MCM, and the non-broadcast address space further comprises at least one unique address assigned to the tile-to-tile bus and a plurality of unique addresses each respectively assigned to a corresponding one of the plurality of follower PHYs.
12. The apparatus of claim 11 , wherein the T2T bus is a Serial Peripheral Interface (SPI) bus, the SPI bus comprising: a T2T bus leader located on the leader tile and coupled to a plurality of follower select lines; and a plurality of T2T bus followers respectively located on the plurality of follower tiles and respectively coupled to the plurality of follower select lines; wherein the T2T bus leader is configured to: simultaneously assert all of the plurality of follower select lines; and transmit the PHY configuration data package whilst all of the plurality of follower select lines are simultaneously asserted.
13. The apparatus of claim 11, wherein the T2T bus is an SPI bus, the SPI bus comprising: a T2T bus leader located on the leader tile and coupled to a plurality of follower select lines; and a plurality' of T2T bus followers respectively located on the plurality of follower tiles and respectively coupled to the plurality of follower select lines; wherein the processor is further configured to: set a plurality’ of follower select bits of a control packet to an active value, each follower select bit respectively corresponding to one of the plurality of T2T bus followers; and transmit the control packet to the T2T bus leader; wherein the T2T bus leader is further configured to: simultaneously assert each of the plurality of follower select lines based on the respective values of the follower select bits; and transmit the PHY configuration data package whilst each of the plurality of follower select lines is simultaneously asserted.
14. The apparatus of claim 8, further comprising: a non-volatile memory; and a read-only memory of the MCM; wherein the processor is configured to: receive a firmware image including the PHY configuration data package; write the firmware image to the non-volatile memory; receive firmware authentication information from the read-only memory'; authenticate the firmware image using the firmware authentication information; and selectively prevent execution of the firmware image and selectively prevent distribution of the PHY data package based on the authentication.
15. A method, comprising: simultaneously transmitting, by a processor of a multi-chip module (MCM) having a leader tile having a plurality of physical data lane leader circuits (leader PHYs) and one or more follower tiles each having a respective plurality' of physical data lane follower circuits (follower PHY s), a broadcast read instruction to a leader tile register located on the leader tile and to one or more follower tile registers respectively located on the one or more follower tiles; receiving, responsive to the broadcast read instruction, a leader tile read result from the leader tile register; receiving, responsive to the broadcast read instruction, one or more follower tile read results from respectively ones of the one or more follower tile registers; and performing a bitwise OR operation on the leader tile read result and the one or more follower tile read results to generate a resultant read value.
16. The method of claim 15, wherein the broadcast read operation specifies a broadcast read address, the broadcast read address being simultaneously assigned to the leader tile register and the one or more follower tile registers.
17. The method of claim 15, wherein the leader tile register and the one or more follow er tile registers are interrupt request registers, the method further comprising: identifying, by the processor, that an interrupt request has been raised based on the resultant read value.
18. An apparatus, comprising: a multi-chip module (MCM) having a plurality of physical layer data lane leader circuits (leader PHYs) located on a leader tile of the MCM and a plurality of physical data lane follower circuits (follow er PHYs) respectively located on each of one or more follower tiles of the MCM; a leader tile register located on the leader tile; one or more follower tile registers respectively located on the one or more follower tiles; and a processor and a bus coupled to the processor and to the plurality of PHYs; wherein the processor is configured to: simultaneously transmit a broadcast read instruction to the leader tile register and to the one or more follower tile registers; receive, responsive to the broadcast read instruction, a leader tile read result from the leader tile register; receive, responsive to the broadcast read instruction, one or more follower tile read results from respectively ones of the one or more follower tile registers; and perform a bitwise OR operation on the leader tile read result and the one or more follower tile read results to generate a resultant read value.
19. The apparatus of claim 18, wherein the broadcast read operation specifies a broadcast read address, the broadcast read address being simultaneously assigned to the leader tile register and the one or more follower tile registers.
20. The apparatus of claim 18, wherein the leader tile register and the one or more follower tile registers are interrupt request registers, the processor further configured to: identity7 that an interrupt request has been raised based on the resultant read value.
PCT/US2023/078008 2022-10-27 2023-10-27 Firmware broadcast in a multi-chip module WO2024092188A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263381264P 2022-10-27 2022-10-27
US63/381,264 2022-10-27

Publications (1)

Publication Number Publication Date
WO2024092188A1 true WO2024092188A1 (en) 2024-05-02

Family

ID=88965105

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/078008 WO2024092188A1 (en) 2022-10-27 2023-10-27 Firmware broadcast in a multi-chip module

Country Status (1)

Country Link
WO (1) WO2024092188A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150095630A1 (en) * 2013-09-30 2015-04-02 Apple Inc. Global configuration broadcast
US9288082B1 (en) 2010-05-20 2016-03-15 Kandou Labs, S.A. Circuits for efficient detection of vector signaling codes for chip-to-chip communication using sums of differences
EP3859542A1 (en) * 2020-01-31 2021-08-04 Infineon Technologies AG Spi broadcast mode

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9288082B1 (en) 2010-05-20 2016-03-15 Kandou Labs, S.A. Circuits for efficient detection of vector signaling codes for chip-to-chip communication using sums of differences
US20150095630A1 (en) * 2013-09-30 2015-04-02 Apple Inc. Global configuration broadcast
EP3859542A1 (en) * 2020-01-31 2021-08-04 Infineon Technologies AG Spi broadcast mode

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
REGENSCHEID ANDREW: "BIOS Protection Guidelines for Servers", NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY, 1 August 2014 (2014-08-01), XP093082957, Retrieved from the Internet <URL:https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=75d8fd4a96bfe9401ea3310d330359cb23536a39> [retrieved on 20230918], DOI: 10.6028/NIST.SP.800-147B *

Similar Documents

Publication Publication Date Title
US11386033B2 (en) Extending multichip package link off package
US7249209B2 (en) System and method for dynamically allocating inter integrated circuits addresses to multiple slaves
US7340548B2 (en) On-chip bus
TWI331281B (en) Method and apparatus for shared i/o in a load/store fabric
US8812758B2 (en) Mechanism to flexibly support multiple device numbers on point-to-point interconnect upstream ports
US7979592B1 (en) Virtualization bridge device
US8103803B2 (en) Communication between a processor and a controller
US20090003335A1 (en) Device, System and Method of Fragmentation of PCI Express Packets
US10282341B2 (en) Method, apparatus and system for configuring a protocol stack of an integrated circuit chip
EP3465453B1 (en) Reduced pin count interface
US8291146B2 (en) System and method for accessing resources of a PCI express compliant device
US9910814B2 (en) Method, apparatus and system for single-ended communication of transaction layer packets
US20150339253A1 (en) Electronic device with enhanced management data input/output control
US7096290B2 (en) On-chip high speed data interface
US20120324078A1 (en) Apparatus and method for sharing i/o device
WO2024092188A1 (en) Firmware broadcast in a multi-chip module
JP2008502977A (en) Interrupt method for bus controller
US20040098530A1 (en) Flexible data transfer to and from external device of system-on-chip
US20230315591A1 (en) PCIe DEVICE AND COMPUTING SYSTEM INCLUDING THE SAME
US20230350824A1 (en) Peripheral component interconnect express device and operating method thereof
US20220382362A1 (en) Peripheral component interconnect express (pcie) interface device and method of operating the same
WO2024102916A1 (en) Root complex switching across inter-die data interface to multiple endpoints