WO2021190203A1 - 处理器时钟系统、时钟系统中的子节点电路及电子器件 - Google Patents

处理器时钟系统、时钟系统中的子节点电路及电子器件 Download PDF

Info

Publication number
WO2021190203A1
WO2021190203A1 PCT/CN2021/076686 CN2021076686W WO2021190203A1 WO 2021190203 A1 WO2021190203 A1 WO 2021190203A1 CN 2021076686 W CN2021076686 W CN 2021076686W WO 2021190203 A1 WO2021190203 A1 WO 2021190203A1
Authority
WO
WIPO (PCT)
Prior art keywords
clock
chip
circuit
grid
sub
Prior art date
Application number
PCT/CN2021/076686
Other languages
English (en)
French (fr)
Inventor
尹文
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21775322.7A priority Critical patent/EP4095646A4/en
Publication of WO2021190203A1 publication Critical patent/WO2021190203A1/zh
Priority to US17/947,699 priority patent/US20230013151A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/06Clock generators producing several clock signals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/10Distribution of clock signals, e.g. skew
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/08Clock generators with changeable or programmable clock frequency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/3237Power saving characterised by the action undertaken by disabling clock generation or distribution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/324Power saving characterised by the action undertaken by lowering clock frequency
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L23/00Details of semiconductor or other solid state devices
    • H01L23/48Arrangements for conducting electric current to or from the solid state body in operation, e.g. leads, terminal arrangements ; Selection of materials therefor
    • H01L23/488Arrangements for conducting electric current to or from the solid state body in operation, e.g. leads, terminal arrangements ; Selection of materials therefor consisting of soldered or bonded constructions
    • H01L23/498Leads, i.e. metallisations or lead-frames on insulating substrates, e.g. chip carriers
    • H01L23/49827Via connections through the substrates, e.g. pins going through the substrate, coaxial cables
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L24/00Arrangements for connecting or disconnecting semiconductor or solid-state bodies; Methods or apparatus related thereto
    • H01L24/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L24/10Bump connectors ; Manufacturing methods related thereto
    • H01L24/12Structure, shape, material or disposition of the bump connectors prior to the connecting process
    • H01L24/13Structure, shape, material or disposition of the bump connectors prior to the connecting process of an individual bump connector
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L25/00Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof
    • H01L25/18Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof the devices being of types provided for in two or more different subgroups of the same main group of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03LAUTOMATIC CONTROL, STARTING, SYNCHRONISATION OR STABILISATION OF GENERATORS OF ELECTRONIC OSCILLATIONS OR PULSES
    • H03L7/00Automatic control of frequency or phase; Synchronisation
    • H03L7/06Automatic control of frequency or phase; Synchronisation using a reference signal applied to a frequency- or phase-locked loop
    • H03L7/08Details of the phase-locked loop
    • H03L7/0802Details of the phase-locked loop the loop being adapted for reducing power consumption
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/10Bump connectors; Manufacturing methods related thereto
    • H01L2224/12Structure, shape, material or disposition of the bump connectors prior to the connecting process
    • H01L2224/13Structure, shape, material or disposition of the bump connectors prior to the connecting process of an individual bump connector
    • H01L2224/13001Core members of the bump connector
    • H01L2224/13005Structure
    • H01L2224/13009Bump connector integrally formed with a via connection of the semiconductor or solid-state body
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/10Bump connectors; Manufacturing methods related thereto
    • H01L2224/12Structure, shape, material or disposition of the bump connectors prior to the connecting process
    • H01L2224/13Structure, shape, material or disposition of the bump connectors prior to the connecting process of an individual bump connector
    • H01L2224/13001Core members of the bump connector
    • H01L2224/13099Material
    • H01L2224/131Material with a principal constituent of the material being a metal or a metalloid, e.g. boron [B], silicon [Si], germanium [Ge], arsenic [As], antimony [Sb], tellurium [Te] and polonium [Po], and alloys thereof
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/10Bump connectors; Manufacturing methods related thereto
    • H01L2224/15Structure, shape, material or disposition of the bump connectors after the connecting process
    • H01L2224/16Structure, shape, material or disposition of the bump connectors after the connecting process of an individual bump connector
    • H01L2224/161Disposition
    • H01L2224/16135Disposition the bump connector connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip
    • H01L2224/16145Disposition the bump connector connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip the bodies being stacked
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/10Bump connectors; Manufacturing methods related thereto
    • H01L2224/15Structure, shape, material or disposition of the bump connectors after the connecting process
    • H01L2224/16Structure, shape, material or disposition of the bump connectors after the connecting process of an individual bump connector
    • H01L2224/161Disposition
    • H01L2224/16135Disposition the bump connector connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip
    • H01L2224/16145Disposition the bump connector connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip the bodies being stacked
    • H01L2224/16146Disposition the bump connector connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip the bodies being stacked the bump connector connecting to a via connection in the semiconductor or solid-state body
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/10Bump connectors; Manufacturing methods related thereto
    • H01L2224/15Structure, shape, material or disposition of the bump connectors after the connecting process
    • H01L2224/16Structure, shape, material or disposition of the bump connectors after the connecting process of an individual bump connector
    • H01L2224/161Disposition
    • H01L2224/16151Disposition the bump connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive
    • H01L2224/16221Disposition the bump connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive the body and the item being stacked
    • H01L2224/16225Disposition the bump connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive the body and the item being stacked the item being non-metallic, e.g. insulating substrate with or without metallisation
    • H01L2224/16227Disposition the bump connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive the body and the item being stacked the item being non-metallic, e.g. insulating substrate with or without metallisation the bump connector connecting to a bond pad of the item
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/10Bump connectors; Manufacturing methods related thereto
    • H01L2224/15Structure, shape, material or disposition of the bump connectors after the connecting process
    • H01L2224/17Structure, shape, material or disposition of the bump connectors after the connecting process of a plurality of bump connectors
    • H01L2224/1701Structure
    • H01L2224/1703Bump connectors having different sizes, e.g. different diameters, heights or widths
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/10Bump connectors; Manufacturing methods related thereto
    • H01L2224/15Structure, shape, material or disposition of the bump connectors after the connecting process
    • H01L2224/17Structure, shape, material or disposition of the bump connectors after the connecting process of a plurality of bump connectors
    • H01L2224/171Disposition
    • H01L2224/1718Disposition being disposed on at least two different sides of the body, e.g. dual array
    • H01L2224/17181On opposite sides of the body
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/26Layer connectors, e.g. plate connectors, solder or adhesive layers; Manufacturing methods related thereto
    • H01L2224/31Structure, shape, material or disposition of the layer connectors after the connecting process
    • H01L2224/32Structure, shape, material or disposition of the layer connectors after the connecting process of an individual layer connector
    • H01L2224/3205Shape
    • H01L2224/32052Shape in top view
    • H01L2224/32053Shape in top view being non uniform along the layer connector
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/26Layer connectors, e.g. plate connectors, solder or adhesive layers; Manufacturing methods related thereto
    • H01L2224/31Structure, shape, material or disposition of the layer connectors after the connecting process
    • H01L2224/32Structure, shape, material or disposition of the layer connectors after the connecting process of an individual layer connector
    • H01L2224/321Disposition
    • H01L2224/32104Disposition relative to the bonding area, e.g. bond pad
    • H01L2224/32105Disposition relative to the bonding area, e.g. bond pad the layer connector connecting bonding areas being not aligned with respect to each other
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/26Layer connectors, e.g. plate connectors, solder or adhesive layers; Manufacturing methods related thereto
    • H01L2224/31Structure, shape, material or disposition of the layer connectors after the connecting process
    • H01L2224/32Structure, shape, material or disposition of the layer connectors after the connecting process of an individual layer connector
    • H01L2224/321Disposition
    • H01L2224/32104Disposition relative to the bonding area, e.g. bond pad
    • H01L2224/32106Disposition relative to the bonding area, e.g. bond pad the layer connector connecting one bonding area to at least two respective bonding areas
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/26Layer connectors, e.g. plate connectors, solder or adhesive layers; Manufacturing methods related thereto
    • H01L2224/31Structure, shape, material or disposition of the layer connectors after the connecting process
    • H01L2224/32Structure, shape, material or disposition of the layer connectors after the connecting process of an individual layer connector
    • H01L2224/321Disposition
    • H01L2224/32135Disposition the layer connector connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip
    • H01L2224/32145Disposition the layer connector connecting between different semiconductor or solid-state bodies, i.e. chip-to-chip the bodies being stacked
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/26Layer connectors, e.g. plate connectors, solder or adhesive layers; Manufacturing methods related thereto
    • H01L2224/31Structure, shape, material or disposition of the layer connectors after the connecting process
    • H01L2224/32Structure, shape, material or disposition of the layer connectors after the connecting process of an individual layer connector
    • H01L2224/321Disposition
    • H01L2224/32151Disposition the layer connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive
    • H01L2224/32221Disposition the layer connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive the body and the item being stacked
    • H01L2224/32225Disposition the layer connector connecting between a semiconductor or solid-state body and an item not being a semiconductor or solid-state body, e.g. chip-to-substrate, chip-to-passive the body and the item being stacked the item being non-metallic, e.g. insulating substrate with or without metallisation
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L2224/26Layer connectors, e.g. plate connectors, solder or adhesive layers; Manufacturing methods related thereto
    • H01L2224/31Structure, shape, material or disposition of the layer connectors after the connecting process
    • H01L2224/33Structure, shape, material or disposition of the layer connectors after the connecting process of a plurality of layer connectors
    • H01L2224/331Disposition
    • H01L2224/3318Disposition being disposed on at least two different sides of the body, e.g. dual array
    • H01L2224/33181On opposite sides of the body
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/73Means for bonding being of different types provided for in two or more of groups H01L2224/10, H01L2224/18, H01L2224/26, H01L2224/34, H01L2224/42, H01L2224/50, H01L2224/63, H01L2224/71
    • H01L2224/732Location after the connecting process
    • H01L2224/73201Location after the connecting process on the same surface
    • H01L2224/73203Bump and layer connectors
    • H01L2224/73204Bump and layer connectors the bump connector being embedded into the layer connector
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2224/00Indexing scheme for arrangements for connecting or disconnecting semiconductor or solid-state bodies and methods related thereto as covered by H01L24/00
    • H01L2224/73Means for bonding being of different types provided for in two or more of groups H01L2224/10, H01L2224/18, H01L2224/26, H01L2224/34, H01L2224/42, H01L2224/50, H01L2224/63, H01L2224/71
    • H01L2224/732Location after the connecting process
    • H01L2224/73251Location after the connecting process on different surfaces
    • H01L2224/73253Bump and layer connectors
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2225/00Details relating to assemblies covered by the group H01L25/00 but not provided for in its subgroups
    • H01L2225/03All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00
    • H01L2225/04All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00 the devices not having separate containers
    • H01L2225/065All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00 the devices not having separate containers the devices being of a type provided for in group H01L27/00
    • H01L2225/06503Stacked arrangements of devices
    • H01L2225/06513Bump or bump-like direct electrical connections between devices, e.g. flip-chip connection, solder bumps
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2225/00Details relating to assemblies covered by the group H01L25/00 but not provided for in its subgroups
    • H01L2225/03All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00
    • H01L2225/04All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00 the devices not having separate containers
    • H01L2225/065All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00 the devices not having separate containers the devices being of a type provided for in group H01L27/00
    • H01L2225/06503Stacked arrangements of devices
    • H01L2225/06517Bump or bump-like direct electrical connections from device to substrate
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2225/00Details relating to assemblies covered by the group H01L25/00 but not provided for in its subgroups
    • H01L2225/03All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00
    • H01L2225/04All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00 the devices not having separate containers
    • H01L2225/065All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00 the devices not having separate containers the devices being of a type provided for in group H01L27/00
    • H01L2225/06503Stacked arrangements of devices
    • H01L2225/06541Conductive via connections through the device, e.g. vertical interconnects, through silicon via [TSV]
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2225/00Details relating to assemblies covered by the group H01L25/00 but not provided for in its subgroups
    • H01L2225/03All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00
    • H01L2225/04All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00 the devices not having separate containers
    • H01L2225/065All the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/648 and H10K99/00 the devices not having separate containers the devices being of a type provided for in group H01L27/00
    • H01L2225/06503Stacked arrangements of devices
    • H01L2225/06555Geometry of the stack, e.g. form of the devices, geometry to facilitate stacking
    • H01L2225/06565Geometry of the stack, e.g. form of the devices, geometry to facilitate stacking the devices having the same size and there being no auxiliary carrier between the devices
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L24/00Arrangements for connecting or disconnecting semiconductor or solid-state bodies; Methods or apparatus related thereto
    • H01L24/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L24/10Bump connectors ; Manufacturing methods related thereto
    • H01L24/15Structure, shape, material or disposition of the bump connectors after the connecting process
    • H01L24/16Structure, shape, material or disposition of the bump connectors after the connecting process of an individual bump connector
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L24/00Arrangements for connecting or disconnecting semiconductor or solid-state bodies; Methods or apparatus related thereto
    • H01L24/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L24/10Bump connectors ; Manufacturing methods related thereto
    • H01L24/15Structure, shape, material or disposition of the bump connectors after the connecting process
    • H01L24/17Structure, shape, material or disposition of the bump connectors after the connecting process of a plurality of bump connectors
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L24/00Arrangements for connecting or disconnecting semiconductor or solid-state bodies; Methods or apparatus related thereto
    • H01L24/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L24/26Layer connectors, e.g. plate connectors, solder or adhesive layers; Manufacturing methods related thereto
    • H01L24/31Structure, shape, material or disposition of the layer connectors after the connecting process
    • H01L24/32Structure, shape, material or disposition of the layer connectors after the connecting process of an individual layer connector
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L24/00Arrangements for connecting or disconnecting semiconductor or solid-state bodies; Methods or apparatus related thereto
    • H01L24/01Means for bonding being attached to, or being formed on, the surface to be connected, e.g. chip-to-package, die-attach, "first-level" interconnects; Manufacturing methods related thereto
    • H01L24/26Layer connectors, e.g. plate connectors, solder or adhesive layers; Manufacturing methods related thereto
    • H01L24/31Structure, shape, material or disposition of the layer connectors after the connecting process
    • H01L24/33Structure, shape, material or disposition of the layer connectors after the connecting process of a plurality of layer connectors
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L25/00Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof
    • H01L25/03Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes
    • H01L25/04Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers
    • H01L25/065Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group H01L27/00
    • H01L25/0655Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group H01L27/00 the devices being arranged next to each other
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L25/00Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof
    • H01L25/03Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes
    • H01L25/04Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers
    • H01L25/065Assemblies consisting of a plurality of individual semiconductor or other solid state devices ; Multistep manufacturing processes thereof all the devices being of a type provided for in the same subgroup of groups H01L27/00 - H01L33/00, or in a single subclass of H10K, H10N, e.g. assemblies of rectifier diodes the devices not having separate containers the devices being of a type provided for in group H01L27/00
    • H01L25/0657Stacked arrangements of devices
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2924/00Indexing scheme for arrangements or methods for connecting or disconnecting semiconductor or solid-state bodies as covered by H01L24/00
    • H01L2924/15Details of package parts other than the semiconductor or other solid state devices to be connected
    • H01L2924/151Die mounting substrate
    • H01L2924/153Connection portion
    • H01L2924/1531Connection portion the connection portion being formed only on the surface of the substrate opposite to the die mounting surface
    • H01L2924/15311Connection portion the connection portion being formed only on the surface of the substrate opposite to the die mounting surface being a ball array, e.g. BGA
    • HELECTRICITY
    • H01ELECTRIC ELEMENTS
    • H01LSEMICONDUCTOR DEVICES NOT COVERED BY CLASS H10
    • H01L2924/00Indexing scheme for arrangements or methods for connecting or disconnecting semiconductor or solid-state bodies as covered by H01L24/00
    • H01L2924/15Details of package parts other than the semiconductor or other solid state devices to be connected
    • H01L2924/161Cap
    • H01L2924/1615Shape
    • H01L2924/16152Cap comprising a cavity for hosting the device, e.g. U-shaped cap
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of circuits, in particular to processor clock systems, sub-node circuits and electronic devices in the clock system.
  • the clock period is also called the oscillation period, which is defined as the reciprocal of the clock frequency.
  • the clock cycle is the most basic and smallest unit of time in a computer. In one clock cycle, the processor only completes one of the most basic actions. Clock frequency is an important indicator to measure processor performance. In the design of the processor, if the operations completed in the same clock cycle are equivalent, then the shorter the clock cycle, the higher the clock frequency, and the better the performance of the processor.
  • the clock circuit inside the processor is usually composed of a phase lock loop (PLL), a clock tree, or a clock network, and then connected to the clock pins of different registers.
  • PLL phase lock loop
  • the present application provides a processor clock system, a sub-node circuit and electronic device in the clock system, which can meet the requirements of application scenarios for clock frequency conversion of the clock circuit.
  • a processor clock system including a phase-locked loop PLL, a clock tree, and a clock grid.
  • the phase-locked loop PLL is used to output the first clock signal clk_1; and the clock tree is used to receive the clock signal.
  • the first clock signal clk_1 and the second clock signal clk_2 are output;
  • the clock grid includes a plurality of nodes, some of the nodes are provided with sub-node circuits, and the sub-node circuits are connected to the
  • the clock trees are connected and used to generate a third clock signal clk_3 according to the second clock signal clk_2, wherein the clock grid and the clock tree have a three-dimensional structure.
  • the sub-node circuit includes a resonance and decoupling unit and a gated clock unit, and the gated clock unit is used to receive the gated signal G and the second clock signal clk_2, and send the clock grid to the resonance clock unit.
  • the resonant and decoupling unit outputs the third clock signal clk_3, and the resonant and decoupling unit includes a decoupling capacitor and a plurality of resonant inductors, and supports the generation of multiple oscillation frequencies at different frequency points.
  • the sub-node circuit performs processing by receiving the second clock signal clk_2 output by the clock tree, and outputs the third clock signal clk_3 to realize the gating function.
  • the sub-node circuit can also support a resonant circuit that generates an oscillation frequency to absorb the peak current in the clock circuit and reduce power consumption. Further, the sub-node circuit can also support the generation of multiple oscillation frequencies of different frequency points to meet the higher requirements of the chip for clock frequency conversion. Therefore, the sub-node circuit can have both gating function and resonance function, and generate oscillation frequencies of different frequency points, thereby improving the performance of the clock circuit.
  • the clock tree is composed of multiple clock buffers in a stacked structure.
  • the processor clock system may also be referred to as a clock system or a clock circuit.
  • the sub-node circuit may be referred to as a gating&resonant&decoupling (GRD) circuit.
  • GTD gating&resonant&decoupling
  • the gating clock unit is configured to receive the second clock signal clk_2 and the gating signal G, and to the clock grid and the resonance and decoupling The unit outputs the third clock signal clk_3.
  • the gating clock unit generates the third clock signal clk_3 according to the second clock signal clk_2 and the gating signal G, thereby realizing the gating function and improving the performance of the clock circuit.
  • the gate control signal G is configured by the system, which can mean that the gate control signal G can come from the control logic circuit inside the chip, or can be generated by the upper layer software through the control logic circuit.
  • the resonance and decoupling unit includes: multiple parallel resonant inductance circuits, multiple switching circuits, and one or more decoupling capacitors, wherein the multiple Each resonant inductance circuit corresponds to a plurality of switching circuits one-to-one, and the resonant inductance circuits are respectively provided with resonant inductances.
  • the sub-node circuit also includes decoupling capacitors to achieve the noise reduction function of power supply noise. Therefore, the sub-node circuit is a circuit that can realize the integrated design of the gating function, the resonance function and the noise reduction function. By setting the sub-node circuit in the clock grid, the performance of the clock circuit can be improved.
  • the multiple resonant inductance circuits correspond to multiple switching circuits one-to-one, so that by turning on or off different switching circuits, oscillating frequencies of different frequency points are generated, thereby meeting the requirements of application scenarios for clock frequency conversion of the clock circuit , Improve the performance of the clock circuit.
  • a first switch circuit of the plurality of switch circuits is used to receive a switch control signal S, and the switch control signal S is used to control the first switch circuit to be at In the on state or the off state, the first switch circuit is any one of the plurality of switch circuits.
  • the multiple resonant inductor circuits correspond to multiple clock domains.
  • each clock domain includes an independent clock grid.
  • each clock domain corresponds to a synchronization time, and the devices in the domain are synchronized to this time; the synchronization times corresponding to different clock domains are independent of each other.
  • multiple resonant inductor circuits can be one-to-one corresponding to multiple clock domains, that is, the resonant frequency generated by each resonant inductor circuit corresponds to one clock domain, so that the clock circuit can be applied to a chip that includes multiple clock domains. Improve the application range and use efficiency of the clock circuit.
  • one end of the plurality of decoupling capacitors is used to receive the third clock signal clk_3, and the other end of the decoupling capacitor is used to connect to power or ground.
  • the multiple sub-node circuits are further connected to multiple registers through the clock grid, and the distribution positions of the multiple sub-node circuits in the clock grid are Determined according to the load and location of the multiple registers.
  • the upper-level driving and control of the clock grid comes from the sub-node circuit.
  • the sub-node circuit is usually not set at each node. Therefore, it is necessary to optimize in the clock grid through a suitable algorithm Set the layout of the sub-node circuit. Therefore, by determining the distribution positions of the multiple sub-node circuits in the clock grid through the loads and positions of multiple registers, the driving efficiency of the clock circuit can be improved.
  • the distribution positions of the multiple sub-node circuits in the clock grid Is determined according to a clustering algorithm or a linear regression algorithm, wherein the input of the clustering algorithm or the linear regression algorithm includes the positions of the multiple registers in the clock grid, the clustering algorithm or the linear regression algorithm
  • the output includes the distribution positions of the multiple sub-node circuits in the clock grid.
  • the clustering algorithm or linear regression algorithm is used to determine the distribution positions of multiple sub-node circuits in the clock grid, so as to achieve the purpose of using as few sub-node circuits as possible and reasonably distributing the positions of the sub-node circuits, which can improve the performance of the clock circuit. Drive efficiency.
  • a sub-node circuit in a clock system wherein the sub-node circuit is arranged in a clock grid of the clock system, and the sub-node circuit includes a gated clock unit and a resonance and Decoupling unit:
  • the gating clock unit is used to receive the gating signal G and the second clock signal clk_2 from the clock tree, and output the third clock signal clk_3 to the clock grid and the resonance decoupling unit;
  • the resonance and decoupling unit includes a decoupling capacitor and a plurality of resonance inductances, and the resonance and decoupling unit supports the generation of a plurality of oscillation frequencies of different frequency points for receiving the third clock signal clk_3.
  • the sub-node circuit performs processing by receiving the second clock signal clk_2 output by the clock tree, and outputs the third clock signal clk_3 to realize the gating function.
  • the sub-node circuit can also support a resonant circuit that generates an oscillation frequency to absorb the peak current in the clock circuit and reduce power consumption.
  • the sub-node circuit can also support the generation of multiple oscillation frequencies of different frequency points to meet the higher requirements of the chip for clock frequency conversion. Therefore, the sub-node circuit can have both gating function and resonance function, and generate different frequencies. Point the oscillation frequency, thereby improving the performance of the clock circuit.
  • the gating clock unit is configured to receive the second clock signal clk_2 and the gating signal G, and to the clock grid and the resonance and decoupling The unit outputs the third clock signal clk_3.
  • the resonance and decoupling unit includes: multiple resonance inductance circuits connected in parallel, multiple switching circuits, and one or more decoupling capacitors, wherein the multiple Each resonant inductance circuit corresponds to a plurality of switching circuits one-to-one, and the resonant inductance circuits are respectively provided with resonant inductances.
  • the first switch circuit of the plurality of switch circuits is further used to receive a switch control signal S, and the switch control signal S is used to control the first switch circuit In the on state or the off state, the first switch circuit is any one of the plurality of switch circuits.
  • the multiple resonant inductor circuits have a one-to-one correspondence with multiple clock domains.
  • one end of the decoupling capacitor is used to receive the third clock signal clk_3, and the other end of the decoupling capacitor is used to connect to power or ground.
  • an electronic device including: a first chip, a lower surface of the first chip is provided with a plurality of micro solder balls, and the first chip communicates with the first chip through the plurality of micro solder balls.
  • the two chips are electrically connected; the second chip is arranged under the first chip, a plurality of through silicon vias penetrating the second chip are provided in the second chip, and the lower surface of the second chip A plurality of solder balls are provided, and the plurality of silicon through holes are connected with the plurality of solder balls; the carrier board is arranged under the second chip, and the second chip passes through the plurality of silicon through holes.
  • a clock circuit is provided in the electronic device, and the clock circuit includes: a phase-locked loop PLL for outputting the first clock signal clk_1; a clock tree , Used to receive the first clock signal clk_1, and output a second clock signal clk_2; a clock grid, the clock grid is connected to a plurality of registers, the clock grid is used to receive the second clock signal clk_2 , And output a third clock signal clk_3 to the multiple registers; the PLL and the clock tree are arranged in the first chip, and the clock grid is arranged in the second chip.
  • a three-dimensional clock circuit architecture can be provided, that is, the clock circuit can be distributed in two or more than two chips, the PLL and the clock tree are provided in the first chip, and the clock network The grid is arranged in the second chip, so that the clock signal can drive the circuits in the chip more flexibly and quickly, and the clock performance can be improved.
  • the clock grid includes multiple nodes, and some of the multiple nodes are provided with the sub-node circuit according to the second aspect or any one of the second aspects
  • the clock tree includes multi-level clock buffers, and the last-level clock buffer in the multi-level clock buffers communicates with all the clock buffers through the plurality of micro solder balls. Multiple sub-node circuits in the clock grid are connected.
  • an electronic device including: a first chip; a second chip, arranged under the first chip and electrically connected to the first chip, and a plurality of A first through silicon via, a plurality of micro solder balls are provided on the lower surface of the second chip, and the second chip passes through the plurality of first through silicon vias and the plurality of micro solder balls and the third
  • the chip is electrically connected; the third chip is arranged under the second chip, a plurality of second through silicon vias penetrating the third chip are arranged in the third chip, and the lower part of the third chip A plurality of solder balls are arranged on the surface, and the plurality of second through silicon vias are connected to the plurality of solder balls; the carrier board is arranged under the third chip, and the third chip passes through the plurality of solder balls.
  • a second through silicon via and the plurality of solder balls are electrically connected to the carrier board; wherein a clock circuit is provided in the electronic device, and the clock circuit includes: a phase-locked loop PLL for outputting a first clock Signal clk_1; a clock tree for receiving the first clock signal clk_1 and outputting a second clock signal clk_2; a clock grid, the clock grid is connected to a plurality of registers, the clock grid is used for receiving the The second clock signal clk_2, and the third clock signal clk_3 is output to the plurality of registers; the PLL and part of the clock tree are provided in the first chip, and another part of the clock tree is provided in the In the second chip, the clock grid is arranged in the third chip.
  • a three-dimensional clock circuit structure can be provided, that is, the clock circuit can be distributed in two or more chips, and a part of the PLL and the clock tree is provided in the first chip, Another part of the clock tree is arranged in the second chip, and the clock grid is arranged in the third chip, so that the clock signal can drive the circuits in the chip more flexibly and quickly, and the clock performance can be improved.
  • the clock grid includes multiple nodes, and some of the multiple nodes are provided with the second aspect or any one of the second aspect Sub-node circuit
  • the clock tree includes a multi-level clock buffer, and the last-level clock buffer in the multi-level clock buffer passes through the plurality of first through silicon vias.
  • the multiple micro solder balls are connected to multiple sub-node circuits in the clock grid.
  • a chip system is provided.
  • the chip system is provided with the clock circuit according to any one of the first aspect or the first aspect, or is provided with the clock circuit according to any one of the second aspect or the second aspect.
  • the described sub-node circuit for the clock circuit is provided.
  • Fig. 1 is a schematic diagram of a clock circuit according to an embodiment of the present application.
  • Fig. 2 is a schematic diagram of a clock circuit according to another embodiment of the present application.
  • Fig. 3 is a schematic diagram of a clock circuit according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of the principle of a sub-node circuit 400 according to an embodiment of the present application.
  • FIG. 5 is a timing diagram of a gated clock control signal according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of the layout of the sub-node circuit 400 under load balancing according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of the layout of the sub-node circuit 400 in the case of non-load balancing according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of the layout of a sub-node circuit 400 according to an embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a method for designing and simulating the layout of a sub-node circuit 400 according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of a method of extracting a clock circuit model and simulation according to an embodiment of the present application.
  • FIG. 11 is a topological design diagram of an electronic device 1100 according to an embodiment of the present application.
  • FIG. 12 is a topological design diagram of an electronic device 1200 according to another embodiment of the present application.
  • FIG. 13 is a topological design diagram of an electronic device 1300 according to another embodiment of the present application.
  • FIG. 14 is a schematic diagram of the interconnection between the chips of the electronic device 1300 in FIG. 13.
  • Clock skew refers to the time offset of a clock source arriving at the clock terminals of two different registers.
  • the calculation formula of clock skew can be expressed as:
  • T skew T clk2 -T clk1 (1)
  • T skew represents the clock skew
  • T clk1 and T clk2 respectively represent the time when the clock source reaches two different register clocks.
  • Clock jitter Refers to the deviation of the actual clock from the ideal clock that does not accumulate over time, and sometimes leads and sometimes lags.
  • Differential circuit Its input terminal is used to receive two input signals, and the difference between the above two input signals is the effective input signal of the differential circuit.
  • the output of the differential circuit is an amplification of the difference between the above two input signals. If there is an interference signal in the differential circuit, the same interference will be generated for the two input signals. Through the difference between the two, the effective input of the interference signal is zero, thereby achieving the purpose of anti-common-mode interference.
  • Phase lock loop It can also be called a phase lock loop, which is used to integrate clock signals uniformly to make high-frequency devices work normally.
  • PLL is usually an analog circuit placed inside the chip.
  • the clock signal at its input comes from a clock generating circuit outside the chip, usually a crystal oscillator (referred to as crystal oscillator).
  • the clock signal at the output of the PLL is the clock source of the clock network.
  • the output clock source of the PLL propagates through the clock tree to the clock pins of each sequential circuit (represented as a register in the figure) inside the chip, as the input of each component of the chip.
  • Clock network It is the topological structure of the internal clock circuit of the chip, which is usually implemented in a clock tree or clock grid structure.
  • Time grid A method to implement a clock network in a two-dimensional mesh (2D mesh) structure.
  • FIG. 1 is a schematic diagram of a clock circuit 100 according to an embodiment of the present application.
  • the clock circuit 100 includes a PLL 110 and a clock tree 120.
  • the PLL 110 is used to output clock signals.
  • the clock tree 120 is used to amplify clock signals to drive subsequent circuits. For example, the register 150 in the subsequent circuit is driven.
  • the clock tree 120 generally includes a multi-level clock buffer 121.
  • Each level of clock buffer includes one or more clock buffers 121.
  • the clock buffer 121 connected to the PLL 110 may be referred to as the clock master buffer 122.
  • the last-stage clock buffer 121 can be used to connect to the register 150.
  • the clock skew is tens of ps (picoseconds) on average.
  • FIG. 2 is a schematic structural diagram of a clock circuit 200 according to another embodiment of the present application.
  • Figure 2 is an improved clock circuit based on Figure 1.
  • the clock circuit 200 sets a clock grid 230 between the clock tree 220 and the register 250.
  • the clock grid 230 is a common type of clock network, and may refer to the topology of a gridded clock network.
  • the clock network 230 is composed of a connection between a clock tree and a sequential circuit inside the chip (represented as a register 250 in the figure), and each sequential circuit in the chip needs to be driven by a clock signal to update data.
  • the clock network is used to enable clock signals to be transmitted to various sequential devices.
  • this clock grid design can achieve smaller clock skew and clock jitter, so as to increase the clock frequency.
  • the clock topology in Figure 2 is a global grid clock, but the global grid clock also has some shortcomings.
  • global gridding has high requirements for chip metal traces. Metal wire specifications (for example, width, area, or through holes) and the number of layers need to be customized, and the versatility is insufficient.
  • the grid circuit has large resistance-capacitance (RC) parameters and high power consumption.
  • RC resistance-capacitance
  • an improved scheme is to introduce a resonant inductor in the clock network.
  • the advantage of this method is to reduce power consumption.
  • the clock network with the resonant inductor can be called a resonant clock network. Because the resonant clock network uses the on-chip inductance to create an "electric pendulum", thereby forming a resonant circuit (or can also be called an "oscillation loop"). It can generate an oscillating frequency, thereby reusing the power consumption in the clock circuit, instead of wasting power consumption in each clock cycle. Therefore, there is no need to use a large number of clock buffers in the clock circuit.
  • the resonant circuit itself is the source of clock generation, so there is no need to use a large number of clock buffers like traditional clock circuits.
  • the resonant circuit needs to excite energy exchange in the initial stage, and when the energy exchange slows down due to the loss of the resonant circuit, it must be excited again.
  • the power required for these excitations is far less than the driving power of the clock buffer of the existing clock network.
  • the problem with this solution is that the inductance in the clock circuit is fixed, which causes the oscillation frequency to remain unchanged, cannot meet the requirements of a wide-frequency clock, and cannot be applied to processors that require high frequency conversion. For example, turbo mode or fast frequency reduction mode.
  • the traditional inductance design alone is not enough to further reduce power consumption.
  • the embodiment of the present application proposes a clock circuit in which a sub-node circuit is provided, and the sub-node circuit can simultaneously realize the resonance function, the decoupling function, and the clock control function. And it can also generate oscillation frequencies of different frequency points, which is suitable for processors with high requirements for frequency conversion and improves the performance of the clock circuit.
  • FIG. 3 is a schematic diagram of a clock circuit 300 according to an embodiment of the present application.
  • the clock circuit 300 includes: a PLL 310, a clock tree 320, and a clock grid 330.
  • a plurality of sub-node circuits 400 are provided in the clock grid 330.
  • the sub-node circuit 400 may be referred to as a gating & resonant & decoupling (GRD) circuit.
  • GTD gating & resonant & decoupling
  • the PLL 310 is used to output the first clock signal clk_1 to the clock tree 320.
  • the clock tree 320 is connected to the PLL 310, and is used to receive the first clock signal clk_1 and output the second clock signal clk_2.
  • the clock tree 320 includes multiple stages of clock buffers 321, and each stage of the clock buffer 321 includes one or more clock buffers 321. Generally speaking, the higher the number of stages, the greater the number of clock buffers 321.
  • the clock buffer 321 connected to the PLL 310 may be referred to as the clock master buffer 322.
  • the clock buffer 321 in the clock tree 320 can be understood as a node in the clock tree 320, which functions like a clock repeater and is used to drive a larger load.
  • the clock grid 330 includes multiple nodes, some of the multiple nodes are respectively provided with sub-node circuits 400, the multiple sub-node circuits 400 are connected to the clock tree 320, and the multiple sub-node circuits 400 also The clock grid 330 is connected to a plurality of registers 350.
  • the clock grid 330 usually adopts metal connections of different layers and via connections of upper and lower layers.
  • the multiple sub-node circuits 400 are configured to receive the second clock signal clk_2, and output a third clock signal clk_3 to the clock grid 330, so as to realize the gating function of the clock circuit.
  • the foregoing multiple sub-node circuits 400 output the third clock signal clk_3 to the clock grid 330, which can be understood as the multiple sub-node circuits 400 used to drive and control the multiple registers 350 in the clock grid 330.
  • the sub-node circuit 400 can also support a resonant circuit that generates an oscillation frequency to absorb the peak current in the clock circuit and reduce power consumption.
  • the sub-node circuit 400 can support the generation of multiple oscillation frequencies of different frequency points to meet the higher requirements of the chip for clock frequency conversion.
  • the sub-node circuit 400 also includes a decoupling capacitor Cd to achieve a noise reduction function for power supply noise. Therefore, the sub-node circuit 400 is a circuit that can realize the integrated design of the gating function, the resonance function and the noise reduction function. By providing the sub-node circuit 400 in the clock grid 330, the performance of the clock circuit can be improved.
  • FIG. 4 the circuit structure and working principle of the sub-node circuit 400 will be described.
  • the gating function may refer to the function of turning off or turning on a partial circuit in the clock circuit by using a gating signal.
  • the above-mentioned partial circuit may refer to the circuit part corresponding to the sub-node circuit 400, and may be divided and laid out according to practice.
  • the noise reduction function can also be referred to as the noise suppression function, which is used to reduce the noise of the power supply noise in the clock circuit.
  • the resonance function can refer to the use of a resonance circuit to absorb the peak current in the clock circuit to achieve the purpose of reducing power consumption.
  • FIG. 4 is a schematic diagram of the principle of a sub-node circuit 400 according to an embodiment of the present application. As shown in FIG. 4, the sub-node circuit 400 includes a gated clock unit 410 and a resonance and decoupling unit 420.
  • the gating clock unit 410 receives the second clock signal clk_2 and the gating signal G output by the clock tree 320, and outputs the third clock signal clk_3 according to the second clock signal clk_2 and the gating signal G, so as to realize the gating function .
  • the aforementioned gating signal G may be configured by the system.
  • the gate control signal G is configured by the system, which can mean that the gate control signal G can come from the control logic circuit inside the chip, or can be generated by the upper layer software through the control logic circuit. It should be understood that other signals configured by the system in the embodiments of the present application may also be interpreted using the above definitions.
  • the clock gated unit 410 is also used to output the third clock signal clk_3 to the resonance and decoupling unit 420, so that the resonance and decoupling unit 420 realizes the resonance function and noise suppression function of the third clock signal clk_3.
  • the gated clock unit 410 includes a latch 411 and an AND gate 412.
  • the latch 411 is used to receive the gating signal G and the second clock signal clk_2 received from the clock tree, and output the control signal G_L.
  • the AND gate 412 is used to receive the second clock signal clk_2 and the control signal G_L, and output the third clock signal clk_3.
  • the third clock signal clk_3 is used to drive the clock grid 330.
  • the gated clock unit 410 in FIG. 4 is only an example, and the gated clock unit 410 can also be implemented by other types of gated devices and connections, as long as it can realize the gate control function of the clock circuit.
  • the embodiment of the present application does not limit this.
  • the resonance and decoupling unit 420 supports the generation of multiple oscillating frequencies at different frequency points to realize the resonance function.
  • the resonance and decoupling unit 420 also includes a decoupling capacitor Cd to realize the noise reduction function of the power supply noise.
  • the resonance and decoupling unit 420 includes a plurality of parallel resonance inductance circuits 421, a plurality of switching circuits 422, and one or more decoupling capacitors Cd.
  • the multiple resonant inductor circuits 421 correspond to the multiple switch circuits 422 one-to-one.
  • a plurality of resonance inductances Lr are respectively provided on the plurality of resonance inductance circuits 421.
  • the inductance values of any two resonant inductors in the plurality of resonant inductors are different.
  • the gated clock unit 410 and the resonance and decoupling unit 420 may be provided with a buffer 430 to increase the driving capability of the signal.
  • the resonance and decoupling unit 420 is also used to receive a plurality of switch control signals S.
  • the multiple switch control signals S correspond to the multiple switch circuits 422 one-to-one.
  • the switch control signal S is used to control the corresponding switch circuit 422 to be in the on state or the off state.
  • the switch control signal S can be configured by the system.
  • the first switch circuit 422 of the plurality of switch circuits 422 is used to receive the switch control signal S, and the switch control signal S is used to control the first switch circuit 422 to be in the on state or the off state.
  • the first switch circuit 422 is Any one of the plurality of switch circuits 422.
  • the resonance circuit refers to a circuit that generates an oscillation frequency based on the resonance inductance Lr and the decoupling capacitor Cd.
  • the combination of different resonant inductance Lr and decoupling capacitor Cd can produce different oscillation frequencies.
  • different resonant inductance circuits 421 correspond to different oscillation frequencies.
  • the first switch circuit 422 can be controlled to be turned on by the switch control signal S, so that the resonant inductance circuit 421 corresponding to the first switch circuit 422 is connected to the resonant circuit to generate an oscillation frequency.
  • the first switch circuit 422 is any one of the plurality of switch circuits 422.
  • only one switch circuit 422 of the plurality of switch circuits 422 is in the conducting state. That is, in the same time period, only one resonance inductance circuit 421 of the plurality of resonance inductance circuits 421 is allowed to participate in the resonance circuit.
  • one or more of the plurality of switching circuits 422 may be turned on. That is, within the same time period, one or more of the resonant inductance circuits 421 are allowed to participate in the resonant circuit.
  • one end of the one or more decoupling capacitors Cd is used to receive the third clock signal clk_3, and the other end of the decoupling capacitor Cd is used to connect to power or ground.
  • the resonant inductor Lr in the embodiment of the present application may also be referred to as an inductor Lr, and the decoupling capacitor Cd may also be referred to as a capacitor Cd.
  • each clock domain has a different clock frequency
  • each clock domain includes an independent clock grid.
  • each clock domain corresponds to a synchronization time, and the devices in the domain are synchronized to this time; the synchronization times corresponding to different clock domains are independent of each other.
  • the multiple resonant inductive circuits have a one-to-one correspondence with multiple clock domains.
  • multiple resonant inductors Lr may be connected in parallel with independent switches, and the number of inductances connected in parallel may depend on the number of clock domains (clock domai).
  • the combination of different resonant inductance Lr and decoupling capacitor Cd can produce different oscillation frequencies.
  • the oscillation frequency of the resonance circuit can be calculated according to the following formula (2).
  • f represents the oscillation frequency
  • L represents the inductance value
  • C represents the capacitance value
  • one end of the resonant inductor Lr is used to receive the third clock signal clk_3.
  • the other end of the resonant inductor Lr can also be connected to the power supply or the ground through the decoupling capacitor Cd.
  • the structure of the sub-node circuit 400 in FIG. 4 is merely an example. In practice, the sub-node circuit 400 may include more devices or adopt other connection methods. Alternatively, the sub-node circuit 400 may also include fewer modules and devices than those in FIG. 4. The embodiment of the present application does not limit the specific structure of the sub-node circuit 400, as long as it can realize the functions described in the foregoing.
  • the sub-node circuit 400 may utilize multiple resonant inductance circuits 421 to absorb peak current and reduce power consumption.
  • the sub-node circuit 400 may also use multiple resonance inductance circuits 421 and decoupling capacitors Cd in combination, that is, a combination of multiple inductors and capacitors to resonate to generate oscillation frequencies at different frequencies, so as to meet the processor's requirements for clock frequency conversion.
  • the decoupling capacitor Cd when the resonant inductor circuit 421 is disconnected, can be used as a general chip power supply decoupling capacitor for transient noise (or synchronous switching noise SSN) suppression.
  • a plurality of decoupling capacitors Cd may be used in parallel to increase the capacitance of the capacitor, so as to obtain a better effect.
  • the parasitic parameters and load of the clock network are large, so the power consumption will be greater.
  • the gated clock unit 410 In the case where the gated clock unit 410 is not provided, the clock circuit will be inverted every clock cycle. In the case where the gated clock unit 420 is provided, the clock circuit will be controlled by the gated signal G, which reduces the clock circuit flipping, or in other words, reduces the clock switching behavior, thereby reducing power consumption. Therefore, by adding the gated clock unit 410 in the sub-node circuit 400, the clock switching behavior can be reduced, and the switching power consumption can be saved. At the same time, as the switching behavior of the clock pin is reduced, the internal power consumption of the register is also reduced.
  • the power consumption reduction principle of the gated clock unit 410 is: by controlling the clock flip rate (denoted as a), the dynamic power consumption of the clock circuit is reduced. For example, formula (3) shows how the power consumption of the clock circuit is calculated.
  • P represents the power consumption of the clock circuit
  • a represents the clock flip rate
  • C represents the equivalent capacitance of the load
  • V represents the operating voltage of the clock circuit.
  • FIG. 5 is a timing diagram of the gated clock control signal of the sub-node circuit 400 according to an embodiment of the present application.
  • the second clock signal clk_2 represents the clock signal output by the clock tree 320.
  • the gate control signal G is the signal of the open interface, and the open interface means that the signal can be controlled by the upper layer software.
  • the control signal G_L represents the signal after the gate control signal G and the second clock signal clk_2 are synchronized by the latch 411.
  • the third clock signal clk_3 represents the signal after the AND operation of the control signal G_L and the second clock signal clk_2.
  • the third clock signal clk_3 is the output signal of the gating clock unit 410.
  • the upper-level driving and control of the clock grid comes from the sub-node circuit 400.
  • the sub-node circuit 400 is usually not installed at each node. Therefore, it is necessary to optimize the clock network by a suitable algorithm.
  • the layout of the sub-node circuit 400 is set in the grid.
  • clustering or linear regression algorithms can be used to implement the clock grid layout of the sub-node circuit 400.
  • FIG. 6 is a schematic diagram of the layout of the sub-node circuit 400 under load balancing according to an embodiment of the present application.
  • Load balancing is usually embodied in the same type of registers in the same chip, or in other words, each register in the clock grid has the same or similar gate loading.
  • K-means clustering algorithm K-means, K-means
  • linear regression algorithm can be used to achieve a fixed number of registers in the clock grid, using the least number of sub-node circuits 400 drives the above-mentioned registers, and the total distance from the sub-node circuit 400 to each register is the shortest.
  • the above-mentioned distance can refer to a distance on a plane or a three-dimensional distance.
  • the K-means algorithm is an iterative solution clustering analysis algorithm. For a given sample set, K initial centroids can be designated as clusters, and the iteration is repeated until the algorithm converges.
  • the clustering algorithm is the K-means algorithm as an example for illustration. It should be understood that other clustering algorithms may also be used in FIG. 6 to implement the layout of the sub-node circuit 400 in the case of load balancing.
  • the grid averaging method can be used to first decompose the register circuit, and then use the clustering or linear regression algorithm.
  • FIG. 7 is a schematic diagram of the layout of the sub-node circuit 400 in the case of non-load balancing according to an embodiment of the present application.
  • the clock grid can be divided into multiple sub-grids, and the equal division method is used in the sub-grid first to decompose the large registers into the smallest granularity registers to normalize the non-load-balancing circuit to Load balancing circuit. Then, clustering (for example, K-means algorithm) or linear regression algorithm is used to solve the layout of the sub-node circuit 400.
  • the above-mentioned register with the smallest granularity may refer to the register of the smallest size, and the smallest size refers to the smallest unit of the register that can be processed by the semiconductor factory.
  • the size of the register may refer to the size of the register, and the larger the size of the register, the greater the load.
  • the register with the smallest granularity mentioned above may refer to the register with the smallest size.
  • FIG. 8 is a schematic diagram of the layout of a sub-node circuit 400 according to an embodiment of the present application. As shown in FIG. 8, after the layout of the sub-node circuit 400, the relative relationship between the sub-node circuit 400 and the register is: from a certain vertex (sub-node circuit 400) in the figure to the surrounding 4 sub-grids (including registers) The distances between the center positions are equal.
  • FIG. 9 is a schematic flowchart of a method for designing and simulating the circuit layout of a sub-node circuit 400 according to an embodiment of the present application.
  • the method in Figure 9 can be implemented using circuit simulation software. As shown in Figure 9, the method includes:
  • S905 Generate a clock tree according to the location of the clock grid.
  • FIG. 10 is a schematic diagram of a method of extracting a clock circuit model and simulation according to an embodiment of the present application.
  • the method in Figure 10 can be implemented using circuit simulation software.
  • FIG. 10 shows the specific flow of step S10 in FIG. 9. As shown in Figure 10, the method includes:
  • the above-mentioned clock design principle diagram may be, for example, a register transfer level (RTL) netlist code (netlist code).
  • RTL register transfer level
  • netlist code netlist code
  • RTL refers to an abstract level of logic design, which uses a hardware description language to describe the desired function.
  • S104 Grab the clock output connection and the clock buffer step by step.
  • S105 Determine whether the captured output pin is a clock pin of the register.
  • the PVT process corner refers to the process-voltage-temperature corner (process-voltage-temperature corner, PVT corner). It represents the voltage and temperature conditions that need to be met in the simulation design.
  • extracting the parameters of the differential line and the buffer circuit above includes extracting the above parameters through an RLC simulation program with integrated circuit emphasis (SPICE) model.
  • SPICE stands for circuit-level analog simulation program.
  • the foregoing construction of the end-to-end clock link netlist includes the construction of the foregoing netlist through the RTL SPICE model.
  • the aforementioned output clock source excitation may include setting the frequency of the clock source signal, clock jitter (jitter), and voltage slew rate (slew rate).
  • clock jitter can refer to the deviation between the timing event of a signal and its ideal position.
  • the slew rate can also be called the voltage slew rate, which can refer to the magnitude of the voltage rise in a unit of time.
  • S110 Judge whether the simulation result meets the requirements. If yes, the simulation ends; if not, modify the clock design, and return to S101 to re-execute the clock design simulation.
  • the clock tree can be designed with a differential circuit.
  • the differential clock tree has strong anti-noise ability, low power consumption, and long driving distance to the load.
  • the embodiment of the present application also proposes an electronic device, which is provided with a three-dimensional clock circuit architecture, so that the clock performance can be improved.
  • the three-dimensional clock architecture may include the clock circuit 300 described above, or the three-dimensional clock circuit architecture may also include other types of clock circuits.
  • the above-mentioned electronic device may be a multi-chip stacked package structure, which allows multi-layer stacking of chips, and provides signal connections for multiple chips in a vertical direction through through silicon vias (TSV).
  • TSV through silicon vias
  • the electronic device may be a processor.
  • the above-mentioned three-dimensional clock architecture can be combined with a many-core system on chip (SoC) such as a 2-dimension mesh (2D mesh) or a three-dimensional integrated circuit (3DIC), and a new system on chip (SoC).
  • SoC system on chip
  • the above-mentioned new processes can include chip-on-wafer-on-substrate (CoWoS) packaging, fan-out package (FOP), and 3-dimensional integrated silicon through-silicon vias (3 dimension integrated). circuit through silicon via, 3DIC TSV, 3D TSV) packaging, etc.
  • FIG. 11 is a topological design diagram of an electronic device 1100 according to an embodiment of the present application.
  • the electronic device 1100 adopts the 3D topology structure of 3DIC TSV.
  • 3DIC TSV refers to stacking multiple chips together for packaging, and using through silicon vias (TSV) to achieve high-speed and efficient data communication between chips.
  • TSV through silicon vias
  • the electronic device includes a first chip 1101, a second chip 1102 and a carrier board 1104.
  • the second chip 1102 is disposed under the first chip 1101
  • the carrier board 1104 is disposed under the second chip 1102.
  • the lower surface of the first chip 1101 is provided with a plurality of micro solder balls 1105, and the first chip 1101 is electrically connected to the second chip 1102 through the plurality of micro solder balls 1105.
  • the second chip 1102 is provided with a plurality of through silicon vias 1107 penetrating through the second chip 1102, the lower surface of the second chip 1102 is provided with a plurality of solder balls 1108, and the plurality of through silicon vias 1107 are connected to the plurality of through silicon holes 1107.
  • the solder balls 1108 are connected, and the second chip 1102 is electrically connected to the carrier board 1104 through the plurality of through silicon vias 1107 and the plurality of solder balls 1108.
  • a ball grid array 1109 is provided on the lower surface of the carrier board 1104, and the ball grid array 1109 is used to connect the carrier board 1104 and a printed circuit board (PCB) (not shown in FIG. 11).
  • PCB printed circuit board
  • the carrier 1104 may also be called a package carrier, a substrate, or a package substrate, and provides mechanical support, protection, heat dissipation, and electrical connection channels for the chip. Circuits are provided on the carrier board 1104 to conduct signals between the first chip 1101, the second chip 1102 and the PCB.
  • the carrier board 1104 can be connected to the external pins of the processor through bonding wires, and the ball grid array 1109 of the carrier board 1104 can be used for the connection between the packaged processor and the PCB motherboard.
  • the aforementioned clock circuit (100, 200, 300) can be provided in the electronic device 1100, or other types of clock circuits can be provided.
  • the PLL (110, 210, 310) and the clock tree (110, 220, 320) in the clock circuit can be set in the first chip 1101.
  • the second chip 1102 can be provided with clock grids (230, 330) and registers (150, 250, 350) in the clock circuit.
  • the PLL 310 in the clock circuit 300 and the clock tree 320 can be provided in the first chip 1101, and the clock grid 330 in the clock circuit 300 can be provided in the second chip 1102.
  • the clock grid 330 includes a plurality of registers 350.
  • the clock grid 350 includes a plurality of nodes, and a sub-node circuit 400 may or may not be provided in some nodes of the plurality of nodes.
  • the clock tree 320 includes a multi-level clock buffer 311, and the last level of the buffer in the multi-level clock buffer 311 passes through the plurality of micro solder balls 1105 and the clock grid 330.
  • the two sub-node circuits 400 are connected.
  • the first chip 1101 may include a logic chip or a memory chip (logic&memory die).
  • the second chip 1102 may include a logic die.
  • the electronic device in FIG. 11 may be a processor.
  • the chip architecture design process of the electronic device in FIG. 11 can be described as follows.
  • the chip architecture design process can be implemented by electronic design automation (EDA) software.
  • EDA electronic design automation
  • S501 Generate a clock grid according to the placement density and location of the registers.
  • S502 According to the position of the register on the clock grid, place the sub-node circuit 400 and connect the sub-node circuit 400 to the clock grid node.
  • S503 Insert a through silicon via in the placement position of the sub-node circuit 400, and the through silicon via is connected to the sub-node circuit 400 for connection with the circuit of the first chip.
  • S603 Generate a clock tree according to the location of the PLL and the sink clock buffer.
  • FIG. 12 is a topological design diagram of an electronic device 1200 according to another embodiment of the present application. Compared with FIG. 11, the electronic device 1200 may include more chips. As shown in FIG. 12, the electronic device 1200 includes a first chip 1101, a second chip 1102, a third chip 1103, and a carrier board 1104 from top to bottom. For the sake of brevity, the same or similar content in FIG. 12 and FIG. 11 will not be repeated here.
  • the second chip 1102 is disposed under the first chip 1101 and is electrically connected to the first chip 1101.
  • the first chip 1101 and the second chip 1102 may be bonded and connected in a physical or chemical manner.
  • a plurality of first through silicon vias 1106 are provided in the second chip 1102.
  • the lower surface of the second chip 1102 is provided with a plurality of micro solder balls 1105, and the second chip 1102 is electrically connected to the third chip 1103 through a plurality of first through silicon vias 1106 and the plurality of micro solder balls 1105.
  • the third chip 1103 is provided with a plurality of second through silicon vias (TSV) 1107 passing through the third chip 1103, and the lower surface of the third chip 1103 is provided with a plurality of solder balls 1108.
  • TSV second through silicon vias
  • the second through silicon vias 1107 are connected to the plurality of solder balls 1108, and the third chip 1103 is electrically connected to the carrier board 1104 through the plurality of second through silicon vias 1107 and the plurality of solder balls 1108.
  • a ball grid array 1109 is provided on the lower surface of the carrier board 1104, and the ball grid array 1109 is used to connect the carrier board 1104 and the PCB (not shown in FIG. 12).
  • the clock circuit (100, 200, 300) described above can be provided in the electronic device 1200, or other types of clock circuits can be provided.
  • the PLL (110, 210, 310) in the clock circuit and a part of the clock tree (110, 220, 320) can be set in the first chip 1101.
  • Another part of the clock tree (110, 220, 320) can be set in the second chip 1102.
  • the third chip 1103 can be provided with clock grids (230, 330) and registers (150, 250, 350) in the clock circuit.
  • the PLL 310 in the clock circuit 300 and a part of the clock tree 320 can be provided in the first chip 1101, and another part of the clock tree 320 can be provided in the second chip 1102.
  • the clock grid 330 in the clock circuit 300 can be provided in the third chip 1103.
  • the clock grid 330 includes a plurality of registers 350.
  • the clock grid 350 includes a plurality of nodes, and a sub-node circuit 400 may or may not be provided in some nodes of the plurality of nodes.
  • the clock tree 320 includes a multi-level clock buffer 311, and the last level of the buffer in the multi-level clock buffer 311 passes through a plurality of first through silicon vias 1106 and a plurality of through silicon vias 1106 located in the second chip 1102.
  • the micro solder balls 1105 are connected to multiple sub-node circuits 400 in the clock grid 330.
  • the first chip 1101 and the second chip 1102 may include a logic chip or a memory chip (logic&memory die).
  • the third chip 1103 may include a logic die.
  • the electronic device 1200 in FIG. 12 may be a processor.
  • FIG. 13 is a topological design diagram of an electronic device 1300 according to another embodiment of the present application.
  • the topology design is based on a chiplet (chiplet) chip architecture.
  • the Chiple structure may also be referred to as a Lego (lego) structure, and may adopt a chip-on-wafer-on-substrate (CoWoS) package or a fan-out package (FOP).
  • the electronic device 1300 may be a processor.
  • CoWoS refers to a 2.5-dimensional integrated packaging technology.
  • the packaging process is to first connect the semiconductor chip to the wafer through a chip-on-wafer (CoW) packaging process, and then connect the CoW chip It is connected with the carrier board and integrated into CoWoS.
  • This packaging method can encapsulate multiple chips together, achieving the effects of small package size, low power consumption, and fewer pins.
  • FOP connects chips with different functions and passive components by redistributing circuit layers to reduce the volume of the package. Both cost and performance can be taken into consideration at the same time, and heterogeneous chips can be integrated into a single package.
  • FOP mainly includes two types: fan-out-wafer-level-packaging (FOWLP) and fan-out-panel-level packaging (FOPLP).
  • FOWLP fan-out-wafer-level-packaging
  • FOPLP fan-out-panel-level packaging
  • the electronic device 1300 includes a first chip layer 1301, a second chip layer 1302 and a carrier board 1304.
  • the first chip layer 1301 may include multiple chips, and the multiple chips are interconnected at a high speed through an interposer.
  • the first chip layer 1301 includes two central processing unit (CPU) chips (13-1, 13-2) and two IO chips (13-3, 13-4) as an example for illustration. .
  • the clock network inside each CPU chip (13-1, 13-2) can be the clock circuit (100, 200, 300) mentioned above, or other types of clock circuits.
  • the PLL (110, 210, 310) and the clock tree (110, 220, 320) in the clock circuit can be set in the CPU chip (13-1, 13-2).
  • the second chip layer 1302 can be provided with clock grids (230, 330) and registers (150, 250, 350) in the clock circuit.
  • FIG. 14 is a schematic diagram of the interconnection between the CPU chips in FIG. 13.
  • the above-mentioned high-density and high-speed interconnection method may adopt small input output (small input output, small IO) technology, for example.
  • first chip, the second chip, and the third chip in the foregoing figures and embodiments are bare die.
  • the first chip, the second chip, and the third chip may become the first die, the second die, and the third die.
  • the first die, the second die, and the third die respectively include one or more layers of die.
  • the clock design scheme of the electronic device 1300 in FIG. 13 includes:
  • S801 Generate a clock grid according to the placement density and location of the registers.
  • the aforementioned clock grid may be one or more clock grids.
  • S803 According to the chip layout requirements and the position of the sub-node circuit 400, place the PLL position.
  • S804 Generate a clock tree according to the positions of the PLL and the sub-node circuit 400, and determine the positions of the clock buffers at all levels.
  • the cross-chip clock signals are connected to each other through the high-speed interconnection between the chips.
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Computer Hardware Design (AREA)
  • Power Engineering (AREA)
  • Condensed Matter Physics & Semiconductors (AREA)
  • Semiconductor Integrated Circuits (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)
  • Manipulation Of Pulses (AREA)

Abstract

一种处理器时钟系统、时钟电路中的子节点电路(400)及电子器件。该时钟系统包括:锁相环PLL(310)、时钟树(320)以及时钟网格(330)。时钟树(320)由层叠结构的多个时钟缓冲器(321)组成,时钟树(320)用于接收锁相环PLL(310)输出的第一时钟信号clk_1,并输出第二时钟信号clk_2,时钟网格(330)的部分节点上设置有多个子节点电路(400),用于根据第二时钟信号clk_2产生第三时钟信号clk_3,其中,时钟网格(330)和时钟树(320)为立体结构。

Description

处理器时钟系统、时钟系统中的子节点电路及电子器件
本申请要求于2021年2月8日提交中国专利局、申请号为202110172075.8、发明名称为“处理器时钟系统、时钟系统中的子节点电路及电子器件”的中国专利申请的优先权,以及于2020年3月23日提交中国专利局、申请号为202010208897.2、申请名称为“一种面向高性能处理器的低功耗时钟设计”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电路领域,尤其涉及处理器时钟系统、时钟系统中的子节点电路及电子器件。
背景技术
时钟周期也称为振荡周期,定义为时钟频率的倒数。时钟周期是计算机中最基本的、最小的时间单位。在一个时钟周期内,处理器仅完成一个最基本的动作。时钟频率是衡量处理器性能的一个重要指标。在处理器的设计中,如果同一个时钟周期内完成的操作相当,那么时钟周期越短,时钟频率越高,代表处理器的性能越好。
在物理实现上,处理器内部的时钟电路通常由锁相环(phase lock loop,PLL)、时钟树或者时钟网络组成,然后连接到不同寄存器的时钟引脚。
随着计算机技术的高速发展,当前典型的高性能的处理器集成了百亿规模的晶体管,芯片面积(die size)接近1000mm 2。在时钟电路设计中,随着芯片规模的增大,时钟电路面临着驱动能力不足、级数增加、时延和功耗增大的问题,从而时钟偏斜和时钟抖动的问题也越来越严重。因此,处理器的时钟频率提升越来越困难,导致处理器的性能难以满足日益增长的应用需求。
发明内容
本申请提供一种处理器时钟系统、时钟系统中的子节点电路及电子器件,能够满足应用场景对时钟电路的时钟频率变换的需求。
第一方面,提供了一种处理器时钟系统,包括锁相环PLL、时钟树和时钟网格,其中,锁相环PLL,用于输出第一时钟信号clk_1;所述时钟树用于接收所述第一时钟信号clk_1,并输出第二时钟信号clk_2;所述时钟网格,包括多个节点,所述多个节点中的部分节点上设置有子节点电路,所述子节点电路与所述时钟树相连,用于根据所述第二时钟信号clk_2产生第三时钟信号clk_3,其中,所述时钟网格和所述时钟树为立体结构。
所述子节点电路包括谐振与去耦单元以及门控时钟单元,所述门控时钟单元用于接收门控信号G和所述第二时钟信号clk_2,并向所述时钟网格和所述谐振与去耦单元输出所述第三时钟信号clk_3,所述谐振与去耦单元包括去耦电容和多个谐振电感,支持产生多个不同频点的振荡频率。
子节点电路通过接收时钟树输出的第二时钟信号clk_2进行处理,并输出第三时钟信号clk_3,以实现门控功能。子节点电路还可以支持产生振荡频率的谐振电路,以吸收时钟电路 中的峰值电流,降低功耗。进一步地,子节点电路还可以支持产生多个不同频点的振荡频率,以满足芯片对时钟频率变换更高的要求。因此,子节点电路可以兼具门控功能、谐振功能,并产生不同频点的振荡频率,从而提高了时钟电路的性能。
在一种可能的实施方式中,所述时钟树由层叠结构的多个时钟缓冲器组成。
可选地,所述处理器时钟系统也可以称为时钟系统或时钟电路。
可选地,所述子节点电路可以称为门控&谐振&去耦(gating&resonant&decoupling,GRD)电路。
结合第一方面,在一种可能的实现方式中,所述门控时钟单元用于接收所述第二时钟信号clk_2以及门控信号G,并向所述时钟网格以及所述谐振与去耦单元输出所述第三时钟信号clk_3。
门控时钟单元根据第二时钟信号clk_2以及门控信号G,以生成第三时钟信号clk_3,从而实现门控功能,提高了时钟电路的性能。
其中,门控信号G由系统配置,可以指门控信号G可以来自于芯片内部的控制逻辑电路,也可以由上层软件通过控制逻辑电路来产生。
结合第一方面,在一种可能的实现方式中,所述谐振与去耦单元中包括:多个并联的谐振电感电路、多个开关电路以及一个或多个去耦电容,其中,所述多个谐振电感电路与多个开关电路一一对应,所述多个谐振电感电路上分别设置有谐振电感。
子节点电路还包括去耦电容,以实现对电源噪声的降噪功能。因此,子节点电路是可以实现门控功能、谐振功能以及降噪功能一体化设计的电路,通过在时钟网格中设置子节点电路,能够提高了时钟电路的性能。
所述多个谐振电感电路与多个开关电路一一对应,从而通过导通或关断不同的开关电路,产生不同的频点的振荡频率,从而满足应用场景对时钟电路的时钟频率变换的需求,提高了时钟电路的性能。
结合第一方面,在一种可能的实现方式中,所述多个开关电路中的第一开关电路用于接收开关控制信号S,所述开关控制信号S用于控制所述第一开关电路处于导通状态或关闭状态,所述第一开关电路为所述多个开关电路中的任一开关电路。
结合第一方面,在一种可能的实现方式中,所述多个谐振电感电路对应于多个时钟域。
随着芯片规模越来越大,芯片内部可以在逻辑上划分为多个时钟域,每个时钟域的时钟频率不同,每个时钟域中包括独立的时钟网格。换句话说,每个时钟域对应一个同步时间,域内的器件都同步到该时间;不同时钟域对应的同步时间相互独立。
具体的,可以通过将多个谐振电感电路与多个时钟域一一对应,即每个谐振电感电路产生的谐振频率与一个时钟域对应,从而时钟电路可以适用于包括多个时钟域的芯片,提高了时钟电路的应用范围和使用效率。
结合第一方面,在一种可能的实现方式中,所述多个去耦电容的一端用于接收所述第三时钟信号clk_3,所述去耦电容的另一端用于接电源或地。
结合第一方面,在一种可能的实现方式中,所述多个子节点电路还通过所述时钟网格与多个寄存器相连,所述多个子节点电路在所述时钟网格中的分布位置是根据所述多个寄存器的负载以及位置确定的。
时钟网格的上一级驱动和控制来自于子节点电路,为了节省面积和提高驱动效率,通常不会在每个节点都设置子节点电路,因此需要通过合适的算法来优化在时钟网格中设置子节 点电路的布局。因此,通过多个寄存器的负载以及位置确定多个子节点电路在所述时钟网格中的分布位置,可以提高时钟电路的驱动效率。
结合第一方面,在一种可能的实现方式中,在所述多个寄存器在所述时钟网格中为负载均衡的情况下,所述多个子节点电路在所述时钟网格中的分布位置是根据聚类算法或线性回归算法确定的,其中,所述聚类算法或线性回归算法的输入包括所述多个寄存器在所述时钟网格的位置,所述聚类算法或线性回归算法的输出包括所述多个子节点电路在所述时钟网格中的分布位置。
采用聚类算法或线性回归算法确定多个子节点电路在所述时钟网格中的分布位置,从而达到采用尽可能少的子节点电路并且合理分配子节点电路的位置的目的,可以提高时钟电路的驱动效率。
第二方面,提供了一种时钟系统中的子节点电路,其特征在于,所述子节点电路设置于所述时钟系统的时钟网格中,所述子节点电路包括门控时钟单元和谐振与去耦单元:所述门控时钟单元,用于接收门控信号G和来自时钟树的第二时钟信号clk_2,并向时钟网格和所述谐振去耦单元输出第三时钟信号clk_3;所述谐振与去耦单元包括去耦电容和多个谐振电感,所述谐振与去耦单元支持产生多个不同频点的振荡频率,用于接收所述第三时钟信号clk_3。
子节点电路通过接收时钟树输出的第二时钟信号clk_2进行处理,并输出第三时钟信号clk_3,以实现门控功能。子节点电路还可以支持产生振荡频率的谐振电路,以吸收时钟电路中的峰值电流,降低功耗。进一步地,子节点电路还可以支持产生多个不同频点的振荡频率,以满足芯片对时钟频率变换更高的要求,因此,子节点电路可以兼具门控功能、谐振功能,并产生不同频点的振荡频率,从而提高了时钟电路的性能。
结合第二方面,在一种可能的实现方式中,所述门控时钟单元用于接收所述第二时钟信号clk_2以及门控信号G,并向所述时钟网格以及所述谐振与去耦单元输出所述第三时钟信号clk_3。
结合第二方面,在一种可能的实现方式中,所述谐振与去耦单元中包括:多个并联的谐振电感电路、多个开关电路以及一个或多个去耦电容,其中,所述多个谐振电感电路与多个开关电路一一对应,所述多个谐振电感电路上分别设置有谐振电感。
结合第二方面,在一种可能的实现方式中,所述多个开关电路中的第一开关电路还用于接收开关控制信号S,所述开关控制信号S用于控制所述第一开关电路处于导通状态或关闭状态,所述第一开关电路为所述多个开关电路中的任一开关电路。
结合第二方面,在一种可能的实现方式中,所述多个谐振电感电路与多个时钟域一一对应。
结合第二方面,在一种可能的实现方式中,所述去耦电容的一端用于接收所述第三时钟信号clk_3,所述去耦电容的另一端用于接电源或地。
第三方面,提供了一种电子器件,包括:第一芯片,所述第一芯片的下表面设置有多个微焊球,所述第一芯片通过所述多个微焊球与所述第二芯片电连接;所述第二芯片,设置于所述第一芯片之下,所述第二芯片中设置有贯穿所述第二芯片的多个硅通孔,所述第二芯片的下表面设置有多个焊球,所述多个硅通孔与所述多个焊球连接;所述载板,设置于所述第二芯片之下,所述第二芯片通过所述多个硅通孔以及所述多个焊球与所述载板电连接;其中,所述电子器件中设置有时钟电路,所述时钟电路包括:锁相环PLL,用于输出第一时钟信号 clk_1;时钟树,用于接收所述第一时钟信号clk_1,并输出第二时钟信号clk_2;时钟网格,所述时钟网格与多个寄存器相连,所述时钟网格用于接收所述第二时钟信号clk_2,并向所述多个寄存器输出第三时钟信号clk_3;所述PLL和所述时钟树设置于所述第一芯片中,所述时钟网格设置于所述第二芯片中。
对于包括多个芯片堆叠封装的电子器件,其中可以设置有立体时钟电路的架构,即时钟电路可以分布在两个或两个以上的芯片中,PLL和时钟树设置于第一芯片中,时钟网格设置于第二芯片中,从而时钟信号能够更灵活更快速地驱动芯片中的电路,可以提高时钟性能。
结合第三方面,所述时钟网格包括多个节点,所述多个节点中的部分节点中设置有第二方面或第二方面中任一项所述的子节点电路
结合第三方面,在一种可能的实现方式中,所述时钟树包括多级时钟缓冲器,所述多级时钟缓冲器中的最后一级时钟缓冲器通过所述多个微焊球与所述时钟网格中的多个子节点电路相连。
第四方面,提供了一种电子器件,包括:第一芯片;第二芯片,设置于所述第一芯片之下并与所述第一芯片电连接,所述第二芯片中设置有多个第一硅通孔,所述第二芯片的下表面设置有多个微焊球,所述第二芯片通过所述多个第一硅通孔以及所述多个微焊球与所述第三芯片电连接;所述第三芯片,设置于所述第二芯片之下,所述第三芯片中设置有贯穿所述第三芯片的多个第二硅通孔,所述第三芯片的下表面设置有多个焊球,所述多个第二硅通孔与所述多个焊球连接;所述载板,设置于所述第三芯片之下,所述第三芯片通过所述多个第二硅通孔以及所述多个焊球与所述载板电连接;其中,所述电子器件中设置有时钟电路,所述时钟电路包括:锁相环PLL,用于输出第一时钟信号clk_1;时钟树,用于接收所述第一时钟信号clk_1,并输出第二时钟信号clk_2;时钟网格,所述时钟网格与多个寄存器相连,所述时钟网格用于接收所述第二时钟信号clk_2,并向所述多个寄存器输出第三时钟信号clk_3;所述PLL和所述时钟树的一部分设置于所述第一芯片中,所述时钟树的另一部分设置于所述第二芯片中,所述时钟网格设置于所述第三芯片中。
对于包括多个芯片堆叠封装的电子器件,其中可以设置有立体时钟电路的架构,即时钟电路可以分布在两个或两个以上的芯片中,PLL和时钟树的一部分设置于第一芯片中,时钟树的另一部分设置于第二芯片中,而时钟网格设置于第三芯片中,从而时钟信号能够更灵活更快速地驱动芯片中的电路,可以提高时钟性能。
结合第四方面,在一种可能的实现方式中,所述时钟网格包括多个节点,所述多个节点中的部分节点中设置有第二方面或第二方面中任一项中所述的子节点电路
结合第四方面,在一种可能的实现方式中,所述时钟树包括多级时钟缓冲器,所述多级时钟缓冲器中的最后一级时钟缓冲器通过所述多个第一硅通孔、所述多个微焊球与所述时钟网格中的多个子节点电路相连。
第五方面,提供了一种芯片系统,所述芯片系统中设置有如第一方面或第一方面中任一项所述的时钟电路,或者设置有如第二方面或第二方面中任一项所述的用于时钟电路的子节点电路。
附图说明
图1是本申请一实施例的时钟电路示意图。
图2是本申请又一实施例的时钟电路示意图。
图3是本申请一实施例的时钟电路的示意图。
图4是本申请一实施例的子节点电路400的原理示意图。
图5是本申请一实施例的门控时钟控制信号的时序示意图。
图6是本申请一实施例的在负载均衡情况下的子节点电路400的布局示意图。
图7是本申请一实施例的在非负载平衡的情况下的子节点电路400的布局示意图。
图8是本申请一实施例的子节点电路400布局的示意图。
图9是本申请一实施例的子节点电路400布局的设计与仿真的方法的流程示意图。
图10是本申请一实施例的抽取时钟电路模型和仿真的方法的示意图。
图11是本申请一实施例的电子器件1100的拓扑设计图。
图12是本申请又一实施例的电子器件1200的拓扑设计图。
图13是本申请又一实施例的电子器件1300的拓扑设计图。
图14是图13的电子器件1300的芯片之间的互连示意图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
为了便于理解,首先介绍本申请方案中涉及的若干术语。
时钟偏斜(clock skew):是指一个时钟源到达两个不同寄存器时钟端的时间偏移。时钟偏斜的计算公式可以表示为:
T skew=T clk2-T clk1                  (1)
其中,T skew表示时钟偏斜,T clk1和T clk2分别表示时钟源到达两个不同寄存器时钟的时刻。
时钟抖动(clock jitter):是指相对于理想时钟延实际时钟存在不随时间积累的、时而超前、时而滞后的偏移。
差分电路:其输入端用于接收两个输入信号,上述两个输入信号的差值为差分电路的有效输入信号。差分电路的输出为对上述两个输入信号之差的放大。如果差分电路中存在干扰信号,会对两个输入信号产生相同的干扰,通过二者之差,干扰信号的有效输入为零,从而达到了抗共模干扰的目的。
锁相环(phase lock loop,PLL):也可以称为锁相回路,用于统一整合时钟信号,以使得高频器件正常工作。PLL通常为放置于芯片内部的模拟电路,其输入端的时钟信号来源于芯片外部的时钟发生电路,通常为晶体振荡器(简称为晶振),PLL输出端的时钟信号是时钟网络的时钟源(clock source)信号。PLL的输出时钟源通过时钟树传播到芯片内部的各个时序电路(图中表示为寄存器)的时钟引脚,作为芯片各个部件的输入。
时钟网络(clock network):是芯片内部时钟电路的拓扑结构,通常以时钟树(clock tree)或时钟网格(clock grid)的结构来实现。
时间网格:是以二维网格(2dimension mesh,2D mesh)的结构来实现时钟网络的一种方法。
为了便于理解,接下来介绍时钟电路的基本原理。图1是本申请一实施例的时钟电路100示意图。
如图1所示,时钟电路100包括PLL 110以及时钟树120。其中,PLL 110用于输出时钟信号。时钟树120用于放大时钟信号,以驱动后级电路。例如,驱动后级电路中的寄存器150。
如图1所示,时钟树120通常包括多级时钟缓冲器121。每级时钟缓冲器包括一个或多 个时钟缓冲器121。其中,与PLL 110相连的时钟缓冲器121可称为时钟主缓冲器122。最后一级时钟缓冲器121可用于连接寄存器150。
在传统的时钟树设计中,当芯片的规模越来越大,时钟缓冲器的级数也增多。随着级数的增加,时钟缓冲器的驱动能力变弱、时延增加、功耗变大,从而后级电路中的寄存器的时钟偏斜和时钟抖动的程度也加倍增大。例如,在典型的处理器设计中,时钟偏斜平均为几十ps(皮秒)。
图2是本申请又一实施例的时钟电路200的结构示意图。图2是在图1的基础上进行改进的时钟电路。时钟电路200在时钟树220和寄存器250之间设置了时钟网格230。时钟网格230为时钟网络的一种常见类型,可以指网格化的时钟网络的拓扑结构。时钟网络230由时钟树和芯片内部的时序电路(图中表示为寄存器250)之间的连接组成,芯片内的各个时序电路需要时钟信号驱动实现数据的更新。时钟网络用于使时钟信号能够传递到各个时序器件。
相对于图1的时钟电路的设计方案,这种时钟网格化的设计可以实现更小的时钟偏斜和时钟抖动,以提升时钟频率。
但是图2的时钟拓扑结构为全局网格化时钟,但是全局网格化时钟也存在一些缺点。例如,全局网格化对芯片金属走线的要求高。金属线规格(例如,宽度、面积或通孔)和层数需要定制,通用性不足。网格化电路的阻容(resistance-capacitance,RC)参数大,功耗高。全局网格化对时钟缓冲器的驱动要求高,时钟缓冲器数量多,并且占用面积大。
为了解决上述问题,一种改进方案为在时钟网络中引入了谐振电感。这种方法的优势在于降低功耗。加入谐振电感的时钟网络可以称为谐振时钟网络。因为谐振时钟网络使用片内的电感创建一个“电摆”(electric pendulum),从而形成谐振电路(或者也可以称为“振荡回路”)。其可以产生振荡频率,从而重新利用时钟电路中的功耗,而不是在每个时钟周期中将功耗浪费掉。因此无需在时钟电路中使用大量的时钟缓冲器。
谐振电路本身为时钟发生源,因此无需传统的时钟电路那样使用大量的时钟缓冲器。谐振电路在初始阶段需要激发能量交换,并且当谐振电路损失导致能量交换减缓时,还要再次激发。但这些激发所需要的功率也远远小于已有的时钟网络的时钟缓冲器的驱动功率。
但是这种方案的问题在于,时钟电路中的电感固定,从而导致振荡频率不变,不能满足宽频率时钟要求,不能适用于对频率变换要求高的处理器。例如超频(turbo)模式或者快速降频模式。另外,仅靠传统的电感设计还不足以进一步地降低功耗。
为了改善上述问题,本申请实施例提出了一种时钟电路,该时钟电路中设置有子节点电路,子节点电路能够同时实现谐振功能、去耦功能以及时钟控制功能。并且还可以产生不同频点的振荡频率,从而适用于对频率变换要求高的处理器,提高时钟电路的性能。
图3是本申请一实施例的时钟电路300的示意图。如图3所示,该时钟电路300包括:PLL 310、时钟树320、时钟网格330。所述时钟网格330中设置有多个子节点电路400。在本申请实施例中,所述子节点电路400可以称为门控&谐振&去耦(gating&resonant&decoupling,GRD)电路。
PLL 310用于向时钟树320输出第一时钟信号clk_1。
时钟树320与PLL 310相连,其用于接收所述第一时钟信号clk_1,并输出第二时钟信号clk_2。时钟树320包括多级时钟缓冲器321,每级时钟缓冲器321包括一个或多个时钟缓冲器321。通常来说,级数越高,时钟缓冲器321的数量越多。与PLL 310相连的时钟缓冲器 321可以称为时钟主缓冲器322。时钟树320中的时钟缓冲器321可以理解为时钟树320中的节点,其作用类似于时钟中继器,用于驱动更大的负载。
时钟网格330包括多个节点,所述多个节点中的部分节点上分别设置有子节点电路400,所述多个子节点电路400与所述时钟树320相连,所述多个子节点电路400还通过所述时钟网格330与多个寄存器350相连。
在实际的物理实现中,时钟网格330通常采用不同层的金属连线以及上下层的过孔(via)连接。
多个子节点电路400用于接收所述第二时钟信号clk_2,并向所述时钟网格330输出第三时钟信号clk_3,以实现对时钟电路的门控功能。上述多个子节点电路400向时钟网格330输出第三时钟信号clk_3,可以理解为上述多个子节点电路400用于驱动和控制所述时钟网格330中的多个寄存器350。
子节点电路400还可以支持产生振荡频率的谐振电路,以吸收时钟电路中的峰值电流,降低功耗。并且由于子节点电路400可以支持产生多个不同频点的振荡频率,以满足芯片对时钟频率变换更高的要求。子节点电路400中还包括去耦电容Cd,以实现对电源噪声的降噪功能。因此,子节点电路400是可以实现门控功能、谐振功能以及降噪功能一体化设计的电路,通过在时钟网格330中设置子节点电路400,能够提高了时钟电路的性能。接下来将结合图4,继续描述子节点电路400的电路结构和工作原理。
需要说明的是,在本申请实施例中,门控功能可以指利用门控信号对时钟电路中的局部电路进行关断或导通的功能。上述局部电路可以指子节点电路400对应的电路部分,可以根据实践进行划分和布局。降噪功能也可以称为噪声抑制功能,其用于指对时钟电路中电源噪声进行降噪。谐振功能可以指利用谐振电路吸收时钟电路中的峰值电流,以达到降低功耗的目的。
图4是本申请一实施例的子节点电路400的原理示意图。如图4所示,子节点电路400包括门控时钟单元410以及谐振与去耦单元420。
门控时钟单元410接收时钟树320输出的第二时钟信号clk_2以及门控信号G,并根据所述第二时钟信号clk_2和所述门控信号G输出第三时钟信号clk_3,以实现门控功能。可选地,上述门控信号G可以由系统配置。
其中,门控信号G由系统配置,可以指门控信号G可以来自于芯片内部的控制逻辑电路,也可以由上层软件通过控制逻辑电路来产生。应理解,本申请实施例中其它由系统配置的信号也可以用上述定义解释。
门控时钟单元410还用于向谐振与去耦单元420输出第三时钟信号clk_3,以便于谐振与去耦单元420实现对第三时钟信号clk_3的谐振功能以及噪声抑制功能。
继续参见图4,作为具体示例,门控时钟单元410包括锁存器411和与门412。其中,锁存器411用于接收门控信号G以及从时钟树接收的第二时钟信号clk_2,并输出控制信号G_L。与门412用于接收第二时钟信号clk_2和控制信号G_L,并输出第三时钟信号clk_3。第三时钟信号clk_3用于驱动时钟网格330。
应理解,图4中的门控时钟单元410的具体结构仅仅作为示例,门控时钟单元410也可以采用其它类型的门控器件和连接方式实现,只要其能实现对时钟电路的门控功能即可,本申请实施例对此不作限定。
谐振与去耦单元420支持产生多个不同频点的振荡频率,以实现谐振功能。谐振与去耦 单元420中还包括去耦电容Cd,以实现对电源噪声的降噪功能。
继续参见图4,谐振与去耦单元420包括:多个并联的谐振电感电路421、多个开关电路422以及一个或多个去耦电容Cd。其中,多个谐振电感电路421与多个开关电路422一一对应。多个谐振电感电路421上分别设置有多个谐振电感Lr。在一些示例中,多个谐振电感中的任意两个谐振电感的电感值大小不同。
在一些示例中,门控时钟单元410与谐振与去耦单元420中可以设置缓冲器430,以增加信号的驱动能力。
可选地,谐振与去耦单元420还用于接收多个开关控制信号S。多个开关控制信号S与多个开关电路422一一对应。开关控制信号S用于控制对应的开关电路422处于导通状态或关断状态。开关控制信号S可以由系统配置。
例如,多个开关电路422中的第一开关电路422用于接收开关控制信号S,开关控制信号S用于控制第一开关电路422处于导通状态或关闭状态,所述第一开关电路422为所述多个开关电路422中的任一开关电路。
当多个开关电路422中的某一个开关电路422导通时,其对应的谐振电感电路421可接入到谐振电路中,并产生相应的振荡频率。其中,谐振电路是指基于谐振电感Lr和去耦电容Cd产生振荡频率的电路。其中,不同的谐振电感Lr和去耦电容Cd组合,可以产生不同的振荡频率。换句话说,不同的谐振电感电路421对应不同的振荡频率。
在一些示例中,可以通过开关控制信号S控制第一开关电路422导通,从而使得第一开关电路422对应的谐振电感电路421接入到谐振电路中,以产生振荡频率。所述第一开关电路422为所述多个开关电路422中的任一开关电路。
在一些示例中,在同一时间段内,多个开关电路422中只有一个开关电路422为导通状态。即在同一时间段内,只允许多个谐振电感电路421中的一个谐振电感电路421参与到谐振电路中。在另一些示例中,在同一时间段内,多个开关电路422中可以有一个或多个开关电路422为导通状态。即在同一时间段内,允许多个谐振电感电路421中的一个或多个谐振电感电路421参与到谐振电路中。
可选地,所述一个或多个去耦电容Cd的一端用于接收所述第三时钟信号clk_3,所述去耦电容Cd的另一端用于接电源或地。
需要说明的是,本申请实施例中的谐振电感Lr也可以称为电感Lr,去耦电容Cd也可以称为电容Cd。
可选地,随着芯片规模越来越大,芯片内部可以在逻辑上划分为多个时钟域,每个时钟域的时钟频率不同,每个时钟域中包括独立的时钟网格。换句话说,每个时钟域对应一个同步时间,域内的器件都同步到该时间;不同时钟域对应的同步时间相互独立。
在一些示例中,所述多个谐振电感电路与多个时钟域一一对应。例如,多个谐振电感Lr可以采用独立开关并联的方式,并联的电感个数可取决于时钟域(clock domai)的个数。不同的谐振电感Lr与去耦电容Cd组合,可以产生不同的振荡频率。
可选地,谐振电路的振荡频率可以根据以下公式(2)计算。
Figure PCTCN2021076686-appb-000001
其中,f表示振荡频率,L表示电感值,C表示电容值。
如图4所示,谐振电感Lr的一端用于接收第三时钟信号clk_3。谐振电感Lr的另一端还可以通过去耦电容Cd与电源或地相连。
需要说明的是,图4中的子节点电路400的结构仅仅作为示例。在实践中,子节点电路400中可以包括更多的器件或采用其它的连接方式。或者,子节点电路400中也可以包括比图4更少的模块和器件。本申请实施例对子节点电路400的具体结构不作限定,只要其能实现前文中描述的功能即可。
在本申请实施例中,子节点电路400可以利用多个谐振电感电路421吸收峰值电流,降低功耗。并且子节点电路400还可以利用多个谐振电感电路421和去耦电容Cd结合,即多路电感电容的组合谐振,以产生不同频点的振荡频率,从而满足处理器对时钟频率变换的需求。
在本申请实施例中,在谐振电感电路421断开时,去耦电容Cd可以作为通用的芯片电源去耦电容,用于瞬态噪声(或同步开关噪声SSN)抑制。
可选地,在具体示例中,根据电源和地对瞬态电压的需求,可以采用多个去耦电容Cd并联的方式增加电容容量,以得到更好的效果。
本申请实施例中,在时钟电路中采用时钟树+时钟网格的混合设计的情况下,时钟网的寄生参数和负载很大,因此功耗会更大。在没有设置门控时钟单元410的情况下,在每个时钟周期,时钟电路都会翻转。在设置门控时钟单元420的情况下,时钟电路会受门控信号G控制,减少时钟电路翻转,或者说减少了时钟开关行为,从而降低了功耗。因此通过在子节点电路400中增加门控时钟单元410,可以减少时钟开关行为,节省了开关功耗。同时,由于减少了时钟引脚的开关行为,寄存器的内部功耗也减少了。
作为示例,通常情况下,采用门控时钟单元410能够节省20%~50%的功耗。门控时钟单元410的降功耗原理为:通过控制时钟翻转率(表示为a),降低时钟电路的动态功耗。例如,公式(3)示出了时钟电路的功耗的计算方式。
P=a×f×C×V 2                (3)
其中,P表示时钟电路的功耗,a表示时钟翻转率,C表示负载等效电容,V表示时钟电路的工作电压。
图5是本申请一实施例的子节点电路400的门控时钟控制信号的时序示意图。如图5所示,第二时钟信号clk_2表示时钟树320输出的时钟信号。门控信号G为开放接口的信号,开放接口是指信号可以通过上层软件控制。控制信号G_L表示门控信号G与第二时钟信号clk_2通过锁存器411同步后的信号。第三时钟信号clk_3表示控制信号G_L与第二时钟信号clk_2进行与操作之后的信号。第三时钟信号clk_3为门控时钟单元410的输出信号。
接下来继续介绍本申请实施例的子节点电路400的时钟网格布局算法。时钟网格的上一级驱动和控制来自于子节点电路400,为了节省面积和提高驱动效率,通常不会在每个节点都设置子节点电路400,因此需要通过合适的算法来优化在时钟网格中设置子节点电路400的布局。
在一种可能的实现方式中,当时钟网格内的寄存器电路负载均衡(uniform)时,可以采用聚类或线性回归算法实现子节点电路400的时钟网格布局。
例如,图6是本申请一实施例的在负载均衡情况下的子节点电路400的布局示意图。负载均衡通常体现为同一芯片内相同类型的寄存器,或者说时钟网格内的各寄存器具有相同或相近的门负载(gate loading)。
如图6所示,可以采用K均值聚类算法(K-means clustering algorithm,K-means)算法或线性回归算法,实现在时钟网格内的寄存器数量固定的情况下,使用最少的子节点电路 400驱动上述寄存器,并且子节点电路400到各个寄存器的总距离最短。上述距离可以指一个平面上的距离,也可以指立体的距离。
其中,K-means算法是一种迭代求解的聚类分析算法。对于给定的样本集,可以指定K个初始质心(initial centroids),以作为聚类的类别(cluster),重复迭代至算法收敛。
图6中以聚类算法为K-means算法为例进行说明,应理解,图6中也可以采用其它聚类算法实现负载均衡情况下子节点电路400的布局。
在一种可能的实现方式中,当时钟网格内的寄存器电路负载不均衡(non-uniform)时,可以采用网格化平均法先分解寄存器电路,再使用聚类或线性回归算法。
例如,图7是本申请一实施例的在非负载平衡的情况下的子节点电路400的布局示意图。如图7所示,在时钟网格内的寄存器电路非负载均衡时,通常会出现定制的功能更多的寄存器或寄存器堆(register file)。在这种情况下,可以将时钟网格划分为多个子网格,在子网格内部先使用均分法,将大的寄存器分解为最小粒度的寄存器,以将非负载平衡电路归一化为负载平衡电路。然后使用聚类(例如,K-means算法)或线性回归算法求解子节点电路400的布局。
其中,上述最小粒度的寄存器可以指最小尺寸的寄存器,最小尺寸是指半导体工厂能够加工的寄存器的最小单位。
可选地,寄存器的大小可以指寄存器的尺寸大小,寄存器的尺寸越大,表示负载越大。上述最小粒度的寄存器,可以指最小尺寸的寄存器。
图8是本申请一实施例的子节点电路400布局的示意图。如图8所示,在子节点电路400布局后,子节点电路400与寄存器之间的相对关系是:从图中某一顶点(子节点电路400)到达周边4个子网格内(含寄存器)的中心位置的距离相等。
图9是本申请一实施例的子节点电路400电路布局的设计与仿真的方法的流程示意图。图9的方法可以使用电路仿真软件实现。如图9所示,该方法包括:
S901、输入处理器设计网表(netlist)。
S902、进行时钟布局设计,放置PLL模块。
S903、摆放逻辑单元。
S904、定义时钟网格区域。
S905、根据时钟网格位置生成时钟树。
S906、插入门控电路。
S907、插入谐振电感Lr。
S908、插入去耦电容Cd。
S909、抽取时钟电路模型并运行仿真。
S910、判断仿真结果是否满足时钟要求。
若否,则返回执行步骤S4。若是,则执行S11。
S911、布线。
图10是本申请一实施例的抽取时钟电路模型和仿真的方法的示意图。图10的方法可以使用电路仿真软件实现。图10中示出了图9的步骤S10的具体流程。如图10所示,该方法包括:
S101、输入时钟设计原理图。
上述时钟设计原理图例如可以为寄存器转换级(register transfer level,RTL)网表代 码(netlist code)。其中RTL是指逻辑设计的一个抽象层次,其使用硬件描述语言描述理想达到的功能。
S102、定义时钟源。
S103、抓取时钟源输出引脚。
S104、逐级抓取时钟输出连接和时钟缓冲器。
S105、判断抓取的输出引脚是否为寄存器的时钟引脚。
如果为是,则继续执行S106;如果为否,则返回S103重新执行。
S106、定义PVT工艺角和其它仿真条件。
其中,PVT工艺角是指程序-电压-温度工艺角(process-voltage-temperature corner,PVT corner)。其表示在仿真设计需要满足的电压与温度条件。
S107、从物理参数库抽取差分线和缓冲器电路参数。
例如,上述抽取差分线和缓冲器电路参数包括通过RLC仿真电路模拟器(simulation program with integrated circuit emphasis,SPICE)模型(model)提取上述参数。SPICE表示电路级模拟仿真程序。
S108、构建端到端时钟链路网表。
例如,上述构建端到端时钟链路网表包括通过RTL SPICE model构建上述网表。
S109、输入时钟源激励,运行仿真。
例如,上述输出时钟源激励可以包括设置时钟源信号的频率、时钟抖动(jitter)、电压转换速率(slew rate)。其中,时钟抖动可以指信号的定时事件与其理想位置之间的偏差。压摆率也可以称为电压转换速率,可以指在一个时间单位内电压升高的幅度。
S110、判断仿真结果是否满足要求。若是,则仿真结束;若否,则修改时钟设计,并返回S101重新执行时钟设计的仿真。
可选地,时钟树可以采用差分电路设计。差分时钟树的抗噪能力强、功耗低、对负载的驱动距离远。
另外,本申请实施例还提出了一种电子器件,该电子器件中设置有立体时钟电路架构,从而可以提高时钟性能。该立体时钟架构可以包括前文中所述的时钟电路300,或者该立体时钟电路架构也可以包括其它类型的时钟电路。上述电子器件可以为多芯片堆叠封装结构,多芯片堆叠封装结构允许芯片的多层堆叠,通过硅通孔(through silicon via,TSV)来提供多个芯片在垂直方向上的信号连接。在一些示例中,该电子器件可以为处理器。
可选地,上述立体时钟架构,可以结合2维网格(2 dimension mesh,2D mesh)或三维集成电路(3 dimension integrated circuit,3DIC)等众核片上系统(system on chip,SoC),结合新方法、新工艺,以获取最优收益,例如时钟抖动、时钟偏斜更小。上述新工艺可以包括载板上晶圆的芯片(chip-on-wafer-on-substrate,CoWoS)封装、扇出型封装(fan out package,FOP)、3维集成电路硅通孔(3 dimension integrated circuit through silicon via,3DIC TSV,3D TSV)封装等。
例如,图11是本申请一实施例的电子器件1100的拓扑设计图。该电子器件1100采取了3DIC TSV的3D拓扑结构。3DIC TSV是指将多个芯片堆叠在一起进行封装,并且利用硅通孔(through silicon via,TSV)实现芯片之间的高速高效数据通信。
如图11所示,该电子器件包括第一芯片1101第二芯片1102以及载板1104。其中,第二芯片1102设置于第一芯片1101之下,载板1104设置于第二芯片1102之下。
所述第一芯片1101的下表面设置有多个微焊球1105,所述第一芯片1101通过所述多个微焊球1105与所述第二芯片1102电连接。
第二芯片1102中设置有多个贯穿第二芯片1102的硅通孔1107,所述第二芯片1102的下表面设置有多个焊球1108,所述多个硅通孔1107与所述多个焊球1108相连,所述第二芯片1102通过所述多个硅通孔1107以及所述多个焊球1108与载板1104电连接。
载板1104的下表面设置有球栅阵列1109,所述球栅阵列1109用于连接所述载板1104与印刷电路板(printed circuit board,PCB)(图11中未示出)。
载板1104也可以称为封装载板、基板、封装基板,为芯片提供机械支撑、保护、散热和电气连接通道。载板1104上设置有线路,以导通第一芯片1101、第二芯片1102与PCB之间的信号。载板1104可以通过键合线与处理器的外部引脚相连,载板1104的球栅阵列1109可用于封装后的处理器与PCB主板之间的连接。
可选地,电子器件1100中可以设置前文中所述的时钟电路(100,200,300),也可以设置其他类型的时钟电路。其中,第一芯片1101中可以设置时钟电路中的PLL(110,210,310)以及时钟树(110,220,320)。第二芯片1102中可以设置时钟电路中的时钟网格(230,330)以及寄存器(150,250,350)。
如图11所示,作为示例,第一芯片1101中可以设置时钟电路300中的PLL310以及时钟树320,第二芯片1102中可以设置时钟电路300中的时钟网格330。所述时钟网格330包括多个寄存器350。所述时钟网格350中包括多个节点,所述多个节点中的部分节点中可以设置子节点电路400,也可以不设置子节点电路400。
作为示例,所述时钟树320包括多级时钟缓冲器311,所述多级时钟缓冲器311中的最后一级缓冲器通过所述多个微焊球1105与所述时钟网格330中的多个子节点电路400相连。
可选地,第一芯片1101可以包括逻辑芯片或存储芯片(logic&memory die)。第二芯片1102可以包括逻辑芯片(logic die)。
可选地,图11中的电子器件可以为处理器。
作为示例,图11中的电子器件的芯片架构设计流程可以描述如下。该芯片架构设计流程可以通过电子设计自动化(electronic design automation,EDA)软件实现。
I).第二芯片的时钟设计
S501、根据寄存器的摆放密度和位置,生成时钟网格。
S502、根据寄存器在时钟网格的位置,放置子节点电路400并将子节点电路400连接到时钟网格节点。
S503、在子节点电路400的摆放位置插入硅通孔,硅通孔与子节点电路400相连,用于与第一芯片的电路连接。
II).第一芯片的时钟设计
S601、在硅通孔所在的位置摆放终点(sink)时钟缓冲器,sink时钟缓冲器是指第一芯片的时钟电路的终点。
S602、根据第一芯片的布局要求和sink时钟的位置,摆放PLL的位置。
S603、根据PLL和sink时钟缓冲器的位置,生成时钟树。
S604、将时钟树各级的缓冲器相连。
图12是本申请又一实施例的电子器件1200的拓扑设计图。与图11相比,电子器件1200中可以包括更多的芯片。如图12所示,该电子器件1200中从上至下依次包括第一芯片1101、 第二芯片1102、第三芯片1103以及载板1104。为了简洁,图12中与图11中相同或相似的内容,此处不再赘述。
如图12所示,第二芯片1102设置于第一芯片1101之下,并与第一芯片1101电连接。作为示例,第一芯片1101和第二芯片1102之间可以采用物理或化学的方式进行键合连接。第二芯片1102中设置有多个第一硅通孔1106。第二芯片1102的下表面设置有多个微焊球1105,所述第二芯片1102通过多个第一硅通孔1106以及所述多个微焊球1105与第三芯片1103电连接。
第三芯片1103中设置有多个贯穿第三芯片1103的第二硅通孔(through silicon via,TSV)1107,所述第三芯片1103的下表面设置有多个焊球1108,所述多个第二硅通孔1107与所述多个焊球1108相连,所述第三芯片1103通过所述多个第二硅通孔1107以及所述多个焊球1108与载板1104电连接。
载板1104的下表面设置有球栅阵列1109,所述球栅阵列1109用于连接所述载板1104与PCB(图12中未示出)。
可选地,电子器件1200中可以设置前文中所述的时钟电路(100,200,300),也可以设置其他类型的时钟电路。其中,第一芯片1101中可以设置时钟电路中的PLL(110,210,310)以及时钟树(110,220,320)的一部分。第二芯片1102中可以设置时钟树(110,220,320)的另一部分。第三芯片1103中可以设置时钟电路中的时钟网格(230,330)以及寄存器(150,250,350)。
如图12所示,作为示例,第一芯片1101中可以设置时钟电路300中的PLL310以及时钟树320的一部分,第二芯片1102中设置有时钟树320的另一部分。第三芯片1103中可以设置时钟电路300中的时钟网格330。所述时钟网格330包括多个寄存器350。所述时钟网格350中包括多个节点,所述多个节点中的部分节点中可以设置子节点电路400,也可以不设置子节点电路400。
作为示例,所述时钟树320包括多级时钟缓冲器311,所述多级时钟缓冲器311中的最后一级缓冲器通过位于第二芯片1102中的多个第一硅通孔1106、多个微焊球1105与所述时钟网格330中的多个子节点电路400相连。
可选地,第一芯片1101和第二芯片1102可以包括逻辑芯片或存储芯片(logic&memory die)。第三芯片1103可以包括逻辑芯片(logic die)。
可选地,图12中的电子器件1200可以为处理器。
图13是本申请又一实施例的电子器件1300的拓扑设计图。该拓扑设计基于小芯片(chiplet)芯片架构。Chiple结构也可以称为乐高(lego)架构,可采用载板上晶圆的芯片(chip-on-wafer-on-substrate,CoWoS)封装或扇出型封装(fan out package,FOP)。可选地,电子器件1300可以为处理器。
其中,CoWoS是指一种2.5维的集成型的封装技术,其封装过程为先将半导体芯片通过过晶圆上芯片(Chip on Wafer,CoW)的封装制程连接至晶圆,再把此CoW芯片与载板连结,集成而成CoWoS这种封装方式可以将多个芯片封装到一起,达到封装体积小,功耗低,引脚少的效果。
FOP通过重布线路层将不同功能的芯片与被动组件连接在一起,降低封装的体积。能够同时兼顾成本以及性能,将异质芯片整合在单一封装内。
FOP主要包括扇出型晶圆级封装(fan-out-wafer-level-packaging,FOWLP)以及扇出 型面板级封装(fan-out-panel-level packaging,FOPLP)两种类型。
如图13所示,电子器件1300中包括第一芯片层1301、第二芯片层1302以及载板1304。其中,第一芯片层1301中可包括多个芯片,多个芯片通过中间介质(interposer)进行高速互联。图13中以第一芯片层1301包括2个中央处理器(central processing unit,CPU)芯片(13-1,13-2)和2个IO芯片(13-3,13-4)为例进行说明。
2个CPU(13-1,13-2)芯片内部各自设置有独立的时钟网络。每个CPU芯片(13-1,13-2)内部的时钟网络可以为前文中的时钟电路(100,200,300),也可以为其它类型的时钟电路。作为示例,CPU芯片(13-1,13-2)中可设置时钟电路中的PLL(110,210,310)以及时钟树(110,220,320)。第二芯片层1302中可以设置时钟电路中的时钟网格(230,330)以及寄存器(150,250,350)。
CPU芯片(13-1,13-2)之间通过高密高速互联方式将时钟信号连接。例如,图14是图13中的CPU芯片之间互连的示意图。如图14所示,上述高密高速互连方式例如可以采用小型输入输出(small input output,small IO)技术。
需要说明的是,前述各图以及各实施例中的第一芯片、第二芯片和第三芯片为裸晶die。所述第一芯片、第二芯片和第三芯片又可以成为第一die、第二die和第三die。其中,在可能的实现方案中,第一die、第二die和第三die分别包括一层或多层die。
图13中的电子器件1300的时钟设计方案包括:
S801、根据寄存器的摆放密度和位置,生成时钟网格。上述时钟网格可以为一个或多个时钟网格。
S802、根据寄存器在时钟网格的位置,放置子节点电路400,并将子节点电路400连接到时钟网格节点。
S803、根据芯片布局要求和子节点电路400的位置,摆放PLL位置。
S804、根据PLL和子节点电路400的位置,生成时钟树,确定各级时钟缓冲器的位置。
S805、将时钟树各级缓冲器连接。
S806、若多个芯片之间需要共时钟源设计,跨芯片时钟信号通过芯片与芯片之间的高速互连方式相互连接。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部 件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (19)

  1. 一种处理器时钟系统,其特征在于,包括锁相环PLL、时钟树和时钟网格,
    所述PLL用于输出第一时钟信号clk_1;
    所述时钟树用于接收所述PLL输出的所述第一时钟信号clk_1,并输出第二时钟信号clk-_2;
    所述时钟网格包括多个节点,所述多个节点中的部分节点上设置有子节点电路,所述子节点电路与所述时钟树相连,用于根据所述第二时钟信号clk_2产生第三时钟信号clk_3,其中,所述时钟网格和所述时钟树为立体结构。
  2. 如权利要求1所述的系统,其特征在于,所述子节点电路包括谐振与去耦单元以及门控时钟单元,所述门控时钟单元用于接收门控信号G和所述第二时钟信号clk_2,并向所述时钟网格和所述谐振与去耦单元输出所述第三时钟信号clk_3,所述谐振与去耦单元包括去耦电容和多个谐振电感,支持产生多个不同频点的振荡频率。
  3. 如权利要求2所述的系统,其特征在于,所述谐振与去耦单元包括:多个并联的谐振电感电路、多个开关电路以及一个或多个去耦电容,其中,所述多个谐振电感电路与多个开关电路一一对应,所述多个谐振电感电路上分别设置有谐振电感。
  4. 如权利要求3所述的系统,其特征在于,所述多个开关电路中的第一开关电路用于接收开关控制信号S,所述开关控制信号S用于控制所述第一开关电路处于导通状态或关闭状态,所述第一开关电路为所述多个开关电路中的任一开关电路。
  5. 如权利要求3或4所述的系统,其特征在于,所述多个谐振电感电路对应多个时钟域。
  6. 如权利要求3至5中任一项所述的系统,其特征在于,所述去耦电容的一端用于接收所述第三时钟信号clk_3,所述去耦电容的另一端用于接电源或地。
  7. 如权利要求1至6中任一项所述的系统,其特征在于,
    所述多个子节点电路还通过所述时钟网格与多个寄存器相连,所述多个子节点电路在所述时钟网格中的分布位置是根据所述多个寄存器的负载以及位置确定的。
  8. 如权利要求7所述的系统,其特征在于,在所述多个寄存器在所述时钟网格中为负载均衡的情况下,所述多个子节点电路在所述时钟网格中的分布位置是根据聚类算法或线性回归算法确定的,其中,所述聚类算法或线性回归算法的输入包括所述多个寄存器在所述时钟网格的位置,所述聚类算法或线性回归算法的输出包括所述多个子节点电路在所述时钟网格中的分布位置。
  9. 一种时钟系统中的子节点电路,其特征在于,所述子节点电路设置于所述时钟系统的时钟网格中,所述子节点电路包括门控时钟单元和谐振与去耦单元:
    所述门控时钟单元用于接收门控信号G和来自时钟树的第二时钟信号clk_2,并向时钟网格和所述谐振去耦单元输出第三时钟信号clk_3;
    所述谐振与去耦单元支持产生多个不同频点的振荡频率,并用于接收所述第三时钟信号clk_3。
  10. 如权利要求9所述的子节点电路,其特征在于,所述谐振与去耦单元中包括:多个并联的谐振电感电路、多个开关电路以及一个或多个去耦电容,其中,所述多个谐振电感电路与多个开关电路一一对应,所述多个谐振电感电路上分别设置有谐振电感。
  11. 如权利要求10所述的子节点电路,其特征在于,所述多个开关电路中的第一开关 电路还用于接收开关控制信号S,所述开关控制信号S用于控制所述第一开关电路处于导通状态或关闭状态,所述第一开关电路为所述多个开关电路中的任一开关电路。
  12. 如权利要求10或11所述的子节点电路,其特征在于,所述多个谐振电感电路对应多个时钟域。
  13. 如权利要求10至12中任一项所述的子节点电路,其特征在于,所述去耦电容的一端用于接收所述第三时钟信号clk_3,所述去耦电容的另一端用于接电源或地。
  14. 一种电子器件,其特征在于,包括:
    第一芯片,所述第一芯片的下表面设置有多个微焊球,所述第一芯片通过所述多个微焊球与所述第二芯片电连接;
    所述第二芯片,设置于所述第一芯片之下,所述第二芯片中设置有贯穿所述第二芯片的多个硅通孔,所述第二芯片的下表面设置有多个焊球,所述多个硅通孔与所述多个焊球连接;
    所述载板,设置于所述第二芯片之下,所述第二芯片通过所述多个硅通孔以及所述多个焊球与所述载板电连接;
    其中,所述电子器件中设置有时钟电路,所述时钟电路包括:锁相环PLL,用于输出第一时钟信号clk_1;时钟树,用于接收所述第一时钟信号clk_1,并输出第二时钟信号clk_2;时钟网格,所述时钟网格与多个寄存器相连,所述时钟网格用于接收所述第二时钟信号clk_2,并向所述多个寄存器输出第三时钟信号clk_3;
    所述PLL和所述时钟树设置于所述第一芯片中,所述时钟网格设置于所述第二芯片中。
  15. 如权利要求14所述的电子器件,其特征在于,所述时钟网格包括多个节点,所述多个节点中的部分节点中设置有如权利要求9-13中任一项中所述的子节点电路。
  16. 如权利要求15所述的电子器件,其特征在于,所述时钟树包括多级时钟缓冲器,所述多级时钟缓冲器中的最后一级时钟缓冲器通过所述多个微焊球与所述时钟网格中的多个子节点电路相连。
  17. 一种电子器件,其特征在于,包括:
    第一芯片;
    第二芯片,设置于所述第一芯片之下并与所述第一芯片电连接,所述第二芯片中设置有多个第一硅通孔,所述第二芯片的下表面设置有多个微焊球,所述第二芯片通过所述多个第一硅通孔以及所述多个微焊球与第三芯片电连接;
    所述第三芯片,设置于所述第二芯片之下,所述第三芯片中设置有贯穿所述第三芯片的多个第二硅通孔,所述第三芯片的下表面设置有多个焊球,所述多个第二硅通孔与所述多个焊球连接;
    所述载板,设置于所述第三芯片之下,所述第三芯片通过所述多个第二硅通孔以及所述多个焊球与所述载板电连接;
    其中,所述电子器件中设置有时钟电路,所述时钟电路包括:锁相环PLL,用于输出第一时钟信号clk_1;时钟树,用于接收所述第一时钟信号clk_1,并输出第二时钟信号clk_2;时钟网格,所述时钟网格与多个寄存器相连,所述时钟网格用于接收所述第二时钟信号clk_2,并向所述多个寄存器输出第三时钟信号clk_3;
    所述PLL和所述时钟树的一部分设置于所述第一芯片中,所述时钟树的另一部分设置于所述第二芯片中,所述时钟网格设置于所述第三芯片中。
  18. 如权利要求17所述的电子器件,其特征在于,所述时钟网格包括多个节点,所述 多个节点中的部分节点中设置有如权利要求9-13中任一项中所述的子节点电路。
  19. 如权利要求18所述的电子器件,其特征在于,所述时钟树包括多级时钟缓冲器,所述多级时钟缓冲器中的最后一级时钟缓冲器通过所述多个第一硅通孔、所述多个微焊球与所述时钟网格中的多个子节点电路相连。
PCT/CN2021/076686 2020-03-23 2021-02-18 处理器时钟系统、时钟系统中的子节点电路及电子器件 WO2021190203A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21775322.7A EP4095646A4 (en) 2020-03-23 2021-02-18 PROCESSOR CLOCK SYSTEM, CHILD NODE CIRCUIT IN CLOCK SYSTEM, AND ELECTRONIC DEVICE
US17/947,699 US20230013151A1 (en) 2020-03-23 2022-09-19 Clock circuit in a processor integrated circuit

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202010208897.2 2020-03-23
CN202010208897 2020-03-23
CN202110172075.8 2021-02-08
CN202110172075.8A CN113434007A (zh) 2020-03-23 2021-02-08 处理器时钟系统、时钟系统中的子节点电路及电子器件

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/947,699 Continuation US20230013151A1 (en) 2020-03-23 2022-09-19 Clock circuit in a processor integrated circuit

Publications (1)

Publication Number Publication Date
WO2021190203A1 true WO2021190203A1 (zh) 2021-09-30

Family

ID=77752858

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/076686 WO2021190203A1 (zh) 2020-03-23 2021-02-18 处理器时钟系统、时钟系统中的子节点电路及电子器件

Country Status (4)

Country Link
US (1) US20230013151A1 (zh)
EP (1) EP4095646A4 (zh)
CN (1) CN113434007A (zh)
WO (1) WO2021190203A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220343045A1 (en) * 2021-04-21 2022-10-27 Arm Limited Multi-Dimensional Network Interface
WO2023141843A1 (en) * 2022-01-26 2023-08-03 Huawei Technologies Co., Ltd. Stacked chip assembly
WO2023177583A2 (en) * 2022-03-12 2023-09-21 Microchip Technology Incorporated Integrated circuit package with backside lead for clock tree or power distribution network circuits

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020180502A1 (en) * 2001-04-26 2002-12-05 Yoshitaka Aoki Clock distribution circuit
US20090039492A1 (en) * 2007-08-06 2009-02-12 Samsung Electronics Co., Ltd. Stacked memory device
CN104699531A (zh) * 2013-12-09 2015-06-10 超威半导体公司 3d芯片系统中的电压下降缓解
CN105676944A (zh) * 2014-11-18 2016-06-15 龙芯中科技术有限公司 时钟网络的开关控制方法、装置及处理器
CN105700617A (zh) * 2014-12-10 2016-06-22 联发科技(新加坡)私人有限公司 时钟分配装置和时钟分配方法
CN107817870A (zh) * 2017-10-16 2018-03-20 算丰科技(北京)有限公司 时钟信号传递方法和装置、时钟树、芯片、电子设备
CN110690858A (zh) * 2018-07-04 2020-01-14 苏州芯算力智能科技有限公司 一种谐振时钟电路

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7237217B2 (en) * 2003-11-24 2007-06-26 International Business Machines Corporation Resonant tree driven clock distribution grid
US8736342B1 (en) * 2012-12-19 2014-05-27 International Business Machines Corporation Changing resonant clock modes
US8887118B2 (en) * 2013-02-22 2014-11-11 International Business Machines Corporation Setting switch size and transition pattern in a resonant clock distribution system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020180502A1 (en) * 2001-04-26 2002-12-05 Yoshitaka Aoki Clock distribution circuit
US20090039492A1 (en) * 2007-08-06 2009-02-12 Samsung Electronics Co., Ltd. Stacked memory device
CN104699531A (zh) * 2013-12-09 2015-06-10 超威半导体公司 3d芯片系统中的电压下降缓解
CN105676944A (zh) * 2014-11-18 2016-06-15 龙芯中科技术有限公司 时钟网络的开关控制方法、装置及处理器
CN105700617A (zh) * 2014-12-10 2016-06-22 联发科技(新加坡)私人有限公司 时钟分配装置和时钟分配方法
CN107817870A (zh) * 2017-10-16 2018-03-20 算丰科技(北京)有限公司 时钟信号传递方法和装置、时钟树、芯片、电子设备
CN110690858A (zh) * 2018-07-04 2020-01-14 苏州芯算力智能科技有限公司 一种谐振时钟电路

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4095646A4

Also Published As

Publication number Publication date
CN113434007A (zh) 2021-09-24
EP4095646A4 (en) 2024-01-24
US20230013151A1 (en) 2023-01-19
EP4095646A1 (en) 2022-11-30

Similar Documents

Publication Publication Date Title
WO2021190203A1 (zh) 处理器时钟系统、时钟系统中的子节点电路及电子器件
Darve et al. Physical implementation of an asynchronous 3D-NoC router using serial vertical links
Guthaus et al. Revisiting automated physical synthesis of high-performance clock networks
Farjadrad et al. A bunch-of-wires (BoW) interface for interchiplet communication
US9659123B2 (en) Systems and methods for flexibly optimizing processing circuit efficiency
US20240012969A1 (en) System on Chip (SOC) Current Profile Model for Integrated Voltage Regulator (IVR) Co-design
Ma et al. Survey on chiplets: interface, interconnect and integration methodology
Chiang et al. The road to 3D EDA tool readiness
Kuttappa et al. Resonant clock synchronization with active silicon interposer for multi-die systems
Yan et al. Open source cell library Mono3D to develop large-scale monolithic 3D integrated circuits
Sinha et al. Validation and test issues related to noise induced by parasitic inductances of VLSI interconnects
Zhu et al. Package clock distribution design optimization for high-speed and low-power VLSIs
Jeloka et al. System technology co-optimization and design challenges for 3D IC
Tida et al. Efficient metal inter-layer via utilization strategies for three-dimensional integrated circuits
Sarhan et al. 3DCoB: A new design approach for Monolithic 3D Integrated Circuits
Navidi et al. Comparative analysis of clock distribution networks for TSV-based 3D IC designs
Jagtap et al. A methodology for early exploration of TSV placement topologies in 3D stacked ICs
Lee et al. Automated I/O library generation for interposer-based system-in-package integration of multiple heterogeneous dies
Sumanth Kumar et al. Minimal buffer insertion based low power clock tree synthesis for 3D integrated circuits
Elst Modeling and simulation of parasitic effects in stacked silicon
Boutros et al. Into the Third Dimension: Architecture Exploration Tools for 3D Reconfigurable Acceleration Devices
Chen et al. Reshaping System Design in 3D Integration: Perspectives and Challenges
Kannan et al. An Efficient Wirelength Optimization for Booth Multiplier using Silicon Vias
Sheibanyrad et al. Asynchronous 3D-NoCs making use of serialized vertical links
US10474778B2 (en) Systems and methods for top level integrated circuit design

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21775322

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021775322

Country of ref document: EP

Effective date: 20220825

NENP Non-entry into the national phase

Ref country code: DE