WO2013004836A1

WO2013004836A1 - Test access architecture for interposer-based 3d die stacks

Info

Publication number: WO2013004836A1
Application number: PCT/EP2012/063325
Authority: WO
Inventors: Erik Jan Marinissen; Chun-Chuan Chi
Original assignee: Imec
Priority date: 2011-07-06
Filing date: 2012-07-06
Publication date: 2013-01-10

Abstract

A semiconductor interposer (40) for stacking on top thereof at least two die towers each comprising at least one die (Die 1, Die 2, Die 3), and for interconnecting the die towers by means of at least functional wires (w₀₁, w₂₁,, w₃₂) in the interposer (40), comprises test circuitry for post-bond testing of the dies (Die 1, Die 2, Die 3) and of electrical interconnections between the die towers and the interposer (40). The test circuitry comprises a primary port (Port 0) to external I/Os, or to a die different from the dies of the die towers and a plurality of secondary ports (Port 1, Port 2, Port 3) for stacking the at least two die towers onto. There is a data signal path within the interposer (40) between the primary port (Port 0) and at least one of the plurality of secondary ports (Port 1, Port 2, Port 3). At least one functional wire in the interposer (40) is re-used as part of the test circuitry.

Description

TEST ACCESS ARCHITECTURE FOR INTERPOSER-BASED 3D DIE STACKS

Field of the Invention

The present invention relates in general to integrated circuits (IC) design and testing, and in particular to a test architecture for testing interposer-based 3D die stacks and a method thereto.

Background of the invention

Recent advances in semiconductor processing technology enable the manufacturing of integrated circuits with through-substrate vias (TSVs). A TSV is a conducting nail, typically made of conductive material such as copper or tungsten that provides an electrical connection through the substrate of a thinned-down semiconductor wafer to its back-side. Typical TSVs have a depth of 50μιη, a diameter of 5μιη, and a minimum pitch of ΙΟμιη. TSVs are used to create vertical inter-die connections that provide higher density and performance at lower power dissipation than conventional approaches such as wire-bonds. TSVs are used for three-dimensional integration in so-called 3D Stacked ICs (3D-SICs). These are vertical die stacks that offer a small footprint and form-factor, which is particularly attractive for hand-held and portable applications.

TSVs may also be used, but do not need to be used, in so-called interposer-based 3D die stacks, in which multiple active dies are placed side-by-side on top of and interconnected through an interposer, such as a semiconductor, e.g. silicon, interposer, for example a passive, an active or a so-called active-lite interposer. A passive interposer only has connection features; an active interposer has, on top of the connection features, some functionality; and an active-lite interposer is an active interposer with limited functionality.

A particular character of an interposer is that it interconnects various dies stacked on top of it. A primary port, generally called Port 0, provides access to another device or sub- device. The primary port may provide access to I/O pins, to a die below the interposer, or to another die aside or above the interposer. For the interconnection between the various dies, conductive layers such as metal layers, also called wires, may be provided in the interposer, for forming horizontal connections; as well as metal layer interconnects and/or optionally TSVs, for providing vertical connections. Interposer-based 3D die stacks do not necessarily reduce the footprint, but offer better cooling options for high-performance computing and communication applications. Figure 1 depicts a typical interposer-based 3D die stack 10, in which multiple active dies 11, in the example illustrated three active dies, are stacked face-down on top of an interposer 12, e.g. an interposer base, connected through fine-pitch micro-bumps 13. In the die stack illustrated, the interposer 12 provides both horizontal interconnects between the various dies 11 through its multiple metal layers, as well as vertical interconnects to external VOs through its TSVs 14. In a more general realization, TSVs do not necessarily need to be provided. In a typical configuration, the interposer 12 is face-up, while the TSVs 14 connect its back-side to conductive, e.g. copper, pillars 15 between interposer 12 and package substrate 16. However, it is also possible to flip the interposer 12 face-down. In an interposer-based 3D die stack, the stacked active dies themselves do not require TSVs, unless they evolve into 3D towers of die stacks in their own right. Like all microelectronic products, also these interposer-based 3D die stacks need to be tested for defects incurred during their many, high-precision, and hence defect-prone manufacturing steps, in order to guarantee sufficient outgoing product quality to the customer. These tests should be both effective and cost efficient.

Marinissen et al. "3D DfT Architecture for Pre-Bond and Post-Bond Testing," in Proceedings IEEE International Conference on 3D System Integration (3DIC), November 2010 describes a test access architecture that supports both pre-bond and post-bond testing of 3D-SICs. The architecture consists of die-level wrappers, which have both a single-bit ('serial') test access mechanism (TAM) for test instructions and low-bandwidth test data, as well as a scalable, multi-bit ('parallel') TAM for high-bandwidth test data. During pre-bond testing, the die wrapper provides test access on all primary VOs of the die, including the ones which are not probed. During post-bond testing, probe access is on the bottom die, while the wrappers of the various dies in the stack cooperate to elevate test instructions and data up till the circuit-under-test and back down. The die wrapper can be based on either IEEE Std 1149.1 or IEEE Std 1500 both with some specific 3D extensions; in either case the external VOs are at the bottom die equipped with an IEEE 1149.1 interface to support board-level interconnect testing. In order to incur as few extra test pins as possible, the stack' s serial TAM is multiplexed onto the IEEE 1149.1 pins, whereas the parallel TAM is multiplexed onto functional VOs. This approach requires active DfT circuitry (flip-flops, multiplexers, and logic gates) at each tier, which, in particular in the case of a passive interposer, e.g. a passive semiconductor interposer, is impossible.

Summary of the invention

It is an object of embodiments of the present invention to provide a good post-bond test architecture for ICs stacked on a passive interposer and a good method for post-bond testing of such ICs stacked on a passive interposer.

The above objective is accomplished by a method and device according to embodiments of the present invention.

In a first aspect, the present invention provides a semiconductor interposer for stacking on top thereof at least two die towers each comprising at least one die, and for interconnecting the die towers by means of at least functional wires in the interposer. The functional wires provide die-to-die conductive, e.g. metal, interconnections between the die towers. The die-to-die interconnections between the die towers are also called horizontal interconnections, as they are largely in the plane of the interposer; however, vias between various metal layers could still form a vertical (perpendicular to the plane of the interposer) component therein. The connections between die towers may furthermore include, but do not need to include, interconnections between a primary port of the interposer and any of the secondary ports of the interposer (see below) when these are located at opposite sides of the interposer. Such port-to-port interconnections are also called vertical interconnects, and may be implemented in any suitable way, such as by means of TSVs. Also the vertical interconnections, although being arranged largely perpendicular to the plane of the interposer, may include a horizontal component (in the plane of the interposer), for example in the form of metal layer wires. The semiconductor interposer according to embodiments of the present invention comprises test circuitry for post-bond testing of the dies and of electrical interconnections between the die towers and the interposer. In accordance with the present invention, at least one functional wire in the interposer is re-used as part of the test circuitry.

An interposer according to embodiments of the present invention may comprise a primary port to external I/Os, or to a die different from the dies of the die towers, and a plurality of secondary ports for stacking the at least two die towers onto. There may be a data signal path within the interposer between the primary port and at least one of the plurality of secondary ports and vice versa. The present invention is particularly advantageous for passive interposers, containing only passive and no active elements such as transistors, switches, gates, flip-flops, etc. However, the present invention is in no way limited to passive interposers. Active and active-lite interposers do contain some form of active components. Active interposers comprise active components. In active-lite interposers not all regular active elements and gates can be implemented, but only a subset, e.g. diodes. For such "active-lite" interposers, it is still impossible to implement conventional Design-for-Test infrastructure (comprising scan flip-flops etc.) as can be done in truly active dies. Hence certainly for active-lite interposers, but also for active interposers, the present invention is applicable. In an interposer according to embodiments of the present invention, the test circuitry may comprise (a) one or more extra pins on the interposer' s primary port, and (b) extra interconnections from those extra pins through the interposer to the secondary ports.

In an interposer according to embodiments of the present invention, the primary and secondary ports may for example be extended with at least a four-pin Test Access Port, comprising Test Data In (TDI), Test Data Out (TDO), Test Clock (TCK), Test Mode Select (TMS) and optionally Test Reset Not (TRSTN).

An interposer according to embodiments of the present invention may comprise, besides the functional wires, at least one extra wire to form a connection between the primary port and at least one of the plurality of secondary ports, or between the plurality of secondary ports. Such extra wire is an added wire dedicated for making the TAM width wider.

In a second aspect, the present invention provides a post-bond test architecture for testing a plurality of active die towers each comprising at least one die, the die towers being stacked on top of a semiconductor, e.g. silicon, interposer and interconnected by means of at least functional wires in the interposer. The interconnection between the die towers may furthermore include, but does not need to include, vertical interconnects such as TSVs. The test architecture comprises a test access mechanism arranged for performing a sequence of tests. The sequence of tests may comprise tests for testing the active die towers, the interposer, and the interconnects between the interposer and the stacked die towers. The test architecture according to embodiments of the present invention contains DfT elements both in the active dies as well as in the interposer. At least one functional wire in the interposer is re-used as part of the test access mechanism. The test architecture according to embodiments of the present invention is ideally suited for post-bond testing, whereby the active die towers are stacked on top of a semiconductor, e.g. silicon, interposer.

In a test architecture according to embodiments of the present invention, the test access mechanism may comprise a primary port on the interposer to external I/O, or to a die different from the dies of the die towers, and a plurality of secondary ports on the interposer for stacking the at least two die towers onto. There may be a data signal path within the interposer between the primary port and at least one of the plurality of secondary ports and/or vice versa. In particular embodiments, the test access mechanism may comprise (a) one or more extra pins on the interposer' s primary port, (b) extra interconnections from those extra pins through the interposer to the secondary ports, and (c) boundary- scan-like wrapper cells for controllability and/or observability in the dies of the die towers.

The active dies (or, if applicable, the 3D towers of stacked active dies) stacked on top of the interposer may be equipped with suitable DfT features, such as for example 3D- enhanced die-level test wrappers, which can be based on either IEEE Std 1149.1 or IEEE Std 1500. These die-level wrappers implement both a single-bit ('serial') TAM for test instructions and low bandwidth test data, as well as a scalable, multi-bit ('parallel') TAM for high -bandwidth test data. Specific 3D features include (1) TestTurns, (2) optional pre- bond probe pads, and (3) TestElevators to transport test data up and down in the stack (only applicable for non-top dies in 3D towers). At the bottom side of each active die (or 3D tower), the VOs to the interposer port may pass through a regular IEEE 1149.1 wrapper.

The DfT in the interposer is by definition limited to additional interconnects only. All interposer ports may be extended with a four-pin (or optional five-pin) IEEE 1149.1 Test Access Port (TAP), comprising or consisting of Test Data In (TDI), Test Data Out (TDO), Test Clock (TCK), Test Mode Select (TMS) and (optional) Test Reset Not (TRSTN). These signals may be wired up through the interposer in the same style as is common for TAP signals in Printed Circuit Boards: TCK, TMS, and TRSTN are broadcast from the primary port to all secondary ports, while a serial TAM may be formed by daisy-chaining TDI from the primary port, via the various active die towers at the secondary ports, back to TDO at the primary port. The serial TAM is able to perform interconnect testing of micro-bumps and interposer, as well as basic low-bandwidth testing of the active die towers. The wrapper may comprise test interface signals, an instruction register, and a set of data registers. The instruction register is a register accessed by the test interface signals to load test instructions that control the operation of the wrapper, in particular the instructions control the selection of a data register and control the mode of operation of the selected data register. The selected data register may be accessed by the test interface to shift test data in and out of the wrapper. The set of data registers may comprise for example internal scan registers for testing the die circuitry, boundary scan register for controlling and/or observing the inputs and outputs of the die during testing and a bypass for bypassing the wrapper. Any other user-defined data registers may be included in the set of data registers of the wrapper.

The semiconductor, e.g. silicon, interposer may be extended with extra wires, besides the functional wires, and optionally with TSVs, to form connections between its k + 1 ports (where k is the number of die towers stacked on top of the semiconductor, e.g. silicon, interposer) that hook up the k die towers, comprising for instance IEEE 1149.1 -compliant dies, to the primary port, as is also common at Printed Circuit Boards. This means that the primary port may be extended with a four- or five-pin JTAG interface. The serial TAM between TDI and TDO makes a daisy-chain tour along all k die towers, while the JTAG serial control signals (TCK, TMS, and optionally TRSTN ) are broadcast to all die towers. The added extra wires may be dedicated, hence provided on purpose, for making the TAM wider.

The JTAG-based serial (1-bit) TAM through the interposer and its stacked die towers allows testing the interposer and micro-bump interconnects simultaneously, as if they were an extension of the Printed Circuit Board.

The JTAG-based serial (1-bit) TAM through the interposer can also be used to the test the active die towers. The active die towers typically are or comprise complex ICs (microprocessors, SOCs, FPGAs, memories) with a large associated test data volume. In an embodiment of the present invention, the active die towers may comprise a test wrapper adapted for allowing transit-only data passage. This adaptation allows multi- visits in the active die towers. One, e.g. the first, of these visits is used to test that die itself; the other, e.g. subsequent, visits are merely transit-only visits which are used to enable connections to other dies. The first visit does not require modification of the DfT hardware in the active die towers or active die, but the revisits do. In such embodiments, a test architecture may be provided with at least one multiplexer to switch between normal functional mode and transit mode. In normal functional mode, signals from the die itself are selected as outputs; in test mode, signals from TAM wires which allow testing of the die are selected as outputs; and in transit test mode, signals from TAM wires which allow to perform the re- visit are selected.

In a third aspect, the present invention provides a method for optimizing a basic DfT architecture and for reducing the overall test length for active die towers by identifying functional interposer interconnects that can be combined and reused during testing in a plurality of parallel TAM configurations. The method allows reducing the overall test length for active die towers.

In an embodiment, the functional interposer interconnects are combined and reused during test in a parallel TAM using the "Distribution" architecture. In this case, each active die tower has its own private TAM. The various TAMs can operate in parallel and the overall test length is determined by the die with the longest test length.

In an embodiment, the functional interposer interconnects are combined and reused during test in a parallel TAM using the "Daisy-Chain" architecture. In this case, only one TAM is required, which tours from the primary port, via all active die towers, back to the primary port. The active die towers can be daisy-chained in any order, which allows freedom in constructing the Daisy-chain architecture out of functional interposer connections. A Daisy-chain TAM supports both sequential and parallel test schedules. In an embodiment, the functional interposer interconnects are combined and reused during test in a parallel TAM using the "Hybrid "architecture, which is a generalization of "Distribution" and "Daisy-chain" architectures. In the Hybrid architecture, there is a plurality of TAMs, and each TAM contains a plurality of active die towers, such that each die is included in exactly one TAM. Different TAMs can operate in parallel, while each individual TAM supports both sequential and parallel test schedules; the overall test length is determined by the TAM with the longest test length.

In an embodiment, the functional interposer interconnects are combined and reused during test in a parallel TAM using a "Multi- Visit TAM" architecture according to embodiments of the present invention, which is a generalization of the Hybrid TAM architecture. In this case, there are a number of TAMs which all start and end at the primary port, and each TAM contains at least one active die tower, for example a plurality of active die towers such that each die tower is included in exactly one TAM. The Multi- Visit TAM architecture allows a TAM to visit a die tower multiple times. One of these visits is used to test the die tower; the other visits are merely transit-only visits, which are used to enable connections to other die towers that otherwise would remain unreachable through existing functional interposer connections. This allows to achieve higher success rate for finding a valid TAM architecture, and/or to reduce the resulting test length by being able to identify wider TAMs.

In particular embodiments the TAM architecture imposes a partition of the die towers over the various TAMs. A Multi- Visit TAM visits all of its die towers at least once, but is allowed to visit its die towers multiple times, provided that the functional interposer connections that constitute the TAM path are used at most once. This implies that for a Multi- Visit TAM containing k die towers, each die tower can be visited at most k times. To support the Multi- Visit TAM, minor DfT extensions need to be added to the 3D die wrapper of the dies of the die towers. A multiplexer is required per re-visit per TAM wire to switch between normal functional mode and test mode. In normal functional mode, signals from the die itself are selected as outputs; in test mode, the TAM wires which perform the re-visit are selected.

A method according to embodiments of the present invention may furthermore comprise adding in the interposer at least one die-to-die interposer wire for testing purposes. By adding and using this one or more extra wires, the TAMs may be widened.

Instead thereof or on top thereof, a method according to embodiments of the present invention may comprise adding at least one die-to-external-I/O and/or vice versa interposer wire for testing purposes. This embodiment also allows widening the TAMs, but will require one or more additional pins, which is expensive.

In another aspect, the present invention provides a method for identifying a TAM configuration which best minimizes the overall interposer-based 3D die stack test length. A longer test length requires deeper test vector storage on the test equipment and hence makes this test equipment significantly more expensive. A longer test length also implies a longer test application time, which makes the test more expensive. As test length contributes in a double way to test cost increase, it is a key performance indicator. The semiconductor interposer has stacked on top thereof at least two die towers each comprising at least one die. The method comprises generating all possible partitions for the at least two die towers according to a set of input parameters, and identifying for each partition the best possible Hybrid TAM for each group of die towers in that partition. A better Hybrid TAM is defined as a TAM of wider width, or, in case of equal width, of less re-visits. A Hybrid TAM comprises at least one TAM, each TAM containing at least one active die tower, such that each die tower is included in exactly one TAM. A wider TAM implies equal or smaller test length, which is a main optimization criterion. For Multi- Visit architectures, every re-visit implies more area in the wrapper of the re-visited die, which is a second optimization criterion.

In a method according to particular embodiments of the present invention, each TAM may use a "Distribution" architecture wherein each active die tower has its own private TAM. In a method according to alternative particular embodiments of the present invention, each TAM may use a "Daisy-chain" architecture wherein only one TAM is required, which tours from a primary port of the semiconductor interposer, via all die towers, back to the primary port.

Generating all possible partitions for k dies according to a set of input parameters may include providing as input parameters a matrix containing the interconnect specification of the interposer in question and a test-length lookup table which provides the test lengths of each die for a range of TAM widths.

In embodiments of the present invention, identifying for each partition the best possible Hybrid TAM may include identifying a Multi- Visit Hybrid TAM which allows the TAM to visit a same die tower multiple times, once for testing that die and the other one or more times for transit-only visits used to enable interconnections to other dies.

In embodiments of the present invention, identifying for each partition the best possible Hybrid TAM for each group of die towers in that partition may comprise a pruning mechanism for reducing the number of possible Hybrid TAMs to be taken in account for the identification. Embodiments of the present invention allow an extensive 'pruning' of the search space; therefore, the computer run-time can be drastically reduced, while still being able to guarantee optimality of the solution (as if it had been run exhaustively over the entire search space).

The pruning mechanism according to embodiments of the present invention may include ignoring paths between die towers that do not result in a better TAM. Ignoring such paths may exploit the fact that the width of a TAM is determined by its narrowest segment. In particular embodiments, the present invention provides a method for identifying a Multi- Visit TAM configuration that achieves the shortest test length for a given semiconductor, e.g. silicon, interposer architecture, the method comprising (a) generating all possible partitions for k dies according to a set of input parameters, and (b) identifying for each partition the best possible (One- or Multi- Visit) Daisychain TAM for each group of dies in that partition. A better Daisychain TAM is defined as a TAM of wider width, or, in case of equal width, of less re- visits. A wider TAM implies equal or smaller test length, which is a main optimization criterion. Every re-visit implies more area in the wrapper of the re-visited die, which is a second optimization criterion.

The input parameters of step (a) may include a matrix containing the interconnect specification of the interposer in question and a test-length lookup table which provides the test lengths of each die for a range of TAM widths.

In one aspect, the test access architecture is scalable, in the sense that it works for an undetermined number active dies (or 3D-SICs) stacked on a passive semiconductor, e.g. silicon, interposer.

Particular and preferred aspects of the invention are set out in the accompanying independent and dependent claims. Features from the dependent claims may be combined with features of the independent claims and with features of other dependent claims as appropriate and not merely as explicitly set out in the claims.

For purposes of summarizing the invention and the advantages achieved over the prior art, certain objects and advantages of the invention have been described herein above. Of course, it is to be understood that not necessarily all such objects or advantages may be achieved in accordance with any particular embodiment of the invention. Thus, for example, those skilled in the art will recognize that the invention may be embodied or carried out in a manner that achieves or optimizes one advantage or group of advantages as taught herein without necessarily achieving other objects or advantages as may be taught or suggested herein. Brief Description of the Drawings

Presently preferred embodiments are described below in conjunction with the appended drawing figures, wherein like reference numerals refer to like elements in the various figures, and wherein:

Fig. l shows a typical interposer-based 3D die stack containing a semiconductor, e.g. silicon, interposer base.

Fig.2 illustrates the semiconductor, e.g. silicon, interposer model with k = 3 dies.

Fig.3 presents the interconnection matrix W, which represents the interposer model of

Figure 2. Fig.4 presents the DfT architecture according to embodiments of the present invention. Fig.5 presents an embodiment of an interposer-based 3D die stack with a Distribution TAM architecture.

Fig.6 shows an embodiment of an interposer-based 3D die stack with a Daisychain TAM architecture.

Fig.7 shows all possible Daisychain TAM configurations represented by a tree.

Fig.8 shows an example of the concept of backtracing after a bottleneck in the tree is located.

Fig.9 shows an embodiment of an interposer-based 3D die stack with a Hybrid TAM configuration according to embodiments of the present invention.

Fig.10 presents an example of all possible partitions for k=3.

Fig.11 presents an embodiment of an interposer-based 3D die stack with Multi- Visit TAM architecture according to embodiments of the present invention.

Fig.12 shows the impact of the Multi- Visit TAM on the 3D die wrapper.

Fig.13 shows a tree representation of possible Multi- Visit Daisychain TAM configurations.

Fig. 14 illustrates the results of adding wires to a daisychain in accordance with embodiments of the present invention.

The drawings are only schematic and are non-limiting. In the drawings, the size of some of the elements may be exaggerated and not drawn on scale for illustrative purposes. The dimensions and the relative dimensions do not necessarily correspond to actual reductions to practice of the invention.

Any reference signs in the claims shall not be construed as limiting the scope.

In the different drawings, the same reference signs refer to the same or analogous elements.

Detailed Description of Illustrative Embodiments

The present invention will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto, but is only limited by the claims. The drawings described are only schematic and are non-limiting. In the drawings, the size of some of the elements may be exaggerated and not drawn on scale for illustrative purposes. Furthermore, the terms first, second, third and the like in the description, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. The terms are interchangeable under appropriate circumstances and the embodiments of the invention can operate in other sequences than described or illustrated herein.

Moreover, the terms top, bottom, over, under and the like in the description are used for descriptive purposes and not necessarily for describing relative positions. The terms so used are interchangeable under appropriate circumstances and the embodiments of the invention described herein can operate in other orientations than described or illustrated herein.

The term "comprising" should not be interpreted as being restricted to the means listed thereafter; it does not exclude other elements or steps. It needs to be interpreted as specifying the presence of the stated features, integers, steps or components as referred to, but does not preclude the presence or addition of one or more other features, integers, steps or components, or groups thereof. Thus, the scope of the expression "a device comprising means A and B" should not be limited to devices consisting of only components A and B. It means that with respect to the present invention, the only relevant components of the device are A and B.

Reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments.

Similarly it should be appreciated that in the description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention, and form different embodiments, as would be understood by those in the art. For example, in the following claims, any of the claimed embodiments can be used in any combination.

It should be noted that the use of particular terminology when describing certain features or aspects of the invention should not be taken to imply that the terminology is being redefined herein to be restricted to include any specific characteristics of the features or aspects of the invention with which that terminology is associated.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

In the context of the present invention, a through- substrate via (TSV) is a vertical electrical interconnection (via = vertical interconnect access) passing completely through a thinned-down semiconductor wafer or die. TSVs are a technology used to create 3D die stacks and 3D integrated circuits, with high-performance low-dissipation inter-die interconnects.

A 3D die stack comprises two or more chips (integrated circuits) stacked vertically so that they occupy less space and/or have a larger connectivity. In particular stacks, an interposer may be used as an electrical interface between die towers each comprising at least one die, for electrically interconnecting the die towers by means of at least functional wires in the interposer. In the context of the present invention, a functional wire is a wire, e.g. a metal interconnect, which is part of the functional design of the interposer, and which is not dedicatedly added for test purposes.

Different types of interposers exist: a passive interposer contains no active circuitry, but only interconnects: horizontal interconnects between different dies through multiple metal layers, and/or vertical interconnects through TSVs between dies and the package substrate towards the external world. An interposer provides interconnections between two or more ports, which can be connected to the VOs of other chips and/or external VOs. In the context of the present invention, a test access mechanism (TAM) provides the means for on-chip test data transport. Test wrappers form an interface between a die and its environment, and connect the terminals of the die to other dies and to the TAM.

In the context of the present invention, test length refers to both the time it takes for testing an interposer / die tower architecture, as well as to memory depth requirements of a tester involved in carrying out the test.

Certain embodiments of the present invention relate to a system and a method for testing a plurality of active die towers stacked on top of a semiconductor, e.g. silicon, interposer, whereby each die tower is electrically interconnected by means of at least functional wires present in the interposer. The test architecture according to embodiments of the present invention has been developed for testing a plurality of active die towers after they have been stacked on top of a semiconductor, e.g. silicon, interposer (post-bond phase). In one embodiment, the semiconductor, e.g. silicon, interposer is a die without active circuits, containing only metal interconnect layers and optionally TSVs. However, the application is not limited to the generic interposer model presented but can also be applied to interposers containing active elements. The semiconductor, e.g. silicon, interposer provides (horizontal) wire interconnects between the active dies (or stacks of dies) stacked on top of it, as well as, optionally, (vertical) TSV-based interconnects via the package substrate to the external world. This describes the set-up in which the interposer is face-up, i.e., the many die-to-die connections ("horizontal interconnects") are through BEOL metal layers only and only the external VOs are located at the bottom side of the interposer ("flip-chip") and accessible through TSVs ("vertical interconnects"). However, also other set-up scenarios are possible, and are included in the context of the present invention. The following are cited as examples only, without claiming to be exhaustive, and without being limiting for the present invention:

• The active dies are still on top of the interposer and the external VOs are still at the bottom of the interposer ("flip-chip"), but the interposer is orientated face-down. In that case not only the "vertical interconnects" need to go through TSVs, but also the "horizontal interconnects" (die-to-die) first go through TSVs to reach the face-down front-side of the interposer, where the BEOL metal layers are located.

• The active dies are still on top of the interposer, but the external VOs are also at the top-side of the interposer. These external VOs could for example be wire- bonded to the package substrate. In that scenario, TSVs are not necessary at all, as both the die-to-die interconnects as well as the die-to-external-I/O and/or vice versa interconnects are realized in BEOL metal layers.

In the generic interposer model, the interposer is arranged for having k die towers (or die stacks), each die tower comprising at least one die, stacked on top of itself, connected to the secondary ports, indicated in the drawings as resp. Ports 1 . . . k. The primary port, which is the interposer' s connection, e.g. bottom side connection, to the external VOs is termed Port 0. The number of interconnects between Port i and Port j (for 0 < i < k and 0 < j < k) is a non-negative integer wy. Consequently, an interposer is fully characterized by the value of k and its interconnection matrix W containing all wy values for 0 < i < k and 0 < j < k. It is to be noted that some of the wy values can be zero, indicating that a connection between Ports i and j simply does not exist in this particular interposer.

Figure 2 shows an example interposer with k = 3. All connections between Ports i and j (for 0 < i < k and 0 < j < k and i≠ j) are shown, even if some of these connections might have width zero. Figure 3 shows the corresponding interconnection matrix W.

In a post-bond test of an interposer-based 3D die stack containing a semiconductor, e.g. silicon, interposer, the following items should be testable in accordance with embodiments of the present invention: (1) the stacked die towers each containing one or more dies, (2) the interposer, and (3) the interconnects between interposer and stacked die towers. Even if the stacked die towers and/or the interposer have already been tested for manufacturing defects prior to bonding, they might have been damaged during the stacking process, in which case a re-test is necessary. Post-bond testing also includes the test after packaging, which as final test assures the outgoing product quality to the customer and hence should be able to (re-)test all components of the SIC. In one embodiment it is proposed that the active die towers (or die stacks) are equipped with any suitable DfT architecture, such as for instance a 3D DfT architecture as described in Marinissen et al. "3D DfT Architecture for Pre-Bond and Post-Bond Testing", In Proceedings IEEE International Conference on 3D System Integration (3DIC), November 2010. This implies a 3D-enhanced die wrapper (based on either IEEE Std 1149.1 or IEEE Std 1500) around each die. A suitable DfT architecture:

1. provides test controllability and observability at all interposer-die tower interconnections, for enabling the interconnect + interposer testing. 2. provides test access to all active die towers on top of the interposer via a test interface, e.g. a scalable-width test interface, that is multiplexed onto existing functional inputs/outputs.

The bottom die of each active die tower (or, in case the die tower only consists of one die, that one die) gets, in addition, a DfT architecture such as for example the conventional IEEE Std 1149.1 Boundary Scan ('JTAG') implementation at its bottom-side interface toward the interposer. The semiconductor, e.g. silicon, interposer may be extended with extra wires and optionally TSVs to form connections between its k + 1 ports that hook up the k die towers, e.g. k IEEE 1149.1 -compliant dies, to the primary port Port 0, as is also common at Printed Circuit Boards. This means that the primary port Port 0, in particular embodiments, may be extended with a four- or five-pin JTAG interface. The serial TAM between TDI and TDO makes a daisychain tour along all k die towers, while the JTAG serial control signals (TCK, TMS, and optionally TRSTN) are broadcast to all die towers. The JTAG-based serial TAM through the interposer and its stacked die towers allows to test the interposer and interconnects simultaneously, as if they were an extension of the Printed Circuit Board. From the domain of board-level interconnect testing, there is a large body of prior work on testing wire interconnects, which can be reused in this setting. This assumes full controllability and observability through digital scan access at both ends of the wires-under-test, which in case of embodiments of the present invention is achieved by means of the suggested DfT in interposer and dies. The Modified Counting Sequence Algorithm (see: P. Goel and M.T. McMahon, "Electronic Chip-in-Place Test", in Proceedings IEEE International Test Conference (ITC), October 1982, pp. 83-90) and the True/Complement Test Algorithm (see: P.T. Wagner, "Interconnect Testing with Boundary Scan", in Proceedings IEEE International Test Conference (ITC), October 1987, pp. 52-57) are two examples of interconnect ATPG algorithms. They provide a very cost-effective test, as both detect all (hard) open and shorts through a set of digital test patterns that grows only logarithmically with the number of interconnects.

The serial TAM also allows to test the die towers. However, these die towers typically are or comprise complex ICs (microprocessors, SOCs, FPGAs, memories), with a large associated test data volume. If this large test data volume has to be applied through a one- bit serial TAM only, the test will take a very long time to execute and hence might become prohibitively expensive. In one embodiment, it is proposed to identify functional interconnects in the interposer that can be reused as parallel TAM. In this context, functional interconnects are understood to be interconnects that are part of the existing functional design of the interposer, and not have been added dedicatedly for test purposes. In the 3D-enhanced die wrapper design template, the parallel TAM in the die was already meant to connect to functional I/Os; which functional IVOs will be selected for that will now be driven from the identification of functional interconnects in the interposer that will serve as parallel TAM. A main benefit is that reusing existing interconnects does not increase the interposer costs.

Figure 4 depicts the proposed DfT architecture for an example interposer-based 3D die stack containing a semiconductor, e.g. silicon, interposer 40. The three stacked dies Die 1, Die 2, Die 3 are equipped with both a 3D-enhanced die wrapper 41, 42, 43 and a bottom IEEE 1149.1 -compliant Boundary Scan ('JTAG') wrapper 44, 45, 46. The functional interconnects in the interposer 40 are equal to the ones in Figure 2. Highlighted in bold and dotted lines in the interposer 40 are the additional interconnections to implement the IEEE 1149.1 connectivity. The cost of the additional DfT is negligible. For a medium- sized industrial SOC, the 3D-enhanced die wrapper takes less than 0.04% of the die area, to which the bottom JTAG implementation cost still needs to be added. The interposer 40 may optionally be extended with four or five TSVs and (in the order of) k non-functional (or dedicated for test purposes) metal interconnect wires.

The parallel TAM for the SIC may be implemented by reusing the functional interconnects in the interposer 40. TAM architectures can be classified as Single- Visit TAMs, in which every die is visited exactly once by a single TAM, and Multi- Visit TAMs, in which every die is visited once or more by a single TAM. Three types of Single- Visit TAM architectures are distinguished: (1) Distribution, (2) Daisychain, and (3) Hybrid. The Hybrid architecture is a generalization of Distribution and Daisychain, and hence includes these two types. Only two types of Multi- Visit TAM architectures are considered: Daisychain and Hybrid. Since there are a number of different TAM configurations under a specific TAM architecture, optimization algorithms are proposed in accordance with embodiments of the present invention to search for the TAM configuration that minimizes the overall test length of the die towers.

In one embodiment of the present invention, the test access architecture for 3D-SICs is based on the Distribution TAM architecture. In such Distribution TAM architecture, each die tower has its own private TAM, and hence all die towers can be tested concurrently. Consequently, the overall test length is determined by the die tower which has the longest test length. Since, in accordance with embodiments of the present invention, functional interconnects in the interposer are reused as TAM wires, the requirement for a SIC to adopt this TAM architecture is that each die tower must have its own external I/Os, i.e., for each Die tower i in the SIC, woi > 0 (external inputs exist) and Wio > 0 (external outputs exist). The TAM width for Die tower i is then the minimum of these two widths, min(woi,Wio). Figure 5 shows an interposer-based 3D die stack with a Distribution TAM architecture, where the bold lines woi, w₁₀, wo₂, w₂o, wo₃, w₃₀ represent the TAM wires. An advantage of the Distribution architecture is its ability to perform concurrent testing of the various dies. Since there is only one possible TAM configuration for this architecture, no optimization algorithm is required. However, as the number of die towers k in a SIC grows, it becomes less and less likely that every die tower indeed has its own external I/Os, and hence the chance to be able to identify a complete Distribution architecture covering all die towers drastically decreases.

In one embodiment of the present invention, the test access architecture for 3D-SICs is based on the Daisychain TAM architecture. In the Daisychain architecture, there is only one TAM, which concatenates all dies in the SIC. Each die wrapper is equipped with a single bit bypass register so that a die tower can be bypassed if it is not currently under test. This architecture allows for both sequential and parallel test schedules. In a sequential test schedule, only one die tower is tested at a time, while the other die towers are bypassed. In this way, the die towers are tested one by one until all of them have been tested. In a parallel test schedule, multiple or all die towers are tested at the same time with each tested die's scan chains concatenated. An interposer-based 3D die stack with a Daisychain TAM architecture is shown in Figure 6, in which the TAM starts from the primary port Port 0, going through secondary ports Ports 1, 2, and 3, and ends at the primary port Port 0. For a given interposer, all possible Daisychain TAM configurations can be represented by means of a tree. The nodes in this tree represent ports of the interposer, while the edges represent the interconnects between the ports. A Daisychain TAM configuration is represented by a path from the root of the tree to a leaf node of the tree. A valid Daisychain TAM should always start and end at the primary port Port 0, and hence the root and all leaf nodes should represent Port 0. Furthermore, a Daisychain TAM should include all other, secondary, ports, and hence in every root-to-leaf path of the tree, all ports other than Port 0 should be included exactly once. The width of a particular Daisychain TAM configuration is defined as the minimum of all interconnect widths that are included in the path. For a k-die tower SIC, the tree should include all k! possible Daisychain TAM configurations (permutations) to concatenate all ports. Figure 7 shows a tree graph corresponding to the example interposer-based 3D die stack from Figure 6; the selected Daisychain TAM configuration is highlighted with bold edges. An algorithm may be defined for identifying the best, e.g. widest, Daisychain configuration. The input of the algorithm is the interconnection matrix of a SIC; the output is the permutation of die towers that has the widest TAM width. In a Daisychain architecture, all die towers have the same TAM width. A wider TAM width implies that the overall test length remains equal or decreases. The widest Daisychain TAM results in the shortest overall test length. Hence, the procedure of searching for the Daisychain TAM configuration with the shortest overall test length can be viewed as exploring all full root-leaf paths in the tree and selecting the path that achieves the widest TAM width. For large k values, the number of TAM configurations k! becomes huge and hence it will be time-consuming to find the best TAM configuration. For example, if k = 12, then there will be 12! = 479,001,600 paths to explore in the tree. In order to quickly get the best TAM configuration among k! possibilities, use of a 'tree-pruning' mechanism is proposed in accordance with embodiments of the present invention, to avoid having to explore the entire tree. The concept of the proposed algorithm is to prune the part of the tree that cannot create a TAM wider than the current one. This is achieved by exploiting the fact that the width of a Daisychain TAM is determined by its narrowest segment. For example, if the widest TAM found so far has a width of five bit, then any path having segments with widths equal or less than five can be ignored. In the beginning of the algorithm, the best TAM width is set to 1, meaning that there is already a single-bit TAM available and we are only interested in TAM which improve that. Starting from the primary port Port 0, all paths to other ports are examined and we select the one with the widest width which must also be wider than the current best TAM width. By taking this selected path, the process moves to the examination of the next port. Then, all paths are re-examined from the current port to other ports and select the widest one wider than the current best TAM width. It should be noted that the paths from the current port to ports which have already been visited are not considered, since a valid Daisychain in Single- Visit mode only visits every port once. If there is no path to other ports wider than the current best TAM width, then the previous port is traced one-level back and we mark the visited edge to avoid re-visiting. By repeating the above procedure, we can form a Daisychain TAM. Each time a Daisychain TAM is obtained, the algorithm locates the narrowest edge of such a Daisychain and updates the best TAM width with the width of this edge. This edge is called the 'bottleneck' of the Daisychain, since it limits the TAM width. According to the position of the bottleneck, it is possible to determine from which node we should continue exploring the tree to obtain other wider Daisychains. In Figure 8, the cross through an edge represents that that edge has been visited. The bottleneck has been located in Wkm after finding a complete Daisychain TAM. Since the algorithm always selects the widest edge from the current node to the next node if the bottleneck is Wkm, the other un-visited edges at the same level (represented by dotted lines) can be ignored due to the fact that their widths must be equal or narrower than the bottleneck and hence cannot construct a wider Daisychain. In the example shown in Figure 8, the algorithm traces back to Node i after locating the bottleneck, and continues to explore the rest of the tree. By repeating the steps described above, the algorithm continues searching for wider Daisychains. The searching process terminates if the current port is the primary port Port 0 and there are no un-visited edges to other ports wider than best TAM width. Using the proposed tree-pruning approach in accordance with embodiments of the present invention, many parts of the tree can be ignored, because after a Daisychain is found, the algorithm prunes the tree according to the location of the bottleneck. In addition, the best TAM width increases gradually and any edge narrower than it will be skipped also. The proposed algorithm only prunes the part of the tree that cannot construct a better Daisychain TAM, and therefore is an optimal algorithm without exhaustively exploring the entire tree.

In one embodiment of the present invention, the test access architecture for 3D-SICs is based on the Hybrid TAM architecture which allows a number of TAMs, ranging from one to k, where each TAM includes at least one die tower, such that each die tower is included in exactly one TAM and the union of all TAMs includes all die towers. The Hybrid TAM architecture can be thought of as a partition of the set of die towers into a number of subsets, through which Daisychain TAMs are created. The overall test length for a Hybrid TAM architecture is determined by the maximum test length of the constituting Daisychain TAMs. The Hybrid TAM architecture is a more generic TAM definition that includes Distribution TAMs, Daisychain TAMs, and other TAM configurations, and hence increases the chances to find a TAM configuration for a given interposer. Taking the interposer-based 3D die stack in Figure 9 as an example, although there is no Distribution or Daisychain solution for this interposer 90, there is a Hybrid TAM solution. Two TAM chains can be created, one of which consists of only secondary port Port 1, while the other one consists of secondary ports Ports 3 and 2, as shown in the figure by bold lines. The number of possible configurations N(k) of the Hybrid TAM architecture for k dies can be calculated by the following recursive equations.

JV(0) = 1

(1)

M x H{k)

(3)

Where,

H(0) 1

(4)

k

H(k)

(5)

From Equations 1 through 5, it can be seen that, for a k-die tower SIC, the solution space of the Hybrid TAM is H(k) times larger than that of the Daisychain TAM; H(12) =26. This gives the Hybrid TAM a higher success rate for finding a TAM solution. However, to find the best TAM configuration within such a large solution space could take a prohibitively long compute time. The Hybrid TAM optimization algorithm is proposed, which uses the Daisychain optimization algorithm described above as a sub-procedure. The inputs of the algorithm include the interconnection matrix of a SIC and a test length lookup table which provides the test length of each die tower versus a range of TAM widths; the output is the Hybrid TAM configuration that achieves the shortest overall test length. Different from the case of the Daisychain TAM, during this Hybrid TAM optimization, it is needed to take into account test lengths of die towers, because there may be more than one TAM and a wider total summed TAM width does not necessarily imply a shorter overall test length. The overall test length of a Hybrid TAM configuration is determined by the TAM that has the longest test length.

Before describing the proposed algorithm, two terms are explained, partition and group, using Figure 10. Figure 10 assumes k = 3, and shows all possible five partitions of Dies 1, 2, and 3. As shown in the figure, a group consists of die towers, and can be viewed as a TAM chain; a partition consists of groups, and can be viewed as a Hybrid TAM configuration consisting of multiple TAM chains. In the beginning of the proposed algorithm, the current best test length is set to infinity, meaning that there is no Hybrid TAM solution initially. There are two main steps in the algorithm. The first step is to generate all possible partitions of k die towers in the given SIC. Then, in the second step, using the partitions generated in the first step as an input, the groups in each partition are optimized by the Daisychain optimization algorithm presented previously to determine the permutation of the die towers within groups to form a TAM chain achieving the widest TAM width. To optimize groups in a partition, the algorithm processes from the smallest to the largest group, and if any of the groups fails to form a TAM chain or the test length of such a TAM chain is longer than the current best test length, the process on such a partition is terminated and the algorithm continues to address the next partition. By processing groups in a smaller groups, first order, a significant amount of compute time can be saved, because if the smaller group already meets the termination condition, the larger groups in the same partition do not need to be processed. After all partitions are processed, the Hybrid TAM configuration with the shortest overall test length is selected. Since all possibilities are taken into account, the algorithm guarantees an optimal solution. Although the proposed algorithm needs to invoke the Daisychain optimization algorithm many times, the proposed Daisychain optimization algorithm is so efficient that the total compute time is still within a reasonable scale.

In one embodiment of the present invention, the test access architecture for 3D-SICs is based on the Multi- Visit TAM architecture. The three architectures described previously (distribution, Daisychain and hybrid architecture) allow a TAM to visit a die tower only once, which is the concept of Single- Visit TAMs. The concept of Multi- Visit TAMs allows a TAM to visit the same die tower multiple times. One of these visits is used to test that die tower itself; the other visits are merely transit-only visits, which are used to enable connections to other die towers through existing functional interposer connections. The objective of the Multi- Visit TAM concept is, for given interposer designs, to achieve a higher success rate for finding a valid TAM architecture and to reduce the resulting test length by being able to identify wider TAMs. The proposed Multi- Visit TAM architecture is a generalization of the Hybrid TAM architecture. Hence, there are a plurality of TAMs which all start and end at primary port Port 0, and each TAM contains at least one active die tower, such that each die tower is included in exactly one TAM. In other words: the TAM architecture imposes a partition of the die towers over the various TAMs. A Multi- Visit TAM visits all of its die towers at least once, but is allowed to visit its die towers multiple times, provided that the functional interposer connections that constitute the TAM path are used at most once. This implies that for a Multi- Visit TAM containing k die towers, a die tower can be visited at most k times. Figure 11 shows an example of an interposer-based 3D die stack with three die towers each containing a single die. The interposer connections are such that the connections between the primary port Port 0 and the secondary port Port 2, between the primary port Port 0 and the secondary port Port 3, and between the secondary Ports 2 and 3 are missing (i.e., W02 = w₂o = wo3 = w₃₀ = w₂₃ = w₃₂ = 0); in Figure 11, these connections are not drawn. Due to these missing interposer connections, the Distribution, Daisychain, and Hybrid architectures cannot succeed in constructing a valid TAM architecture for this example interposer-based 3D die stack. However, the Multi- Visit TAM is capable of doing so. The Multi- Visit TAM architecture in this case consists of a single TAM which starts at the primary Port 0, and subsequently includes Dies 1, 2, 1, 3, and 1, and finally ends at primary port Port 0. The TAM connections which are new under the novel Multi-TAM scheme according to embodiments of the present invention are indicated in Figure 11 in dotted lines.

Figure 12 illustrates an example of how to design the 3D test wrapper of Die 1 if the Multi- Visit TAM configuration of Figure 11 is adopted. In order to support the Multi- Visit TAM, DfT extensions need to be added to the conventional 3D die wrapper. As shown in Figure 12, the first TAM visit to Die 1 is considered a "normal' visit and the TAM wires are connected to the internal test resources (Wrapper Boundary Register, scan chains, or Bypasses) so that Die 1 itself can be tested or participate otherwise in the test. Subsequent visits of the TAM to Die 1 (highlighted in bold in Figure 12) are considered transit-only visits and the TAM is simply directed to the wires of the destination port without entering the internal test resources of Die 1. A multiplexer is required per re-visit per TAM wire to switch between normal functional mode and test mode. In normal functional mode, signals from the die itself are selected as outputs; in test mode, the TAM wires which perform the re- visit are selected. In the example of Figure 12, Die 1 needs two extra multiplexers of min(woi;w₂i;w₃₁;w₁o;w₁₂;w₁₃)-bit (the same as the TAM width); Dies 2 and 3 do not need extra DfT, since there is no re- visit to them.

A method for optimizing the Multi- Visit parallel TAM architecture is also proposed in accordance with embodiments of the present invention. The proposed algorithm identifies a Multi- Visit TAM configuration that achieves the shortest test length. The inputs for the algorithm include a matrix containing the interconnect specification of the interposer in question and a test-length lookup table which provides the test lengths of each die tower for a range of TAM widths. The proposed algorithm comprises or consists of two steps. The first step generates all possible partitions for k die towers. In the second step, for each partition it identifies the best possible (One- or Multi- Visit) Daisychain TAM for each group of die towers in that partition. A better Daisychain TAM is defined as a TAM of wider width, or, in case of equal width, of less re-visits. A wider TAM implies equal or smaller test length, which is a main optimization criterion. Every re-visit implies more area in the wrapper of the re-visited die tower, which constitutes a second optimization criterion. Step 1 can be accomplished by straightforward enumeration which was explained previously using Figure 10 where all possible partitions for k = 3 are illustrated. This step needs to be executed only once and the resulting partitions for different values of k can be saved in a file for future use. The tree for the One- Visit Daisychain TAM, as described previously with reference to Figure 7, can be extended to represent the Multi- Visit Daisychain TAM. Part of the resulting tree is shown in Figure 13. This drawing indicates that the complexity of the Multi- Visit tree is much higher than that of the One- Visit tree. Multi- Visit TAMs have different TAM lengths depending on the number of revisits used. A tree-pruning' mechanism may be employed to avoid having to explore the entire tree, so that the compute time of the optimization procedure can be reduced. The concept of tree-pruning is to ignore the paths that do not result in a better TAM, which can be accomplished by exploiting the fact that the width of a Daisychain TAM is determined by its narrowest segment. For example, if the current best TAM has width 10- bit, then any edge narrower than 10-bit can be ignored, during tree exploration because a TAM including these edges is narrower than the current best TAM. At the start of the Daisychain TAM optimization, the best TAM width is initialized to 1, meaning that there is already a single-bit ("serial') TAM available and we are only interested in TAMs which improve that. Starting from the root, primary port Port 0, the algorithm selects the widest edge woi; 1 < i < k, which must also be wider than best TAM width. By taking the selected edge, the next Port i is moved to. Then, from secondary port Port i, the algorithm again selects the widest edge, wy ; 1 < j < k, to the next Port j. If there is no path from the current port to the next port, the previous port is back-tracked to and another edge is tried. After all ports have been visited by repeating the above operations, the algorithm tries to find a path to primary port Port 0 to end this TAM. If indeed a Daisychain TAM is found, the best TAM width is updated. As the exploration of the tree proceeds, the best TAM width grows larger, and therefore increasingly more edges can be pruned in the subsequent exploration steps. It should be noted that if best TAM width equals to 1, the algorithm ignores all edges equal to or smaller than the best TAM width. This is because there is already a single-bit test interface available, and hence all 1-bit edges can be pruned. However, if best TAM width is larger than 1, the algorithm ignores only the edges smaller than the best TAM width, since Daisychain TAMs with the same width may have different TAM lengths (resulting from different numbers of re-visits). In case of multiple TAMs with the same width, the algorithm selects the one with the shortest length as the best. Priority is given to not-yet-visited ports during the exploration of the tree, i.e., the algorithm performs a re-visit only when there is no path from the current port to a not- yet- visited port. This can minimize the number of re- visits, which in turn minimizes the resulting TAM length and the area cost induced in the die wrapper. For a given partition, its die groups are optimized by the algorithm described above to determine their Multi- Visit Daisychain TAM configurations. Then, the test length of the partition is calculated with the provided test-length lookup table. After exhaustively processing all possible partitions, the one with the shortest test length is selected as the best Multi- Visit TAM configuration. Although the problem complexity of Multi- Visit optimization is much higher than that of Single- Visit, experimental results show that the Multi- Visit algorithm according to embodiments of the present invention is efficient and can deal with large k values. This algorithm guarantees the optimum solution since all possible TAM configurations are considered.

Another approach to increase the TAM width is to add test-dedicated interconnects. An algorithm is proposed to determine where and how many wires should be added. The inputs of the algorithm include the interconnection matrix of the target SIC, a test length lookup table which lists the test lengths of each die versus a range of TAM widths, and a user- specified maximum number of allowed extra wires, w_max; the output is a Single or Multi- Visit TAM configuration, and the positions and exact number of wires that should be added. The maximum number of allowed extra wires, w_max, is user-defined, and may depend on cost, for example on the ratio between cost and improvement in test length.

Adding external VOs may significantly increase product costs, nevertheless, in accordance with embodiments of the present invention it may also be implemented, besides or on top of adding interconnects between die towers. Hereto, die-to-external-I/O and/or vice versa interposer wires (so-called "vertical wires") may be added in the interposer, for the purpose of widening the TAMs. This solution may prove particularly relevant if the other solutions provided herein do not sufficiently reduce the test time. Although the proposed adding-wire technique can be applied to both Daisychain and Hybrid TAM architectures, as an example only, adding wires to Hybrid TAM is considered here, because the Daisychain architecture is merely a special case of the Hybrid architecture.

The proposed algorithm consists of two steps and the first step is to generate all possible partitions for the target k-die SIC, which is the same as the first step described with respect to Hybrid TAM.

Then, the second step is to optimize each group in each partition. For each group, the corresponding tree for the die towers in the group are explored exhaustively to determine the best Daisychain TAM configuration under the specified maximum number of allowed extra wires w_max. The tree that needs to be explored can be Single- Visit (as in Fig. 7) or Multi- Visit (as in Fig. 13), depending on the desired TAM architecture.

During the tree exploration, every possible Daisychain TAM is considered, including 0- bit Daisychain. Since extra wires are allowed, an original 0-bit Daisychain may be perfectly suitable for serving as the initial Daisychain for adding wires. Each time a complete Daisychain has been identified, a range of extra wires from one to w_max is added to the Daisychain and its resulting test length is calculated. To add wires to a Daisychain, every bottleneck position (the segment with the narrowest width) needs to be widened to make the effective TAM width wider. Figure 14 shows an example of adding wires to Daisychains.

Figure 14(a) is a 0-bit Daisychain initially but it may be a good initial Daisychain for adding wires because its bottleneck lies on only one segment, and hence the Daisychain can be made one-bit wider with each extra wire. Figure 14 (b) is a one-bit Daisychain initially, and there is also one bottleneck position. If no extra wire is allowed, Daisychain 2 is certainly better than Daisychain 1. However, with 4 extra wires, Daisychain 1 can become 4-bit wide, while Daisychain 2 can only become 3-bit wide. In addition, Daisychain 2 can only be extended to 4-bit at most, because its external input is 4-bit wide and hence to make Daisychain 2 wider than 4-bit requires extra external I/Os, which may induce significant additional cost and is prohibited in the algorithm according to embodiments of the present invention. From this example, it can be seen that for different numbers of allowed extra wires, the best Daisychain TAM configuration may be different.

After the processing for a Daisychain in a tree, a corresponding table which records the test lengths versus a range of extra wires can be obtained. Table I shows an example table for Daisychain x, which indicates that Daisychain x can achieve a test length of 200, 190, 190, and 100 clock cycles for 0, 1, 2, and w_max extra wires, respectively. It is observed that more extra wires do not necessarily shorten the test length at some stages, this is because the provided TAM width cannot always match the number of scan chains of the die towers perfectly. With this table, it is possible to know the benefit that can be obtained with each extra wire, and adding redundant wires that do not shorten the test length can be avoided.

Table I: test length vs. extra wires for a daisychain

For an entire tree of a certain group, there are many Daisychain TAM configurations, and their corresponding tables can be combined into a single table to show which Daisychain in this group is the best under a specific number of extra wires. Table II shows such an example table for Group y, which consists of a number of die towers. The table shows the respective shortest test lengths that can be achieved for 0, 1, 2, and w_max extra wires, as well as the corresponding Daisychain configuration IDs that achieve such test lengths. These IDs are one-to-one mapped to specific Daisychain TAM configurations. Therefore, once an ID is selected, the exact ordering of die towers inside the Daisychain can be determined. For example, ID #1 may correspond to the Daisychain consisting of Dies 1, 2, and 3 (an ordered sequence), while ID #2 may correspond to another Daisychain consisting of Dies 2, 1, and 3.

Table II: test lengths vs. extra wires for a group A number of groups can form a partition, which corresponds to the Hybrid TAM architecture according to embodiments of the present invention. Combining several tables for groups like the one in Table II, the table for a partition can further be created. Table III shows an example table for Partition z. Like the tables shown earlier, the best test lengths are listed for 0, 1, 2, and w_max extra wires. In addition, the corresponding Hybrid TAM configuration ID is also stored in the table. The ID is coded as g.d, in which g indicates the group in the partition and d indicates the Daisychain in the group that can achieve the listed shortest test lengths. For example, Table III indicates that Partition z has three groups, each consisting of a number of die towers. If w_max extra wires are to be added, then the best TAM configuration for Partition z is: Group 1 configured as Daisychain #1; Group 2 configured as Daisychain #12; Group 3 configured as Daisychain #9. This Hybrid TAM configuration can achieve a test length of 120 clock cycles.

Partition z

Table III: test lengths vs. extra wires for a partition

For each possible partition, there is a corresponding table like the one in Table III. Combining all of those tables, one overall table can finally be created, which shows the best Hybrid TAM configurations under different number of allowed extra wires, as shown in Table IV. In the table, the additional information, Partition ID, indicates the partition that can achieve the shortest test length; the Hybrid TAM configuration ID further specifies the Daisychain configurations of each group. For example, if 5 extra wires are to be added, then the best TAM configuration comes from Partition 2, which consists of two groups. Group 1 is configured as Daisychain #6, while Group 2 is configured as Daisychain #7. The resulting test length is then 140 clock cycles. Overall

# Extra Wires 0 1 2 3 4 5

Test Length 220 200 170 160 150 140 1 10

(Clock Cycles)

Partition ID 3 1 3 1 1 2 6

Hvbrid TAM 1.9 1.2 1.9 1.1 1.2 1.6 1.4

Config. ID 2.6 2.2 2.2 2.7 2.6

3.2 3.4 3.4 3.5

4.1

Table IV: overall table of test lengths vs. extra wires

Once the overall table for the target SIC is obtained, the algorithm automatically selects the TAM configuration that can achieve the shortest test length under the specified number of allowed extra wires. Users can also manually select the TAM configuration according to the information provided by the overall table, so that the solution is most cost-efficient for particular cases.

The foregoing description details certain embodiments of the invention. It will be appreciated, however, that no matter how detailed the foregoing appears in text, the invention may be practiced in many ways. It should be noted that the use of particular terminology when describing certain features or aspects of the invention should not be taken to imply that the terminology is being re-defined herein to be restricted to including any specific characteristics of the features or aspects of the invention with which that terminology is associated.

While the above detailed description has shown, described, and pointed out novel features of the invention as applied to various embodiments, it will be understood that various omissions, substitutions, and changes in the form and details of the device or process illustrated may be made by those skilled in the technology without departing from the invention as defined by the appended claims.

Claims

1.- A semiconductor interposer (40) for stacking on top thereof at least two die towers each comprising at least one die (Die 1, Die 2, Die 3), and for interconnecting the die towers by means of at least functional wires (woi, w₂₁, w₃₂) in the interposer (40), the semiconductor interposer (40) comprising test circuitry for post-bond testing of the dies (Die 1, Die 2, Die 3) and of electrical interconnections between the die towers and the interposer (40),

wherein at least one functional wire in the interposer (40) is re-used as part of the test circuitry.

2.- An interposer (40) according to claim 1, comprising

a primary port (Port 0) to external I/Os, or to a die different from the dies of the die towers to be stacked,

a plurality of secondary ports (Port 1, Port 2, Port 3) for stacking the at least two die towers onto.

3.- An interposer (40) according to claim 2, wherein the test circuitry comprises

(a) one or more extra pins on the interposer' s primary port (Port 0), and

(b) extra interconnections from those extra pins through the interposer (4) to the secondary ports (Port 1, Port 2, Port 3).

4. - An interposer (40) according to claim 3, wherein the primary and secondary ports are extended with at least a four-pin Test Access Port, comprising Test Data In

(TDI), Test Data Out (TDO), Test Clock (TCK), Test Mode Select (TMS) and optionally Test Reset Not (TRSTN).

5. - An interposer (40) according to any of the previous claims, comprising, besides the functional wires (woi, w₂₁, w₃₂), at least one extra wire to form a connection between the primary port (Port 0) and at least one of the plurality of secondary ports

(Port 1, Port 2, Port 3), or between the plurality of secondary ports (Port 1, Port 2, Port 3).

6. - A post-bond test architecture for testing a plurality of active die towers each comprising at least one die (Die 1, Die 2, Die 3), the die towers stacked on top of a semiconductor interposer (40) and interconnected by means of at least functional wires (w₀₁, w₂₁, w₃₂) in the interposer (40), the test architecture comprising a test access mechanism arranged for performing a sequence of tests, the test architecture comprising DfT elements both in the dies (Die 1, Die 2, Die 3) as well as in the interposer (40), wherein at least one functional wire in the interposer (40) is re-used as part of the test access mechanism.

7. - A test architecture according to claim 6, wherein the test access mechanism comprises

a primary port (Port 0) on the interposer (40) to external I/O, or to a die different from the dies (Die 1, Die 2, Die 3) of the die towers, and

a plurality of secondary ports (Port 1, Port 2, Port 3) on the interposer (40) for stacking the at least two die towers onto.

8. - A test architecture according to claim 7, wherein the test access mechanism comprises

(a) one or more extra pins on the interposer' s primary port (Port 0),

(b) extra interconnections from those extra pins through the interposer (40) to the secondary ports (Port 1, Port 2, Port 3), and

(c) boundary-scan-like wrapper cells for controllability and/or observability in the dies (Die 1, Die 2, Die 3).

9. - A test architecture according to any of claims 7 or 8, wherein the interposer ports are extended with at least a four-pin Test Access Port, comprising Test Data In (TDI), Test Data Out (TDO), Test Clock (TCK), Test Mode Select (TMS) and optionally Test Reset Not (TRSTN).

10.- A test architecture according to any of claims 7 to 9, comprising, besides the functional wires (w₀₁, w₂₁, w₃₂), at least one extra wire to form a connection between the primary port (Port 0) and at least one of the plurality of secondary ports (Port 1, Port 2, Port 3), or between the plurality of secondary ports (Port 1, Port 2, Port 3).

11.- A test architecture according to any of claims 6 to 10, wherein the active die towers comprise a test wrapper adapted for allowing transit-only data passage.

12. - A test architecture according to claim 11, provided with at least one multiplexer to switch between normal functional mode and transit test mode.

13. - A method for optimizing a basic DfT architecture for post-bond testing of at least two die towers, each comprising at least one die, stacked onto a semiconductor interposer, the method comprising identifying functional interposer interconnects that can be combined and reused during testing in at least one TAM configuration.

14.- Method according to claim 13, wherein identifying functional interposer interconnects includes taking into account visiting a particular die tower for test purposes, and revisiting that die tower for transit-only visits to enable connections to other dies.

15.- Method according to claim 14, the method imposing a partitioning of the die towers over a plurality TAMs.

16.- Method according to any of claims 14 or 15, wherein a TAM visits all of its dies at least once, but is allowed to visit its dies multiple times, provided that the functional interposer interconnects that constitute the TAM path are used at most once.

17.- Method according to any of claims 13 to 16, wherein the functional interposer interconnects are combined and reused during test in a parallel TAM using a "Hybrid" architecture comprising a plurality of TAMs, and each TAM contains at least one active die tower, such that each die tower is included in exactly one TAM,.

18. - Method according to claim 17, wherein the functional interposer interconnects are combined and reused during test in a parallel TAM using a "Distribution" architecture wherein each active die tower has its own private TAM.

19. - Method according to claim 17, wherein the functional interposer interconnects are combined and reused during test in a parallel TAM using a "Daisy-Chain" architecture wherein only one TAM is required, which tours from a primary port of the semiconductor interposer, via all die towers, back to the primary port.

20. - Method according to any of claims 13 to 19, furthermore comprising adding in the interposer at least one die-to-die interposer wire for testing purposes.

21. - Method according to any of claims 13 to 20, furthermore comprising adding at least one die-to-external-I/O and/or vice versa interposer wire for testing purposes.

22.- A method for identifying a TAM configuration that achieves the shortest test length for a given semiconductor interposer architecture, the semiconductor interposer having stacked on top thereof at least two die towers each comprising at least one die, the method comprising

generating all possible partitions for the at least two die towers according to a set of input parameters, and

identifying for each partition the best possible Hybrid TAM for each group of die towers in that partition, a better Hybrid TAM being defined as a TAM of wider width, or, in case of equal width, of less re- visits, a Hybrid TAM comprising at least one TAM, each TAM containing at least one active die tower, such that each die tower is included in exactly one TAM,.

23.- A method according to claim 22, wherein each TAM uses a "Distribution" architecture wherein each active die tower has its own private TAM.

24.- A method according to claim 22, wherein each TAM uses a "Daisy-chain" architecture wherein only one TAM is required, which tours from a primary port of the semiconductor interposer, via all die towers, back to the primary port.

25. - A method according to any of claims 22 to 24, wherein generating all possible partitions for k dies according to a set of input parameters includes providing as input parameters a matrix containing the interconnect specification of the interposer in question and a test-length lookup table which provides the test lengths of each die for a range of TAM widths.

26. - A method according to any of claims 22 to 25, wherein identifying for each partition the best possible Hybrid TAM includes identifying a Multi- Visit Hybrid TAM which allows the TAM to visit a same die tower multiple times, once for testing that die and the other one or more times for transit-only visits used to enable interconnections to other dies.

27. - A method according to any of claims 22 to 26, wherein identifying for each partition the best possible Hybrid TAM for each group of die towers in that partition comprises a pruning mechanism for reducing the number of possible Hybrid TAMs to be taken in account for the identification.

28. - A method according to claim 27, wherein the pruning mechanism includes ignoring paths between die towers that do not result in a better TAM.

29. - A method according to claim 28, wherein ignoring paths that do not result in a better TAM exploits the fact that the width of a TAM is determined by its narrowest segment.