CN117542402A - Fault-tolerant repairing method, stacked chip and storage medium - Google Patents

Fault-tolerant repairing method, stacked chip and storage medium Download PDF

Info

Publication number
CN117542402A
CN117542402A CN202311464166.4A CN202311464166A CN117542402A CN 117542402 A CN117542402 A CN 117542402A CN 202311464166 A CN202311464166 A CN 202311464166A CN 117542402 A CN117542402 A CN 117542402A
Authority
CN
China
Prior art keywords
silicon channel
repair
wafer
sub
circuit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311464166.4A
Other languages
Chinese (zh)
Inventor
赵毅
李少白
胡坤梅
莫晓霖
陈钰文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202311464166.4A priority Critical patent/CN117542402A/en
Publication of CN117542402A publication Critical patent/CN117542402A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/04Detection or location of defective memory elements, e.g. cell constructio details, timing of test signals
    • G11C29/08Functional testing, e.g. testing during refresh, power-on self testing [POST] or distributed testing
    • G11C29/12Built-in arrangements for testing, e.g. built-in self testing [BIST] or interconnection details
    • G11C29/44Indication or identification of errors, e.g. for repair
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/04Detection or location of defective memory elements, e.g. cell constructio details, timing of test signals
    • G11C29/08Functional testing, e.g. testing during refresh, power-on self testing [POST] or distributed testing
    • G11C29/12Built-in arrangements for testing, e.g. built-in self testing [BIST] or interconnection details
    • G11C29/18Address generation devices; Devices for accessing memories, e.g. details of addressing circuits
    • G11C29/24Accessing extra cells, e.g. dummy cells or redundant cells

Abstract

The application provides a fault-tolerant repair method, a stacked chip and a storage medium, which are applied to a control circuit of the fault-tolerant repair circuit, wherein the fault-tolerant repair circuit further comprises a detection circuit and a plurality of repair circuits, the detection circuit is electrically connected with a silicon channel component of the stacked chip, the input end of each repair circuit is connected between a first wafer of the stacked chip and at least one silicon channel component, the output end of each repair circuit is connected between the first wafer and a second wafer of the stacked chip, and the control circuit is in communication connection with the detection circuit and the repair circuits; the method comprises the steps of grouping silicon channel members evenly according to a first quantity; dividing the silicon channel members in the member group into a conventional silicon channel member and a redundant silicon channel member; acquiring state information of the component group sent by the detection circuit to confirm the damaged quantity of damaged conventional silicon channel components in the component group; the control repair circuit activates a damaged number of redundant silicon channel members. Ensuring that the electrical connection of the first and second wafers is not affected.

Description

Fault-tolerant repairing method, stacked chip and storage medium
Technical Field
The present disclosure relates to the field of integrated circuits, and in particular, to a fault tolerance repairing method, stacked chips, and a storage medium.
Background
The stacked chips are a research hot spot of the current chip packaging technology, and the performance of the chips can be effectively improved through the stacking of a plurality of chips, so that the overall cost of the chips is reduced. However, due to the size, density and process of the chips, when the TSV (Through SiliconVia, through-silicon via) technology is used to form the vertical inter-chip interconnect channels during the three-dimensional stacking process, various defects may occur in the stacked chips, thereby affecting the electrical connection between the chips.
Disclosure of Invention
The fault-tolerant repairing method, the stacked chips and the storage medium aim to solve the problem that in the existing three-dimensional stacking process, when a vertical interconnection channel between chips is formed by utilizing a TSV technology, the stacked chips have various defects, and then electric connection between the chips is affected.
In a first aspect, the present application provides a fault-tolerant repair method applied to a control circuit of a fault-tolerant repair circuit, the method is used for repairing a plurality of silicon channel members of a stacked chip, the stacked chip includes at least one first wafer and one second wafer, the first wafer is internally provided with the plurality of silicon channel members, the first wafer is stacked on the second wafer, the fault-tolerant repair circuit further includes a detection circuit and a plurality of repair circuits, the detection circuit is electrically connected with the silicon channel members, an input end of the repair circuit is connected between the first wafer and at least one silicon channel member, an output end of the repair circuit is connected between the first wafer and the second wafer, and the control circuit is respectively in communication connection with the detection circuit and the repair circuit; the method comprises the following steps:
Grouping a plurality of the silicon channel members in an average manner according to a first number to obtain a first number of member groups;
dividing the silicon channel members within each of said member groups into a first sub-number of regular silicon channel members and a second sub-number of redundant silicon channel members;
acquiring state information of the component group sent by the detection circuit;
confirming the number of damages of the conventional silicon channel members damaged in the member group based on the state information;
the repair circuit is controlled to activate the damaged number of redundant silicon channel members.
In some embodiments, the dividing the silicon channel members within each of the member groups into a first sub-number of regular silicon channel members and a second sub-number of redundant silicon channel members comprises: calculating a repair rate of the component group according to the first sub-number and the second sub-number; wherein, the formula of the repair rate is as follows:
wherein P is repair rate, N is the number of silicon channel components in the component group,f is failure rate of the silicon channel members in order to select failure probability of N silicon channel members in the N silicon channel members; updating the first sub-number and the second sub-number to enable the repair rate of the component group to meet a preset repair condition.
In some embodiments, the updating the first sub-quantity and the second sub-quantity includes: acquiring a preset proportion according to the first sub-quantity and the second sub-quantity; updating the preset proportion based on a dichotomy to complete updating of the first sub-quantity and the second sub-quantity.
In some embodiments, the repair circuit includes a plurality of first multiplexing modules and a plurality of second multiplexing modules; the first multiplexing module is connected between the upper surface of the first wafer and the silicon channel member, and the second multiplexing module is connected between the first wafer and the second wafer; after said grouping of the plurality of silicon channel members in average by a first number to obtain said first number of member groups, further comprising: and acquiring the utilization rate of the first multiplexing module and the second multiplexing module in the component group.
In some embodiments, the upper surface of the first wafer includes a plurality of input channels and the upper surface of the second wafer includes a plurality of output channels; the controlling the repair circuit to activate the damaged number of redundant silicon channel members includes: acquiring first channel identifications of input channels of the damaged number of conventional silicon channel members connected with the upper surface of the first wafer; obtaining second channel identifiers of output channels of the damaged number of conventional silicon channel components connected with the upper surface of the second wafer; and determining a target input channel and a target output channel which can be used for connection according to the first channel identifier and the second channel identifier, and electrically connecting the target input channel and the target output channel with the redundant silicon channel components so as to activate the damaged number of redundant silicon channel components.
In some embodiments, the detection circuit includes a state detection module and a register module, an input end of the state detection module is electrically connected with the silicon channel member, an output end of the state detection module is electrically connected with the register module, and the state detection module is used for outputting a detection signal of the silicon channel member to the register module; the obtaining the state information of the component group sent by the detection circuit includes: and acquiring the detection signals stored by the register modules corresponding to the component groups, and acquiring the state information of the component groups according to the detection signals.
In a second aspect, the present application provides a stacked chip, including:
at least one first wafer, including an upper surface and a lower surface opposite to the upper surface, wherein a plurality of silicon channel members are arranged between the upper surface and the lower surface to form electrical conduction between the upper surface and the lower surface;
the second wafer comprises an upper surface and a lower surface arranged opposite to the upper surface, and the lower surface of the first wafer is stacked on the upper surface of the second wafer;
the fault-tolerant repair circuit comprises a control circuit, a detection circuit and a plurality of repair circuits, wherein the detection circuit is electrically connected with the silicon channel components, one side of an input end of the repair circuit is electrically connected with the upper surface of the first wafer, the other side of the input end of the repair circuit is electrically connected with one side of at least one silicon channel component, one side of an output end of the repair circuit is electrically connected with the other side of the silicon channel components, the other side of the output end of the repair circuit is electrically connected with the second wafer, and the control circuit is respectively in communication connection with the detection circuit and the repair circuit;
The control circuit comprises a processor, a memory and a computer program stored on the memory and executable by the processor, wherein the memory stores a strategy model, and the computer program realizes the fault-tolerant repair method provided by any embodiment of the application when being executed by the processor.
In some embodiments, the repair circuit includes a plurality of first multiplexing modules and a plurality of second multiplexing modules; the first multiplexing module is connected between the upper surface of the first wafer and the silicon channel member, and the second multiplexing module is connected between the first wafer and the second wafer; the detection circuit comprises a state detection module and a registering module, wherein the input end of the state detection module is electrically connected with the silicon channel component, the output end of the state detection module is electrically connected with the registering module, and the state detection module is used for outputting a detection signal of the silicon channel component to the registering module.
In a third aspect, the present application provides a storage medium, where a computer program is stored on the provided computer readable storage medium, to implement the fault-tolerant repair method provided in any embodiment of the present application.
The method is applied to a control circuit of a fault-tolerant repair circuit, the method is used for repairing a plurality of silicon channel components of the stacked chips, the stacked chips comprise at least one first wafer and at least one second wafer, the plurality of silicon channel components are arranged in the first wafer, the first wafer is stacked on the second wafer, the fault-tolerant repair circuit further comprises a detection circuit and a plurality of repair circuits, the detection circuit is electrically connected with the silicon channel components, an input end of the repair circuit is connected between the first wafer and the at least one silicon channel component, an output end of the repair circuit is connected between the first wafer and the second wafer, and the control circuit is respectively in communication connection with the detection circuit and the repair circuit. The method comprises the steps of obtaining a first number of component groups by carrying out average grouping on a plurality of silicon channel components according to the first number; dividing the silicon channel members within each member group into a first sub-number of regular silicon channel members and a second sub-number of redundant silicon channel members; acquiring state information of a component group sent by a detection circuit; confirming the damaged number of damaged conventional silicon channel members in the member group according to the state information; the control repair circuit activates a damaged number of redundant silicon channel members. And dividing the silicon channel components into a first number of component groups, dividing the silicon channel components in the component groups into a first sub-number of conventional silicon channel components and a second sub-number of redundant silicon channel components, confirming the damaged number of the damaged conventional silicon channel components in the component groups according to the state information of the component groups sent by the detection circuit, and finally controlling the repair circuit to activate the damaged number of the redundant silicon channel components. Thereby ensuring that electrical connections during the stacking of the first and second wafers are not affected.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic block diagram of a stacked chip according to an embodiment of the present application;
FIG. 2 is a schematic block diagram of another stacked chip provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart of steps of a fault-tolerant repair method provided in an embodiment of the present application;
FIG. 4 is a schematic flow chart of steps of a method for partitioning components provided in an embodiment of the present application;
FIG. 5 is a schematic flow chart of steps of an activation method provided in an embodiment of the present application;
fig. 6 is a schematic block diagram of a control circuit provided in an embodiment of the present application.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations.
It is to be understood that the terminology used in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be understood that, in order to clearly describe the technical solutions of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect. For example, the first wafer and the second wafer are merely for distinguishing between different wafers, and are not limited in their order of precedence. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
In order to facilitate understanding of the embodiments of the present application, some of the words referred to in the embodiments of the present application are briefly described below.
Tsv (through silicon via) technology: TSVs are formed by making vertical vias between chips, wafer to wafer; the TSV technology realizes vertical electrical interconnection of silicon through holes through filling of conductive substances such as copper, tungsten, polysilicon and the like, and is the only vertical electrical interconnection technology at present, and is one of key technologies for realizing 3D advanced packaging.
The TSV technology has the following advantages:
1) High density integration: by advanced packaging, the integration level of electronic components can be greatly improved, and the geometric size and the packaging weight of the packaging are reduced. The defects of the existing 2D-SIP (System in a package) and PoP (three-dimensional package stack) systems are overcome, and the requirements of the microelectronic product on multifunction and miniaturization are met.
2) Improving electrical properties: the TSV technology can greatly shorten the length of the electrical interconnection, so that the problems of signal delay and the like in the SOC (two-dimensional system on chip) technology can be well solved, and the electrical performance is improved.
3) Multiple functions integration: through the mode of TSV interconnection, different functional chips (such as radio frequency, internal memory, logic, digital and MEMS and the like) can be integrated together to realize the multifunction of the electronic component.
4) Reducing the manufacturing cost: although the current TSV three-dimensional integration technology is relatively costly in terms of process, manufacturing costs can be reduced at the overall level of components.
2. The bisection method comprises the following steps: the dichotomy (bipartition method) is a method of dividing into two, and a, b is defined as the closed region of R. Successive dichotomy is the creation of the following interval sequence ([ an, bn ]): a0 =a, b0=b, and for any natural number n, [ an+1, bn+1] is either equal to [ an, cn ], or equal to [ cn, bn ], where cn represents the midpoint of [ an, bn ].
Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.
The stacked chips are a research hot spot of the current chip packaging technology, and the performance of the chips can be effectively improved through the stacking of a plurality of chips, so that the overall cost of the chips is reduced. However, due to the size, density and process of the chips, when the TSV technology is used to form the vertical interconnection channels between the chips in the three-dimensional stacking process, various defects exist in the stacked chips, which affect the electrical connection between the chips.
In order to solve the above-mentioned problems, an embodiment of the present application provides a stacked chip, please refer to fig. 1, fig. 1 is a schematic block diagram of the structure of the stacked chip provided in the embodiment of the present application. As shown in fig. 1, a stacked chip 10 is provided that includes at least a first wafer 11, a second wafer 12, a silicon channel member 13, and a fault tolerant repair circuit 14. The first chip 11 includes an upper surface and a lower surface opposite to the upper surface, and a plurality of silicon channel members 13 are disposed between the upper surface and the lower surface to form electrical conduction between the upper surface and the lower surface. The second wafer 12 includes an upper surface and a lower surface disposed opposite to the upper surface, and the lower surface of the first wafer 11 is stacked on the upper surface of the second wafer 12. The fault-tolerant repair circuit 14 includes a control circuit 141, a detection circuit 142, and a plurality of repair circuits 143, the detection circuit 142 is electrically connected with the silicon channel members 13, one side of an input end of the repair circuit 143 is electrically connected with an upper surface of the first wafer 11, the other side of the input end of the repair circuit 143 is electrically connected with one side of at least one silicon channel member 13, one side of an output end of the repair circuit 143 is electrically connected with the other side of the silicon channel member 13, the other side of the output end of the repair circuit 143 is electrically connected with the second wafer 12, and the control circuit 143 is respectively in communication connection with the detection circuit 142 and the repair circuit 143. The control circuit 143 is configured to implement the fault-tolerant repair method provided in any one of the embodiments of the present application.
Specifically, by stacking a plurality of first wafers 11 with silicon channel members 13 on the upper surface of the second wafer 12, the provision of the fault-tolerant repair circuit 14 is passed. It is possible to ensure in real time that there is sufficient silicon channel members 13 between the first wafer 11 and the second wafer 12 for effective electrical connection.
In some embodiments, the control circuit 141 includes a micro control unit (Microcontroller Unit; MCU) capable of further controlling the repair circuit 14 to activate the corresponding silicon channel members 13 based on the detection information of the silicon channel members 13 sent by the detection circuit 142, so as to ensure that there is sufficient silicon channel members 13 between the first wafer 11 and the second wafer 12 for effective electrical connection.
It should be noted that in some embodiments, 1 first wafer 11 is stacked on the second wafer 12, and in some embodiments, 2 first wafers 11 are stacked on the second wafer 12. The number of the first wafers 11 is set according to the user's needs, and the embodiment of the present application is not limited.
In some embodiments, please refer to fig. 2, fig. 2 is a schematic block diagram of another stacked chip provided in an embodiment of the present application. The repair circuit 143 includes a plurality of first multiplexing modules 143a and a plurality of second multiplexing modules 143b. The first multiplexing module 143a is connected between the upper surface of the first wafer 11 and the silicon channel members 13 (TSVS in fig. 2 denotes a plurality of silicon channel members 13), and the second multiplexing module 143b is connected between the first wafer 11 and the second wafer 12; the detection circuit 142 includes a state detection module 142a and a register module 142b, wherein an input end of the state detection module 142a is electrically connected to the silicon channel member 13, an output end of the state detection module 142a is electrically connected to the register module 142b, and the state detection module 142a is configured to output a detection signal of the silicon channel member 13 to the register module 142b.
Each of the first multiplexing module 143a and the second multiplexing module 143b may be connected to one silicon channel member 13 one by one simultaneously or may be connected to a plurality of silicon channel members 13 simultaneously through the first multiplexing module 143a and the second multiplexing module 143b, for example, MUXs (muxes). The method provided by the embodiment of the application aims at improving the utilization rate of the first multiplexing module 143a and the second multiplexing module 143 b.
By way of the state detection module 142a and the registration module 142b, for example, the state detection module 142a is a comparator and the registration module 142b is a register. The comparator is further capable of determining whether the signal of the silicon channel member 13 is greater than the threshold voltage, and if so, outputting a detection signal as a high level signal, and otherwise outputting a low level signal. And can thereby help the control circuit 141 quickly determine whether the silicon channel member 13 is damaged.
The embodiment of the present application provides a stacked chip, by which it is possible to ensure in real time that there are sufficient silicon channel members 13 between the first wafer 11 and the second wafer 12 for effective electrical connection.
Referring to fig. 3, fig. 3 is a schematic flowchart illustrating steps of a fault-tolerant repair method according to an embodiment of the present application. The fault-tolerant repair method is applied to the control circuit of the fault-tolerant repair circuit provided by any embodiment of the application and is used for repairing the silicon channel component of the stacked chip provided by any embodiment of the application.
As shown in fig. 1, the fault-tolerant repair method specifically includes steps S101 to S105.
S101, carrying out average grouping on a plurality of silicon channel components according to a first number to obtain a first number of component groups.
In particular, when a plurality of silicon channel members are included between the first wafers of the provided stacked chips, the control circuit first groups the silicon channel members on average according to a first number, for example, when 100 silicon channel members are present, 10 member groups can be obtained per 10 silicon channel members group, in order to facilitate the subsequent redundancy and the division of the conventional silicon channel members. And then the control circuit respectively acquires the state information of the detection circuit for each component group to confirm the number of damaged silicon channel components in each component group, and the repair of the silicon channel components is rapidly completed to ensure the electrical performance of the stacked wafers.
S102, dividing the silicon channel components in each component group into a first sub-number of conventional silicon channel components and a second sub-number of redundant silicon channel components.
Specifically, the regular silicon channel members represent silicon channel members that normally operate during use of the stacked chips, and the redundant silicon channel members represent spare silicon channel members during use of the stacked chips. By providing each component group with a corresponding redundant silicon channel component by the control circuit, fault tolerance of the stacked chips can be improved.
Illustratively, the control circuit is configured to control the number of components by grouping the component groups according to a grouping ratio Ngr: ngs, where Ngr represents a first sub-number and Ngs represents a second sub-number. Then, a multiplexer can be configured for each component group. The multiplexer can ensure transmission from the input signal to the output signal by avoiding defective TSV paths in the group. The second number of pass redundant silicon channel members and the amount of use of the multiplexers within the group of members determines the rate of use of the multiplexers within the group of members. The cost of the provided stacked chips can be minimized by optimizing the first sub-quantity and the second sub-quantity.
It should be noted that even though the total redundant TSV usage is the same, different repair rates and multiplexer usage are corresponding with different first and second sub-numbers. Thus, the method provided herein will trade off between high repair rate and reduced hardware consumption of the multiplexer, thereby selecting the optimal first and second sub-numbers.
In some embodiments, referring to fig. 4, fig. 4 is a schematic flowchart illustrating steps of a method for partitioning components according to an embodiment of the present application.
As shown in fig. 4, the provided member repair method includes steps S102a to S102b.
S102a, calculating the repair rate of the component group according to the first sub-quantity and the second sub-quantity. Wherein, the formula of the repair rate is as follows:
wherein P is repair rate, N is the number of silicon channel components in the component group,to select the failure probability of N silicon channel members among the N silicon channel members, F is the failure rate of the silicon channel member.
S102b, updating the first sub-quantity and the second sub-quantity to enable the repair rate of the component group to meet the preset repair condition.
Since the TSVs exist in an array in the actual manufacturing process, a mathematical relationship is established between the number of TSVs in one block and the final repair rate based on the probability theory, so that the redundancy number of the TSV block can be calculated with the repair rate as a limit.
First, based on the failure rate of a single TSV, the expected number of failures of the intra-layer TSV can be calculated. Then, based on the expected failure number of the single-layer TSVs, the number of TSV chains and the limitation of the size of the TSV blocks are calculated. The repair rate calculation for n failed TSVs within a layer is shown in the above equation. As long as P is sufficiently large, it can be assumed that the maximum number of failures of TSVs is n. The larger P indicates a higher repair rate, and the final goal of the model assumption is to achieve a higher repair rate. Among others, the purpose of the repair method provided by the present application is to make the repair rate as large as possible with the minimum number of redundant silicon channel members, i.e., to reduce the cost of stacked chips by reducing the number of redundant silicon channel members.
Illustratively, updating the first sub-quantity and the second sub-quantity includes: acquiring a preset proportion according to the first sub-quantity and the second sub-quantity; the preset proportion is updated based on the dichotomy to complete the updating of the first sub-quantity and the second sub-quantity.
And updating the preset proportion of the first sub-quantity and the second sub-quantity by a dichotomy, and further calculating the corresponding system repair rate by using the updated preset proportion. If the repair rate is greater than the preset repair rate, the new second sub-number is set to the previous 1/2. If the updated repair rate meets the preset repair condition, for example, is larger than the preset repair rate, the second sub-quantity is updated to be 1/2 of the previous value, if not, the setting of the new second sub-quantity is stopped, and the preset proportion of the final first sub-quantity and the second sub-quantity is found.
It should be noted that, in some embodiments, the preset ratio in the embodiments of the present application may be calculated by a genetic algorithm, and the preset ratio in the embodiments of the present application may be calculated by a gauss newton method. The embodiment of the application does not limit the optimizing algorithm of the preset proportion.
Illustratively, as shown in FIG. 2, the repair circuit includes a plurality of first multiplexing modules and a plurality of second multiplexing modules; the first multiplexing module is connected between the upper surface of the first wafer and the silicon channel member, and the second multiplexing module is connected between the first wafer and the second wafer; after the plurality of silicon channel members are evenly grouped by the first number to obtain the first number of member groups, further comprising: and acquiring the utilization rate of the first multiplexing module and the second multiplexing module in the component group. By acquiring the usage rates of the first multiplexing module and the second multiplexing module, the usage condition of the silicon channel member in the currently stacked chip can be judged. The first sub-number and the second sub-number have been adjusted according to the use case.
It should be noted that, in some embodiments, updating the first sub-number and the second sub-number to make the repair rate of the component group meet the preset repair condition includes: updating the first sub-quantity and the second sub-quantity to enable the repair rate of the component group to meet the preset repair condition and the use rate of the component group to meet the preset use condition.
The packet proportion is selected by a search algorithm, such as dichotomy. First, the consumption maximum limit use rate Cmax of the multiplexer is set, starting from the initial silicon channel member total redundancy ratio rinit=100%, for example, for 1:1,2:2,3: and 3, calculating the utilization rate of the multiplexer in equal proportion, and finding the packet ratio corresponding to the closest Cmax, namely the ratio of the first sub-number to the second sub-number, and stopping searching for a new packet ratio within 100%. And calculating a system repair rate corresponding to the ratio of the first sub-quantity to the second sub-quantity, and if the repair rate meets a preset repair condition, for example, the repair rate is larger than the preset repair rate. The new overall redundancy ratio Rnew is set to the previous 1/2 while the consumption calculation of the multiplexer is performed for packet ratios belonging to 50% (e.g., (2: 1,4:2,8:4, … …)), the packet ratio corresponding to the closest Cmax is found, and searching for new packet ratios is stopped within Rnew (50%). Calculating a system repair rate corresponding to the ratio of the first sub-quantity to the second sub-quantity, if the repair rate is larger than a preset repair rate, setting a new total redundancy ratio Rnew to be 1/2 of the previous value, and if not, stopping setting the new total redundancy ratio of the silicon channel component, and finding out the final first sub-quantity and second sub-quantity. To reduce the cost of stacking chips and to improve their performance.
S103, acquiring state information of the component group sent by the detection circuit.
Specifically, the control circuit tests each of the regular silicon channel members and the redundant silicon channel members in the member group by acquiring status information, such as whether the voltages of the regular silicon channel members and the redundant silicon channel members are higher than a threshold voltage, by the detection circuit. And further, whether the damaged conventional silicon channel members exist in the member group or not can be confirmed, so that the stacked chips can be repaired in time when the damaged conventional silicon channel members exist.
In some embodiments, referring to fig. 2, the detection circuit includes a state detection module and a register module, an input end of the state detection module is electrically connected with the silicon channel member, an output end of the state detection module is electrically connected with the register module, and the state detection module is used for outputting a detection signal of the silicon channel member to the register module; acquiring state information of the component group sent by the detection circuit, wherein the state information comprises: and acquiring detection signals stored by the register modules corresponding to the component groups, and acquiring state information of the component groups according to the detection signals.
Through the state detection module and the register module, the comparator can further judge whether the signal of the silicon channel member 3 is larger than the threshold voltage, if so, the detection signal is output as a high-level signal, and if not, the detection signal is output as a low-level signal. And can thereby help the control circuit quickly determine if the silicon channel member is damaged.
Illustratively, the nand gate has a preset threshold voltage by employing the nand gate as a state detection module. The control circuit controls one end of the NAND gate to input a high level (the high level does not affect the output of the NAND gate, and the output of the NAND gate is determined by the output end of the silicon channel component) input signal to pass through one end of the silicon channel component connected with the upper surface of the first wafer. The different states of the silicon channel structure result in different voltage signals at the output of the nand gate. The input signal passes through the silicon channel component, (RC delay) and reaches the second chip, and by observing the output of the nand gate, the voltage value at the head and the tail ends of the silicon channel component can be compared, and if the voltage value is higher than the preset threshold voltage of the nand gate circuit, the output state information is a low-level signal, for example, test_result=0 (without defects). If test_result=1 (nand gate outputs high level, the silicon channel member is defective).
S104, confirming the damaged quantity of the damaged conventional silicon channel components in the component group according to the state information.
Specifically, the control circuit tests the state information of each conventional silicon channel member and redundant silicon channel member in the member group according to the detection circuit, when the number of damaged conventional silicon channel members in the member group is confirmed to be the damaged number, the channel corresponding to the damaged portion of the conventional silicon channel members in the stacked chip is electrically disconnected if the damaged portion is not repaired in time. It can be ensured by the confirmation of the number of damages that the provided method is able to lock the damaged conventional silicon channel member in time.
S105, controlling the repair circuit to activate the damaged number of redundant silicon channel components, and completing the repair of the silicon channel components of the stacked chips.
Specifically, when the control circuit determines that there is a failure of a conventional silicon channel member in the member group of stacked chips, a corresponding number of redundant silicon channel members in the member group need to be activated in time by the repair circuit. The method comprises the steps of dividing silicon channel components into a first number of component groups, dividing the silicon channel components in the component groups into a first sub-number of conventional silicon channel components and a second sub-number of redundant silicon channel components, confirming the damaged number of the damaged conventional silicon channel components in the component groups according to state information of the component groups sent by a detection circuit, and finally controlling a repair circuit to activate the damaged number of the redundant silicon channel components. Thereby ensuring that electrical connections during the stacking of the first and second wafers are not affected.
In some embodiments, the upper surface of the first wafer includes a plurality of input channels and the upper surface of the second wafer includes a plurality of output channels; referring to fig. 5, fig. 5 is a schematic flowchart illustrating steps of an activation method according to an embodiment of the present application.
Referring to fig. 5, the provided activation method includes steps S105a to S105c.
S105a, acquiring a first channel identifier of an input channel of which the damaged number of conventional silicon channel components are connected with the upper surface of the first wafer.
S105b, obtaining second channel identifiers of output channels of the damaged number of conventional silicon channel components connected with the upper surface of the second wafer.
S105c, determining a target input channel and a target output channel which can be used for connection according to the first channel identification and the second channel identification, and electrically connecting the target input channel and the target output channel with the redundant silicon channel components so as to activate the damaged number of the redundant silicon channel components.
By arranging the electrical transfer channels for detecting the silicon channel members in the detection circuit, for example, arranging the path module, the control circuit can generate control signals to the path module according to the identification information of the silicon channel members, for example, the 5 th input channel and the 3 rd output channel when the corresponding redundant silicon channel members are activated, and the path module is responsible for avoiding paths corresponding to defective silicon channel members from input signals, and selecting good target input channels and target output channels to carry out path planning from the input signals to the redundant silicon channel members to the output signals. Redundant silicon channel members are prevented from being activated on damaged channels.
Illustratively, depending on the detection result of the silicon channel member status register set, the defective conventional silicon channel member is repaired. And after the detection circuit stores the detection result of each silicon channel member in the group into the silicon channel member state register group, repairing each silicon channel member. Repair requires a total of the first sub-number plus the second sub-number of clock cycles. The repair module further includes a path module that is comprised of a first sub-number of multiplexers. The control signal of the multiplexer is given by a control module, and the control module comprises a signal line counting enabling module and a failure silicon channel component accumulator, and is formed by a fault-tolerant overflow comparator. First, the state of the first silicon channel member enters the signal line count enable, enabling the latch chain if it is 0, the latch chain refreshing the latch signal value given by the output of the failed silicon channel member accumulator (if the first silicon channel member state is 0, the output of the failed silicon channel member accumulator is 00). The values of the state register group of the silicon channel component enter the control circuit according to the sequence, and each submodule gives out the output values according to rules. And then the repair of the silicon channel member can be completed.
The application provides a fault-tolerant repair method, which is characterized in that silicon channel components are divided into a first number of component groups, then the silicon channel components in the component groups are divided into a first sub-number of conventional silicon channel components and a second sub-number of redundant silicon channel components, so that the damage number of the damaged conventional silicon channel components in the component groups is confirmed according to state information of the component groups sent by a detection circuit, and finally a repair circuit is controlled to activate the damaged number of the redundant silicon channel components. Thereby ensuring that electrical connections during the stacking of the first and second wafers are not affected.
The application provides a fault-tolerant repair method which is applied to a control circuit of a fault-tolerant repair circuit. As shown in fig. 6, fig. 6 is a schematic block diagram of a control circuit according to an embodiment of the present application.
The control circuit may include a processor, a memory, and a network interface, among others. The processor, memory and network interface are connected by a system bus, such as an I2C (Inter-integrated Circuit) bus.
Specifically, the processor may be a Micro-controller unit (MCU), a central processing unit (Central Processing Unit, CPU), a digital signal processor (Digital Signal Processor, DSP), or the like.
Specifically, the memory may be a Flash chip, a Read-only memory (ROM) disk, an optical disk, a U-disk, a removable hard disk, or the like.
The network interface is used for network communication such as transmitting assigned tasks and the like. It will be appreciated by those skilled in the art that the structure shown in fig. 6 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the control circuitry to which the present application is applied, and that a particular control circuitry may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
The processor is configured to run a computer program stored in the memory, and implement corresponding steps in any fault-tolerant repair method provided in the embodiments of the present application when the computer program is executed.
An exemplary provided control circuit is used for the following steps.
And carrying out average grouping on a plurality of silicon channel components according to a first number to obtain a component group with the first number.
The silicon channel members within each of the member groups are divided into a first sub-number of regular silicon channel members and a second sub-number of redundant silicon channel members.
And acquiring the state information of the component group sent by the detection circuit.
And confirming the damaged quantity of the damaged conventional silicon channel components in the component group according to the state information.
And controlling the repair circuit to activate the damaged number of redundant silicon channel members to finish the repair of the silicon channel members of the stacked chips.
In some embodiments, the dividing the silicon channel members within each of the member groups into a first sub-number of regular silicon channel members and a second sub-number of redundant silicon channel members comprises: calculating a repair rate of the component group according to the first sub-number and the second sub-number; wherein, the formula of the repair rate is as follows:
wherein P is repair rate, N is the number of silicon channel components in the component group,f is failure rate of the silicon channel members in order to select failure probability of N silicon channel members in the N silicon channel members; updating the first sub-number and the second sub-number to enable the repair rate of the component group to meet a preset repair condition.
In some embodiments, the updating the first sub-quantity and the second sub-quantity includes: acquiring a preset proportion according to the first sub-quantity and the second sub-quantity; updating the preset proportion based on a dichotomy to complete updating of the first sub-quantity and the second sub-quantity.
In some embodiments, the repair circuit includes a plurality of first multiplexing modules and a plurality of second multiplexing modules; the first multiplexing module is connected between the upper surface of the first wafer and the silicon channel member, and the second multiplexing module is connected between the first wafer and the second wafer; after said grouping of the plurality of silicon channel members in average by a first number to obtain said first number of member groups, further comprising: and acquiring the utilization rate of the first multiplexing module and the second multiplexing module in the component group.
In some embodiments, the upper surface of the first wafer includes a plurality of input channels and the upper surface of the second wafer includes a plurality of output channels; the controlling the repair circuit to activate the damaged number of redundant silicon channel members includes: acquiring first channel identifications of input channels of the damaged number of conventional silicon channel members connected with the upper surface of the first wafer; obtaining second channel identifiers of output channels of the damaged number of conventional silicon channel components connected with the upper surface of the second wafer; and determining a target input channel and a target output channel which can be used for connection according to the first channel identifier and the second channel identifier, and electrically connecting the target input channel and the target output channel with the redundant silicon channel components so as to activate the damaged number of redundant silicon channel components.
In some embodiments, the detection circuit includes a state detection module and a register module, an input end of the state detection module is electrically connected with the silicon channel member, an output end of the state detection module is electrically connected with the register module, and the state detection module is used for outputting a detection signal of the silicon channel member to the register module; the obtaining the state information of the component group sent by the detection circuit includes: and acquiring the detection signals stored by the register modules corresponding to the component groups, and acquiring the state information of the component groups according to the detection signals.
It should be noted that, for convenience and brevity of description, specific working processes of the control circuit described above may refer to corresponding processes in the foregoing fault-tolerant repair method embodiment, and are not described herein again.
An embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium stores a computer program, where the computer program includes program instructions, and the processor executes the program instructions to implement the steps of the fault-tolerant repair method provided in the foregoing embodiment. For example, the computer program is loaded by a processor, the following steps may be performed:
And carrying out average grouping on a plurality of silicon channel components according to a first number to obtain a component group with the first number.
The silicon channel members within each of the member groups are divided into a first sub-number of regular silicon channel members and a second sub-number of redundant silicon channel members.
And acquiring the state information of the component group sent by the detection circuit.
And confirming the damaged quantity of the damaged conventional silicon channel components in the component group according to the state information.
And controlling the repair circuit to activate the damaged number of redundant silicon channel members to finish the repair of the silicon channel members of the stacked chips.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
The computer readable storage medium may be an internal storage unit of the computer device of the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of a computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. provided on the computer device.
Because the computer program stored in the computer readable storage medium can execute any fault-tolerant repairing method provided by the embodiment of the present application, the beneficial effects that any fault-tolerant repairing method provided by the embodiment of the present application can be achieved, and detailed descriptions of the previous embodiments are omitted herein.
The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments. While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. The fault-tolerant repair method is characterized by being applied to a control circuit of a fault-tolerant repair circuit, the method is used for repairing a plurality of silicon channel components of a stacked chip, the stacked chip comprises at least one first wafer and a second wafer, the first wafer is internally provided with the plurality of silicon channel components, the first wafer is stacked on the second wafer, the fault-tolerant repair circuit further comprises a detection circuit and a plurality of repair circuits, the detection circuit is electrically connected with the silicon channel components, the input end of each repair circuit is connected between the first wafer and at least one silicon channel component, the output end of each repair circuit is connected between the first wafer and the second wafer, and the control circuit is respectively connected with the detection circuit and the repair circuit in a communication way; the method comprises the following steps:
Grouping a plurality of the silicon channel members in an average manner according to a first number to obtain a first number of member groups;
dividing the silicon channel members within each of said member groups into a first sub-number of regular silicon channel members and a second sub-number of redundant silicon channel members;
acquiring state information of the component group sent by the detection circuit;
confirming the number of damages of the conventional silicon channel members damaged in the member group based on the state information;
and controlling the repair circuit to activate the damaged number of redundant silicon channel members to finish the repair of the silicon channel members of the stacked chips.
2. The method of claim 1, wherein the dividing the silicon channel members within each of the member groups into a first sub-number of regular silicon channel members and a second sub-number of redundant silicon channel members comprises:
calculating a repair rate of the component group according to the first sub-number and the second sub-number; wherein, the formula of the repair rate is as follows:
wherein P is repair rate, N is the number of silicon channel components in the component group,to select N silicon channel componentsFailure probability of n silicon channel members, F is failure rate of the silicon channel members;
Updating the first sub-number and the second sub-number to enable the repair rate of the component group to meet a preset repair condition.
3. The method of claim 2, wherein the updating the first sub-quantity and the second sub-quantity comprises:
acquiring a preset proportion according to the first sub-quantity and the second sub-quantity;
updating the preset proportion based on a dichotomy to complete updating of the first sub-quantity and the second sub-quantity.
4. The method of claim 2, wherein the repair circuit comprises a plurality of first multiplexing modules and a plurality of second multiplexing modules; the first multiplexing module is connected between the upper surface of the first wafer and the silicon channel member, and the second multiplexing module is connected between the first wafer and the second wafer; after said grouping of the plurality of silicon channel members in average by a first number to obtain said first number of member groups, further comprising:
and acquiring the utilization rate of the first multiplexing module and the second multiplexing module in the component group.
5. The method of claim 4, wherein updating the first sub-quantity and the second sub-quantity such that the repair rate of the group of components meets a preset repair condition comprises:
Updating the first sub-number and the second sub-number so that the repair rate of the component group meets a preset repair condition and the use rate of the component group meets a preset use condition.
6. The method of claim 1, wherein the upper surface of the first wafer comprises a plurality of input channels and the upper surface of the second wafer comprises a plurality of output channels; the controlling the repair circuit to activate the damaged number of redundant silicon channel members includes:
acquiring first channel identifications of input channels of the damaged number of conventional silicon channel members connected with the upper surface of the first wafer;
obtaining second channel identifiers of output channels of the damaged number of conventional silicon channel components connected with the upper surface of the second wafer;
and determining a target input channel and a target output channel which can be used for connection according to the first channel identifier and the second channel identifier, and electrically connecting the target input channel and the target output channel with the redundant silicon channel components so as to activate the damaged number of redundant silicon channel components.
7. The method of claim 1, wherein the detection circuit comprises a state detection module and a register module, an input of the state detection module is electrically connected to the silicon channel member, an output of the state detection module is electrically connected to the register module, and the state detection module is configured to output a detection signal of the silicon channel member to the register module; the obtaining the state information of the component group sent by the detection circuit includes:
And acquiring the detection signals stored by the register modules corresponding to the component groups, and acquiring the state information of the component groups according to the detection signals.
8. A stacked chip, comprising:
at least one first wafer, including an upper surface and a lower surface opposite to the upper surface, wherein a plurality of silicon channel members are arranged between the upper surface and the lower surface to form electrical conduction between the upper surface and the lower surface;
the second wafer comprises an upper surface and a lower surface arranged opposite to the upper surface, and the lower surface of the first wafer is stacked on the upper surface of the second wafer;
the fault-tolerant repair circuit comprises a control circuit, a detection circuit and a plurality of repair circuits, wherein the detection circuit is electrically connected with the silicon channel components, one side of an input end of the repair circuit is electrically connected with the upper surface of the first wafer, the other side of the input end of the repair circuit is electrically connected with one side of at least one silicon channel component, one side of an output end of the repair circuit is electrically connected with the other side of the silicon channel components, the other side of the output end of the repair circuit is electrically connected with the second wafer, and the control circuit is respectively in communication connection with the detection circuit and the repair circuit;
Wherein the control circuit comprises a processor, a memory, and a computer program stored on the memory and executable by the processor, the memory storing a policy model, wherein the computer program when executed by the processor implements the fault tolerant repair method of any one of claims 1 to 7.
9. The stacked chip of claim 8, wherein the repair circuit comprises a plurality of first multiplexing modules and a plurality of second multiplexing modules; the first multiplexing module is connected between the upper surface of the first wafer and the silicon channel member, and the second multiplexing module is connected between the first wafer and the second wafer; the detection circuit comprises a state detection module and a registering module, wherein the input end of the state detection module is electrically connected with the silicon channel component, the output end of the state detection module is electrically connected with the registering module, and the state detection module is used for outputting a detection signal of the silicon channel component to the registering module.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, causes the processor to implement the fault tolerant restoration method according to any of claims 1 to 7.
CN202311464166.4A 2023-11-06 2023-11-06 Fault-tolerant repairing method, stacked chip and storage medium Pending CN117542402A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311464166.4A CN117542402A (en) 2023-11-06 2023-11-06 Fault-tolerant repairing method, stacked chip and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311464166.4A CN117542402A (en) 2023-11-06 2023-11-06 Fault-tolerant repairing method, stacked chip and storage medium

Publications (1)

Publication Number Publication Date
CN117542402A true CN117542402A (en) 2024-02-09

Family

ID=89791025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311464166.4A Pending CN117542402A (en) 2023-11-06 2023-11-06 Fault-tolerant repairing method, stacked chip and storage medium

Country Status (1)

Country Link
CN (1) CN117542402A (en)

Similar Documents

Publication Publication Date Title
CN107431061B (en) Method and circuit for communication in multi-die package
Jiang et al. On effective through-silicon via repair for 3-D-stacked ICs
Jiang et al. Yield enhancement for 3D-stacked memory by redundancy sharing across dies
US9804221B2 (en) Configurable vertical integration
EP3104277B1 (en) Mixed redundancy scheme for inter-die interconnects in a multichip package
US9666562B2 (en) 3D integrated circuit
US8754704B2 (en) Through-silicon via self-routing circuit and routing method thereof
Lo et al. Architecture of ring-based redundant TSV for clustered faults
WO2018175634A1 (en) Semiconductor layered device with data bus
CN112562767B (en) On-chip software definition interconnection network device and method
CN102592647B (en) Semiconductor device, the method distributing chip id and the method that chip id is set
WO2010000625A1 (en) Microprocessor interface with dynamic segment sparing and repair
US20190229095A1 (en) Integrated wafer-level processing system
Wang et al. A new cellular-based redundant TSV structure for clustered faults
Grecu et al. NoC interconnect yield improvement using crosspoint redundancy
CN117542402A (en) Fault-tolerant repairing method, stacked chip and storage medium
US7159047B2 (en) Network with programmable interconnect nodes adapted to large integrated circuits
JP2022548603A (en) Redundancy Scheme for Multichip Stacked Devices
CN103986672A (en) Method and system for reconstructing on-chip network topological structure
Ouyang et al. A TSV fault-tolerant scheme based on failure classification in 3D-NoC
Hsieh et al. Fault-tolerant mesh for 3D network on chip
CN115171748A (en) Stack structure, memory device and chip gating method
JP7239099B2 (en) TSV Error Tolerant Router Device for 3D Network-on-Chip
Maity et al. A cost-effective repair scheme for clustered TSV defects in 3D ICs
Abdallah et al. A low-overhead fault tolerant technique for TSV-based interconnects in 3D-IC systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination