CN116256620B - Chiplet integrated chip detection method and device, electronic equipment and storage medium - Google Patents

Chiplet integrated chip detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116256620B
CN116256620B CN202310538211.XA CN202310538211A CN116256620B CN 116256620 B CN116256620 B CN 116256620B CN 202310538211 A CN202310538211 A CN 202310538211A CN 116256620 B CN116256620 B CN 116256620B
Authority
CN
China
Prior art keywords
fault
type
register
core particle
core
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310538211.XA
Other languages
Chinese (zh)
Other versions
CN116256620A (en
Inventor
王嘉诚
张少仲
张栩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongcheng Hualong Computer Technology Co Ltd
Original Assignee
Zhongcheng Hualong Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongcheng Hualong Computer Technology Co Ltd filed Critical Zhongcheng Hualong Computer Technology Co Ltd
Priority to CN202310538211.XA priority Critical patent/CN116256620B/en
Publication of CN116256620A publication Critical patent/CN116256620A/en
Application granted granted Critical
Publication of CN116256620B publication Critical patent/CN116256620B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/56External testing equipment for static stores, e.g. automatic test equipment [ATE]; Interfaces therefor
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/28Testing of electronic circuits, e.g. by signal tracer
    • G01R31/2851Testing of integrated circuits [IC]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention relates to the technical field of integrated circuits, in particular to a detection method and device of a Chiplet integrated chip, electronic equipment and a storage medium. The method comprises the following steps: obtaining a core particle grouping result of the Chiplet integrated chip; respectively carrying out fault detection on each core particle group to determine fault core particles; checking a first type register and a second type register in the fault core particle respectively to determine a fault area of the fault core particle according to a checking result; it is determined whether the fault area contains a built-in micro control unit to determine the fault point. The scheme not only can improve the detection efficiency of the Chiplet integrated chip, but also can improve the position accuracy of the fault point.

Description

Chiplet integrated chip detection method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of integrated circuits, in particular to a detection method and device for a chip (core particle) integrated chip, electronic equipment and a storage medium.
Background
The Chiplet integrated chip is formed from a plurality of die connected by a high-speed serial interface, and each die further includes a plurality of differently functioning chiplets.
After the chip integrated chip fails or before use, each core particle needs to be tested, and the core particle is composed of a plurality of small chips. The existing test mode is to detect the integrated chip composed of all the core grains first and then detect each core grain independently, but the efficiency of the test mode is lower.
Therefore, a new method for detecting the chip integrated chip is needed.
Disclosure of Invention
In order to solve the problem that the existing detection method of the Chiplet integrated chip is low in efficiency, the embodiment of the invention provides a detection method, a detection device, electronic equipment and a storage medium of the Chiplet integrated chip.
In a first aspect, an embodiment of the present invention provides a method for detecting a Chiplet integrated chip, including:
obtaining a core particle grouping result of the Chiplet integrated chip;
respectively carrying out fault detection on each core particle group to determine fault core particles;
checking a first type register and a second type register in the fault core particle respectively to determine a fault area of the fault core particle according to a checking result; the first type of registers are registers which cannot be overturned in the chip operation process, and the second type of registers are readable and writable registers;
and determining whether the fault area contains a built-in micro control unit or not so as to conduct fault troubleshooting on the fault area and determine a fault point.
In a second aspect, an embodiment of the present invention further provides a detection apparatus for a Chiplet integrated chip, including:
the acquisition unit is used for acquiring a core particle grouping result of the Chiplet integrated chip;
the detection unit is used for respectively carrying out fault detection on each core particle group and determining fault core particles;
the verification unit is used for verifying the first type register and the second type register in the fault core particle respectively so as to determine a fault area of the fault core particle according to a verification result; the first type of registers are registers which cannot be overturned in the chip operation process, and the second type of registers are readable and writable registers;
and the troubleshooting unit is used for determining whether the fault area contains a built-in micro control unit or not so as to troubleshoot the fault area and determine a fault point.
In a third aspect, an embodiment of the present invention further provides an electronic device, including a memory and a processor, where the memory stores a computer program, and when the processor executes the computer program, the method described in any embodiment of the present specification is implemented.
In a fourth aspect, embodiments of the present invention also provide a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform a method according to any of the embodiments of the present specification.
The embodiment of the invention provides a detection method, a detection device, electronic equipment and a storage medium of a Chiplet integrated chip, wherein the detection method, the detection device, the electronic equipment and the storage medium firstly acquire a core particle grouping result of the Chiplet integrated chip; then, respectively carrying out fault detection on each core particle group to determine fault core particles; finally, after the fault core particle is determined, the first type register and the second type register in the fault core particle can be respectively checked, so that a fault area of the fault core particle is determined according to a checking result; and determining whether the fault area contains a built-in micro control unit or not so as to conduct fault investigation on the fault area and determine a fault point. Therefore, the scheme not only can improve the detection efficiency of the Chiplet integrated chip, but also can improve the position accuracy of the fault point.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for testing a chip integrated chip according to an embodiment of the present invention;
FIG. 2 is a hardware architecture diagram of an electronic device according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a detection apparatus of a Chiplet integrated chip according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by those skilled in the art without making any inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.
As previously mentioned, the core particle needs to be tested either after a chip failure or before use, and is made up of a plurality of chiplets. The existing test mode is to detect the integrated chip formed by all the core grains first and then detect each core grain independently, and the efficiency of the test mode is lower.
In order to solve the technical problems, the inventor can consider that core particles in the chip integrated chip are grouped first, and then each core particle group is detected respectively, so that compared with a mode of detecting each core particle independently, the detection efficiency can be greatly improved; in addition, after the fault core particle is determined, each first type register and each second type register contained in the fault core particle can be respectively checked, so that a fault area of the fault core particle is determined according to a check result, whether a built-in micro control unit is contained in the fault area is determined, fault detection is carried out on the fault area, and a fault point is determined.
Specific implementations of the above concepts are described below.
Referring to fig. 1, an embodiment of the present invention provides a method for detecting a chip integrated chip, including:
step 100: obtaining a core particle grouping result of the Chiplet integrated chip;
step 102: respectively carrying out fault detection on each core particle group to determine fault core particles;
step 104: checking a first type register and a second type register in the fault core particle respectively to determine a fault area of the fault core particle according to a checking result; the first type of registers are registers which cannot be overturned in the running process of the chip, and the second type of registers are readable and writable registers;
step 106: and determining whether the fault area contains a built-in micro control unit or not so as to conduct fault investigation on the fault area and determine a fault point.
In the embodiment of the invention, firstly, a core particle grouping result of a Chiplet integrated chip is obtained; then, respectively carrying out fault detection on each core particle group to determine fault core particles; finally, after the fault core particle is determined, the first type register and the second type register in the fault core particle can be respectively checked, so that a fault area of the fault core particle is determined according to a checking result; and determining whether the fault area contains a built-in micro control unit or not so as to conduct fault investigation on the fault area and determine a fault point. Therefore, the scheme not only can improve the detection efficiency of the Chiplet integrated chip, but also can improve the position accuracy of the fault point.
The manner in which the individual steps shown in fig. 1 are performed is described below.
For step 100:
in some embodiments, the Chiplet integrated chip can perform core grouping by:
determining a fault index of each core particle in the chip integrated chip;
individually taking the core particles with the failure index larger than a first threshold value as a core particle group;
partitioning the remaining core grains according to the positions of the core grains with the fault indexes larger than the first threshold value in the chip integrated chip;
for each region, performing: dividing the core particles with the failure index smaller than the second threshold value in the current region into a core particle group, and calculating the correlation of the remaining ungrouped core particles so as to group the remaining ungrouped core particles according to the correlation; wherein the second threshold is less than the first threshold.
In some embodiments, the failure index may be calculated by the following formula:
Figure SMS_1
wherein F is the failure index of the current core particle,
Figure SMS_4
is important for the current core particleGrade (S)>
Figure SMS_5
For the failure probability of the current core, +.>
Figure SMS_7
For the influence coefficient of the first type of register, +.>
Figure SMS_2
For the number of registers of the first type +.>
Figure SMS_6
For the influence coefficient of the second type of register, +.>
Figure SMS_8
For the number of registers of the second type +.>
Figure SMS_9
Is the influence coefficient of the micro-control unit, +.>
Figure SMS_3
Is the number of microcontrol units currently built into the core particle.
In the embodiment of the present invention, the importance level of each core particle in the chip integrated chip may be predetermined and classified into a first level (relatively important), a second level (moderately important) and a third level (generally important), and if the importance level of a certain core particle is the first level, the importance level of the core particle
Figure SMS_12
Then 1, it will be appreciated that +.>
Figure SMS_14
2, third grade->
Figure SMS_16
For 3, it can be seen that the more important the core, the greater the failure index, the more important the core will be as a single core group or the number of cores within the core group will be reduced as much as possible. />
Figure SMS_11
、/>
Figure SMS_15
And->
Figure SMS_17
The influence coefficients of the first type of register, the second type of register and the micro control unit on the core particle operation can be set according to experience. In the present embodiment, the number of the first type register, the second type register and the micro control unit needs to be considered, and when the number is large, the fault detection is difficult, and the number of the core grains is as large as possible to be used as one core grain group or the number of the core grains in the core grain group is reduced, so +_>
Figure SMS_18
、/>
Figure SMS_10
And->
Figure SMS_13
. Therefore, the fault index is calculated through the formula, the core particles can be comprehensively considered and grouped according to whether the core particles are easy to fault, the importance level and the number of the first type registers, the second type registers and the micro control units, and the subsequent fault checking efficiency can be greatly improved.
Then, the core particles with the failure index larger than the first threshold value are independently used as a core particle group, the positions of the core particle groups in the Chiplet integrated chip are used as dividing lines, the rest core particles are partitioned according to the positions, so that the core particles which are relatively close in distance and have direct data interaction can be divided into a group, the efficiency and the accuracy of subsequent failure detection can be improved, the adjacent core particles are relatively convenient to connect in series during detection, and the core particles with direct data interaction are possibly connected, so that the time can be greatly saved.
In this embodiment, although the partition is good, the number of core particles in each region may be large, and the probability of failure caused by direct partition into a group of core particles is too large, which may increase difficulty in subsequent investigation of failed core particles. Therefore, the core particles with the failure index smaller than the second threshold value in each region can be divided into one core particle group, and the correlation of the remaining ungrouped core particles is calculated to group the remaining ungrouped core particles according to the correlation, so that the efficiency and the accuracy of the subsequent detection are improved.
In this embodiment, the correlation of the remaining non-grouped core particles may be determined based on the distance between the core particles, whether there is data interaction, the frequency of data interaction, and the flow of data interaction, so as to divide the core particles that are relatively close to each other and have direct data interaction into a group, so that the efficiency and accuracy of subsequent fault detection may be improved, and the adjacent core particles are relatively convenient to connect in series during detection, and the core particles that have direct data interaction may have been connected, so that the time may be greatly saved.
For step 102:
in this step, the core grains in each core grain group are electrically connected with each other, and a test signal is input to the first core grain in each core grain group, so that it is only necessary to detect the output signal of the last core grain in each core grain group, and it can be determined whether the current core grain group has a fault. Thus, separate detection of other core particles can be omitted, the detection steps can be simplified, and the detection efficiency can be greatly improved.
For step 104:
after the faulty core particle is determined, a fault screening of each chiplet of the faulty core particle is required to determine the faulty region.
The common configuration registers in the chip are mainly two types, the first type of registers are registers in which the values are not changed in the running process of the chip after configuration, and the second type of registers are registers in which the values are rewritten when the special wr_en write enable signal is valid, so that the fault area can be determined by checking the first type of registers and the second type of registers contained in the fault core particle.
In some embodiments, the first type of register is verified by the following verification scheme:
reading the value of each first-type register in a set clock period;
comparing the value of each first-type register with the pre-acquired initial configuration value of each first-type register in each clock period;
if the comparison results are the same in the set clock period, determining that the first type register has no fault;
if the comparison results are different in the set clock period, determining that the first type of register fails.
In this embodiment, since the value of the first type register does not change after configuration, it is possible to compare whether the value of each first type register is the same as the previous initial configuration value in each clock cycle in the set clock cycle, if each clock cycle is the same, consider that the first type register has no fault, and if at least one clock cycle is different in the set clock cycle, determine that the first type register has a fault.
In some embodiments, the second type of register is verified by the following verification scheme:
acquiring a check standard table of a second type register; the verification standard table comprises initial configuration values of each second-class register and theoretical values of each second-class register in each target clock period;
in each target clock cycle, performing:
reading the actual value of each second type register in the current target clock period, and writing the actual value into a check standard table;
determining whether the failed chip generates a valid write enable signal in the current clock cycle;
and determining the verification result of each second-class register based on the actual value of each second-class register in the current target clock period, the generation result of the write enable signal, the theoretical value of each second-class register in the verification standard table in the current target clock period and the actual value of each second-class register in the last target clock period.
In this embodiment, a calibration standard table may be determined in advance, where the table includes an initial configuration value of each second type register and a theoretical value of each second type register during the operation of the faulty core particle; then after each second type register is configured, starting to check, firstly reading the actual value of each second type register under the first target clock period, and writing the actual value of each second type register into a check standard table; then, determining whether the failed chip generates a valid write enable signal in the current clock cycle; then, the verification result of each second-type register under the first target clock cycle can be determined according to the actual value of each second-type register under the first target clock cycle, the generation result of the write enable signal, the theoretical value of each second-type register under the current target clock cycle in the verification standard table and the actual value of each second-type register (namely, the initial configuration value of each second-type register) during the last target clock cycle; then reading the actual value of each second type register in the second target clock period, and writing the actual value of each second type register into a check standard table; then, determining whether the failed chip generates a valid write enable signal in the current clock cycle; then, the actual value of each second-class register in the second target clock cycle, the generation result of the write enable signal, the theoretical value of each second-class register in the current target clock cycle and the actual value of each second-class register in the last target clock cycle (i.e. the initial configuration value of each second-class register in the first target clock cycle) in the calibration standard table can be used for determining the calibration result of each second-class register in the second target clock cycle, and the like, so that all the calibration results of each second-class register in the set clock cycle can be determined.
In some embodiments, the step of determining the verification result of each second type register based on the actual value of each second type register in the current target clock cycle, the generated result of the write enable signal, the theoretical value of each second type register in the verification standard table in the current target clock cycle, and the actual value of each second type register in the last target clock cycle may include:
for each second type of register, performing:
judging whether the actual value of the second type register in the current target clock period is consistent with the theoretical value of the second type register in the check standard table in the current target clock period;
if the two types of registers are consistent, the second type of registers have no faults in the current target clock period;
if the actual values of the second type register in the current target clock period are inconsistent with the actual values of the last target clock period, and the write enable signals are not generated, determining suspected faults of the second type register in the current target clock period; otherwise, determining that the second type register fails in the current target clock cycle.
In the present embodiment, for each second-type register, execution is performed: determining whether the actual value of the second type register in the current target clock period is consistent with the theoretical value of the second type register in the check standard table in the current target clock period, and if so, indicating that the second type register has no fault in the current target clock period; if the actual value of the second type register in the current target clock period is not consistent with the actual value of the last target clock period, whether an effective write enable signal is generated or not is determined, if the actual value of the second type register in the current target clock period is different from the actual value of the last target clock period, and the effective write enable signal is not generated in the current target clock period, the second type register is indicated to have abnormal overturn in the current target clock period and is not due to the write enable signal, and under the condition, the second type register is possibly abnormal overturn due to abnormal discharge, and the second type register can be recovered only by restarting a chip or reconfiguring, so that a verification result is suspected fault; in other cases, it is determined that the fault is within the current target clock cycle. Other cases include 1) determining that the actual value of the second type register at the current target clock cycle is the same as the actual value of the last target clock cycle, and generating a valid write enable signal; 2) Determining that the actual value of the second type register in the current target clock period is the same as the actual value of the last target clock period, and generating no effective write enable signal; 3) Determining that the actual value of the second type register in the current target clock period is different from the actual value of the last target clock period, and generating a valid write enable signal; 4) It is necessary to determine that the actual value of the second type of register at the current target clock cycle is different from the actual value of the last target clock cycle and that no valid write enable signal is generated.
After determining the verification result of each first type register and the verification result of each second type register, if a second type register with suspected faults exists, after reconfiguration after restarting is needed, the second type register is independently detected again to determine whether the second type register has true faults. Then, the fault area can be determined according to the first type register and the second type register of the fault.
For step 106:
because the chiplets communicate with each other in the fault chip, the chiplet located in the middle breaks down, the chiplet located downstream of the chiplet receives error data, and then calculates according to the error data, so that the chiplet behind the chiplet can be determined as a fault area, and therefore, fault troubleshooting is also needed to confirm a specific fault point so as to improve the position accuracy of the fault point.
In some embodiments, when the fault area contains a built-in micro control unit, the fault point is determined by performing fault investigation on the fault area in the following manner:
for each micro-control unit, performing: receiving test data of the current micro control unit from an SPI interface (Serial Peripheral Interface ) of a small chip containing the current micro control unit by utilizing an intermediate module, and performing protocol conversion on the test data; one end of the middle module is connected with the SPI main equipment interface of the small chip, and the other end of the middle module is connected with the USB interface of the PC;
the PC is used for receiving the test data after protocol conversion output by the intermediate module, so as to determine whether the micro control unit is a fault point according to the test data after protocol conversion;
and performing fault investigation on other fault areas except the micro control unit to determine all fault points.
In this embodiment, a special simulator is usually disposed on a chiplet including a micro-control unit to perform testing, but the design of the special simulator is very time-consuming and labor-consuming, so that the chiplet can be considered to be connected to a PC for testing, but the chiplet does not have a USB interface, so that the chiplet cannot be directly connected to the PC for detection, and an intermediate module can be disposed to perform protocol conversion on test data of the micro-control unit of the chiplet and send the test data to the PC for determining whether the micro-control unit is a fault point.
However, in the conventional manner, the chiplet including the micro control unit prints the test data through the serial port RS232, but the transmission rate of the serial port is low, and the real-time processing cannot be satisfied; the SPI interface of the small chip is usually connected with the flash memory to configure and initialize the small chip, and after the configuration and initialization are finished, the SPI interface is usually in an idle state, so that the SPI interface can be used for connecting with an intermediate module, and the transmission rate is ensured.
Thus, when the failure area contains a built-in micro control unit, for each chiplet containing a micro control unit in the failure area, performing: and connecting the SPI main equipment interface of the current small chip with the middle module, connecting the other end of the middle module with the USB interface of the PC, performing protocol conversion on test data of the micro control unit of the current small chip by using the middle module, and sending the test data to the PC so as to determine whether the micro control unit is a fault point by using the PC, and after the micro control unit is determined to be checked, further performing fault checking on other areas except the micro control unit in the fault area so as to determine the final fault point.
In an embodiment of the present invention, the intermediate module may at least include the following two structures:
the first structure and the middle module comprise FPGA chips.
And the second structure and the middle module comprise an FPGA chip and a CYUSB3014 chip.
The two structures are described below.
First, the first structure will be described.
In this first configuration, the FPGA chip includes: the SPI slave device interface is simulated through the simulation of the SPI analysis unit, so that the FPGA chip simulates one SPI slave device, and then the test data analyzed by the SPI analysis unit is converted into a USB protocol by the USB conversion unit, so that the protocol conversion work from the SPI slave device interface to the USB interface is completed by directly using one FPGA chip. The SPI slave device simulated by the FPGA chip and a flash chip are connected on the SPI interface of the small chip together. And the other end of the FPGA chip is connected with a PC through a USB interface. Furthermore, since the rates of the USB interface and the SPI interface are different, a buffer area is required to be provided for data buffering between the SPI parsing unit and the USB conversion unit.
In the second structure, the intermediate module performs protocol conversion in the following manner:
receiving test data by using an FPGA chip, and converting the test data from SPI protocol data into GPIF II (general programmable interface ) protocol data;
and receiving the test data after protocol conversion by using a CYUSB3014 chip, and performing secondary protocol conversion on the test data after protocol conversion to convert the test data into USB protocol data.
In this embodiment, an SPI slave device is simulated by a piece of FPGA chip, and is connected to the SPI interface of the chiplet together with a piece of flash chip. Because the PC has no SPI interface and cannot be directly connected with SPI interface equipment, the embodiment uses a CYUSB3014 chip to provide a USB port for connecting with the PC. And the FPGA is connected with the CYUSB3014 chip by using a GPIF II programmable parallel interface provided by the CYUSB 3014. GPIF II is a parallel bus interface standard, and is specially used for connecting a CYUSB3014 chip with an FPGA chip, wherein the bus width is at most 32 bits, the operating frequency is at most 100MHz, and the maximum bandwidth is 3.2Gbps. The CYUSB3014 chip is provided with an SPI interface, but is provided with an SPI Master device (Master) interface, and no method is provided for directly connecting with the SPI Master device interface of the small chip. Thus, the present invention uses an FPGA chip to simulate an SPI slave device connected to a chiplet. One end of the FPGA chip simulates an SPI slave device to be connected with the small chip, and the other end of the FPGA chip is connected with the CYUSB3014 chip through a GPIF II programmable parallel interface. Therefore, the software in the FPGA chip needs to perform protocol conversion work of the SPI interface and the GPIF II interface. Furthermore, since the transmission rates of the SPI interface and the GPIF II interface are different, a buffer area needs to be provided for data buffering between the SPI parsing unit and the GPIF II conversion unit.
As shown in fig. 2 and 3, the embodiment of the invention provides a test device for a chip integrated chip. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. In terms of hardware, as shown in fig. 2, a hardware architecture diagram of an electronic device where a detection device of a chip integrated chip provided in an embodiment of the present invention is located is shown, where the electronic device where the embodiment is located may include other hardware, such as a forwarding chip responsible for processing a message, besides a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 2. Taking a software implementation as an example, as shown in fig. 3, the device in a logic sense is formed by reading a corresponding computer program in a nonvolatile memory into a memory by a CPU of an electronic device where the device is located and running the computer program.
As shown in fig. 3, a detection device for a Chiplet integrated chip provided in this embodiment includes:
an obtaining unit 301, configured to obtain a core particle grouping result of the Chiplet integrated chip;
a detection unit 302, configured to perform fault detection on each core particle group, and determine a faulty core particle;
a verification unit 303, configured to verify the first type register and the second type register in the fault core particle respectively, so as to determine a fault area of the fault core particle according to a verification result; the first type of registers are registers which cannot be overturned in the running process of the chip, and the second type of registers are readable and writable registers;
and the troubleshooting unit 304 is used for determining whether the fault area contains a built-in micro control unit or not so as to troubleshoot the fault area and determine a fault point.
In one embodiment of the present invention, when the fault area includes the built-in micro control unit, the troubleshooting unit 304 performs troubleshooting on the fault area to determine a fault point by:
for each micro-control unit, performing: the method comprises the steps that an intermediate module is utilized to receive test data of a current micro control unit from a serial peripheral interface SPI of a small chip containing the current micro control unit, and protocol conversion is carried out on the test data; one end of the middle module is connected with the SPI main equipment interface of the small chip, and the other end of the middle module is connected with the USB interface of the PC;
the PC is used for receiving the test data after protocol conversion output by the intermediate module, so as to determine whether the micro control unit is a fault point according to the test data after protocol conversion;
and performing fault investigation on other fault areas except the micro control unit to determine all fault points.
In one embodiment of the present invention, in the investigation unit 304, the intermediate module performs protocol conversion by:
receiving test data by using an FPGA chip, and converting the test data from SPI protocol data to GPIF II protocol data;
and receiving the test data after protocol conversion by using a CYUSB3014 chip, and performing secondary protocol conversion on the test data after protocol conversion to convert the test data into USB protocol data.
In one embodiment of the present invention, the Chiplet-integrated chips in grouping unit 301 are core-grouped by:
determining a fault index of each core particle in the chip integrated chip;
individually taking the core particles with the failure index larger than a first threshold value as a core particle group;
partitioning the remaining core grains according to the positions of the core grains with the fault indexes larger than the first threshold value in the chip integrated chip;
for each region, performing: dividing the core particles with the failure index smaller than the second threshold value in the current region into a core particle group, and calculating the correlation of the remaining ungrouped core particles so as to group the remaining ungrouped core particles according to the correlation; wherein the second threshold is less than the first threshold.
In one embodiment of the present invention, in the grouping unit 301, the failure index is calculated by the following formula:
Figure SMS_19
wherein F is the failure index of the current core particle,
Figure SMS_20
for the current level of importance of the core +.>
Figure SMS_23
For the failure probability of the current core, +.>
Figure SMS_25
For the influence coefficient of the first type of register, +.>
Figure SMS_22
For the number of registers of the first type +.>
Figure SMS_24
For the influence coefficient of the second type of register, +.>
Figure SMS_26
For the number of registers of the second type +.>
Figure SMS_27
Is the influence coefficient of the micro-control unit, +.>
Figure SMS_21
Is the number of microcontrol units currently built into the core particle.
In one embodiment of the present invention, the first type of register in the checking unit 303 is checked by the following checking method:
reading the value of each first-type register in a set clock period;
comparing the value of each first-type register with the pre-acquired initial configuration value of each first-type register in each clock period;
if the comparison results are the same in the set clock period, determining that the first type register has no fault;
if the comparison results are different in the set clock period, determining that the first type of register fails.
In one embodiment of the present invention, the second type of register in the verification unit 303 is verified by the following verification method:
acquiring a check standard table of a second type register; the verification standard table comprises initial configuration values of each second-class register and theoretical values of each second-class register in each target clock period;
in each target clock cycle, performing:
reading the actual value of each second type register in the current target clock period, and writing the actual value into a check standard table;
determining whether the failed chip generates a valid write enable signal in the current clock cycle;
and determining the verification result of each second-class register based on the actual value of each second-class register in the current target clock period, the generation result of the write enable signal, the theoretical value of each second-class register in the verification standard table in the current target clock period and the actual value of each second-class register in the last target clock period.
In one embodiment of the present invention, the verification unit 303 is configured to, when executing the determination of the verification result of each second-type register based on the actual value of each second-type register in the current target clock cycle, the generated result of the write enable signal, the theoretical value of each second-type register in the verification standard table in the current target clock cycle, and the actual value of each second-type register in the last target clock cycle, determine the verification result of each second-type register:
for each second type of register, performing:
judging whether the actual value of the second type register in the current target clock period is consistent with the theoretical value of the second type register in the check standard table in the current target clock period;
if the two types of registers are consistent, the second type of registers have no faults in the current target clock period;
if the actual values of the second type register in the current target clock period are inconsistent with the actual values of the last target clock period, and the write enable signals are not generated, determining suspected faults of the second type register in the current target clock period; otherwise, determining that the second type register fails in the current target clock cycle.
It will be appreciated that the structure illustrated in the embodiments of the present invention is not limited to a specific configuration of a chip integrated chip detection device. In other embodiments of the invention, a detection device of a Chiplet integrated chip can include more or less components than illustrated, or certain components can be combined, certain components can be split, or different component arrangements. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The content of information interaction and execution process between the modules in the device is based on the same conception as the embodiment of the method of the present invention, and specific content can be referred to the description in the embodiment of the method of the present invention, which is not repeated here.
The embodiment of the invention also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and when the processor executes the computer program, the detection method of the Chiplet integrated chip in any embodiment of the invention is realized.
The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium is stored with a computer program, when the computer program is executed by a processor, the processor is caused to execute the detection method of the Chiplet integrated chip in any embodiment of the invention.
Specifically, a system or apparatus provided with a storage medium on which a software program code realizing the functions of any of the above embodiments is stored, and a computer (or CPU or MPU) of the system or apparatus may be caused to read out and execute the program code stored in the storage medium.
In this case, the program code itself read from the storage medium may realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code form part of the present invention.
Examples of the storage medium for providing the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer by a communication network.
Further, it should be apparent that the functions of any of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform part or all of the actual operations based on the instructions of the program code.
Further, it is understood that the program code read out by the storage medium is written into a memory provided in an expansion board inserted into a computer or into a memory provided in an expansion module connected to the computer, and then a CPU or the like mounted on the expansion board or the expansion module is caused to perform part and all of actual operations based on instructions of the program code, thereby realizing the functions of any of the above embodiments.
It is noted that relational terms such as first and second, and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. The detection method of the Chiplet integrated chip is characterized by comprising the following steps of:
obtaining a core particle grouping result of the Chiplet integrated chip;
respectively carrying out fault detection on each core particle group to determine fault core particles;
checking a first type register and a second type register in the fault core particle respectively to determine a fault area of the fault core particle according to a checking result; the first type of registers are registers which cannot be overturned in the chip operation process, and the second type of registers are readable and writable registers;
determining whether the fault area contains a built-in micro control unit or not so as to conduct fault troubleshooting on the fault area and determine a fault point;
the chip integrated chip is used for grouping core particles in the following way:
determining a fault index of each core particle in the Chiplet integrated chip;
individually taking the core particles with the failure index larger than a first threshold value as a core particle group;
partitioning the remaining core grains according to the positions of the core grains with the fault index larger than a first threshold value in the chip integrated chip;
for each region, performing: dividing the core particles with the failure index smaller than a second threshold value in the current region into a core particle group, and calculating the correlation of the remaining non-grouped core particles so as to group the remaining non-grouped core particles according to the correlation; wherein the second threshold is less than the first threshold;
the failure index is calculated by the following formula:
Figure QLYQS_1
wherein F is the failure index of the current core particle,
Figure QLYQS_4
for the current level of importance of the core +.>
Figure QLYQS_6
As the probability of failure of the current core particle,
Figure QLYQS_8
for the influence coefficients of the first type of register, and (2)>
Figure QLYQS_3
For the number of registers of said first type, < >>
Figure QLYQS_5
For the influence coefficients of said second type of register, and (2)>
Figure QLYQS_7
For the number of registers of said second type, < >>
Figure QLYQS_9
Is the influence coefficient of the micro-control unit, +.>
Figure QLYQS_2
Is the number of microcontrol units currently built into the core particle.
2. The method according to claim 1, wherein when the fault area contains a built-in micro control unit, the fault point is determined by performing fault investigation on the fault area in the following manner:
for each micro-control unit, performing: receiving test data of a current micro control unit from a serial peripheral interface SPI of a small chip containing the current micro control unit by utilizing an intermediate module, and carrying out protocol conversion on the test data; one end of the middle module is connected with the SPI main equipment interface of the small chip, and the other end of the middle module is connected with the USB interface of the PC;
the PC is utilized to receive the test data after protocol conversion output by the intermediate module, so as to determine whether the micro control unit is a fault point according to the test data after protocol conversion;
and performing fault investigation on other fault areas except the micro control unit to determine all fault points.
3. The method of claim 2, wherein the intermediate module performs protocol conversion by:
receiving the test data by using an FPGA chip, and converting the test data from SPI protocol data into general programmable interface GPIF II protocol data;
and receiving the test data after protocol conversion by using a CYUSB3014 chip, and performing secondary protocol conversion on the test data after protocol conversion to convert the test data into USB protocol data.
4. The method of claim 1, wherein the second type of register is verified by the following verification means:
acquiring a check standard table of the second type register; the verification standard table comprises initial configuration values of each second-class register and theoretical values of each second-class register in each target clock period;
in each target clock cycle, performing:
reading the actual value of each second type register in the current target clock period, and writing the actual value into the check standard table;
determining whether the failed chip generates a valid write enable signal within a current clock cycle;
and determining the verification result of each second-class register based on the actual value of each second-class register in the current target clock period, the generation result of the write enable signal, the theoretical value of each second-class register in the verification standard table in the current target clock period and the actual value of each second-class register in the last target clock period.
5. The method of claim 4, wherein determining the verification result for each second-type register based on the actual value of each second-type register at the current target clock cycle, the generated result of the write enable signal, the theoretical value of each second-type register at the current target clock cycle in the verification standard table, and the actual value of each second-type register at the last target clock cycle, comprises:
for each second type of register, performing:
judging whether the actual value of the second type register in the current target clock period is consistent with the theoretical value of the second type register in the check standard table in the current target clock period;
if the two types of registers are consistent, the second type of registers have no faults in the current target clock period;
if the actual values of the second type register in the current target clock period are inconsistent with the actual values of the last target clock period, and the write enable signals are not generated, determining suspected faults of the second type register in the current target clock period; otherwise, determining that the second type register fails in the current target clock cycle.
6. A chip integrated chip detection device, comprising:
the acquisition unit is used for acquiring a core particle grouping result of the Chiplet integrated chip;
the detection unit is used for respectively carrying out fault detection on each core particle group and determining fault core particles;
the verification unit is used for verifying the first type register and the second type register in the fault core particle respectively so as to determine a fault area of the fault core particle according to a verification result; the first type of registers are registers which cannot be overturned in the chip operation process, and the second type of registers are readable and writable registers;
the troubleshooting unit is used for determining whether the fault area contains a built-in micro control unit or not so as to troubleshoot the fault area and determine a fault point;
the chip integrated chip in the acquisition unit performs core particle grouping by the following method:
determining a fault index of each core particle in the chip integrated chip;
individually taking the core particles with the failure index larger than a first threshold value as a core particle group;
partitioning the remaining core grains according to the positions of the core grains with the fault indexes larger than the first threshold value in the chip integrated chip;
for each region, performing: dividing the core particles with the failure index smaller than the second threshold value in the current region into a core particle group, and calculating the correlation of the remaining ungrouped core particles so as to group the remaining ungrouped core particles according to the correlation; wherein the second threshold is less than the first threshold;
in the acquisition unit, the failure index is calculated by the following formula:
Figure QLYQS_10
wherein F is the failure index of the current core particle,
Figure QLYQS_13
for the current level of importance of the core +.>
Figure QLYQS_14
As the probability of failure of the current core particle,
Figure QLYQS_16
for the influence coefficient of the first type of register, +.>
Figure QLYQS_12
For the number of registers of the first type +.>
Figure QLYQS_15
As the influencing factor of the second type of register,
Figure QLYQS_17
for the number of registers of the second type +.>
Figure QLYQS_18
Is the influence coefficient of the micro-control unit, +.>
Figure QLYQS_11
Is the number of microcontrol units currently built into the core particle.
7. An electronic device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the method of any of claims 1-5 when the computer program is executed.
8. A computer readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of any of claims 1-5.
CN202310538211.XA 2023-05-15 2023-05-15 Chiplet integrated chip detection method and device, electronic equipment and storage medium Active CN116256620B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310538211.XA CN116256620B (en) 2023-05-15 2023-05-15 Chiplet integrated chip detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310538211.XA CN116256620B (en) 2023-05-15 2023-05-15 Chiplet integrated chip detection method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116256620A CN116256620A (en) 2023-06-13
CN116256620B true CN116256620B (en) 2023-07-14

Family

ID=86686517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310538211.XA Active CN116256620B (en) 2023-05-15 2023-05-15 Chiplet integrated chip detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116256620B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111381147A (en) * 2018-12-29 2020-07-07 北京灵汐科技有限公司 Many-core chip testing method, many-core chip testing device and many-core chip testing equipment
CN112595966A (en) * 2021-03-03 2021-04-02 南京邮电大学 IEEE standard based Chiplet circuit testing method
CN114578217A (en) * 2022-05-06 2022-06-03 南京邮电大学 Controllable Chiplet serial test circuit
CN115020266A (en) * 2022-08-04 2022-09-06 南京邮电大学 2.5D chip bound test circuit
CN115047322A (en) * 2022-08-17 2022-09-13 中诚华隆计算机技术有限公司 Method and system for identifying fault chip of intelligent medical equipment
CN115112979A (en) * 2022-06-27 2022-09-27 集睿致远(厦门)科技有限公司 ESD event detection method and device, electronic equipment and storage medium
CN115128438A (en) * 2022-09-02 2022-09-30 中诚华隆计算机技术有限公司 Chip internal fault monitoring method and device
CN115166493A (en) * 2022-09-06 2022-10-11 中诚华隆计算机技术有限公司 Chip internal detection obstacle removing method
CN115576738A (en) * 2022-12-08 2023-01-06 中诚华隆计算机技术有限公司 Method and system for realizing equipment fault determination based on chip analysis
CN115622666A (en) * 2022-12-06 2023-01-17 北京超摩科技有限公司 Fault channel replacement method for transmission of data link between core particles and core particles
CN115617739A (en) * 2022-09-27 2023-01-17 南京信息工程大学 Chip based on Chiplet architecture and control method
WO2023023975A1 (en) * 2021-08-25 2023-03-02 华为技术有限公司 Chip, chip manufacturing method, and related apparatus
TW202311977A (en) * 2021-08-16 2023-03-16 美商高通公司 Systems and methods for sleep clock edge-based global counter synchronization in a chiplet system

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111381147A (en) * 2018-12-29 2020-07-07 北京灵汐科技有限公司 Many-core chip testing method, many-core chip testing device and many-core chip testing equipment
CN112595966A (en) * 2021-03-03 2021-04-02 南京邮电大学 IEEE standard based Chiplet circuit testing method
TW202311977A (en) * 2021-08-16 2023-03-16 美商高通公司 Systems and methods for sleep clock edge-based global counter synchronization in a chiplet system
WO2023023975A1 (en) * 2021-08-25 2023-03-02 华为技术有限公司 Chip, chip manufacturing method, and related apparatus
CN114578217A (en) * 2022-05-06 2022-06-03 南京邮电大学 Controllable Chiplet serial test circuit
CN115112979A (en) * 2022-06-27 2022-09-27 集睿致远(厦门)科技有限公司 ESD event detection method and device, electronic equipment and storage medium
CN115020266A (en) * 2022-08-04 2022-09-06 南京邮电大学 2.5D chip bound test circuit
CN115047322A (en) * 2022-08-17 2022-09-13 中诚华隆计算机技术有限公司 Method and system for identifying fault chip of intelligent medical equipment
CN115128438A (en) * 2022-09-02 2022-09-30 中诚华隆计算机技术有限公司 Chip internal fault monitoring method and device
CN115166493A (en) * 2022-09-06 2022-10-11 中诚华隆计算机技术有限公司 Chip internal detection obstacle removing method
CN115617739A (en) * 2022-09-27 2023-01-17 南京信息工程大学 Chip based on Chiplet architecture and control method
CN115622666A (en) * 2022-12-06 2023-01-17 北京超摩科技有限公司 Fault channel replacement method for transmission of data link between core particles and core particles
CN115576738A (en) * 2022-12-08 2023-01-06 中诚华隆计算机技术有限公司 Method and system for realizing equipment fault determination based on chip analysis

Also Published As

Publication number Publication date
CN116256620A (en) 2023-06-13

Similar Documents

Publication Publication Date Title
CN116256621B (en) Method and device for testing core particle, electronic equipment and storage medium
CN109117518B (en) System and method for verifying read-write access of register
CN112417798B (en) Time sequence testing method and device, electronic equipment and storage medium
US6339837B1 (en) Hybrid method for design verification
CN112286750A (en) GPIO (general purpose input/output) verification method and device, electronic equipment and medium
US6360353B1 (en) Automated alternating current characterization testing
US7823101B2 (en) Device, method, and storage for verification scenario generation, and verification device
US11200129B2 (en) Performance evaluation for an electronic design under test
CN116774016B (en) Chip testing method, device, equipment and storage medium
CN116227398B (en) Method and system for automatically generating IP core test stimulus
CN108120917B (en) Method and device for determining test clock circuit
US6934656B2 (en) Auto-linking of function logic state with testcase regression list
CN116256620B (en) Chiplet integrated chip detection method and device, electronic equipment and storage medium
CN111624475A (en) Method and system for testing large-scale integrated circuit
CN115470125B (en) Log file-based debugging method, device and storage medium
CN114091391A (en) Chip verification method, device, equipment and storage medium
US20200320241A1 (en) Method of Detecting a Circuit Malfunction and Related Device
CN108363567B (en) Database-based verification platform exciter automatic generation method
US7689399B1 (en) Automatic extraction of design properties
US20020126581A1 (en) Method of analyzing clock skew between signals
US10060976B1 (en) Method and apparatus for automatic diagnosis of mis-compares
CN112131811B (en) FPGA time sequence parameter extraction method
CN117094269B (en) Verification method, verification device, electronic equipment and readable storage medium
CN116340046B (en) Core particle fault detection method and device
CN115983172B (en) Method and simulation platform for post simulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant