WO2023244210A1 - Quick error detection and symbolic initial states in pre-silicon verification and debugging - Google Patents

Quick error detection and symbolic initial states in pre-silicon verification and debugging Download PDF

Info

Publication number
WO2023244210A1
WO2023244210A1 PCT/US2022/033266 US2022033266W WO2023244210A1 WO 2023244210 A1 WO2023244210 A1 WO 2023244210A1 US 2022033266 W US2022033266 W US 2022033266W WO 2023244210 A1 WO2023244210 A1 WO 2023244210A1
Authority
WO
WIPO (PCT)
Prior art keywords
registers
instructions
subset
register file
sequence
Prior art date
Application number
PCT/US2022/033266
Other languages
French (fr)
Inventor
Flavio M. DE PAULA
Shanchih WEN
Original Assignee
Woodpecker Technologies Pte. Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Woodpecker Technologies Pte. Ltd filed Critical Woodpecker Technologies Pte. Ltd
Priority to PCT/US2022/033266 priority Critical patent/WO2023244210A1/en
Publication of WO2023244210A1 publication Critical patent/WO2023244210A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/26Functional testing
    • G06F11/273Tester hardware, i.e. output processing circuits

Definitions

  • This application relates generally to integrated circuits (ICs), particularly to methods, systems, and non-transitory computer-readable media for effective pre-silicon verification and debugging of integrated circuits (e.g., a multicore processor system).
  • ICs integrated circuits
  • methods, systems, and non-transitory computer-readable media for effective pre-silicon verification and debugging of integrated circuits (e.g., a multicore processor system).
  • An electronic device oftentimes integrates a system on a chip (SoC) with a power management integrated circuit (PMIC), communication ports, external memory or storage, and other peripheral function modules on a main logic board.
  • SoC system on a chip
  • PMIC power management integrated circuit
  • the SoC includes one or more microprocessor or central processing unit (CPU) cores, memory, input/output ports, and secondary storage in a single package.
  • QED Quick Error Detection
  • QED Quick Error Detection
  • SQED symbolic QED
  • BMCs bounded model checkers
  • SQED simplifies the input constraints and the output properties based on an instruction set architecture (ISA) of the CPU and a self-consistency property of the instructions executed on the CPU, respectively.
  • ISA instruction set architecture
  • SQED further simplifies setting the input constraints and checking the output properties, thereby expediting pre-silicon verification and debugging.
  • SQED techniques are applied to verify a single central processing unit (CPU) core and can be extended easily to check a multi-core processor system.
  • SQED can also be used to verify and debug logic circuits that include combination logic or sequential logic circuit elements and do not form any processor core.
  • a method is implemented at a computer device for pre-silicon verification and debugging of a digital hardware system (e.g., SoC) having a register file.
  • the method includes executing formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions.
  • the method further includes duplicating the sequence of first instructions to a sequence of second instructions, implementing the sequence of first instructions and the sequence of second instructions, and storing output values of the sequences of first and second instructions into a first set of registers and a second set of registers of the register file, respectively.
  • Each output value is stored in a respective register of the register file and has a predefined number of bits.
  • the method further includes determining whether the sequence of first instructions are properly implemented by the digital hardware system based on a subset of each output value stored in the register file.
  • the subset of each output value includes a first number of fixed bits in a respective output value, and the first number is less than the predefined number.
  • a method for pre-silicon verification and debugging of a digital hardware system having a register file.
  • the method includes executing formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions.
  • the method further includes duplicating the sequence of first instructions to a sequence of second instructions and implementing the sequence of first instructions and the sequence of second instructions.
  • the method further includes storing output values of the sequences of first and second instructions into a first set of registers and a second set of registers of the register file, respectively. Each output value is stored in a respective register of the register file and has a predefined number of bits.
  • the method further includes determining whether the sequence of first instructions are properly implemented by the digital hardware system based on the output values stored in the register file.
  • executing formal analysis of the digital hardware system further includes initializing each of a subset of bits of the first set of registers of the register file and a corresponding subset of bits of the second set of registers of the register file to a respective SIS in a duplicated manner, and the subset of bits of the first set of registers are less than all bits of the first set of registers.
  • a computer system includes one or more processors and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform the above methods for pre-silicon verification and debugging of a digital hardware system.
  • a non-transitory computer-readable medium having instructions stored thereon, which when executed by one or more processors cause the processors to perform the above methods for pre-silicon verification and debugging of a digital hardware system.
  • Figure 1 A is a block diagram of an example system module in a typical electronic device, in accordance with some embodiments.
  • Figure IB is a block diagram of an example computer system that executes pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments.
  • FIG. 2 is a block diagram of a multithreading and multi-core processor system, in accordance with some embodiments.
  • Figure 3 is a block diagram of a symbolic QED formal verification testbench, in accordance with some embodiments.
  • FIG. 4 is a block diagram showing a QED component including a QED module and a fetch unit, in accordance with some embodiments.
  • Figure 5 is a pseudo code for a QED module, in accordance with some embodiments.
  • Figure 6 is a block diagram of a computer system configured to execute presilicon verification and debugging of a digital hardware system, in accordance with some embodiments.
  • Figure 7 is a data structure of an example register file configured to store output values of instructions of a sequence of instructions, in accordance with some embodiments.
  • Figure 8A is an example sequence of original instructions
  • Figure 8B is an example sequence of executed instructions that is generated from the sequence of original instructions in Figure 8A via QED transformation implemented by a QED module, in accordance with some embodiments.
  • Figure 9 is an example process of pre-silicon verification and debugging implemented based on output values of instructions, in accordance with some embodiments.
  • Figure 10A is a data structure of a portion of a register file in which all values stored in registers are initialized to 0, in accordance with some embodiments
  • Figures 10B-10E are four example data structures of a portion of a register file in which a subset of (not all) registers are set to symbolic initial states (SIS), in accordance with some embodiments.
  • SIS symbolic initial states
  • Figure 11 is a flow diagram of a method for pre-silicon verification and debugging of a digital hardware system (e.g., a processor), in accordance with some embodiments.
  • a digital hardware system e.g., a processor
  • Figure 12 is a flow diagram of another method for pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments.
  • FIG. 1 A is a block diagram of an example system module 100 in a typical electronic device, in accordance with some implementations.
  • System module 100 in this electronic device includes at least one or more processors 102, memory modules 104 for storing programs, instructions and data, an input/output (I/O) controller 106, one or more communication interfaces such as network interfaces 108, and one or more communication buses 130 for interconnecting these components.
  • I/O controller 106 allows one or more processors 102 to communicate with an EO device (e.g., a keyboard, a mouse or a touch screen) via a universal serial bus interface.
  • an EO device e.g., a keyboard, a mouse or a touch screen
  • network interfaces 108 includes one or more interfaces for Wi-Fi, Ethernet and Bluetooth networks, each allowing the electronic device to exchange data with an external source, e.g., a server or another electronic device.
  • communication buses 130 include circuitry (sometimes called a chipset) that interconnects and controls communications among various system components included in system module 100.
  • the one or more processors 102 of the system module 100 are integrated in a system on a chip (SoC).
  • memory modules 104 include high-speed random access memory, such as DRAM, SRAM, DDR RAM or other random access solid state memory devices.
  • memory modules 104 include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices.
  • memory modules 104, or alternatively the non-volatile memory device(s) within memory modules 104 include a non-transitory computer readable storage medium.
  • memory slots are reserved on system module 100 for receiving memory modules 104. Once inserted into the memory slots, memory modules 104 are integrated into system module 100.
  • system module 100 further includes one or more components selected from:
  • a memory controller 110 that controls communication between one or more processors 102 and memory components, including memory modules 104, in electronic device;
  • SSDs 112 that apply integrated circuit assemblies to store data in the electronic device, and in many implementations, are based on NAND or NOR memory configurations;
  • a hard drive 114 that is a conventional data storage device used for storing and retrieving digital information based on electromechanical magnetic disks
  • a power supply connector 116 that includes one or more direct current (DC) power supply interfaces each of which is configured to receive a distinct DC supply voltage;
  • DC direct current
  • PMIC power management integrated circuit
  • a graphics module 120 that generates a feed of output images to one or more display devices according to their desirable image/video formats
  • a sound module 122 that facilitates the input and output of audio signals to and from the electronic device under control of computer programs.
  • communication buses 130 also interconnect and control communications among various system components including components 110-122.
  • non-transitory computer readable storage media can be used, as new data storage technologies are developed for storing information in the non- transitory computer readable storage media in the memory modules 104 and in SSDs 112.
  • These new non-transitory computer readable storage media include, but are not limited to, those manufactured from biological materials, nanowires, carbon nanotubes and individual molecules, even though the respective data storage technologies are currently under development and yet to be commercialized.
  • Figure IB is a block diagram of an example computer system 150 that executes pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments.
  • the computer system 150 is integrated into another electronic system such as a router.
  • the computer system 150 is implemented via discrete elements or one or more integrated components.
  • the computer system 150 executes any of a number of operating systems, and stores a program including a plurality of control instructions executed to implement pre-silicon verification and debugging of the digital hardware system including, but not limited to, each processor 102, an SoC integrating a plurality of processors 102, memory controller 110, memory integrated circuit, PMIC 118, graphics module 120, sound module 122, or a combination thereof.
  • Computer system 150 includes a processor 152, a memory 154, a storage device 156, and one or more input/output structures 158.
  • One or more input/output structures 158 provide input/output operations for computer system 150, and optionally include a display.
  • One or more buses 160 typically interconnect the components 152, 154, 156, and 158.
  • Processor 152 may be a single or multi core.
  • computer system 150 includes accelerators that are formed in an SoC.
  • Processor 152 is configured to execute instructions for pre-silicon verification and debugging. Such instructions are stored in memory 154 or storage device 156. Data and/or information are received and outputting using one or more input/output structure 158 of computer system 150.
  • memory 154 include a computer-readable medium, such as volatile or nonvolatile memory, which stores data.
  • Storage device 156 provides storage for the instructions for pre-silicon verification and debugging in computer system 150.
  • Storage device 156 is optionally a flash memory device, a disk drive, an optical disk device, a tape device, or any other device employing magnetic, optical, or other information recording technologies.
  • Each processor 102, SoC integrating a plurality of processors 102, memory controller 110, memory integrated circuit, PMIC 118, graphics module 120, or sound module 122 of a system module 100 includes a respective integrated circuit.
  • computer system 150 applies pre-silicon verification and debugging techniques to verify its functions and debug errors occurring in its operation.
  • Formal property checkers i.e., bounded model checkers (BMCs)
  • BMCs bounded model checkers
  • the BMCs include a plurality of input constraints and a plurality of output properties.
  • the input constraints are entered manually, and the output properties are automatically generated based on design specifications.
  • the BMCs are configured to return example operation failures that correspond to inputs complying with the input constraints and erroneous output properties, which need to be troubleshot by the verification engineers.
  • BMC is limited by manually entering the input constraints and iteratively correcting errors associated with operation failures.
  • symbolic quick error detection is applied to facilitate operation of the formal property checkers (i.e., the BMCs).
  • the input constraints are created based on an instruction set architecture (ISA) of the integrated circuit (e.g., a CPU), and the output properties include self-consistency properties of instructions executed on the integrated circuit.
  • ISA instruction set architecture
  • the output properties include self-consistency properties of instructions executed on the integrated circuit.
  • two sets of instructions original and duplicate instructions
  • the CPU’s register file is associated with a memory address space and includes two portions.
  • the register file and memory address space are entirely reserved for the two sets of instructions of SQED, and split into two halves.
  • Original instructions use a first half of the register file
  • duplicate instructions use a second half of the register file.
  • the duplicate instructions are the exact same operations implemented in the same sequence as the original instructions, except the duplicate and original instructions use their separate halves of the register file and memory address space.
  • the input constraints to be defined for the BMCs include a valid set of instructions as defined in the ISA.
  • a QED module is automatically added to a verification environment to duplicate the original instructions to the duplicate instructions. Once the same number of original and duplicate instructions are executed, program consistency requires that the first and second halves of the register file contain the same output values. Stated another way, an equality and universal self-consistency property are required in error detection by duplicate instructions for validation (EDDI-V) for the BMCs.
  • EDDI-V is intended to detect bugs that are activated by a particular sequence of instructions and lead to an incorrect output value, and therefore, requires that the instruction sequence follow a correct control flow and that execution not hang. Hang detection is addressed by CFTSS-V (Control Flow Tracking by Software Signatures for Validation) which, if applied, augments EDDI-V to cover control flow errors.
  • CFTSS-V Control Flow Tracking by Software Signatures for Validation
  • FIG. 2 is a block diagram of a multithreading and multi-core processor system 200, in accordance with some embodiments.
  • An example of the processor system 200 is an OpenSPARC T2 SoC, which is an open source version of an UltraSPARC T2 microprocessor.
  • the UltraSPARC T2 microprocessor belongs to a scalable processor architecture (SPARC) family, and includes a 500-million-transistor SoC with 8 processor cores 202(64 hardware threads), a private level 1 (LI) cache, 8 banks of shared level 2 (L2) cache 204, 4 memory controllers 206, a crossbar-based interconnect 208, and various I/O controllers 210.
  • Logic bug scenarios are simulated for the OpenSPARC T2 SoC.
  • These simulated scenarios represent a variety of “difficult” bug scenarios that are extracted from commercial multicore SoCs and can take an extended time (days to weeks) to identify.
  • These bug scenarios include bugs in the processor cores 202, bugs in components external to the processor cores, and bugs related to power management.
  • a length for bug traces correspond to the number of instructions in the trace found by the BMC tool (not including duplicate instructions created by the QED modules). For bugs that are only found by executing instructions on multiple processor cores 202, the number of instructions for each core 202 may be different. For example, one core 202 could have a bug trace that is 3 instructions long, while another core has a bug trace that is 1 instruction long. The length of the longest bug trace (a length of 3 instructions in this example) is identified for the multicore processor system 200, because all cores must complete execution of the instructions to activate and detect the bug (and the cores execute the instructions in parallel).
  • FIG. 3 is a block diagram of a symbolic QED formal verification testbench 300, in accordance with some embodiments.
  • the computer system 150 provides a device under test (DUT) wrapper 302 including a DUT memory 304 and a QED module 306.
  • the DUT wrapper 302 is coupled to a design of one or more processors 102 applied in a device under test (DUT) 308, e.g., a system module 100 or a processor system 200.
  • the DUT memory 304 is coupled to the processor core(s) 102 and QED module 306, and configured to store instructions executed by the DUT 308 and data generated in the verification and debugging process.
  • the processors 102 include a register file 310 having a plurality of registers for storing output values of the instructions that are implemented for pre-silicon verification and debugging of a digital hardware system.
  • the register file 310 a special small memory that is inside or closest to the processor(s) 102, and has the fastest access time and the smallest capacity among all memories or caches available to the processor(s) 102.
  • the register file 310 includes a first set of registers and a second set of registers. In an example, the plurality of registers of the register file are divided into two halves corresponding to the first and second sets of registers.
  • the QED module 306 is instantiated inside the DUT wrapper 302 and interfaces with the DUT memory 304 and a plurality of monitoring points inside the design of the processor core(s) 102. As such, the QED module 306 of the computer system 200 is configured to implement a pre-silicon verification and debugging process to detect errors in the design of the one or more processors 102 of the DUT 308.
  • the DUT memory 304 includes a memory array 312 having a plurality of memory units for storing output values of the instructions that are implemented for pre-silicon verification and debugging of the digital hardware system.
  • the memory array 312 includes a first set of memory units and a second set of memory units. In an example, the memory array 312 is divided into two halves corresponding to the first and second sets of memory units.
  • the symbolic QED formal verification testbench 300 further includes a set of scripts, e.g., determined based on the design of the one or more processors 102.
  • the symbolic QED formal verification testbench 300 is established in a formal property checker tool, such as VC-Formal from Synopsys or JasperGold from Cadence.
  • the QED module 306 is an RTL module configured to be instantiated inside the DUT 308.
  • the QED module 306 includes a plurality of formal verification constraints and output properties based on the three main QED transformations, which include EDDI-V (Error Detection by Duplicate Instructions for Validation), CFCSS-V (Control Flow Checking by Software Signatures for Validation), and CFTSS-V (Control Flow Tracking by Software Signatures for Validation).
  • EDDI-V Error Detection by Duplicate Instructions for Validation
  • CFCSS-V Control Flow Checking by Software Signatures for Validation
  • CFTSS-V Control Flow Tracking by Software Signatures for Validation
  • the QED module 306 determines that any of the SQED properties is violated, the QED module 306 generates an example which is used by the verification engineer to debug the design of the one or more processors 102. Once a bug is identified and fixed, the verification and debugging process is repeated until the QED module 306 cannot find any example up to a depth and/or a time limit set in the scripts. The depth is optionally measured by a number of clock cycles for which a sequence of instructions are executed.
  • the symbolic QED formal verification testbench 300 is not limited to verifying and debugging processors. In some embodiments, the symbolic QED formal verification testbench 300 is applied to verify and debug digital integrated circuit distinct from the one or more processors 102, i.e., digital logic that does not form any processor core. Thus, the symbolic QED formal verification testbench 300 is broadly applied to pre-silicon verification and debugging of a digital hardware system.
  • FIG. 4 is a block diagram showing a QED component 400 including a QED module 306 and a fetch unit 402, in accordance with some embodiments.
  • the QED module 306 is configured to implement a pre-silicon verification and debugging process and VO modules coupled to the QED module 306.
  • inputs to the QED module 306 include:
  • instruction in which is the instruction from the fetch unit 402 to be executed by the processor core 102;
  • target address which contains the address of the next instruction to execute when the processor executes a control-flow instruction
  • committed which is a signal from the processor core 102 to indicate if the instruction fetched has been committed (i.e., the result written to a register or memory).
  • outputs from the QED module 306 include:
  • PC which is the address of the next instruction to fetch
  • PC override which determines if the processor core 102 should use the PC from the QED module 306 or the PC from the fetch unit 402;
  • instruction override which determines whether the processor core 102 should use the modified instruction from the QED module 306 or the instruction from the fetch unit 402; and 5) qed ready, which is set to qed ready J if the committed input signal is true, and false otherwise.
  • FIG. 5 is a pseudo code 500 for a QED module 306, in accordance with some embodiments.
  • the QED module 306 has one or more of the following internal variables:
  • qed ready i which signals when both the original and duplicated registers should have the same values (under bug-free conditions). Initially, qed ready i is set to false, and is only set to true when both the original and duplicate instructions have been executed.
  • the QED module 306 starts in an ORIG MODE mode.
  • the QED module switches to a DUP MODE mode, loads the address stored in qed rewind address into PC, and sets PC override J to 1 (and as long as enable is true, PC override is also set to 1).
  • the processor core 102 then re- executes instructions starting from the address stored in qed rewind address.
  • DUP MODK mode the duplicated instruction is outputted as instruction out, and instruction override J is set to 1 so the processor core 102 executes the duplicated instruction instead of the original instruction from the fetch unit 402.
  • the processor core 102 executes the control flow instruction, and the QED module 306 stores the address of the next instruction to execute (i.e., the target of the control flow instruction) in qed rewind address and then return to the ORIG MODE mode.
  • FIG. 6 is a block diagram of a computer system 150 configured to execute pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments.
  • the computer system 150 typically, includes one or more processing units (CPUs) 152, one or more network interfaces 602, memory 154, and one or more communication buses 160 for interconnecting these components (sometimes called a chipset).
  • the computer system 150 includes one or more input devices 158A that facilitate user input, such as a keyboard, a mouse, a voice-command input unit or microphone, a touch screen display, a touch-sensitive input pad, a gesture capturing camera, or other input buttons or controls.
  • the computer system 150 also includes one or more output devices 158B that enable presentation of user interfaces and display content, including one or more speakers and/or one or more visual displays.
  • Memory 154 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices; and, optionally, includes non-volatile memory, such as one or more magnetic disk storage devices, one or more optical disk storage devices, one or more flash memory devices, or one or more other non-volatile solid state storage devices. Memory 154, optionally, includes one or more storage devices remotely located from one or more processing units 152. Memory 154, or alternatively the non-volatile memory within memory 154, includes a non-transitory computer readable storage medium. In some embodiments, memory 154, or the non- transitory computer readable storage medium of memory 154, stores the following programs, modules, and data structures, or a subset or superset thereof:
  • Operating system 604 including procedures for handling various basic system services and for performing hardware dependent tasks
  • Network communication module 606 for connecting each computer system 150 to other devices (e.g., a server, client device, or external storage) via one or more network interfaces 602 (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;
  • User interface module 608 for enabling presentation of information (e.g., a graphical user interface for user application(s), widgets, websites and web pages thereof, and/or games, audio and/or video content, text, etc.) at the computer system 150 via one or more output devices 158B (e.g., displays, speakers, etc.);
  • information e.g., a graphical user interface for user application(s), widgets, websites and web pages thereof, and/or games, audio and/or video content, text, etc.
  • output devices 158B e.g., displays, speakers, etc.
  • Input processing module 610 for detecting one or more user inputs or interactions from one of the one or more input devices 158 A and interpreting the detected input or interaction;
  • Web browser module 612 for navigating, requesting (e.g., via HTTP), and displaying websites and web pages thereof, including a web interface for logging into a user account associated with the computer system 150 or another electronic device, controlling the computer system or electronic device if associated with the user account, and editing and reviewing settings and data that are associated with the user account;
  • One or more user applications 614 for execution by the computer system 150 e.g., games, social network applications, smart home applications, data analysis and visualization applications, and/or other web or non-web based applications
  • the one or more user applications 614 include a BMC application for returning example operation failures of a digital hardware system in view of input constraints and output properties;
  • DUT wrapper 302 for creating a symbolic QED formal verification testbench or environment in which a design of a digital hardware system (e.g., a processor 102) is verified and debugged, where the DUT wrapper 302 includes a DUT memory 304 for storing instructions executed by the DUT 308 and data generated in a pre-silicon verification and debugging process and a QED module 306 for implementing a pre-silicon verification and debugging process based on a combination of original and duplicate instructions; and
  • o Device settings 620 including common device settings (e.g., service tier, device model, storage capacity, processing capabilities, communication capabilities, etc.) of the computer system 150
  • o User account information 622 for the one or more user applications 614 e.g., user names, security questions, account history data, user preferences, and predefined account settings
  • o Network parameters 624 for the one or more communication networks 108 e.g., IP address, subnet mask, default
  • the QED module 306 duplicates original instruction to duplicate instructions, and stores output values of the original and duplicate instructions into two separate sets of registers of the register file 310, respectively. Alternatively, in some embodiments, the QED module 306 stores the output values of the original and duplicate instructions into two separate sets of memory units of the memory array 312 of the DUT memory 304. Further, in some embodiments, the QED module 306 determines whether the instructions are properly implemented by the digital hardware system based on a subset of each output value (e.g., a least significant bit (LSB) of each output value), which is stored in the register file 310 or memory array 312.
  • LSB least significant bit
  • the QED module 306 initializes a subset of a first set of registers of the register file 310 and a corresponding subset of a second set of registers of the register file 310 to a respective SIS in a duplicated manner, e.g., initializes the LSB of each register of the register file 310 to the respective SIS.
  • the QED module 306 initializes a subset of a first set of memory units of the memory array 312 and a corresponding subset of a second set of memory units of the memory array 312 to a respective SIS in a duplicated manner, e.g., initializes the LSB of each memory units of the memory array 312 to the respective SIS.
  • Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above.
  • the above identified modules or programs i.e., sets of instructions
  • memory 154 optionally, stores a subset of the modules and data structures identified above.
  • memory 154 optionally, stores additional modules and data structures not described above.
  • Figure 7 is a data structure of an example register file 310 configured to store output values of a sequence of instructions, in accordance with some embodiments.
  • a QED module 306 of the computer system 150 is applied to verify and debug a design of a digital hardware system (e.g., an integrated circuit).
  • the sequence of instructions includes a sequence of first instructions 702 (e.g., an ordered sequence of instructions A, B, C, and D), and the QED module 306 duplicates the sequence of first instructions to a sequence of second instructions 704 (e.g., an ordered sequence of instructions A’, B’, C’, and D’).
  • the sequence of first instructions 702 (also called original instructions) and the sequence of second instructions 704 (also called duplicate instructions) are mixed with each other to form a comprehensive sequence of instructions 706 in which the first instructions in the sequence 702 keep their original order with respect to each other and the second instructions in the sequence 704 keep their original order with respect to each other.
  • the sequence of second instructions 704 follow the sequence of first instructions 702.
  • the comprehensive sequence 706 includes an ordered sequence of instructions A, B, C, D, A’, B’, C’, and D’.
  • the sequence of first instructions 702 follow the sequence of second instructions 704.
  • the comprehensive sequence 706 includes an ordered sequence of instructions A’, B’, C’, D’, A, B, C, and D. Additionally and alternatively, in some embodiments, the sequence of first instructions 702 interleave with the sequence of second instructions 704. In example, the comprehensive sequence 706 includes an ordered sequence of instructions A, B,
  • the QED module 306 in conjunction with a BMC application stimulates the design of the digital hardware system with the comprehensive sequence of instructions 706. During this stimulation, a plurality of input properties are assigned to the digital hardware system, and a plurality of output properties are defined based on output values expected from execution of the instructions by the digital hardware system. The input and output properties are part of the code of the QED module 306.
  • the register file 310 includes a first set of registers 310A and a second set of registers 310B. Each of the first and second sets of registers 310A and 310B includes the same number (e.g., 16) of registers 708, and each register 708 has a predefined number of bits (e.g., 32 bits).
  • Each of the first set of registers 310A is distinct from the second set of registers 310B, and each of the second set of registers 310B is distinct from the first set of registers 310A.
  • each set of registers 310A or 310B include contiguous registers that form a respective register block in the register file 310.
  • the first and second sets of registers 310A and 310B are immediately adjacent to each other or physically separated from each other by one or more registers.
  • at least one set of registers 310A or 310B include non-contiguous registers that form two or more register blocks in the register file 310.
  • the first set of registers 310A interleaves with the second set of registers 310B.
  • Each register in the first set of registers 310A is immediately adjacent to one or two registers of the second set of registers 310B, and each register in the second set of registers 310B is immediately adjacent to one or two registers of the first set of registers 310A.
  • the register file 310 is entirely reserved to implement the comprehensive sequence of instructions 706.
  • the first and second sets of registers 310A and 310B correspond to a first half and a second half of the register file 310.
  • the first set of registers 310A of the register file 310 store output values of the first instructions 702, and the second set of registers 310B of the register file 310 store output values of the second instructions 704.
  • the registers 708 (Rl, R2, R3, R4, ..., and RN) of the first set of registers 310A store output values generated by the first instructions 702 according to a first storage order
  • the registers 708 (RE, R2’, R3’, R4’, ..., and RN’) of the second set of registers 310B store output values generated by the second instructions 704 according to a second storage order that matches the first storage order.
  • Such a register file 310 is part of the digital hardware system (e.g., one or more processors 310 in Figure 3).
  • the QED module 306 is coupled to the register file 310, and is configured to determine whether the sequence of first instructions 702 are properly implemented by the digital hardware system based on a subset of each output value stored in the register file 310.
  • the subset of each output value includes a first number of fixed bits in a respective output value, and the first number is less than the predefined number.
  • the QED module 306 compares a subset of each of the output values stored in registers Rl, R2, R3, R4, ..., and RN of the first set of registers 310A and a subset of a corresponding output value stored in registers RE, R2’, R3, R4, ..., or RN’ of the second set of registers 310B to decide whether the first instructions 702 are properly implemented.
  • the subset of each of the output values is less than the entire output value.
  • the QED module 306 can reduce a computational demand and detect errors at a faster rate in spite of a risk of missing an error that occurs to any remaining bit of the output values stored in registers Rl, R2, R3, R4, ..., and RN of the first set of registers 310A.
  • Figure 8A is an example sequence of original instructions 702
  • Figure 8B is an example sequence of executed instructions 706 that is generated from the sequence of original instructions 702 in Figure 8A via QED transformation implemented by a QED module 306, in accordance with some embodiments.
  • the QED module 306 has two modes including an original mode (ORIG MODE) and a duplicate mode (DUP MODE).
  • the QED module 306 starts in ORIG MODE.
  • ORIG MODE When a control flow instruction is fetched, the QED module 306 switches to DUP MODE.
  • DUP MODE mode the QED module 306 generates the duplicate instructions 704 from the original instructions 702. After all the original and duplicate instructions 702 and 704 finish execution, the corresponding registers 708 of the register file 310 should be equal.
  • the first set of registers 310A the register file 310 has 16 registers for storing 16 output values, so is the second set of registers 310B.
  • the first set of registers 310A has the first 16 registers (Rl, R2, R3, R4, R5, etc.) for storing first 16 output values generated by the original instructions 702, and the second set of registers 310B has the second distinct 16 registers (R17, R18, R19, R20, R21, etc.) for storing second 16 output values that are generated by the duplicate instructions 704 and correspond to the first 16 output values.
  • the first and second sets of registers 310A and 310B are immediately adjacent to each other.
  • the first and second sets of registers 310A and 310B are separate from each other. Additionally, in some embodiments, the first and second sets of registers 310A and 310B interleave with each other. If the instructions 706 are properly implemented by the design of the digital hardware system, each of the first 16 output values stored in registers Rl, R2, R3, R4, R5, etc. is equal to a respective one of the second 16 output values stored in registers R17, R18, R19, R20, R21, etc., respectively.
  • Figure 9 is an example process 900 of pre-silicon verification and debugging implemented based on output values of instructions, in accordance with some embodiments.
  • a sequence of first instructions 702 are duplicated to and combined with a sequence of second instructions 704.
  • the first and second instructions 702 and 704 are executed to generate a plurality of first output values 902 (e.g., a value stored in register R2) and a plurality of second output values 904 (e.g., a value stored in register R2’), respectively.
  • a register file 310 includes a first set of registers 310A and a second set of registers 310B that is distinct from the first set of registers 310A.
  • the first and second sets of registers 310A and 310B include the same number of registers 708 for storing the first or second output values 902 and 904, respectively.
  • Each register 708 has a predefined number of bits (e.g., 32 bits).
  • the first and second output values 902 and 904 generated from the first and second instructions 702 and 704 are stored in the first and second sets of registers 310A and 310B of the register file 310 according to the same order, respectively.
  • two output values 902 and 904 are generated from first and second instructions 702 and 704 that correspond to each other, and stored in two distinct registers 708A and 708B located in the first and second set of registers 310A and 310B, respectively.
  • the QED module 306 compares the output values stored in the first and second sets of registers 310A and 310B to determine whether the sequence of first instructions 702 are properly implemented by the digital hardware system. Particularly, only a subset of each output value (i.e., a first number of fixed bits 906 in a respective output value) is used for comparison. That said, for each respective first output value 902 (e.g., stored in register R2) of the sequence of first instructions 702, the QED module 306 identifies a respective second output value 904 (e.g., R2’) in register R2’ of the second set of registers 310B.
  • a respective second output value 904 e.g., R2’
  • the respective second output value 904 corresponds to a second instruction (e.g., instruction D’ in Figure 7) that is duplicated from a first instruction (e.g., instruction D in Figure 7) corresponding to the respective first output value 902 (e.g., stored in register R2).
  • a second instruction e.g., instruction D in Figure 7
  • first instruction e.g., instruction D in Figure 7
  • respective first output value 902 e.g., stored in register R2
  • Each of the first number of fixed bits 906A in the respective first output value 902 is compared with a respective one of the first number of fixed bits 906B in the respective second output value 904.
  • each output value R2 or R2’ has 32 bits
  • the last 7 bits of the output values stored in registers 708A and 708B (R2 and R2’) are compared.
  • each register 708 of the register file 310 is an integer register, and the first number is equal to 1.
  • the subset of the respective first output value include a least significant bit (LSB). Only the LSBs of the output values stored in registers R2 and R2’ are compared.
  • each register 708 of the register file 310 is a floating point register.
  • the predefined number of bits include one or more mantissa bits, one or more exponent bits, and a sign bit.
  • the one or more mantissa bits include the LSB of the respective output value.
  • the subset of the respective first output value 902 always includes the sign bit. Further, in some situations, the subset of the respective first output value 902 has two bits including the sign bit and one of the exponent bit(s). Further, in some situations, the subset of the respective first output value 902 has three bits including the sign bit, one of the exponent bit(s), and the LSB of the one or more mantissa bits.
  • the first number is equal to 4 or above, and for each first output value 902, the subset of the respective first output value 902 includes the sign bit, at least one mantissa bit, and at least one exponent bit.
  • a processor 102 includes a plurality of register files 310.
  • the processor 102 implements RISC-V floating-point instructions.
  • the register files 310 includes a first register file that stores output values of the floating-point instructions, and a second register file that stores integer values. Further, in some situations, the processor 102 also implements RISC-V vector instructions, and further includes a third register file.
  • the QED module 306 determines that the sequence of first instructions 702 are properly implemented by the digital hardware system, i.e., the design of the digital hardware system is correct. Alternatively, in some situations, in accordance with a determination that the subset 906A of a first output value 902 of one of the sequence of first instructions 702 is distinct from the subset 906B of a corresponding second output value 904, the QED module 306 detects an error with implementation of the sequence of first instructions 702.
  • the corresponding second output value 904 is generated by a corresponding one of the sequence of second instructions 704 that is duplicated from the one of the sequence of first instructions 702.
  • the QED module 306 detects a mismatch 908 in at least one of the first number of fixed bits 906A in the first output value generated by the sequence of first instructions 702 and a respective one of the first number of fixed bits 906B in a corresponding second output value generated by the sequence of second instructions 704.
  • the QED module 306 determines that the sequence of first instructions 702 are improperly implemented by the digital hardware system and aborts comparing any remaining bits in this mismatching first output value with any remaining bits in the respective second output value or comparing any remaining first output values with any respective second output values.
  • the register file 310 has 16 32-bit registers (indexed as 0 to 15), and is divided such that the first set of registers 310A includes registers 0 to 7 for storing first output values generated from the original instructions 702 and the second set of registers 310B includes registers 8 to 15 for storing second output values generated from the duplicate instructions 704.
  • Each z-th register of the first set of registers 310A is uniquely paired with a respective (z+5)-th register of the second set of registers 310B, wherein z is from 0 to 7.
  • the QED module 306 tracks execution of the instructions 702 and 704, and asserts a signal eddi ready in accordance with a determination that the original instructions 702 and corresponding duplicate instructions 704 are executed.
  • An EDDI property refers to an output property of corresponding output values in the sets of registers 310A and 310B being matched, and is used for error detection by duplicate instructions (EDDI). Each output value is stored in a register 708 of a register file 310.
  • Each first output value 902 is stored in each register 708 of the first set of registers 310A of the register file 310 is compared with a corresponding output value 904 stored in a corresponding register 708 of the second set of registers 310B of the register file 310, e.g., when the signal eddi ready is asserted.
  • such a comparison is implemented on an entire width of the registers 708, e.g., 32 bits.
  • the DUT 308 has bugs that affect only certain bits of the registers that do not overlap the selected subset 906, and the bugs go undetected by the QED module 306. Such bugs whose manifestations affect only certain bits of the registers tend to be single instruction bugs, rather than sequence-dependent bugs.
  • the single instruction bugs are not main targets of SQED in which the EDDI-V properties are checked.
  • SQED has an extension, si-SQED, which is relied upon to directly address the single instruction bugs.
  • QEDDI is applied as the first line of attack, i.e., the QEDDI variant of EDDI-V property is executed first to detect and fix one or more bugs in the DUT 308.
  • a full-width EDDI-V is executed to detect any remaining bugs that QEDDI has missed.
  • the full-width EDDI-V or EDDI-V of remaining bits is executed, in accordance with a determination that QEDDI does not identify any bug.
  • the full-width EDDI-V or EDDI-V of remaining bits is executed for thoroughness regardless of whether QEDDI identifies any bugs, i.e., even after QEDDI identifies some bugs.
  • a magnitude of the first number and/or locations of the first number of bits 906 selected in the output values determine an error detection capability of QEDDI (e.g., including an error detection speed). It is a tradeoff that verification engineers should make during the course of setting up SQED EDDI-V properties and QEDDI variations based on the DUT 308 and the register file 310.
  • two distinct register files 310 include integer registers and floating-point registers, respectively. Comparing the LSBs is a good choice for the register file 310 having integer registers.
  • register bits are split among mantissa, exponent, and sign bits.
  • the mantissa bits include the LSB, and the LSB is selected for comparison.
  • sign and exponent bits are not the LSB, and however, selected for comparison to improve a bug detection capability.
  • the LSB, the sign bit, and one or more exponent bits are selected for comparison.
  • Figure 10A is a data structure 1000 of a portion of a register file 310 in which all values stored in the registers 708 are initialized to 0, in accordance with some embodiments
  • Figures 10B-10E are four example data structures 1010, 1020, 1030, and 1040 of a portion of a register file 310 in which a subset of (not all) registers are set to symbolic initial states (SIS, i.e., represented by “x” in Figures 10B-10E), in accordance with some embodiments.
  • a register file 310 is configured to store output values of a sequence of first instructions 702.
  • Each register 708 in the register file 310 has a predefined number of bits (e.g., 32 bits).
  • the sequence of first instructions 702 are duplicated to a sequence of second instructions 704.
  • Output values of the sequence of first instructions 702 and the sequence of second instructions 704 are stored according to the same order in a first set of registers 310A and a second set of registers 310B of the register file 310, respectively.
  • the first and second sets of registers 310A and 310B do not share any register 708, and are either immediately adjacent to or separate from each other.
  • Each register 708 in the register file 310 is optionally initialized with an initial value (e.g., 0 or 1) or set to an SIS.
  • the initial value is defined in an “assume property” statement in SystemVerilog.
  • the register 708 when a register 708 is not defined with any initial value, the register 708 is automatically set to the SIS.
  • the register 708 is set to the SIS, when no initial value of 0 or 1 is assigned to the value stored in the register 708, and the register 708 has no constraint.
  • a formal property checker tool is allowed to assign any initial value (0 or 1) to the register 708 that is set to the SIS.
  • the SIS corresponds to a symbolic value indicating that the formal property checker tool may try all possible values for the symbolic value in an evaluation process.
  • each and every bit of the first and second sets of registers 310A and 310B are initialized to a respective predefined value of 0 or 1 in a duplicated manner. Referring to Figure 10A, all bits in the register file 310 are initialized to 0.
  • not all of the bits in the register file 310 are initialized to 0 or 1. Only a subset (less than all bits) of the registers 708 of the register file 310 are assigned with 0, e.g., using the “assume property” statements in System Verilog. Bits that are left out in the “assume property” statements are not assigned with any initial value, and become symbolic (i.e., are set to SIS).
  • the initial states of the register file 310 and register contents are a subset of constraints to be set properly for SQED runs.
  • a DUT state needs to be initialized to an EDDI-V consistent state, i.e., with the two half spaces (i.e., the sets of registers 310A and 310B of the register file 310) initialized to equal states.
  • the following “assume property” statement enforces the above equal state constraints for the first and second sets of registers 310A and 310B of the register file 310 (a similar constraint can be written for memory halves).
  • all bits of the registers 708 of the register file 310 are initialized to 0.
  • all bits of the registers 708 of the register file 310 are initialized to 1.
  • each of the registers 708 of the register file 310 is initialized to 1 or 0, not necessarily all to 1 and all to 0. All bits are initialized, and no bit is left undefined. Conversely, in some embodiments, no bit of the register file is initialized to 1 or 0. All bits of the registers 708 of the register file 310 start with a symbolic initial state (SIS), i.e., they could be any value.
  • SIS symbolic initial state
  • the QED module 306 enforces that equal symbolic state is applied to each register pair, i.e., any two registers located in the same locations of the first and second sets of registers 310A and 310B of the register file 310.
  • the following instructions use these non-zero values stored in these registers 708 as source operands, start to have non-zero inputs, and allow additional bug hunting to happen during deeper clock cycles.
  • the non-zero values are available in the register file 310 during the initial clock cycles, and bugs could be detected within the initial clock cycles without waiting to enter the deeper clock cycles.
  • a QED module 306 builds a search tree for formal property check (i.e., bounded model check).
  • a size of a root of this search tree depends on a size of a state space in the initial clock cycles. Each level of the tree grows exponentially in size by how many possible input combinations the QED module 306 can apply to the inputs in each cycle.
  • the root includes all possible states (i.e., initial values) of the register file 310 and memory, while with all 0s initial state, the root is many orders of magnitude smaller.
  • the QED module 306 that uses a breadth-first search algorithm can get to deeper clock cycles faster when it starts with a smaller root (having all 0 initial values in the register file 310) compared to a larger root (having at least a subset of the register file 310 set to SIS).
  • bugs are activated to cause failure (e.g., incorrect value in register file or memory) in a short instruction sequence, and get detected faster with SQED when the register file 310 are initialized with SIS.
  • bugs are activated to cause failure in longer instruction sequences (e.g., deeper clock cycles), and get detected faster with SQED when all of the register file 310 are initialized with 0.
  • SQED verification steps are optionally set up to do some runs with SIS followed by runs with all 0 initial values in the first and second sets of registers 310A and 310B and the register file 310.
  • selective SIS is applied to allow SIS for a subset of registers or register bits while starting the rest of the registers or bits from all Os. This keeps the exponential size growth of the root of the search tree under control, while allowing faster detection of bugs that need longer instruction sequences for detection.
  • Selective SIS is implemented in at least one of two independent dimensions, within register bits and/or among registers.
  • the QED module 306 initializes each and every bit of a subset of registers 708 (e.g., 708C and 708D) in the first set of registers 310A of the register file 310 and a corresponding subset of registers 708 (e.g., 708C’ and 708D’) in the second set of registers 310B of the register file 310 to a respective SIS.
  • a subset of registers 708 e.g., 708C and 708D
  • the QED module 306 also initializes each of a remaining subset of registers 708 (e.g., 708E and 708F) in the first set of registers 310A of the register file 310 and a corresponding remaining subset of registers 708 (e.g., 708E’ and 708F’) in the second set of registers 310B of the register file 310 to 0. Further, in some embodiments, each bit of a single register of the first set of register 310A of the register file and a corresponding single register of the second set of the register 310B of the register file are set to the SIS.
  • each bit of a subset of the predefined number of bits in each of a subset of registers of the register file is initialized to a respective SIS.
  • the subset of registers include registers 708C, 708D and 708E of the first set of registers 310A of the register file 310
  • the subset of the predefined number of bits include two or more bits, i.e., bits 5 and 27 of bits 0-31 in registers 708C and 708D, bits 1-3 and 20 in register 708E.
  • the corresponding bits of the second set of registers 310B of the register file 310 are also initialized to the SIS.
  • the QED module 306 if there is any remaining bit in the predefined number of bits in each of the subset of registers, the QED module 306 initializes a remaining subset of the predefined number of bits in each of the subset of registers (e.g., 708C-708E) to 0. In some embodiments, if there is any remaining register (e.g., 708F) in the first set of register 310A of the register file 310, the QED module 306 initializes all bits of the remaining subset of registers to 0.
  • the QED module 306 initializes a least significant bit (LSB) of each register of the register file to an SIS.
  • the QED module 306 initializes each of a subset of fixed bits (bits 4, 5, 16, 27) of each of a subset of registers 708 (e.g., 708C, 708D, 708E) in the first set of registers 310A of the register file 310 and a corresponding subset of fixed bits (bits 4, 5, 16, 27) of each of a corresponding subset of registers 708 (e.g., 708C’, 708D’, 708E’) in the second set of registers 310B of the register file 310 to a respective SIS.
  • the QED module 306 initializes each remaining bit of the first set of registers 310A of the register file 310 and a corresponding remaining bit of the second set of registers 310B of the register file 310 to 0 or 1 in a duplicated manner.
  • each of a subset of bits of the first set of registers 310A of the register file 310 and a corresponding subset of bits of the second set of registers 310B of the register file 310 is initialized to a respective SIS in a duplicated manner.
  • the subset of bits of the first set of registers 310A are less than all bits of the first set of registers 310A.
  • each of the subset of bits is not initialized to a predefined value (e.g., 0 or 1) in an “assume property” statement, and automatically set to the respective SIS.
  • any bit that is initialized with 0 can be initialized to 1, except that bits in the first and second sets of registers 310A and 310B are initialized to 0 or 1 in a duplicated manner (i.e., each bit in the first set of registers 310A and a corresponding bit in the second set of registers 310B are jointly set to the same value of 0 or 1).
  • the QED module 306 duplicates original instruction to duplicate instructions, and stores output values of the original and duplicate instructions into two separate sets of memory units and the memory array 312 of the DUT memory. When the output values are stored in the memory array 312, the QED module 306 determines whether the instructions are properly implemented by the digital hardware system based on a subset of each output value (e.g., a least significant bit (LSB) of each output value), which is stored in memory array 312.
  • LSB least significant bit
  • the QED module 306 initializes a subset of a first set of memory units of the memory array 312 and a corresponding subset of a second set of memory units of the memory array 312 to a respective SIS in a duplicated manner, e.g., initializes the LSB of each memory units of the memory array 312 to the respective SIS.
  • 10A-10E are similarly applicable to the first set of memory units of the memory array 312 and the second set of memory units of the memory array 312.
  • FIG 11 is a flow diagram of a method 1100 for pre-silicon verification and debugging of a digital hardware system (e.g., a processor 102), in accordance with some embodiments.
  • the method 1100 is described as being implemented by a computer system 150 (specifically, a QED module 306 of the computer system 150).
  • the method 1100 is, optionally, governed by instructions that are stored in a non-transitory computer readable storage medium and that are executed by one or more processors of the computer system.
  • Each of the operations shown in Figure 11 may correspond to instructions stored in a computer memory or non-transitory computer readable storage medium (e.g., memory 154 of the computer system 150 in Figures IB and 6).
  • the computer readable storage medium may include a magnetic or optical disk storage device, solid state storage devices such as Flash memory, or other non-volatile memory device or devices.
  • the instructions stored on the computer readable storage medium may include one or more of: source code, assembly language code, object code, or other instruction format that is interpreted by one or more processors. Some operations in the method 1100 may be combined and/or the order of some operations may be changed.
  • the computer system 150 executes (1102) formal analysis of the digital hardware system including a register file, and the formal analysis includes a sequence of first instructions 702.
  • the sequence of first instructions 702 are duplicated (1104) to a sequence of second instructions 704.
  • the computer system 150 implements (1106) the sequence of first instructions 702 and the sequence of second instructions 704, and stores (1108) output values of the sequences of first and second instructions 704 into a first set of registers 310A and a second set of registers 310B of the register file 310, respectively.
  • the first and second sets of registers 310A and 310B are entirely distinct from each other, and do not overlap at all.
  • the first and second sets of registers have the same number of registers.
  • Each output value is (1112) stored in a respective register 708 of the register file 310 and has a predefined number of bits (e.g., 32 bits).
  • the computer system 150 determines (1114) whether the sequence of first instructions 702 are properly implemented by the digital hardware system based on a subset of each output value stored in the register file 310.
  • the subset of each output value includes (1116) a first number of fixed bits (e.g., a LSB) 906 in a respective output value, and the first number is less than the predefined number.
  • the computer system 150 identifies a respective second output value 904 in the second set of registers 310B.
  • the respective second output value 904 corresponds to a second instruction 704 that is duplicated from a first instruction 702 corresponding to the respective first output value 902.
  • Each of the first number of fixed bits 906A in the respective first output value 902 is compared with a respective one of the first number of fixed bits 906B in the respective second output value 904.
  • the computer system 150 determines (1118) that the sequence of first instructions 702 are properly implemented by the digital hardware system.
  • the computer system 150 detects a mismatch in at least one of the first number of fixed bits 906A in the first output value 902 and a respective one of the first number of fixed bits 906B in a corresponding second output value 904 of the sequence of second instructions 704. In accordance with a determination that the sequence of first instructions 702 are improperly implemented by the digital hardware system, the computer system aborts comparing any remaining bit of the first output value 902 with any respective remaining bit of the second output value 904.
  • the computer system 150 detects (1120) an error with implementation of the sequence of first instructions 702.
  • the corresponding second output value 904 is generated by a corresponding one of the sequence of second instructions 704 that is duplicated from the one of the sequence of first instructions 702.
  • each register 708 of the register file 310 is an integer register. The first number is equal to 1.
  • the subset of the respective first output value 902 includes a least significant bit (LSB).
  • LSB least significant bit
  • each register 708 of the register file 310 is a floating point register.
  • the predefined number of bits include one or more mantissa bits, one or more exponent bits, and a sign bit.
  • the one or more mantissa bits include the LSB of the respective output value.
  • the subset of the respective first output value 902 always includes the sign bit.
  • the subset of the respective first output value 902 has two bits including the sign bit and one of the exponent bit(s). Further, in some situations, the subset of the respective first output value 902 has three bits including the sign bit, one of the exponent bit(s), and the LSB of the one or more mantissa bits. Alternatively, in some situations, the first number is equal to 4 or above, and for each first output value 902, the subset of the respective first output value 902 includes the sign bit, at least one mantissa bit, and at least one exponent bit.
  • the subset of each output value includes a first subset and each output value includes a second subset that is supplemental to the first subset.
  • the computer system 150 determines whether the sequence of first instructions 702 are properly implemented by the digital hardware system based on at least the second subset of each output value. In some situations, if the first subset of each output value is correct, the computer system 150 continues to check all bits of each output value.
  • the first set of registers 310A and the second set of registers 310B of the register file 310 are initialized in a duplicated manner, such that the first set of registers 310A and the second set of registers 310B have equal initial states to be used to implement the sequence of first instructions 702 and the sequence of second instructions 704.
  • each and every bit of the first and second sets of registers 310A and 310B are initialized to a respective predefined value of 0 or 1 in a duplicated manner. For example, all bits of the first and second sets of registers 31 OB of the register file 310 are initialized to 0.
  • each and every bit of the first and second sets of registers 3 lOBs of the register file 310 is initialized to a respective symbolic initial state (SIS).
  • the computer system 150 initializes each and every bit of a subset of registers in the first set of registers 310A of the register file 310 and a corresponding subset of registers in the second set of registers 310B of the register file 310 to a respective SIS, and initializes each and every bit of a remaining subset of registers in the first set of registers 310A of the register file 310 and a corresponding remaining subset of registers in the second set of registers 310B of the register file 310 to a respective predefined value or 0 or 1. Further, in some embodiments, each and every bit of a single register of the first set of registers 310A of the register file 310 and a corresponding single register of the second set of registers 310B of the register file 310 are initialized to an SIS.
  • the computer system 150 initializes each bit of a subset of the predefined number of bits in each of a subset of registers of the register file 310 to a respective SIS, and initializes a remaining subset of the predefined number of bits in each of the subset of registers of the register file 310 to 0.
  • the computer system initializes a least significant bit (LSB) of each register of the register file 310 to a SIS.
  • LSB least significant bit
  • the computer system 150 initializes each of a subset of fixed bits of each of a subset of registers in the first set of registers 310A of the register file 310 and a corresponding subset of fixed bits of each of a corresponding subset of registers in the second set of registers 310B of the register file 310 to a respective SIS.
  • the computer system 150 also initializes each remaining bit of the first set of registers 310A of the register file 310 and a corresponding remaining bit of the second set of registers 310B of the register file 310 to 0 or 1 in a duplicated manner.
  • an LSB of a first register of the first set of registers 310A and an LSB of a corresponding register of the second set of registers 310B of the register file 310 are initialized to respective SIS in a duplicated manner.
  • the sequences of first and second instructions 702 and 704 are interleaved to an interleaved sequence of instructions 706 without varying internal orders of the sequences of first and second instructions 704.
  • the sequence of first instructions 702 and the sequence of second instructions 704 are implemented sequentially according to an order of instructions in the interleaved sequence of instructions 706.
  • Figure 12 is a flow diagram of another method 1200 for pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments.
  • the method 1200 is described as being implemented by a computer system 150 (e.g., a QED module 306).
  • the method 1200 is, optionally, governed by instructions that are stored in a non-transitory computer readable storage medium and that are executed by one or more processors of the computer system.
  • Each of the operations shown in Figure 8 may correspond to instructions stored in a computer memory or non- transitory computer readable storage medium (e.g., memory 154 of the computer system 150 in Figures IB and 6).
  • the computer readable storage medium may include a magnetic or optical disk storage device, solid state storage devices such as Flash memory, or other nonvolatile memory device or devices.
  • the instructions stored on the computer readable storage medium may include one or more of: source code, assembly language code, object code, or other instruction format that is interpreted by one or more processors. Some operations in the method 1200 may be combined and/or the order of some operations may be changed.
  • the computer system 150 executes (1202) formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions.
  • the sequence of first instructions is duplicated (1204) to a sequence of second instructions.
  • the computer system 150 implements (1206) the sequence of first instructions and the sequence of second instructions, and stores (1208) output values of the sequences of first and second instructions into a first set of registers 310A and a second set of registers 310B of the register file 310, respectively.
  • the first and second sets of registers are distinct from each other, and have the same number of registers.
  • Each output value is stored (1212) in a respective register of the register file and has a predefined number of bits.
  • the computer system 150 determines (1214) whether the sequence of first instructions are properly implemented by the digital hardware system based on the output values stored in the register file.
  • Each of a subset of bits of the first set of registers 310A of the register file and a corresponding subset of bits of the second set of registers 310B of the register file 310 is (1216) initialized to a respective SIS in a duplicated manner.
  • the subset of bits of the first set of registers 310A are less than (1218) all bits of the first set of registers 310A.
  • the computer system 150 initializes each and every bit of a subset of registers in the first set of registers 310A of the register file 310 and a corresponding subset of registers in the second set of registers 310B of the register file 310 to a respective SIS, and initializes each and every of a remaining subset of registers in the first set of registers 310A of the register file 310 and a corresponding remaining subset of registers in the second set of registers 310B of the register file 310 to 0. Further, in some embodiments, each and every bit of a single register of the first set of registers 310A of the register file 310 and a corresponding single register of the second set of registers 310B of the register file 310 are initialized to an SIS.
  • the computer system 150 initializes each bit of a subset of the predefined number of bits in each of a subset of registers of the register file 310 to a respective SIS, and initializes a remaining subset of the predefined number of bits in each of the subset of registers of the register file 310 to 0.
  • the computer system initializes a least significant bit (LSB) of each register of the register file 310 to a SIS.
  • LSB least significant bit
  • the computer system 150 initializes each of a subset of fixed bits of each of a subset of registers in the first set of registers 310A of the register file 310 and a corresponding subset of fixed bits of each of a corresponding subset of registers in the second set of registers 310B of the register file 310 to a respective SIS.
  • the computer system 150 also initializes each remaining bit of the first set of registers 310A of the register file 310 and a corresponding remaining bit of the second set of registers 310B of the register file 310 to 0 or 1 in a duplicated manner.
  • an LSB of a first register of the first set of registers 310A and an LSB of a corresponding register of the second set of registers 310B of the register file 310 are initialized to respective SIS in a duplicated manner.
  • only a subset of each output value stored in the register file 310 is applied to determine whether the sequence of first instructions 702 are properly implemented by the digital hardware system.
  • the subset of each output value includes a first number of fixed bits in a respective output value, and the first number is less than the predefined number.
  • the QED module 306 stores output values of the original and duplicate instructions into two separate sets of memory units and the memory array 312 of the DUT memory 304.
  • the computer system 150 executes formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions 702.
  • the sequence of first instructions 702 are duplicated to a sequence of second instructions 704.
  • the computer system 150 implements the sequence of first instructions 702 and the sequence of second instructions 704, and stores output values of the sequences of first and second instructions 704 into a first set of memory units and a second set of memory units of the memory array 312, respectively.
  • the first and second sets of memory units of the memory array 312 are entirely distinct from each other, and do not overlap at all.
  • the first and second sets of memory units of the memory array 312 have the same number of memory units.
  • Each output value is stored in a memory unit of the memory array 312 and has a predefined number of bits (e.g., 32 bits).
  • the computer system 150 determines whether the sequence of first instructions 702 are properly implemented by the digital hardware system based on a subset of each output value stored in the memory array 312.
  • the subset of each output value includes a first number of fixed bits (e.g., a LSB) 906 in a respective output value, and the first number is less than the predefined number.
  • the computer system 150 executes formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions.
  • the sequence of first instructions is duplicated to a sequence of second instructions.
  • the computer system 150 implements the sequence of first instructions and the sequence of second instructions, and stores output values of the sequences of first and second instructions into a first set of memory units and a second set of memory units of the memory array 312, respectively.
  • the first and second sets of memory units are distinct from each other, and have the same number of memory units.
  • Each output value is stored in a respective memory unit of the memory array 312 and has a predefined number of bits.
  • the computer system 150 determines whether the sequence of first instructions are properly implemented by the digital hardware system based on the output values stored in the memory array 312.
  • Each of a subset of bits of the first set of memory units of the memory array 312 and a corresponding subset of bits of the second set of memory units of the memory array 312 is initialized to a respective SIS in a duplicated manner.
  • the subset of bits of the first set of memory unit are less than all bits of the first set of memory units.
  • the term “if’ is, optionally, construed to mean “when” or “upon” or “in response to determining” or “in response to detecting” or “in accordance with a determination that,” depending on the context.
  • the phrase “if it is determined” or “if [a stated condition or event] is detected” is, optionally, construed to mean “upon determining” or “in response to determining” or “upon detecting [the stated condition or event]” or “in response to detecting [the stated condition or event]” or “in accordance with a determination that [a stated condition or event] is detected,” depending on the context.
  • stages that are not order dependent may be reordered and other stages may be combined or broken out. While some reordering or other groupings are specifically mentioned, others will be obvious to those of ordinary skill in the art, so the ordering and groupings presented herein are not an exhaustive list of alternatives. Moreover, it should be recognized that the stages can be implemented in hardware, firmware, software or any combination thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

This application is directed to pre-silicon verification and debugging of a digital hardware system. A computer device executes formal analysis of the digital hardware system having a register file by a sequence of first instructions that are duplicated to a sequence of second instructions. The computer device implements both the first and second instructions and stores output values of the first and second instructions into a first set of registers and a second set of registers of the register file, respectively. Each output value has a predefined number of bits. The computer device determines whether the instructions are properly implemented by the digital hardware system based on a subset of each output value stored in the register file. The subset of each output value includes a first number of fixed bits in a respective output value, and the first number is less than the predefined number.

Description

Quick Error Detection and Symbolic Initial States in Pre-Silicon Verification and Debugging
TECHNICAL FIELD
[0001] This application relates generally to integrated circuits (ICs), particularly to methods, systems, and non-transitory computer-readable media for effective pre-silicon verification and debugging of integrated circuits (e.g., a multicore processor system).
BACKGROUND
[0002] An electronic device oftentimes integrates a system on a chip (SoC) with a power management integrated circuit (PMIC), communication ports, external memory or storage, and other peripheral function modules on a main logic board. The SoC includes one or more microprocessor or central processing unit (CPU) cores, memory, input/output ports, and secondary storage in a single package. Quick Error Detection (QED) techniques are applied for pre-silicon verification and debugging and for post-silicon validation and diagnostics of integrated circuits in different portions of the electronic device (e.g., the SoC). Effectiveness of these techniques has been demonstrated on both open-source designs and commercial designs. It would be beneficial to improve the QED techniques to provide more effective and efficient solutions for silicon verification, debugging, validation, and diagnostics, particularly when the integrated circuits and systems become more and more complex in functionality.
SUMMARY
[0003] Various embodiments of this applications are directed to symbolic QED (SQED), which is used in pre-silicon verification and debugging. SQED is based on formal property checkers (also called bounded model checkers (BMCs)) that use mathematical methods to algorithmically verify functions of integrated circuits and involve manually writing a large number of input constraints and output properties. SQED simplifies the input constraints and the output properties based on an instruction set architecture (ISA) of the CPU and a self-consistency property of the instructions executed on the CPU, respectively. Particularly, in this application, SQED further simplifies setting the input constraints and checking the output properties, thereby expediting pre-silicon verification and debugging. Such SQED techniques are applied to verify a single central processing unit (CPU) core and can be extended easily to check a multi-core processor system. SQED can also be used to verify and debug logic circuits that include combination logic or sequential logic circuit elements and do not form any processor core.
[0004] In one aspect, a method is implemented at a computer device for pre-silicon verification and debugging of a digital hardware system (e.g., SoC) having a register file. The method includes executing formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions. The method further includes duplicating the sequence of first instructions to a sequence of second instructions, implementing the sequence of first instructions and the sequence of second instructions, and storing output values of the sequences of first and second instructions into a first set of registers and a second set of registers of the register file, respectively. Each output value is stored in a respective register of the register file and has a predefined number of bits. The method further includes determining whether the sequence of first instructions are properly implemented by the digital hardware system based on a subset of each output value stored in the register file. The subset of each output value includes a first number of fixed bits in a respective output value, and the first number is less than the predefined number.
[0005] In another aspect, a method is implemented for pre-silicon verification and debugging of a digital hardware system having a register file. The method includes executing formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions. The method further includes duplicating the sequence of first instructions to a sequence of second instructions and implementing the sequence of first instructions and the sequence of second instructions. The method further includes storing output values of the sequences of first and second instructions into a first set of registers and a second set of registers of the register file, respectively. Each output value is stored in a respective register of the register file and has a predefined number of bits. The method further includes determining whether the sequence of first instructions are properly implemented by the digital hardware system based on the output values stored in the register file. Particularly, executing formal analysis of the digital hardware system further includes initializing each of a subset of bits of the first set of registers of the register file and a corresponding subset of bits of the second set of registers of the register file to a respective SIS in a duplicated manner, and the subset of bits of the first set of registers are less than all bits of the first set of registers.
[0006] In another aspect, a computer system includes one or more processors and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform the above methods for pre-silicon verification and debugging of a digital hardware system.
[0007] A non-transitory computer-readable medium, having instructions stored thereon, which when executed by one or more processors cause the processors to perform the above methods for pre-silicon verification and debugging of a digital hardware system.
[0008] These illustrative embodiments and implementations are mentioned not to limit or define the disclosure, but to provide examples to aid understanding thereof.
Additional embodiments are discussed in the Detailed Description, and further description is provided there. Other implementations and advantages may be apparent to those skilled in the art in light of the descriptions and drawings in this specification.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] For a better understanding of various described implementations, reference should be made to the Detailed Description below, in conjunction with the following drawings. Like reference numerals refer to corresponding parts throughout the drawings. [0010] Figure 1 A is a block diagram of an example system module in a typical electronic device, in accordance with some embodiments.
[0011] Figure IB is a block diagram of an example computer system that executes pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments.
[0012] Figure 2 is a block diagram of a multithreading and multi-core processor system, in accordance with some embodiments.
[0013] Figure 3 is a block diagram of a symbolic QED formal verification testbench, in accordance with some embodiments.
[0014] Figure 4 is a block diagram showing a QED component including a QED module and a fetch unit, in accordance with some embodiments.
[0015] Figure 5 is a pseudo code for a QED module, in accordance with some embodiments.
[0016] Figure 6 is a block diagram of a computer system configured to execute presilicon verification and debugging of a digital hardware system, in accordance with some embodiments.
[0017] Figure 7 is a data structure of an example register file configured to store output values of instructions of a sequence of instructions, in accordance with some embodiments. [0018] Figure 8A is an example sequence of original instructions, and Figure 8B is an example sequence of executed instructions that is generated from the sequence of original instructions in Figure 8A via QED transformation implemented by a QED module, in accordance with some embodiments.
[0019] Figure 9 is an example process of pre-silicon verification and debugging implemented based on output values of instructions, in accordance with some embodiments. [0020] Figure 10A is a data structure of a portion of a register file in which all values stored in registers are initialized to 0, in accordance with some embodiments, and Figures 10B-10E are four example data structures of a portion of a register file in which a subset of (not all) registers are set to symbolic initial states (SIS), in accordance with some embodiments.
[0021] Figure 11 is a flow diagram of a method for pre-silicon verification and debugging of a digital hardware system (e.g., a processor), in accordance with some embodiments.
[0022] Figure 12 is a flow diagram of another method for pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments.
DESCRIPTION OF IMPLEMENTATIONS
[0023] Reference will now be made in detail to specific embodiments, examples of which are illustrated in the accompanying drawings. In the following detailed description, numerous non-limiting specific details are set forth in order to assist in understanding the subject matter presented herein. But it will be apparent to one of ordinary skill in the art that various alternatives may be used without departing from the scope of claims and the subject matter may be practiced without these specific details. For example, it will be apparent to one of ordinary skill in the art that the subject matter presented herein can be implemented on many types of electronic devices with digital video capabilities.
[0024] Figure 1 A is a block diagram of an example system module 100 in a typical electronic device, in accordance with some implementations. System module 100 in this electronic device includes at least one or more processors 102, memory modules 104 for storing programs, instructions and data, an input/output (I/O) controller 106, one or more communication interfaces such as network interfaces 108, and one or more communication buses 130 for interconnecting these components. In some implementations, EO controller 106 allows one or more processors 102 to communicate with an EO device (e.g., a keyboard, a mouse or a touch screen) via a universal serial bus interface. In some implementations, network interfaces 108 includes one or more interfaces for Wi-Fi, Ethernet and Bluetooth networks, each allowing the electronic device to exchange data with an external source, e.g., a server or another electronic device. In some implementations, communication buses 130 include circuitry (sometimes called a chipset) that interconnects and controls communications among various system components included in system module 100. In some embodiments, the one or more processors 102 of the system module 100 are integrated in a system on a chip (SoC).
[0025] In some implementations, memory modules 104 include high-speed random access memory, such as DRAM, SRAM, DDR RAM or other random access solid state memory devices. In some implementations, memory modules 104 include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid-state storage devices. In some implementations, memory modules 104, or alternatively the non-volatile memory device(s) within memory modules 104, include a non-transitory computer readable storage medium. In some implementations, memory slots are reserved on system module 100 for receiving memory modules 104. Once inserted into the memory slots, memory modules 104 are integrated into system module 100.
[0026] In some implementations, system module 100 further includes one or more components selected from:
• a memory controller 110 that controls communication between one or more processors 102 and memory components, including memory modules 104, in electronic device;
• solid-state drives (SSDs) 112 that apply integrated circuit assemblies to store data in the electronic device, and in many implementations, are based on NAND or NOR memory configurations;
• a hard drive 114 that is a conventional data storage device used for storing and retrieving digital information based on electromechanical magnetic disks;
• a power supply connector 116 that includes one or more direct current (DC) power supply interfaces each of which is configured to receive a distinct DC supply voltage;
• power management integrated circuit (PMIC) 118 that modulates the distinct DC supply voltages received via the DC power supply interfaces to other desired internal supply voltages, e.g., 5V, 3.3V or 1.8V, as required by various components or circuits (e.g., processor cores in one or more processors 102) within electronic device;
• a graphics module 120 that generates a feed of output images to one or more display devices according to their desirable image/video formats; and
• a sound module 122 that facilitates the input and output of audio signals to and from the electronic device under control of computer programs.
[0027] It is noted that communication buses 130 also interconnect and control communications among various system components including components 110-122. One skilled in the art knows that other non-transitory computer readable storage media can be used, as new data storage technologies are developed for storing information in the non- transitory computer readable storage media in the memory modules 104 and in SSDs 112. These new non-transitory computer readable storage media include, but are not limited to, those manufactured from biological materials, nanowires, carbon nanotubes and individual molecules, even though the respective data storage technologies are currently under development and yet to be commercialized.
[0028] Figure IB is a block diagram of an example computer system 150 that executes pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments. In some embodiments, the computer system 150 is integrated into another electronic system such as a router. In some embodiments, the computer system 150 is implemented via discrete elements or one or more integrated components. The computer system 150 executes any of a number of operating systems, and stores a program including a plurality of control instructions executed to implement pre-silicon verification and debugging of the digital hardware system including, but not limited to, each processor 102, an SoC integrating a plurality of processors 102, memory controller 110, memory integrated circuit, PMIC 118, graphics module 120, sound module 122, or a combination thereof.
[0029] Computer system 150 includes a processor 152, a memory 154, a storage device 156, and one or more input/output structures 158. One or more input/output structures 158 provide input/output operations for computer system 150, and optionally include a display. One or more buses 160 typically interconnect the components 152, 154, 156, and 158. Processor 152 may be a single or multi core. Additionally, in some embodiments, computer system 150 includes accelerators that are formed in an SoC. Processor 152 is configured to execute instructions for pre-silicon verification and debugging. Such instructions are stored in memory 154 or storage device 156. Data and/or information are received and outputting using one or more input/output structure 158 of computer system 150. Specifically, memory 154 include a computer-readable medium, such as volatile or nonvolatile memory, which stores data. Storage device 156 provides storage for the instructions for pre-silicon verification and debugging in computer system 150. Storage device 156 is optionally a flash memory device, a disk drive, an optical disk device, a tape device, or any other device employing magnetic, optical, or other information recording technologies. [0030] Each processor 102, SoC integrating a plurality of processors 102, memory controller 110, memory integrated circuit, PMIC 118, graphics module 120, or sound module 122 of a system module 100 includes a respective integrated circuit. During the course of designing the respective integrated circuit, computer system 150 applies pre-silicon verification and debugging techniques to verify its functions and debug errors occurring in its operation. Formal property checkers (i.e., bounded model checkers (BMCs)) are based on mathematical models, and configured to prove the functional correctness of a design of an integrated circuit. The BMCs include a plurality of input constraints and a plurality of output properties. In some embodiments, the input constraints are entered manually, and the output properties are automatically generated based on design specifications. The BMCs are configured to return example operation failures that correspond to inputs complying with the input constraints and erroneous output properties, which need to be troubleshot by the verification engineers. In some embodiments, BMC is limited by manually entering the input constraints and iteratively correcting errors associated with operation failures.
[0031] Conversely, in some embodiments, symbolic quick error detection (SQED) is applied to facilitate operation of the formal property checkers (i.e., the BMCs). The input constraints are created based on an instruction set architecture (ISA) of the integrated circuit (e.g., a CPU), and the output properties include self-consistency properties of instructions executed on the integrated circuit. In SQED, two sets of instructions (original and duplicate instructions) are generated and executed on the CPU. The CPU’s register file is associated with a memory address space and includes two portions. In some embodiments, the register file and memory address space are entirely reserved for the two sets of instructions of SQED, and split into two halves. Original instructions use a first half of the register file, and duplicate instructions use a second half of the register file. The duplicate instructions are the exact same operations implemented in the same sequence as the original instructions, except the duplicate and original instructions use their separate halves of the register file and memory address space. The input constraints to be defined for the BMCs include a valid set of instructions as defined in the ISA. A QED module is automatically added to a verification environment to duplicate the original instructions to the duplicate instructions. Once the same number of original and duplicate instructions are executed, program consistency requires that the first and second halves of the register file contain the same output values. Stated another way, an equality and universal self-consistency property are required in error detection by duplicate instructions for validation (EDDI-V) for the BMCs. EDDI-V is intended to detect bugs that are activated by a particular sequence of instructions and lead to an incorrect output value, and therefore, requires that the instruction sequence follow a correct control flow and that execution not hang. Hang detection is addressed by CFTSS-V (Control Flow Tracking by Software Signatures for Validation) which, if applied, augments EDDI-V to cover control flow errors.
[0032] Figure 2 is a block diagram of a multithreading and multi-core processor system 200, in accordance with some embodiments. An example of the processor system 200 is an OpenSPARC T2 SoC, which is an open source version of an UltraSPARC T2 microprocessor. The UltraSPARC T2 microprocessor belongs to a scalable processor architecture (SPARC) family, and includes a 500-million-transistor SoC with 8 processor cores 202(64 hardware threads), a private level 1 (LI) cache, 8 banks of shared level 2 (L2) cache 204, 4 memory controllers 206, a crossbar-based interconnect 208, and various I/O controllers 210. Logic bug scenarios are simulated for the OpenSPARC T2 SoC. These simulated scenarios represent a variety of “difficult” bug scenarios that are extracted from commercial multicore SoCs and can take an extended time (days to weeks) to identify. These bug scenarios include bugs in the processor cores 202, bugs in components external to the processor cores, and bugs related to power management.
[0033] In some embodiments, for Symbolic QED, a length for bug traces correspond to the number of instructions in the trace found by the BMC tool (not including duplicate instructions created by the QED modules). For bugs that are only found by executing instructions on multiple processor cores 202, the number of instructions for each core 202 may be different. For example, one core 202 could have a bug trace that is 3 instructions long, while another core has a bug trace that is 1 instruction long. The length of the longest bug trace (a length of 3 instructions in this example) is identified for the multicore processor system 200, because all cores must complete execution of the instructions to activate and detect the bug (and the cores execute the instructions in parallel).
[0034] Figure 3 is a block diagram of a symbolic QED formal verification testbench 300, in accordance with some embodiments. The computer system 150 provides a device under test (DUT) wrapper 302 including a DUT memory 304 and a QED module 306. The DUT wrapper 302 is coupled to a design of one or more processors 102 applied in a device under test (DUT) 308, e.g., a system module 100 or a processor system 200. The DUT memory 304 is coupled to the processor core(s) 102 and QED module 306, and configured to store instructions executed by the DUT 308 and data generated in the verification and debugging process. In some embodiments, the processors 102 include a register file 310 having a plurality of registers for storing output values of the instructions that are implemented for pre-silicon verification and debugging of a digital hardware system. In some embodiments, the register file 310 a special small memory that is inside or closest to the processor(s) 102, and has the fastest access time and the smallest capacity among all memories or caches available to the processor(s) 102. The register file 310 includes a first set of registers and a second set of registers. In an example, the plurality of registers of the register file are divided into two halves corresponding to the first and second sets of registers. [0035] The QED module 306 is instantiated inside the DUT wrapper 302 and interfaces with the DUT memory 304 and a plurality of monitoring points inside the design of the processor core(s) 102. As such, the QED module 306 of the computer system 200 is configured to implement a pre-silicon verification and debugging process to detect errors in the design of the one or more processors 102 of the DUT 308. In some embodiments, the DUT memory 304 includes a memory array 312 having a plurality of memory units for storing output values of the instructions that are implemented for pre-silicon verification and debugging of the digital hardware system. The memory array 312 includes a first set of memory units and a second set of memory units. In an example, the memory array 312 is divided into two halves corresponding to the first and second sets of memory units.
[0036] In addition to the QED module 306, the symbolic QED formal verification testbench 300 further includes a set of scripts, e.g., determined based on the design of the one or more processors 102. In some embodiments, the symbolic QED formal verification testbench 300 is established in a formal property checker tool, such as VC-Formal from Synopsys or JasperGold from Cadence. In some embodiments, the QED module 306 is an RTL module configured to be instantiated inside the DUT 308. The QED module 306 includes a plurality of formal verification constraints and output properties based on the three main QED transformations, which include EDDI-V (Error Detection by Duplicate Instructions for Validation), CFCSS-V (Control Flow Checking by Software Signatures for Validation), and CFTSS-V (Control Flow Tracking by Software Signatures for Validation). [0037] In some embodiments, on the symbolic QED formal verification testbench 300, all possible sequences of inputs are automatically checked in an exhaustive manner within a BMC bound. Once a set of scripts are in place and the testbench 300 is established, the script associated with each of a plurality of SQED properties is executed. If the QED module 306 determines that any of the SQED properties is violated, the QED module 306 generates an example which is used by the verification engineer to debug the design of the one or more processors 102. Once a bug is identified and fixed, the verification and debugging process is repeated until the QED module 306 cannot find any example up to a depth and/or a time limit set in the scripts. The depth is optionally measured by a number of clock cycles for which a sequence of instructions are executed.
[0038] It is noted that the symbolic QED formal verification testbench 300 is not limited to verifying and debugging processors. In some embodiments, the symbolic QED formal verification testbench 300 is applied to verify and debug digital integrated circuit distinct from the one or more processors 102, i.e., digital logic that does not form any processor core. Thus, the symbolic QED formal verification testbench 300 is broadly applied to pre-silicon verification and debugging of a digital hardware system.
[0039] Figure 4 is a block diagram showing a QED component 400 including a QED module 306 and a fetch unit 402, in accordance with some embodiments. The QED module 306 is configured to implement a pre-silicon verification and debugging process and VO modules coupled to the QED module 306. In some embodiments, inputs to the QED module 306 include:
1) enable, which disables the QED module 306 if 0;
2) instruction in, which is the instruction from the fetch unit 402 to be executed by the processor core 102;
3) target address, which contains the address of the next instruction to execute when the processor executes a control-flow instruction; and
4) committed, which is a signal from the processor core 102 to indicate if the instruction fetched has been committed (i.e., the result written to a register or memory).
[0040] In some embodiments, outputs from the QED module 306 include:
1) PC, which is the address of the next instruction to fetch;
2) PC override, which determines if the processor core 102 should use the PC from the QED module 306 or the PC from the fetch unit 402;
3) instruction out, which is the modified instruction computed by the QED module 306;
4) instruction override , which determines whether the processor core 102 should use the modified instruction from the QED module 306 or the instruction from the fetch unit 402; and 5) qed ready, which is set to qed ready J if the committed input signal is true, and false otherwise.
[0041] Figure 5 is a pseudo code 500 for a QED module 306, in accordance with some embodiments. The QED module 306 has one or more of the following internal variables:
1) current mode, which tracks whether the QED module 306 is executing original instructions (ORIG MODE) or duplicate instructions (DUP MODE ,'
2) qed rewind address, which holds the address of the first instruction in the sequence of original instructions;
3) PC override J and instruction override J, which are internal versions of
PC override and instruction override (the only difference is that when the enable signal is set to 0, then both PC override and instruction override are also set to 0, disabling the QED module 306); and
4) qed ready i, which signals when both the original and duplicated registers should have the same values (under bug-free conditions). Initially, qed ready i is set to false, and is only set to true when both the original and duplicate instructions have been executed.
[0042] In some situations, the QED module 306 starts in an ORIG MODE mode. When a control flow instruction is fetched, the QED module switches to a DUP MODE mode, loads the address stored in qed rewind address into PC, and sets PC override J to 1 (and as long as enable is true, PC override is also set to 1). The processor core 102 then re- executes instructions starting from the address stored in qed rewind address. Conversely, in some embodiments, in
Figure imgf000012_0001
DUP MODK mode, the duplicated instruction is outputted as instruction out, and instruction override J is set to 1 so the processor core 102 executes the duplicated instruction instead of the original instruction from the fetch unit 402. After all the duplicate instructions finish execution, the corresponding registers (e.g., two sets of registers of a register file) should be equal, and so once the results are written to registers (the committed signal from the processor core 102 is true), qed ready is set to true. This time, the processor core 102 executes the control flow instruction, and the QED module 306 stores the address of the next instruction to execute (i.e., the target of the control flow instruction) in qed rewind address and then return to the ORIG MODE mode.
[0043] Figure 6 is a block diagram of a computer system 150 configured to execute pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments. The computer system 150, typically, includes one or more processing units (CPUs) 152, one or more network interfaces 602, memory 154, and one or more communication buses 160 for interconnecting these components (sometimes called a chipset). The computer system 150 includes one or more input devices 158A that facilitate user input, such as a keyboard, a mouse, a voice-command input unit or microphone, a touch screen display, a touch-sensitive input pad, a gesture capturing camera, or other input buttons or controls. The computer system 150 also includes one or more output devices 158B that enable presentation of user interfaces and display content, including one or more speakers and/or one or more visual displays.
[0044] Memory 154 includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, or other random access solid state memory devices; and, optionally, includes non-volatile memory, such as one or more magnetic disk storage devices, one or more optical disk storage devices, one or more flash memory devices, or one or more other non-volatile solid state storage devices. Memory 154, optionally, includes one or more storage devices remotely located from one or more processing units 152. Memory 154, or alternatively the non-volatile memory within memory 154, includes a non-transitory computer readable storage medium. In some embodiments, memory 154, or the non- transitory computer readable storage medium of memory 154, stores the following programs, modules, and data structures, or a subset or superset thereof:
• Operating system 604 including procedures for handling various basic system services and for performing hardware dependent tasks;
• Network communication module 606 for connecting each computer system 150 to other devices (e.g., a server, client device, or external storage) via one or more network interfaces 602 (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;
• User interface module 608 for enabling presentation of information (e.g., a graphical user interface for user application(s), widgets, websites and web pages thereof, and/or games, audio and/or video content, text, etc.) at the computer system 150 via one or more output devices 158B (e.g., displays, speakers, etc.);
• Input processing module 610 for detecting one or more user inputs or interactions from one of the one or more input devices 158 A and interpreting the detected input or interaction;
• Web browser module 612 for navigating, requesting (e.g., via HTTP), and displaying websites and web pages thereof, including a web interface for logging into a user account associated with the computer system 150 or another electronic device, controlling the computer system or electronic device if associated with the user account, and editing and reviewing settings and data that are associated with the user account;
• One or more user applications 614 for execution by the computer system 150 (e.g., games, social network applications, smart home applications, data analysis and visualization applications, and/or other web or non-web based applications), where in some embodiments, the one or more user applications 614 include a BMC application for returning example operation failures of a digital hardware system in view of input constraints and output properties;
• Device under test (DUT) wrapper 302 for creating a symbolic QED formal verification testbench or environment in which a design of a digital hardware system (e.g., a processor 102) is verified and debugged, where the DUT wrapper 302 includes a DUT memory 304 for storing instructions executed by the DUT 308 and data generated in a pre-silicon verification and debugging process and a QED module 306 for implementing a pre-silicon verification and debugging process based on a combination of original and duplicate instructions; and
• One or more databases 618 for storing at least data including one or more of: o Device settings 620 including common device settings (e.g., service tier, device model, storage capacity, processing capabilities, communication capabilities, etc.) of the computer system 150; o User account information 622 for the one or more user applications 614, e.g., user names, security questions, account history data, user preferences, and predefined account settings; and o Network parameters 624 for the one or more communication networks 108, e.g., IP address, subnet mask, default gateway, DNS server and host name.
[0045] In some embodiments, the QED module 306 duplicates original instruction to duplicate instructions, and stores output values of the original and duplicate instructions into two separate sets of registers of the register file 310, respectively. Alternatively, in some embodiments, the QED module 306 stores the output values of the original and duplicate instructions into two separate sets of memory units of the memory array 312 of the DUT memory 304. Further, in some embodiments, the QED module 306 determines whether the instructions are properly implemented by the digital hardware system based on a subset of each output value (e.g., a least significant bit (LSB) of each output value), which is stored in the register file 310 or memory array 312. In some embodiments, the QED module 306 initializes a subset of a first set of registers of the register file 310 and a corresponding subset of a second set of registers of the register file 310 to a respective SIS in a duplicated manner, e.g., initializes the LSB of each register of the register file 310 to the respective SIS. Alternatively, in some embodiments, the QED module 306 initializes a subset of a first set of memory units of the memory array 312 and a corresponding subset of a second set of memory units of the memory array 312 to a respective SIS in a duplicated manner, e.g., initializes the LSB of each memory units of the memory array 312 to the respective SIS. [0046] Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures, modules or data structures, and thus various subsets of these modules may be combined or otherwise re-arranged in various embodiments. In some embodiments, memory 154, optionally, stores a subset of the modules and data structures identified above. Furthermore, memory 154, optionally, stores additional modules and data structures not described above.
[0047] Figure 7 is a data structure of an example register file 310 configured to store output values of a sequence of instructions, in accordance with some embodiments. A QED module 306 of the computer system 150 is applied to verify and debug a design of a digital hardware system (e.g., an integrated circuit). Specifically, the sequence of instructions includes a sequence of first instructions 702 (e.g., an ordered sequence of instructions A, B, C, and D), and the QED module 306 duplicates the sequence of first instructions to a sequence of second instructions 704 (e.g., an ordered sequence of instructions A’, B’, C’, and D’). The sequence of first instructions 702 (also called original instructions) and the sequence of second instructions 704 (also called duplicate instructions) are mixed with each other to form a comprehensive sequence of instructions 706 in which the first instructions in the sequence 702 keep their original order with respect to each other and the second instructions in the sequence 704 keep their original order with respect to each other. In some embodiments, the sequence of second instructions 704 follow the sequence of first instructions 702. For example, the comprehensive sequence 706 includes an ordered sequence of instructions A, B, C, D, A’, B’, C’, and D’. Alternatively, in some embodiments, the sequence of first instructions 702 follow the sequence of second instructions 704. For example, the comprehensive sequence 706 includes an ordered sequence of instructions A’, B’, C’, D’, A, B, C, and D. Additionally and alternatively, in some embodiments, the sequence of first instructions 702 interleave with the sequence of second instructions 704. In example, the comprehensive sequence 706 includes an ordered sequence of instructions A, B,
Figure imgf000016_0001
[0048] The QED module 306 in conjunction with a BMC application stimulates the design of the digital hardware system with the comprehensive sequence of instructions 706. During this stimulation, a plurality of input properties are assigned to the digital hardware system, and a plurality of output properties are defined based on output values expected from execution of the instructions by the digital hardware system. The input and output properties are part of the code of the QED module 306. The register file 310 includes a first set of registers 310A and a second set of registers 310B. Each of the first and second sets of registers 310A and 310B includes the same number (e.g., 16) of registers 708, and each register 708 has a predefined number of bits (e.g., 32 bits). Each of the first set of registers 310A is distinct from the second set of registers 310B, and each of the second set of registers 310B is distinct from the first set of registers 310A. In some embodiments, each set of registers 310A or 310B include contiguous registers that form a respective register block in the register file 310. The first and second sets of registers 310A and 310B are immediately adjacent to each other or physically separated from each other by one or more registers. In some embodiments, at least one set of registers 310A or 310B include non-contiguous registers that form two or more register blocks in the register file 310. In an example, the first set of registers 310A interleaves with the second set of registers 310B. Each register in the first set of registers 310A is immediately adjacent to one or two registers of the second set of registers 310B, and each register in the second set of registers 310B is immediately adjacent to one or two registers of the first set of registers 310A. In some embodiments, the register file 310 is entirely reserved to implement the comprehensive sequence of instructions 706. The first and second sets of registers 310A and 310B correspond to a first half and a second half of the register file 310.
[0049] Specifically, in some embodiments, the first set of registers 310A of the register file 310 store output values of the first instructions 702, and the second set of registers 310B of the register file 310 store output values of the second instructions 704. For example, the registers 708 (Rl, R2, R3, R4, ..., and RN) of the first set of registers 310A store output values generated by the first instructions 702 according to a first storage order, and the registers 708 (RE, R2’, R3’, R4’, ..., and RN’) of the second set of registers 310B store output values generated by the second instructions 704 according to a second storage order that matches the first storage order. Such a register file 310 is part of the digital hardware system (e.g., one or more processors 310 in Figure 3).
[0050] The QED module 306 is coupled to the register file 310, and is configured to determine whether the sequence of first instructions 702 are properly implemented by the digital hardware system based on a subset of each output value stored in the register file 310. The subset of each output value includes a first number of fixed bits in a respective output value, and the first number is less than the predefined number. Specifically, the QED module 306 compares a subset of each of the output values stored in registers Rl, R2, R3, R4, ..., and RN of the first set of registers 310A and a subset of a corresponding output value stored in registers RE, R2’, R3, R4, ..., or RN’ of the second set of registers 310B to decide whether the first instructions 702 are properly implemented. The subset of each of the output values is less than the entire output value. By these means, the QED module 306 can reduce a computational demand and detect errors at a faster rate in spite of a risk of missing an error that occurs to any remaining bit of the output values stored in registers Rl, R2, R3, R4, ..., and RN of the first set of registers 310A.
[0051] Figure 8A is an example sequence of original instructions 702, and Figure 8B is an example sequence of executed instructions 706 that is generated from the sequence of original instructions 702 in Figure 8A via QED transformation implemented by a QED module 306, in accordance with some embodiments. The QED module 306 has two modes including an original mode (ORIG MODE) and a duplicate mode (DUP MODE). The QED module 306 starts in ORIG MODE. When a control flow instruction is fetched, the QED module 306 switches to DUP MODE. In DUP MODE mode, the QED module 306 generates the duplicate instructions 704 from the original instructions 702. After all the original and duplicate instructions 702 and 704 finish execution, the corresponding registers 708 of the register file 310 should be equal.
[0052] In this example, the first set of registers 310A the register file 310 has 16 registers for storing 16 output values, so is the second set of registers 310B. The first set of registers 310A has the first 16 registers (Rl, R2, R3, R4, R5, etc.) for storing first 16 output values generated by the original instructions 702, and the second set of registers 310B has the second distinct 16 registers (R17, R18, R19, R20, R21, etc.) for storing second 16 output values that are generated by the duplicate instructions 704 and correspond to the first 16 output values. In some embodiments, the first and second sets of registers 310A and 310B are immediately adjacent to each other. Alternatively, in some embodiments, the first and second sets of registers 310A and 310B are separate from each other. Additionally, in some embodiments, the first and second sets of registers 310A and 310B interleave with each other. If the instructions 706 are properly implemented by the design of the digital hardware system, each of the first 16 output values stored in registers Rl, R2, R3, R4, R5, etc. is equal to a respective one of the second 16 output values stored in registers R17, R18, R19, R20, R21, etc., respectively.
[0053] Figure 9 is an example process 900 of pre-silicon verification and debugging implemented based on output values of instructions, in accordance with some embodiments. As explained above, a sequence of first instructions 702 are duplicated to and combined with a sequence of second instructions 704. The first and second instructions 702 and 704 are executed to generate a plurality of first output values 902 (e.g., a value stored in register R2) and a plurality of second output values 904 (e.g., a value stored in register R2’), respectively. A register file 310 includes a first set of registers 310A and a second set of registers 310B that is distinct from the first set of registers 310A. The first and second sets of registers 310A and 310B include the same number of registers 708 for storing the first or second output values 902 and 904, respectively. Each register 708 has a predefined number of bits (e.g., 32 bits). The first and second output values 902 and 904 generated from the first and second instructions 702 and 704 are stored in the first and second sets of registers 310A and 310B of the register file 310 according to the same order, respectively. In this example shown in Figure 9, two output values 902 and 904 are generated from first and second instructions 702 and 704 that correspond to each other, and stored in two distinct registers 708A and 708B located in the first and second set of registers 310A and 310B, respectively.
[0054] The QED module 306 compares the output values stored in the first and second sets of registers 310A and 310B to determine whether the sequence of first instructions 702 are properly implemented by the digital hardware system. Particularly, only a subset of each output value (i.e., a first number of fixed bits 906 in a respective output value) is used for comparison. That said, for each respective first output value 902 (e.g., stored in register R2) of the sequence of first instructions 702, the QED module 306 identifies a respective second output value 904 (e.g., R2’) in register R2’ of the second set of registers 310B. The respective second output value 904 (e.g., stored in register R2’) corresponds to a second instruction (e.g., instruction D’ in Figure 7) that is duplicated from a first instruction (e.g., instruction D in Figure 7) corresponding to the respective first output value 902 (e.g., stored in register R2). Each of the first number of fixed bits 906A in the respective first output value 902 is compared with a respective one of the first number of fixed bits 906B in the respective second output value 904. For example, in Figure 9, while each output value R2 or R2’ has 32 bits, the last 7 bits of the output values stored in registers 708A and 708B (R2 and R2’) are compared.
[0055] Only the subset of each output value (i.e., a first number of fixed bits 906 in a respective output value) is used for comparison. In an example not shown, each register 708 of the register file 310 is an integer register, and the first number is equal to 1. For each first output value 902, the subset of the respective first output value include a least significant bit (LSB). Only the LSBs of the output values stored in registers R2 and R2’ are compared. Alternatively, in some embodiments, each register 708 of the register file 310 is a floating point register. For each first output value, the predefined number of bits include one or more mantissa bits, one or more exponent bits, and a sign bit. The one or more mantissa bits include the LSB of the respective output value. In some embodiments, the subset of the respective first output value 902 always includes the sign bit. Further, in some situations, the subset of the respective first output value 902 has two bits including the sign bit and one of the exponent bit(s). Further, in some situations, the subset of the respective first output value 902 has three bits including the sign bit, one of the exponent bit(s), and the LSB of the one or more mantissa bits. Alternatively, in some situations, the first number is equal to 4 or above, and for each first output value 902, the subset of the respective first output value 902 includes the sign bit, at least one mantissa bit, and at least one exponent bit.
[0056] In an example, a processor 102 includes a plurality of register files 310. The processor 102 implements RISC-V floating-point instructions. The register files 310 includes a first register file that stores output values of the floating-point instructions, and a second register file that stores integer values. Further, in some situations, the processor 102 also implements RISC-V vector instructions, and further includes a third register file.
[0057] In some situations, in accordance with a determination that the first number of fixed bits 906A in each first output value 902 matches the first number of fixed bits 906B in the respective second output value 904, the QED module 306 determines that the sequence of first instructions 702 are properly implemented by the digital hardware system, i.e., the design of the digital hardware system is correct. Alternatively, in some situations, in accordance with a determination that the subset 906A of a first output value 902 of one of the sequence of first instructions 702 is distinct from the subset 906B of a corresponding second output value 904, the QED module 306 detects an error with implementation of the sequence of first instructions 702. It is noted that the corresponding second output value 904 is generated by a corresponding one of the sequence of second instructions 704 that is duplicated from the one of the sequence of first instructions 702. [0058] Specifically, in some embodiments, for a first output value of the sequence of first instructions 702 stored in the first set of registers 310A of the register file 310, the QED module 306 detects a mismatch 908 in at least one of the first number of fixed bits 906A in the first output value generated by the sequence of first instructions 702 and a respective one of the first number of fixed bits 906B in a corresponding second output value generated by the sequence of second instructions 704. The QED module 306 determines that the sequence of first instructions 702 are improperly implemented by the digital hardware system and aborts comparing any remaining bits in this mismatching first output value with any remaining bits in the respective second output value or comparing any remaining first output values with any respective second output values.
[0059] When the QED module 306 completes execution of a sequence of original instructions 702 and their corresponding duplicate instructions 704, data stored in the first set of registers 310A are equal to data stored in the second set of registers 310B of the register file 310 if the digital hardware system is designed properly without any error for implementing the original instructions 702. Data stored in registers 708 of the register file 310 are compared in pair. The division of the register file 310 into two separate sets of registers 310A and 310B can be done in many ways. In some embodiments, the register file 310 is split in middle based on their index. For example, the register file 310 has 16 32-bit registers (indexed as 0 to 15), and is divided such that the first set of registers 310A includes registers 0 to 7 for storing first output values generated from the original instructions 702 and the second set of registers 310B includes registers 8 to 15 for storing second output values generated from the duplicate instructions 704. Each z-th register of the first set of registers 310A is uniquely paired with a respective (z+5)-th register of the second set of registers 310B, wherein z is from 0 to 7. The QED module 306 tracks execution of the instructions 702 and 704, and asserts a signal eddi ready in accordance with a determination that the original instructions 702 and corresponding duplicate instructions 704 are executed.
[0060] An EDDI property refers to an output property of corresponding output values in the sets of registers 310A and 310B being matched, and is used for error detection by duplicate instructions (EDDI). Each output value is stored in a register 708 of a register file 310. Each EEDI property can be written in SystemVerilog language as: logic [15:0] [31 :0] arf; for (i = 0; i < 8; i++) begin assert property (@(posedge elk) (eddi_ready) |-> (arf[i] == arf[i+8])); (1) end Each first output value 902 is stored in each register 708 of the first set of registers 310A of the register file 310 is compared with a corresponding output value 904 stored in a corresponding register 708 of the second set of registers 310B of the register file 310, e.g., when the signal eddi ready is asserted. In some embodiments, such a comparison is implemented on an entire width of the registers 708, e.g., 32 bits. Conversely, in some embodiments associated with quick EDDI (QEDDI), the QED module 306 compares a subset of bits of each register 708, rather than the entire width of the registers 708. For example, the QED module 306 compares only the least significant bit (LSB) of each register pair: for (i = 0; i < 8; i++) begin assert property (@(posedge elk) (eddi_ready) |-> (arf[i] [0] == arf[i+8] [0])); end
When the number of bits compared in each property is reduced, a cone of influence (COI) of the EDDI property is reduced, thereby making it easier for a BMC application to analyze the EDDI property and find possible failures in the design of the digital hardware system. As a search space for the possible failures is reduced, an exponential growth of analyzing each extra clock cycle is controlled. Particularly, in deeper clock cycles, it becomes feasible to analyze and check the DUT 308 for more clock cycles within a finite amount of time. This variation of EDDI-V is called quick EDDI (QEDDI).
[0061] In some embodiments, the DUT 308 has bugs that affect only certain bits of the registers that do not overlap the selected subset 906, and the bugs go undetected by the QED module 306. Such bugs whose manifestations affect only certain bits of the registers tend to be single instruction bugs, rather than sequence-dependent bugs. The single instruction bugs are not main targets of SQED in which the EDDI-V properties are checked. SQED has an extension, si-SQED, which is relied upon to directly address the single instruction bugs. In some embodiments, QEDDI is applied as the first line of attack, i.e., the QEDDI variant of EDDI-V property is executed first to detect and fix one or more bugs in the DUT 308. A full-width EDDI-V is executed to detect any remaining bugs that QEDDI has missed. In some embodiments, the full-width EDDI-V or EDDI-V of remaining bits is executed, in accordance with a determination that QEDDI does not identify any bug. Alternatively, in some embodiments, the full-width EDDI-V or EDDI-V of remaining bits is executed for thoroughness regardless of whether QEDDI identifies any bugs, i.e., even after QEDDI identifies some bugs.
[0062] In some embodiments, a magnitude of the first number and/or locations of the first number of bits 906 selected in the output values determine an error detection capability of QEDDI (e.g., including an error detection speed). It is a tradeoff that verification engineers should make during the course of setting up SQED EDDI-V properties and QEDDI variations based on the DUT 308 and the register file 310. For example, two distinct register files 310 include integer registers and floating-point registers, respectively. Comparing the LSBs is a good choice for the register file 310 having integer registers. For the register file 310 having floating point registers, register bits are split among mantissa, exponent, and sign bits. In some embodiments, the mantissa bits include the LSB, and the LSB is selected for comparison. Alternatively, in some embodiments, sign and exponent bits are not the LSB, and however, selected for comparison to improve a bug detection capability. Additionally, in some embodiments, the LSB, the sign bit, and one or more exponent bits are selected for comparison.
[0063] Figure 10A is a data structure 1000 of a portion of a register file 310 in which all values stored in the registers 708 are initialized to 0, in accordance with some embodiments, and Figures 10B-10E are four example data structures 1010, 1020, 1030, and 1040 of a portion of a register file 310 in which a subset of (not all) registers are set to symbolic initial states (SIS, i.e., represented by “x” in Figures 10B-10E), in accordance with some embodiments. A register file 310 is configured to store output values of a sequence of first instructions 702. Each register 708 in the register file 310 has a predefined number of bits (e.g., 32 bits). In some embodiments, the sequence of first instructions 702 are duplicated to a sequence of second instructions 704. Output values of the sequence of first instructions 702 and the sequence of second instructions 704 are stored according to the same order in a first set of registers 310A and a second set of registers 310B of the register file 310, respectively. The first and second sets of registers 310A and 310B do not share any register 708, and are either immediately adjacent to or separate from each other. Each register 708 in the register file 310 is optionally initialized with an initial value (e.g., 0 or 1) or set to an SIS. The initial value is defined in an “assume property” statement in SystemVerilog. In some embodiments, when a register 708 is not defined with any initial value, the register 708 is automatically set to the SIS. Independently of initial values of bits of each register 708 of the register file 310, values stored in the first and second sets of registers 310A and 310B of the register file 310 match each other in pair according to the following equal state constraints (i.e., the EDDI property): for (i = 0; i < 8; i++) begin assume property (@(posedge elk) (sqed_init) |-> (arf[i] == arf[i+8])); (1) end
[0064] The register 708 is set to the SIS, when no initial value of 0 or 1 is assigned to the value stored in the register 708, and the register 708 has no constraint. A formal property checker tool is allowed to assign any initial value (0 or 1) to the register 708 that is set to the SIS. The SIS corresponds to a symbolic value indicating that the formal property checker tool may try all possible values for the symbolic value in an evaluation process. In some embodiments, each and every bit of the first and second sets of registers 310A and 310B are initialized to a respective predefined value of 0 or 1 in a duplicated manner. Referring to Figure 10A, all bits in the register file 310 are initialized to 0. Conversely, in some embodiments, referring to Figures 10B-10E, not all of the bits in the register file 310 are initialized to 0 or 1. Only a subset (less than all bits) of the registers 708 of the register file 310 are assigned with 0, e.g., using the “assume property” statements in System Verilog. Bits that are left out in the “assume property” statements are not assigned with any initial value, and become symbolic (i.e., are set to SIS).
[0065] The initial states of the register file 310 and register contents are a subset of constraints to be set properly for SQED runs. When the EDDI-V property is being checked, a DUT state needs to be initialized to an EDDI-V consistent state, i.e., with the two half spaces (i.e., the sets of registers 310A and 310B of the register file 310) initialized to equal states. For example, by implementing a logic that asserts the sqed init signal in cycle 1 of the formal verification run (de-asserted otherwise), the following “assume property” statement enforces the above equal state constraints for the first and second sets of registers 310A and 310B of the register file 310 (a similar constraint can be written for memory halves).
[0066] Referring to Figure 10A, in some embodiments, all bits of the registers 708 of the register file 310 are initialized to 0. The following “assume property” statement is applied to set all 32 bits of 16 registers of the register file 310 to 0: for (i = 0; i < 16; i++) begin assume property (@(posedge elk) (sqed_init) |-> (arf[i] == 32’bO)); end
[0067] In some embodiments not shown, all bits of the registers 708 of the register file 310 are initialized to 1. Alternatively, in some embodiments not shown, each of the registers 708 of the register file 310 is initialized to 1 or 0, not necessarily all to 1 and all to 0. All bits are initialized, and no bit is left undefined. Conversely, in some embodiments, no bit of the register file is initialized to 1 or 0. All bits of the registers 708 of the register file 310 start with a symbolic initial state (SIS), i.e., they could be any value. When analysis starts from an SIS, the QED module 306 explores all possible initial values in the registers 708. Given the EDDI-V consistency constraint (i.e., the above equal state constraint exemplified in statement (1)), the QED module 306 enforces that equal symbolic state is applied to each register pair, i.e., any two registers located in the same locations of the first and second sets of registers 310A and 310B of the register file 310.
[0068] Under some circumstances, referring to Figure 10A, all registers of the first and second sets of registers 310A and 31 OB of the register file 310 are initialized to 0. Nonzero values are written to corresponding registers, when arithmetic or logical instructions are executed by the DUT 308 that have a non-zero immediate field. For example, an ADDI instruction in RISC-V ISA writes a sum of a first value in a source register and a second value of an immediate field of the ADDI instruction to a destination register. As the QED module 306 explores possible sequences of instructions with possible instructions and possible values of instruction fields, non-zero values are generated and written to the register file 310 after a few initial clock cycles. Once these non-zero values are stored in the registers 708 of the register file 310, the following instructions use these non-zero values stored in these registers 708 as source operands, start to have non-zero inputs, and allow additional bug hunting to happen during deeper clock cycles. With some bits of the register file 310 initialized to SIS, the non-zero values are available in the register file 310 during the initial clock cycles, and bugs could be detected within the initial clock cycles without waiting to enter the deeper clock cycles.
[0069] Application of the SIS is associated with a trade-off between a runtime and a bug detection time. A QED module 306 builds a search tree for formal property check (i.e., bounded model check). A size of a root of this search tree depends on a size of a state space in the initial clock cycles. Each level of the tree grows exponentially in size by how many possible input combinations the QED module 306 can apply to the inputs in each cycle. With SIS, the root includes all possible states (i.e., initial values) of the register file 310 and memory, while with all 0s initial state, the root is many orders of magnitude smaller. As a result, the QED module 306 that uses a breadth-first search algorithm can get to deeper clock cycles faster when it starts with a smaller root (having all 0 initial values in the register file 310) compared to a larger root (having at least a subset of the register file 310 set to SIS). [0070] In some situations, bugs are activated to cause failure (e.g., incorrect value in register file or memory) in a short instruction sequence, and get detected faster with SQED when the register file 310 are initialized with SIS. Conversely, in some situations, bugs are activated to cause failure in longer instruction sequences (e.g., deeper clock cycles), and get detected faster with SQED when all of the register file 310 are initialized with 0. SQED verification steps are optionally set up to do some runs with SIS followed by runs with all 0 initial values in the first and second sets of registers 310A and 310B and the register file 310. [0071] In some embodiments, selective SIS is applied to allow SIS for a subset of registers or register bits while starting the rest of the registers or bits from all Os. This keeps the exponential size growth of the root of the search tree under control, while allowing faster detection of bugs that need longer instruction sequences for detection. Selective SIS is implemented in at least one of two independent dimensions, within register bits and/or among registers. Specifically, in some embodiments, referring to Figure 10B, the QED module 306 initializes each and every bit of a subset of registers 708 (e.g., 708C and 708D) in the first set of registers 310A of the register file 310 and a corresponding subset of registers 708 (e.g., 708C’ and 708D’) in the second set of registers 310B of the register file 310 to a respective SIS. The QED module 306 also initializes each of a remaining subset of registers 708 (e.g., 708E and 708F) in the first set of registers 310A of the register file 310 and a corresponding remaining subset of registers 708 (e.g., 708E’ and 708F’) in the second set of registers 310B of the register file 310 to 0. Further, in some embodiments, each bit of a single register of the first set of register 310A of the register file and a corresponding single register of the second set of the register 310B of the register file are set to the SIS.
[0072] In an example, only one register (e.g., a register corresponding to i=l 5) is left to be initialized to the SIS as follows: for (i = 0; i < 15; i++) begin assume property (@(posedge elk) (sqed_init) |-> (arf[i] == 32’bO)); end
[0073] In some embodiments, referring to Figure 10C, each bit of a subset of the predefined number of bits in each of a subset of registers of the register file is initialized to a respective SIS. For example, the subset of registers include registers 708C, 708D and 708E of the first set of registers 310A of the register file 310, and the subset of the predefined number of bits include two or more bits, i.e., bits 5 and 27 of bits 0-31 in registers 708C and 708D, bits 1-3 and 20 in register 708E. The corresponding bits of the second set of registers 310B of the register file 310 are also initialized to the SIS. In some embodiments, if there is any remaining bit in the predefined number of bits in each of the subset of registers, the QED module 306 initializes a remaining subset of the predefined number of bits in each of the subset of registers (e.g., 708C-708E) to 0. In some embodiments, if there is any remaining register (e.g., 708F) in the first set of register 310A of the register file 310, the QED module 306 initializes all bits of the remaining subset of registers to 0.
[0074] In an example, the subset of registers includes only one register (e.g., a register corresponding to i=l 5), and the subset of the predefined number of bits includes only 1 bit (e.g., LSB). Only the LSB of register 15 is left to be uninitialized to 0 or 1, i.e., set to the SIS, as follows: for (i = 0; i < 16; i++) begin if (i==l 5) assume property (@(posedge elk) (sqed_init) |-> (arf[i][31 : 1] == 31’b0)); else assume property (@(posedge elk) (sqed_init) |-> (arf[i] == 32’bO)); end
[0075] In some embodiments, referring to Figure 10D, the QED module 306 initializes a least significant bit (LSB) of each register of the register file to an SIS. The initialization constraint is written as follows: logic [15:0] [31 :0] arf; for (i = 0; i < 16; i++) begin assume property (@(posedge elk) (sqed_init) |-> (arf[i][31 : 1] == 31’b0)); end
[0076] In some embodiments, referring to Figure 10E, the QED module 306 initializes each of a subset of fixed bits (bits 4, 5, 16, 27) of each of a subset of registers 708 (e.g., 708C, 708D, 708E) in the first set of registers 310A of the register file 310 and a corresponding subset of fixed bits (bits 4, 5, 16, 27) of each of a corresponding subset of registers 708 (e.g., 708C’, 708D’, 708E’) in the second set of registers 310B of the register file 310 to a respective SIS. The QED module 306 initializes each remaining bit of the first set of registers 310A of the register file 310 and a corresponding remaining bit of the second set of registers 310B of the register file 310 to 0 or 1 in a duplicated manner.
[0077] In various embodiments of this application, during pre-silicon verification and debugging, each of a subset of bits of the first set of registers 310A of the register file 310 and a corresponding subset of bits of the second set of registers 310B of the register file 310 is initialized to a respective SIS in a duplicated manner. The subset of bits of the first set of registers 310A are less than all bits of the first set of registers 310A. In SystemVerilog, each of the subset of bits is not initialized to a predefined value (e.g., 0 or 1) in an “assume property” statement, and automatically set to the respective SIS. Locations of the subset of bits of the first set of registers 310A of the register file 310 are flexible, except that the subset of bits of the first set of registers 310A and the corresponding subset of bits of the second set of registers 310B are set to the SIS in a duplicated manner. Additionally, referring to Figures 10A-10E, any bit that is initialized with 0 can be initialized to 1, except that bits in the first and second sets of registers 310A and 310B are initialized to 0 or 1 in a duplicated manner (i.e., each bit in the first set of registers 310A and a corresponding bit in the second set of registers 310B are jointly set to the same value of 0 or 1). [0078] It is noted that various embodiments described with reference to Figures 7, 8A-8B, 9, are 10A-10E are directed to the first set of registers 310A and the second set of registers 310B of the register file 310. As explained above, in some embodiments, the QED module 306 duplicates original instruction to duplicate instructions, and stores output values of the original and duplicate instructions into two separate sets of memory units and the memory array 312 of the DUT memory. When the output values are stored in the memory array 312, the QED module 306 determines whether the instructions are properly implemented by the digital hardware system based on a subset of each output value (e.g., a least significant bit (LSB) of each output value), which is stored in memory array 312. Likewise, when the output values are stored in the memory array 312, the QED module 306 initializes a subset of a first set of memory units of the memory array 312 and a corresponding subset of a second set of memory units of the memory array 312 to a respective SIS in a duplicated manner, e.g., initializes the LSB of each memory units of the memory array 312 to the respective SIS. Different embodiments described with reference to Figures 7, 8A-8B, 9, are 10A-10E are similarly applicable to the first set of memory units of the memory array 312 and the second set of memory units of the memory array 312.
[0079] Figure 11 is a flow diagram of a method 1100 for pre-silicon verification and debugging of a digital hardware system (e.g., a processor 102), in accordance with some embodiments. For convenience, the method 1100 is described as being implemented by a computer system 150 (specifically, a QED module 306 of the computer system 150). The method 1100 is, optionally, governed by instructions that are stored in a non-transitory computer readable storage medium and that are executed by one or more processors of the computer system. Each of the operations shown in Figure 11 may correspond to instructions stored in a computer memory or non-transitory computer readable storage medium (e.g., memory 154 of the computer system 150 in Figures IB and 6). The computer readable storage medium may include a magnetic or optical disk storage device, solid state storage devices such as Flash memory, or other non-volatile memory device or devices. The instructions stored on the computer readable storage medium may include one or more of: source code, assembly language code, object code, or other instruction format that is interpreted by one or more processors. Some operations in the method 1100 may be combined and/or the order of some operations may be changed.
[0080] The computer system 150 executes (1102) formal analysis of the digital hardware system including a register file, and the formal analysis includes a sequence of first instructions 702. The sequence of first instructions 702 are duplicated (1104) to a sequence of second instructions 704. The computer system 150 implements (1106) the sequence of first instructions 702 and the sequence of second instructions 704, and stores (1108) output values of the sequences of first and second instructions 704 into a first set of registers 310A and a second set of registers 310B of the register file 310, respectively. The first and second sets of registers 310A and 310B are entirely distinct from each other, and do not overlap at all. The first and second sets of registers have the same number of registers. Each output value is (1112) stored in a respective register 708 of the register file 310 and has a predefined number of bits (e.g., 32 bits). The computer system 150 determines (1114) whether the sequence of first instructions 702 are properly implemented by the digital hardware system based on a subset of each output value stored in the register file 310. The subset of each output value includes (1116) a first number of fixed bits (e.g., a LSB) 906 in a respective output value, and the first number is less than the predefined number.
[0081] In some embodiments, for each respective first output value 902 of the sequence of first instructions 702 stored in the first set of registers 310A of the register file 310, the computer system 150 identifies a respective second output value 904 in the second set of registers 310B. The respective second output value 904 corresponds to a second instruction 704 that is duplicated from a first instruction 702 corresponding to the respective first output value 902. Each of the first number of fixed bits 906A in the respective first output value 902 is compared with a respective one of the first number of fixed bits 906B in the respective second output value 904.
[0082] In some embodiments, in accordance with a determination that the first number of fixed bits in each first output value matches the first number of fixed bits in the respective second output value, the computer system 150 determines (1118) that the sequence of first instructions 702 are properly implemented by the digital hardware system.
[0083] In some embodiments, for a first output value 902 of the sequence of first instructions 702 stored in the first set of registers 310A of the register file 310, the computer system 150 detects a mismatch in at least one of the first number of fixed bits 906A in the first output value 902 and a respective one of the first number of fixed bits 906B in a corresponding second output value 904 of the sequence of second instructions 704. In accordance with a determination that the sequence of first instructions 702 are improperly implemented by the digital hardware system, the computer system aborts comparing any remaining bit of the first output value 902 with any respective remaining bit of the second output value 904. [0084] In some embodiments, in accordance with a determination that the subset of a first output value 902 of the sequence of first instructions 702 is distinct from the subset of a corresponding second output value 904, the computer system 150 detects (1120) an error with implementation of the sequence of first instructions 702. The corresponding second output value 904 is generated by a corresponding one of the sequence of second instructions 704 that is duplicated from the one of the sequence of first instructions 702.
[0085] In some embodiments, each register 708 of the register file 310 is an integer register. The first number is equal to 1. For each first output value 902, the subset of the respective first output value 902 includes a least significant bit (LSB). Alternatively, in some embodiments, each register 708 of the register file 310 is a floating point register. For each register 708, the predefined number of bits include one or more mantissa bits, one or more exponent bits, and a sign bit. The one or more mantissa bits include the LSB of the respective output value. In some embodiments, the subset of the respective first output value 902 always includes the sign bit. Further, in some situations, the subset of the respective first output value 902 has two bits including the sign bit and one of the exponent bit(s). Further, in some situations, the subset of the respective first output value 902 has three bits including the sign bit, one of the exponent bit(s), and the LSB of the one or more mantissa bits. Alternatively, in some situations, the first number is equal to 4 or above, and for each first output value 902, the subset of the respective first output value 902 includes the sign bit, at least one mantissa bit, and at least one exponent bit.
[0086] In some embodiments, the subset of each output value includes a first subset and each output value includes a second subset that is supplemental to the first subset. After determining whether the sequence of first instructions 702 are properly implemented based on the first subset of each output value, the computer system 150 determines whether the sequence of first instructions 702 are properly implemented by the digital hardware system based on at least the second subset of each output value. In some situations, if the first subset of each output value is correct, the computer system 150 continues to check all bits of each output value.
[0087] In some embodiments, the first set of registers 310A and the second set of registers 310B of the register file 310 are initialized in a duplicated manner, such that the first set of registers 310A and the second set of registers 310B have equal initial states to be used to implement the sequence of first instructions 702 and the sequence of second instructions 704. In some embodiments, each and every bit of the first and second sets of registers 310A and 310B are initialized to a respective predefined value of 0 or 1 in a duplicated manner. For example, all bits of the first and second sets of registers 31 OB of the register file 310 are initialized to 0. In some embodiments, each and every bit of the first and second sets of registers 3 lOBs of the register file 310 is initialized to a respective symbolic initial state (SIS).
[0088] Referring to Figure 10B, in some embodiments, the computer system 150 initializes each and every bit of a subset of registers in the first set of registers 310A of the register file 310 and a corresponding subset of registers in the second set of registers 310B of the register file 310 to a respective SIS, and initializes each and every bit of a remaining subset of registers in the first set of registers 310A of the register file 310 and a corresponding remaining subset of registers in the second set of registers 310B of the register file 310 to a respective predefined value or 0 or 1. Further, in some embodiments, each and every bit of a single register of the first set of registers 310A of the register file 310 and a corresponding single register of the second set of registers 310B of the register file 310 are initialized to an SIS.
[0089] Referring to Figure 10C, in some embodiments, the computer system 150 initializes each bit of a subset of the predefined number of bits in each of a subset of registers of the register file 310 to a respective SIS, and initializes a remaining subset of the predefined number of bits in each of the subset of registers of the register file 310 to 0.
[0090] In some embodiments, the computer system initializes a least significant bit (LSB) of each register of the register file 310 to a SIS.
[0091] In some embodiments, the computer system 150 initializes each of a subset of fixed bits of each of a subset of registers in the first set of registers 310A of the register file 310 and a corresponding subset of fixed bits of each of a corresponding subset of registers in the second set of registers 310B of the register file 310 to a respective SIS. The computer system 150 also initializes each remaining bit of the first set of registers 310A of the register file 310 and a corresponding remaining bit of the second set of registers 310B of the register file 310 to 0 or 1 in a duplicated manner. Further, in an example, an LSB of a first register of the first set of registers 310A and an LSB of a corresponding register of the second set of registers 310B of the register file 310 are initialized to respective SIS in a duplicated manner. [0092] Referring to Figure 7, in some embodiments, the sequences of first and second instructions 702 and 704 are interleaved to an interleaved sequence of instructions 706 without varying internal orders of the sequences of first and second instructions 704. The sequence of first instructions 702 and the sequence of second instructions 704 are implemented sequentially according to an order of instructions in the interleaved sequence of instructions 706.
[0093] It should be understood that the particular order in which the operations in Figure 11 have been described are merely exemplary and are not intended to indicate that the described order is the only order in which the operations could be performed. One of ordinary skill in the art would recognize various ways to implement pre-silicon verification and debugging of the digital hardware system as described herein. Additionally, it should be noted that details of other processes described with respect to Figures 1-10 and 12 are also applicable in an analogous manner to the method 1100 described above with respect to Figure 11.
[0094] Figure 12 is a flow diagram of another method 1200 for pre-silicon verification and debugging of a digital hardware system, in accordance with some embodiments. For convenience, the method 1200 is described as being implemented by a computer system 150 (e.g., a QED module 306). The method 1200 is, optionally, governed by instructions that are stored in a non-transitory computer readable storage medium and that are executed by one or more processors of the computer system. Each of the operations shown in Figure 8 may correspond to instructions stored in a computer memory or non- transitory computer readable storage medium (e.g., memory 154 of the computer system 150 in Figures IB and 6). The computer readable storage medium may include a magnetic or optical disk storage device, solid state storage devices such as Flash memory, or other nonvolatile memory device or devices. The instructions stored on the computer readable storage medium may include one or more of: source code, assembly language code, object code, or other instruction format that is interpreted by one or more processors. Some operations in the method 1200 may be combined and/or the order of some operations may be changed.
[0095] The computer system 150 executes (1202) formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions. The sequence of first instructions is duplicated (1204) to a sequence of second instructions. The computer system 150 implements (1206) the sequence of first instructions and the sequence of second instructions, and stores (1208) output values of the sequences of first and second instructions into a first set of registers 310A and a second set of registers 310B of the register file 310, respectively. The first and second sets of registers are distinct from each other, and have the same number of registers. Each output value is stored (1212) in a respective register of the register file and has a predefined number of bits. The computer system 150 determines (1214) whether the sequence of first instructions are properly implemented by the digital hardware system based on the output values stored in the register file. Each of a subset of bits of the first set of registers 310A of the register file and a corresponding subset of bits of the second set of registers 310B of the register file 310 is (1216) initialized to a respective SIS in a duplicated manner. The subset of bits of the first set of registers 310A are less than (1218) all bits of the first set of registers 310A.
[0096] Referring to Figure 10B, in some embodiments, the computer system 150 initializes each and every bit of a subset of registers in the first set of registers 310A of the register file 310 and a corresponding subset of registers in the second set of registers 310B of the register file 310 to a respective SIS, and initializes each and every of a remaining subset of registers in the first set of registers 310A of the register file 310 and a corresponding remaining subset of registers in the second set of registers 310B of the register file 310 to 0. Further, in some embodiments, each and every bit of a single register of the first set of registers 310A of the register file 310 and a corresponding single register of the second set of registers 310B of the register file 310 are initialized to an SIS.
[0097] Referring to Figure 10C, in some embodiments, the computer system 150 initializes each bit of a subset of the predefined number of bits in each of a subset of registers of the register file 310 to a respective SIS, and initializes a remaining subset of the predefined number of bits in each of the subset of registers of the register file 310 to 0.
[0098] In some embodiments, the computer system initializes a least significant bit (LSB) of each register of the register file 310 to a SIS.
[0099] In some embodiments, the computer system 150 initializes each of a subset of fixed bits of each of a subset of registers in the first set of registers 310A of the register file 310 and a corresponding subset of fixed bits of each of a corresponding subset of registers in the second set of registers 310B of the register file 310 to a respective SIS. The computer system 150 also initializes each remaining bit of the first set of registers 310A of the register file 310 and a corresponding remaining bit of the second set of registers 310B of the register file 310 to 0 or 1 in a duplicated manner. Further, in an example, an LSB of a first register of the first set of registers 310A and an LSB of a corresponding register of the second set of registers 310B of the register file 310 are initialized to respective SIS in a duplicated manner. [00100] In some embodiments, only a subset of each output value stored in the register file 310 is applied to determine whether the sequence of first instructions 702 are properly implemented by the digital hardware system. The subset of each output value includes a first number of fixed bits in a respective output value, and the first number is less than the predefined number. [00101] It should be understood that the particular order in which the operations in Figure 12 have been described are merely exemplary and are not intended to indicate that the described order is the only order in which the operations could be performed. One of ordinary skill in the art would recognize various ways to implement pre-silicon verification and debugging of the digital hardware system as described herein. Additionally, it should be noted that details of other processes described above with respect to Figures 1-11 are also applicable in an analogous manner to the method 1200 described above with respect to Figure 12.
[00102] Various embodiments described with reference to Figure 11 and 12 are directed to the register file 310. As explained above, in some embodiments, the QED module 306 stores output values of the original and duplicate instructions into two separate sets of memory units and the memory array 312 of the DUT memory 304. In one aspect, when the output values are stored in the memory array 312, the computer system 150 executes formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions 702. The sequence of first instructions 702 are duplicated to a sequence of second instructions 704. The computer system 150 implements the sequence of first instructions 702 and the sequence of second instructions 704, and stores output values of the sequences of first and second instructions 704 into a first set of memory units and a second set of memory units of the memory array 312, respectively. The first and second sets of memory units of the memory array 312 are entirely distinct from each other, and do not overlap at all. The first and second sets of memory units of the memory array 312 have the same number of memory units. Each output value is stored in a memory unit of the memory array 312 and has a predefined number of bits (e.g., 32 bits). The computer system 150 determines whether the sequence of first instructions 702 are properly implemented by the digital hardware system based on a subset of each output value stored in the memory array 312. The subset of each output value includes a first number of fixed bits (e.g., a LSB) 906 in a respective output value, and the first number is less than the predefined number.
[00103] In another aspect, the computer system 150 executes formal analysis of the digital hardware system, and the formal analysis includes a sequence of first instructions. The sequence of first instructions is duplicated to a sequence of second instructions. The computer system 150 implements the sequence of first instructions and the sequence of second instructions, and stores output values of the sequences of first and second instructions into a first set of memory units and a second set of memory units of the memory array 312, respectively. The first and second sets of memory units are distinct from each other, and have the same number of memory units. Each output value is stored in a respective memory unit of the memory array 312 and has a predefined number of bits. The computer system 150 determines whether the sequence of first instructions are properly implemented by the digital hardware system based on the output values stored in the memory array 312. Each of a subset of bits of the first set of memory units of the memory array 312 and a corresponding subset of bits of the second set of memory units of the memory array 312 is initialized to a respective SIS in a duplicated manner. The subset of bits of the first set of memory unit are less than all bits of the first set of memory units.
[00104] Different embodiments described with reference to Figures 11 and 12 are similarly applicable to the first set of memory units of the memory array 312 and the second set of memory units of the memory array 312.
[00105] The terminology used in the description of the various described implementations herein is for the purpose of describing particular implementations only and is not intended to be limiting. As used in the description of the various described implementations and the appended claims, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term “and/or” as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms “includes,” “including,” “comprises,” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Additionally, it will be understood that, although the terms “first,” “second,” etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another.
[00106] As used herein, the term “if’ is, optionally, construed to mean “when” or “upon” or “in response to determining” or “in response to detecting” or “in accordance with a determination that,” depending on the context. Similarly, the phrase “if it is determined” or “if [a stated condition or event] is detected” is, optionally, construed to mean “upon determining” or “in response to determining” or “upon detecting [the stated condition or event]” or “in response to detecting [the stated condition or event]” or “in accordance with a determination that [a stated condition or event] is detected,” depending on the context.
[00107] The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the claims to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain principles of operation and practical applications, to thereby enable others skilled in the art.
[00108] Although various drawings illustrate a number of logical stages in a particular order, stages that are not order dependent may be reordered and other stages may be combined or broken out. While some reordering or other groupings are specifically mentioned, others will be obvious to those of ordinary skill in the art, so the ordering and groupings presented herein are not an exhaustive list of alternatives. Moreover, it should be recognized that the stages can be implemented in hardware, firmware, software or any combination thereof.

Claims

What is claimed is:
1. A method for pre-silicon verification and debugging of a digital hardware system, comprising: executing formal analysis of the digital hardware system including a register file, the formal analysis including a sequence of first instructions; duplicating the sequence of first instructions to a sequence of second instructions; implementing the sequence of first instructions and the sequence of second instructions; storing output values of the sequences of first and second instructions into a first set of registers and a second set of registers of the register file, respectively, each output value stored in a respective register of the register file and having a predefined number of bits; and determining whether the sequence of first instructions are properly implemented by the digital hardware system based on a subset of each output value stored in the register file, the subset of each output value including a first number of fixed bits in a respective output value, the first number less than the predefined number.
2. The method of claim 1, determining whether the sequence of first instructions are properly implemented by the digital hardware system further comprising: for each respective first output value of the sequence of first instructions stored in the first set of registers of the register file: identifying a respective second output value in the second set of registers, the respective second output value corresponding to a second instruction that is duplicated from a first instruction corresponding to the respective first output value; and comparing each of the first number of fixed bits in the respective first output value with a respective one of the first number of fixed bits in the respective second output value.
3. The method of claim 2, determining whether the sequence of first instructions are properly implemented by the digital hardware system further comprising: in accordance with a determination that the first number of fixed bits in each first output value matches the first number of fixed bits in the respective second output value, determining that the sequence of first instructions are properly implemented by the digital hardware system.
4. The method of claim 1, determining whether the sequence of first instructions are properly implemented by the digital hardware system further comprising: for a first output value of the sequence of first instructions stored in the first set of registers of the register file, detecting a mismatch in at least one of the first number of fixed bits in the first output value and a respective one of the first number of fixed bits in a corresponding second output value of the sequence of second instructions; determining that the sequence of first instructions are improperly implemented by the digital hardware system; and aborting comparing any remaining bit of the first output value with any respective remaining bit of the second output value.
5. The method of claim 1, further comprising: in accordance with a determination that the subset of a first output value of the sequence of first instructions is distinct from the subset of a corresponding second output value, detecting an error with implementation of the sequence of first instructions; wherein the corresponding second output value is generated by a corresponding one of the sequence of second instructions that is duplicated from the one of the sequence of first instructions.
6. The method of any of claims 1-5, wherein: each register of the register file is an integer register; the first number is equal to 1; and for each first output value, the subset of the respective first output value includes a least significant bit (LSB).
7. The method of any of claims 1-5, wherein: each register of the register file is a floating point register; the predefined number of bits include at least one mantissa bit, at least one exponent bit, and a sign bit; the first number is equal to 1, 2, or 3; and for each first output value, the subset of the respective first output value includes the first number of bits selected from the mantissa, exponent, and sign bits.
8. The method of claim 1, wherein the subset of each output value includes a first subset and each output value includes a second subset that is supplemental to the first subset, further comprising: after determining whether the sequence of first instructions are properly implemented based on the first subset of each output value, determining whether the sequence of first instructions are properly implemented by the digital hardware system based on the second subset of each output value.
9. The method of any of claims 1-8, executing formal analysis of the digital hardware system further comprising: initiating the first set of registers and the second set of registers of the register file in a duplicated manner, such that the first set of registers and the second set of registers have equal initial states to be used to implement the sequence of first instructions and the sequence of second instructions.
10. The method of any of claims 1-9, further comprising: initializing each and every bit of the first and second sets of registers of the register file to a respective predefined value of 0 or 1.
11. The method of any of claims 1-9, further comprising: initializing each and every bit of the first and second sets of registers of the register file to a respective symbolic initial state (SIS).
12. The method of any of claims 1-9, further comprising: initializing each bit of a subset of registers in the first set of registers of the register file and a corresponding subset of registers in the second set of registers of the register file to a respective SIS; and initializing each bit of a remaining subset of registers in the first set of registers of the register file and a corresponding remaining subset of registers in the second set of registers of the register file to a respective predefined value or 0 or 1.
13. The method of claim 12, wherein each bit of a single register of the first set of registers of the register file and a corresponding single register of the second set of registers of the register file are initialized to an SIS.
14. The method of any of claims 1-9, further comprising: initializing each bit of a subset of the predefined number of bits in each of a subset of registers of the register file to a respective SIS; and initializing a remaining subset of the predefined number of bits in each of the subset of registers of the register file to 0.
15. The method of any of claims 1-9, further comprising: initializing a least significant bit (LSB) of each register of the register file to an SIS.
16. The method of any of claims 1-9, further comprising: initializing each of a subset of fixed bits of each of a subset of registers in the first set of registers of the register file and a corresponding subset of fixed bits of each of a corresponding subset of registers in the second set of registers of the register file to a respective SIS; and initializing each remaining bit of the first set of registers of the register file and a corresponding remaining bit of the second set of registers of the register file to a respective predefined value or 0 or 1 in a duplicated manner.
17. The method of claim 16, further comprising: initializing an LSB of a first register of the first set of registers and an LSB of a corresponding register of the second set of registers to a SIS in a duplicated manner.
18. The method of any of claims 1-17, further comprising: interleaving the sequences of first and second instructions to an interleaved sequence of instructions without varying internal orders of the sequences of first and second instructions, wherein the sequence of first instructions and the sequence of second instructions are implemented sequentially according to an order of instructions in the interleaved sequence of instructions.
19. A method for pre-silicon verification and debugging of a digital hardware system, comprising: executing formal analysis of the digital hardware system including a register file, the formal analysis including a sequence of first instructions; duplicating the sequence of first instructions to a sequence of second instructions; implementing the sequence of first instructions and the sequence of second instructions; storing output values of the sequences of first and second instructions into a first set of registers and a second set of registers of the register file, respectively, each output value stored in a respective register of the register file and having a predefined number of bits; determining whether the sequence of first instructions are properly implemented by the digital hardware system based on the output values stored in the register file; wherein executing formal analysis of the digital hardware system further includes initializing each of a subset of bits of the first set of registers of the register file and a corresponding subset of bits of the second set of registers of the register file to a respective SIS in a duplicated manner, and the subset of bits of the first set of registers are less than all bits of the first set of registers.
20. The method of claim 19, initializing each of the subset of bits of the first set of registers of the register file and the corresponding subset of bits of the second set of registers of the register file to the respective SIS in the duplicated manner further comprising: initializing each bit of a subset of registers in the first set of registers of the register file and a corresponding subset of registers in the second set of registers of the register file to the respective SIS; and initializing each bit of a remaining subset of registers in the first set of registers of the register file and a corresponding remaining subset of registers in the second set of registers of the register file to a respective predefined value or 0 or 1.
21. The method of claim 20, wherein each bit of a single register of the first set of registers of the register file and a corresponding single register of the second set of registers of the register file are initialized to an SIS.
22. The method of claim 19, initializing each of the subset of bits of the first set of registers of the register file and the corresponding subset of bits of the second set of registers of the register file to the respective SIS in the duplicated manner further comprising: initializing each bit of a subset of the predefined number of bits in each of a subset of registers of the register file to the respective SIS; and initializing a remaining subset of the predefined number of bits in each of the subset of registers of the register file to 0.
23. The method of claim 19, initializing each of the subset of bits of the first set of registers of the register file and the corresponding subset of bits of the second set of registers of the register file to the respective SIS in the duplicated manner further comprising: initializing a least significant bit (LSB) of each register of the register file to an SIS.
24. The method of claim 19, initializing each of the subset of bits of the first set of registers of the register file and the corresponding subset of bits of the second set of registers of the register file to the respective SIS in the duplicated manner further comprising: initializing each of a subset of fixed bits of each of a subset of registers in the first set of registers of the register file and a corresponding subset of fixed bits of each of a corresponding subset of registers in the second set of registers of the register file to the respective SIS; and initializing each remaining bit of the first set of registers of the register file and a corresponding remaining bit of the second set of registers of the register file to a respective predefined value or 0 or 1 in a duplicated manner.
25. The method of claim 24, further comprising: initializing an LSB of a first register of the first set of registers and an LSB of a corresponding register of the second set of registers to the SIS in a duplicated manner.
26. The method of any of claims 19-25, wherein all of the predefined number of bits of each output value stored in the register file are compared to determine whether the sequence of first instructions are properly implemented by the digital hardware system.
27. The method of any of claims 19-25, wherein only a subset of each output value stored in the register file is applied to determine whether the sequence of first instructions are properly implemented by the digital hardware system, the subset of each output value including a first number of fixed bits in a respective output value, the first number less than the predefined number.
28. The method of claim 27, determining whether the sequence of first instructions are properly implemented by the digital hardware system further comprising: for each respective first output value of the sequence of first instructions stored in the first set of registers of the register file: identifying a respective second output value in the second set of registers, the respective second output value corresponding to a second instruction that is duplicated from a first instruction corresponding to the respective first output value; and comparing each of the first number of fixed bits in the respective first output value with a respective one of the first number of fixed bits in the respective second output value.
29. The method of claim 27, determining whether the sequence of first instructions are properly implemented by the digital hardware system further comprising: in accordance with a determination that the first number of fixed bits in each first output value matches the first number of fixed bits in the respective second output value, determining that the sequence of first instructions are properly implemented by the digital hardware system.
30. The method of claim 27, determining whether the sequence of first instructions are properly implemented by the digital hardware system further comprising: for a first output value of the sequence of first instructions stored in the first set of registers of the register file, detecting a mismatch in at least one of the first number of fixed bits in the first output value and a respective one of the first number of fixed bits in a corresponding second output value of the sequence of second instructions; determining that the sequence of first instructions are improperly implemented by the digital hardware system; and aborting comparing any remaining bit of the first output value with any respective remaining bit of the second output value.
31. The method of claim 27, further comprising: in accordance with a determination that the subset of a first output value of the sequence of first instructions is distinct from the subset of a corresponding second output value, detecting an error with implementation of the sequence of first instructions; wherein the corresponding second output value is generated by a corresponding one of the sequence of second instructions that is duplicated from the one of the sequence of first instructions.
32. The method of any of claims 27-31, wherein: each register of the register file is an integer register; the first number is equal to 1; and for each first output value, the subset of the respective first output value includes a least significant bit (LSB).
33. The method of any of claims 27-31, wherein: each register of the register file is a floating point register; the predefined number of bits include at least one mantissa bit, at least one exponent bit, and a sign bit; the first number is equal to 1, 2, or 3; and for each first output value, the subset of the respective first output value includes the first number of bits selected from the mantissa, exponent, and sign bits.
34. The method of any of claims 27-33, wherein the subset of each output value includes a first subset and each output value includes a second subset that is supplemental to the first subset, further comprising: after determining whether the sequence of first instructions are properly implemented based on the first subset of each output value, determining whether the sequence of first instructions are properly implemented by the digital hardware system based on the second subset of each output value.
35. The method of any of claims 19-34, further comprising: interleaving the sequences of first and second instructions to an interleaved sequence of instructions without varying internal orders of the sequences of first and second instructions, wherein the sequence of first instructions and the sequence of second instructions are implemented sequentially according to an order of instructions in the interleaved sequence of instructions.
36. A computer system, comprising: one or more processors; and memory having instructions stored thereon, which when executed by the one or more processors cause the processors to perform a method of any of claims 1-35.
37. A non-transitory computer-readable medium, having instructions stored thereon, which when executed by one or more processors cause the processors to perform a method of any of claims 1-35.
PCT/US2022/033266 2022-06-13 2022-06-13 Quick error detection and symbolic initial states in pre-silicon verification and debugging WO2023244210A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/US2022/033266 WO2023244210A1 (en) 2022-06-13 2022-06-13 Quick error detection and symbolic initial states in pre-silicon verification and debugging

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2022/033266 WO2023244210A1 (en) 2022-06-13 2022-06-13 Quick error detection and symbolic initial states in pre-silicon verification and debugging

Publications (1)

Publication Number Publication Date
WO2023244210A1 true WO2023244210A1 (en) 2023-12-21

Family

ID=89191622

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/033266 WO2023244210A1 (en) 2022-06-13 2022-06-13 Quick error detection and symbolic initial states in pre-silicon verification and debugging

Country Status (1)

Country Link
WO (1) WO2023244210A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110087861A1 (en) * 2009-10-12 2011-04-14 The Regents Of The University Of Michigan System for High-Efficiency Post-Silicon Verification of a Processor
US20140257739A1 (en) * 2013-03-07 2014-09-11 International Business Machines Corporation Implementing random content of program loops in random test generation for processor verification
US20170076116A1 (en) * 2015-09-11 2017-03-16 Freescale Semiconductor, Inc. Model-Based Runtime Detection of Insecure Behavior for System on Chip with Security Requirements
US20180165393A1 (en) * 2015-06-06 2018-06-14 The Board Of Trustees Of The Leland Stanford Junior University SYSTEM-LEVEL VALIDATION OF SYSTEMS-ON-A-CHIP (SoC)
US20200201778A1 (en) * 2018-12-20 2020-06-25 International Business Machines Corporation Methods and systems for verifying out-of-order page fault detection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110087861A1 (en) * 2009-10-12 2011-04-14 The Regents Of The University Of Michigan System for High-Efficiency Post-Silicon Verification of a Processor
US20140257739A1 (en) * 2013-03-07 2014-09-11 International Business Machines Corporation Implementing random content of program loops in random test generation for processor verification
US20180165393A1 (en) * 2015-06-06 2018-06-14 The Board Of Trustees Of The Leland Stanford Junior University SYSTEM-LEVEL VALIDATION OF SYSTEMS-ON-A-CHIP (SoC)
US20170076116A1 (en) * 2015-09-11 2017-03-16 Freescale Semiconductor, Inc. Model-Based Runtime Detection of Insecure Behavior for System on Chip with Security Requirements
US20200201778A1 (en) * 2018-12-20 2020-06-25 International Business Machines Corporation Methods and systems for verifying out-of-order page fault detection

Similar Documents

Publication Publication Date Title
CN107493685B (en) Reprogramming a port controller via its own external port
US10026499B2 (en) Memory testing system
CN107710166B (en) Post-silicon verification and debug with fast error detection of symbols
DE112011105864T5 (en) Method, device and system for memory validation
CN111722990A (en) Method and device for checking cable connection between main back boards
US10120702B2 (en) Platform simulation for management controller development projects
US8707102B2 (en) Method and program for verifying operation of processor
US10642678B1 (en) PCI/PCIe-non-compliance-vulnerability detection apparatus and method
US10133654B1 (en) Firmware debug trace capture
US8626965B2 (en) Using a DMA engine to automatically validate DMA data paths
CN102147831A (en) Logic verification method and device
US9250919B1 (en) Multiple firmware image support in a single memory device
US8370618B1 (en) Multiple platform support in computer system firmware
CN103475514B (en) Node, group system and BIOS without BMC repair and upgrade method
US8762696B2 (en) Firmware with a plurality of emulated instances of platform-specific management firmware
WO2023244210A1 (en) Quick error detection and symbolic initial states in pre-silicon verification and debugging
US9589088B1 (en) Partitioning memory in programmable integrated circuits
JP7394849B2 (en) Testing read-only memory using the memory built-in self-test controller
CN114328062A (en) Method, device and storage medium for checking cache consistency
US10043027B1 (en) Generation of mask-value pairs for managing access to memory segments
US11586527B2 (en) Automated algorithmic verification in an embedded complex distributed storage environment
US11086758B1 (en) Identifying firmware functions executed in a call chain prior to the occurrence of an error condition
TW201928669A (en) Computer apparatus, diagnostic method and non-transitory computer-readable storage medium
US10872030B2 (en) Control system and method of to perform an operation
JP2008226020A (en) Register verifying device, method and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22947017

Country of ref document: EP

Kind code of ref document: A1