US20060059286A1 - Multi-core debugger - Google Patents

Multi-core debugger Download PDF

Info

Publication number
US20060059286A1
US20060059286A1 US11/042,476 US4247605A US2006059286A1 US 20060059286 A1 US20060059286 A1 US 20060059286A1 US 4247605 A US4247605 A US 4247605A US 2006059286 A1 US2006059286 A1 US 2006059286A1
Authority
US
United States
Prior art keywords
signal
interrupt
processor
core
processor cores
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/042,476
Inventor
Michael Bertone
David Carlson
Richard Kessler
Philip Dickinson
Muhammad Hussain
Trent Parker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cavium LLC
Original Assignee
Cavium Networks LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US60921104P priority Critical
Application filed by Cavium Networks LLC filed Critical Cavium Networks LLC
Priority to US11/042,476 priority patent/US20060059286A1/en
Assigned to CAVIUM NETWORKS reassignment CAVIUM NETWORKS ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DICKINSON, PHILIP H., HUSSAIN, MUHAMMAD R., PARKER, TRENT, BERTONE, MICHAEL S., CARLSON, DAVID A., KESSLER, RICHARD E.
Publication of US20060059286A1 publication Critical patent/US20060059286A1/en
Assigned to CAVIUM NETWORKS, INC., A DELAWARE CORPORATION reassignment CAVIUM NETWORKS, INC., A DELAWARE CORPORATION MERGER (SEE DOCUMENT FOR DETAILS). Assignors: CAVIUM NETWORKS, A CALIFORNIA CORPORATION
Application status is Abandoned legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/084Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3632Software debugging of specific synchronisation aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0813Multiuser, multiprocessor or multiprocessing cache systems with a network or matrix configuration
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0831Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
    • G06F12/0835Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means for main memory peripheral accesses (e.g. I/O or DMA)
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0875Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0891Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using clearing, invalidating or resetting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/20Handling requests for interconnection or transfer for access to input/output bus
    • G06F13/24Handling requests for interconnection or transfer for access to input/output bus using interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • G06F9/30014Arithmetic instructions with variable precision
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file
    • G06F9/30138Extension of register space, e.g. register cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3824Operand accessing
    • G06F9/383Operand prefetching
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0804Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/601Reconfiguration of cache memory
    • G06F2212/6012Reconfiguration of cache memory of operating mode, e.g. cache mode or local memory mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/6022Using a prefetch buffer or dedicated prefetch cache

Abstract

In a multi-core processor, a high-speed interrupt-signal interconnect allows more than one of the processors to be interrupted at substantially the same time. For example, a global signal interconnect is coupled to each of the multiple processors, each processor being configured to selectively provide an interrupt signal, or pulse thereon. Preferably, each of the processor cores is capable of pulsing the global signal interconnect during every clock cycle to minimize delay between a triggering event and its respective interrupt signal. Each of the multiple processors also senses, or samples the global signal interconnect, preferably during the same cycle within which the pulse was provided, to determine the existence of an interrupt signal. Upon sensing an interrupt signal, each of the multiple processors responds to it substantially simultaneously. For example, an interrupt signal sampled by each of the multiple processors causes each processor to invoke a debug handler routine.

Description

    RELATED APPLICATION
  • This application claims the benefit of U.S. Provisional Application No. 60/609,211, filed on Sep. 10, 2004. The entire teachings of the above application are incorporated herein by reference.
  • BACKGROUND OF THE INVENTION
  • Complex computer systems and programs rarely work exactly as designed. During the development of a new computer system, unexpected errors or bugs may be discovered by thorough testing and exhaustive execution of a variety of programs and applications. The source or cause of an error is often not apparent from the error itself, many times an error manifests itself by locking the target system for no apparent reason. Thus, tracking down the source of the error can be problematic.
  • Software and system developers commonly use tools referred to as debuggers to identify the source of unexpected errors and to assist in their resolution. A debugger is a software program used to break (i.e., interrupt) program execution at one or more locations in an application program. Once interrupted, a user is presented with a debugger command prompt for entering debugger commands that will allow for setting breakpoints, displaying or changing memory, single stepping, and so forth. Often, processors include onboard features accessible by a debugger to facilitate access to and operation of the processor during debugging.
  • One of the most difficult tasks facing designers of embedded systems today is emulating and debugging embedded hardware and software in a real-world environment. Embedded systems are growing more complex, offering increasingly higher levels of performance, and using larger software programs than ever before. To meet the challenges of dealing with embedded systems, engineers and programmers seek advanced tools to enable their performance of appropriate levels of debugging.
  • Tracking down problems is particularly challenging when the target system includes a multi-core processor. Multi-core processors include two or more processor cores that are each capable of simultaneously executing independent programs.
  • Using standard debug features that may be provided with the individual processor cores of the multi-core processor, can provide insight into operation of the individual processor cores. Assessing operation of parallel applications being developed and executed on the multi-core processor system by debugging an individual processor core will generally be inadequate. Namely, if an operation of a first processor is interrupted as described above, the other processors will continue to operate, thereby changing the state of the system with each subsequent clock cycle as measured from the moment of interrupt.
  • SUMMARY OF THE INVENTION
  • A multi-core processor includes a global interrupt capability that selectively breaks operation of more than one of the multiple processor cores at substantially the same time, usually within a few clock cycles. A global interrupt-signal interconnect is coupled to each of the plurality of independent processor cores. Each of the processor cores includes an interrupt-signal sensor for sampling an interrupt signal on the global-signal interconnect and an interrupt-signal generator for selectively providing an interrupt signal. Each processor core respectively interrupts its execution of instructions in response to sampling an interrupt signal on the global interrupt-signal interconnect.
  • The respective interrupt-signal generator of each of the plurality of independent processor cores is coupled to the global interrupt-signal interconnect. Outputs from the respective interrupt-signal generators can be coupled together and further to the global interrupt-signal interconnect in a wired-OR configuration. Thus, each of the processor cores can individually assert an interrupt signal on the same global interrupt-signal interconnect.
  • The multi-core processor can further include an interface adapted to connect to an external device. For example, the interface can be defined by a Joint Test Action Group (JTAG) interface. In some embodiments, more than one global interrupt-signal interconnect includes are provided. In such a configuration, each of the global interrupt-signal interconnects can represent a different interrupt signal. Additionally, information that may be relevant to debugging the multi-core processor can be provided by a combination of signals asserted on the multiple interrupt-signal interconnects.
  • In some embodiments, the plurality of independent processor cores resides on a single semiconductor die. The independent processor cores can be Reduced Instruction Set Computer (RISC) processors. Alternatively or in addition, each of the multiple independent processor cores includes a respective register storing information configurable according to the sampled interrupt signal.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
  • FIG. 1 is a block diagram of a security appliance including a network services processor according to the principles of the present invention;
  • FIG. 2 is a block diagram of the network services processor shown in FIG. 1;
  • FIGS. 3A and 3B are block diagrams illustrating exemplary embodiments of a multi-core debug architecture;
  • FIG. 4 is a more detailed block diagram of one of the processor cores shown in FIGS. 3A and 3B;
  • FIG. 5 is a schematic block diagram of a debug register;
  • FIG. 6 is a schematic block diagram of the Multi Core Debug (MCD) register;
  • FIG. 7A is a schematic diagram of an exemplary Test Access Port (TAP) controller;
  • FIG. 7B is a more-detailed block diagram illustrating the interconnection of the TAPs among the multiple processor cores shown in FIGS. 3A and 3B; and
  • FIG. 8 is a more-detailed block diagram of the debug architecture within one of the processor cores shown in FIGS. 3A and 3B.
  • DETAILED DESCRIPTION OF THE INVENTION
  • A description of preferred embodiments of the invention follows.
  • Applications for multi-core processors are as limitless as applications that use a single microprocessor. Some applications that are particularly well suited for multi-core processors include telecommunications and networking. Having multiple processor cores enables a single sizeable task to be broken down into several smaller, more manageable subtasks, each subtask being executed on a different core processor. Breaking down large tasks in this way typically simplifies the overall processing of complex, high-speed data manipulations, such as those used in data security.
  • A debugging system for multi-core processors is provided to facilitate debugging these parallel applications executing on several independent processor cores. This is accomplished, at least in part, by generating internal trigger events from one or more of the multiple processor cores. These multiple trigger events can be transmitted to an external debug console using a debug interface having relatively few I/O signal lines. Preferably the debug interface is separate from the processor core's memory interface (e.g., the Dynamic Random Access Memory (DRAM) interface) to avoid interference with the parallel application. A separate debug interface also allows a majority of the hardware for the debug interface to remain useable during normal processing of the multi-core processors.
  • Combining multiple processor cores in a single system leads to a closer placement of cores with respect to each other. Reducing separation between the processor cores generally reduces propagation delay, thereby increasing communication speed between them. In some embodiments, the processor cores are provided within the same central processing unit and are interconnected using cables. Alternatively or in addition, some of the processor cores can be interconnected in the same socket, e.g., plugged into a common processor socket on a motherboard. In some applications, the multiple processors are provided together on the same semiconductor die.
  • With different processor cores operating cooperatively to implement a common function, such as packet processing in a high-speed packet processor, it may be necessary to examine the state of more than one of the multiple processor cores during any debugging activity. Thus, it would be beneficial to interrupt the multiple processor cores at substantially the same time, thereby allowing examination of the register contents and memory values attributable to any of the multiple processor cores. Once interrupted, operation of the multiple processor cores can be stepped sequentially, in unison according to operation in a debug mode. This special class of fast interrupts are referred to herein as Multi-Core Debug (MCD) interrupts.
  • To facilitate the very fast debug interrupt, a separate, high-speed interrupt-signal interconnect is provided. This separate signal interconnect allows for substantially simultaneous interruption of more than one of the multiple processor cores. For example, a global signal interconnect is coupled to each of the processor cores. Each of the processor cores, in turn, is configured to selectively provide an interrupt signal, or pulse, on the global signal interconnect. Preferably, each of the processor cores is capable of pulsing the global signal interconnect during any cycle of the processor clock. Additionally, each of the processor cores samples the global signal interconnect to determine whether any processor core has provided an interrupt signal.
  • Each of the multiple processor cores is connected to the global signal interconnect, with each core being capable of independently pulsing the signal interconnect. Once pulsed, the processor cores sampling the signal interconnect receive the interrupt substantially simultaneously. Using a logical OR configuration of the contributed pulses from all of the multiple processor cores provides the desired functionality (i.e., the global signal interconnect is asserted if any one of the interconnected processor cores asserts the interconnect).
  • Each of the processor cores includes respective debug circuitry with supporting extensions that enable concurrent multi-Core debugging. The debug circuitry is responsive to the global signal interconnect being asserted.
  • More generally, a multi-core processor architecture includes multiple global signal interconnects. Each of the multiple interconnects is independently configured as the single interconnect described above. Thus, each of the global signal interconnects can be asserted (i.e., pulsed) by any of the multiple processor cores as described above.
  • FIG. 1 is a block diagram of an exemplary security appliance 102 that includes a network services processor 100 according to the principles of the present invention. The network services processor 100 is a multi-core processor. The security appliance 102 can be a standalone system that switches packets received at one Ethernet port (Gig E) to another Ethernet port (Gig E). Preferably, the security appliance 102 also performs one or more security functions related to the received packets prior to forwarding the packets. For example, the security appliance 102 can be used to perform security processing on packets received from a Wide Area Network (WAN) 102 prior to forwarding the processed packets to a Local Area Network (LAN) 103. Exemplary network services processors 100 adapted to perform such security processing can include hardware packet processing, buffering, work scheduling, ordering, synchronization, and coherence support to accelerate packet processing tasks according to the principles of the present invention.
  • The network services processor 100 generally processes higher layer protocols. For example, the network services processor 100 processes one ore more of the Open System Interconnection (OSI) network L2-L7 layer protocols encapsulated in received packets. As is well-known to those skilled in the art, the OSI reference model defines seven network protocol layers: Layers 1-7 (referred to herein as L1-L7). The physical layer (L1) represents an actual physical interface. Namely, the electrical and physical attributes that enable a device to be connected to a transmission medium. The data link layer (L2) performs data framing. The network layer (L3) formats the data into packets. The transport layer (L4) handles end to end transport. The session layer (L5) manages communications between devices, for example, whether communication is half-duplex or full-duplex. The presentation layer (L6) manages data formatting and presentation, for example, syntax, control codes, special graphics and character sets. The application layer (L7) permits communication between users, for example, file transfer and electronic mail.
  • To support multiple interconnects, the network services processor 100 includes a number of interfaces. For example, the network services processor 100 includes a number of Ethernet Media Access Control interfaces with standard Reduced Gigabyte Media Independent Interface (RGMII) connections to off-chip destinations using physical interfaces (PHYs) 104 a, 104 b.
  • In operation, the network services processor 100 receives packets from one or more external destinations at one ore more respective Ethernet ports (Gig E) through the physical interfaces PHY 104 a, 104 b. The network services processor 100 then selectively performs L7-L2 network protocol processing on the received packets forwarding processed packets through the physical interfaces 104 a, 104 b. The processed packets may be forwarded to another “hop” in the network, to their final destination, or through a local communications bus for further processing by a host processor. The local communications bus can be any one of a number of industry standard busses, such as a Peripheral Component Interconnect (PCI) bus 106 or a PCI Extended (PCI-X). Other PC busses include Integrated Systems Architecture (ISA), Extended ISA (EISA), Micro Channel, VL-bus, NuBus, TURBOchannel, VMEbus, MULTIBUS, STD bus, and proprietary busses. Further, the network protocol processing can include processing of network security protocols such as Firewall, Application Firewall, Virtual Private Network (VPN) including IP Security (IPSec) and/or Secure Sockets Layer (SSL), Intrusion detection System (IDS) and Anti-virus (AV).
  • A DRAM controller in the network services processor 100 controls access to an external Dynamic Random Access Memory (DRAM) 108 that is coupled to the network services processor 100. The DRAM 108 stores data packets received from the PHY interfaces 104 a, 104 b from a local communications bus, such as the PCI-X interface 106 for processing by the network services processor 100. In one embodiment, the DRAM interface supports 64 or 128 bit Double Data Rate II Synchronous Dynamic Random Access Memory (DDR II SDRAM) operating at speeds up to and including 800 MHz.
  • A boot bus 110 can be provided, such that the necessary boot code is accessible allowing the network services processor 100 to execute the boot code upon power-on and/or reset. Generally, the boot code is stored in a memory, such as a flash memory 112. Application code can also be loaded into the network services processor 100 over the boot bus 110. For example, application code can be loaded from a device 114 implementing the Compact Flash standard, or from another high-volume device, such as a disk, attached via the PCI bus.
  • A miscellaneous I/O interface 116 offers auxiliary interfaces such as General Purpose Input/Output (GPIO), Flash, IEEE 802 two-wire Management Interface (MDIO), Universal Asynchronous Receiver-Transmitters (UARTs), and serial interfaces.
  • The network services processor 100 can include another memory controller for controlling Low latency DRAM 118. The low latency DRAM 118 can be used for Internet Services and Security applications, thereby allowing fast lookups, including the string-matching that may be required for Intrusion Detection System (IDS) or Anti Virus (AV) applications.
  • FIG. 2 is a more-detailed block diagram of an exemplary network services processor 100, such as the one shown in FIG. 1. As discussed above, the network services processor 100 can be adapted to deliver high application performance by including multiple processor cores 202. Network operations can be categorized into data plane operations and control plane operations. A data plane operation includes packet operations for forwarding packets. A control plane operation includes processing of portions of complex higher level protocols such as Internet Protocol Security (IPSec), Transmission Control Protocol (TCP) and Secure Sockets Layer (SSL). Advantageously, in such a network application, selective processor cores 202 can be dedicated to performing respective data plane or control plane operations. A data plane operation can include processing of other portions of these complex higher level protocols.
  • A packet input unit 214 can be used to allocate and create a work queue entry for each packet. The work queue entry, in turn contains a pointer to a buffered packet temporarily stored in memory, such as Level-2 cache 212 or DRAM 108 (FIG. 1).
  • Packet Input/Output processing is performed by a respective interface unit 210 a, 210 b, a packet input unit (Packet Input) 214, and a packet output unit (PKO) 218. The input controller 214 and interface units 210 a, 210 b can perform parsing of received packets and checking of results to offload the processor cores 202.
  • A packet is received by any one of the interface units 210 a, 210 b (generally 210) through a predefined interface, such as a System Packet Interface SPI-4.2 (e.g., SPI-4 phase 2 standard of the Optical Internetworking Forum) or an RGMII interface. A packet can also be received by a PCI interface 224. The interface unit 210 a, 210 b handles L2 network protocol pre-processing of the received packet by checking various fields in the L2 network protocol header included in the received packet. After the interface unit 210 has performed L2 network protocol processing, the packet is forwarded to the packet input unit 214. The pre-processed packet can be forwarded over an input/output (I/O) bus, such as I/O bus 225. The packet input unit 214 can be used to perform additional pre-processing, such as pre-processing of L3 and L4 network protocol headers included in the received packet. The pre-processing can include checksum checks for Transmission Control Protocol (TCP)/User Datagram Protocol (UDP) (L3 network protocols).
  • The packet input unit 214 writes packet data into buffers in Level-2 cache 212 or DRAM 108 (FIG. 1) in a format that is convenient to higher-layer software executed in at least one processor core 202 for further processing of higher level network protocols. The packet input unit 214 can support a programmable buffer size and can distribute packet data across multiple buffers to support large packet input sizes.
  • The Packet order/work (POW) module (unit) 228 queues and schedules work (i.e., packet processing operations) for the processor cores 202. Work can be defined to be any task to be performed by a processor core 202 that is identified by an entry on a work queue. The task can include packet processing operations, for example, packet processing operations for L4-L7 layers to be performed on a received packet identified by a work queue entry on a work queue. Each separate packet processing operation is a piece of the work to be performed by a processor core 202 on the received packet stored in memory. For example, the work can be the processing of a received Firewall/Virtual Private Network (VPN) packet. The processing of a Firewall/VPN packet includes the following separate packet processing operations (i.e., pieces of work): (1) defragmentation to reorder fragments in the received packet; (2) IPSec decryption (3) IPSec encryption; and (4) Network Address Translation (NAT) or TCP sequence number adjustment prior to forwarding the packet.
  • The POW module 228 selects (i.e., schedules) work for a processor core 202 and returns a pointer to the work queue entry that describes the work to the processor core 202. Each piece of work (i.e., a packet processing operation) has an associated group identifier and a tag.
  • Prior to describing the operation of the processor cores 202 in further detail, the other modules in the processor core 202 will be described. After the packet has been processed by the processor cores 202, a packet output unit (PKO) 218 reads the packet data from Level-2 cache 212 or DRAM 108 (FIG. 1), performs L4 network protocol post-processing (e.g., generates a TCP/UDP checksum), forwards the packet through the interface unit 210 and frees the Level-2 cache 212 or DRAM 108 locations used to store the packet.
  • The network services processor 100 can also include application specific co-processors that offload the processor cores 202 so that the network services processor 100 achieves a high-throughput. The application specific co-processors can include a DFA co-processor 244 that performs Deterministic Finite Automata (DFA) and a compression/decompression co-processor 208 that performs compression and decompression. Other co-processors include a Random Number Generator (RNG) 246 and a timer unit 242. The timer unit 242 is particularly useful for TCP applications.
  • Each processor core 202 can include a dual-issue, superscalar processor with a respective instruction cache 206, a respective Level-1 data cache 204, and respective built-in hardware acceleration (e.g., a crypto acceleration module) 200 for cryptography algorithms with direct access to low latency memory over the low latency memory bus 230. The low-latency, direct-access path to low-latency memory 118 (FIG. 1) that bypasses the Level-2 cache memory 212 and can be directly accessed from both the processor cores 202 and a DFA co-processor 244.
  • The network services processor 100 also includes a memory subsystem. The memory subsystem includes the respective Level-1 data cache memory 204 of each of the processor cores 202, respective instruction cache 206 in each of the processor cores 202, a Level-2 cache memory 212, a DRAM controller 216 for external DRAM memory 108 (FIG. 1), and an interface, such as a low-latency bus 230 to external low latency memory (not shown).
  • The memory subsystem is configured to support the multiple processor cores 202 and can be tuned to deliver both the high-throughput and the low-latency required by memory-intensive, content-networking applications. Level-2 cache memory 212 and external DRAM memory 108 (FIG. 1) are shared by all of the processor cores 202 and I/O co-processor devices.
  • Each of the processor cores 202 can be coupled to the Level-2 cache by a, local bus, such as a coherent memory bus 234. Thus, the coherent memory bus 234 can represent the communication channel for memory and I/O transactions between the processor cores 202, an I/O Bridge (IOB) 232, and the Level-2 cache and controller 212.
  • A Free-Pool Allocator (FPA) 236 maintains pools of pointers to free memory in Level-2 cache memory 212 and DRAM 108. A bandwidth efficient (Last-In-First-Out (LIFO)) stack is implemented for each free pointer pool.
  • The I/O Bridge 232 manages the overall protocol and arbitration and provides coherent I/O partitioning. The I/O Bridge 232 includes a bridge 238 and a Fetch-and-Add Unit (FAU) 240. The bridge 238 includes buffer queues for storing information to be transferred between the I/O bus 225, coherent memory bus 234, the packet input unit 214 and the packet output unit 218.
  • The Fetch-and-Add Unit 240 includes a 2 kilobyte (KB) register file supporting read, write, atomic fetch-and-add, and atomic update operations. The Fetch-and-Add Unit 240 can be accessed from both the cores 202 and the packet output unit 218. The registers store highly-used values and thus reduce traffic to access these values. Registers in the Fetch-and-Add Unit 240 are used to maintain lengths of the output queues that are used for forwarding processed packets through the packet output unit 218.
  • The PCI interface controller 224 has a Direct Memory Access (DMA) engine that allows the processor cores 202 to move data asynchronously between local memory in the network services processor 100 and remote (PCI) memory (not shown) in both directions.
  • In some embodiments, a key memory (KEY) 248 is provided. The key memory 248 is a protected memory coupled to the I/O Bus 225 that can be written/read by the processor cores 202. For example, the key memory can include error checking and correction. ECC will report single and double bit errors and repair single bit errors. The memory is a single-port memory that can be provided withy write precedence. In some embodiments, the key memory 248 can be used to temporarily store Loads, Stores, and I/O pre-fetches.
  • An Miscellaneous Input/Output (MIO) unit 226 can also be coupled to the I/O bus 225 to provide interface support for one or more external devices. For example, the MIO unit 226 can support one or more interfaces to a Universal Asynchronous Receiver/Transmitter (UART), to a boot bus, to a General Purpose Input/Output (GPIO) interface for communicating with peripheral devices (not shown), and more generally to a Field-Programmable Gate Array (FPGA) for interfacing with external devices. For example, an FPGA can be used to interface to external Ternary Content-Addressable Memory (TCAM) hardware providing fast-lookup performance. In particular, the MIO 226 can provide an interface to an external debugger console described below.
  • The processor core 202 supports multiple operational modes including: user mode, kernel mode, and debug mode. User mode is most often employed when executing applications programs (e.g., the internal flow of program control). Kernel mode is typically used for handling exceptions and operating system kernel functions, including management of any related coprocessor and Input/Output (I/O) device access. Debug mode is a special operational mode typically used by software developers to examine variables and memory locations, to stop code execution at predefined break points, and to step through the code one line or unit at a time, usually while monitoring variables and memory locations. Debug mode is also different from other operational modes in that there are substantially no restrictions on access to coprocessors, memory areas. Additionally, while in Debug mode, the usual exceptions like address error and interrupt are masked.
  • A multi-core processor 100 configured for debugging parallel applications executing on more than one independent processor cores is shown in FIGS. 3A and 3B. For example, the multi-core processor 100 includes three separate global signal interconnects: MCD_0, MCD_1, and MCD_2. Each of the three global signal interconnects MCD_0, MCD_1, and MCD_2 is coupled to each of the multiple processor cores 202 a, 202 b, . . . 202 n (generally 202). Each processor core 202, in turn, includes circuitry configured to assert a global interrupt signal on one or more of the global signal interconnects. Preferably, each of the processor cores 202 is configured to independently and selectively assert (i.e., pulse) an interrupt on one or more of the global signal interconnects MCD_0, MCD_1, and MCD_2.
  • Each processor core 202 also includes sensing circuitry configured to sample each of the global signal interconnects to determine the presence of an asserted interrupt. Preferably, each of the processor cores 202 independently samples the global signal interconnects MCD_0, MCD_1, and MCD_2 to determine whether an interrupt has been asserted, and on which of the several global signal interconnects the interrupt has been asserted—interrupts can be asserted on more than one of the global signal interconnects at a time. The global signal interconnects MCD_0, MCD_1, and MCD_2 are preferably sampled continuously, or at least once during each clock cycle to determine the presence of an interrupt. The sensing circuitry can include a register into which the state of the global signal interconnect is latched. For example, a register is configured to store one bit for each of the multiple global signal interconnects, the value of the stored bit indicative of the state of the respective global signal interconnect.
  • Having more than one global signal interconnect MCD_0, MCD_1, and MCD_2 coupled to each of the processor cores 202 can provide additional information. For example, with each of the three wires capable of being independently pulsed between two states (e.g., a logical Low or “0” and a logical High or “1”) can provide information corresponding to one of up to eight different messages (i.e., 3=8).
  • Alternatively, or in addition, the global signal interconnects can be used to communicate with the processor cores 202 once interrupted. An external debug console 325 hosting a debugger application and providing a user interface can be interconnected to one or more of the processor cores 202 to facilitate debugging of the system 100. Preferably, the global signal interconnects are accessible by the debugger. For example, a debugger can assert a pulse on MCD_1 to instruct the processor cores 202 to check their mailbox location (e.g., in main memory) for an instruction from the debugger. The debugger can assert a pulse on MCD_2 to restart all processor cores 202 after a multi-core interrupt. Thus, usage of the global signal interconnects can minimize disruption of the state contained in the processor cores 202 and in the system 100, while the debugger examines it. This capability can be very useful to isolate the cause of bugs in parallel applications.
  • The processor cores 202 are each coupled to the one or more global signal MCD_0, MCD_1, and MCD_2 interconnects in a respective “wired-OR” fashion. Thus, respective interrupt-signal generators of each of the processor cores 202 are all interconnected at a first wired-OR 310 a, further connected to the first global signal interconnect MCD_0. Second and third wired-ORs 310 b, 310 c are provided to similarly interconnect the processor cores 202 to the second and third global signal interconnects MCD_1 and MCD_2, respectively.
  • Thus, should one or more of the processor cores 202 assert a pulse on any of its respective interrupt-signal generator outputs (e.g., to wired-OR 310 a), the pulse will be asserted on the respective global signal interconnect (e.g., MCD_0). Preferably, a pulse can be asserted during any cycle. Using the wired-OR 310 a, 310 b, 310 c (generally 310) provides the desired logic allowing any of the processor cores 202 to drive the global interconnect signal (i.e., the wired-OR providing an output=“1” if any of its inputs=“1”), while also minimizing any corresponding delay. Once a pulse is asserted on the global signal interconnects, all of the processor cores 202 sample it, allowing the cores 202 to be interrupted very quickly—at the same time, or at least within a few cycles of the processor clock. Such a rapid interrupt of all of the processor cores 202 preserves the entire state of the parallel application at the time of the interrupt for examination by the debugger.
  • In other embodiments, each of the processor cores 202 can be interconnected to the global signal interconnects MCD_0, MCD_1, and MCD_2 using combinational logic, such as a logical OR gate. Such logic, however, represents additional complexity generally resulting in a corresponding delay (e.g., a gate delay due to synchronous logic, and/or a rise time delay due to the capacitance of the logic circuitry).
  • Each processor core 202 provides an exception handler. Generally, an exception refers to an error or other special condition detected during normal program execution. The exception handler can interrupt the normal flow of program control in response to receiving an exception. For example, a debug exception handler halts normal operation in response to receiving a debug interrupt. The exception handler then passes control to a debug handler, or software program, that controls operation in debug mode.
  • Some exemplary exception types include a Debug Single Step (DSS) exception resulting in single step execution in debug mode. A general Debug Interrupt (DINT) results in entry of debug mode and can be caused by the assertion of an external interrupt (e.g., EJ_DINT), or by setting a related bit in a debug register. An interrupt can result from assertion of unmasked hardware or software interrupt signal. A debug hardware instruction break matched (DIB) exception results in entry of debug mode when an instruction matches a predetermined instruction breakpoint. Similarly, a debug breakpoint instruction (DBp) results in entry of debug mode upon execution of a special instruction (e.g., a software debug breakpoint instruction, such as the EJTAG “SDBBP” instruction that places a processor into debug mode and fetches associated handler code from memory). A Data Address Break (address only) or Data Value (e.g., DDBL/DDBS) results in entry of debug mode when a particular memory address is accessed, or a particular value is written to/read from memory.
  • Each of the processor cores 202 includes respective onboard debug circuitry 318. As shown in FIG. 3A, each of the multiple processor cores 202 can include a respective core Test Access Port (TAP) 320′, 320″, 320′″ (generally 320) for accessing the respective debug circuitry 318. The core TAPs 320 are connected to one system TAP 330. As shown, each of the respective core TAPs 320 and the system TAP 330 can be interconnected in a daisy chain configuration. Additionally, the debug circuitry 318 of all of the interconnected processor cores 202 can be coupled to the external debug console 325.
  • Once in debug mode, the debug control console can be used to inspect the values stored in registers and memory locations. The debug control console provides a software program that communicates with the onboard debug circuitry 318 to accomplish inspection of stored values, setting of breakpoints, stopping, restarting and sequentially stepping each of the processor cores 202 in unison.
  • Alternatively, or in addition, each of the processor cores 202 can be coupled to the external debug console 325 through one or more Universal Asynchronous Receiver-Transmitter (UART) devices that include receiving and transmitting circuits for asynchronous serial communications, as shown in FIG. 3B. In one embodiment, the multi-core processor 100 includes two UART devices 335 a, 335 b (generally 335) used to control serial data transmission and reception between the processor cores 202 and external devices, such as the external debug console 325. The UART devices 335 can be included within the Miscellaneous I/O unit (FIG. 2). Thus, each processor core 202 can communicate with another device, such as the external debug console 325, through a respective memory bus interface 340 using one or more of the UART devices 335 accessible through the I/O bridge 238. Advantageously, communicating with the external debug console 325 using the UART device 335 removes constraints that would have otherwise been imposed by using a standard interface, such as the JTAG TAP interface (FIG. 3A).
  • The multi-core processor 100 optionally includes a trace buffer 610 (shown in phantom) for selectively monitoring memory transactions of the processor cores 202. For example, the trace buffer 610 is coupled to the coherent memory bus 234 to monitor transactions thereon. Generally, the trace buffer 610 stores information that can be used to assist in any debugging activity. For example, the trace buffer 610 can be configured to store the last “N” transactions on the bus, the N+1st transaction being dumped as a new transaction occurs. Further, when using a single trace buffer 610, identification tags can be used to identify the particular core processor 202 associated with each stored transaction.
  • Beneficially, the trace buffer 610 is also coupled to each of the one or more global signal interconnects MCD_0, MCD_1, and MCD_2, and configured with sensing circuitry sampling any pulses asserted on the global signal interconnects. The trace buffer 610 also includes a trigger that initiates the starting and or stopping of monitoring in response to sampling an interrupt signal on the global signal interconnect. Although a single trace buffer 610 supporting multiple core processors 202 is illustrated, other configurations are possible. For example, multiple trace buffers 610 can be provided with each trace buffer 610 respectively corresponding to one of the multiple core processors 202. Additionally, the trace buffer 610 can be on-chip, as shown, or off-chip and accessible by a probe.
  • Alternatively or in addition, the trace buffer 610 includes circuitry configured to assert a global interrupt signal on one or more of the global signal interconnects MCD_0, MCD_1, and MCD_2. As shown, the trace buffer 610 can be coupled to the global signal interconnects MCD_0, MCD_1, and MCD_2 through the wired-OR circuits 310. In this configuration, the trace buffer 610 can selectively assert a global interrupt signal on one or more of the global signal interconnects MCD_0, MCD_1, and MCD_2, thereby interrupting more than one of the multiple processor cores 202 in response to activity on the coherent memory bus 234.
  • FIG. 4 is a more detailed block diagram of an exemplary processor core 202 shown in FIGS. 3A and 3B. In general, a processor core 202 interprets and executes instructions. In some embodiments the processor core 202 is a Reduced Instruction Set Computing (RISC) processor core 202. In more detail, the processor core 202 includes an execution unit 400, an instruction dispatch unit 402, an instruction fetch unit 404, a load/store unit 416, a Memory Management Unit 406, a system interface 408, a write buffer 420 and security accelerators 200. The processor core 202 also includes debug circuitry 318 allowing debug operations to be performed. The system interface 408 controls access to external memory, that is, memory external to the processor core 202, such as the L2 cache memory described in relation to FIG. 2.
  • Still referring to FIG. 4, the execution unit 400 includes a multiply/divide unit 412 and at least one register file 414. The multiply/divide unit 412 has a 64-bit register-direct multiply. The instruction fetch unit 404 includes Instruction Cache (ICache) 206. The load/store unit 416 includes Data Cache (DCache) 204. A portion of the data cache 204 can be reserved as local scratch pad/local memory 422. In one embodiment, the instruction cache 206 is 32 Kilobytes, the data cache 204 is 8 Kilobytes and the write buffer 420 is 2 Kilobytes. The memory management unit 406 includes a Translation Lookaside Buffer (TLB) 410.
  • In one embodiment, the processor core 202 includes a crypto acceleration module (security accelerators) 200 that includes cryptography acceleration. For example, the cryptography acceleration can include one or more of Triple Data Encryption Standard (3DES), Advanced Encryption Standard (AES), Secure Hash Algorithm (SHA-1l), and Message Digest Algorithm #5 (MD5). The crypto acceleration module 200 communicates by moves to and from the main register file 414 in the execution unit 400. Particular algorithms, such as Rivest, Shamir, Adleman (RSA) and the Diffie-Heilman (DH) can be implemented and are performed in the multiply/divide unit 412.
  • In some embodiments, the multi-core processor 100 (FIG. 2) includes a superscalar processor. A superscalar processor includes a superscalar instruction pipeline that allows more than one instruction to be completed each cycle of the processor's clock period by allowing multiple instructions to be issued simultaneously and dispatched in parallel to multiple execution units 400. The RISC-type processor core 202 has an instruction set architecture that defines instructions by which the programmer interfaces with the RISC-type processor 202. In one embodiment, the superscalar RISC-type core is an extension of the MIPS64 version 2 core. Only load-and-store instructions access external memory; that is, memory external to the processor core 202. In one embodiment, the external memory is accessed over a coherent memory bus 234 (FIG. 2). All other instructions operate on data stored in the register file 414 within the execution unit 414 of the processor core 202. In some embodiments, the superscalar processor can be a dual-issue processor.
  • The instruction pipeline is divided into stages, each stage taking one clock cycle to complete. Thus, in a five stage pipeline, it takes five clock cycles to process each instruction and five instructions can be processed concurrently with each instruction being processed by a different stage of the pipeline in any given clock cycle. Typically, a five stage pipeline includes the following stages: fetch, decode, execute, memory and write back.
  • During the fetch-stage, the instruction fetch unit 404 fetches an instruction from instruction cache 206 at a location in instruction cache 206 identified by a memory address stored in a program counter. During the decode-stage, the instruction fetched in the fetch-stage is decoded by the instruction dispatch unit 402 and the address of the next instruction to be fetched for the issuing context is computed. During the execute-stage, the execution unit 400 performs an operation dependent on the type of instruction. For example, the execution unit 400 begins the arithmetic or logical operation for a register-to-register instruction, calculates the virtual address for a load or store operation, or determines whether the branch condition is true for a branch instruction. During the memory-stage, data is aligned by the load/store unit 416 and transferred to its destination in external memory. During the write back-stage, the result of a register-to-register or load instruction is written back to the register file 414.
  • The system interface 408 is coupled via the coherent memory bus 234 (FIG. 2) to external memory. In one embodiment, the coherent memory bus 243 has 384 bits and includes four separate buses: (i) an Address/Command Bus; (ii) a Store Data Bus; (iii) a Commit/Response control bus; and (iv) a Fill Data bus. All store data is sent to external memory over the coherent memory bus 234 via a write buffer entry in the write buffer 420. In one embodiment, the write buffer 420 has 16 write buffer entries.
  • Store data flows from the load/store unit 416 to the write buffer 420, and from the write buffer 420 through the system interface 408 to external memory. The processor core 202 can generate data to be stored in external memory faster than the system interface 408 can write the store data to the external memory. The write buffer 420 minimizes pipeline stalls by providing a resource for storing data prior to forwarding the data to external memory.
  • The write buffer 420 is also used to aggregate data to be stored in external memory over a coherent memory bus 424 into aligned cache blocks to maximize the rate at which the data can be written to the external memory. Furthermore, the write buffer 420 can also merge multiple stores to the same location in external memory resulting in a single write operation to external memory. The write-merging operation of the write buffer 420 can result in the order of writes to the external memory being different than the order of execution of the store instructions.
  • The processor core 202 also includes an exception control system providing circuitry for identifying and managing exceptions. An exception refers to an interruption or change of the normal flow of program control that occurs when an event or other special condition is detected during execution. Exceptions can be caused by a variety of sources, including boundary cases in data, external events, or even program errors, being generated (i.e., “raised”) by hardware or software. Exemplary hardware exceptions include resets, interrupts and signals from a memory management unit. Hardware exceptions may be generated by an arithmetic logic unit or floating-point unit for numerical errors such as divide by zero, overflow or underflow or instruction decoding errors such as privileged, reserved, trap or undefined instructions. Software exceptions are even more varied. For example, a software exception can refer to any kind of error checking that alters the normal behavior of the program. An exception transfers control from code being executed at the instant of the exception to different code-a routine commonly referred to as an exception handler.
  • A system co-processor can also be provide within the processor core 202 for providing a diagnostic capability, for controlling the operating mode (i.e., kernel, user, and debug), for configuring interrupts as enabled or disabled, and for storing other configuration information.
  • The processor core 202 also includes a Memory Management Unit (MMU) 406 coupled to the instruction fetch unit 404 and the load/store unit 416. The MMU 406 is a hardware device or circuit that supports virtual memory and paging by translating virtual addresses into physical addresses. Thus, the MMU 406 may receive a virtual memory address from program instructions being executed on the processor core 202. The virtual memory address is associated with a read from or a write to physical memory. The MMU 406 translates the virtual address to a physical address to allow a related physical memory location to be accessed by the program.
  • In a multitasking system all processes compete for the use of memory and of the MMU 406. In some memory management architectures, however, each process is allowed to have its own area or configuration of the page table, with a mechanism to switch between different mappings on a process switch. This means that all processes can have the same virtual address space rather than require load-time relocation. To accomplish this task, the MMU 406 can include a Translation Lookaside Buffer (TLB) 410.
  • The debug circuitry 318 on each processor core 202 can include an onboard debug controller. Having an onboard debug controller facilitates operation of the processor core 202 in the debug mode. For example, the debug controller can allow for single-step execution of the processor core 202. Further, the debug controller can support breakpoints, enabling them to transition the processor core 202 into debug mode. For example, the breakpoints can be one or more of instruction breakpoints, data breakpoints, and virtual address breakpoints.
  • In some embodiments, the onboard debug circuitry 318 includes standardized features. For example, the onboard debug circuitry 318 can be compliant with the design philosophy of the Joint Test Action Group (JTAG) interface—a popular standardized interface defined by IEEE Standard 1149.1. In embodiments that utilize processor cores, the onboard controller is referred to is the standard MIPS Enhanced JTAG (EJTAG) debug circuitry 318.
  • Each processor core 202 includes one or more debug registers, each register including one or more pre-defined fields for storing information (e.g., state bits) related to different aspects of debug mode operation. The debug registers 425 can be located in the instruction fetch unit 404. For example, one of the debug registers 425 is a Debug register 500. The Debug register 500 is illustrated in more detail in FIG. 5. The Debug register 500 includes a DM state bit indicative of whether the processor core 202 is operating in debug mode. Other bits include a DBD state bit indicative of whether the last debug exception or exception in Debug Mode occurred in a branch or jump delay slot. A DDBSImpr bit is indicative of an imprecise debug data break store. A DDBLImpr bit is indicative of an imprecise debug data break load. This bit can be implemented for load value breakpoints. A DExcCode bit is set to one when Debug[DExcCode] is valid and should be interpreted.
  • Another one of the debug registers 425 is a Multi-Core Debug (MCD) register 600 is shown in FIG. 6. The MCD register 600 includes dedicated multi-core debug state positions 615, one position being provided for each of the respective global signal interconnects MCD_0, MCD_1, and MCD_2. Similarly, the MCD register 600 includes dedicated mask-disable state positions 605, one position being provided for each of the respective global signal interconnects MCD_0, MCD_1, and MCD_2. When set, the mask-disable bits (one bit for each global signal interconnect) disable the effect of sampling a pulse on the corresponding global signal interconnect.
  • The MCD register 600 also includes respective software-control bit locations 610 for each of the several global MCD wires. For the exemplary multi-core processor 100, the three software-control bit locations 610, referred to as: Pls0, Pls1, or Pls2 are reserved. These software-control bit locations 610 corresponding to the three global signal interconnects: MCD_0, MCD_1, and MCD_2, respectively. Thus, bits written by software into the software control bit locations 610 can be used to pulse any combination of the three global MCD wires.
  • In some embodiments, the debug registers 425 (FIG. 4) include a DEPC register for imprecise debug exceptions and imprecise exceptions in Debug Mode. Imprecise debug data breakpoint are provided for load value compare, otherwise debug data breakpoints are precise. The DEPC register contains an address at which execution should be resumed when returning to Non-Debug Mode.
  • Exception handlers can be entered for debug processing in a number of ways. First, software such as the processor core instruction set and/or the debugger can include a breakpoint instruction. When the breakpoint instruction is executed by the execution unit 400, it causes a specific exception. Alternatively or in addition, a set of trap instructions can be provided. When the trap instructions are executed by the execution unit 400, a specific exception will result, but only when certain register value criteria are also satisfied. Further, a pair of optional Watch registers can be programmed to cause a specific exception on a load, store, or instruction fetch access to a specific word (e.g., a 64-bit double word) in virtual memory. Still further, an optional TLB-based MMU 406 can be programmed to “trap,” or otherwise interrupt program execution on any access, or more specifically, on any store to a page of memory. These exceptions generally refer to interrupting operation on any one of the processor cores 202. To interrupt the other processor cores 202, a pulse must be asserted on one or more of the global signal interconnects MCD—0, MCD —1, MCD —2.
  • In operation, when one or more of the processor cores 202 asserts a pulse on one of the global signal interconnects MCD_0, MCD_1, and MCD_2, the corresponding signal value can be a high state, or logical one. The respective instruction fetch unit 404 of each of the interconnected processor cores 202 samples the one on the global signal interconnect. In response to sampling the one, the instruction fetch unit 404 sets an internal state bit corresponding to the sampled pulse. The internal state bit, or MCD state bit, can be dedicated multi-core debug state positions 615 in the multi-core debug register 600 (i.e., Multi-Core Debug[MCD0, MCD1, MCD2]).
  • If any of multi-core debug state bits 615 are non-zero on a given processor core 202 (and that processor core 202 is not already in debug mode), the onboard debug circuitry 318 requests a debug exception on its respective processor core 202. With all of the multiple processor cores 202 sampling the same pulse and setting their respective bits 615 at substantially the same time, all of the unmasked processor cores 202 are interrupted at substantially the same time. Preferably, this occurs during the same cycle, but it can also occur within a few clock cycles. Software can later clear Multi-Core Debug[MCD0, MCD1, MCD2] bits by overwriting them (e.g., writing a one to them). Such a provision ensures that no further debug interrupts occur after exiting the debug handler.
  • In general, interrupts can be assigned different priority values to ensure the desired results in situations in which more than one type of interrupt occurs. In particular, the MCD interrupts can occur at the same priority level as standard debug interrupts provided within the debug circuitry 318 of each of the processor cores 202. The exception location can also be the same as a debug interrupt, with the multi-core debug bits 615 being similar to the DINT bit of the debug register shown in FIG. 5.
  • The detailed behavior of the bits, however, is different. For example, the DINT bit is read-only, whereas Multi-Core Debug[MCD0, MCD 1, MCD2] bits can be written to, allowing the bits to be cleared by the debug handler. Further, the DINT is cleared when Multi-Core Debug[DExcC] is set, whereas the multi-core debug state bits 615 need not be.
  • There are at least four ways that the global signal interconnects MCD—0, MCD_1, MCD_2 can be pulsed. First, software can cause initiation of a pulse on the global MCD wires. For example, debugger software running on a processor core 202 can write one or more values (e.g., a logical “1”s) to any combination of the software-control state bits 610 of the MCD register 600. When a “1” is written into one or more of these bits 610, the processor core 202 interprets it as an instruction to assert an interrupt signal, or pulse, on the corresponding global signal interconnects.
  • The global signal interconnects can also be pulsed by execution of a special instruction. For example, execution of a software breakpoint instruction, such as the SDBBP instruction, by any one of the processor cores 202 results in that core 202 asserting a pulse on the MCD_0 global signal interconnect. Whether a pulse is actually asserted by a processor core 202 in response to the breakpoint instruction can be further controlled by a global-signal debug bit 618 in the MCD register 600. Thus, a pulse is only asserted in response to the breakpoint instruction when the MCD[GSDB] bit 618 is set.
  • Alternatively or in addition, the initiation of a pulse on the global signal interconnects can result if one or more bits within a particular register are set and a breakpoint match occurs. When these two conditions occur, the hardware (e.g., the debug circuitry 318) pulses one of the global MCD wires (e.g., the MCD_0 wire). An Instruction Breakpoint Control-n register (IBCn, “n” being a numbered reference to a particular instruction breakpoint) stores a value responsive to a match of an executed breakpoint instruction. Similarly, a Data Breakpoint Control-n (DBCn) stores a value responsive to a match of a data transaction. The registers IBCn and DBCn generally include special bits (e.g., BE, TE) that can be used to enable the respective breakpoints.
  • Table 1 below describes an exemplary embodiment in which the detailed behavior on a breakpoint match is defined based on exemplary register values.
    TABLE 1
    Breakpoint Match Behavior
    BE TE Comment
    0 0 Nothing happens on a match
    0 1 MCD0 is pulsed on a match. BS bits are also set
    in IBS/DBS. No direct local exception occurs.
    (This mode may not be used.)
    1 0 A local breakpoint exception occurs due to the
    breakpoint match, causing the Core to enter
    debug mode. MCD0 is not pulsed. BS bits are
    set in IBS/DBS. (This mode will be used when
    debugging, but not multi-Core.)
    1 1 A local breakpoint exception occurs due to the
    breakpoint match, causing the Core to enter
    debug mode. MCD0 is also pulsed. BS bits are
    also set in IBS/DBS. (This mode will be used
    when debugging multi-Core.)
  • An exemplary TAP controller 700 is shown in FIG. 7A. The TAP controller 700 includes one or more registers 705 for storing instruction, data, and control information relating to the TAP interface 320. The registers 705 allow a user to perform a set up for the onboard debug circuitry 318, and provide important status information during a debug session. The size of the registers 705 depends on the specific implementations, but usually they are at least 32 bits.
  • The registers 705 receive information from an external source using the Test Data Input (TDI) input (i.e., pin). The registers also provide information to an external source using the Test Data Output (TDO) output (i.e., pin). Operation of the interface is provided by a TAP controller state machine 710. The TAP controller 700 uses a communications channel, such as a serial communications channel that operates according to a clock signal received on the Test Clock (TCK) input (i.e., pin). Thus, movement of data into and/or out of the registers 705 operates according to the received clock signal. Similarly, operation of the state machine also relies on the received clock.
  • A more detailed interconnection of respective TAP interfaces 320 on each of the multiple processor cores 202 is shown in FIG. 7B. A JTAG interface, referred to as a Test Access Port (TAP) 320′, 320″, 320′″ (generally 320), includes at least four-signal lines: Test Clock (TCK); TMS; Test Data In (TDI); and Test Data Out (TDO). The interface can also include one or more power and ground signal lines (note shown). The JTAG interface is a serial interface that is capable of transferring data according to a clock signal received on the TCK signal line. Operating frequency varies per chip, but is typically defined by a clock signal having a frequency between about 10 MHz to about 100 MHz (i.e., from about 100 nanoseconds to about 10 nanoseconds per bit time).
  • Configuration of each of the respective debug circuitry 318 (FIGS. 3A and 3B) can be performed by manipulating an internal state machine. For example, a debug controller state machine within the debug circuitry 318 can be externally manipulated one bit at a time via the TMS signal line of the TAP 330. Data can then be transferred in and out, one bit at a time, during each TCK clock cycle. The data can be received via the TDI signal line, and transmitted out via the TDO signal line, respectively. Different instruction modes can be loaded into the debug controller 318 to read the core identification (ID), to sample input, to drive (and/or float) output, to manipulate functions, and/or to bypass (pipe TDI to TDO to logically shorten chains of multiple chips).
  • The respective TAP 320 of each of the multiple processor cores 202 a, 202 b . . . 202 n (generally 202) is coupled to the respective TAP 320 of the other multiple processor cores 202 in a serial, or “daisy chain” configuration. Thus, the TCK signal of the first TAP 320′ is serially interconnected to the corresponding TCK signal lines of all of the other TAPs 320. The interconnected TCK signal lines are further connected to a corresponding TCK signal line of a system TAP 330. Typically, the system TAP 330 is interconnected to one of the end of the interconnected processor cores 202 (i.e., processor core 202 n or processor core 202 a, as shown), that end processor core 202 referred to as a “master” processor core 202 a. For the most part, the remaining TAP signal lines are generally interconnected in a similar manner being further connected from the master processor core 202 a corresponding TAP signal lines on the system TAP 330. Interconnection of the TDI and TDO signal lines, however, is different as described in more detail below.
  • In the daisy chain configuration, the TDI signal line of the master processor core 202 a connects to the corresponding TDI signal line of the system TAP 330, the master processor core 202 a receiving data from an external source. The TDO signal line of the master processor core 202 a, however, is connected to the TDI signal line of an adjacent processor core 202 b. Additional processor cores 202 are connected in a similar manner, the TDO signal line of one processor core 202 being interconnected to the TDI signal line of its preceding processor core 202, until the TDO signal line of the last processor core 202 n in the chain is interconnected to the TDO signal line of the system TAP 330.
  • A more-detailed diagram illustrating an alternative embodiment of a processor core 202 including exemplary onboard debug circuitry is shown in FIG. 8. An execution unit 400 (e.g., a combined processor and co-processor) is coupled to a memory (e.g., cache) controller 805 through an MMU 410. The MMU 410 may include a TLB. The memory controller 805 is further coupled to a memory system interface through a bus interface unit 408. Access and control of the onboard debug features is provided through an EJTAG TAP 320. The processing unit 300 includes a number of registers 830 that support debug operation. For example, the processor core 202 includes an MCD register 835 as discussed above; a debug register 836 as also discussed above, a DEPC register 837 and a DESAVE register 838.
  • A debug control register 832 is coupled between the registers 830 of the processing unit 400, the memory controller 805, and externally via the EJTAG TAP 320. A hardware breakpoint unit 825 is also coupled between the registers 830 of the execution unit 400, the memory controller and the MMU 410. The Hardware Breakpoint Unit 825 implements memory-mapped registers that control the instruction and data hardware breakpoints. The memory-mapped region containing the hardware breakpoint registers is accessible to software only in debug mode.
  • The debug features provide compatibility with existing debuggers. The debug circuitry 318 support includes specific extensions that enable concurrent multi-Core debugging. For example, controlling logic can be used to interpret the values of the software-control bit locations 610. Upon interpreting a value indicative of a pulse, the controlling logic can write the interpreted values into the corresponding MCD—0, MCD_1, MCD_2 bit locations of the MCD register. The controlling logic can then pulse the one or more corresponding global MCD wires, according to the corresponding values 615. Once pulsed, the processor cores 202 sample the pulse. The pulse sampling can occur during the next clock cycle after the pulse was written. Once sampled, each of the processor cores 202 that is not masked, will initiate a debug exception handler routine.
  • The debug exception handler can then follow a set of predetermined rules to determine the one or more causes of a given debug exception after reading the Debug and/or Multi-Core Debug registers. For example, the debug exception handler can follow the rules listed in Table 2 below.
    TABLE 2
    Debug Exception Handler Rules
    1. Any of MCD state bit locations (Multi-Core Debug[MCD0, MCD1,
    MCD2]) could be set at any time, indicating that the
    corresponding MCD state bit is set.
    2. If Multi-Core Debug[DExcC] is set, all of Debug[DDBSImpr,
    DDBLImpr, DINT, DIB, DDBS, DDBL, DBp, DSS] will
    be clear, and Debug[DExcCode] will contain a valid
    code. (This is the case for a debug mode exception.)
    3. If none of Debug[DDBSImpr, DDBLImpr, DINT, DIB,
    DDBS, DDBL, DBp, DSS] are set, then the
    exception was either due to MCD*, or Multi-
    Core Debug[DExcC] being set and
    Debug[DExcCode] is valid.
    4. No more than one of Debug[DIB, DDBS, DDBL,
    DBp, DSS] can be set.
    5. If Multi-Core Debug[DExcC] is clear, any
    combination of Debug[DDBLImpr, DINT] may be set.
    6. At least one of Debug[DDBLImpr, DINT, DIB, DDBS,
    DDBL, DBp, DSS] and Multi-Core Debug[MCD0,
    MCD1, MCD2, DExcC] will be set.
  • While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.

Claims (23)

1. A multi-core processor comprising:
a plurality of independent processor cores, each processor core executing instructions and operating in parallel to perform work;
each of the plurality of independent processor cores respectively including:
an interrupt-signal sensor; and
an interrupt-signal generator selectively providing an interrupt signal; and
a global interrupt-signal interconnect in electrical communication with each of the plurality of independent processor cores, more than one of the processor cores respectively interrupting its execution of instructions substantially simultaneously responsive to sampling with respective interrupt-signal sensors an interrupt signal on the global interrupt-signal interconnect.
2. The multi-core processor of claim 1, wherein the respective interrupt-signal generator of each of the plurality of independent processor cores is coupled to the global interrupt-signal interconnect.
3. The multi-core processor of claim 2, wherein the respective interrupt-signal generator of each of the plurality of independent processor cores is coupled to the global interrupt-signal interconnect in a wired-OR configuration.
4. The multi-core processor of claim 1, wherein an interrupt signal is provided in response to a write to a register in one of the plurality of independent processor cores.
5. The multi-core processor of claim 1, wherein an interrupt signal is provided in response to execution of a debug breakpoint instruction in one of the plurality of independent processor cores.
6. The multi-core processor of claim 1, wherein an interrupt signal is provided in response to detection of an instruction or data breakpoint match in one of the plurality of independent processor cores.
7. The multi-core processor of claim 1, wherein the global interrupt-signal interconnect comprises a plurality of independent global interrupt-signal interconnects, each of the independent global interrupt-signal interconnects representing a respective interrupt signal.
8. The multi-core processor of claim 1, further comprising a trace buffer coupled to the global interrupt-signal interconnect, the trace buffer being configured to monitor memory transactions of the independent processor cores in response to an interrupt signal on the global interrupt-signal interconnect.
9. The multi-core processor of claim 1, wherein each of the plurality of independent processor cores comprises a respective register storing information, the register configurable according to the sampled interrupt signal.
10. The multi-core processor of claim 1, further comprising a core-processor clock signal for coordinating execution of the instructions, wherein the interrupt-signal sensor samples the global interrupt-signal interconnect during each cycle of the core-processor clock signal.
11. The multi-core processor of claim 10, wherein each processor core respectively interrupts its execution of instructions within three core-processor clock cycles of sampling an interrupt signal on the global interrupt-signal interconnect.
12. The multi-core processor of claim 1, wherein the global interrupt-signal interconnect is used to communicate after the plurality of processor cores are interrupted.
13. A method of debugging a multi-core processor comprising the steps of:
selectively providing an interrupt signal on a global interrupt-signal interconnect, the global interrupt-signal interconnect coupled to each of a plurality of processor cores comprising the multi-core processor;
sampling the provided interrupt signal at each of the plurality of processor cores; and
interrupting execution of more than one of the plurality of processor cores substantially simultaneously responsive to the sensed interrupt signal.
14. The method of claim 13, wherein the interrupt signal is selectively provided by one of the plurality of processor cores.
15. The method of claim 14, wherein the interrupt signal is provided in response to software control.
16. The method of claim 15, wherein the software control comprises software writing a value to a register.
17. The method of claim 14, wherein the interrupt signal is provided in response to execution of a debug breakpoint instruction.
18. The method of claim 14, wherein the interrupt signal is provided in response to a breakpoint match.
19. The method of claim 13, further comprising entering a debug handler routine at each of the interrupted processor cores.
20. The method of claim 19, wherein each of the interrupted processor cores communicates with an external device responsive to entering the debug handler routine.
21. The method of claim 20, wherein each of the plurality of processor cores communicates with the external device using a Joint Test Action Group (JTAG) test access port.
22. The method of claim 20, further comprising using the global interrupt-signal interconnect to communicate after the plurality of processor cores are interrupted.
23. A multi-core processor comprising:
means for selectively providing an interrupt signal on a global interrupt-signal interconnect, the global interrupt-signal interconnect coupled to each of a plurality of processor cores comprising the multi-core processor;
means for sensing the provided interrupt signal at each of the plurality of processor cores; and
means for interrupting execution of more than one of the plurality of processors substantially simultaneously responsive to a sensed interrupt signal.
US11/042,476 2004-09-10 2005-01-25 Multi-core debugger Abandoned US20060059286A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US60921104P true 2004-09-10 2004-09-10
US11/042,476 US20060059286A1 (en) 2004-09-10 2005-01-25 Multi-core debugger

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/042,476 US20060059286A1 (en) 2004-09-10 2005-01-25 Multi-core debugger

Publications (1)

Publication Number Publication Date
US20060059286A1 true US20060059286A1 (en) 2006-03-16

Family

ID=38731731

Family Applications (4)

Application Number Title Priority Date Filing Date
US11/015,343 Active US7941585B2 (en) 2004-09-10 2004-12-17 Local scratchpad and data caching system
US11/030,010 Abandoned US20060059316A1 (en) 2004-09-10 2005-01-05 Method and apparatus for managing write back cache
US11/042,476 Abandoned US20060059286A1 (en) 2004-09-10 2005-01-25 Multi-core debugger
US14/159,210 Active US9141548B2 (en) 2004-09-10 2014-01-20 Method and apparatus for managing write back cache

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US11/015,343 Active US7941585B2 (en) 2004-09-10 2004-12-17 Local scratchpad and data caching system
US11/030,010 Abandoned US20060059316A1 (en) 2004-09-10 2005-01-05 Method and apparatus for managing write back cache

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/159,210 Active US9141548B2 (en) 2004-09-10 2014-01-20 Method and apparatus for managing write back cache

Country Status (2)

Country Link
US (4) US7941585B2 (en)
CN (5) CN101036117B (en)

Cited By (115)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060156099A1 (en) * 2004-12-16 2006-07-13 Sweet James D Method and system of using a single EJTAG interface for multiple tap controllers
US20060253660A1 (en) * 2005-03-30 2006-11-09 Intel Corporation Method and apparatus to provide dynamic hardware signal allocation in a processor
US20080184150A1 (en) * 2007-01-31 2008-07-31 Marc Minato Electronic circuit design analysis tool for multi-processor environments
US20080239956A1 (en) * 2007-03-30 2008-10-02 Packeteer, Inc. Data and Control Plane Architecture for Network Application Traffic Management Device
US20080316922A1 (en) * 2007-06-21 2008-12-25 Packeteer, Inc. Data and Control Plane Architecture Including Server-Side Triggered Flow Policy Mechanism
WO2009008007A3 (en) * 2007-07-09 2009-03-05 Hewlett Packard Development Co Data packet processing method for a multi core processor
US20090083517A1 (en) * 2007-09-25 2009-03-26 Packeteer, Inc. Lockless Processing of Command Operations in Multiprocessor Systems
US20090150695A1 (en) * 2007-12-10 2009-06-11 Justin Song Predicting future power level states for processor cores
US20090150696A1 (en) * 2007-12-10 2009-06-11 Justin Song Transitioning a processor package to a low power state
US7813277B2 (en) 2007-06-29 2010-10-12 Packeteer, Inc. Lockless bandwidth management for multiprocessor networking devices
US7840000B1 (en) * 2005-07-25 2010-11-23 Rockwell Collins, Inc. High performance programmable cryptography system
US20100332909A1 (en) * 2009-06-30 2010-12-30 Texas Instruments Incorporated Circuits, systems, apparatus and processes for monitoring activity in multi-processing systems
US20110153724A1 (en) * 2009-12-23 2011-06-23 Murali Raja Systems and methods for object rate limiting in multi-core system
US20110161630A1 (en) * 2009-12-28 2011-06-30 Raasch Steven E General purpose hardware to replace faulty core components that may also provide additional processor functionality
US20110214023A1 (en) * 2010-02-26 2011-09-01 UltraSoC Technologies Limited Method of Debugging Multiple Processes
US20110225456A1 (en) * 2010-03-10 2011-09-15 Texas Instruments Incorporated Commanded jtag test access port operations
US20110307741A1 (en) * 2010-06-15 2011-12-15 National Chung Cheng University Non-intrusive debugging framework for parallel software based on super multi-core framework
US8111707B2 (en) 2007-12-20 2012-02-07 Packeteer, Inc. Compression mechanisms for control plane—data plane processing architectures
WO2012087894A2 (en) * 2010-12-22 2012-06-28 Intel Corporation Debugging complex multi-core and multi-socket systems
US20120216017A1 (en) * 2009-11-16 2012-08-23 Fujitsu Limited Parallel computing apparatus and parallel computing method
US20130254605A1 (en) * 2006-10-20 2013-09-26 Texas Instruments Incorporated High speed double data rate jtag interface
US20140019644A1 (en) * 2012-07-10 2014-01-16 International Business Machines Corporation Controlling A Plurality Of Serial Peripheral Interface ('SPI') Peripherals Using A Single Chip Select
US8683240B2 (en) 2011-06-27 2014-03-25 Intel Corporation Increasing power efficiency of turbo mode operation in a processor
US8688883B2 (en) 2011-09-08 2014-04-01 Intel Corporation Increasing turbo mode residency of a processor
US20140157051A1 (en) * 2011-05-16 2014-06-05 Zongyou Shao Method and device for debugging a mips-structure cpu with southbridge and northbridge chipsets
US8769316B2 (en) 2011-09-06 2014-07-01 Intel Corporation Dynamically allocating a power budget over multiple domains of a processor
US8799687B2 (en) 2005-12-30 2014-08-05 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including optimizing C-state selection under variable wakeup rates
US8832478B2 (en) 2011-10-27 2014-09-09 Intel Corporation Enabling a non-core domain to control memory bandwidth in a processor
US8914650B2 (en) 2011-09-28 2014-12-16 Intel Corporation Dynamically adjusting power of non-core processor circuitry including buffer circuitry
US20150026441A1 (en) * 2005-05-16 2015-01-22 Texas Instruments Incorporated Method and system of inserting marking values used to correlate trace data as between processor cores
US8943334B2 (en) 2010-09-23 2015-01-27 Intel Corporation Providing per core voltage and frequency control
US8943340B2 (en) 2011-10-31 2015-01-27 Intel Corporation Controlling a turbo mode frequency of a processor
US8954770B2 (en) 2011-09-28 2015-02-10 Intel Corporation Controlling temperature of multiple domains of a multi-domain processor using a cross domain margin
US8972763B2 (en) 2011-12-05 2015-03-03 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including determining an optimal power state of the apparatus based on residency time of non-core domains in a power saving state
US8984313B2 (en) 2012-08-31 2015-03-17 Intel Corporation Configuring power management functionality in a processor including a plurality of cores by utilizing a register to store a power domain indicator
US9026815B2 (en) 2011-10-27 2015-05-05 Intel Corporation Controlling operating frequency of a core domain via a non-core domain of a multi-domain processor
US9052901B2 (en) 2011-12-14 2015-06-09 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including configurable maximum processor current
US9063727B2 (en) 2012-08-31 2015-06-23 Intel Corporation Performing cross-domain thermal control in a processor
US9069555B2 (en) 2011-03-21 2015-06-30 Intel Corporation Managing power consumption in a multi-core processor
US9075556B2 (en) 2012-12-21 2015-07-07 Intel Corporation Controlling configurable peak performance limits of a processor
US9074947B2 (en) 2011-09-28 2015-07-07 Intel Corporation Estimating temperature of a processor core in a low power state without thermal sensor information
US9081577B2 (en) 2012-12-28 2015-07-14 Intel Corporation Independent control of processor core retention states
US9098261B2 (en) 2011-12-15 2015-08-04 Intel Corporation User level control of power management policies
WO2015134103A1 (en) * 2014-03-07 2015-09-11 Cavium, Inc. Method and system for ordering i/o access in a multi-node environment
US9158693B2 (en) 2011-10-31 2015-10-13 Intel Corporation Dynamically controlling cache size to maximize energy efficiency
US9164565B2 (en) 2012-12-28 2015-10-20 Intel Corporation Apparatus and method to manage energy usage of a processor
US9176875B2 (en) 2012-12-14 2015-11-03 Intel Corporation Power gating a portion of a cache memory
US20150346935A1 (en) * 2010-03-09 2015-12-03 Avistar Communications Corporation Scalable high-performance interactive real-time media architectures for virtual desktop environments
US9235252B2 (en) 2012-12-21 2016-01-12 Intel Corporation Dynamic balancing of power across a plurality of processor domains according to power policy control bias
US9239611B2 (en) 2011-12-05 2016-01-19 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including balancing power among multi-frequency domains of a processor based on efficiency rating scheme
US9292468B2 (en) 2012-12-17 2016-03-22 Intel Corporation Performing frequency coordination in a multiprocessor system based on response timing optimization
US9323316B2 (en) 2012-03-13 2016-04-26 Intel Corporation Dynamically controlling interconnect frequency in a processor
US9323525B2 (en) 2014-02-26 2016-04-26 Intel Corporation Monitoring vector lane duty cycle for dynamic optimization
US9335803B2 (en) 2013-02-15 2016-05-10 Intel Corporation Calculating a dynamically changeable maximum operating voltage value for a processor based on a different polynomial equation using a set of coefficient values and a number of current active cores
US9335804B2 (en) 2012-09-17 2016-05-10 Intel Corporation Distributing power to heterogeneous compute elements of a processor
US9348401B2 (en) 2013-06-25 2016-05-24 Intel Corporation Mapping a performance request to an operating frequency in a processor
US9348407B2 (en) 2013-06-27 2016-05-24 Intel Corporation Method and apparatus for atomic frequency and voltage changes
US9354689B2 (en) 2012-03-13 2016-05-31 Intel Corporation Providing energy efficient turbo operation of a processor
US9367114B2 (en) 2013-03-11 2016-06-14 Intel Corporation Controlling operating voltage of a processor
US9372524B2 (en) 2011-12-15 2016-06-21 Intel Corporation Dynamically modifying a power/performance tradeoff based on processor utilization
US9372800B2 (en) 2014-03-07 2016-06-21 Cavium, Inc. Inter-chip interconnect protocol for a multi-chip system
US9377836B2 (en) 2013-07-26 2016-06-28 Intel Corporation Restricting clock signal delivery based on activity in a processor
US9377841B2 (en) 2013-05-08 2016-06-28 Intel Corporation Adaptively limiting a maximum operating frequency in a multicore processor
US9395784B2 (en) 2013-04-25 2016-07-19 Intel Corporation Independently controlling frequency of plurality of power domains in a processor system
US9405345B2 (en) 2013-09-27 2016-08-02 Intel Corporation Constraining processor operation based on power envelope information
US9405351B2 (en) 2012-12-17 2016-08-02 Intel Corporation Performing frequency coordination in a multiprocessor system
US9411644B2 (en) 2014-03-07 2016-08-09 Cavium, Inc. Method and system for work scheduling in a multi-chip system
US9423858B2 (en) 2012-09-27 2016-08-23 Intel Corporation Sharing power between domains in a processor package using encoded power consumption information from a second domain to calculate an available power budget for a first domain
US9436245B2 (en) 2012-03-13 2016-09-06 Intel Corporation Dynamically computing an electrical design point (EDP) for a multicore processor
US9459689B2 (en) 2013-12-23 2016-10-04 Intel Corporation Dyanamically adapting a voltage of a clock generation circuit
US9471088B2 (en) 2013-06-25 2016-10-18 Intel Corporation Restricting clock signal delivery in a processor
US9494998B2 (en) 2013-12-17 2016-11-15 Intel Corporation Rescheduling workloads to enforce and maintain a duty cycle
US9495001B2 (en) 2013-08-21 2016-11-15 Intel Corporation Forcing core low power states in a processor
US9513689B2 (en) 2014-06-30 2016-12-06 Intel Corporation Controlling processor performance scaling based on context
US9529532B2 (en) 2014-03-07 2016-12-27 Cavium, Inc. Method and apparatus for memory allocation in a multi-node system
US9547027B2 (en) 2012-03-30 2017-01-17 Intel Corporation Dynamically measuring power consumption in a processor
US9575537B2 (en) 2014-07-25 2017-02-21 Intel Corporation Adaptive algorithm for thermal throttling of multi-core processors with non-homogeneous performance states
US9575543B2 (en) 2012-11-27 2017-02-21 Intel Corporation Providing an inter-arrival access timer in a processor
US9594560B2 (en) 2013-09-27 2017-03-14 Intel Corporation Estimating scalability value for a specific domain of a multicore processor based on active state residency of the domain, stall duration of the domain, memory bandwidth of the domain, and a plurality of coefficients based on a workload to execute on the domain
US9606888B1 (en) * 2013-01-04 2017-03-28 Marvell International Ltd. Hierarchical multi-core debugger interface
US9606602B2 (en) 2014-06-30 2017-03-28 Intel Corporation Method and apparatus to prevent voltage droop in a computer
US9639134B2 (en) 2015-02-05 2017-05-02 Intel Corporation Method and apparatus to provide telemetry data to a power controller of a processor
US9665153B2 (en) 2014-03-21 2017-05-30 Intel Corporation Selecting a low power state based on cache flush latency determination
US9671853B2 (en) 2014-09-12 2017-06-06 Intel Corporation Processor operating by selecting smaller of requested frequency and an energy performance gain (EPG) frequency
US9684360B2 (en) 2014-10-30 2017-06-20 Intel Corporation Dynamically controlling power management of an on-die memory of a processor
US9684517B2 (en) 2013-10-31 2017-06-20 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. System monitoring and debugging in a multi-core processor system
US9703358B2 (en) 2014-11-24 2017-07-11 Intel Corporation Controlling turbo mode frequency operation in a processor
US9710054B2 (en) 2015-02-28 2017-07-18 Intel Corporation Programmable power management agent
US9710041B2 (en) 2015-07-29 2017-07-18 Intel Corporation Masking a power state of a core of a processor
US9710043B2 (en) 2014-11-26 2017-07-18 Intel Corporation Controlling a guaranteed frequency of a processor
US9760158B2 (en) 2014-06-06 2017-09-12 Intel Corporation Forcing a processor into a low power state
US9760136B2 (en) 2014-08-15 2017-09-12 Intel Corporation Controlling temperature of a system memory
US9760160B2 (en) 2015-05-27 2017-09-12 Intel Corporation Controlling performance states of processing engines of a processor
US9823719B2 (en) 2013-05-31 2017-11-21 Intel Corporation Controlling power delivery to a processor via a bypass
US9842082B2 (en) 2015-02-27 2017-12-12 Intel Corporation Dynamically updating logical identifiers of cores of a processor
US9847927B2 (en) 2014-12-26 2017-12-19 Pfu Limited Information processing device, method, and medium
US9874922B2 (en) 2015-02-17 2018-01-23 Intel Corporation Performing dynamic power control of platform devices
US9910481B2 (en) 2015-02-13 2018-03-06 Intel Corporation Performing power management in a multicore processor
US9910470B2 (en) 2015-12-16 2018-03-06 Intel Corporation Controlling telemetry data communication in a processor
US9977477B2 (en) 2014-09-26 2018-05-22 Intel Corporation Adapting operating parameters of an input/output (IO) interface circuit of a processor
US9983644B2 (en) 2015-11-10 2018-05-29 Intel Corporation Dynamically updating at least one power management operational parameter pertaining to a turbo mode of a processor for increased performance
EP3333697A1 (en) * 2016-12-12 2018-06-13 INTEL Corporation Communicating signals between divided and undivided clock domains
US10001822B2 (en) 2015-09-22 2018-06-19 Intel Corporation Integrating a power arbiter in a processor
US10048744B2 (en) 2014-11-26 2018-08-14 Intel Corporation Apparatus and method for thermal management in a multi-chip package
EP3343377A4 (en) * 2015-09-25 2018-09-12 Huawei Technologies Co., Ltd. Debugging method, multi-core processor, and debugging equipment
US10108454B2 (en) 2014-03-21 2018-10-23 Intel Corporation Managing dynamic capacitance using code scheduling
US10146286B2 (en) 2016-01-14 2018-12-04 Intel Corporation Dynamically updating a power management policy of a processor
US10168758B2 (en) 2016-09-29 2019-01-01 Intel Corporation Techniques to enable communication between a processor and voltage regulator
US10185566B2 (en) 2012-04-27 2019-01-22 Intel Corporation Migrating tasks between asymmetric computing elements of a multi-core processor
US10234930B2 (en) 2015-02-13 2019-03-19 Intel Corporation Performing power management in a multicore processor
US10234920B2 (en) 2016-08-31 2019-03-19 Intel Corporation Controlling current consumption of a processor based at least in part on platform capacitance
US10281975B2 (en) 2016-06-23 2019-05-07 Intel Corporation Processor having accelerated user responsiveness in constrained environment
US10289188B2 (en) 2016-06-21 2019-05-14 Intel Corporation Processor having concurrent core and fabric exit from a low power state
US10324519B2 (en) 2016-06-23 2019-06-18 Intel Corporation Controlling forced idle state operation in a processor
US10339023B2 (en) 2014-09-25 2019-07-02 Intel Corporation Cache-aware adaptive thread scheduling and migration

Families Citing this family (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7941585B2 (en) * 2004-09-10 2011-05-10 Cavium Networks, Inc. Local scratchpad and data caching system
DK1794979T3 (en) * 2004-09-10 2017-07-24 Cavium Inc Selective copying data structure
US7594081B2 (en) * 2004-09-10 2009-09-22 Cavium Networks, Inc. Direct access to low-latency memory
WO2006079139A1 (en) * 2004-10-12 2006-08-03 Canon Kabushiki Kaisha Concurrent ipsec processing system and method
US20080282034A1 (en) * 2005-09-19 2008-11-13 Via Technologies, Inc. Memory Subsystem having a Multipurpose Cache for a Stream Graphics Multiprocessor
US20070067567A1 (en) * 2005-09-19 2007-03-22 Via Technologies, Inc. Merging entries in processor caches
US8079084B1 (en) 2007-08-10 2011-12-13 Fortinet, Inc. Virus co-processor instructions and methods for using such
US8375449B1 (en) 2007-08-10 2013-02-12 Fortinet, Inc. Circuits and methods for operating a virus co-processor
US8286246B2 (en) * 2007-08-10 2012-10-09 Fortinet, Inc. Circuits and methods for efficient data transfer in a virus co-processing system
US7836283B2 (en) * 2007-08-31 2010-11-16 Freescale Semiconductor, Inc. Data acquisition messaging using special purpose registers
US20090106501A1 (en) * 2007-10-17 2009-04-23 Broadcom Corporation Data cache management mechanism for packet forwarding
CN101272334B (en) 2008-03-19 2010-11-10 杭州华三通信技术有限公司 Method, device and equipment for processing QoS service by multi-core CPU
CN101282303B (en) 2008-05-19 2010-09-22 杭州华三通信技术有限公司 Method and apparatus for processing service packet
JP5202130B2 (en) * 2008-06-24 2013-06-05 株式会社東芝 Cache memory, a computer system, and a memory access method
CN101299194B (en) 2008-06-26 2010-04-07 上海交通大学 Heterogeneous multi-core system thread-level dynamic dispatching method based on configurable processor
US8041899B2 (en) * 2008-07-29 2011-10-18 Freescale Semiconductor, Inc. System and method for fetching information to a cache module using a write back allocate algorithm
US8996812B2 (en) * 2009-06-19 2015-03-31 International Business Machines Corporation Write-back coherency data cache for resolving read/write conflicts
US8595425B2 (en) * 2009-09-25 2013-11-26 Nvidia Corporation Configurable cache for multiple clients
DK2507951T5 (en) * 2009-12-04 2013-12-02 Napatech As Apparatus and method for receiving and storing data packets controlled by a central controller
US8850404B2 (en) * 2009-12-23 2014-09-30 Intel Corporation Relational modeling for performance analysis of multi-core processors using virtual tasks
CN102141905B (en) 2010-01-29 2015-02-25 上海芯豪微电子有限公司 Processor system structure
CN101840328B (en) * 2010-04-15 2014-05-07 华为技术有限公司 Data processing method, system and related equipment
US8683128B2 (en) 2010-05-07 2014-03-25 International Business Machines Corporation Memory bus write prioritization
US8838901B2 (en) 2010-05-07 2014-09-16 International Business Machines Corporation Coordinated writeback of dirty cachelines
CN102279802A (en) * 2010-06-13 2011-12-14 中兴通讯股份有限公司 A method and apparatus for a read operation to improve the efficiency of synchronous dynamic random access memory controller
CN102346661A (en) * 2010-07-30 2012-02-08 国际商业机器公司 Method and system for state maintenance of request queue of hardware accelerator
US8661227B2 (en) * 2010-09-17 2014-02-25 International Business Machines Corporation Multi-level register file supporting multiple threads
CN102149207B (en) * 2011-04-02 2013-06-19 天津大学 Access point (AP) scheduling method for improving short-term fairness of transmission control protocol (TCP) in wireless local area network (WLAN)
US20120297147A1 (en) * 2011-05-20 2012-11-22 Nokia Corporation Caching Operations for a Non-Volatile Memory Array
US9936209B2 (en) * 2011-08-11 2018-04-03 The Quantum Group, Inc. System and method for slice processing computer-related tasks
US8898244B2 (en) * 2011-10-20 2014-11-25 Allen Miglore System and method for transporting files between networked or connected systems and devices
US8473658B2 (en) 2011-10-25 2013-06-25 Cavium, Inc. Input output bridging
US8850125B2 (en) 2011-10-25 2014-09-30 Cavium, Inc. System and method to provide non-coherent access to a coherent memory system
US8560757B2 (en) * 2011-10-25 2013-10-15 Cavium, Inc. System and method to reduce memory access latencies using selective replication across multiple memory ports
US9330002B2 (en) * 2011-10-31 2016-05-03 Cavium, Inc. Multi-core interconnect in a network processor
FR2982683B1 (en) * 2011-11-10 2014-01-03 Sagem Defense Securite Method sequencing on a multicore processor.
US9424327B2 (en) 2011-12-23 2016-08-23 Intel Corporation Instruction execution that broadcasts and masks data values at different levels of granularity
WO2013095607A1 (en) * 2011-12-23 2013-06-27 Intel Corporation Instruction execution unit that broadcasts data values at different levels of granularity
US8625422B1 (en) * 2012-12-20 2014-01-07 Unbound Networks Parallel processing using multi-core processor
US9274826B2 (en) * 2012-12-28 2016-03-01 Futurewei Technologies, Inc. Methods for task scheduling through locking and unlocking an ingress queue and a task queue
US9811467B2 (en) * 2014-02-03 2017-11-07 Cavium, Inc. Method and an apparatus for pre-fetching and processing work for procesor cores in a network processor
US9431105B2 (en) 2014-02-26 2016-08-30 Cavium, Inc. Method and apparatus for memory access management
US10002326B2 (en) * 2014-04-14 2018-06-19 Cavium, Inc. Compilation of finite automata based on memory hierarchy
US10110558B2 (en) 2014-04-14 2018-10-23 Cavium, Inc. Processing of finite automata based on memory hierarchy
US9443553B2 (en) 2014-04-28 2016-09-13 Seagate Technology Llc Storage system with multiple media scratch pads
US8947817B1 (en) 2014-04-28 2015-02-03 Seagate Technology Llc Storage system with media scratch pad
US9971686B2 (en) * 2015-02-23 2018-05-15 Intel Corporation Vector cache line write back processors, methods, systems, and instructions
GB2540948A (en) * 2015-07-31 2017-02-08 Advanced Risc Mach Ltd Apparatus with reduced hardware register set
CN105072050A (en) * 2015-08-26 2015-11-18 联想(北京)有限公司 Data transmission method and data transmission device
US10303372B2 (en) 2015-12-01 2019-05-28 Samsung Electronics Co., Ltd. Nonvolatile memory device and operation method thereof

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5193187A (en) * 1989-12-29 1993-03-09 Supercomputer Systems Limited Partnership Fast interrupt mechanism for interrupting processors in parallel in a multiprocessor system wherein processors are assigned process ID numbers
US5613128A (en) * 1990-12-21 1997-03-18 Intel Corporation Programmable multi-processor interrupt controller system with a processor integrated local interrupt controller
US5778236A (en) * 1996-05-17 1998-07-07 Advanced Micro Devices, Inc. Multiprocessing interrupt controller on I/O bus
US5848164A (en) * 1996-04-30 1998-12-08 The Board Of Trustees Of The Leland Stanford Junior University System and method for effects processing on audio subband data
US6115763A (en) * 1998-03-05 2000-09-05 International Business Machines Corporation Multi-core chip providing external core access with regular operation function interface and predetermined service operation services interface comprising core interface units and masters interface unit
US20020029358A1 (en) * 2000-05-31 2002-03-07 Pawlowski Chester W. Method and apparatus for delivering error interrupts to a processor of a modular, multiprocessor system
US6496880B1 (en) * 1999-08-26 2002-12-17 Agere Systems Inc. Shared I/O ports for multi-core designs
US6539522B1 (en) * 2000-01-31 2003-03-25 International Business Machines Corporation Method of developing re-usable software for efficient verification of system-on-chip integrated circuit designs
US6598178B1 (en) * 1999-06-01 2003-07-22 Agere Systems Inc. Peripheral breakpoint signaler
US6675284B1 (en) * 1998-08-21 2004-01-06 Stmicroelectronics Limited Integrated circuit with multiple processing cores
US6718294B1 (en) * 2000-05-16 2004-04-06 Mindspeed Technologies, Inc. System and method for synchronized control of system simulators with multiple processor cores
US20040264077A1 (en) * 2002-10-02 2004-12-30 Dejan Radosavljevic Protective device with end of life indicator

Family Cites Families (99)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4415970A (en) * 1980-11-14 1983-11-15 Sperry Corporation Cache/disk subsystem with load equalization
JPS6232518B2 (en) * 1982-10-15 1987-07-15 Hitachi Ltd
US4755930A (en) * 1985-06-27 1988-07-05 Encore Computer Corporation Hierarchical cache memory system and method
US5091846A (en) * 1986-10-03 1992-02-25 Intergraph Corporation Cache providing caching/non-caching write-through and copyback modes for virtual addresses and including bus snooping to maintain coherency
US5155831A (en) * 1989-04-24 1992-10-13 International Business Machines Corporation Data processing system with fast queue store interposed between store-through caches and a main memory
US5119485A (en) * 1989-05-15 1992-06-02 Motorola, Inc. Method for data bus snooping in a data processing system by selective concurrent read and invalidate cache operation
US5471593A (en) 1989-12-11 1995-11-28 Branigin; Michael H. Computer processor with an efficient means of executing many instructions simultaneously
US5404483A (en) * 1990-06-29 1995-04-04 Digital Equipment Corporation Processor and method for delaying the processing of cache coherency transactions during outstanding cache fills
US5347648A (en) * 1990-06-29 1994-09-13 Digital Equipment Corporation Ensuring write ordering under writeback cache error conditions
US5432918A (en) * 1990-06-29 1995-07-11 Digital Equipment Corporation Method and apparatus for ordering read and write operations using conflict bits in a write queue
US5404482A (en) * 1990-06-29 1995-04-04 Digital Equipment Corporation Processor and method for preventing access to a locked memory block by recording a lock in a content addressable memory with outstanding cache fills
US5276852A (en) * 1990-10-01 1994-01-04 Digital Equipment Corporation Method and apparatus for controlling a processor bus used by multiple processor components during writeback cache transactions
US5265233A (en) 1991-05-17 1993-11-23 Sun Microsystems, Inc. Method and apparatus for providing total and partial store ordering for a memory in multi-processor system
US6446164B1 (en) * 1991-06-27 2002-09-03 Integrated Device Technology, Inc. Test mode accessing of an internal cache memory
US5408644A (en) * 1992-06-05 1995-04-18 Compaq Computer Corporation Method and apparatus for improving the performance of partial stripe operations in a disk array subsystem
US5590368A (en) * 1993-03-31 1996-12-31 Intel Corporation Method and apparatus for dynamically expanding the pipeline of a microprocessor
US5623633A (en) * 1993-07-27 1997-04-22 Dell Usa, L.P. Cache-based computer system employing a snoop control circuit with write-back suppression
US5551006A (en) * 1993-09-30 1996-08-27 Intel Corporation Low cost writethrough cache coherency apparatus and method for computer systems without a cache supporting bus
US5509129A (en) * 1993-11-30 1996-04-16 Guttag; Karl M. Long instruction word controlling plural independent processor operations
US5623627A (en) * 1993-12-09 1997-04-22 Advanced Micro Devices, Inc. Computer memory architecture including a replacement cache
US5754819A (en) * 1994-07-28 1998-05-19 Sun Microsystems, Inc. Low-latency memory indexing method and structure
GB2292822A (en) * 1994-08-31 1996-03-06 Hewlett Packard Co Partitioned cache memory
US5619680A (en) * 1994-11-25 1997-04-08 Berkovich; Semyon Methods and apparatus for concurrent execution of serial computing instructions using combinatorial architecture for program partitioning
JPH08278916A (en) * 1994-11-30 1996-10-22 Hitachi Ltd Multichannel memory system, transfer information synchronizing method, and signal transfer circuit
JP3872118B2 (en) * 1995-03-20 2007-01-24 富士通株式会社 Cache coherence apparatus
US5737547A (en) * 1995-06-07 1998-04-07 Microunity Systems Engineering, Inc. System for placing entries of an outstanding processor request into a free pool after the request is accepted by a corresponding peripheral device
US5742840A (en) * 1995-08-16 1998-04-21 Microunity Systems Engineering, Inc. General purpose, multiple precision parallel operation, programmable media processor
US6598136B1 (en) * 1995-10-06 2003-07-22 National Semiconductor Corporation Data transfer with highly granular cacheability control between memory and a scratchpad area
DE69734399D1 (en) * 1996-01-24 2006-03-02 Sun Microsystems Inc Method and apparatus for stack caching
US6021473A (en) * 1996-08-27 2000-02-01 Vlsi Technology, Inc. Method and apparatus for maintaining coherency for data transaction of CPU and bus device utilizing selective flushing mechanism
US5897656A (en) * 1996-09-16 1999-04-27 Corollary, Inc. System and method for maintaining memory coherency in a computer system having multiple system buses
US5860158A (en) * 1996-11-15 1999-01-12 Samsung Electronics Company, Ltd. Cache control unit with a cache request transaction-oriented protocol
US6134634A (en) * 1996-12-20 2000-10-17 Texas Instruments Incorporated Method and apparatus for preemptive cache write-back
US5895485A (en) * 1997-02-24 1999-04-20 Eccs, Inc. Method and device using a redundant cache for preventing the loss of dirty data
JP3849951B2 (en) * 1997-02-27 2006-11-22 株式会社日立製作所 Shared memory multiprocessor
US6018792A (en) * 1997-07-02 2000-01-25 Micron Electronics, Inc. Apparatus for performing a low latency memory read with concurrent snoop
US5991855A (en) * 1997-07-02 1999-11-23 Micron Electronics, Inc. Low latency memory read with concurrent pipe lined snoops
US6009263A (en) * 1997-07-28 1999-12-28 Institute For The Development Of Emerging Architectures, L.L.C. Emulating agent and method for reformatting computer instructions into a standard uniform format
US5990913A (en) 1997-07-30 1999-11-23 Intel Corporation Method and apparatus for implementing a flush command for an accelerated graphics port device
US6760833B1 (en) * 1997-08-01 2004-07-06 Micron Technology, Inc. Split embedded DRAM processor
US7076568B2 (en) * 1997-10-14 2006-07-11 Alacritech, Inc. Data communication apparatus for computer intelligent network interface card which transfers data between a network and a storage device according designated uniform datagram protocol socket
US6070227A (en) * 1997-10-31 2000-05-30 Hewlett-Packard Company Main memory bank indexing scheme that optimizes consecutive page hits by linking main memory bank address organization to cache memory address organization
US6026475A (en) * 1997-11-26 2000-02-15 Digital Equipment Corporation Method for dynamically remapping a virtual address to a physical address to maintain an even distribution of cache page addresses in a virtual address space
US6253311B1 (en) * 1997-11-29 2001-06-26 Jp First Llc Instruction set for bi-directional conversion and transfer of integer and floating point data
US6560680B2 (en) * 1998-01-21 2003-05-06 Micron Technology, Inc. System controller with Integrated low latency memory using non-cacheable memory physically distinct from main memory
JP3751741B2 (en) * 1998-02-02 2006-03-01 日本電気株式会社 Multi-processor system
US6073210A (en) 1998-03-31 2000-06-06 Intel Corporation Synchronization of weakly ordered write combining operations using a fencing mechanism
US6643745B1 (en) * 1998-03-31 2003-11-04 Intel Corporation Method and apparatus for prefetching data into cache
JP3708436B2 (en) * 1998-05-07 2005-10-19 インフィネオン テクノロジース アクチエンゲゼルシャフト Cache memory for two-dimensional data field
TW501011B (en) * 1998-05-08 2002-09-01 Koninkl Philips Electronics Nv Data processing circuit with cache memory
US20010054137A1 (en) * 1998-06-10 2001-12-20 Richard James Eickemeyer Circuit arrangement and method with improved branch prefetching for short branch instructions
US6483516B1 (en) * 1998-10-09 2002-11-19 National Semiconductor Corporation Hierarchical texture cache
US6718457B2 (en) * 1998-12-03 2004-04-06 Sun Microsystems, Inc. Multiple-thread processor for threaded software applications
US6526481B1 (en) * 1998-12-17 2003-02-25 Massachusetts Institute Of Technology Adaptive cache coherence protocols
US6563818B1 (en) * 1999-05-20 2003-05-13 Advanced Micro Devices, Inc. Weighted round robin cell architecture
US6279080B1 (en) * 1999-06-09 2001-08-21 Ati International Srl Method and apparatus for association of memory locations with a cache location having a flush buffer
US6188624B1 (en) * 1999-07-12 2001-02-13 Winbond Electronics Corporation Low latency memory sensing circuits
US6606704B1 (en) * 1999-08-31 2003-08-12 Intel Corporation Parallel multithreaded processor with plural microengines executing multiple threads each microengine having loadable microcode
US6401175B1 (en) * 1999-10-01 2002-06-04 Sun Microsystems, Inc. Shared write buffer for use by multiple processor units
US6661794B1 (en) 1999-12-29 2003-12-09 Intel Corporation Method and apparatus for gigabit packet assignment for multithreaded packet processing
US6438658B1 (en) * 2000-06-30 2002-08-20 Intel Corporation Fast invalidation scheme for caches
US6654858B1 (en) * 2000-08-31 2003-11-25 Hewlett-Packard Development Company, L.P. Method for reducing directory writes and latency in a high performance, directory-based, coherency protocol
US6665768B1 (en) * 2000-10-12 2003-12-16 Chipwrights Design, Inc. Table look-up operation for SIMD processors with interleaved memory systems
US6587920B2 (en) * 2000-11-30 2003-07-01 Mosaid Technologies Incorporated Method and apparatus for reducing latency in a memory system
US6662275B2 (en) * 2001-02-12 2003-12-09 International Business Machines Corporation Efficient instruction cache coherency maintenance mechanism for scalable multiprocessor computer system with store-through data cache
US6647456B1 (en) * 2001-02-23 2003-11-11 Nvidia Corporation High bandwidth-low latency memory controller
US6725336B2 (en) * 2001-04-20 2004-04-20 Sun Microsystems, Inc. Dynamically allocated cache memory for a multi-processor unit
US6785677B1 (en) * 2001-05-02 2004-08-31 Unisys Corporation Method for execution of query to search strings of characters that match pattern with a target string utilizing bit vector
GB2378779B (en) 2001-08-14 2005-02-02 Advanced Risc Mach Ltd Accessing memory units in a data processing apparatus
US6877071B2 (en) * 2001-08-20 2005-04-05 Technology Ip Holdings, Inc. Multi-ported memory
US20030110208A1 (en) * 2001-09-12 2003-06-12 Raqia Networks, Inc. Processing data across packet boundaries
US6757784B2 (en) * 2001-09-28 2004-06-29 Intel Corporation Hiding refresh of memory and refresh-hidden memory
US7072970B2 (en) * 2001-10-05 2006-07-04 International Business Machines Corporation Programmable network protocol handler architecture
US6901491B2 (en) * 2001-10-22 2005-05-31 Sun Microsystems, Inc. Method and apparatus for integration of communication links with a remote direct memory access protocol
US6944731B2 (en) * 2001-12-19 2005-09-13 Agere Systems Inc. Dynamic random access memory system with bank conflict avoidance feature
US6789167B2 (en) 2002-03-06 2004-09-07 Hewlett-Packard Development Company, L.P. Method and apparatus for multi-core processor integrated circuit having functional elements configurable as core elements and as system device elements
US7200735B2 (en) * 2002-04-10 2007-04-03 Tensilica, Inc. High-performance hybrid processor with configurable execution units
GB2388447B (en) * 2002-05-09 2005-07-27 Sun Microsystems Inc A computer system method and program product for performing a data access from low-level code
US6814374B2 (en) * 2002-06-28 2004-11-09 Delphi Technologies, Inc. Steering column with foamed in-place structure
CN1387119A (en) 2002-06-28 2002-12-25 西安交通大学 Tree chain table for fast search of data and its generating algorithm
US6970985B2 (en) * 2002-07-09 2005-11-29 Bluerisc Inc. Statically speculative memory accessing
GB2390950A (en) * 2002-07-17 2004-01-21 Sony Uk Ltd Video wipe generation based on the distance of a display position between a wipe origin and a wipe destination
US6957305B2 (en) 2002-08-29 2005-10-18 International Business Machines Corporation Data streaming mechanism in a microprocessor
US20040059880A1 (en) * 2002-09-23 2004-03-25 Bennett Brian R. Low latency memory access method using unified queue mechanism
US7146643B2 (en) 2002-10-29 2006-12-05 Lockheed Martin Corporation Intrusion detection accelerator
US7093153B1 (en) * 2002-10-30 2006-08-15 Advanced Micro Devices, Inc. Method and apparatus for lowering bus clock frequency in a complex integrated data processing system
US7055003B2 (en) * 2003-04-25 2006-05-30 International Business Machines Corporation Data cache scrub mechanism for large L2/L3 data cache structures
US7133971B2 (en) * 2003-11-21 2006-11-07 International Business Machines Corporation Cache with selective least frequently used or most frequently used cache line replacement
US20050138276A1 (en) * 2003-12-17 2005-06-23 Intel Corporation Methods and apparatus for high bandwidth random access using dynamic random access memory
US7159068B2 (en) * 2003-12-22 2007-01-02 Phison Electronics Corp. Method of optimizing performance of a flash memory
US20050138297A1 (en) * 2003-12-23 2005-06-23 Intel Corporation Register file cache
US7380276B2 (en) * 2004-05-20 2008-05-27 Intel Corporation Processor extensions and software verification to support type-safe language environments running with untrusted code
US7353341B2 (en) * 2004-06-03 2008-04-01 International Business Machines Corporation System and method for canceling write back operation during simultaneous snoop push or snoop kill operation in write back caches
US20060026371A1 (en) 2004-07-30 2006-02-02 Chrysos George Z Method and apparatus for implementing memory order models with order vectors
DK1794979T3 (en) * 2004-09-10 2017-07-24 Cavium Inc Selective copying data structure
US7594081B2 (en) 2004-09-10 2009-09-22 Cavium Networks, Inc. Direct access to low-latency memory
US7941585B2 (en) * 2004-09-10 2011-05-10 Cavium Networks, Inc. Local scratchpad and data caching system
US20060143396A1 (en) * 2004-12-29 2006-06-29 Mason Cabot Method for programmer-controlled cache line eviction policy
US9304767B2 (en) * 2009-06-02 2016-04-05 Oracle America, Inc. Single cycle data movement between general purpose and floating-point registers

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5193187A (en) * 1989-12-29 1993-03-09 Supercomputer Systems Limited Partnership Fast interrupt mechanism for interrupting processors in parallel in a multiprocessor system wherein processors are assigned process ID numbers
US5613128A (en) * 1990-12-21 1997-03-18 Intel Corporation Programmable multi-processor interrupt controller system with a processor integrated local interrupt controller
US5848164A (en) * 1996-04-30 1998-12-08 The Board Of Trustees Of The Leland Stanford Junior University System and method for effects processing on audio subband data
US5778236A (en) * 1996-05-17 1998-07-07 Advanced Micro Devices, Inc. Multiprocessing interrupt controller on I/O bus
US6115763A (en) * 1998-03-05 2000-09-05 International Business Machines Corporation Multi-core chip providing external core access with regular operation function interface and predetermined service operation services interface comprising core interface units and masters interface unit
US6675284B1 (en) * 1998-08-21 2004-01-06 Stmicroelectronics Limited Integrated circuit with multiple processing cores
US6598178B1 (en) * 1999-06-01 2003-07-22 Agere Systems Inc. Peripheral breakpoint signaler
US6496880B1 (en) * 1999-08-26 2002-12-17 Agere Systems Inc. Shared I/O ports for multi-core designs
US6539522B1 (en) * 2000-01-31 2003-03-25 International Business Machines Corporation Method of developing re-usable software for efficient verification of system-on-chip integrated circuit designs
US6718294B1 (en) * 2000-05-16 2004-04-06 Mindspeed Technologies, Inc. System and method for synchronized control of system simulators with multiple processor cores
US20020029358A1 (en) * 2000-05-31 2002-03-07 Pawlowski Chester W. Method and apparatus for delivering error interrupts to a processor of a modular, multiprocessor system
US20040264077A1 (en) * 2002-10-02 2004-12-30 Dejan Radosavljevic Protective device with end of life indicator

Cited By (196)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060156099A1 (en) * 2004-12-16 2006-07-13 Sweet James D Method and system of using a single EJTAG interface for multiple tap controllers
US7650542B2 (en) * 2004-12-16 2010-01-19 Broadcom Corporation Method and system of using a single EJTAG interface for multiple tap controllers
US20060253660A1 (en) * 2005-03-30 2006-11-09 Intel Corporation Method and apparatus to provide dynamic hardware signal allocation in a processor
US7549026B2 (en) * 2005-03-30 2009-06-16 Intel Corporation Method and apparatus to provide dynamic hardware signal allocation in a processor
US9342468B2 (en) * 2005-05-16 2016-05-17 Texas Instruments Incorporated Memory time stamp register external to first and second processors
US20150026441A1 (en) * 2005-05-16 2015-01-22 Texas Instruments Incorporated Method and system of inserting marking values used to correlate trace data as between processor cores
US7840000B1 (en) * 2005-07-25 2010-11-23 Rockwell Collins, Inc. High performance programmable cryptography system
US8799687B2 (en) 2005-12-30 2014-08-05 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including optimizing C-state selection under variable wakeup rates
US20130254605A1 (en) * 2006-10-20 2013-09-26 Texas Instruments Incorporated High speed double data rate jtag interface
US10162003B2 (en) 2006-10-20 2018-12-25 Texas Instruments Incorporated DDR TMS/TDI, addressable tap, state machine, and tap state monitor
US9817071B2 (en) 2006-10-20 2017-11-14 Texas Instruments Incorporated TDI/TMS DDR coupled JTAG domain with 6 preset flip flops
US8898528B2 (en) * 2006-10-20 2014-11-25 Texas Instruments Incoporated DDR JTAG interface setting flip-flops in high state at power-up
US20080184150A1 (en) * 2007-01-31 2008-07-31 Marc Minato Electronic circuit design analysis tool for multi-processor environments
US20080239956A1 (en) * 2007-03-30 2008-10-02 Packeteer, Inc. Data and Control Plane Architecture for Network Application Traffic Management Device
US9419867B2 (en) 2007-03-30 2016-08-16 Blue Coat Systems, Inc. Data and control plane architecture for network application traffic management device
US8059532B2 (en) 2007-06-21 2011-11-15 Packeteer, Inc. Data and control plane architecture including server-side triggered flow policy mechanism
US20080316922A1 (en) * 2007-06-21 2008-12-25 Packeteer, Inc. Data and Control Plane Architecture Including Server-Side Triggered Flow Policy Mechanism
US7813277B2 (en) 2007-06-29 2010-10-12 Packeteer, Inc. Lockless bandwidth management for multiprocessor networking devices
WO2009008007A3 (en) * 2007-07-09 2009-03-05 Hewlett Packard Development Co Data packet processing method for a multi core processor
US20100241831A1 (en) * 2007-07-09 2010-09-23 Hewlett-Packard Development Company, L.P. Data packet processing method for a multi core processor
US8799547B2 (en) 2007-07-09 2014-08-05 Hewlett-Packard Development Company, L.P. Data packet processing method for a multi core processor
US8279885B2 (en) 2007-09-25 2012-10-02 Packeteer, Inc. Lockless processing of command operations in multiprocessor systems
US20090083517A1 (en) * 2007-09-25 2009-03-26 Packeteer, Inc. Lockless Processing of Command Operations in Multiprocessor Systems
US20090150695A1 (en) * 2007-12-10 2009-06-11 Justin Song Predicting future power level states for processor cores
US20090150696A1 (en) * 2007-12-10 2009-06-11 Justin Song Transitioning a processor package to a low power state
US8024590B2 (en) 2007-12-10 2011-09-20 Intel Corporation Predicting future power level states for processor cores
US9285855B2 (en) 2007-12-10 2016-03-15 Intel Corporation Predicting future power level states for processor cores
US10261559B2 (en) 2007-12-10 2019-04-16 Intel Corporation Predicting future power level states for processor cores
US8111707B2 (en) 2007-12-20 2012-02-07 Packeteer, Inc. Compression mechanisms for control plane—data plane processing architectures
US9121905B2 (en) * 2009-03-25 2015-09-01 Texas Instruments Incorporated TAP with commandable data register control router and routing circuit
US20150113347A1 (en) * 2009-03-25 2015-04-23 Texas Instruments Incorporated Commanded jtag test access port operations
US10024913B2 (en) 2009-03-25 2018-07-17 Texas Instruments Incorporated Tap commandable data register control router inverted TCK, TMS/TDI imputs
US8959396B2 (en) * 2009-03-25 2015-02-17 Texas Instruments Incorporated Commandable data register control router connected to TCK and TDI
US8407528B2 (en) 2009-06-30 2013-03-26 Texas Instruments Incorporated Circuits, systems, apparatus and processes for monitoring activity in multi-processing systems
US20100332909A1 (en) * 2009-06-30 2010-12-30 Texas Instruments Incorporated Circuits, systems, apparatus and processes for monitoring activity in multi-processing systems
US8549261B2 (en) * 2009-11-16 2013-10-01 Fujitsu Limited Parallel computing apparatus and parallel computing method
US20120216017A1 (en) * 2009-11-16 2012-08-23 Fujitsu Limited Parallel computing apparatus and parallel computing method
US20110153724A1 (en) * 2009-12-23 2011-06-23 Murali Raja Systems and methods for object rate limiting in multi-core system
US8452835B2 (en) * 2009-12-23 2013-05-28 Citrix Systems, Inc. Systems and methods for object rate limiting in multi-core system
US9866463B2 (en) 2009-12-23 2018-01-09 Citrix Systems, Inc. Systems and methods for object rate limiting in multi-core system
US8914672B2 (en) * 2009-12-28 2014-12-16 Intel Corporation General purpose hardware to replace faulty core components that may also provide additional processor functionality
US20110161630A1 (en) * 2009-12-28 2011-06-30 Raasch Steven E General purpose hardware to replace faulty core components that may also provide additional processor functionality
US8112677B2 (en) 2010-02-26 2012-02-07 UltraSoC Technologies Limited Method of debugging multiple processes
US20110214023A1 (en) * 2010-02-26 2011-09-01 UltraSoC Technologies Limited Method of Debugging Multiple Processes
US20150346935A1 (en) * 2010-03-09 2015-12-03 Avistar Communications Corporation Scalable high-performance interactive real-time media architectures for virtual desktop environments
US20110225456A1 (en) * 2010-03-10 2011-09-15 Texas Instruments Incorporated Commanded jtag test access port operations
US8572433B2 (en) * 2010-03-10 2013-10-29 Texas Instruments Incorporated JTAG IC with commandable circuit controlling data register control router
US20110307741A1 (en) * 2010-06-15 2011-12-15 National Chung Cheng University Non-intrusive debugging framework for parallel software based on super multi-core framework
US9348387B2 (en) 2010-09-23 2016-05-24 Intel Corporation Providing per core voltage and frequency control
US9983659B2 (en) 2010-09-23 2018-05-29 Intel Corporation Providing per core voltage and frequency control
US9032226B2 (en) 2010-09-23 2015-05-12 Intel Corporation Providing per core voltage and frequency control
US8943334B2 (en) 2010-09-23 2015-01-27 Intel Corporation Providing per core voltage and frequency control
US9983660B2 (en) 2010-09-23 2018-05-29 Intel Corporation Providing per core voltage and frequency control
US9983661B2 (en) 2010-09-23 2018-05-29 Intel Corporation Providing per core voltage and frequency control
US9939884B2 (en) 2010-09-23 2018-04-10 Intel Corporation Providing per core voltage and frequency control
US8782468B2 (en) * 2010-12-22 2014-07-15 Intel Corporation Methods and tools to debug complex multi-core, multi-socket QPI based system
CN103270504A (en) * 2010-12-22 2013-08-28 英特尔公司 Debugging complex multi-core and multi-socket systems
WO2012087894A2 (en) * 2010-12-22 2012-06-28 Intel Corporation Debugging complex multi-core and multi-socket systems
WO2012087894A3 (en) * 2010-12-22 2013-01-03 Intel Corporation Debugging complex multi-core and multi-socket systems
US20120166882A1 (en) * 2010-12-22 2012-06-28 Binata Bhattacharyya Methods and tools to debug complex multi-core, multi-socket qpi based system
KR101498452B1 (en) * 2010-12-22 2015-03-04 인텔 코포레이션 Debugging complex multi-core and multi-socket systems
US9069555B2 (en) 2011-03-21 2015-06-30 Intel Corporation Managing power consumption in a multi-core processor
US9075614B2 (en) 2011-03-21 2015-07-07 Intel Corporation Managing power consumption in a multi-core processor
US20140157051A1 (en) * 2011-05-16 2014-06-05 Zongyou Shao Method and device for debugging a mips-structure cpu with southbridge and northbridge chipsets
US9846625B2 (en) * 2011-05-16 2017-12-19 Dawning Information Industry Co., Ltd. Method and device for debugging a MIPS-structure CPU with southbridge and northbridge chipsets
US8793515B2 (en) 2011-06-27 2014-07-29 Intel Corporation Increasing power efficiency of turbo mode operation in a processor
US8904205B2 (en) 2011-06-27 2014-12-02 Intel Corporation Increasing power efficiency of turbo mode operation in a processor
US8683240B2 (en) 2011-06-27 2014-03-25 Intel Corporation Increasing power efficiency of turbo mode operation in a processor
US8769316B2 (en) 2011-09-06 2014-07-01 Intel Corporation Dynamically allocating a power budget over multiple domains of a processor
US8775833B2 (en) 2011-09-06 2014-07-08 Intel Corporation Dynamically allocating a power budget over multiple domains of a processor
US9081557B2 (en) 2011-09-06 2015-07-14 Intel Corporation Dynamically allocating a power budget over multiple domains of a processor
US9032126B2 (en) 2011-09-08 2015-05-12 Intel Corporation Increasing turbo mode residency of a processor
US9032125B2 (en) 2011-09-08 2015-05-12 Intel Corporation Increasing turbo mode residency of a processor
US8688883B2 (en) 2011-09-08 2014-04-01 Intel Corporation Increasing turbo mode residency of a processor
US8954770B2 (en) 2011-09-28 2015-02-10 Intel Corporation Controlling temperature of multiple domains of a multi-domain processor using a cross domain margin
US8914650B2 (en) 2011-09-28 2014-12-16 Intel Corporation Dynamically adjusting power of non-core processor circuitry including buffer circuitry
US9074947B2 (en) 2011-09-28 2015-07-07 Intel Corporation Estimating temperature of a processor core in a low power state without thermal sensor information
US9501129B2 (en) 2011-09-28 2016-11-22 Intel Corporation Dynamically adjusting power of non-core processor circuitry including buffer circuitry
US9235254B2 (en) 2011-09-28 2016-01-12 Intel Corporation Controlling temperature of multiple domains of a multi-domain processor using a cross-domain margin
US9176565B2 (en) 2011-10-27 2015-11-03 Intel Corporation Controlling operating frequency of a core domain based on operating condition of a non-core domain of a multi-domain processor
US10248181B2 (en) 2011-10-27 2019-04-02 Intel Corporation Enabling a non-core domain to control memory bandwidth in a processor
US8832478B2 (en) 2011-10-27 2014-09-09 Intel Corporation Enabling a non-core domain to control memory bandwidth in a processor
US9354692B2 (en) 2011-10-27 2016-05-31 Intel Corporation Enabling a non-core domain to control memory bandwidth in a processor
US10037067B2 (en) 2011-10-27 2018-07-31 Intel Corporation Enabling a non-core domain to control memory bandwidth in a processor
US9939879B2 (en) 2011-10-27 2018-04-10 Intel Corporation Controlling operating frequency of a core domain via a non-core domain of a multi-domain processor
US9026815B2 (en) 2011-10-27 2015-05-05 Intel Corporation Controlling operating frequency of a core domain via a non-core domain of a multi-domain processor
US9471490B2 (en) 2011-10-31 2016-10-18 Intel Corporation Dynamically controlling cache size to maximize energy efficiency
US9618997B2 (en) 2011-10-31 2017-04-11 Intel Corporation Controlling a turbo mode frequency of a processor
US8943340B2 (en) 2011-10-31 2015-01-27 Intel Corporation Controlling a turbo mode frequency of a processor
US9292068B2 (en) 2011-10-31 2016-03-22 Intel Corporation Controlling a turbo mode frequency of a processor
US9158693B2 (en) 2011-10-31 2015-10-13 Intel Corporation Dynamically controlling cache size to maximize energy efficiency
US10067553B2 (en) 2011-10-31 2018-09-04 Intel Corporation Dynamically controlling cache size to maximize energy efficiency
US9239611B2 (en) 2011-12-05 2016-01-19 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including balancing power among multi-frequency domains of a processor based on efficiency rating scheme
US9753531B2 (en) 2011-12-05 2017-09-05 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including determining an optimal power state of the apparatus based on residency time of non-core domains in a power saving state
US8972763B2 (en) 2011-12-05 2015-03-03 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including determining an optimal power state of the apparatus based on residency time of non-core domains in a power saving state
US9052901B2 (en) 2011-12-14 2015-06-09 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including configurable maximum processor current
US9170624B2 (en) 2011-12-15 2015-10-27 Intel Corporation User level control of power management policies
US9372524B2 (en) 2011-12-15 2016-06-21 Intel Corporation Dynamically modifying a power/performance tradeoff based on processor utilization
US9760409B2 (en) 2011-12-15 2017-09-12 Intel Corporation Dynamically modifying a power/performance tradeoff based on a processor utilization
US9098261B2 (en) 2011-12-15 2015-08-04 Intel Corporation User level control of power management policies
US9535487B2 (en) 2011-12-15 2017-01-03 Intel Corporation User level control of power management policies
US8996895B2 (en) 2011-12-28 2015-03-31 Intel Corporation Method, apparatus, and system for energy efficiency and energy conservation including optimizing C-state selection under variable wakeup rates
US9354689B2 (en) 2012-03-13 2016-05-31 Intel Corporation Providing energy efficient turbo operation of a processor
US9323316B2 (en) 2012-03-13 2016-04-26 Intel Corporation Dynamically controlling interconnect frequency in a processor
US9436245B2 (en) 2012-03-13 2016-09-06 Intel Corporation Dynamically computing an electrical design point (EDP) for a multicore processor
US9547027B2 (en) 2012-03-30 2017-01-17 Intel Corporation Dynamically measuring power consumption in a processor
US10185566B2 (en) 2012-04-27 2019-01-22 Intel Corporation Migrating tasks between asymmetric computing elements of a multi-core processor
US9411770B2 (en) * 2012-07-10 2016-08-09 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Controlling a plurality of serial peripheral interface (‘SPI’) peripherals using a single chip select
US20140019644A1 (en) * 2012-07-10 2014-01-16 International Business Machines Corporation Controlling A Plurality Of Serial Peripheral Interface ('SPI') Peripherals Using A Single Chip Select
US9760155B2 (en) 2012-08-31 2017-09-12 Intel Corporation Configuring power management functionality in a processor
US9063727B2 (en) 2012-08-31 2015-06-23 Intel Corporation Performing cross-domain thermal control in a processor
US8984313B2 (en) 2012-08-31 2015-03-17 Intel Corporation Configuring power management functionality in a processor including a plurality of cores by utilizing a register to store a power domain indicator
US9235244B2 (en) 2012-08-31 2016-01-12 Intel Corporation Configuring power management functionality in a processor
US10203741B2 (en) 2012-08-31 2019-02-12 Intel Corporation Configuring power management functionality in a processor
US9189046B2 (en) 2012-08-31 2015-11-17 Intel Corporation Performing cross-domain thermal control in a processor
US10191532B2 (en) 2012-08-31 2019-01-29 Intel Corporation Configuring power management functionality in a processor
US9335804B2 (en) 2012-09-17 2016-05-10 Intel Corporation Distributing power to heterogeneous compute elements of a processor
US9342122B2 (en) 2012-09-17 2016-05-17 Intel Corporation Distributing power to heterogeneous compute elements of a processor
US9423858B2 (en) 2012-09-27 2016-08-23 Intel Corporation Sharing power between domains in a processor package using encoded power consumption information from a second domain to calculate an available power budget for a first domain
US9575543B2 (en) 2012-11-27 2017-02-21 Intel Corporation Providing an inter-arrival access timer in a processor
US9176875B2 (en) 2012-12-14 2015-11-03 Intel Corporation Power gating a portion of a cache memory
US9183144B2 (en) 2012-12-14 2015-11-10 Intel Corporation Power gating a portion of a cache memory
US9292468B2 (en) 2012-12-17 2016-03-22 Intel Corporation Performing frequency coordination in a multiprocessor system based on response timing optimization
US9405351B2 (en) 2012-12-17 2016-08-02 Intel Corporation Performing frequency coordination in a multiprocessor system
US9235252B2 (en) 2012-12-21 2016-01-12 Intel Corporation Dynamic balancing of power across a plurality of processor domains according to power policy control bias
US9086834B2 (en) 2012-12-21 2015-07-21 Intel Corporation Controlling configurable peak performance limits of a processor
US9075556B2 (en) 2012-12-21 2015-07-07 Intel Corporation Controlling configurable peak performance limits of a processor
US9671854B2 (en) 2012-12-21 2017-06-06 Intel Corporation Controlling configurable peak performance limits of a processor
US9164565B2 (en) 2012-12-28 2015-10-20 Intel Corporation Apparatus and method to manage energy usage of a processor
US9081577B2 (en) 2012-12-28 2015-07-14 Intel Corporation Independent control of processor core retention states
US9606888B1 (en) * 2013-01-04 2017-03-28 Marvell International Ltd. Hierarchical multi-core debugger interface
US9335803B2 (en) 2013-02-15 2016-05-10 Intel Corporation Calculating a dynamically changeable maximum operating voltage value for a processor based on a different polynomial equation using a set of coefficient values and a number of current active cores
US9996135B2 (en) 2013-03-11 2018-06-12 Intel Corporation Controlling operating voltage of a processor
US9367114B2 (en) 2013-03-11 2016-06-14 Intel Corporation Controlling operating voltage of a processor
US9395784B2 (en) 2013-04-25 2016-07-19 Intel Corporation Independently controlling frequency of plurality of power domains in a processor system
US9377841B2 (en) 2013-05-08 2016-06-28 Intel Corporation Adaptively limiting a maximum operating frequency in a multicore processor
US10146283B2 (en) 2013-05-31 2018-12-04 Intel Corporation Controlling power delivery to a processor via a bypass
US9823719B2 (en) 2013-05-31 2017-11-21 Intel Corporation Controlling power delivery to a processor via a bypass
US9348401B2 (en) 2013-06-25 2016-05-24 Intel Corporation Mapping a performance request to an operating frequency in a processor
US10175740B2 (en) 2013-06-25 2019-01-08 Intel Corporation Mapping a performance request to an operating frequency in a processor
US9471088B2 (en) 2013-06-25 2016-10-18 Intel Corporation Restricting clock signal delivery in a processor
US9348407B2 (en) 2013-06-27 2016-05-24 Intel Corporation Method and apparatus for atomic frequency and voltage changes
US9377836B2 (en) 2013-07-26 2016-06-28 Intel Corporation Restricting clock signal delivery based on activity in a processor
US9495001B2 (en) 2013-08-21 2016-11-15 Intel Corporation Forcing core low power states in a processor
US10310588B2 (en) 2013-08-21 2019-06-04 Intel Corporation Forcing core low power states in a processor
US9405345B2 (en) 2013-09-27 2016-08-02 Intel Corporation Constraining processor operation based on power envelope information
US9594560B2 (en) 2013-09-27 2017-03-14 Intel Corporation Estimating scalability value for a specific domain of a multicore processor based on active state residency of the domain, stall duration of the domain, memory bandwidth of the domain, and a plurality of coefficients based on a workload to execute on the domain
US9684517B2 (en) 2013-10-31 2017-06-20 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. System monitoring and debugging in a multi-core processor system
US9494998B2 (en) 2013-12-17 2016-11-15 Intel Corporation Rescheduling workloads to enforce and maintain a duty cycle
US9459689B2 (en) 2013-12-23 2016-10-04 Intel Corporation Dyanamically adapting a voltage of a clock generation circuit
US9965019B2 (en) 2013-12-23 2018-05-08 Intel Corporation Dyanamically adapting a voltage of a clock generation circuit
US9323525B2 (en) 2014-02-26 2016-04-26 Intel Corporation Monitoring vector lane duty cycle for dynamic optimization
US9372800B2 (en) 2014-03-07 2016-06-21 Cavium, Inc. Inter-chip interconnect protocol for a multi-chip system
US9529532B2 (en) 2014-03-07 2016-12-27 Cavium, Inc. Method and apparatus for memory allocation in a multi-node system
US10169080B2 (en) 2014-03-07 2019-01-01 Cavium, Llc Method for work scheduling in a multi-chip system
US9411644B2 (en) 2014-03-07 2016-08-09 Cavium, Inc. Method and system for work scheduling in a multi-chip system
WO2015134103A1 (en) * 2014-03-07 2015-09-11 Cavium, Inc. Method and system for ordering i/o access in a multi-node environment
US10198065B2 (en) 2014-03-21 2019-02-05 Intel Corporation Selecting a low power state based on cache flush latency determination
US9665153B2 (en) 2014-03-21 2017-05-30 Intel Corporation Selecting a low power state based on cache flush latency determination
US10108454B2 (en) 2014-03-21 2018-10-23 Intel Corporation Managing dynamic capacitance using code scheduling
US9760158B2 (en) 2014-06-06 2017-09-12 Intel Corporation Forcing a processor into a low power state
US10345889B2 (en) 2014-06-06 2019-07-09 Intel Corporation Forcing a processor into a low power state
US9513689B2 (en) 2014-06-30 2016-12-06 Intel Corporation Controlling processor performance scaling based on context
US9606602B2 (en) 2014-06-30 2017-03-28 Intel Corporation Method and apparatus to prevent voltage droop in a computer
US10216251B2 (en) 2014-06-30 2019-02-26 Intel Corporation Controlling processor performance scaling based on context
US10331186B2 (en) 2014-07-25 2019-06-25 Intel Corporation Adaptive algorithm for thermal throttling of multi-core processors with non-homogeneous performance states
US9575537B2 (en) 2014-07-25 2017-02-21 Intel Corporation Adaptive algorithm for thermal throttling of multi-core processors with non-homogeneous performance states
US9760136B2 (en) 2014-08-15 2017-09-12 Intel Corporation Controlling temperature of a system memory
US9990016B2 (en) 2014-08-15 2018-06-05 Intel Corporation Controlling temperature of a system memory
US9671853B2 (en) 2014-09-12 2017-06-06 Intel Corporation Processor operating by selecting smaller of requested frequency and an energy performance gain (EPG) frequency
US10339023B2 (en) 2014-09-25 2019-07-02 Intel Corporation Cache-aware adaptive thread scheduling and migration
US9977477B2 (en) 2014-09-26 2018-05-22 Intel Corporation Adapting operating parameters of an input/output (IO) interface circuit of a processor
US9684360B2 (en) 2014-10-30 2017-06-20 Intel Corporation Dynamically controlling power management of an on-die memory of a processor
US9703358B2 (en) 2014-11-24 2017-07-11 Intel Corporation Controlling turbo mode frequency operation in a processor
US9710043B2 (en) 2014-11-26 2017-07-18 Intel Corporation Controlling a guaranteed frequency of a processor
US10048744B2 (en) 2014-11-26 2018-08-14 Intel Corporation Apparatus and method for thermal management in a multi-chip package
US9847927B2 (en) 2014-12-26 2017-12-19 Pfu Limited Information processing device, method, and medium
US9639134B2 (en) 2015-02-05 2017-05-02 Intel Corporation Method and apparatus to provide telemetry data to a power controller of a processor
US9910481B2 (en) 2015-02-13 2018-03-06 Intel Corporation Performing power management in a multicore processor
US10234930B2 (en) 2015-02-13 2019-03-19 Intel Corporation Performing power management in a multicore processor
US9874922B2 (en) 2015-02-17 2018-01-23 Intel Corporation Performing dynamic power control of platform devices
US9842082B2 (en) 2015-02-27 2017-12-12 Intel Corporation Dynamically updating logical identifiers of cores of a processor
US9710054B2 (en) 2015-02-28 2017-07-18 Intel Corporation Programmable power management agent
US9760160B2 (en) 2015-05-27 2017-09-12 Intel Corporation Controlling performance states of processing engines of a processor
US9710041B2 (en) 2015-07-29 2017-07-18 Intel Corporation Masking a power state of a core of a processor
US10001822B2 (en) 2015-09-22 2018-06-19 Intel Corporation Integrating a power arbiter in a processor
EP3343377A4 (en) * 2015-09-25 2018-09-12 Huawei Technologies Co., Ltd. Debugging method, multi-core processor, and debugging equipment
US9983644B2 (en) 2015-11-10 2018-05-29 Intel Corporation Dynamically updating at least one power management operational parameter pertaining to a turbo mode of a processor for increased performance
US9910470B2 (en) 2015-12-16 2018-03-06 Intel Corporation Controlling telemetry data communication in a processor
US10146286B2 (en) 2016-01-14 2018-12-04 Intel Corporation Dynamically updating a power management policy of a processor
US10289188B2 (en) 2016-06-21 2019-05-14 Intel Corporation Processor having concurrent core and fabric exit from a low power state
US10281975B2 (en) 2016-06-23 2019-05-07 Intel Corporation Processor having accelerated user responsiveness in constrained environment
US10324519B2 (en) 2016-06-23 2019-06-18 Intel Corporation Controlling forced idle state operation in a processor
US10234920B2 (en) 2016-08-31 2019-03-19 Intel Corporation Controlling current consumption of a processor based at least in part on platform capacitance
US10168758B2 (en) 2016-09-29 2019-01-01 Intel Corporation Techniques to enable communication between a processor and voltage regulator
EP3333697A1 (en) * 2016-12-12 2018-06-13 INTEL Corporation Communicating signals between divided and undivided clock domains

Also Published As

Publication number Publication date
US9141548B2 (en) 2015-09-22
US20060059316A1 (en) 2006-03-16
CN101069170B (en) 2012-02-08
CN101040256A (en) 2007-09-19
CN100533372C (en) 2009-08-26
CN101069170A (en) 2007-11-07
CN101128804B (en) 2012-02-01
US20140317353A1 (en) 2014-10-23
CN101128804A (en) 2008-02-20
US20060059310A1 (en) 2006-03-16
CN101053234B (en) 2012-02-29
CN101053234A (en) 2007-10-10
CN101036117A (en) 2007-09-12
CN101036117B (en) 2010-12-08
US7941585B2 (en) 2011-05-10

Similar Documents

Publication Publication Date Title
Yourst PTLsim: A cycle accurate full system x86-64 microarchitectural simulator
US6148381A (en) Single-port trace buffer architecture with overflow reduction
US5530804A (en) Superscalar processor with plural pipelined execution units each unit selectively having both normal and debug modes
US7149926B2 (en) Configurable real-time trace port for embedded processors
JP3846939B2 (en) Data processor
US6530076B1 (en) Data processing system processor dynamic selection of internal signal tracing
EP0974093B1 (en) Debug interface including a compact trace record storage
US6754856B2 (en) Memory access debug facility
US6205560B1 (en) Debug system allowing programmable selection of alternate debug mechanisms such as debug handler, SMI, or JTAG
US20110078350A1 (en) Method for generating multiple serial bus chip selects using single chip select signal and modulation of clock signal frequency
EP0762277B1 (en) Data processor with built-in emulation circuit
US8341604B2 (en) Embedded trace macrocell for enhanced digital signal processor debugging operations
EP0762276B1 (en) Data processor with built-in emulation circuit
EP0391173B1 (en) Debug peripheral for microcomputers, microprocessors and core processor integrated circuits and system using the same
EP0762279A1 (en) Data processor with built-in emulation circuit
US7840842B2 (en) Method for debugging reconfigurable architectures
US6990657B2 (en) Shared software breakpoints in a shared memory system
JP3936175B2 (en) Digital signal processing system
USRE36766E (en) Microprocessor breakpoint apparatus
US6145122A (en) Development interface for a data processor
US5978902A (en) Debug interface including operating system access of a serial/parallel debug port
US6145123A (en) Trace on/off with breakpoint register
ES2678413T3 (en) Procedure and alignment system traces between threads for processor multithreaded
EP0849670B1 (en) Integrated computer providing an instruction trace
US7665002B1 (en) Multi-core integrated circuit with shared debug port

Legal Events

Date Code Title Description
AS Assignment

Owner name: CAVIUM NETWORKS, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BERTONE, MICHAEL S.;CARLSON, DAVID A.;KESSLER, RICHARD E.;AND OTHERS;REEL/FRAME:016974/0203;SIGNING DATES FROM 20050914 TO 20050926

AS Assignment

Owner name: CAVIUM NETWORKS, INC., A DELAWARE CORPORATION, CAL

Free format text: MERGER;ASSIGNOR:CAVIUM NETWORKS, A CALIFORNIA CORPORATION;REEL/FRAME:019014/0174

Effective date: 20070205

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION