US20220138329A1 - Microprocessor pipeline circuitry to support cryptographic computing - Google Patents

Microprocessor pipeline circuitry to support cryptographic computing Download PDF

Info

Publication number
US20220138329A1
US20220138329A1 US17/576,533 US202217576533A US2022138329A1 US 20220138329 A1 US20220138329 A1 US 20220138329A1 US 202217576533 A US202217576533 A US 202217576533A US 2022138329 A1 US2022138329 A1 US 2022138329A1
Authority
US
United States
Prior art keywords
address
processor
encrypted
pointer
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/576,533
Inventor
Michael E. Kounavis
Santosh Ghosh
Sergej Deutsch
Michael D. LeMay
David M. Durham
Stanislav Shwartsman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corp filed Critical Intel Corp
Priority to US17/576,533 priority Critical patent/US20220138329A1/en
Publication of US20220138329A1 publication Critical patent/US20220138329A1/en
Priority to US17/878,322 priority patent/US20220382885A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0207Addressing or allocation; Relocation with multidimensional access, e.g. row/column, matrix
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/06Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
    • G06F12/0646Configuration or reconfiguration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0875Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory
    • G06F12/1408Protection against unauthorised use of memory or access to memory by using cryptography
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory
    • G06F12/1458Protection against unauthorised use of memory or access to memory by checking the subject access rights
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory
    • G06F12/1458Protection against unauthorised use of memory or access to memory by checking the subject access rights
    • G06F12/1466Key-lock mechanism
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/54Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by adding security routines or objects to programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • G06F21/72Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information in cryptographic circuits
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/78Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/78Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data
    • G06F21/79Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data in semiconductor storage media, e.g. directly-addressable memories
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30043LOAD or STORE instructions; Clear instruction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30101Special purpose registers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/3017Runtime instruction translation, e.g. macros
    • G06F9/30178Runtime instruction translation, e.g. macros of compressed or encrypted instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/321Program or instruction counter, e.g. incrementing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • H04L9/0631Substitution permutation network [SPN], i.e. cipher composed of a number of stages or rounds each involving linear and nonlinear transformations, e.g. AES algorithms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • H04L9/0637Modes of operation, e.g. cipher block chaining [CBC], electronic codebook [ECB] or Galois/counter mode [GCM]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/0819Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s)
    • H04L9/0822Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s) using key encryption key
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/0819Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s)
    • H04L9/083Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s) involving central third party, e.g. key distribution center [KDC] or trusted third party [TTP]
    • H04L9/0833Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s) involving central third party, e.g. key distribution center [KDC] or trusted third party [TTP] involving conference or group key
    • H04L9/0836Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s) involving central third party, e.g. key distribution center [KDC] or trusted third party [TTP] involving conference or group key using tree structure or hierarchical structure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0869Generation of secret information including derivation or calculation of cryptographic keys or passwords involving random numbers or seeds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0894Escrow, recovery or storing of secret information, e.g. secret key escrow or cryptographic key storage
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/14Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols using a plurality of keys or algorithms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3242Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving keyed hash functions, e.g. message authentication codes [MACs], CBC-MAC or HMAC
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0897Caches characterised by their organisation or structure with two or more cache hierarchy levels
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45587Isolation or security of virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/556Detecting local intrusion or implementing counter-measures involving covert channels, i.e. data leakage between processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1032Reliability improvement, data loss prevention, degraded operation etc
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1041Resource optimization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1052Security improvement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/40Specific encoding of data in memory or cache
    • G06F2212/402Encrypted data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/21Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/2107File encryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • H04L2209/125Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This disclosure relates in general to the field of computer systems and, more particularly, to microprocessor pipeline circuitry to supporting cryptographic computing.
  • Cryptographic computing may refer to solutions for computer system security that employ cryptographic mechanisms inside processor components. Some cryptographic computing systems may involve the encryption and decryption of pointers, keys and data in a processor core using new encrypted memory access instructions.
  • FIG. 1 is a flow diagram of an example process of scheduling microoperations.
  • FIG. 2 is a diagram of an example process of scheduling microoperations based on cryptographic-based instructions.
  • FIG. 3 is a diagram of another example process of scheduling microoperations based on cryptographic-based instructions.
  • FIGS. 4A-4B are diagrams of an example data decryption process in a cryptographic computing system.
  • FIGS. 5A-5C are diagrams of another example data decryption process in a cryptographic computing system.
  • FIGS. 6A-6B are diagrams of an example data encryption process in a cryptographic computing system.
  • FIGS. 7A-7B are diagrams of an example pointer decryption process in a cryptographic computing system.
  • FIGS. 8A-8B are diagrams of an example base address slice decryption process in a cryptographic computing system.
  • FIG. 9 is a flow diagram of an example process of executing cryptographic-based instructions in a cryptographic computing system.
  • FIG. 10 is a block diagram illustrating an example processor core and memory according to at least one embodiment
  • FIG. 11A is a block diagram of an example in-order pipeline and an example register renaming, out-of-order issue/execution pipeline according to one or more embodiments of this disclosure;
  • FIG. 11B is a block diagram of an example in-order architecture core and register renaming, out-of-order issue/execution architecture core to be included in a processor according to one or more embodiments of this disclosure.
  • FIG. 12 is a block diagram of an example computer architecture according to at least one embodiment.
  • Cryptographic computing may refer to computer system security solutions that employ cryptographic mechanisms inside processor components.
  • Some cryptographic computing systems may involve the encryption and decryption of pointers, keys, and data in a processor core using new encrypted memory access instructions.
  • the microarchitecture pipeline of the processor core may be configured in such a way to support such encryption and decryption operations.
  • Some current systems may address security concerns by placing a memory encryption unit in the microcontroller. However, such systems may increase latencies due to the placement of cryptographic functionality in the microcontroller. Other systems may provide a pointer authentication solution. However, these solutions cannot support multi-tenancy and may otherwise be limited when compared to the cryptographic computing implementations described herein.
  • an execution pipeline of a processor core first maps cryptographic computing instructions into at least one block encryption-based microoperation ( ⁇ op) and at least one regular, non-encryption-based load/store ⁇ op.
  • Load operations performed by load ⁇ ops may go to a load buffer (e.g., in a memory subsystem of a processor), while store operations performed by store ⁇ ops may go to store buffer (e.g., in the same memory subsystem).
  • An in-order or out-of-order execution scheduler is aware of the timings and dependencies associated with the cryptographic computing instructions.
  • the load and store ⁇ ops are considered as dependent on the block encryption ⁇ ops.
  • the load and store ⁇ ops may execute in parallel with the encryption of the counter.
  • a counter common to the plurality of load/store ⁇ ops may be encrypted only once.
  • block encryptions coming from cryptographic computing instructions are scheduled to be executed in parallel with independent ⁇ ops, which may include ⁇ ops not coming from cryptographic computing instructions.
  • functional units include block encryption or counter encryption operations.
  • data decryption may be performed (e.g., on data loaded from a data cache unit) by a decryption unit coupled to or implemented in a load buffer
  • data encryption may be performed (e.g., on data output from an execution unit) by an encryption unit coupled to or implemented in a store buffer.
  • pointer decryption may be performed by an address generation unit.
  • Any suitable block cipher cryptographic algorithm may be implemented. For example, a small block cipher (e.g., a SIMON, or SPECK cipher at a 32-bit block size, or other variable bit size block cipher) or their tweakable versions may be used.
  • AES Advanced Encryption Standard
  • AES-XTS tweaked-codebook mode with ciphertext stealing
  • CTR AES counter
  • cryptographic computing may require the linear address for each memory access to be plumbed to the interface with the data cache to enable tweaked encryption and decryption at that interface.
  • load requests that may be accomplished by adding a new read port on the load buffer.
  • stream ciphers e.g., those using the counter mode
  • the keystream may be pre-computed as soon as the load buffer entry is created.
  • Data may be encrypted as it is stored into the store buffer or may be encrypted after it exits the store buffer on its way to a Level-1 (L1) cache.
  • L1 Level-1
  • a read port may be utilized on the store buffer so that a cryptographic execution unit can read the address.
  • aspects of the present disclosure may provide a good cost/performance trade-off when compared to current systems, as data and pointer encryption and decryption latencies can be hidden behind the execution of other ⁇ ops. Other advantages will be apparent in light of the present disclosure.
  • FIG. 1 is a flow diagram of an example process 100 of scheduling microoperations.
  • the example process 100 may be implemented by an execution scheduler, such as an out-of-order execution scheduler in certain instances.
  • a sequence of instructions is accessed by an execution scheduler.
  • the instructions may be inside a window of fixed size (e.g., 25 instructions or 50 instructions).
  • the sequence of instructions is mapped to a sequence of microoperations ( ⁇ ops). In typical pipelines, each instruction may be mapped to one or more ⁇ ops in the sequence.
  • the scheduler detects dependencies between ⁇ ops and expresses those dependencies in the form of a directed acyclic graph. This may be performed by dependencies logic of the scheduler.
  • two independent ⁇ ops may be represented as nodes in separate parallel branches in the graph.
  • dependent ⁇ ops such as an ADD ⁇ op and a following store ⁇ op may be represented as sequential nodes in the same branch of the graph.
  • the acyclic graph may include speculative execution branches in certain instances.
  • the scheduler may annotate the graph with latency and throughput values associated with the execution of the ⁇ ops, and at 110 , the scheduler performs maximal scheduling of at least one subset of independent ⁇ ops by the functional units of the processor core.
  • the annotation of 108 may be performed by timing logic of the scheduler and the scheduling of 110 may be performed by scheduling logic of the scheduler.
  • Maximal scheduling may refer to the assignment of independent ⁇ ops to core functional units that are locally optimal according to some specific objective. For example, the scheduler may perform assignments such that the largest possible number of independent functional units are simultaneously occupied to execute independent ⁇ op tasks. In certain embodiments, the scheduling performed at 110 may be repeated several times.
  • FIG. 2 is a diagram of an example process 200 of scheduling microoperations based on cryptographic-based instructions.
  • the example process 200 may be implemented by an execution scheduler, such as an out-of-order execution scheduler in cryptographic computing systems.
  • a sequence of cryptographic-based instruction is accessed. This operation may correspond to operation 102 of the process 100 .
  • Cryptographic-based instructions may refer to instructions that are to be executed in cryptographic computing systems or environments, where data is stored in memory in encrypted form and decrypted/encrypted within a processor core.
  • An example cryptographic-based instruction includes an encrypted load and store operation. The sequence of instructions may be within a particular window of fixed size as in process 100 .
  • At 204 at least one encryption-based ⁇ op and at least one non-encryption based ⁇ op are generated for each instruction accessed at 202 . This operation may correspond to operation 104 of the process 100 .
  • the encryption-based ⁇ op is based on a block encryption scheme.
  • the at least one encryption-based ⁇ op may include a data block encryption ⁇ op and the at least one non-encryption based ⁇ op may include a regular, unencrypted load or store ⁇ op.
  • the at least one encryption-based ⁇ op may include a data block decryption ⁇ op and the at least one non-encryption based ⁇ op may include a regular, unencrypted load or store ⁇ op.
  • the at least one encryption-based ⁇ op may include a data pointer encryption ⁇ op and the at least one non-encryption-based ⁇ op may include a regular, unencrypted load or store ⁇ op.
  • the at least one encryption-based ⁇ op may include a data pointer decryption ⁇ op and the non-encryption-based ⁇ op may include a regular, unencrypted load or store ⁇ op.
  • the non-encryption based ⁇ ops are expressed as dependent upon the (block) encryption-based ⁇ ops.
  • This operation may correspond to operation 106 of the process 100 , and may accordingly be performed by dependencies logic of the scheduler during generation of an acyclic graph.
  • the scheduler may compute dependencies between ⁇ ops by identifying regular, unencrypted load or store ⁇ ops that have resulted from the mapping of cryptographic-based instructions into ⁇ ops as dependent on at least one of a data block encryption ⁇ op, a data block decryption ⁇ op, a pointer encryption ⁇ op, or a pointer decryption ⁇ op.
  • encryption or decryption timings are added to an acyclic graph that expresses ⁇ op dependencies.
  • This operation may correspond to operation 108 of the process 100 , whereby the acyclic graph is annotated by timing logic of a scheduler. In some embodiments, the timings are otherwise implicitly taken into account by the scheduler.
  • the encryption-based ⁇ ops are scheduled to execute in parallel with independent ⁇ ops (e.g., those not originating from the cryptographic-based instructions accessed at 202 ).
  • This operation may correspond to operation 110 of the process 100 , whereby the maximal scheduling is performed by scheduling logic of a scheduler. For instance, the scheduling logic that assigns ⁇ ops to functional units may ensure that data block and pointer encryption/decryption tasks are scheduled to be executed in parallel with other independent ⁇ ops.
  • FIG. 3 is a diagram of another example process 300 of scheduling microoperations based on cryptographic-based instructions.
  • a block cipher encryption scheme is utilized, and the mode used for data block and pointer encryption is the counter mode.
  • data are encrypted by being XOR-ed with an almost random value, called the key stream.
  • the key stream may be produced by encrypting counter blocks using a secret key.
  • Counter blocks comprising tweak bits (as well as the bits of a block-by-block increasing counter) may be encrypted with the same key and the resulting encrypted blocks are XOR-ed with the data.
  • key stream generation microoperations can be parallelized with microoperations for the reading of the data from memory.
  • Cryptographic-based instructions may refer to instructions that are to be executed in cryptographic computing systems or environments, where data is stored in memory in encrypted form and decrypted/encrypted within a processor core.
  • An example cryptographic-based instruction includes an encrypted load and store operation. The sequence of instructions may be within a particular window of fixed size as in processes 100 , 200 .
  • At 304 at least one counter mode encryption-based ⁇ op and at least one non-encryption based ⁇ op are generated for each instruction accessed at 302 , in a similar manner as described above with respect to 204 of process 200 .
  • non-encryption-based ⁇ ops that can execute in parallel with the encryption of the counter are identified, and the counter common to the identified ⁇ ops is encrypted once (instead of multiple times).
  • This operation may correspond to operation 106 of the process 100 , and may accordingly be performed by dependencies logic of the scheduler during generation of an acyclic graph.
  • the scheduler logic that computes ⁇ op dependencies may ensure that regular unencrypted load ⁇ ops coming from the cryptographic-based instructions are not expressed as dependent on their associated counter encryption ⁇ ops.
  • the encryption of the counter blocks may proceed independently from the loading of the data.
  • the corresponding ⁇ ops of these two steps may be represented by nodes of two separate parallel branches in the dependencies graph.
  • the dependencies logic of the scheduler may also identify a plurality of load and store ⁇ ops coming from the cryptographic-based instructions, the associated data of which need to be encrypted or decrypted with the same counter value and key stream. For these ⁇ ops, the dependencies logic may schedule the computation of the key stream only once and represent it as a single node in the dependencies graph.
  • encryption or decryption timings are added to an acyclic graph that expresses ⁇ op dependencies.
  • This operation may correspond to operation 108 of the process 100 , whereby the acyclic graph is annotated by timing logic of a scheduler. In some embodiments, the timings are otherwise implicitly taken into account by the scheduler.
  • the encryption-based ⁇ ops are scheduled to execute in parallel with independent ⁇ ops (e.g., those not originating from the cryptographic-based instructions accessed at 302 ).
  • This operation may correspond to operation 110 of the process 100 , whereby the maximal scheduling is performed by scheduling logic of the scheduler. For instance, the scheduling logic that assigns ⁇ ops to functional units may ensure that data block and pointer encryption/decryption tasks are scheduled to be executed in parallel with other independent ⁇ ops.
  • an out-of-order-execution scheduler may support the execution of cryptographic-based instructions in cryptographic computing implementations.
  • the following examples describe certain embodiments wherein the functional units of a core support the execution of the microoperations as discussed above.
  • the encryption and decryption of data is done in the load and store buffers, respectively, of a processor core microarchitecture.
  • FIGS. 4A-4B are diagrams of an example data decryption process in a cryptographic computing system.
  • FIG. 4A shows an example system 400 for implementing the example process 450 of FIG. 4B .
  • the system 400 is implemented entirely within a processor as part of a cryptographic computing system.
  • the system 400 may, in certain embodiments, be executed in response to a plurality of ⁇ ops issued by an out-of-order scheduler implementing the process 200 of FIG. 2 .
  • a load buffer 402 includes one or more load buffer entries 404 .
  • the load buffer 402 may be implemented in a memory subsystem of a processor, such as in a memory subsystem of a processor core.
  • Each load buffer entry 404 includes a physical address field 406 and a pointer field 408 .
  • a state machine servicing load requests obtains data from a data cache unit 412 (which may, in some implementations be a store buffer), then uses the pointer field 408 (obtained via read port 410 ) as a tweak in a decryption operation performed on the encrypted data via a decryption unit 414 .
  • the decrypted data are then delivered to an execution unit 416 of the processor core microarchitecture.
  • the decryption unit 414 may be implemented inside the load buffer 402 in some embodiments.
  • a data cache unit stores encrypted data (ciphertext) to be decrypted by the decryption unit 414 as described above.
  • the decryption unit 414 accesses the ciphertext to begin fulfilling a load operation.
  • the decryption unit 414 then decrypts the ciphertext at 454 using an active key obtained from a register along with a tweak value, which, in the example shown, is the value of the pointer field 408 (i.e., the data's linear address).
  • the decryption unit 414 provides the decrypted plaintext to an execution unit 416 to fulfill the load operation.
  • the decryption unit 414 sends a wake-up signal to a reservation station of the processor (which may track the status of register contents and support register renaming).
  • FIGS. 5A-5C are diagrams of another example data decryption process in a cryptographic computing system.
  • FIG. 5A shows an example system 500 for implementing the example processes 550 , 560 of FIGS. 5B, 5C .
  • the system 500 is implemented entirely within a processor as part of a cryptographic computing system.
  • a counter mode block cipher is used for encryption/decryption of data.
  • the system 500 may be executed, in certain embodiments, in response to a plurality of ⁇ ops issued by an out-of-order scheduler implementing the process 300 of FIG. 3 .
  • a load buffer 502 includes one or more load buffer entries 504 .
  • the load buffer 502 may be implemented in a memory subsystem of a processor, such as in a memory subsystem of a processor core.
  • Each load buffer entry 504 includes a physical address field 506 , a pointer field 508 , and a key stream 510 .
  • the key stream generator 512 produces the key stream 510 by encrypting a counter value loaded from the register 522 .
  • the pointer field 508 of the load buffer entry 504 tweaks the encryption operation performed by the key stream generator 512 .
  • the encryption performed by the key stream generator 512 may be tweaked by other fields, such as, for example, other cryptographic context values.
  • An XOR operation is then performed on the key stream 510 by the XOR unit 518 (which reads the key stream 510 via the read port 514 ) and encrypted data coming from the data cache unit 516 (which may, in some embodiments, be a store buffer).
  • the decrypted data are then delivered to an execution unit 520 of the processor core microarchitecture.
  • the key stream generator 512 may be implemented outside the load buffer 502 in some embodiments.
  • the XOR unit 518 may be implemented inside the load buffer 502 in some embodiments.
  • a load buffer entry 504 is created.
  • a key stream generator 512 is invoked.
  • the key stream generator 512 uses a key obtained from a register along with a tweak value (which, in the example shown, is the pointer value 508 ) to generate a key stream 510 , which is stored in the load buffer entry 504 .
  • the ciphertext associated with the load operation may become available from a data cache unit (or store buffer).
  • the cipher text is accessed, and at 564 , the ciphertext is XOR-ed with the key stream 510 .
  • the result of the XOR operation is provided to an execution unit 520 of the processor core microarchitecture to fulfill the load operation.
  • a wake-up signal is sent to a reservation station of the processor.
  • FIGS. 6A-6B are diagrams of an example data encryption process in a cryptographic computing system.
  • FIG. 6A shows an example system 600 for implementing the example process 650 of FIG. 6B .
  • the system 600 is implemented entirely within a processor as part of a cryptographic computing system.
  • the system 600 may, in certain embodiments, be executed in response to a plurality of ⁇ ops issued by an out-of-order scheduler implementing the process 200 of FIG. 2 .
  • a store buffer 602 includes one or more store buffer entries 604 .
  • the store buffer 602 may be implemented in a memory subsystem of a processor, such as in a memory subsystem of a processor core.
  • Each store buffer entry 604 includes a physical address field 606 , a pointer field 608 , and store data 610 (which is to be stored).
  • a state machine servicing store requests obtains data from a register file 620 (or execution unit), and an encryption unit 612 uses the pointer field 608 as a tweak during an encryption operation performed on the data obtained from the register file 620 .
  • the encrypted data are then passed to a data cache unit 630 (or other execution unit of the CPU core microarchitecture).
  • the encryption unit 612 may be implemented outside the store buffer 602 in some embodiments.
  • plaintext data to be encrypted is available from a register file 620 .
  • the store buffer entry 604 is populated with a pointer value 608 .
  • the plaintext data is accessed from the register file 620 and at 656 , the plaintext data is encrypted by the encryption unit 612 using an active key obtained from a register 640 along with a tweak (which, in the example shown, is the value of the pointer field 408 (i.e., the data's linear address)) and stored in the store buffer entry 604 as store data 610 .
  • the encrypted store data 610 is provided to a data cache unit 630 (or another waiting execution unit, in some implementations).
  • the pointer values used in the encryption and decryption operations may themselves be encrypted for security purposes.
  • the pointer values may be entirely or partially encrypted (that is, only a portion of the bits of the pointer value may be encrypted).
  • the encrypted pointer values may first be decrypted prior to being used in the encryption/decryption operations described above.
  • FIGS. 7A-7B and 8A-8B describe example embodiments for decrypting pointer values prior to use in the encryption/decryption operations.
  • FIGS. 7A-7B are diagrams of an example pointer decryption process in a cryptographic computing system.
  • FIG. 7A shows an example system 700 for implementing the example process 750 of FIG. 7B .
  • the system 700 is implemented entirely within a processor as part of a cryptographic computing system.
  • the system 700 may, in certain embodiments, be executed in response to a plurality of ⁇ ops issued by an out-of-order scheduler implementing the process 200 of FIG. 2 or the process 300 of FIG. 3 .
  • an address generation unit 702 is configured to decrypt parts of a linear address, which are encrypted for security.
  • a decryption unit 704 in the address generation unit 702 accepts as input an encrypted pointer 710 representing a first encoded linear address, along with a key obtained from a register and a context value tweak input (e.g., the tweak input may come from a separate register, or may consist of unencrypted bits of the same linear address).
  • the decryption unit 704 outputs a decrypted subset of the bits of the encrypted pointer 710 , which are then passed to address generation circuitry 706 within the address generation unit 702 along with other address generation inputs.
  • the address generation circuitry 706 generates a second effective linear address to be used in a memory read or write operation based on the inputs.
  • the tweak value (which is also described in FIG. 7B as the “context value”) may be available either statically or dynamically—if it is not available statically, it is loaded dynamically from memory.
  • request to generate an effective address from an encrypted pointer 710 is received by an address generation unit 702 .
  • the address generation unit 702 determines at 754 whether a context value is available statically. If it is available statically, then the value is used at 756 ; if not, the context value is loaded dynamically from a table in memory at 755 .
  • the process then proceeds to 756 , where the encrypted pointer 710 is decrypted using an active decryption key obtained from a register along with the obtained context value.
  • a decrypted address is output to the address generation circuitry 706 , which then generates, at 760 , an effective address for use in read/write operations based on the decrypted address (and any other address generation inputs).
  • FIGS. 8A-8B are diagrams of an example base address slice decryption process in a cryptographic computing system.
  • FIG. 8A shows an example system 800 for implementing the example process 850 of FIG. 8B .
  • the system 800 is implemented entirely within a processor as part of a cryptographic computing system.
  • the system 800 may, in certain embodiments, be executed in response to a plurality of ⁇ ops issued by an out-of-order scheduler implementing the process 200 of FIG. 2 or the process 300 of FIG. 3 .
  • a generation unit 802 is configured to decrypt parts of a linear address, as described above with respect to FIGS. 7A-7B .
  • the bit set that is encrypted i.e., slice 824
  • the upper bits 822 of the encoded linear address 820 may denote the data object size, type, format, or other security information associated with the encoded linear address 820 .
  • the encoded linear address 820 also includes an offset 826 .
  • a decryption unit 804 in the address generation unit 802 accepts as input the encrypted base address slice 824 , along with a key obtained from a register and a context value tweak input (e.g., the tweak input may come from a separate register, or may consist of unencrypted bits of the same linear address).
  • the decryption unit 804 outputs a decrypted base address.
  • the decrypted base address slice is then provided to a concatenator/adder unit 806 , which concatenates the decrypted base address with a set of complementary upper bits from a register or context table entry and the offset 826 to yield an intermediate base address.
  • the set of complementary bits is different from the upper bits 822 , and the set of complementary does not convey metadata information (e.g., data object size, type, format, etc.) but instead includes the missing bits of the effective linear address that is constructed, denoting a location in the linear address space.
  • metadata information e.g., data object size, type, format, etc.
  • the intermediate base address is then combined with the upper bits 822 by the OR unit 808 to yield a tagged base address.
  • the upper bits 822 may be combined using an XOR unit, an ADD unit or a logical AND unit.
  • the upper bits 822 may act as a tweak value and tweak the decryption of the middle slice of the address.
  • the tagged base address is then provided to address generation circuitry 810 in the address generation unit 802 , along with other address generation inputs.
  • the address generation circuitry 810 then generates an effective address to be used in a memory read or write operation based on the inputs.
  • the upper bits 822 may be used to determine a number of intermediate lower address bits (e.g., from offset 826 ) that would be used as a tweak to the encrypted base address 824 .
  • a Translation Lookaside Buffer may be used that maps linear addresses (which may also be referred to as virtual addresses) to physical addresses.
  • a TLB entry is populated after a page miss where a page walk of the paging structures determines the correct linear to physical memory mapping, caching the linear to physical mapping for fast lookup.
  • a TLB (for example, the data TLB or dTLB) may instead cache the encoded address 820 to physical address mapping, using a Content Addressable Memory (CAM) circuit to match the encrypted/encoded address 820 to the correct physical address translation.
  • CAM Content Addressable Memory
  • the TLB may determine the physical memory mapping prior to the completion of the decryption unit 804 revealing the decrypted linear address, and may immediately proceed with processing the instructions dependent on this cached memory mapping.
  • Other embodiments may instead use one or both of the offset 826 and upper bits 822 of the address 820 as a partial linear address mapping into the TLB (that is, the TLB lookup is performed only against the plaintext subset of the address 820 ), and proceed to use the physical memory translation, if found, verifying the remainder of the decrypted base address ( 824 ) to determine the full linear address is a match (TLB hit) after completion of the decryption 804 .
  • Such embodiments may speculatively proceed with processing and nuke the processor pipeline if the final decrypted linear address match is found to be a false positive hit in the TLB, preventing the execution of dependent instructions, or cleaning up the execution of dependent instructions by returning processor register state and/or memory to its prior state before the TLB misprediction (incorrect memory mapping).
  • a subset of the upper bits 822 indicates address adjustment, which may involve adding offset value (which is a power of two) to the effective linear address that is produced by the address generation unit.
  • the offset value may include a bit string where only a single bit is equal to 1 and all other bits are equal to zero.
  • address adjustment may involve subtracting from the effective linear address an offset value, which is a power of two. Adjustment may be included in certain implementations because some memory object allocations cross power of two boundaries.
  • the smallest power-of-two box that contains a memory object allocation is also a unique property of the allocation and may be used for cryptographically tweaking the encryption the base address 824 associated with the allocation.
  • allocations that cross power of two boundaries may be associated with exceedingly large power-of-two boxes. Such large boxes may be polluted with data of other allocations, which, even though cryptographically isolated, may still be accessed by software (e.g., as a result of a software bug).
  • the adjustment may proceed in parallel with the decryption of the base address bits 824 .
  • performing the adjustment involves: (i) passing the upper bits 822 though a decoder circuit, (ii) obtaining the outputs of the decoder circuit; (iii) using those decoder outputs together with a first offset value 826 to form a second offset value to add to the bits of the linear address which are unencrypted; (iv) obtain a carry out value from this addition; (v) add the carry out value to the decrypted address bits 824 once they are produced.
  • a partial TLB lookup process may begin as soon as the adjustment process has produced the linear address bits which are used by the partial TLB lookup process.
  • the tweak value (also described in FIG. 8B as the “context value”) may be available either statically or dynamically—if it is not available statically, it is loaded dynamically from memory.
  • request to generate an effective address from an encrypted base address slice 824 is received by an address generation unit 802 .
  • the address generation unit 802 determines at 854 whether a context value is available statically. If it is available statically, then the value is used at 856 ; if not, the context value is loaded dynamically from a table in memory at 855 .
  • the encrypted base address slice 824 is decrypted using an active decryption key obtained from a register along with the context value.
  • the address generation unit 802 determines whether both (1) the memory access is being performed with a static context value, and (2) the input context value has its dynamic flag bit cleared.
  • the dynamic flag bit may be a flag bit in the pointer that indicates whether context information is available statically or dynamically. For instance, if an object represented by the pointer is not entirely within the bounds of a statically addressable memory region, then a dynamic flag bit may be set in the pointer.
  • the dynamic flag bit may indicate that context information is to be dynamically obtained, for example, via a pointer context table. In other words, there may be a region of memory in which the upper bits for a base address can be supplied statically from a control register, and allocations outside that region may need to draw their upper bits for the base address dynamically from a table entry in memory.
  • the process 850 moves to 860 ; if one or both are not true, then the upper base address bits are loaded dynamically from a table entry in memory at 859 before proceeding to 860 .
  • the operations of 858 can be performed alongside those of 854 , or the operations may be merged.
  • the operations of 859 can be performed alongside those of 855 , or the operations may be merged.
  • the concatenator/adder unit 806 of the address generation unit 802 concatenates the upper base address bits with the decrypted base address slice, and at 862 , adds the offset 826 to the concatenation.
  • the address generation unit 802 recombines tag information from the upper bits 822 with the result of the concatenation/addition of 860 and 862 via the OR unit 808 .
  • the result of the concatenation, addition, and ORing is provided to address generation circuitry 810 in the address generation unit 802 , along with other address generation inputs.
  • the address generation circuitry 810 generates an effective address to be used in a memory read or write operation based on the inputs.
  • FIG. 9 is a flow diagram of an example process 900 of executing cryptographic-based instructions in a cryptographic computing system.
  • the example process 900 may be performed by circuitry of a microprocessor pipeline of a processor (e.g., one or more of the components described above, which may be implemented in a processor configured similar to the processor 1000 of FIG. 10 ) in response to accessing a set of cryptographic-based instructions.
  • the circuitry of the microprocessor pipeline performs each of the operations described, while in other embodiments, the circuity of the microprocessor pipeline performs only a subset of the operations described.
  • encrypted data stored in a data cache unit of a processor e.g., data cache unit 412 of FIG. 4A , data cache unit 516 of FIG. 5A , or data cache unit 1024 of FIG. 10 ) is accessed.
  • the encrypted data is decrypted based on a pointer value.
  • the decryption may be performed in manner similar to that described above with respect to FIGS. 4A-4B , FIGS. 5A-5B , or in another manner.
  • the pointer value or a portion thereof may itself be encrypted.
  • the pointer value may first be decrypted/decoded, for example, in a similar manner to that described above with respect to FIGS. 7A-7B or FIGS. 8A-8B .
  • a cryptographic-based instruction is executed based on data obtained from the decryption performed at 904 .
  • the instruction may be executed on an execution unit of the processor (e.g., execution unit 416 of FIG. 4A , execution unit 520 of FIG. 5A , or execution unit(s) 1016 of FIG. 10 ).
  • a result of the execution performed at 906 is encrypted based on another pointer value.
  • the encryption may be performed in a similar manner to that described above with respect to FIGS. 6A-6B .
  • the encrypted result is stored in a data cache unit of the processor or another execution unit.
  • the example processes described above may include additional or different operations, and the operations may be performed in the order shown or in another order.
  • one or more of the operations shown in the flow diagrams are implemented as processes that include multiple operations, sub-processes, or other types of routines.
  • operations can be combined, performed in another order, performed in parallel, iterated, or otherwise repeated or performed in another manner.
  • certain functionality is described herein as being performed by load or store buffers, address generation units, or other certain aspects of a processor, it will be understood that the teachings of the present disclosure may be implemented in other examples by other types of execution units in a processor, including but not limited to separate data block encryption units, separate key stream generation units, or separate data pointer decryption units.
  • FIGS. 10-12 are block diagrams of example computer architectures that may be used in accordance with embodiments disclosed herein.
  • any computer architecture designs known in the art for processors and computing systems may be used.
  • system designs and configurations known in the arts for laptops, desktops, handheld PCs, personal digital assistants, tablets, engineering workstations, servers, network devices, servers, appliances, network hubs, routers, switches, embedded processors, digital signal processors (DSPs), graphics devices, video game devices, set-top boxes, micro controllers, smart phones, mobile devices, wearable electronic devices, portable media players, hand held devices, and various other electronic devices are also suitable for embodiments of computing systems described herein.
  • suitable computer architectures for embodiments disclosed herein can include, but are not limited to, configurations illustrated in FIGS. 10-12 .
  • FIG. 10 is an example illustration of a processor according to an embodiment.
  • Processor 1000 is an example of a type of hardware device that can be used in connection with the implementations above.
  • Processor 1000 may be any type of processor, such as a microprocessor, an embedded processor, a digital signal processor (DSP), a network processor, a multi-core processor, a single core processor, or other device to execute code.
  • DSP digital signal processor
  • a processing element may alternatively include more than one of processor 1000 illustrated in FIG. 10 .
  • Processor 1000 may be a single-threaded core or, for at least one embodiment, the processor 1000 may be multi-threaded in that it may include more than one hardware thread context (or “logical processor”) per core.
  • FIG. 10 also illustrates a memory 1002 coupled to processor 1000 in accordance with an embodiment.
  • Memory 1002 may be any of a wide variety of memories (including various layers of memory hierarchy) as are known or otherwise available to those of skill in the art.
  • Such memory elements can include, but are not limited to, random access memory (RAM), read only memory (ROM), logic blocks of a field programmable gate array (FPGA), erasable programmable read only memory (EPROM), and electrically erasable programmable ROM (EEPROM).
  • RAM random access memory
  • ROM read only memory
  • FPGA field programmable gate array
  • EPROM erasable programmable read only memory
  • EEPROM electrically erasable programmable ROM
  • Processor 1000 can execute any type of instructions associated with algorithms, processes, or operations detailed herein. Generally, processor 1000 can transform an element or an article (e.g., data) from one state or thing to another state or thing.
  • processor 1000 can transform an element or an article (e.g., data) from one state or thing to another state or thing.
  • Code 1004 which may be one or more instructions to be executed by processor 1000 , may be stored in memory 1002 , or may be stored in software, hardware, firmware, or any suitable combination thereof, or in any other internal or external component, device, element, or object where appropriate and based on particular needs.
  • processor 1000 can follow a program sequence of instructions indicated by code 1004 .
  • Each instruction enters a front-end logic 1006 and is processed by one or more decoders 1008 .
  • the decoder may generate, as its output, a microoperation such as a fixed width microoperation in a predefined format, or may generate other instructions, microinstructions, or control signals that reflect the original code instruction.
  • Front-end logic 1006 also includes register renaming logic 1010 and scheduling logic 1012 (which includes a reservation station 1013 ), which generally allocate resources and queue the operation corresponding to the instruction for execution.
  • scheduling logic 1012 includes an in-order or an out-of-order execution scheduler.
  • Processor 1000 can also include execution logic 1014 having a set of execution units 1016 a , . . . , 1016 n , an address generation unit 1017 , etc. Some embodiments may include a number of execution units dedicated to specific functions or sets of functions. Other embodiments may include only one execution unit or one execution unit that can perform a particular function. Execution logic 1014 performs the operations specified by code instructions.
  • back-end logic 1018 can retire the instructions of code 1004 .
  • processor 1000 allows out of order execution but requires in order retirement of instructions.
  • Retirement logic 1020 may take a variety of known forms (e.g., re-order buffers or the like). In this manner, processor 1000 is transformed during execution of code 1004 , at least in terms of the output generated by the decoder, hardware registers and tables utilized by register renaming logic 1010 , and any registers (not shown) modified by execution logic 1014 .
  • Processor 1000 can also include a memory subsystem 1022 , which includes a load buffer 1024 , a decryption unit 1025 , a store buffer 1026 , an encryption unit 1027 , a Translation Lookaside Buffer (TLB) 1028 , a data cache unit (DCU) 1030 , and a Level-2 (L2) cache unit 1032 .
  • the load buffer 1024 processes microoperations for memory/cache load operations
  • the store buffer 1026 processes microoperations for memory/cache store operations.
  • the data stored in the data cache unit 1030 , the L2 cache unit 1032 , and/or the memory 1002 may be encrypted, and may be encrypted (prior to storage) and decrypted (prior to processing by one or more execution units 1016 ) entirely within the processor 1000 as described herein.
  • the decryption unit 1025 may decrypt encrypted data stored in the DCU 1030 , e.g., during load operations processed by the load buffer 1024 as described above
  • the encryption unit 1027 may encrypt data to be stored in the DCU 1030 , e.g., during stored operations processed by the store buffer 1026 as described above.
  • the decryption unit 1025 may be implemented inside the load buffer 1024 and/or the encryption unit 1027 may be implemented inside the store buffer 1026 .
  • the Translation Lookaside Buffer (TLB) 1028 maps linear addresses to physical addresses and performs other functionality as described herein.
  • a processing element may include other elements on a chip with processor 1000 .
  • a processing element may include memory control logic along with processor 1000 .
  • the processing element may include I/O control logic and/or may include I/O control logic integrated with memory control logic.
  • the processing element may also include one or more caches.
  • non-volatile memory such as flash memory or fuses may also be included on the chip with processor 1000 .
  • FIG. 11A is a block diagram illustrating both an example in-order pipeline and an example register renaming, out-of-order issue/execution pipeline according to one or more embodiments of this disclosure.
  • FIG. 11B is a block diagram illustrating both an example embodiment of an in-order architecture core and an example register renaming, out-of-order issue/execution architecture core to be included in a processor according to one or more embodiments of this disclosure.
  • the solid lined boxes in FIGS. 11A-11B illustrate the in-order pipeline and in-order core, while the optional addition of the dashed lined boxes illustrates the register renaming, out-of-order issue/execution pipeline and core. Given that the in-order aspect is a subset of the out-of-order aspect, the out-of-order aspect will be described.
  • a processor pipeline 1100 includes a fetch stage 1102 , a length decode stage 1104 , a decode stage 1106 , an allocation stage 1108 , a renaming stage 1110 , a schedule (also known as a dispatch or issue) stage 1112 , a register read/memory read stage 1114 , an execute stage 1116 , a write back/memory write stage 1118 , an exception handling stage 1122 , and a commit stage 1124 .
  • a schedule also known as a dispatch or issue
  • FIG. 11B shows processor core 1190 including a front end unit 1130 coupled to an execution engine unit 1150 , and both are coupled to a memory unit 1170 .
  • Processor core 1190 and memory unit 1170 are examples of the types of hardware that can be used in connection with the implementations shown and described herein.
  • the core 1190 may be a reduced instruction set computing (RISC) core, a complex instruction set computing (CISC) core, a very long instruction word (VLIW) core, or a hybrid or alternative core type.
  • the core 1190 may be a special-purpose core, such as, for example, a network or communication core, compression engine, coprocessor core, general purpose computing graphics processing unit (GPGPU) core, graphics core, or the like.
  • processor core 1190 and its components represent example architecture that could be used to implement logical processors and their respective components.
  • the front end unit 1130 includes a branch prediction unit 1132 coupled to an instruction cache unit 1134 , which is coupled to an instruction translation lookaside buffer (TLB) unit 1136 , which is coupled to an instruction fetch unit 1138 , which is coupled to a decode unit 1140 .
  • the decode unit 1140 (or decoder) may decode instructions, and generate as an output one or more micro-operations, micro-code entry points, microinstructions, other instructions, or other control signals, which are decoded from, or which otherwise reflect, or are derived from, the original instructions.
  • the decode unit 1140 may be implemented using various different mechanisms.
  • the core 1190 includes a microcode ROM or other medium that stores microcode for certain macroinstructions (e.g., in decode unit 1140 or otherwise within the front end unit 1130 ).
  • the decode unit 1140 is coupled to a rename/allocator unit 1152 in the execution engine unit 1150 .
  • the execution engine unit 1150 includes the rename/allocator unit 1152 coupled to a retirement unit 1154 and a set of one or more scheduler unit(s) 1156 .
  • the scheduler unit(s) 1156 represents any number of different schedulers, including reservation stations, central instruction window, etc.
  • the scheduler unit(s) 1156 is coupled to the physical register file(s) unit(s) 1158 .
  • Each of the physical register file(s) units 1158 represents one or more physical register files, different ones of which store one or more different data types, such as scalar integer, scalar floating point, packed integer, packed floating point, vector integer, vector floating point, status (e.g., an instruction pointer that is the address of the next instruction to be executed), etc.
  • the physical register file(s) unit 1158 comprises a vector registers unit, a write mask registers unit, and a scalar registers unit. These register units may provide architectural vector registers, vector mask registers, and general purpose registers (GPRs). In at least some embodiments described herein, register units 1158 are examples of the types of hardware that can be used in connection with the implementations shown and described herein (e.g., registers 112 ).
  • the physical register file(s) unit(s) 1158 is overlapped by the retirement unit 1154 to illustrate various ways in which register renaming and out-of-order execution may be implemented (e.g., using a reorder buffer(s) and a retirement register file(s); using a future file(s), a history buffer(s), and a retirement register file(s); using register maps and a pool of registers; etc.).
  • the retirement unit 1154 and the physical register file(s) unit(s) 1158 are coupled to the execution cluster(s) 1160 .
  • the execution cluster(s) 1160 includes a set of one or more execution units 1162 and a set of one or more memory access units 1164 .
  • the execution units 1162 may perform various operations (e.g., shifts, addition, subtraction, multiplication) and on various types of data (e.g., scalar floating point, packed integer, packed floating point, vector integer, vector floating point). While some embodiments may include a number of execution units dedicated to specific functions or sets of functions, other embodiments may include only one execution unit or multiple execution units that all perform all functions. Execution units 1162 may also include an address generation unit (AGU) to calculate addresses used by the core to access main memory and a page miss handler (PMH).
  • AGU address generation unit
  • PMH page miss handler
  • the scheduler unit(s) 1156 , physical register file(s) unit(s) 1158 , and execution cluster(s) 1160 are shown as being possibly plural because certain embodiments create separate pipelines for certain types of data/operations (e.g., a scalar integer pipeline, a scalar floating point/packed integer/packed floating point/vector integer/vector floating point pipeline, and/or a memory access pipeline that each have their own scheduler unit, physical register file(s) unit, and/or execution cluster—and in the case of a separate memory access pipeline, certain embodiments are implemented in which only the execution cluster of this pipeline has the memory access unit(s) 1164 ). It should also be understood that where separate pipelines are used, one or more of these pipelines may be out-of-order issue/execution and the rest in-order.
  • the set of memory access units 1164 is coupled to the memory unit 1170 , which includes a data TLB unit 1172 coupled to a data cache unit 1174 coupled to a level 2 (L2) cache unit 1176 .
  • the memory access units 1164 may include a load unit, a store address unit, and a store data unit, each of which is coupled to the data TLB unit 1172 in the memory unit 1170 .
  • the instruction cache unit 1134 is further coupled to a level 2 (L2) cache unit 1176 in the memory unit 1170 .
  • the L2 cache unit 1176 is coupled to one or more other levels of cache and eventually to a main memory.
  • a page miss handler may also be included in core 1190 to look up an address mapping in a page table if no match is found in the data TLB unit 1172 .
  • the example register renaming, out-of-order issue/execution core architecture may implement the pipeline 1100 as follows: 1) the instruction fetch 1138 performs the fetch and length decoding stages 1102 and 1104 ; 2) the decode unit 1140 performs the decode stage 1106 ; 3) the rename/allocator unit 1152 performs the allocation stage 1108 and renaming stage 1110 ; 4) the scheduler unit(s) 1156 performs the schedule stage 1112 ; 5) the physical register file(s) unit(s) 1158 and the memory unit 1170 perform the register read/memory read stage 1114 ; the execution cluster 1160 perform the execute stage 1116 ; 6) the memory unit 1170 and the physical register file(s) unit(s) 1158 perform the write back/memory write stage 1118 ; 7) various units may be involved in the exception handling stage 1122 ; and 8) the retirement unit 1154 and the physical register file(s) unit(s) 1158 perform the commit stage 1124 .
  • the core 1190 may support one or more instructions sets (e.g., the x 86 instruction set (with some extensions that have been added with newer versions); the MIPS instruction set of MIPS Technologies of Sunnyvale, Calif.; the ARM instruction set (with optional additional extensions such as NEON) of ARM Holdings of Sunnyvale, Calif.), including the instruction(s) described herein.
  • the core 1190 includes logic to support a packed data instruction set extension (e.g., AVX1, AVX2), thereby allowing the operations used by many multimedia applications to be performed using packed data.
  • a packed data instruction set extension e.g., AVX1, AVX2
  • the core may support multithreading (executing two or more parallel sets of operations or threads), and may do so in a variety of ways including time sliced multithreading, simultaneous multithreading (where a single physical core provides a logical core for each of the threads that physical core is simultaneously multithreading), or a combination thereof (e.g., time sliced fetching and decoding and simultaneous multithreading thereafter such as in the Intel® Hyperthreading technology). Accordingly, in at least some embodiments, multi-threaded enclaves may be supported.
  • register renaming is described in the context of out-of-order execution, it should be understood that register renaming may be used in an in-order architecture.
  • the illustrated embodiment of the processor also includes separate instruction and data cache units 1134 / 1174 and a shared L2 cache unit 1176 , alternative embodiments may have a single internal cache for both instructions and data, such as, for example, a Level 1 (L1) internal cache, or multiple levels of internal cache.
  • the system may include a combination of an internal cache and an external cache that is external to the core and/or the processor. Alternatively, all of the cache may be external to the core and/or the processor.
  • FIG. 12 illustrates a computing system 1200 that is arranged in a point-to-point (PtP) configuration according to an embodiment.
  • FIG. 12 shows a system where processors, memory, and input/output devices are interconnected by a number of point-to-point interfaces.
  • processors, memory, and input/output devices are interconnected by a number of point-to-point interfaces.
  • one or more of the computing systems or computing devices described herein may be configured in the same or similar manner as computing system 1200 .
  • Processors 1270 and 1280 may be implemented as single core processors 1274 a and 1284 a or multi-core processors 1274 a - 1274 b and 1284 a - 1284 b .
  • Processors 1270 and 1280 may each include a cache 1271 and 1281 used by their respective core or cores.
  • a shared cache (not shown) may be included in either processors or outside of both processors, yet connected with the processors via P-P interconnect, such that either or both processors' local cache information may be stored in the shared cache if a processor is placed into a low power mode.
  • Processors 1270 and 1280 may also each include integrated memory controller logic (MC) 1272 and 1282 to communicate with memory elements 1232 and 1234 , which may be portions of main memory locally attached to the respective processors.
  • memory controller logic 1272 and 1282 may be discrete logic separate from processors 1270 and 1280 .
  • Memory elements 1232 and/or 1234 may store various data to be used by processors 1270 and 1280 in achieving operations and functionality outlined herein.
  • Processors 1270 and 1280 may be any type of processor, such as those discussed in connection with other figures.
  • Processors 1270 and 1280 may exchange data via a point-to-point (PtP) interface 1250 using point-to-point interface circuits 1278 and 1288 , respectively.
  • Processors 1270 and 1280 may each exchange data with an input/output (I/O) subsystem 1290 via individual point-to-point interfaces 1252 and 1254 using point-to-point interface circuits 1276 , 1286 , 1294 , and 1298 .
  • I/O subsystem 1290 may also exchange data with a high-performance graphics circuit 1238 via a high-performance graphics interface 1239 , using an interface circuit 1292 , which could be a PtP interface circuit.
  • the high-performance graphics circuit 1238 is a special-purpose processor, such as, for example, a high-throughput MIC processor, a network or communication processor, compression engine, graphics processor, GPGPU, embedded processor, or the like.
  • I/O subsystem 1290 may also communicate with a display 1233 for displaying data that is viewable by a human user.
  • any or all of the PtP links illustrated in FIG. 12 could be implemented as a multi-drop bus rather than a PtP link.
  • I/O subsystem 1290 may be in communication with a bus 1220 via an interface circuit 1296 .
  • Bus 1220 may have one or more devices that communicate over it, such as a bus bridge 1218 and I/O devices 1216 .
  • bus bridge 1218 may be in communication with other devices such as a user interface 1212 (such as a keyboard, mouse, touchscreen, or other input devices), communication devices 1226 (such as modems, network interface devices, or other types of communication devices that may communicate through a computer network 1260 ), audio I/O devices 1214 , and/or a data storage device 1228 .
  • Data storage device 1228 may store code and data 1230 , which may be executed by processors 1270 and/or 1280 .
  • any portions of the bus architectures could be implemented with one or more PtP links.
  • the computer system depicted in FIG. 12 is a schematic illustration of an embodiment of a computing system that may be utilized to implement various embodiments discussed herein. It will be appreciated that various components of the system depicted in FIG. 12 may be combined in a system-on-a-chip (SoC) architecture or in any other suitable configuration capable of achieving the functionality and features of examples and implementations provided herein.
  • SoC system-on-a-chip
  • interaction may be described in terms of a single computing system. However, this has been done for purposes of clarity and example only. In certain cases, it may be easier to describe one or more of the functionalities of a given set of flows by only referencing a single computing system. Moreover, the system for deep learning and malware detection is readily scalable and can be implemented across a large number of components (e.g., multiple computing systems), as well as more complicated/sophisticated arrangements and configurations. Accordingly, the examples provided should not limit the scope or inhibit the broad teachings of the computing system as potentially applied to a myriad of other architectures.
  • ‘at least one of’ refers to any combination of the named elements, conditions, or activities.
  • ‘at least one of X, Y, and Z’ is intended to mean any of the following: 1) at least one X, but not Y and not Z; 2) at least one Y, but not X and not Z; 3) at least one Z, but not X and not Y; 4) at least one X and Y, but not Z; 5) at least one X and Z, but not Y; 6) at least one Y and Z, but not X; or 7) at least one X, at least one Y, and at least one Z.
  • first, ‘second’, ‘third’, etc. are intended to distinguish the particular nouns (e.g., element, condition, module, activity, operation, claim element, etc.) they modify, but are not intended to indicate any type of order, rank, importance, temporal sequence, or hierarchy of the modified noun.
  • first X and ‘second X’ are intended to designate two separate X elements that are not necessarily limited by any order, rank, importance, temporal sequence, or hierarchy of the two elements.
  • Example 1 includes a processor comprising: data cache units storing encrypted data; and a microprocessor pipeline coupled to the data cache units.
  • the microprocessor pipeline comprises circuitry to access and execute a sequence of cryptographic-based instructions based on the encrypted data. Execution of the sequence of cryptographic-based instructions comprises at least one of: decryption of the encrypted data based on a first pointer value; execution of a cryptographic-based instruction based on data obtained from decryption of the encrypted data; encryption of a result of execution of a cryptographic-based instruction, wherein the encryption is based on a second pointer value; and storage of encrypted data in the data cache units, wherein the encrypted data stored in the data cache units is based on an encrypted result of execution of a cryptographic-based instruction.
  • Example 2 includes the subject matter of Example 1, and optionally, wherein the circuitry is further to: generate, for each cryptographic-based instruction, at least one encryption-based microoperation and at least one non-encryption-based microoperation the cryptographic-based instruction; and schedule the at least one encryption-based microoperation and the at least one non-encryption-based microoperation for execution based on timings of the encryption-based microoperation.
  • Example 3 includes the subject matter of Example 2, and optionally, wherein the encryption-based microoperation is based on a block cipher, and the non-encryption-based microoperation is scheduled as dependent upon the encryption-based microoperation.
  • Example 4 includes the subject matter of Example 2, and optionally, wherein the encryption-based microoperation is based on a counter mode block cipher, and the non-encryption-based microoperation is scheduled to execute in parallel with encryption of a counter.
  • Example 5 includes the subject matter of Example 2, and optionally, wherein the encryption-based microoperation is one of an encryption operation and a decryption operation.
  • Example 6 includes the subject matter of Example 2, and optionally, wherein the non-encryption-based microoperation is one of a load operation and a store operation.
  • Example 7 includes the subject matter of any one of Examples 1-6, and optionally, wherein the circuitry is to decrypt the encrypted data by using the first pointer value as an input to a decryption function.
  • Example 8 includes the subject matter of Example 7, and optionally, wherein the circuitry to decrypt the encrypted data is in a load buffer of the processor.
  • Example 9 includes the subject matter of Example 7, and optionally, wherein the circuitry is to decrypt the encrypted data further by: generating a key stream based on the first pointer value and a counter value; and performing an XOR operation on the key stream and the encrypted data to yield the decrypted data.
  • Example 10 includes the subject matter of any one of Examples 1-6, and optionally, wherein the circuitry is to encrypt the result of the execution of the cryptographic-based instruction by using the second pointer value as an input to an encryption function.
  • Example 11 includes the subject matter of Example 10, and optionally, wherein the circuitry to encrypt the result of the execution of the cryptographic-based instruction is in a store buffer of the processor.
  • Example 12 includes the subject matter of any one of Examples 1-6, and optionally, wherein at least one of the first pointer value and the second pointer value is an effective address based on an encoded linear address that is at least partially encrypted, and the circuitry is further to: access the encoded linear address; decrypt an encrypted portion of the encoded linear address based on a key obtained from a register of the processor; and generate the effective address based on a result of the decryption of the encrypted portion of the encoded linear address.
  • Example 13 includes the subject matter of Example 12, and optionally, wherein the entirety of the encoded linear address is encrypted.
  • Example 14 includes the subject matter of Example 12, and optionally, wherein the circuitry to decrypt the encoded linear address is in an address generation unit of the processor.
  • Example 15 includes a method comprising: accessing a sequence of cryptographic-based instructions to execute on encrypted data stored in data cache units of a processor; and executing the sequence of cryptographic-based instructions by a core of the processor, wherein execution comprises one or more of: decryption of the encrypted data based on a first pointer value; execution of a cryptographic-based instruction based on data obtained from decryption of the encrypted data; encryption of a result of execution of a cryptographic-based instruction, wherein the encryption is based on a second pointer value; and storage of encrypted data in the data cache units, wherein the encrypted data stored in the data cache units is based on an encrypted result of execution of a cryptographic-based instruction.
  • Example 16 includes the subject matter of Example 15, and optionally, wherein executing the sequence of cryptographic-based instructions comprises: generating, for each cryptographic-based instruction, at least one encryption-based microoperation and at least one non-encryption-based microoperation the cryptographic-based instruction; scheduling the at least one encryption-based microoperation and the at least one non-encryption-based microoperation for execution based on timings of the encryption-based microoperation; and executing the scheduled microoperations.
  • Example 17 includes the subject matter of Example 16, and optionally, wherein the encryption-based microoperation is based on a block cipher, and the non-encryption-based microoperation is scheduled as dependent upon the encryption-based microoperation.
  • Example 18 includes the subject matter of Example 16, and optionally, wherein the encryption-based microoperation is based on a counter mode block cipher, and the non-encryption-based microoperation is scheduled to execute in parallel with encryption of a counter.
  • Example 19 includes the subject matter of Example 16, and optionally, wherein the encryption-based microoperation is one of an encryption operation and a decryption operation, and the non-encryption-based microoperation is one of a load operation and a store operation.
  • Example 20 includes the subject matter of Example 19, and optionally, wherein the encryption operation and decryption operation each utilize a pointer value as a tweak input.
  • Example 21 includes the subject matter of any one of Examples 16-20, and optionally, wherein the decryption is performed by circuitry coupled to or implemented in, a load buffer of the processor.
  • Example 22 includes the subject matter of any one of Examples 16-20, and optionally, wherein the encryption is performed by circuitry coupled to or implemented in, a store buffer of the processor.
  • Example 23 includes the subject matter of any one of Examples 16-20, and optionally, wherein decrypting the encrypted data comprises: generating a key stream based on the first pointer value and a counter value; and performing an XOR operation on the key stream and the encrypted data to yield the decrypted data.
  • Example 24 includes the subject matter of any one of Examples 16-20, and optionally, wherein at least one of the first pointer value and the second pointer value is an effective address based on an encoded linear address that is at least partially encrypted, and the method further comprises: accessing the encoded linear address; decrypting an encrypted portion of the encoded linear address based on a key obtained from a register of the processor; and generating the effective address based on a result of the decryption of the encrypted portion of the encoded linear address.
  • Example 25 includes the subject matter of Example 24, and optionally, wherein the entirety of the encoded linear address is encrypted.
  • Example 26 includes the subject matter of Example 24, and optionally, wherein the decryption of the encoded linear address is by an address generation unit of the processor
  • Example 27 includes a system comprising: memory storing cryptographic-based instructions, and a processor coupled to the memory.
  • the processor comprises: data cache units storing encrypted data; means for accessing the cryptographic-based instructions, the cryptographic instructions to execute based on the encrypted data; means for decrypting the encrypted data based on a first pointer value; means for executing the cryptographic-based instruction using the decrypted data; means for encrypting a result of the execution of the cryptographic-based instruction based on a second pointer value; and means for storing the encrypted result in the data cache units.
  • Example 28 includes the subject matter of Example 27, and optionally, wherein the means for decrypting the encrypted data comprises a load buffer of the processor.
  • Example 29 includes the subject matter of Example 27, and optionally, wherein the means for encrypting a result of the execution of the cryptographic-based instruction comprises a store buffer of the processor.
  • Example 30 includes the subject matter of any one of Examples 27-29, and optionally, wherein at least one of the first pointer value and the second pointer value is an effective address based on an encoded linear address that is at least partially encrypted, and the processor further comprises additional means for: accessing the encoded linear address; decrypting an encrypted portion of the encoded linear address based on a key obtained from a register of the processor; and generating the effective address based on a result of the decryption of the encrypted portion of the encoded linear address.
  • Example 31 includes the subject matter of Example 30, and optionally, wherein the additional means comprises an address generation unit of the processor.
  • Example 32 includes a processor core supporting the encryption and the decryption of pointers keys, and data in the core and where such encryption and decryption operations are performed by logic and circuitry which is part of the processor microarchitecture pipeline.
  • Example 33 includes the subject matter of Example 32, and optionally, wherein instructions that perform encrypted memory loads and stores are mapped into at least one block encryption ⁇ op and at least one regular load/store ⁇ op.
  • Example 34 includes the subject matter of Example 32, and optionally, wherein an in order or out-of-order execution scheduler schedules the execution of encryption, decryption and load/store ⁇ ops and where load and store ⁇ ops are considered as dependent on one of a block encryption and a block decryption ⁇ op.
  • Example 35 includes the subject matter of Example 34, and optionally, wherein the out-of-order execution scheduler may load and store ⁇ ops can execute in parallel with the encryption of a counter.
  • Example 36 includes the subject matter of Example 32, and optionally, wherein decryption of data is tweaked by a pointer and the decryption takes place in the load buffer.
  • Example 37 includes the subject matter of Example 32, and optionally, wherein encryption of data is tweaked by a pointer and the encryption takes place in the store buffer.
  • Example 38 includes the subject matter of Example 32, and optionally, wherein decryption of a pointer takes place in the address generation unit.
  • Example 39 includes the subject matter of Example 32, and optionally, wherein decryption of a slice of a base takes place in the address generation unit.
  • Example 40 may include a device comprising logic, modules, circuitry, or other means to perform one or more elements of a method described in or related to any of the examples above or any other method or process described herein.

Abstract

In one embodiment, a processor of a cryptographic computing system includes a register to store an encryption key and address generation circuitry to obtain a pointer representing a linear address to be accessed by a read or write operation, the pointer being at least partially encrypted, obtain the key from the register and a context value, decrypt the encrypted portion of the pointer using the key and the context value as a tweak input, and generate an effective address for use in the read or write operation based on an output of the decryption.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This Application is a continuation (and claims the benefit of priority under 35 U.S.C. § 120) of U.S. application Ser. No. 16/724,105 filed on Dec. 20, 2019, entitled MICROPROCESSOR PIPELINE CIRCUITRY TO SUPPORT CRYPTOGRAPHIC COMPUTING, which application claims the benefit of and priority from U.S. Provisional Patent Application Ser. No. 62/868,884 entitled “Cryptographic Computing” and filed Jun. 29, 2019. The disclosures of the prior applications are each incorporated herein by reference.
  • TECHNICAL FIELD
  • This disclosure relates in general to the field of computer systems and, more particularly, to microprocessor pipeline circuitry to supporting cryptographic computing.
  • BACKGROUND
  • Cryptographic computing may refer to solutions for computer system security that employ cryptographic mechanisms inside processor components. Some cryptographic computing systems may involve the encryption and decryption of pointers, keys and data in a processor core using new encrypted memory access instructions.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • To provide a more complete understanding of the present disclosure and features and advantages thereof, reference is made to the following description, taken in conjunction with the accompanying figures, where like reference numerals represent like parts, in which:
  • FIG. 1 is a flow diagram of an example process of scheduling microoperations.
  • FIG. 2 is a diagram of an example process of scheduling microoperations based on cryptographic-based instructions.
  • FIG. 3 is a diagram of another example process of scheduling microoperations based on cryptographic-based instructions.
  • FIGS. 4A-4B are diagrams of an example data decryption process in a cryptographic computing system.
  • FIGS. 5A-5C are diagrams of another example data decryption process in a cryptographic computing system.
  • FIGS. 6A-6B are diagrams of an example data encryption process in a cryptographic computing system.
  • FIGS. 7A-7B are diagrams of an example pointer decryption process in a cryptographic computing system.
  • FIGS. 8A-8B are diagrams of an example base address slice decryption process in a cryptographic computing system.
  • FIG. 9 is a flow diagram of an example process of executing cryptographic-based instructions in a cryptographic computing system.
  • FIG. 10 is a block diagram illustrating an example processor core and memory according to at least one embodiment;
  • FIG. 11A is a block diagram of an example in-order pipeline and an example register renaming, out-of-order issue/execution pipeline according to one or more embodiments of this disclosure;
  • FIG. 11B is a block diagram of an example in-order architecture core and register renaming, out-of-order issue/execution architecture core to be included in a processor according to one or more embodiments of this disclosure; and
  • FIG. 12 is a block diagram of an example computer architecture according to at least one embodiment.
  • DETAILED DESCRIPTION
  • The following disclosure provides various possible embodiments, or examples, for implementation of cryptographic computing. Cryptographic computing may refer to computer system security solutions that employ cryptographic mechanisms inside processor components. Some cryptographic computing systems may involve the encryption and decryption of pointers, keys, and data in a processor core using new encrypted memory access instructions. Thus, the microarchitecture pipeline of the processor core may be configured in such a way to support such encryption and decryption operations.
  • Some current systems may address security concerns by placing a memory encryption unit in the microcontroller. However, such systems may increase latencies due to the placement of cryptographic functionality in the microcontroller. Other systems may provide a pointer authentication solution. However, these solutions cannot support multi-tenancy and may otherwise be limited when compared to the cryptographic computing implementations described herein.
  • In some embodiments of the present disclosure, an execution pipeline of a processor core first maps cryptographic computing instructions into at least one block encryption-based microoperation (μop) and at least one regular, non-encryption-based load/store μop. Load operations performed by load μops may go to a load buffer (e.g., in a memory subsystem of a processor), while store operations performed by store μops may go to store buffer (e.g., in the same memory subsystem). An in-order or out-of-order execution scheduler is aware of the timings and dependencies associated with the cryptographic computing instructions. In some embodiments, the load and store μops are considered as dependent on the block encryption μops. In embodiments where a counter mode is used, the load and store μops may execute in parallel with the encryption of the counter. In these implementations, a counter common to the plurality of load/store μops may be encrypted only once. In certain embodiments, block encryptions coming from cryptographic computing instructions are scheduled to be executed in parallel with independent μops, which may include μops not coming from cryptographic computing instructions.
  • Further, in some embodiments, functional units include block encryption or counter encryption operations. For example, data decryption may be performed (e.g., on data loaded from a data cache unit) by a decryption unit coupled to or implemented in a load buffer, and data encryption may be performed (e.g., on data output from an execution unit) by an encryption unit coupled to or implemented in a store buffer. As another example, pointer decryption may be performed by an address generation unit. Any suitable block cipher cryptographic algorithm may be implemented. For example, a small block cipher (e.g., a SIMON, or SPECK cipher at a 32-bit block size, or other variable bit size block cipher) or their tweakable versions may be used. The Advanced Encryption Standard (AES) may be implemented in any number of ways to achieve encryption/decryption of a block of data. For example, an AES xor-encrypt-xor (XEX) based tweaked-codebook mode with ciphertext stealing (AES-XTS) may be suitable. In other embodiments, an AES counter (CTR) mode of operation could be implemented.
  • In certain embodiments, cryptographic computing may require the linear address for each memory access to be plumbed to the interface with the data cache to enable tweaked encryption and decryption at that interface. For load requests, that may be accomplished by adding a new read port on the load buffer. In embodiments utilizing stream ciphers, e.g., those using the counter mode, the keystream may be pre-computed as soon as the load buffer entry is created. Data may be encrypted as it is stored into the store buffer or may be encrypted after it exits the store buffer on its way to a Level-1 (L1) cache. In some instances, it may be advantageous to start encrypting the data as soon as its address becomes available (e.g., while it may still be in the store buffer) to minimize the total delay for storing the data. If the data is encrypted outside of the store buffer, then a read port may be utilized on the store buffer so that a cryptographic execution unit can read the address.
  • Aspects of the present disclosure may provide a good cost/performance trade-off when compared to current systems, as data and pointer encryption and decryption latencies can be hidden behind the execution of other μops. Other advantages will be apparent in light of the present disclosure.
  • FIG. 1 is a flow diagram of an example process 100 of scheduling microoperations. The example process 100 may be implemented by an execution scheduler, such as an out-of-order execution scheduler in certain instances. At 102, a sequence of instructions is accessed by an execution scheduler. The instructions may be inside a window of fixed size (e.g., 25 instructions or 50 instructions). At 104, the sequence of instructions is mapped to a sequence of microoperations (μops). In typical pipelines, each instruction may be mapped to one or more μops in the sequence. At 106, the scheduler detects dependencies between μops and expresses those dependencies in the form of a directed acyclic graph. This may be performed by dependencies logic of the scheduler. As an example, two independent μops, an XOR μop and a load μop, may be represented as nodes in separate parallel branches in the graph. Conversely, dependent μops such as an ADD μop and a following store μop may be represented as sequential nodes in the same branch of the graph. The acyclic graph may include speculative execution branches in certain instances.
  • At 108, the scheduler may annotate the graph with latency and throughput values associated with the execution of the μops, and at 110, the scheduler performs maximal scheduling of at least one subset of independent μops by the functional units of the processor core. The annotation of 108 may be performed by timing logic of the scheduler and the scheduling of 110 may be performed by scheduling logic of the scheduler. Maximal scheduling may refer to the assignment of independent μops to core functional units that are locally optimal according to some specific objective. For example, the scheduler may perform assignments such that the largest possible number of independent functional units are simultaneously occupied to execute independent μop tasks. In certain embodiments, the scheduling performed at 110 may be repeated several times.
  • FIG. 2 is a diagram of an example process 200 of scheduling microoperations based on cryptographic-based instructions. The example process 200 may be implemented by an execution scheduler, such as an out-of-order execution scheduler in cryptographic computing systems. At 202, a sequence of cryptographic-based instruction is accessed. This operation may correspond to operation 102 of the process 100. Cryptographic-based instructions may refer to instructions that are to be executed in cryptographic computing systems or environments, where data is stored in memory in encrypted form and decrypted/encrypted within a processor core. An example cryptographic-based instruction includes an encrypted load and store operation. The sequence of instructions may be within a particular window of fixed size as in process 100.
  • At 204, at least one encryption-based μop and at least one non-encryption based μop are generated for each instruction accessed at 202. This operation may correspond to operation 104 of the process 100. In some embodiments, the encryption-based μop is based on a block encryption scheme. The at least one encryption-based μop may include a data block encryption μop and the at least one non-encryption based μop may include a regular, unencrypted load or store μop. As another example, the at least one encryption-based μop may include a data block decryption μop and the at least one non-encryption based μop may include a regular, unencrypted load or store μop. As yet another example, the at least one encryption-based μop may include a data pointer encryption μop and the at least one non-encryption-based μop may include a regular, unencrypted load or store μop. As yet another example, the at least one encryption-based μop may include a data pointer decryption μop and the non-encryption-based μop may include a regular, unencrypted load or store μop.
  • At 206, the non-encryption based μops are expressed as dependent upon the (block) encryption-based μops. This operation may correspond to operation 106 of the process 100, and may accordingly be performed by dependencies logic of the scheduler during generation of an acyclic graph. As an example, in some embodiments, the scheduler may compute dependencies between μops by identifying regular, unencrypted load or store μops that have resulted from the mapping of cryptographic-based instructions into μops as dependent on at least one of a data block encryption μop, a data block decryption μop, a pointer encryption μop, or a pointer decryption μop.
  • At 208, encryption or decryption timings are added to an acyclic graph that expresses μop dependencies. This operation may correspond to operation 108 of the process 100, whereby the acyclic graph is annotated by timing logic of a scheduler. In some embodiments, the timings are otherwise implicitly taken into account by the scheduler. At 210, the encryption-based μops are scheduled to execute in parallel with independent μops (e.g., those not originating from the cryptographic-based instructions accessed at 202). This operation may correspond to operation 110 of the process 100, whereby the maximal scheduling is performed by scheduling logic of a scheduler. For instance, the scheduling logic that assigns μops to functional units may ensure that data block and pointer encryption/decryption tasks are scheduled to be executed in parallel with other independent μops.
  • FIG. 3 is a diagram of another example process 300 of scheduling microoperations based on cryptographic-based instructions. In particular, in the example, shown, a block cipher encryption scheme is utilized, and the mode used for data block and pointer encryption is the counter mode. In the counter mode, data are encrypted by being XOR-ed with an almost random value, called the key stream. The key stream may be produced by encrypting counter blocks using a secret key. Counter blocks comprising tweak bits (as well as the bits of a block-by-block increasing counter) may be encrypted with the same key and the resulting encrypted blocks are XOR-ed with the data. Using the counter mode, key stream generation microoperations can be parallelized with microoperations for the reading of the data from memory.
  • At 302, a sequence of cryptographic-based instruction is accessed. Cryptographic-based instructions may refer to instructions that are to be executed in cryptographic computing systems or environments, where data is stored in memory in encrypted form and decrypted/encrypted within a processor core. An example cryptographic-based instruction includes an encrypted load and store operation. The sequence of instructions may be within a particular window of fixed size as in processes 100, 200.
  • At 304, at least one counter mode encryption-based μop and at least one non-encryption based μop are generated for each instruction accessed at 302, in a similar manner as described above with respect to 204 of process 200.
  • At 306, non-encryption-based μops that can execute in parallel with the encryption of the counter are identified, and the counter common to the identified μops is encrypted once (instead of multiple times). This operation may correspond to operation 106 of the process 100, and may accordingly be performed by dependencies logic of the scheduler during generation of an acyclic graph. As an example, the scheduler logic that computes μop dependencies may ensure that regular unencrypted load μops coming from the cryptographic-based instructions are not expressed as dependent on their associated counter encryption μops. In the counter mode, the encryption of the counter blocks may proceed independently from the loading of the data. Hence, the corresponding μops of these two steps may be represented by nodes of two separate parallel branches in the dependencies graph. These branches would merge in a node presenting the XOR operation which adds the encrypted counter to the loaded data, according to the counter mode specification. In some implementations, the dependencies logic of the scheduler may also identify a plurality of load and store μops coming from the cryptographic-based instructions, the associated data of which need to be encrypted or decrypted with the same counter value and key stream. For these μops, the dependencies logic may schedule the computation of the key stream only once and represent it as a single node in the dependencies graph.
  • At 308, encryption or decryption timings are added to an acyclic graph that expresses μop dependencies. This operation may correspond to operation 108 of the process 100, whereby the acyclic graph is annotated by timing logic of a scheduler. In some embodiments, the timings are otherwise implicitly taken into account by the scheduler. At 310, the encryption-based μops are scheduled to execute in parallel with independent μops (e.g., those not originating from the cryptographic-based instructions accessed at 302). This operation may correspond to operation 110 of the process 100, whereby the maximal scheduling is performed by scheduling logic of the scheduler. For instance, the scheduling logic that assigns μops to functional units may ensure that data block and pointer encryption/decryption tasks are scheduled to be executed in parallel with other independent μops.
  • The above descriptions have described how an out-of-order-execution scheduler may support the execution of cryptographic-based instructions in cryptographic computing implementations. The following examples describe certain embodiments wherein the functional units of a core support the execution of the microoperations as discussed above. In some of the example embodiments described below, the encryption and decryption of data is done in the load and store buffers, respectively, of a processor core microarchitecture.
  • FIGS. 4A-4B are diagrams of an example data decryption process in a cryptographic computing system. In particular, FIG. 4A shows an example system 400 for implementing the example process 450 of FIG. 4B. In certain embodiments, the system 400 is implemented entirely within a processor as part of a cryptographic computing system. The system 400 may, in certain embodiments, be executed in response to a plurality of μops issued by an out-of-order scheduler implementing the process 200 of FIG. 2.
  • Referring to the example system 400 of FIG. 4A, a load buffer 402 includes one or more load buffer entries 404. The load buffer 402 may be implemented in a memory subsystem of a processor, such as in a memory subsystem of a processor core. Each load buffer entry 404 includes a physical address field 406 and a pointer field 408. In the example shown, a state machine servicing load requests obtains data from a data cache unit 412 (which may, in some implementations be a store buffer), then uses the pointer field 408 (obtained via read port 410) as a tweak in a decryption operation performed on the encrypted data via a decryption unit 414. The decrypted data are then delivered to an execution unit 416 of the processor core microarchitecture. Although shown as being implemented outside (and coupled to) the load buffer 402, the decryption unit 414 may be implemented inside the load buffer 402 in some embodiments.
  • Referring now to the example process 450 of FIG. 4B, a data cache unit (or store buffer) stores encrypted data (ciphertext) to be decrypted by the decryption unit 414 as described above. At 452, the decryption unit 414 accesses the ciphertext to begin fulfilling a load operation. The decryption unit 414 then decrypts the ciphertext at 454 using an active key obtained from a register along with a tweak value, which, in the example shown, is the value of the pointer field 408 (i.e., the data's linear address). At 456, the decryption unit 414 provides the decrypted plaintext to an execution unit 416 to fulfill the load operation. Finally, at 458, the decryption unit 414 sends a wake-up signal to a reservation station of the processor (which may track the status of register contents and support register renaming).
  • FIGS. 5A-5C are diagrams of another example data decryption process in a cryptographic computing system. In particular, FIG. 5A shows an example system 500 for implementing the example processes 550, 560 of FIGS. 5B, 5C. In certain embodiments, the system 500 is implemented entirely within a processor as part of a cryptographic computing system. In the examples shown in FIGS. 5A-5B, a counter mode block cipher is used for encryption/decryption of data. The system 500 may be executed, in certain embodiments, in response to a plurality of μops issued by an out-of-order scheduler implementing the process 300 of FIG. 3.
  • Referring to the example system 500 of FIG. 5A, a load buffer 502 includes one or more load buffer entries 504. The load buffer 502 may be implemented in a memory subsystem of a processor, such as in a memory subsystem of a processor core. Each load buffer entry 504 includes a physical address field 506, a pointer field 508, and a key stream 510. In the example shown, since the counter mode is being used, the key stream generator 512 produces the key stream 510 by encrypting a counter value loaded from the register 522. The pointer field 508 of the load buffer entry 504 tweaks the encryption operation performed by the key stream generator 512. The encryption performed by the key stream generator 512 may be tweaked by other fields, such as, for example, other cryptographic context values. An XOR operation is then performed on the key stream 510 by the XOR unit 518 (which reads the key stream 510 via the read port 514) and encrypted data coming from the data cache unit 516 (which may, in some embodiments, be a store buffer). The decrypted data are then delivered to an execution unit 520 of the processor core microarchitecture. Although shown as being implemented inside the load buffer 502, the key stream generator 512 may be implemented outside the load buffer 502 in some embodiments. Further, although shown as being implemented outside (and coupled to) the load buffer 502, the XOR unit 518 may be implemented inside the load buffer 502 in some embodiments.
  • Referring now to the example process 550 of FIG. 5B, at 552, a load buffer entry 504 is created. At 554, a key stream generator 512 is invoked. The key stream generator 512 uses a key obtained from a register along with a tweak value (which, in the example shown, is the pointer value 508) to generate a key stream 510, which is stored in the load buffer entry 504.
  • Referring now to the example process 560 of FIG. 5C (which may execute independently from the process 550 of FIG. 5B), the ciphertext associated with the load operation may become available from a data cache unit (or store buffer). At 562, the cipher text is accessed, and at 564, the ciphertext is XOR-ed with the key stream 510. At 564, the result of the XOR operation is provided to an execution unit 520 of the processor core microarchitecture to fulfill the load operation. Finally, at 568, a wake-up signal is sent to a reservation station of the processor.
  • FIGS. 6A-6B are diagrams of an example data encryption process in a cryptographic computing system. In particular, FIG. 6A shows an example system 600 for implementing the example process 650 of FIG. 6B. In certain embodiments, the system 600 is implemented entirely within a processor as part of a cryptographic computing system. The system 600 may, in certain embodiments, be executed in response to a plurality of μops issued by an out-of-order scheduler implementing the process 200 of FIG. 2.
  • Referring to the example system 600 shown in FIG. 6A, a store buffer 602 includes one or more store buffer entries 604. The store buffer 602 may be implemented in a memory subsystem of a processor, such as in a memory subsystem of a processor core. Each store buffer entry 604 includes a physical address field 606, a pointer field 608, and store data 610 (which is to be stored). In the example shown, a state machine servicing store requests obtains data from a register file 620 (or execution unit), and an encryption unit 612 uses the pointer field 608 as a tweak during an encryption operation performed on the data obtained from the register file 620. The encrypted data are then passed to a data cache unit 630 (or other execution unit of the CPU core microarchitecture). Although shown as being implemented inside the store buffer 602, the encryption unit 612 may be implemented outside the store buffer 602 in some embodiments.
  • Referring now to the example process 650 of FIG. 6B, plaintext data to be encrypted is available from a register file 620. At 652, the store buffer entry 604 is populated with a pointer value 608. At 654, the plaintext data is accessed from the register file 620 and at 656, the plaintext data is encrypted by the encryption unit 612 using an active key obtained from a register 640 along with a tweak (which, in the example shown, is the value of the pointer field 408 (i.e., the data's linear address)) and stored in the store buffer entry 604 as store data 610. At 658, the encrypted store data 610 is provided to a data cache unit 630 (or another waiting execution unit, in some implementations).
  • In some implementations, the pointer values used in the encryption and decryption operations may themselves be encrypted for security purposes. The pointer values may be entirely or partially encrypted (that is, only a portion of the bits of the pointer value may be encrypted). In these instances, the encrypted pointer values may first be decrypted prior to being used in the encryption/decryption operations described above. FIGS. 7A-7B and 8A-8B describe example embodiments for decrypting pointer values prior to use in the encryption/decryption operations.
  • FIGS. 7A-7B are diagrams of an example pointer decryption process in a cryptographic computing system. In particular, FIG. 7A shows an example system 700 for implementing the example process 750 of FIG. 7B. In certain embodiments, the system 700 is implemented entirely within a processor as part of a cryptographic computing system. The system 700 may, in certain embodiments, be executed in response to a plurality of μops issued by an out-of-order scheduler implementing the process 200 of FIG. 2 or the process 300 of FIG. 3.
  • Referring to the example system 700 shown in FIG. 7A, an address generation unit 702 is configured to decrypt parts of a linear address, which are encrypted for security. A decryption unit 704 in the address generation unit 702 accepts as input an encrypted pointer 710 representing a first encoded linear address, along with a key obtained from a register and a context value tweak input (e.g., the tweak input may come from a separate register, or may consist of unencrypted bits of the same linear address). The decryption unit 704 outputs a decrypted subset of the bits of the encrypted pointer 710, which are then passed to address generation circuitry 706 within the address generation unit 702 along with other address generation inputs. The address generation circuitry 706 generates a second effective linear address to be used in a memory read or write operation based on the inputs.
  • Referring now to the example process 750 shown in FIG. 7B, the tweak value (which is also described in FIG. 7B as the “context value”) may be available either statically or dynamically—if it is not available statically, it is loaded dynamically from memory. At 752, request to generate an effective address from an encrypted pointer 710 is received by an address generation unit 702. The address generation unit 702 determines at 754 whether a context value is available statically. If it is available statically, then the value is used at 756; if not, the context value is loaded dynamically from a table in memory at 755. The process then proceeds to 756, where the encrypted pointer 710 is decrypted using an active decryption key obtained from a register along with the obtained context value. At 758, a decrypted address is output to the address generation circuitry 706, which then generates, at 760, an effective address for use in read/write operations based on the decrypted address (and any other address generation inputs).
  • FIGS. 8A-8B are diagrams of an example base address slice decryption process in a cryptographic computing system. In particular, FIG. 8A shows an example system 800 for implementing the example process 850 of FIG. 8B. In certain embodiments, the system 800 is implemented entirely within a processor as part of a cryptographic computing system. The system 800 may, in certain embodiments, be executed in response to a plurality of μops issued by an out-of-order scheduler implementing the process 200 of FIG. 2 or the process 300 of FIG. 3.
  • Referring to the example system 800 shown in FIG. 8A, a generation unit 802 is configured to decrypt parts of a linear address, as described above with respect to FIGS. 7A-7B. However, in the example shown, the bit set that is encrypted (i.e., slice 824) occupies a middle slice of an encoded linear address 820 rather than the entire address being encrypted as in the examples described above with respect to FIGS. 7A-7B. The upper bits 822 of the encoded linear address 820 may denote the data object size, type, format, or other security information associated with the encoded linear address 820. The encoded linear address 820 also includes an offset 826.
  • In the example shown, a decryption unit 804 in the address generation unit 802 accepts as input the encrypted base address slice 824, along with a key obtained from a register and a context value tweak input (e.g., the tweak input may come from a separate register, or may consist of unencrypted bits of the same linear address). The decryption unit 804 outputs a decrypted base address. The decrypted base address slice is then provided to a concatenator/adder unit 806, which concatenates the decrypted base address with a set of complementary upper bits from a register or context table entry and the offset 826 to yield an intermediate base address. In certain embodiments, the set of complementary bits is different from the upper bits 822, and the set of complementary does not convey metadata information (e.g., data object size, type, format, etc.) but instead includes the missing bits of the effective linear address that is constructed, denoting a location in the linear address space.
  • The intermediate base address is then combined with the upper bits 822 by the OR unit 808 to yield a tagged base address. In other embodiments, the upper bits 822 may be combined using an XOR unit, an ADD unit or a logical AND unit. In yet other embodiments, the upper bits 822 may act as a tweak value and tweak the decryption of the middle slice of the address. The tagged base address is then provided to address generation circuitry 810 in the address generation unit 802, along with other address generation inputs. The address generation circuitry 810 then generates an effective address to be used in a memory read or write operation based on the inputs. In one embodiment, the upper bits 822 may be used to determine a number of intermediate lower address bits (e.g., from offset 826) that would be used as a tweak to the encrypted base address 824.
  • For embodiments with an encrypted base address, a Translation Lookaside Buffer (TLB) may be used that maps linear addresses (which may also be referred to as virtual addresses) to physical addresses. A TLB entry is populated after a page miss where a page walk of the paging structures determines the correct linear to physical memory mapping, caching the linear to physical mapping for fast lookup. As an optimization, a TLB (for example, the data TLB or dTLB) may instead cache the encoded address 820 to physical address mapping, using a Content Addressable Memory (CAM) circuit to match the encrypted/encoded address 820 to the correct physical address translation. In this way, the TLB may determine the physical memory mapping prior to the completion of the decryption unit 804 revealing the decrypted linear address, and may immediately proceed with processing the instructions dependent on this cached memory mapping. Other embodiments may instead use one or both of the offset 826 and upper bits 822 of the address 820 as a partial linear address mapping into the TLB (that is, the TLB lookup is performed only against the plaintext subset of the address 820), and proceed to use the physical memory translation, if found, verifying the remainder of the decrypted base address (824) to determine the full linear address is a match (TLB hit) after completion of the decryption 804. Such embodiments may speculatively proceed with processing and nuke the processor pipeline if the final decrypted linear address match is found to be a false positive hit in the TLB, preventing the execution of dependent instructions, or cleaning up the execution of dependent instructions by returning processor register state and/or memory to its prior state before the TLB misprediction (incorrect memory mapping).
  • In some embodiments, a subset of the upper bits 822 indicates address adjustment, which may involve adding offset value (which is a power of two) to the effective linear address that is produced by the address generation unit. The offset value may include a bit string where only a single bit is equal to 1 and all other bits are equal to zero. In some other embodiments, address adjustment may involve subtracting from the effective linear address an offset value, which is a power of two. Adjustment may be included in certain implementations because some memory object allocations cross power of two boundaries. In some embodiments, the smallest power-of-two box that contains a memory object allocation is also a unique property of the allocation and may be used for cryptographically tweaking the encryption the base address 824 associated with the allocation. If address adjustment is not supported, allocations that cross power of two boundaries may be associated with exceedingly large power-of-two boxes. Such large boxes may be polluted with data of other allocations, which, even though cryptographically isolated, may still be accessed by software (e.g., as a result of a software bug). The adjustment may proceed in parallel with the decryption of the base address bits 824. In certain embodiments, performing the adjustment involves: (i) passing the upper bits 822 though a decoder circuit, (ii) obtaining the outputs of the decoder circuit; (iii) using those decoder outputs together with a first offset value 826 to form a second offset value to add to the bits of the linear address which are unencrypted; (iv) obtain a carry out value from this addition; (v) add the carry out value to the decrypted address bits 824 once they are produced. In other embodiments, a partial TLB lookup process may begin as soon as the adjustment process has produced the linear address bits which are used by the partial TLB lookup process.
  • Referring now to the example process 850 shown in FIG. 8B, as in FIG. 7B, the tweak value (also described in FIG. 8B as the “context value”) may be available either statically or dynamically—if it is not available statically, it is loaded dynamically from memory. In particular, at 852, request to generate an effective address from an encrypted base address slice 824 is received by an address generation unit 802. The address generation unit 802 determines at 854 whether a context value is available statically. If it is available statically, then the value is used at 856; if not, the context value is loaded dynamically from a table in memory at 855. At 856, the encrypted base address slice 824 is decrypted using an active decryption key obtained from a register along with the context value.
  • At 858, the address generation unit 802 determines whether both (1) the memory access is being performed with a static context value, and (2) the input context value has its dynamic flag bit cleared. The dynamic flag bit may be a flag bit in the pointer that indicates whether context information is available statically or dynamically. For instance, if an object represented by the pointer is not entirely within the bounds of a statically addressable memory region, then a dynamic flag bit may be set in the pointer. The dynamic flag bit may indicate that context information is to be dynamically obtained, for example, via a pointer context table. In other words, there may be a region of memory in which the upper bits for a base address can be supplied statically from a control register, and allocations outside that region may need to draw their upper bits for the base address dynamically from a table entry in memory.
  • If both of the conditions are true at 858, the process 850 moves to 860; if one or both are not true, then the upper base address bits are loaded dynamically from a table entry in memory at 859 before proceeding to 860. In some cases, the operations of 858 can be performed alongside those of 854, or the operations may be merged. Likewise, in some cases, the operations of 859 can be performed alongside those of 855, or the operations may be merged.
  • At 860, the concatenator/adder unit 806 of the address generation unit 802 concatenates the upper base address bits with the decrypted base address slice, and at 862, adds the offset 826 to the concatenation. At 864, the address generation unit 802 recombines tag information from the upper bits 822 with the result of the concatenation/addition of 860 and 862 via the OR unit 808. The result of the concatenation, addition, and ORing is provided to address generation circuitry 810 in the address generation unit 802, along with other address generation inputs. At 866, the address generation circuitry 810 generates an effective address to be used in a memory read or write operation based on the inputs.
  • FIG. 9 is a flow diagram of an example process 900 of executing cryptographic-based instructions in a cryptographic computing system. The example process 900 may be performed by circuitry of a microprocessor pipeline of a processor (e.g., one or more of the components described above, which may be implemented in a processor configured similar to the processor 1000 of FIG. 10) in response to accessing a set of cryptographic-based instructions. In some embodiments, the circuitry of the microprocessor pipeline performs each of the operations described, while in other embodiments, the circuity of the microprocessor pipeline performs only a subset of the operations described.
  • At 902, encrypted data stored in a data cache unit of a processor (e.g., data cache unit 412 of FIG. 4A, data cache unit 516 of FIG. 5A, or data cache unit 1024 of FIG. 10) is accessed.
  • At 904, the encrypted data is decrypted based on a pointer value. The decryption may be performed in manner similar to that described above with respect to FIGS. 4A-4B, FIGS. 5A-5B, or in another manner. In some instances, the pointer value or a portion thereof may itself be encrypted. In these instances, the pointer value may first be decrypted/decoded, for example, in a similar manner to that described above with respect to FIGS. 7A-7B or FIGS. 8A-8B.
  • At 906, a cryptographic-based instruction is executed based on data obtained from the decryption performed at 904. The instruction may be executed on an execution unit of the processor (e.g., execution unit 416 of FIG. 4A, execution unit 520 of FIG. 5A, or execution unit(s) 1016 of FIG. 10).
  • At 908, a result of the execution performed at 906 is encrypted based on another pointer value. The encryption may be performed in a similar manner to that described above with respect to FIGS. 6A-6B.
  • At 910, the encrypted result is stored in a data cache unit of the processor or another execution unit.
  • The example processes described above may include additional or different operations, and the operations may be performed in the order shown or in another order. In some cases, one or more of the operations shown in the flow diagrams are implemented as processes that include multiple operations, sub-processes, or other types of routines. In some cases, operations can be combined, performed in another order, performed in parallel, iterated, or otherwise repeated or performed in another manner. Further, although certain functionality is described herein as being performed by load or store buffers, address generation units, or other certain aspects of a processor, it will be understood that the teachings of the present disclosure may be implemented in other examples by other types of execution units in a processor, including but not limited to separate data block encryption units, separate key stream generation units, or separate data pointer decryption units.
  • FIGS. 10-12 are block diagrams of example computer architectures that may be used in accordance with embodiments disclosed herein. Generally, any computer architecture designs known in the art for processors and computing systems may be used. In an example, system designs and configurations known in the arts for laptops, desktops, handheld PCs, personal digital assistants, tablets, engineering workstations, servers, network devices, servers, appliances, network hubs, routers, switches, embedded processors, digital signal processors (DSPs), graphics devices, video game devices, set-top boxes, micro controllers, smart phones, mobile devices, wearable electronic devices, portable media players, hand held devices, and various other electronic devices, are also suitable for embodiments of computing systems described herein. Generally, suitable computer architectures for embodiments disclosed herein can include, but are not limited to, configurations illustrated in FIGS. 10-12.
  • FIG. 10 is an example illustration of a processor according to an embodiment. Processor 1000 is an example of a type of hardware device that can be used in connection with the implementations above. Processor 1000 may be any type of processor, such as a microprocessor, an embedded processor, a digital signal processor (DSP), a network processor, a multi-core processor, a single core processor, or other device to execute code. Although only one processor 1000 is illustrated in FIG. 10, a processing element may alternatively include more than one of processor 1000 illustrated in FIG. 10. Processor 1000 may be a single-threaded core or, for at least one embodiment, the processor 1000 may be multi-threaded in that it may include more than one hardware thread context (or “logical processor”) per core.
  • FIG. 10 also illustrates a memory 1002 coupled to processor 1000 in accordance with an embodiment. Memory 1002 may be any of a wide variety of memories (including various layers of memory hierarchy) as are known or otherwise available to those of skill in the art. Such memory elements can include, but are not limited to, random access memory (RAM), read only memory (ROM), logic blocks of a field programmable gate array (FPGA), erasable programmable read only memory (EPROM), and electrically erasable programmable ROM (EEPROM).
  • Processor 1000 can execute any type of instructions associated with algorithms, processes, or operations detailed herein. Generally, processor 1000 can transform an element or an article (e.g., data) from one state or thing to another state or thing.
  • Code 1004, which may be one or more instructions to be executed by processor 1000, may be stored in memory 1002, or may be stored in software, hardware, firmware, or any suitable combination thereof, or in any other internal or external component, device, element, or object where appropriate and based on particular needs. In one example, processor 1000 can follow a program sequence of instructions indicated by code 1004. Each instruction enters a front-end logic 1006 and is processed by one or more decoders 1008. The decoder may generate, as its output, a microoperation such as a fixed width microoperation in a predefined format, or may generate other instructions, microinstructions, or control signals that reflect the original code instruction. Front-end logic 1006 also includes register renaming logic 1010 and scheduling logic 1012 (which includes a reservation station 1013), which generally allocate resources and queue the operation corresponding to the instruction for execution. In some embodiments, the scheduling logic 1012 includes an in-order or an out-of-order execution scheduler.
  • Processor 1000 can also include execution logic 1014 having a set of execution units 1016 a, . . . , 1016 n, an address generation unit 1017, etc. Some embodiments may include a number of execution units dedicated to specific functions or sets of functions. Other embodiments may include only one execution unit or one execution unit that can perform a particular function. Execution logic 1014 performs the operations specified by code instructions.
  • After completion of execution of the operations specified by the code instructions, back-end logic 1018 can retire the instructions of code 1004. In one embodiment, processor 1000 allows out of order execution but requires in order retirement of instructions. Retirement logic 1020 may take a variety of known forms (e.g., re-order buffers or the like). In this manner, processor 1000 is transformed during execution of code 1004, at least in terms of the output generated by the decoder, hardware registers and tables utilized by register renaming logic 1010, and any registers (not shown) modified by execution logic 1014.
  • Processor 1000 can also include a memory subsystem 1022, which includes a load buffer 1024, a decryption unit 1025, a store buffer 1026, an encryption unit 1027, a Translation Lookaside Buffer (TLB) 1028, a data cache unit (DCU) 1030, and a Level-2 (L2) cache unit 1032. The load buffer 1024 processes microoperations for memory/cache load operations, while the store buffer 1026 processes microoperations for memory/cache store operations. In cryptographic computing systems, the data stored in the data cache unit 1030, the L2 cache unit 1032, and/or the memory 1002 may be encrypted, and may be encrypted (prior to storage) and decrypted (prior to processing by one or more execution units 1016) entirely within the processor 1000 as described herein. Accordingly, the decryption unit 1025 may decrypt encrypted data stored in the DCU 1030, e.g., during load operations processed by the load buffer 1024 as described above, and the encryption unit 1027 may encrypt data to be stored in the DCU 1030, e.g., during stored operations processed by the store buffer 1026 as described above. In some embodiments, the decryption unit 1025 may be implemented inside the load buffer 1024 and/or the encryption unit 1027 may be implemented inside the store buffer 1026. The Translation Lookaside Buffer (TLB) 1028 maps linear addresses to physical addresses and performs other functionality as described herein.
  • Although not shown in FIG. 10, a processing element may include other elements on a chip with processor 1000. For example, a processing element may include memory control logic along with processor 1000. The processing element may include I/O control logic and/or may include I/O control logic integrated with memory control logic. The processing element may also include one or more caches. In some embodiments, non-volatile memory (such as flash memory or fuses) may also be included on the chip with processor 1000.
  • FIG. 11A is a block diagram illustrating both an example in-order pipeline and an example register renaming, out-of-order issue/execution pipeline according to one or more embodiments of this disclosure. FIG. 11B is a block diagram illustrating both an example embodiment of an in-order architecture core and an example register renaming, out-of-order issue/execution architecture core to be included in a processor according to one or more embodiments of this disclosure. The solid lined boxes in FIGS. 11A-11B illustrate the in-order pipeline and in-order core, while the optional addition of the dashed lined boxes illustrates the register renaming, out-of-order issue/execution pipeline and core. Given that the in-order aspect is a subset of the out-of-order aspect, the out-of-order aspect will be described.
  • In FIG. 11A, a processor pipeline 1100 includes a fetch stage 1102, a length decode stage 1104, a decode stage 1106, an allocation stage 1108, a renaming stage 1110, a schedule (also known as a dispatch or issue) stage 1112, a register read/memory read stage 1114, an execute stage 1116, a write back/memory write stage 1118, an exception handling stage 1122, and a commit stage 1124.
  • FIG. 11B shows processor core 1190 including a front end unit 1130 coupled to an execution engine unit 1150, and both are coupled to a memory unit 1170. Processor core 1190 and memory unit 1170 are examples of the types of hardware that can be used in connection with the implementations shown and described herein. The core 1190 may be a reduced instruction set computing (RISC) core, a complex instruction set computing (CISC) core, a very long instruction word (VLIW) core, or a hybrid or alternative core type. As yet another option, the core 1190 may be a special-purpose core, such as, for example, a network or communication core, compression engine, coprocessor core, general purpose computing graphics processing unit (GPGPU) core, graphics core, or the like. In addition, processor core 1190 and its components represent example architecture that could be used to implement logical processors and their respective components.
  • The front end unit 1130 includes a branch prediction unit 1132 coupled to an instruction cache unit 1134, which is coupled to an instruction translation lookaside buffer (TLB) unit 1136, which is coupled to an instruction fetch unit 1138, which is coupled to a decode unit 1140. The decode unit 1140 (or decoder) may decode instructions, and generate as an output one or more micro-operations, micro-code entry points, microinstructions, other instructions, or other control signals, which are decoded from, or which otherwise reflect, or are derived from, the original instructions. The decode unit 1140 may be implemented using various different mechanisms. Examples of suitable mechanisms include, but are not limited to, look-up tables, hardware implementations, programmable logic arrays (PLAs), microcode read only memories (ROMs), etc. In one embodiment, the core 1190 includes a microcode ROM or other medium that stores microcode for certain macroinstructions (e.g., in decode unit 1140 or otherwise within the front end unit 1130). The decode unit 1140 is coupled to a rename/allocator unit 1152 in the execution engine unit 1150.
  • The execution engine unit 1150 includes the rename/allocator unit 1152 coupled to a retirement unit 1154 and a set of one or more scheduler unit(s) 1156. The scheduler unit(s) 1156 represents any number of different schedulers, including reservation stations, central instruction window, etc. The scheduler unit(s) 1156 is coupled to the physical register file(s) unit(s) 1158. Each of the physical register file(s) units 1158 represents one or more physical register files, different ones of which store one or more different data types, such as scalar integer, scalar floating point, packed integer, packed floating point, vector integer, vector floating point, status (e.g., an instruction pointer that is the address of the next instruction to be executed), etc. In one embodiment, the physical register file(s) unit 1158 comprises a vector registers unit, a write mask registers unit, and a scalar registers unit. These register units may provide architectural vector registers, vector mask registers, and general purpose registers (GPRs). In at least some embodiments described herein, register units 1158 are examples of the types of hardware that can be used in connection with the implementations shown and described herein (e.g., registers 112). The physical register file(s) unit(s) 1158 is overlapped by the retirement unit 1154 to illustrate various ways in which register renaming and out-of-order execution may be implemented (e.g., using a reorder buffer(s) and a retirement register file(s); using a future file(s), a history buffer(s), and a retirement register file(s); using register maps and a pool of registers; etc.). The retirement unit 1154 and the physical register file(s) unit(s) 1158 are coupled to the execution cluster(s) 1160. The execution cluster(s) 1160 includes a set of one or more execution units 1162 and a set of one or more memory access units 1164. The execution units 1162 may perform various operations (e.g., shifts, addition, subtraction, multiplication) and on various types of data (e.g., scalar floating point, packed integer, packed floating point, vector integer, vector floating point). While some embodiments may include a number of execution units dedicated to specific functions or sets of functions, other embodiments may include only one execution unit or multiple execution units that all perform all functions. Execution units 1162 may also include an address generation unit (AGU) to calculate addresses used by the core to access main memory and a page miss handler (PMH).
  • The scheduler unit(s) 1156, physical register file(s) unit(s) 1158, and execution cluster(s) 1160 are shown as being possibly plural because certain embodiments create separate pipelines for certain types of data/operations (e.g., a scalar integer pipeline, a scalar floating point/packed integer/packed floating point/vector integer/vector floating point pipeline, and/or a memory access pipeline that each have their own scheduler unit, physical register file(s) unit, and/or execution cluster—and in the case of a separate memory access pipeline, certain embodiments are implemented in which only the execution cluster of this pipeline has the memory access unit(s) 1164). It should also be understood that where separate pipelines are used, one or more of these pipelines may be out-of-order issue/execution and the rest in-order.
  • The set of memory access units 1164 is coupled to the memory unit 1170, which includes a data TLB unit 1172 coupled to a data cache unit 1174 coupled to a level 2 (L2) cache unit 1176. In one example embodiment, the memory access units 1164 may include a load unit, a store address unit, and a store data unit, each of which is coupled to the data TLB unit 1172 in the memory unit 1170. The instruction cache unit 1134 is further coupled to a level 2 (L2) cache unit 1176 in the memory unit 1170. The L2 cache unit 1176 is coupled to one or more other levels of cache and eventually to a main memory. In addition, a page miss handler may also be included in core 1190 to look up an address mapping in a page table if no match is found in the data TLB unit 1172.
  • By way of example, the example register renaming, out-of-order issue/execution core architecture may implement the pipeline 1100 as follows: 1) the instruction fetch 1138 performs the fetch and length decoding stages 1102 and 1104; 2) the decode unit 1140 performs the decode stage 1106; 3) the rename/allocator unit 1152 performs the allocation stage 1108 and renaming stage 1110; 4) the scheduler unit(s) 1156 performs the schedule stage 1112; 5) the physical register file(s) unit(s) 1158 and the memory unit 1170 perform the register read/memory read stage 1114; the execution cluster 1160 perform the execute stage 1116; 6) the memory unit 1170 and the physical register file(s) unit(s) 1158 perform the write back/memory write stage 1118; 7) various units may be involved in the exception handling stage 1122; and 8) the retirement unit 1154 and the physical register file(s) unit(s) 1158 perform the commit stage 1124.
  • The core 1190 may support one or more instructions sets (e.g., the x86 instruction set (with some extensions that have been added with newer versions); the MIPS instruction set of MIPS Technologies of Sunnyvale, Calif.; the ARM instruction set (with optional additional extensions such as NEON) of ARM Holdings of Sunnyvale, Calif.), including the instruction(s) described herein. In one embodiment, the core 1190 includes logic to support a packed data instruction set extension (e.g., AVX1, AVX2), thereby allowing the operations used by many multimedia applications to be performed using packed data.
  • It should be understood that the core may support multithreading (executing two or more parallel sets of operations or threads), and may do so in a variety of ways including time sliced multithreading, simultaneous multithreading (where a single physical core provides a logical core for each of the threads that physical core is simultaneously multithreading), or a combination thereof (e.g., time sliced fetching and decoding and simultaneous multithreading thereafter such as in the Intel® Hyperthreading technology). Accordingly, in at least some embodiments, multi-threaded enclaves may be supported.
  • While register renaming is described in the context of out-of-order execution, it should be understood that register renaming may be used in an in-order architecture. While the illustrated embodiment of the processor also includes separate instruction and data cache units 1134/1174 and a shared L2 cache unit 1176, alternative embodiments may have a single internal cache for both instructions and data, such as, for example, a Level 1 (L1) internal cache, or multiple levels of internal cache. In some embodiments, the system may include a combination of an internal cache and an external cache that is external to the core and/or the processor. Alternatively, all of the cache may be external to the core and/or the processor.
  • FIG. 12 illustrates a computing system 1200 that is arranged in a point-to-point (PtP) configuration according to an embodiment. In particular, FIG. 12 shows a system where processors, memory, and input/output devices are interconnected by a number of point-to-point interfaces. Generally, one or more of the computing systems or computing devices described herein may be configured in the same or similar manner as computing system 1200.
  • Processors 1270 and 1280 may be implemented as single core processors 1274 a and 1284 a or multi-core processors 1274 a-1274 b and 1284 a-1284 b. Processors 1270 and 1280 may each include a cache 1271 and 1281 used by their respective core or cores. A shared cache (not shown) may be included in either processors or outside of both processors, yet connected with the processors via P-P interconnect, such that either or both processors' local cache information may be stored in the shared cache if a processor is placed into a low power mode.
  • Processors 1270 and 1280 may also each include integrated memory controller logic (MC) 1272 and 1282 to communicate with memory elements 1232 and 1234, which may be portions of main memory locally attached to the respective processors. In alternative embodiments, memory controller logic 1272 and 1282 may be discrete logic separate from processors 1270 and 1280. Memory elements 1232 and/or 1234 may store various data to be used by processors 1270 and 1280 in achieving operations and functionality outlined herein.
  • Processors 1270 and 1280 may be any type of processor, such as those discussed in connection with other figures. Processors 1270 and 1280 may exchange data via a point-to-point (PtP) interface 1250 using point-to- point interface circuits 1278 and 1288, respectively. Processors 1270 and 1280 may each exchange data with an input/output (I/O) subsystem 1290 via individual point-to- point interfaces 1252 and 1254 using point-to- point interface circuits 1276, 1286, 1294, and 1298. I/O subsystem 1290 may also exchange data with a high-performance graphics circuit 1238 via a high-performance graphics interface 1239, using an interface circuit 1292, which could be a PtP interface circuit. In one embodiment, the high-performance graphics circuit 1238 is a special-purpose processor, such as, for example, a high-throughput MIC processor, a network or communication processor, compression engine, graphics processor, GPGPU, embedded processor, or the like. I/O subsystem 1290 may also communicate with a display 1233 for displaying data that is viewable by a human user. In alternative embodiments, any or all of the PtP links illustrated in FIG. 12 could be implemented as a multi-drop bus rather than a PtP link.
  • I/O subsystem 1290 may be in communication with a bus 1220 via an interface circuit 1296. Bus 1220 may have one or more devices that communicate over it, such as a bus bridge 1218 and I/O devices 1216. Via a bus 1210, bus bridge 1218 may be in communication with other devices such as a user interface 1212 (such as a keyboard, mouse, touchscreen, or other input devices), communication devices 1226 (such as modems, network interface devices, or other types of communication devices that may communicate through a computer network 1260), audio I/O devices 1214, and/or a data storage device 1228. Data storage device 1228 may store code and data 1230, which may be executed by processors 1270 and/or 1280. In alternative embodiments, any portions of the bus architectures could be implemented with one or more PtP links.
  • The computer system depicted in FIG. 12 is a schematic illustration of an embodiment of a computing system that may be utilized to implement various embodiments discussed herein. It will be appreciated that various components of the system depicted in FIG. 12 may be combined in a system-on-a-chip (SoC) architecture or in any other suitable configuration capable of achieving the functionality and features of examples and implementations provided herein.
  • Although this disclosure has been described in terms of certain implementations and generally associated methods, alterations and permutations of these implementations and methods will be apparent to those skilled in the art. For example, the actions described herein can be performed in a different order than as described and still achieve the desirable results. As one example, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve the desired results. In certain implementations, multitasking and parallel processing may be advantageous. Other variations are within the scope of the following claims.
  • The architectures presented herein are provided by way of example only, and are intended to be non-exclusive and non-limiting. Furthermore, the various parts disclosed are intended to be logical divisions only, and need not necessarily represent physically separate hardware and/or software components. Certain computing systems may provide memory elements in a single physical memory device, and in other cases, memory elements may be functionally distributed across many physical devices. In the case of virtual machine managers or hypervisors, all or part of a function may be provided in the form of software or firmware running over a virtualization layer to provide the disclosed logical function.
  • Note that with the examples provided herein, interaction may be described in terms of a single computing system. However, this has been done for purposes of clarity and example only. In certain cases, it may be easier to describe one or more of the functionalities of a given set of flows by only referencing a single computing system. Moreover, the system for deep learning and malware detection is readily scalable and can be implemented across a large number of components (e.g., multiple computing systems), as well as more complicated/sophisticated arrangements and configurations. Accordingly, the examples provided should not limit the scope or inhibit the broad teachings of the computing system as potentially applied to a myriad of other architectures.
  • As used herein, unless expressly stated to the contrary, use of the phrase ‘at least one of’ refers to any combination of the named elements, conditions, or activities. For example, ‘at least one of X, Y, and Z’ is intended to mean any of the following: 1) at least one X, but not Y and not Z; 2) at least one Y, but not X and not Z; 3) at least one Z, but not X and not Y; 4) at least one X and Y, but not Z; 5) at least one X and Z, but not Y; 6) at least one Y and Z, but not X; or 7) at least one X, at least one Y, and at least one Z.
  • Additionally, unless expressly stated to the contrary, the terms ‘first’, ‘second’, ‘third’, etc., are intended to distinguish the particular nouns (e.g., element, condition, module, activity, operation, claim element, etc.) they modify, but are not intended to indicate any type of order, rank, importance, temporal sequence, or hierarchy of the modified noun. For example, ‘first X’ and ‘second X’ are intended to designate two separate X elements that are not necessarily limited by any order, rank, importance, temporal sequence, or hierarchy of the two elements.
  • References in the specification to “one embodiment,” “an embodiment,” “some embodiments,” etc., indicate that the embodiment(s) described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment.
  • While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any embodiments or of what may be claimed, but rather as descriptions of features specific to particular embodiments. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable sub combination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a sub combination or variation of a sub combination.
  • Similarly, the separation of various system components and modules in the embodiments described above should not be understood as requiring such separation in all embodiments. It should be understood that the described program components, modules, and systems can generally be integrated together in a single software product or packaged into multiple software products.
  • Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of this disclosure. Numerous other changes, substitutions, variations, alterations, and modifications may be ascertained to one skilled in the art and it is intended that the present disclosure encompass all such changes, substitutions, variations, alterations, and modifications as falling within the scope of the appended claims.
  • The following examples pertain to embodiments in accordance with this specification. It will be understood that one or more aspects of certain examples described below may be combined with or implemented in certain other examples, including examples not explicitly indicated.
  • Example 1 includes a processor comprising: data cache units storing encrypted data; and a microprocessor pipeline coupled to the data cache units. The microprocessor pipeline comprises circuitry to access and execute a sequence of cryptographic-based instructions based on the encrypted data. Execution of the sequence of cryptographic-based instructions comprises at least one of: decryption of the encrypted data based on a first pointer value; execution of a cryptographic-based instruction based on data obtained from decryption of the encrypted data; encryption of a result of execution of a cryptographic-based instruction, wherein the encryption is based on a second pointer value; and storage of encrypted data in the data cache units, wherein the encrypted data stored in the data cache units is based on an encrypted result of execution of a cryptographic-based instruction.
  • Example 2 includes the subject matter of Example 1, and optionally, wherein the circuitry is further to: generate, for each cryptographic-based instruction, at least one encryption-based microoperation and at least one non-encryption-based microoperation the cryptographic-based instruction; and schedule the at least one encryption-based microoperation and the at least one non-encryption-based microoperation for execution based on timings of the encryption-based microoperation.
  • Example 3 includes the subject matter of Example 2, and optionally, wherein the encryption-based microoperation is based on a block cipher, and the non-encryption-based microoperation is scheduled as dependent upon the encryption-based microoperation.
  • Example 4 includes the subject matter of Example 2, and optionally, wherein the encryption-based microoperation is based on a counter mode block cipher, and the non-encryption-based microoperation is scheduled to execute in parallel with encryption of a counter.
  • Example 5 includes the subject matter of Example 2, and optionally, wherein the encryption-based microoperation is one of an encryption operation and a decryption operation.
  • Example 6 includes the subject matter of Example 2, and optionally, wherein the non-encryption-based microoperation is one of a load operation and a store operation.
  • Example 7 includes the subject matter of any one of Examples 1-6, and optionally, wherein the circuitry is to decrypt the encrypted data by using the first pointer value as an input to a decryption function.
  • Example 8 includes the subject matter of Example 7, and optionally, wherein the circuitry to decrypt the encrypted data is in a load buffer of the processor.
  • Example 9 includes the subject matter of Example 7, and optionally, wherein the circuitry is to decrypt the encrypted data further by: generating a key stream based on the first pointer value and a counter value; and performing an XOR operation on the key stream and the encrypted data to yield the decrypted data.
  • Example 10 includes the subject matter of any one of Examples 1-6, and optionally, wherein the circuitry is to encrypt the result of the execution of the cryptographic-based instruction by using the second pointer value as an input to an encryption function.
  • Example 11 includes the subject matter of Example 10, and optionally, wherein the circuitry to encrypt the result of the execution of the cryptographic-based instruction is in a store buffer of the processor.
  • Example 12 includes the subject matter of any one of Examples 1-6, and optionally, wherein at least one of the first pointer value and the second pointer value is an effective address based on an encoded linear address that is at least partially encrypted, and the circuitry is further to: access the encoded linear address; decrypt an encrypted portion of the encoded linear address based on a key obtained from a register of the processor; and generate the effective address based on a result of the decryption of the encrypted portion of the encoded linear address.
  • Example 13 includes the subject matter of Example 12, and optionally, wherein the entirety of the encoded linear address is encrypted.
  • Example 14 includes the subject matter of Example 12, and optionally, wherein the circuitry to decrypt the encoded linear address is in an address generation unit of the processor.
  • Example 15 includes a method comprising: accessing a sequence of cryptographic-based instructions to execute on encrypted data stored in data cache units of a processor; and executing the sequence of cryptographic-based instructions by a core of the processor, wherein execution comprises one or more of: decryption of the encrypted data based on a first pointer value; execution of a cryptographic-based instruction based on data obtained from decryption of the encrypted data; encryption of a result of execution of a cryptographic-based instruction, wherein the encryption is based on a second pointer value; and storage of encrypted data in the data cache units, wherein the encrypted data stored in the data cache units is based on an encrypted result of execution of a cryptographic-based instruction.
  • Example 16 includes the subject matter of Example 15, and optionally, wherein executing the sequence of cryptographic-based instructions comprises: generating, for each cryptographic-based instruction, at least one encryption-based microoperation and at least one non-encryption-based microoperation the cryptographic-based instruction; scheduling the at least one encryption-based microoperation and the at least one non-encryption-based microoperation for execution based on timings of the encryption-based microoperation; and executing the scheduled microoperations.
  • Example 17 includes the subject matter of Example 16, and optionally, wherein the encryption-based microoperation is based on a block cipher, and the non-encryption-based microoperation is scheduled as dependent upon the encryption-based microoperation.
  • Example 18 includes the subject matter of Example 16, and optionally, wherein the encryption-based microoperation is based on a counter mode block cipher, and the non-encryption-based microoperation is scheduled to execute in parallel with encryption of a counter.
  • Example 19 includes the subject matter of Example 16, and optionally, wherein the encryption-based microoperation is one of an encryption operation and a decryption operation, and the non-encryption-based microoperation is one of a load operation and a store operation.
  • Example 20 includes the subject matter of Example 19, and optionally, wherein the encryption operation and decryption operation each utilize a pointer value as a tweak input.
  • Example 21 includes the subject matter of any one of Examples 16-20, and optionally, wherein the decryption is performed by circuitry coupled to or implemented in, a load buffer of the processor.
  • Example 22 includes the subject matter of any one of Examples 16-20, and optionally, wherein the encryption is performed by circuitry coupled to or implemented in, a store buffer of the processor.
  • Example 23 includes the subject matter of any one of Examples 16-20, and optionally, wherein decrypting the encrypted data comprises: generating a key stream based on the first pointer value and a counter value; and performing an XOR operation on the key stream and the encrypted data to yield the decrypted data.
  • Example 24 includes the subject matter of any one of Examples 16-20, and optionally, wherein at least one of the first pointer value and the second pointer value is an effective address based on an encoded linear address that is at least partially encrypted, and the method further comprises: accessing the encoded linear address; decrypting an encrypted portion of the encoded linear address based on a key obtained from a register of the processor; and generating the effective address based on a result of the decryption of the encrypted portion of the encoded linear address.
  • Example 25 includes the subject matter of Example 24, and optionally, wherein the entirety of the encoded linear address is encrypted.
  • Example 26 includes the subject matter of Example 24, and optionally, wherein the decryption of the encoded linear address is by an address generation unit of the processor
  • Example 27 includes a system comprising: memory storing cryptographic-based instructions, and a processor coupled to the memory. The processor comprises: data cache units storing encrypted data; means for accessing the cryptographic-based instructions, the cryptographic instructions to execute based on the encrypted data; means for decrypting the encrypted data based on a first pointer value; means for executing the cryptographic-based instruction using the decrypted data; means for encrypting a result of the execution of the cryptographic-based instruction based on a second pointer value; and means for storing the encrypted result in the data cache units.
  • Example 28 includes the subject matter of Example 27, and optionally, wherein the means for decrypting the encrypted data comprises a load buffer of the processor.
  • Example 29 includes the subject matter of Example 27, and optionally, wherein the means for encrypting a result of the execution of the cryptographic-based instruction comprises a store buffer of the processor.
  • Example 30 includes the subject matter of any one of Examples 27-29, and optionally, wherein at least one of the first pointer value and the second pointer value is an effective address based on an encoded linear address that is at least partially encrypted, and the processor further comprises additional means for: accessing the encoded linear address; decrypting an encrypted portion of the encoded linear address based on a key obtained from a register of the processor; and generating the effective address based on a result of the decryption of the encrypted portion of the encoded linear address.
  • Example 31 includes the subject matter of Example 30, and optionally, wherein the additional means comprises an address generation unit of the processor.
  • Example 32 includes a processor core supporting the encryption and the decryption of pointers keys, and data in the core and where such encryption and decryption operations are performed by logic and circuitry which is part of the processor microarchitecture pipeline.
  • Example 33 includes the subject matter of Example 32, and optionally, wherein instructions that perform encrypted memory loads and stores are mapped into at least one block encryption μop and at least one regular load/store μop.
  • Example 34 includes the subject matter of Example 32, and optionally, wherein an in order or out-of-order execution scheduler schedules the execution of encryption, decryption and load/store μops and where load and store μops are considered as dependent on one of a block encryption and a block decryption μop.
  • Example 35 includes the subject matter of Example 34, and optionally, wherein the out-of-order execution scheduler may load and store μops can execute in parallel with the encryption of a counter.
  • Example 36 includes the subject matter of Example 32, and optionally, wherein decryption of data is tweaked by a pointer and the decryption takes place in the load buffer.
  • Example 37 includes the subject matter of Example 32, and optionally, wherein encryption of data is tweaked by a pointer and the encryption takes place in the store buffer.
  • Example 38 includes the subject matter of Example 32, and optionally, wherein decryption of a pointer takes place in the address generation unit.
  • Example 39 includes the subject matter of Example 32, and optionally, wherein decryption of a slice of a base takes place in the address generation unit.
  • Example 40 may include a device comprising logic, modules, circuitry, or other means to perform one or more elements of a method described in or related to any of the examples above or any other method or process described herein.

Claims (25)

What is claimed is:
1. A processor comprising:
a register to store an encryption key; and
address generation circuitry to:
obtain a pointer representing a linear address to be accessed by a read or write operation, the pointer being at least partially encrypted;
obtain the key from the register and a context value;
decrypt the encrypted portion of the pointer using the key and the context value as a tweak input; and
generate an effective address for use in the read or write operation based on an output of the decryption.
2. The processor of claim 1, wherein the context value is obtained from another register of the processor.
3. The processor of claim 1, wherein the context value is obtained from bits of the pointer.
4. The processor of claim 1, wherein the context value is obtained from memory.
5. The processor of claim 1, wherein the pointer comprises an encrypted base address, plaintext upper bits, and a plaintext offset, and the address generation circuitry is to generate the effective address by:
decrypting the encrypted base address portion to yield a decrypted base address; and
combining the decrypted base address, the upper bits, and the offset.
6. The processor of claim 5, wherein the address generation circuitry is to generate the effective address by:
concatenating the decrypted base address with a set of complimentary upper bits and the offset to yield an intermediate base address; and
combining the upper bits with the intermediate base address.
7. The processor of claim 6, wherein the address generation circuitry is to combine the upper bits with the intermediate base address using one or more of an XOR, ADD, or logical AND function.
8. The processor of claim 1, further comprising:
a data cache unit storing encrypted data; and
memory access circuitry to:
access the encrypted data stored in the data cache unit; and
decrypt the encrypted data based on the key and the effective address.
9. The processor of claim 8, wherein the effective address is used as a tweak input to the decryption.
10. The processor of claim 8, wherein the circuitry is to decrypt the encrypted data by:
generating a key stream based on the effective address and a counter value; and
performing an XOR operation on the key stream and the encrypted data to yield decrypted data.
11. A method comprising:
obtaining a pointer representing a linear address to be accessed by a read or write operation, the pointer being at least partially encrypted;
obtaining the key from a processor register and a context value;
decrypting the encrypted portion of the pointer using the key and the context value as a tweak input; and
generating an effective address for use in the read or write operation based on an output of the decryption.
12. The method of claim 11, wherein the context value is obtained from another processor register.
13. The method of claim 11, wherein the context value is obtained from bits of the pointer.
14. The method of claim 11, wherein the context value is obtained from memory.
15. The method of claim 11, wherein the pointer comprises an encrypted base address, plaintext upper bits, and a plaintext offset, and generating the effective address comprises:
decrypting the encrypted base address portion to yield a decrypted base address; and
combining the decrypted base address, the upper bits, and the offset.
16. The method of claim 15, generating the effective address comprises:
concatenating the decrypted base address with a set of complimentary upper bits and the offset to yield an intermediate base address; and
combining the upper bits with the intermediate base address.
17. The method of claim 16, wherein combining the upper bits with the intermediate base address comprises using one or more of an XOR, ADD, or logical AND function.
18. The processor of claim 11, further comprising:
accessing encrypted data stored in a data cache unit; and
decrypting the encrypted data based on the key and the effective address.
19. The method of claim 18, wherein the effective address is used as a tweak input to the decryption.
20. The method of claim 18, wherein decrypting the encrypted data comprises:
generating a key stream based on the effective address and a counter value; and
performing an XOR operation on the key stream and the encrypted data to yield decrypted data.
21. A system comprising:
memory; and
a processor coupled to the memory, the processor comprising:
a register to store an encryption key; and
address generation circuitry to:
obtain a pointer representing a linear address to be accessed by a read or write instruction stored in the memory, the pointer being at least partially encrypted;
obtain the key from the register and a context value;
decrypt the encrypted portion of the pointer using the key and the context value as a tweak input; and
generate an effective address for use in the read or write operation based on an output of the decryption.
22. The system of claim 21, wherein the pointer comprises an encrypted base address, plaintext upper bits, and a plaintext offset, and the address generation circuitry is to generate the effective address by:
decrypting the encrypted base address portion to yield a decrypted base address; and
combining the decrypted base address, the upper bits, and the offset.
23. The system of claim 22, wherein the address generation circuitry is to generate the effective address by:
concatenating the decrypted base address with a set of complimentary upper bits and the offset to yield an intermediate base address; and
combining the upper bits with the intermediate base address.
24. The system of claim 23, wherein the address generation circuitry is to combine the upper bits with the intermediate base address using one or more of an XOR, ADD, or logical AND function.
25. The system of claim 21, wherein the processor further comprises memory access circuitry to:
access encrypted data stored in the memory; and
decrypt the encrypted data based on the key and the effective address.
US17/576,533 2019-06-29 2022-01-14 Microprocessor pipeline circuitry to support cryptographic computing Abandoned US20220138329A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/576,533 US20220138329A1 (en) 2019-06-29 2022-01-14 Microprocessor pipeline circuitry to support cryptographic computing
US17/878,322 US20220382885A1 (en) 2019-06-29 2022-08-01 Cryptographic computing using encrypted base addresses and used in multi-tenant environments

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201962868884P 2019-06-29 2019-06-29
US16/724,105 US11321469B2 (en) 2019-06-29 2019-12-20 Microprocessor pipeline circuitry to support cryptographic computing
US17/576,533 US20220138329A1 (en) 2019-06-29 2022-01-14 Microprocessor pipeline circuitry to support cryptographic computing

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/724,105 Continuation US11321469B2 (en) 2019-06-29 2019-12-20 Microprocessor pipeline circuitry to support cryptographic computing

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/723,977 Continuation-In-Part US11354423B2 (en) 2019-06-29 2019-12-20 Cryptographic isolation of memory compartments in a computing environment

Publications (1)

Publication Number Publication Date
US20220138329A1 true US20220138329A1 (en) 2022-05-05

Family

ID=70159013

Family Applications (11)

Application Number Title Priority Date Filing Date
US16/709,612 Active 2040-07-09 US11580234B2 (en) 2019-06-29 2019-12-10 Implicit integrity for cryptographic computing
US16/723,927 Active 2040-01-06 US11308225B2 (en) 2019-06-29 2019-12-20 Management of keys for use in cryptographic computing
US16/722,707 Active 2041-01-16 US11416624B2 (en) 2019-06-29 2019-12-20 Cryptographic computing using encrypted base addresses and used in multi-tenant environments
US16/724,105 Active 2040-01-14 US11321469B2 (en) 2019-06-29 2019-12-20 Microprocessor pipeline circuitry to support cryptographic computing
US16/723,977 Active 2040-02-10 US11354423B2 (en) 2019-06-29 2019-12-20 Cryptographic isolation of memory compartments in a computing environment
US16/722,342 Active 2041-08-16 US11829488B2 (en) 2019-06-29 2019-12-20 Pointer based data encryption
US16/723,871 Active 2042-04-08 US11768946B2 (en) 2019-06-29 2019-12-20 Low memory overhead heap management for memory tagging
US16/724,026 Active 2041-04-17 US11620391B2 (en) 2019-06-29 2019-12-20 Data encryption based on immutable pointers
US17/576,533 Abandoned US20220138329A1 (en) 2019-06-29 2022-01-14 Microprocessor pipeline circuitry to support cryptographic computing
US17/833,515 Pending US20220300626A1 (en) 2019-06-29 2022-06-06 Cryptographic isolation of memory compartments in a computing environment
US18/499,133 Pending US20240061943A1 (en) 2019-06-29 2023-10-31 Pointer based data encryption

Family Applications Before (8)

Application Number Title Priority Date Filing Date
US16/709,612 Active 2040-07-09 US11580234B2 (en) 2019-06-29 2019-12-10 Implicit integrity for cryptographic computing
US16/723,927 Active 2040-01-06 US11308225B2 (en) 2019-06-29 2019-12-20 Management of keys for use in cryptographic computing
US16/722,707 Active 2041-01-16 US11416624B2 (en) 2019-06-29 2019-12-20 Cryptographic computing using encrypted base addresses and used in multi-tenant environments
US16/724,105 Active 2040-01-14 US11321469B2 (en) 2019-06-29 2019-12-20 Microprocessor pipeline circuitry to support cryptographic computing
US16/723,977 Active 2040-02-10 US11354423B2 (en) 2019-06-29 2019-12-20 Cryptographic isolation of memory compartments in a computing environment
US16/722,342 Active 2041-08-16 US11829488B2 (en) 2019-06-29 2019-12-20 Pointer based data encryption
US16/723,871 Active 2042-04-08 US11768946B2 (en) 2019-06-29 2019-12-20 Low memory overhead heap management for memory tagging
US16/724,026 Active 2041-04-17 US11620391B2 (en) 2019-06-29 2019-12-20 Data encryption based on immutable pointers

Family Applications After (2)

Application Number Title Priority Date Filing Date
US17/833,515 Pending US20220300626A1 (en) 2019-06-29 2022-06-06 Cryptographic isolation of memory compartments in a computing environment
US18/499,133 Pending US20240061943A1 (en) 2019-06-29 2023-10-31 Pointer based data encryption

Country Status (3)

Country Link
US (11) US11580234B2 (en)
EP (7) EP3757833B1 (en)
CN (7) CN112149188A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11829488B2 (en) 2019-06-29 2023-11-28 Intel Corporation Pointer based data encryption

Families Citing this family (87)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9378560B2 (en) * 2011-06-17 2016-06-28 Advanced Micro Devices, Inc. Real time on-chip texture decompression using shader processors
US11212097B2 (en) * 2015-05-17 2021-12-28 Gideon Samid Mixed unary cryptography
US11843597B2 (en) * 2016-05-18 2023-12-12 Vercrio, Inc. Automated scalable identity-proofing and authentication process
WO2019120038A1 (en) * 2017-12-18 2019-06-27 北京三快在线科技有限公司 Encrypted storage of data
US20210026934A1 (en) 2018-02-02 2021-01-28 Dover Microsystems, Inc. Systems and methods for policy linking and/or loading for secure initialization
WO2019213061A1 (en) 2018-04-30 2019-11-07 Dover Microsystems, Inc. Systems and methods for checking safety properties
US10860709B2 (en) 2018-06-29 2020-12-08 Intel Corporation Encoded inline capabilities
US11106441B2 (en) * 2018-09-14 2021-08-31 Microsoft Technology Licensing, Llc Secure device-bound edge workload delivery
EP3877874A1 (en) 2018-11-06 2021-09-15 Dover Microsystems, Inc. Systems and methods for stalling host processor
US11841956B2 (en) 2018-12-18 2023-12-12 Dover Microsystems, Inc. Systems and methods for data lifecycle protection
US11250165B2 (en) 2019-12-20 2022-02-15 Intel Corporation Binding of cryptographic operations to context or speculative execution restrictions
US11403234B2 (en) 2019-06-29 2022-08-02 Intel Corporation Cryptographic computing using encrypted base addresses and used in multi-tenant environments
US11575504B2 (en) 2019-06-29 2023-02-07 Intel Corporation Cryptographic computing engine for memory load and store units of a microarchitecture pipeline
GB2601666B (en) * 2019-08-06 2023-04-26 Ictk Holdings Co Ltd Processor, processor operation method and electronic device comprising same
US11386101B2 (en) * 2019-08-08 2022-07-12 Cisco Technology, Inc. Systems and methods for fuzzy search without full text
US11546271B2 (en) 2019-08-09 2023-01-03 Oracle International Corporation System and method for tag based request context in a cloud infrastructure environment
US20210042165A1 (en) * 2019-08-09 2021-02-11 Oracle International Corporation System and method for supporting a quota policy language in a cloud infrastructure environment
US20210049036A1 (en) * 2019-08-13 2021-02-18 Facebook Technologies, Llc Capability Space
US11411938B2 (en) 2019-08-19 2022-08-09 Red Hat, Inc. Proof-of-work key wrapping with integrated key fragments
US11316839B2 (en) 2019-08-19 2022-04-26 Red Hat, Inc. Proof-of-work key wrapping for temporally restricting data access
US11303437B2 (en) 2019-08-19 2022-04-12 Red Hat, Inc. Proof-of-work key wrapping with key thresholding
US11424920B2 (en) 2019-08-19 2022-08-23 Red Hat, Inc. Proof-of-work key wrapping for cryptographically controlling data access
US11436352B2 (en) 2019-08-19 2022-09-06 Red Hat, Inc. Proof-of-work key wrapping for restricting data execution based on device capabilities
US11271734B2 (en) * 2019-08-19 2022-03-08 Red Hat, Inc. Proof-of-work key wrapping for verifying device capabilities
US11411728B2 (en) 2019-08-19 2022-08-09 Red Hat, Inc. Proof-of-work key wrapping with individual key fragments
US11294715B2 (en) 2019-08-28 2022-04-05 Marvell Asia Pte, Ltd. System and method for queuing work within a virtualized scheduler based on in-unit accounting of in-unit entries
US11681806B2 (en) * 2019-10-15 2023-06-20 International Business Machines Corporation Protecting against out-of-bounds buffer references
US11263310B2 (en) * 2019-11-26 2022-03-01 Red Hat, Inc. Using a trusted execution environment for a proof-of-work key wrapping scheme that verifies remote device capabilities
US11520878B2 (en) * 2019-11-26 2022-12-06 Red Hat, Inc. Using a trusted execution environment for a proof-of-work key wrapping scheme that restricts execution based on device capabilities
US11176058B2 (en) * 2020-01-22 2021-11-16 Arm Limited Address decryption for memory storage
US11216366B2 (en) * 2020-02-13 2022-01-04 Intel Corporation Security check systems and methods for memory allocations
WO2021162439A1 (en) * 2020-02-14 2021-08-19 Samsung Electronics Co., Ltd. Electronic device performing restoration on basis of comparison of constant value and control method thereof
US11249976B1 (en) 2020-02-18 2022-02-15 Wells Fargo Bank, N.A. Data structures for computationally efficient data promulgation among devices in decentralized networks
US11500981B2 (en) * 2020-03-24 2022-11-15 Microsoft Technology Licensing, Llc Shadow stack enforcement range for dynamic code
US11379579B2 (en) * 2020-03-24 2022-07-05 Microsoft Technology Licensing, Llc Shadow stack violation enforcement at module granularity
US11861364B2 (en) * 2020-03-24 2024-01-02 Microsoft Technology Licensing, Llc Circular shadow stack in audit mode
US11429580B2 (en) 2020-06-25 2022-08-30 Intel Corporation Collision-free hashing for accessing cryptographic computing metadata and for cache expansion
US11070621B1 (en) * 2020-07-21 2021-07-20 Cisco Technology, Inc. Reuse of execution environments while guaranteeing isolation in serverless computing
WO2022051189A1 (en) * 2020-09-01 2022-03-10 Intel Corporation Creating, using, and managing protected cryptography keys
US11494356B2 (en) * 2020-09-23 2022-11-08 Salesforce.Com, Inc. Key permission distribution
US11816228B2 (en) * 2020-09-25 2023-11-14 Advanced Micro Devices, Inc. Metadata tweak for channel encryption differentiation
US11928472B2 (en) 2020-09-26 2024-03-12 Intel Corporation Branch prefetch mechanisms for mitigating frontend branch resteers
US11886332B2 (en) 2020-10-30 2024-01-30 Universitat Politecnica De Valencia Dynamic memory allocation methods and systems
CN112492580B (en) * 2020-11-25 2023-08-18 北京小米移动软件有限公司 Information processing method and device, communication equipment and storage medium
US11604740B2 (en) * 2020-12-01 2023-03-14 Capital One Services, Llc Obfuscating cryptographic material in memory
US11797713B2 (en) 2020-12-16 2023-10-24 International Business Machines Corporation Systems and methods for dynamic control of a secure mode of operation in a processor
US20220197822A1 (en) * 2020-12-23 2022-06-23 Intel Corporation 64-bit virtual addresses having metadata bit(s) and canonicality check that does not fail due to non-canonical values of metadata bit(s)
WO2022133860A1 (en) * 2020-12-24 2022-06-30 Intel Corporation Key management for crypto processors attached to other processing units
WO2022139850A1 (en) * 2020-12-26 2022-06-30 Intel Corporation Cryptographic computing including enhanced cryptographic addresses
US20210117341A1 (en) * 2020-12-26 2021-04-22 Intel Corporation Cache line slot level encryption based on context information
US11669625B2 (en) 2020-12-26 2023-06-06 Intel Corporation Data type based cryptographic computing
US20210120077A1 (en) * 2020-12-26 2021-04-22 Intel Corporation Multi-tenant isolated data regions for collaborative platform architectures
US11755500B2 (en) 2020-12-26 2023-09-12 Intel Corporation Cryptographic computing with disaggregated memory
US11625337B2 (en) 2020-12-26 2023-04-11 Intel Corporation Encoded pointer based data encryption
US11580035B2 (en) 2020-12-26 2023-02-14 Intel Corporation Fine-grained stack protection using cryptographic computing
CN112738219B (en) * 2020-12-28 2022-06-10 中国第一汽车股份有限公司 Program running method, program running device, vehicle and storage medium
EP4248323A1 (en) * 2021-02-12 2023-09-27 Huawei Technologies Co., Ltd. Low overhead active mitigation of security vulnerabilities by memory tagging
US20220261509A1 (en) * 2021-02-13 2022-08-18 Intel Corporation Region-based deterministic memory safety
US11223489B1 (en) 2021-02-23 2022-01-11 Garantir LLC Advanced security control implementation of proxied cryptographic keys
EP4060537A1 (en) * 2021-03-17 2022-09-21 Secure Thingz Limited A method and system for securely provisioning electronic devices
US11218317B1 (en) 2021-05-28 2022-01-04 Garantir LLC Secure enclave implementation of proxied cryptographic keys
US11418329B1 (en) 2021-05-28 2022-08-16 Garantir LLC Shared secret implementation of proxied cryptographic keys
US11868275B2 (en) 2021-06-24 2024-01-09 International Business Machines Corporation Encrypted data processing design including local buffers
US20220414270A1 (en) * 2021-06-24 2022-12-29 International Business Machines Corporation Encrypted data processing design including cleartext register files
US20230029331A1 (en) * 2021-07-26 2023-01-26 Microsoft Technology Licensing, Llc Dynamically allocatable physically addressed metadata storage
WO2023025370A1 (en) * 2021-08-24 2023-03-02 Huawei Technologies Co., Ltd. Control flow integrity
WO2023034586A1 (en) * 2021-09-03 2023-03-09 Dover Microsystems, Inc. Systems and methods for on-demand loading of metadata
US11502827B1 (en) 2021-09-03 2022-11-15 Garantir LLC Exporting remote cryptographic keys
JP2023039697A (en) 2021-09-09 2023-03-22 キオクシア株式会社 memory system
US11372969B1 (en) * 2021-09-17 2022-06-28 Polyverse Corporation Randomized canary and shadow stack for JIT-ROP defense
US20220100911A1 (en) * 2021-12-10 2022-03-31 Intel Corporation Cryptographic computing with legacy peripheral devices
US20220114285A1 (en) * 2021-12-22 2022-04-14 Intel Corporation Data oblivious cryptographic computing
EP4207679A1 (en) * 2021-12-31 2023-07-05 G-Innovations Viet Nam Joint Stock Company Method, mobile equipment, and system for keystream protection
CN114357488B (en) * 2022-01-04 2022-09-16 深圳市智百威科技发展有限公司 Data encryption system and method
US20230251782A1 (en) * 2022-02-10 2023-08-10 Macronix International Co., Ltd. Memory device and associated control method
WO2023164167A2 (en) * 2022-02-25 2023-08-31 Cryptography Research, Inc. Techniques and devices for configurable memory encryption and authentication
US20220179949A1 (en) * 2022-02-28 2022-06-09 Intel Corporation Compiler-directed selection of objects for capability protection
US20220207133A1 (en) 2022-03-16 2022-06-30 Intel Corporation Cryptographic enforcement of borrow checking across groups of pointers
US11836094B2 (en) * 2022-03-21 2023-12-05 Intel Corporation Cryptographic data objects page conversion
US11789737B2 (en) 2022-03-24 2023-10-17 Intel Corporation Capability-based stack protection for software fault isolation
US20220222183A1 (en) * 2022-03-25 2022-07-14 Intel Corporation Tagless implicit integrity with multi-perspective pattern search
CN114968088B (en) * 2022-04-08 2023-09-05 中移互联网有限公司 File storage method, file reading method and device
WO2023212149A1 (en) * 2022-04-28 2023-11-02 Dover Microsystems, Inc. Systems and methods for enforcing encoded policies
US11949593B2 (en) * 2022-05-10 2024-04-02 Cisco Technology, Inc. Stateless address translation at an autonomous system (AS) boundary for host privacy
TWI816456B (en) * 2022-06-30 2023-09-21 新唐科技股份有限公司 Cipher device and cipher method thereof
EP4325387A1 (en) 2022-08-19 2024-02-21 Steen Harbach AG Method for providing a digital key
US20240104013A1 (en) * 2022-09-28 2024-03-28 Intel Corporation Deterministic adjacent overflow detection for slotted memory pointers

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120284461A1 (en) * 2011-05-03 2012-11-08 Qualcomm Incorporated Methods and Apparatus for Storage and Translation of Entropy Encoded Software Embedded within a Memory Hierarchy
US20160092702A1 (en) * 2014-09-26 2016-03-31 David M. Durham Cryptographic ponter address encoding
US20160094552A1 (en) * 2014-09-26 2016-03-31 David M. Durham Creating stack position dependent cryptographic return address to mitigate return oriented programming attacks
US20180095899A1 (en) * 2016-10-01 2018-04-05 Intel Corporation Multi-crypto-color-group vm/enclave memory integrity method and apparatus
US20200380140A1 (en) * 2019-05-31 2020-12-03 Nxp B.V. Probabilistic memory safety using cryptography

Family Cites Families (100)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6570989B1 (en) 1998-04-27 2003-05-27 Matsushita Electric Industrial Co., Ltd. Cryptographic processing apparatus, cryptographic processing method, and storage medium storing cryptographic processing program for realizing high-speed cryptographic processing without impairing security
WO2002003605A1 (en) 2000-07-04 2002-01-10 Koninklijke Philips Electronics N.V. Substitution-box for symmetric-key ciphers
US7684565B2 (en) 2001-01-16 2010-03-23 General Instrument Corporation System for securely communicating information packets
JP4199937B2 (en) 2001-03-06 2008-12-24 株式会社日立製作所 Anti-tamper encryption method
US7043017B2 (en) 2001-09-13 2006-05-09 Freescale Semiconductor, Inc. Key stream cipher device
US6792551B2 (en) * 2001-11-26 2004-09-14 Intel Corporation Method and apparatus for enabling a self suspend mode for a processor
US6694401B2 (en) * 2001-12-28 2004-02-17 Intel Corporation Method and apparatus for executing real-mode interrupts from within extended SMRAM handler
US20030149869A1 (en) 2002-02-01 2003-08-07 Paul Gleichauf Method and system for securely storing and trasmitting data by applying a one-time pad
US7793286B2 (en) * 2002-12-19 2010-09-07 Intel Corporation Methods and systems to manage machine state in virtual machine operations
US7225298B2 (en) 2003-04-11 2007-05-29 Sun Microsystems, Inc. Multi-node computer system in which networks in different nodes implement different conveyance modes
WO2004092968A2 (en) 2003-04-11 2004-10-28 Sun Microsystems, Inc. Multi-node system with global access states
US20070152854A1 (en) * 2005-12-29 2007-07-05 Drew Copley Forgery detection using entropy modeling
JP2007235323A (en) 2006-02-28 2007-09-13 Toshiba Corp Storing/recording method of high confidential information, reproducer utilizing high confidential information, and memory for storing high confidential information
EP1870829B1 (en) 2006-06-23 2014-12-03 Microsoft Corporation Securing software by enforcing data flow integrity
US20080080708A1 (en) 2006-09-29 2008-04-03 Mcalister Donald Kent Key wrapping system and method using encryption
US8155308B1 (en) 2006-10-10 2012-04-10 Marvell International Ltd. Advanced encryption system hardware architecture
US7907723B2 (en) 2006-10-11 2011-03-15 Frank Rubin Device, system and method for fast secure message encryption without key distribution
EP2080314A2 (en) 2006-10-25 2009-07-22 Spyrus, Inc. Method and system for deploying advanced cryptographic algorithms
US7761676B2 (en) 2006-12-12 2010-07-20 Intel Corporation Protecting memory by containing pointer accesses
US20080263117A1 (en) * 2007-04-23 2008-10-23 Gregory Gordon Rose Initial seed management for pseudorandom number generator
US8085934B1 (en) 2007-07-11 2011-12-27 Marvell International Ltd. Reverse cryptographic key expansion
US9424315B2 (en) 2007-08-27 2016-08-23 Teradata Us, Inc. Methods and systems for run-time scheduling database operations that are executed in hardware
JP5044848B2 (en) 2007-12-04 2012-10-10 剣 竜沢 Pi ++ stream cipher encryption method and decryption method, and encryption and decryption algorithm based on pi data
EP2073430B1 (en) 2007-12-21 2013-07-24 Research In Motion Limited Methods and systems for secure channel initialization transaction security based on a low entropy shared secret
US20090172393A1 (en) 2007-12-31 2009-07-02 Haluk Kent Tanik Method And System For Transferring Data And Instructions Through A Host File System
US8879725B2 (en) 2008-02-29 2014-11-04 Intel Corporation Combining instructions including an instruction that performs a sequence of transformations to isolate one transformation
US8675868B1 (en) 2008-07-01 2014-03-18 Maxim Integrated Products, Inc. Encrypting an address-dependent value along with code to prevent execution or use of moved code
KR20110044884A (en) * 2008-07-28 2011-05-02 어드밴스드 마이크로 디바이시즈, 인코포레이티드 Virtualization Advanced Synchronization Capability
US8156385B2 (en) 2009-10-28 2012-04-10 International Business Machines Corporation Systems and methods for backward-compatible constant-time exception-protection memory
US8762343B2 (en) 2009-12-29 2014-06-24 Cleversafe, Inc. Dispersed storage of software
US8683225B2 (en) 2010-05-25 2014-03-25 Via Technologies, Inc. Microprocessor that facilitates task switching between encrypted and unencrypted programs
US9892283B2 (en) 2010-05-25 2018-02-13 Via Technologies, Inc. Decryption of encrypted instructions using keys selected on basis of instruction fetch address
US9060174B2 (en) * 2010-12-28 2015-06-16 Fish Dive, Inc. Method and system for selectively breaking prediction in video coding
EP2506487B1 (en) 2011-03-30 2013-10-23 Nagravision S.A. Method of encryption with bidirectional difference propagation
EP2653992A1 (en) 2012-04-17 2013-10-23 Itron, Inc. Microcontroller configured for external memory decryption
US9904788B2 (en) 2012-08-08 2018-02-27 Amazon Technologies, Inc. Redundant key management
US9037872B2 (en) 2012-12-17 2015-05-19 Advanced Micro Devices, Inc. Hardware based return pointer encryption
KR101795771B1 (en) 2013-03-18 2017-11-09 한국전자통신연구원 System and method for providing compressed encryption and decryption in homomorphic cryptography based on intergers
US9053216B1 (en) * 2013-08-09 2015-06-09 Datto, Inc. CPU register assisted virtual machine screenshot capture timing apparatuses, methods and systems
US10700856B2 (en) 2013-11-19 2020-06-30 Network-1 Technologies, Inc. Key derivation for a module using an embedded universal integrated circuit card
US9213653B2 (en) 2013-12-05 2015-12-15 Intel Corporation Memory integrity
KR101516574B1 (en) 2014-02-21 2015-05-04 한국전자통신연구원 Variable length block cipher apparatus for providing the format preserving encryption, and the method thereof
US9703733B2 (en) 2014-06-27 2017-07-11 Intel Corporation Instructions and logic to interrupt and resume paging in a secure enclave page cache
KR101593169B1 (en) 2014-08-20 2016-02-15 한국전자통신연구원 Feistel-based variable length block cipher apparatus and method thereof
US9830162B2 (en) 2014-12-15 2017-11-28 Intel Corporation Technologies for indirect branch target security
US9852301B2 (en) 2014-12-24 2017-12-26 Intel Corporation Creating secure channels between a protected execution environment and fixed-function endpoints
US9792229B2 (en) 2015-03-27 2017-10-17 Intel Corporation Protecting a memory
IN2015DE01753A (en) 2015-06-11 2015-08-28 Pradeep Varma
US9893881B2 (en) 2015-06-29 2018-02-13 Intel Corporation Efficient sharing of hardware encryption pipeline for multiple security solutions
US10181946B2 (en) 2015-07-20 2019-01-15 Intel Corporation Cryptographic protection of I/O data for DMA capable I/O controllers
US10235176B2 (en) 2015-12-17 2019-03-19 The Charles Stark Draper Laboratory, Inc. Techniques for metadata processing
US9990249B2 (en) 2015-12-24 2018-06-05 Intel Corporation Memory integrity with error detection and correction
GB2547249B (en) 2016-02-12 2019-09-11 Advanced Risc Mach Ltd An apparatus and method for generating signed bounded pointers
US10585809B2 (en) 2016-04-01 2020-03-10 Intel Corporation Convolutional memory integrity
RU2634173C1 (en) 2016-06-24 2017-10-24 Акционерное общество "Лаборатория Касперского" System and detecting method of remote administration application
US20220019698A1 (en) 2016-08-11 2022-01-20 Intel Corporation Secure Public Cloud with Protected Guest-Verified Host Control
US20180082057A1 (en) 2016-09-22 2018-03-22 Intel Corporation Access control
US10261854B2 (en) 2016-09-30 2019-04-16 Intel Corporation Memory integrity violation analysis method and apparatus
US20180095906A1 (en) 2016-09-30 2018-04-05 Intel Corporation Hardware-based shared data coherency
US10805070B2 (en) 2016-10-19 2020-10-13 Index Systems, Llc Systems and methods for multi-region encryption/decryption redundancy
US10387305B2 (en) 2016-12-23 2019-08-20 Intel Corporation Techniques for compression memory coloring
US10469254B2 (en) 2017-03-29 2019-11-05 Intuit Inc. Method and system for hierarchical cryptographic key management
US10536266B2 (en) * 2017-05-02 2020-01-14 Seagate Technology Llc Cryptographically securing entropy for later use
US10877806B2 (en) 2017-06-14 2020-12-29 Intel Corporation Method and apparatus for securely binding a first processor to a second processor
US10657071B2 (en) 2017-09-25 2020-05-19 Intel Corporation System, apparatus and method for page granular, software controlled multiple key memory encryption
US10776525B2 (en) 2017-09-29 2020-09-15 Intel Corporation Multi-tenant cryptographic memory isolation
US10769272B2 (en) 2017-09-29 2020-09-08 Intel Corporation Technology to protect virtual machines from malicious virtual machine managers
US10706164B2 (en) 2017-09-29 2020-07-07 Intel Corporation Crypto-enforced capabilities for isolation
DE102018125786A1 (en) * 2017-11-17 2019-05-23 Intel Corporation Encrypted system memory management
US11082432B2 (en) 2017-12-05 2021-08-03 Intel Corporation Methods and apparatus to support reliable digital communications without integrity metadata
US10929527B2 (en) 2017-12-20 2021-02-23 Intel Corporation Methods and arrangements for implicit integrity
CN110490008B (en) 2018-05-14 2021-08-10 英韧科技(上海)有限公司 Security device and security chip
IT201800005506A1 (en) 2018-05-18 2019-11-18 PROCESSING SYSTEM, RELATED INTEGRATED CIRCUIT AND PROCEDURE
EP3752945A1 (en) 2018-05-21 2020-12-23 Google LLC Automatic generation of patches for security violations
US10871983B2 (en) 2018-05-31 2020-12-22 Intel Corporation Process-based multi-key total memory encryption
US10860709B2 (en) 2018-06-29 2020-12-08 Intel Corporation Encoded inline capabilities
US10785028B2 (en) 2018-06-29 2020-09-22 Intel Corporation Protection of keys and sensitive data from attack within microprocessor architecture
US11630920B2 (en) 2018-06-29 2023-04-18 Intel Corporation Memory tagging for side-channel defense, memory safety, and sandboxing
US10922439B2 (en) 2018-06-29 2021-02-16 Intel Corporation Technologies for verifying memory integrity across multiple memory regions
US11258861B2 (en) * 2018-06-29 2022-02-22 Intel Corporation Secure reporting of platform state information to a remote server
US11188639B2 (en) 2018-07-19 2021-11-30 Intel Corporation System, method and apparatus for automatic program compartmentalization
US11126733B2 (en) 2018-08-27 2021-09-21 Intel Corporation System, apparatus and method for configurable trusted input/output access from authorized software
US20200076585A1 (en) 2018-09-04 2020-03-05 International Business Machines Corporation Storage device key management for encrypted host data
US10802910B2 (en) * 2018-09-17 2020-10-13 Intel Corporation System for identifying and correcting data errors
US11288213B2 (en) 2019-03-29 2022-03-29 Intel Corporation Memory protection with hidden inline metadata
US11398899B2 (en) 2019-05-28 2022-07-26 Shanghai Zhaoxin Semiconductor Co., Ltd. Data processing device and data processing method
US20190319781A1 (en) 2019-06-27 2019-10-17 Intel Corporation Deterministic Encryption Key Rotation
US11403234B2 (en) 2019-06-29 2022-08-02 Intel Corporation Cryptographic computing using encrypted base addresses and used in multi-tenant environments
US20200257827A1 (en) 2019-06-29 2020-08-13 Intel Corporation Memory write for ownership access in a core
US11575504B2 (en) 2019-06-29 2023-02-07 Intel Corporation Cryptographic computing engine for memory load and store units of a microarchitecture pipeline
US20200145187A1 (en) 2019-12-20 2020-05-07 Intel Corporation Bit-length parameterizable cipher
US11250165B2 (en) 2019-12-20 2022-02-15 Intel Corporation Binding of cryptographic operations to context or speculative execution restrictions
US11580234B2 (en) 2019-06-29 2023-02-14 Intel Corporation Implicit integrity for cryptographic computing
US20220382885A1 (en) 2019-06-29 2022-12-01 David M. Durham Cryptographic computing using encrypted base addresses and used in multi-tenant environments
US11411938B2 (en) 2019-08-19 2022-08-09 Red Hat, Inc. Proof-of-work key wrapping with integrated key fragments
US11784786B2 (en) 2020-08-14 2023-10-10 Intel Corporation Mitigating security vulnerabilities with memory allocation markers in cryptographic computing systems
US11580035B2 (en) 2020-12-26 2023-02-14 Intel Corporation Fine-grained stack protection using cryptographic computing
US11669625B2 (en) 2020-12-26 2023-06-06 Intel Corporation Data type based cryptographic computing
US11625337B2 (en) 2020-12-26 2023-04-11 Intel Corporation Encoded pointer based data encryption
US11755500B2 (en) 2020-12-26 2023-09-12 Intel Corporation Cryptographic computing with disaggregated memory

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120284461A1 (en) * 2011-05-03 2012-11-08 Qualcomm Incorporated Methods and Apparatus for Storage and Translation of Entropy Encoded Software Embedded within a Memory Hierarchy
US20160092702A1 (en) * 2014-09-26 2016-03-31 David M. Durham Cryptographic ponter address encoding
US20160094552A1 (en) * 2014-09-26 2016-03-31 David M. Durham Creating stack position dependent cryptographic return address to mitigate return oriented programming attacks
US20180095899A1 (en) * 2016-10-01 2018-04-05 Intel Corporation Multi-crypto-color-group vm/enclave memory integrity method and apparatus
US20200380140A1 (en) * 2019-05-31 2020-12-03 Nxp B.V. Probabilistic memory safety using cryptography

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11829488B2 (en) 2019-06-29 2023-11-28 Intel Corporation Pointer based data encryption

Also Published As

Publication number Publication date
US11580234B2 (en) 2023-02-14
US11620391B2 (en) 2023-04-04
EP3757833B1 (en) 2022-12-07
US11829488B2 (en) 2023-11-28
US11354423B2 (en) 2022-06-07
US11416624B2 (en) 2022-08-16
EP3757856B1 (en) 2023-06-14
EP3757854A1 (en) 2020-12-30
EP3757852B1 (en) 2022-11-02
EP3757856A1 (en) 2020-12-30
EP3757833A1 (en) 2020-12-30
US20200125769A1 (en) 2020-04-23
EP3757852A1 (en) 2020-12-30
US11321469B2 (en) 2022-05-03
US20200145199A1 (en) 2020-05-07
US20200125501A1 (en) 2020-04-23
US20200117810A1 (en) 2020-04-16
US11308225B2 (en) 2022-04-19
US20220300626A1 (en) 2022-09-22
CN112149188A (en) 2020-12-29
US20200125742A1 (en) 2020-04-23
CN112149149A (en) 2020-12-29
US20200125770A1 (en) 2020-04-23
EP3757851A1 (en) 2020-12-30
US20200159676A1 (en) 2020-05-21
CN112149147A (en) 2020-12-29
EP3757850A1 (en) 2020-12-30
US20200125502A1 (en) 2020-04-23
US20240061943A1 (en) 2024-02-22
US11768946B2 (en) 2023-09-26
EP3757851B1 (en) 2023-05-24
EP3757850B1 (en) 2023-05-03
CN112149148A (en) 2020-12-29
CN112149150A (en) 2020-12-29
CN112149145A (en) 2020-12-29
EP3757855A1 (en) 2020-12-30
EP3757854B1 (en) 2023-07-12
CN112149143A (en) 2020-12-29

Similar Documents

Publication Publication Date Title
US11321469B2 (en) Microprocessor pipeline circuitry to support cryptographic computing
US11575504B2 (en) Cryptographic computing engine for memory load and store units of a microarchitecture pipeline
CN107851170B (en) Supporting configurable security levels for memory address ranges
US8819455B2 (en) Parallelized counter tree walk for low overhead memory replay protection
US20210117342A1 (en) Encoded pointer based data encryption
US11250165B2 (en) Binding of cryptographic operations to context or speculative execution restrictions
US20220121447A1 (en) Hardening cpu predictors with cryptographic computing context information
US20200117811A1 (en) Processor hardware and instructions for sha3 cryptographic operations
US20220326957A1 (en) Indirect branch predictor security protection
US20230010948A1 (en) Indirect branch predictor security protection
CN117546168A (en) Cryptographic computation using context information for transient side channel security
EP4020114A1 (en) Time and frequency domain side-channel leakage suppression using integrated voltage regulator cascaded with runtime crypto arithmetic transformations
US20220121578A1 (en) Transient side-channel aware architecture for cryptographic computing
US20240104027A1 (en) Temporal information leakage protection mechanism for cryptographic computing
US20210117341A1 (en) Cache line slot level encryption based on context information

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION