WO2023016030A1 - 神经网络参数部署方法、ai集成芯片及其相关装置 - Google Patents

神经网络参数部署方法、ai集成芯片及其相关装置 Download PDF

Info

Publication number
WO2023016030A1
WO2023016030A1 PCT/CN2022/094495 CN2022094495W WO2023016030A1 WO 2023016030 A1 WO2023016030 A1 WO 2023016030A1 CN 2022094495 W CN2022094495 W CN 2022094495W WO 2023016030 A1 WO2023016030 A1 WO 2023016030A1
Authority
WO
WIPO (PCT)
Prior art keywords
chip
neural network
storage unit
network code
npu
Prior art date
Application number
PCT/CN2022/094495
Other languages
English (en)
French (fr)
Inventor
骆华敏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023016030A1 publication Critical patent/WO2023016030A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/14Protection against unauthorised use of memory or access to memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords

Definitions

  • the present application relates to the field of artificial intelligence, and in particular to a neural network parameter deployment method, an artificial intelligence (AI) integrated chip, an AI computing device, electronic equipment, and a computer-readable storage medium.
  • AI artificial intelligence
  • AI computing has gradually expanded from the cloud to terminals, and smart terminals with AI computing capabilities have gradually become popular in smart phones, public security, car assisted driving, smart homes and other fields.
  • high costs, performance bottlenecks, power consumption bottlenecks, and security risks caused by the high computing power required by existing AI computing are always challenges for smart terminal devices.
  • the existing AI computing system generally includes a neural network processing unit (Neural-network Processing Unit, NPU), a central processing unit (Central Processing Unit, CPU), on-chip static random access memory (Static Random Access Memory) , SRAM) and off-chip Dynamic Random Access Memory (Dynamic Random Access Memory, DRAM).
  • NPU neural network processing unit
  • CPU Central Processing Unit
  • SRAM static random access memory
  • DRAM Dynamic Random Access Memory
  • DRAM Dynamic Random Access Memory
  • the first aspect of the embodiment of the present application discloses an AI computing device, and the AI computing device includes an AI integrated chip and an off-chip memory.
  • the AI integrated chip includes a CPU, an NPU, a first on-chip storage unit and a second on-chip storage unit.
  • the off-chip memory is used to store the first neural network code and the first weight parameters associated with the NPU.
  • the second on-chip storage unit is used to store the second neural network code and the second weight parameter associated with the NPU, and the NPU is used to pair the first neural network code, the second neural network code, the first weight parameter and the second weight parameter Specifies the data to process.
  • the second on-chip storage unit is provided with a permission that only allows the NPU to read, and a permission that only allows the CPU to write.
  • the second on-chip storage unit can be an embedded non-volatile memory, which will be relatively important
  • the second neural network code and the second weight parameter are stored in the second on-chip storage unit, and the relatively unimportant first neural network code and the first weight parameter are stored in the off-chip memory, which can improve the second neural network code and the second
  • the security level of the weight parameter, the NPU can read the second neural network code and the second weight parameter from the second on-chip storage unit, which can reduce the bandwidth requirement for the NPU to access the off-chip memory, and at the same time set the second on-chip storage unit to only Allowing NPU to read and CPU to write can further improve the security level of the second neural network code and the second weight parameter.
  • the AI integrated chip further includes a third on-chip storage unit for caching at least part of the first neural network code and the first weight parameters read from the off-chip memory.
  • a third on-chip storage unit is added to cache the first neural network code and the first weight parameter read by the NPU from the off-chip memory.
  • the third on-chip memory can be an embedded non-volatile memory. Compared with the speed at which the NPU reads the first neural network code and the first weight parameter from the off-chip memory, the NPU reads the first neural network code and the first weight parameter from the third on-chip storage unit faster, which can improve the performance of the NPU. Data processing efficiency.
  • the CPU is further configured to receive ciphertext data and write the ciphertext data to the second on-chip storage unit, where the ciphertext data is obtained by encrypting the second neural network code and the second weight parameter.
  • the AI manufacturer encrypts the neural network code and weight parameters with an encryption key and sends them to the device manufacturer, and the device manufacturer writes the ciphertext neural network code and weight parameters into the built-in eNWM of the chip (the above-mentioned second in-chip memory unit), and the one-time programmable OTP device in the chip has a built-in decryption key, which is used to decrypt the neural network code and weight parameters of the ciphertext.
  • the decryption key is provided by the AI manufacturer After negotiating with the chip manufacturer and implanting it into the chip by the chip manufacturer, the AI manufacturer does not need to send the decryption key to the device manufacturer, but it is built into the chip by the chip manufacturer, and the equipment of the device manufacturer can work normally after using the above chip.
  • the decryption key to decrypt the ciphertext neural network code and weight parameters, equipment manufacturers or any other third-party manufacturers cannot obtain the plaintext neural network code, effectively protecting the intellectual property rights of AI manufacturers.
  • the AI integrated chip also includes a key storage unit and a decryption unit, the decryption key is stored in the key storage unit before the ciphertext data is written into the second on-chip storage unit, and the key storage unit is provided with only Permission to allow the decryption unit to read, the NPU is also used to read the ciphertext data from the second on-chip storage unit, and call the decryption unit to decrypt the ciphertext data using the decryption key to obtain the second neural network code and the second weight parameter.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device, to avoid the decryption key from being leaked to the device manufacturer, and at the same time, only one read port can be set for the key storage unit , the read port is provided to the decryption unit, so that only the decryption unit is allowed to read the decryption key, further avoiding the leakage of the decryption key, and the ciphertext data can be read again when the NPU reads the ciphertext data from the second on-chip storage unit Decryption processing is performed to prevent equipment manufacturers from obtaining the plaintext second neural network code and second weight parameters.
  • the key storage unit in the chip such as an OTP device
  • the AI integrated chip also includes a key storage unit and a decryption unit.
  • the key storage unit is used to store the decryption key.
  • the key storage unit is provided with an authority that only allows the decryption unit to read. text data, and call the decryption unit to use the decryption key to decrypt the ciphertext data.
  • the ciphertext data is obtained by encrypting the second neural network code and the second weight parameter.
  • the CPU is also used to decrypt the decrypted second neural network
  • the code and the second weight parameter are written into the second on-chip storage unit.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device.
  • the OTP device is only provided with one read port, and the read port is connected to the chip in the chip.
  • the decryption unit, the decryption key is negotiated by the AI manufacturer and the chip manufacturer and implanted into the chip by the chip manufacturer.
  • the equipment manufacturer purchases the neural network code and weight parameters of the ciphertext from the AI manufacturer, and purchases the chip with the decryption key implanted from the chip manufacturer.
  • the chip is also equipped with eNVM, and the equipment manufacturer purchases the neural network code and weight parameters of the ciphertext After the weight parameters, the neural network code and weight parameters of the ciphertext are written into the eNVM in the chip, and the decryption unit in the chip pairs the neural network code and weight parameters of the ciphertext by calling the decryption key built in the OTP device. Use after decryption. In this way, the AI manufacturer does not need to send the decryption key to the device manufacturer, but the chip manufacturer builds it into the chip. After using the above-mentioned chip, the device manufacturer's device can normally use the decryption key to decrypt the neural network code and weight of the ciphertext. Parameters, equipment manufacturers or any other third-party manufacturers cannot obtain the neural network code in clear text, which effectively protects the intellectual property rights of AI manufacturers.
  • the key storage unit includes a one-time programmable OTP device.
  • the decryption key can be avoided from being tampered with, and because the OTP device has strong anti-attack and anti-reverse engineering capabilities, it can Improve the storage security of the decryption key.
  • the off-chip memory is also used to store designated data and processing results of the designated data
  • the first on-chip storage unit is used to cache designated data read from the off-chip memory
  • the specified data read from the off-chip memory is cached by the first on-chip storage unit, for example, the first on-chip storage unit can be an on-chip SRAM, compared to the NPU reading specified data from the off-chip memory Speed, the NPU reads the specified data from the first on-chip storage unit faster, which can speed up the NPU's reading of the specified data and improve the data processing efficiency of the NPU.
  • the first on-chip storage unit can be an on-chip SRAM
  • the first on-chip storage unit includes SRAM
  • the off-chip memory includes DRAM
  • the second on-chip storage unit includes any one of eFalsh, eNVM, MRAM or RRAM
  • the third on-chip storage unit includes eFalsh, Any of eNVM, MRAM or RRAM.
  • the second aspect of the embodiment of the present application discloses an AI integrated chip.
  • the AI integrated chip includes a CPU, an NPU, a first on-chip storage unit, and a second on-chip storage unit.
  • the second on-chip storage unit is used to store neural network codes and weight parameters associated with the NPU, and the NPU is used to process specified data based on the neural network codes and weight parameters.
  • the second on-chip storage unit includes any one of eFalsh, eNVM, MRAM or RRAM.
  • a second on-chip storage unit is added to store the neural network code and weight parameters associated with the NPU.
  • the neural network code and weight can be improved.
  • the security level of the parameters, the NPU reads the neural network code and weight parameters from the second on-chip storage unit, which can reduce the bandwidth requirement for the NPU to access the off-chip memory, improve the performance of the NPU, and write the neural network code and weight parameters into the chip eFalsh, eNVM, MRAM or RRAM are stored, and equipment manufacturers cannot steal the neural network code and weight parameters provided by AI manufacturers.
  • the second on-chip memory unit is provided with only one read port and one write port, the read port is electrically connected to the NPU, and the write port is electrically connected to the CPU.
  • setting the second on-chip storage unit as the permission that only allows the NPU to read and the permission that only allows the CPU to write can further improve the security of the neural network code and weight parameters stored in the second on-chip storage unit grade.
  • the CPU is further configured to receive ciphertext data and write the ciphertext data to the second on-chip storage unit, where the ciphertext data is obtained by encrypting the neural network code and weight parameters.
  • the AI manufacturer encrypts the neural network code and weight parameters with an encryption key and sends them to the device manufacturer, and the device manufacturer writes the ciphertext neural network code and weight parameters into the built-in eNWM of the chip (the above-mentioned second in-chip memory unit), and the one-time programmable OTP device in the chip has a built-in decryption key, which is used to decrypt the neural network code and weight parameters of the ciphertext.
  • the decryption key is provided by the AI manufacturer After negotiating with the chip manufacturer and implanting it into the chip by the chip manufacturer, the AI manufacturer does not need to send the decryption key to the device manufacturer, but it is built into the chip by the chip manufacturer, and the equipment of the device manufacturer can work normally after using the above chip.
  • the decryption key to decrypt the ciphertext neural network code and weight parameters, equipment manufacturers or any other third-party manufacturers cannot obtain the plaintext neural network code, effectively protecting the intellectual property rights of AI manufacturers.
  • the AI integrated chip also includes a key storage unit and a decryption unit, the decryption key is stored in the key storage unit before the ciphertext data is written into the second on-chip storage unit, and the key storage unit is provided with only Allowing the permission of the decryption unit to read, the NPU is also used to read the ciphertext data from the second on-chip storage unit, and call the decryption unit to use the decryption key to decrypt the ciphertext data to obtain the neural network code and weight parameters.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device, to avoid the decryption key from being leaked to the device manufacturer, and at the same time, only one read port can be set for the key storage unit , the read port is provided to the decryption unit, so that only the decryption unit is allowed to read the decryption key, further avoiding the leakage of the decryption key, and the ciphertext data can be read again when the NPU reads the ciphertext data from the second on-chip storage unit Decryption processing is performed to prevent equipment manufacturers from obtaining the plaintext second neural network code and second weight parameters.
  • the key storage unit in the chip such as an OTP device
  • the AI integrated chip also includes a key storage unit and a decryption unit.
  • the key storage unit is used to store the decryption key.
  • the key storage unit is provided with an authority that only allows the decryption unit to read. text data, and call the decryption unit to use the decryption key to decrypt the ciphertext data.
  • the ciphertext data is obtained by encrypting the neural network code and weight parameters.
  • the CPU is also used to write the decrypted neural network code and weight parameters into to the second on-chip memory location.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device.
  • the OTP device is only provided with one read port, and the read port is connected to the chip in the chip.
  • the decryption unit, the decryption key is negotiated by the AI manufacturer and the chip manufacturer and implanted into the chip by the chip manufacturer.
  • the equipment manufacturer purchases the neural network code and weight parameters of the ciphertext from the AI manufacturer, and purchases the chip with the decryption key implanted from the chip manufacturer.
  • the chip is also equipped with eNVM, and the equipment manufacturer purchases the neural network code and weight parameters of the ciphertext After the weight parameters, the neural network code and weight parameters of the ciphertext are written into the eNVM in the chip, and the decryption unit in the chip pairs the neural network code and weight parameters of the ciphertext by calling the decryption key built in the OTP device. Use after decryption. In this way, the AI manufacturer does not need to send the decryption key to the device manufacturer, but the chip manufacturer builds it into the chip. After using the above-mentioned chip, the device manufacturer's device can normally use the decryption key to decrypt the neural network code and weight of the ciphertext. Parameters, equipment manufacturers or any other third-party manufacturers cannot obtain the neural network code in clear text, which effectively protects the intellectual property rights of AI manufacturers.
  • the key storage unit includes a one-time programmable OTP device.
  • the decryption key can be avoided from being tampered with, and because the OTP device has strong anti-attack and anti-reverse engineering capabilities, it can Improve the storage security of the decryption key.
  • the embodiment of the present application provides a neural network parameter deployment method, which is applied to an AI computing device, and the AI computing device includes an AI integrated chip and an off-chip memory.
  • the AI integrated chip includes a CPU, an NPU, a first on-chip storage unit and a second on-chip storage unit.
  • the neural network parameter deployment includes: storing the first neural network code and the first weight parameter associated with the NPU to the off-chip memory; storing the second neural network code and the second weight parameter associated with the NPU to the second on-chip storage unit ; Wherein, the NPU is used to process the specified data based on the first neural network code, the second neural network code, the first weight parameter and the second weight parameter, and the second on-chip storage unit is provided with permission that only allows the NPU to read, And permissions that allow only CPU writes.
  • the second on-chip storage unit can be an embedded non-volatile memory, which will be relatively important
  • the second neural network code and the second weight parameter are stored in the second on-chip storage unit, and the relatively unimportant first neural network code and the first weight parameter are stored in the off-chip memory, which can improve the second neural network code and the second
  • the security level of weight parameters the NPU can read the neural network code and weight parameters from the second on-chip storage unit, which can reduce the bandwidth requirement for the NPU to access the off-chip memory, and set the second on-chip storage unit to only allow the NPU to read , CPU writing, which can further improve the security level of the second neural network code and the second weight parameter.
  • the AI integrated chip also includes a third on-chip storage unit
  • the neural network parameter deployment method further includes: reading the first neural network code and the first weight parameters from the off-chip memory and caching them in the third on-chip storage in the unit.
  • a third on-chip storage unit is added to cache the first neural network code and the first weight parameter read by the NPU from the off-chip memory.
  • the third on-chip memory can be an embedded non-volatile memory. Compared with the speed at which the NPU reads the first neural network code and the first weight parameter from the off-chip memory, the NPU reads the first neural network code and the first weight parameter from the third on-chip storage unit faster, which can improve the performance of the NPU. Data processing efficiency.
  • the neural network parameter deployment method further includes: receiving the ciphertext data sent by the encryption device, and writing the ciphertext data to the second on-chip storage unit; wherein, the ciphertext data is the code for the second neural network and the second weight parameter are encrypted.
  • the AI manufacturer encrypts the neural network code and weight parameters with an encryption key and sends them to the device manufacturer, and the device manufacturer writes the ciphertext neural network code and weight parameters into the built-in eNWM of the chip (the above-mentioned second in-chip memory unit), and the one-time programmable OTP device in the chip has a built-in decryption key, which is used to decrypt the neural network code and weight parameters of the ciphertext.
  • the decryption key is provided by the AI manufacturer After negotiating with the chip manufacturer and implanting it into the chip by the chip manufacturer, the AI manufacturer does not need to send the decryption key to the device manufacturer, but it is built into the chip by the chip manufacturer, and the equipment of the device manufacturer can work normally after using the above chip.
  • the decryption key to decrypt the ciphertext neural network code and weight parameters, equipment manufacturers or any other third-party manufacturers cannot obtain the plaintext neural network code, effectively protecting the intellectual property rights of AI manufacturers.
  • the neural network parameter deployment method further includes: when the NPU reads the ciphertext data from the second on-chip storage unit, calling the pre-stored decryption key to decrypt the ciphertext data to obtain the second neural network code and the second weight parameter.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device, to avoid the decryption key from being leaked to the device manufacturer, and at the same time, only one read port can be set for the key storage unit , the read port is provided to the decryption unit, so that only the decryption unit is allowed to read the decryption key, further avoiding the leakage of the decryption key, and the ciphertext data can be read again when the NPU reads the ciphertext data from the second on-chip storage unit Decryption processing is performed to prevent equipment manufacturers from obtaining the plaintext second neural network code and second weight parameters.
  • the key storage unit in the chip such as an OTP device
  • the neural network parameter deployment method further includes: receiving the ciphertext data sent by the encryption device, and calling the pre-stored decryption key to decrypt the ciphertext data; decrypting the second neural network code and the second The weight parameters are written into the second on-chip storage unit; wherein, the ciphertext data is obtained by encrypting the second neural network code and the second weight parameters.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device.
  • the OTP device is only provided with one read port, and the read port is connected to the chip in the chip.
  • the decryption unit, the decryption key is negotiated by the AI manufacturer and the chip manufacturer and implanted into the chip by the chip manufacturer.
  • the equipment manufacturer purchases the neural network code and weight parameters of the ciphertext from the AI manufacturer, and purchases the chip with the decryption key implanted from the chip manufacturer.
  • the chip is also equipped with eNVM, and the equipment manufacturer purchases the neural network code and weight parameters of the ciphertext After the weight parameters, the neural network code and weight parameters of the ciphertext are written into the eNVM in the chip, and the decryption unit in the chip pairs the neural network code and weight parameters of the ciphertext by calling the decryption key built in the OTP device. Use after decryption. In this way, the AI manufacturer does not need to send the decryption key to the device manufacturer, but the chip manufacturer builds it into the chip. After using the above-mentioned chip, the device manufacturer's device can normally use the decryption key to decrypt the neural network code and weight of the ciphertext. Parameters, equipment manufacturers or any other third-party manufacturers cannot obtain the neural network code in clear text, which effectively protects the intellectual property rights of AI manufacturers.
  • the embodiment of the present application provides a neural network parameter deployment method, including: the first processing end writes the decryption key into the OTP device of the AI integrated chip; the second processing end encrypts the neural network code and weight parameters , to obtain the ciphertext data; the third processing end writes the ciphertext data to the on-chip storage unit of the AI integrated chip; wherein, the decryption key is used to decrypt the ciphertext data, and the on-chip storage unit includes eFalsh, eNVM, MRAM or any one of RRAM.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device.
  • the OTP device is provided with only one read port, and the read port is connected to the decryption unit in the chip.
  • the decryption key is negotiated by the AI manufacturer and the chip manufacturer and implanted into the chip by the chip manufacturer. Yes, the device manufacturer purchases the neural network code and weight parameters of the ciphertext from the AI manufacturer, and purchases a chip with a decryption key implanted from the chip manufacturer.
  • the chip is also equipped with eFalsh, eNVM, MRAM or RRAM, and the device manufacturer purchases After the neural network code and weight parameters of the ciphertext, write the neural network code and weight parameters of the ciphertext into the eFalsh, eNVM, MRAM or RRAM in the chip, and the decryption unit in the chip is built into the OTP device by calling
  • the decryption key is used after decrypting the neural network code and weight parameters of the ciphertext. In this way, the AI manufacturer does not need to send the decryption key to the device manufacturer, but the chip manufacturer builds it into the chip.
  • the device manufacturer's device can normally use the decryption key to decrypt the neural network code and weight of the ciphertext.
  • Parameters, device manufacturers or any other third-party manufacturers cannot obtain the neural network code in plain text, which effectively protects the intellectual property rights of AI manufacturers, and stores the ciphertext data in the on-chip eFalsh, eNVM, MRAM or RRAM, compared with the existing
  • the scheme stores the neural network code and weight parameters in the off-chip memory, which can further improve the security level of the neural network code and weight parameters.
  • the embodiment of the present application provides a neural network parameter deployment method, including: obtaining ciphertext data obtained by encrypting the neural network code and weight parameters; writing the ciphertext data to the AI integrated chip The on-chip storage unit; wherein, the OTP device of the AI integrated chip is pre-written with the decryption key for decrypting the ciphertext data, and the on-chip storage unit includes any one of eFalsh, eNVM, MRAM or RRAM.
  • the decryption key is an OTP device written into the AI integrated chip by the chip manufacturer.
  • the decryption key is implanted by the chip manufacturer after negotiation between the AI manufacturer and the chip manufacturer. Inside the chip, the device manufacturer purchases the neural network code and weight parameters of the ciphertext from the AI manufacturer, and purchases the chip with the decryption key implanted from the chip manufacturer.
  • the chip is also equipped with eFalsh, eNVM, MRAM or RRAM, and the device manufacturer purchases After obtaining the neural network code and weight parameters of the ciphertext, write the neural network code and weight parameters of the ciphertext into eFalsh, eNVM, MRAM or RRAM in the chip, and the decryption unit in the chip calls the built-in OTP device
  • the decryption key in the ciphertext is used after decrypting the neural network code and weight parameters of the ciphertext. AI manufacturers do not need to send the decryption key to the device manufacturer, but are built into the chip by the chip manufacturer. After the chip, the decryption key can be used to decrypt the neural network code and weight parameters of the ciphertext.
  • the device manufacturer or any other third-party manufacturer cannot obtain the neural network code in plaintext, which effectively protects the intellectual property rights of AI manufacturers. Furthermore, the ciphertext The data is stored in the on-chip eFalsh, eNVM, MRAM or RRAM. Compared with the existing scheme of storing the neural network code and weight parameters in the off-chip memory, the security level of the neural network code and weight parameters can be further improved.
  • the AI integrated chip includes an element CPU, an NPU, a first on-chip storage unit, and a second on-chip storage unit
  • writing ciphertext data into the on-chip storage unit of the AI integrated chip includes: writing the ciphertext The data is written to the second on-chip storage unit, wherein the second on-chip storage unit is set with a permission that only allows the NPU to read and a permission that only allows the CPU to write.
  • the NPU can read the neural network code and weight parameters from the second on-chip storage unit, which can reduce the bandwidth requirement for the NPU to access the off-chip memory, and at the same time set the second on-chip storage unit to only allow the NPU to read , CPU writing, which can further improve the security level of the second neural network code and the second weight parameter.
  • the neural network parameter deployment method further includes: when the NPU reads the ciphertext data from the second on-chip storage unit, calling the decryption key stored in the OTP device to decrypt the ciphertext data to obtain the neural network code and weight parameter.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device, to avoid the decryption key from being leaked to the device manufacturer, and at the same time, only one read port can be set for the key storage unit , the read port is provided to the decryption unit, so that only the decryption unit is allowed to read the decryption key, further avoiding the leakage of the decryption key, and the ciphertext data can be read again when the NPU reads the ciphertext data from the second on-chip storage unit Decryption processing is performed to prevent equipment manufacturers from obtaining the plaintext second neural network code and second weight parameters.
  • the key storage unit in the chip such as an OTP device
  • writing the ciphertext data to the on-chip storage unit of the AI integrated chip includes: calling the decryption key to decrypt the ciphertext data; writing the decrypted neural network code and weight parameters to the second On-chip storage unit.
  • the chip manufacturer first writes the decryption key into the key storage unit in the chip, such as an OTP device, to avoid the decryption key from being leaked to the device manufacturer, and at the same time, only one read port can be set for the key storage unit , the read port is provided to the decryption unit, so that only the decryption unit is allowed to read the decryption key, further avoiding the leakage of the decryption key.
  • the text data is decrypted to prevent equipment manufacturers from obtaining the plaintext second neural network code and second weight parameters.
  • an embodiment of the present application provides a computer-readable storage medium, including computer instructions.
  • the computer instructions When the computer instructions are run on an electronic device, the electronic device executes the method for deploying neural network parameters as described in the third aspect.
  • the embodiment of the present application provides an electronic device, including the AI computing device described in the first aspect or including the AI integrated chip described in the second aspect.
  • an embodiment of the present application provides a computer program product, which, when running on a computer, causes the computer to execute the method for deploying neural network parameters as described in the third aspect.
  • the computer-readable storage medium described in the fourth aspect, the electronic device described in the fifth aspect, the computer program product described in the sixth aspect, and the device described in the seventh aspect are all related to the above-mentioned second aspect.
  • the method in the aspect or the third aspect is corresponding, therefore, the beneficial effect that it can achieve can refer to the beneficial effect in the corresponding method provided above, and will not be repeated here.
  • FIG. 1 is a schematic diagram of the architecture of an AI computing system in the prior art
  • FIG. 2 is a schematic structural diagram of an AI computing device provided by an embodiment of the present application.
  • 3a-3b are schematic diagrams of the architecture of an AI computing device provided by another embodiment of the present application.
  • FIG. 4 is a schematic diagram of decrypting ciphertext data by a decryption unit provided in an embodiment of the present application
  • FIG. 5 is a schematic flowchart of a neural network parameter deployment method provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a neural network parameter deployment method provided by another embodiment of the present application.
  • FIG. 7 is a schematic flowchart of a neural network parameter deployment method provided in another embodiment of the present application.
  • Fig. 8 is a possible electronic device provided by the embodiment of the present application.
  • a schematic diagram of the architecture of an artificial intelligence (AI) computing device provided by an embodiment of the present invention is exemplarily introduced below with reference to FIG. 2 .
  • the AI computing device 100 can be applied to a smart terminal, and the smart terminal can be a smart phone, a tablet computer, a smart car, a smart home device, and the like.
  • the AI computing device 100 may include an AI integrated chip 10 and an off-chip memory 20 .
  • the AI integrated chip 10 may include a central processing unit (Central Processing Unit, CPU) 11, a neural network processing unit (Neural-network Processing Unit, NPU) 12, a first on-chip storage unit 13 and a second on-chip storage unit 14.
  • CPU Central Processing Unit
  • NPU neural network processing unit
  • the CPU 11 may include an NPU driver, and the CPU 11 may configure the NPU 12 through the NPU driver.
  • the CPU 11 can configure the NPU 12 to process specified data through the NPU driver, and allocate registers in the NPU 12 for processing specified data.
  • the specified data may refer to data specified by the CPU 11.
  • Tasks can be assigned by the CPU 11, and the NPU 1 then executes corresponding tasks. For example, taking a smart car as an example, each frame of image captured by the camera can be automatically stored in a memory, and each time an image is stored, the CPU 11 can issue an execution command to the NPU 12, instructing the NPU 12 to call the image from the memory. Image for AI model reasoning.
  • NPU 12 is a neural network (neural-network, NN) computing processor.
  • NN neural network
  • the network code required by the NPU 12 in the data processing process can be divided into the first neural network code and the second neural network code, and the weight data required in the data processing process can be divided into the first weight parameter and the second weight parameter. parameter.
  • the second neural network code may refer to the neural network code/neural model code with high value in the network code, and the second neural network code may be specified by the developer.
  • the first neural network code may refer to the rest of the network codes except the second neural network code.
  • the second weight parameter may refer to a weight parameter with a high value in the weight data, and the second weight parameter may be specified by a developer.
  • the first weight parameter may refer to other weight parameters in the weight data except the second weight parameter.
  • the NPU 12 can process the specified data based on the first neural network code, the second neural network code, the first weight parameter and the second weight parameter.
  • the capacity of the second on-chip storage unit 14 used by the AI computing device 100 is limited, and may not be able to store all the network codes and weight data required by the NPU 12.
  • the off-chip memory 20 can be used to store the first neural network code, The first weight parameter, the specified data and the processing result of the specified data, utilize the second on-chip storage unit 14 to store the second neural network code and the second weight parameter, so that the core neural network code and core weight parameters of the NPU 12 are stored in the AI integration
  • the chip 10 can improve the security level of core neural network codes and core weight parameters.
  • the second on-chip storage unit 14 can be an embedded non-volatile memory
  • the AI integrated chip 10 stores relatively important second neural network codes and second weight parameters by adding the second on-chip storage unit 14, which will be relatively insignificant.
  • the important first neural network code and the first weight parameter are stored in the off-chip memory 20, which can improve the security level of the second neural network code and the second weight parameter, because the NPU 12 can read the second
  • the neural network code and the second weight parameter can reduce the bandwidth requirement for the NPU 12 to access the off-chip memory 20, and improve the performance of the NPU 12.
  • the AI integrated chip 10 can also reuse the embedded non-volatile memory to store relatively important second neural network codes and second weight parameters, without adding an additional
  • the embedded non-volatile memory serves as the second on-chip storage unit 14 .
  • the first on-chip storage unit 13 can be used to cache specified data read from the off-chip memory 20, for example, when the NPU 12 processes the first data stored in the off-chip memory 20, the first data can be read from the off-chip memory 20 reads the first on-chip storage unit 13, and the first on-chip storage unit 13 caches the first data, which can speed up the speed of NPU processing data and improve data processing efficiency.
  • the first on-chip storage unit 13 may include a static random access memory (Static Random Access Memory, SRAM), and the second on-chip storage unit 14 may include an embedded flash memory (Embedded Flash, eFalsh), an embedded non-volatile memory Any one of volatile memory (Embedded Non-volatile Memory, eNVM), Magnetic Random Access Memory (Magnetic Random Access Memory, MRAM), or resistive Random Access Memory (Resistive Random Access Memory, RRAM).
  • the second on-chip storage unit 14 may also be other types of embedded storage devices.
  • the off-chip memory 20 may include Dynamic Random Access Memory (Dynamic Random Access Memory, DRAM).
  • the second on-chip storage unit 14 can be designed as an authority that only allows the NPU 12 to read, and only allows the CPU 11 to write permission.
  • the second on-chip memory unit 14 may only have one read port and one write port.
  • the write port of the second on-chip storage unit 14 is only electrically connected to the CPU 11, so that only the CPU 11 is allowed to write.
  • the read port of the second on-chip storage unit 14 is only electrically connected to the NPU 12, so that the NPU 12 is allowed to read only.
  • the second neural network code and the second weight parameter stored in the second on-chip storage unit 14 can selectively perform cryptographic encryption processing or mathematical operation scrambling processing, so as to improve the second neural network code and the second weight parameter. Security of weight parameters.
  • the algorithm for encryption processing or scrambling processing can be selected according to actual needs, and this application does not limit it.
  • the encryption algorithm can be a symmetric encryption algorithm (AES algorithm, DES algorithm, SM4 algorithm, etc.) or an asymmetric encryption algorithm.
  • the scrambling algorithm can be a developer-defined algorithm, such as a reversible scrambling algorithm such as a developer-defined mathematical lookup table and reverse order processing.
  • Encryption processing or scrambling processing can be processed off-line outside the AI integrated chip 10, for example, an encryption device (an element or device independent of the AI integrated chip 10, Equipment such as computers, servers, etc.) encrypt or scramble the second neural network code and the second weight parameters to obtain ciphertext data, and the encryption device can send the ciphertext data to the AI integrated chip 10 .
  • Decryption or descrambling of ciphertext data can be performed inside the AI integrated chip 10 .
  • the AI integrated chip 10 may further include a key storage unit 15 , a decryption unit 16 and a third on-chip storage unit 17 .
  • the key storage unit 15 is used to store a decryption key or a descrambling key.
  • the decryption unit 16 is used for decrypting or descrambling the ciphertext data according to the decryption key or the descrambling key stored in the key storage unit 15 .
  • the key storage unit 15 can be a one-time programmable (One Time Programmable, OTP) device arranged inside the AI integrated chip 10, and the decryption unit 16 can be arranged inside the AI integrated chip 10 for decrypting or deciphering the ciphertext data. Interference processing hardware circuit.
  • OTP One Time Programmable
  • the third on-chip storage unit 17 is used for caching at least part of the first neural network code and the first weight parameters read from the off-chip memory 20 .
  • the NPU 12 needs to use the first neural network code and the first weight parameter to process the specified data
  • part or all of the first neural network code and the first weight parameter stored in the off-chip memory 20 can be stored in the off-chip memory 20.
  • 20 read the third on-chip storage unit 17, and cache the first neural network code and the first weight parameter through the third on-chip storage unit 17, which can speed up the speed at which the NPU 12 reads the first neural network code and the first weight parameter, Improve data processing efficiency.
  • the third on-chip storage unit 17 may be any one of eFalsh, eNVM, MRAM, or RRAM.
  • the third on-chip storage unit 17 may also be other types of embedded storage devices.
  • the second on-chip storage unit 14 can store all network codes and weight data required by the NPU 12, that is, the first neural network code, the second neural network code, the first weight parameter and the second weight parameter Stored in the second on-chip storage unit 14, now the off-chip memory 20 can be used to store the processing results of specified data and specified data, and the third on-chip storage unit 17 can also be omitted (that is, no need to arrange the third on-chip storage unit 17 Cache the neural network code and weight parameters read from the off-chip memory 20).
  • the AI integrated chip 10 can decrypt or descramble the ciphertext data when the ciphertext data is written into the second on-chip storage unit 14, so as to realize the conversion of the second neural network code and the second weight parameter into plaintext
  • the format is stored in the second on-chip storage unit 14.
  • the CPU 11 when it receives the ciphertext data, it can call the decryption unit 16 to decrypt or descramble the ciphertext data, and then decrypt or descramble the obtained second
  • the neural network code and the second weight parameter are written into the second on-chip storage unit 14 for storage.
  • the AI integrated chip 10 may also not perform decryption or descrambling processing when the ciphertext data file is written into the second on-chip storage unit 14, and store the second neural network code and the second weight parameter in ciphertext form
  • the NPU 12 decrypts or descrambles the ciphertext data when reading the ciphertext data from the second on-chip storage unit 14.
  • the NPU when the NPU reads the ciphertext data from the second on-chip storage unit 14, it can call the decryption unit 16 to decrypt or descramble the ciphertext data to obtain a second neural network code and a second weight parameter.
  • the decryption unit 16 can automatically read the decryption key or descrambling key stored in the key storage unit 15 .
  • the OTP device includes an OTP memory and an OTP controller.
  • the OTP controller automatically controls the OTP memory to transmit the decryption key or descrambling key to the decryption unit 16, so that the decryption unit 16 can be used in AI
  • the reading of the decryption key or descrambling key is automatically completed.
  • the key storage unit 15 can set the permission that only the decryption unit 16 is allowed to read, so as to avoid the leakage of the decryption key or the descrambling key.
  • the key storage unit 15 is provided with only one read port, and the read port is only electrically connected to the decryption unit 16 .
  • the second neural network code and the second weight parameters are stored in the second on-chip storage unit 14 in ciphertext, and when the NPU 12 reads the ciphertext data from the second on-chip storage unit 14, the NPU 12 can send Call the instruction to the decryption unit 16.
  • the decryption unit 16 receives the call instruction from the NPU 12, it uses the decryption key to decrypt the ciphertext data to obtain the second neural network code and the second weight parameter.
  • the above-mentioned AI computing device adds a second on-chip storage unit to store the second neural network code and the second weight parameter associated with the NPU, and stores the relatively important second neural network code and the second weight parameter in the second on-chip storage unit, which stores the relatively unimportant first neural network code and first weight parameters in the off-chip memory, which can improve the security level of the second neural network code and the second weight parameters, and the NPU can read from the second on-chip storage unit
  • the neural network code and weight parameters can reduce the bandwidth requirement of the NPU to access the off-chip memory.
  • setting the second on-chip storage unit to only allow the NPU to read and the CPU to write can further improve the second neural network code and the second weight.
  • the security level of the parameters and can prevent the equipment manufacturer from obtaining the decryption key, the second neural network code in plain text and the second weight parameters.
  • the key stored in the key storage unit 15 is used as an example to illustrate the decryption key.
  • the ciphertext data of different input addresses corresponds to different decryption keys, and the input ciphertext data needs to use the
  • the decryption key associated with the address is used for decryption.
  • the decryption unit 16 can use the decryption key to decrypt the ciphertext data according to the input address of the ciphertext data, and obtain the plaintext data and the output address, wherein the input address is used as the instruction information of the decryption process, and the input address and the output address can be the same.
  • the input address of the ciphertext data may refer to a storage address of the ciphertext data, for example, an address stored in the key storage unit 15 .
  • the decryption unit 16 may directly use the decryption key to decrypt the ciphertext data to obtain plaintext data.
  • a neural network parameter deployment method provided by an embodiment of the present application is applied to an AI computing device 100 .
  • the AI computing device 100 includes an AI integrated chip 10 and an off-chip memory 20 .
  • the AI integrated chip 10 includes a CPU 11, an NPU 12, a first on-chip storage unit 13 and a second on-chip storage unit 14.
  • the network codes required by the NPU 12 during data processing are divided into first neural network codes and second neural network codes, and the weight data required during data processing can be divided into first weights parameter and the second weight parameter.
  • the second neural network code may refer to the neural network code/neural model code with high value in the network code, and the second neural network code may be specified by the developer.
  • the first neural network code may refer to the rest of the network codes except the second neural network code.
  • the second weight parameter may refer to a weight parameter with a high value in the weight data, and the second weight parameter may be specified by a developer.
  • the first weight parameter may refer to other weight parameters in the weight data except the second weight parameter.
  • the neural network parameter deployment method may include:
  • Step 50 storing the first neural network code and the first weight parameters in the off-chip memory 20 .
  • the first neural network code and the first weight parameters can be written into the off-chip memory 20 by the CPU 11 for storage.
  • the off-chip memory 20 may include DRAM.
  • Step 51 storing the second neural network code and the second weight parameters in the second on-chip storage unit 14 .
  • the CPU 11 may write the second neural network code and the second weight parameters into the second on-chip storage unit 14 for storage.
  • the second on-chip storage unit 14 can be any one of eFalsh, eNVM, MRAM, or RRAM, and the second on-chip storage unit 14 can be provided with permission that only allows the NPU 12 to read, and only allows the CPU 11 to write permissions, which in turn can improve the security of the second neural network code and the second weight parameters stored in the second on-chip storage unit 14 .
  • the AI integrated chip 10 may also include a third on-chip storage unit 17.
  • the NPU 12 reads the first neural network code and the first weight parameter from the off-chip memory 20
  • the first neural network code read and the first weight parameter can be cached in the third on-chip storage unit 17 .
  • the third on-chip storage unit 17 can be any one of eFalsh, eNVM, MRAM, or RRAM, and the speed at which the NPU 12 can read the first neural network code and the first weight parameter can be accelerated by the third on-chip storage unit 17, Improve data processing efficiency.
  • the second neural network code and the second weight parameter stored in the second on-chip storage unit 14 can selectively perform cryptographic encryption processing or mathematical operation scrambling processing, so as to improve the second neural network code and the second weight parameter. Security of weight parameters.
  • the algorithm for encryption processing or scrambling processing can be selected according to actual requirements, which is not limited in this application. Encryption processing or scrambling processing can be processed offline outside the AI integrated chip 10, for example, an encryption device independent of the AI integrated chip 10 can be used to encrypt or scramble the second neural network code and the second weight parameter to obtain the encrypted text data, the encryption device can send the cipher text data to the AI integrated chip 10. Decryption or descrambling of ciphertext data can be performed inside the AI integrated chip 10 .
  • the AI integrated chip 10 can decrypt or descramble the ciphertext data when the ciphertext data is written into the second on-chip storage unit 14, so as to store the second neural network code and the second weight parameters in the second on-chip storage unit 14 in plain text. Internal storage unit 14.
  • the AI integrated chip 10 may also not perform decryption or descrambling processing when the ciphertext data file is written into the second on-chip storage unit 14, and store the second neural network code and the second weight parameter in the second on-chip in ciphertext form.
  • the storage unit 14 and the NPU 12 read the ciphertext data from the second on-chip storage unit 14, the ciphertext data is decrypted or descrambled.
  • the second on-chip storage unit 14 can store all network codes and weight data required by the NPU 12, the first neural network code, the second neural network code, the first weight parameter and the second neural network code can be stored.
  • the weight parameters are stored in the second on-chip storage unit 14 .
  • a second on-chip storage unit is added to store the second neural network code and the second weight parameter associated with the NPU, and the relatively important second neural network code and the second weight parameter are stored in the second chip
  • the internal storage unit stores the relatively unimportant first neural network code and the first weight parameter in the off-chip memory, which can improve the security level of the second neural network code and the second weight parameter, and the NPU can read from the second internal storage unit Reading the neural network code and weight parameters can reduce the bandwidth requirement for the NPU to access the off-chip memory.
  • setting the second on-chip storage unit to only allow the NPU to read and the CPU to write can further improve the second neural network code and the first.
  • the security level of the second weight parameter can prevent the equipment manufacturer from obtaining the decryption key, the second neural network code in plain text and the second weight parameter.
  • a neural network parameter deployment method provided by an embodiment of the present application is applied to the first processing end, the second processing end, and the third processing end.
  • the first processing end may refer to the electronic device on the side of the chip manufacturer
  • the second processing end may refer to the electronic device on the side of the AI manufacturer
  • the third processing end may refer to the electronic device on the side of the equipment manufacturer.
  • the chip manufacturer provides the hardware module of the AI integrated chip 10.
  • the hardware module may include a CPU 11, an NPU 12, a first on-chip storage unit 13, a second on-chip storage unit 14, a key storage unit 15, a decryption unit 16, and the like.
  • the AI manufacturer provides the software module of the AI integrated chip 10, for example, the software module includes the neural network code and weight parameters required by the NPU 12 in the AI integrated chip 10.
  • the device manufacturer obtains a terminal including the AI integrated chip 10 based on the hardware module of the AI integrated chip 10 provided by the chip manufacturer and the software module of the AI integrated chip 10 provided by the AI manufacturer.
  • the neural network parameter deployment method includes:
  • Step 61 the first processing end writes the decryption key or the descrambling key into the key storage unit 15 of the AI integrated chip 10 .
  • the chip manufacturer and the AI manufacturer can negotiate a decryption key or a descrambling key, which can be generated by the AI manufacturer and provided to the chip manufacturer, or generated by the chip manufacturer and provided to the AI manufacturer.
  • the key storage unit 15 may be an OTP device, and the chip manufacturer may write the decryption key or the descrambling key into the OTP device of the AI integrated chip 10 through the first processing terminal.
  • the chip manufacturer may provide the hardware module of the AI integrated chip 10 implanted with the decryption key or descrambling key to the device manufacturer.
  • the first processing end may be a burning device, and the chip manufacturer uses the burning device to write the decryption key or descrambling key into the OTP device of the AI integrated chip 10 .
  • Step 62 the second processing end encrypts the neural network code and weight parameters associated with the NPU 12 to obtain ciphertext data.
  • the AI vendor encrypts the neural network code and weight parameters (the encryption key used for encryption may be the same as the decryption key ), to get the ciphertext data.
  • AI manufacturers can use personal computers (Personal Computer, PC) or servers to encrypt neural network codes and weight parameters to obtain ciphertext data.
  • the AI manufacturer can provide the ciphertext data to the device manufacturer online or offline.
  • equipment manufacturers can purchase ciphertext data from AI manufacturers offline, and equipment manufacturers can also conduct online transactions with AI manufacturers. After the transaction, send the ciphertext data to the device manufacturer (PC or server on the device manufacturer's side).
  • Step 63 the third processing terminal writes the ciphertext data into the second on-chip storage unit 14 of the AI integrated chip 10 .
  • the device manufacturer can write the ciphertext data to the second on-chip storage unit 14 of the AI integrated chip 10 through the third processing terminal, and then when the AI integrated chip 10 performs AI reasoning later, the decryption unit 16 can Use the built-in decryption key or descrambling key to decrypt the ciphertext data to obtain the plaintext neural network code and weight parameters.
  • the second on-chip storage unit 14 may include any one of eFalsh, eNVM, MRAM or RRAM.
  • the third processing end may be a burning device, and the device manufacturer uses the burning device to write the ciphertext data into the second on-chip storage unit 14 of the AI integrated chip 10 .
  • the decryption unit 16 can automatically read the decryption key or descrambling key stored in the key storage unit 15 .
  • the OTP device includes an OTP memory and an OTP controller.
  • the OTP controller automatically controls the OTP memory to transmit the decryption key or descrambling key to the decryption unit 16, so that the decryption unit 16 can be used in AI
  • the reading of the decryption key or descrambling key is automatically completed.
  • the OTP device can set the permission that only the decryption unit 16 is allowed to read, so as to avoid the leakage of the decryption key or descrambling key.
  • the key storage unit 15 is provided with only one read port, and the read port is electrically connected to the decryption unit 16 .
  • the AI manufacturer can encrypt the upgraded neural network code and weight parameters through the second processing end, and provide the new ciphertext data to the equipment manufacturer.
  • the device manufacturer can write new ciphertext data into the second on-chip storage unit 14 of the AI integrated chip 10 through the third processing terminal.
  • the above-mentioned neural network parameter deployment method compared with the existing scheme, where the equipment manufacturer writes the decryption key and the plaintext neural network code and weight parameters into the AI integrated chip, the chip manufacturer writes the decryption key into the AI integrated chip. Then provide the AI integrated chip that stores the decryption key to the equipment manufacturer, and the AI manufacturer will encrypt the neural network code and weight parameters, and the AI manufacturer will provide the neural network code and weight parameters to the equipment manufacturer in ciphertext form, and the equipment manufacturer will Writing the ciphertext data to the second on-chip storage unit can prevent equipment manufacturers from obtaining the decryption key and the neural network code and weight parameters of the plaintext, and then store the ciphertext data in the on-chip eFalsh, eNVM, MRAM or RRAM , compared with the existing scheme of storing the neural network code and weight parameters in the off-chip memory, the security level of the neural network code and weight parameters can be further improved.
  • a neural network parameter deployment method provided by an embodiment of the present application is applied to the third processing end.
  • the third processing end may be an electronic device on the side of the device manufacturer.
  • the neural network parameter deployment method includes:
  • Step 70 obtaining ciphertext data.
  • the ciphertext data may refer to the data obtained by encrypting the neural network code and weight parameters associated with the NPU 12.
  • the AI manufacturer can encrypt the neural network code and weight parameters to obtain ciphertext data, and provide the ciphertext data to the equipment manufacturer.
  • the AI manufacturer can provide the ciphertext data to the device manufacturer online or offline.
  • equipment manufacturers can purchase ciphertext data from AI manufacturers offline, and equipment manufacturers can also conduct online transactions with AI manufacturers, such as requesting ciphertext data from AI manufacturers' cloud servers, and AI manufacturers' cloud servers confirm online
  • send the ciphertext data to the device manufacturer PC or server on the device manufacturer's side.
  • Step 71 write the ciphertext data into the second on-chip storage unit 14 of the AI integrated chip 10 .
  • the device manufacturer when the device manufacturer obtains the ciphertext data, the device manufacturer can write the ciphertext data into the second on-chip storage unit 14 of the AI integrated chip 10 through the third processing terminal.
  • the second on-chip storage unit 14 may include any one of eFalsh, eNVM, MRAM or RRAM.
  • the third processing end may be a burning device, and the device manufacturer uses the burning device to write the ciphertext data into the second on-chip storage unit 14 of the AI integrated chip 10 .
  • a decryption key for decrypting ciphertext data is pre-written in the one-time programmable OTP device of the AI integrated chip.
  • the chip manufacturer can write the decryption key or descrambling key into the OTP device of the AI integrated chip 10, and the chip manufacturer then provides the AI integrated chip 10 implanted with the decryption key or descrambling key to the equipment manufacturer .
  • the decryption key or descrambling key stored in the OTP device can be controlled by the read logic control decryption unit 16 set by the program in the CPU 11 or NPU 12 to complete the decryption key or descrambling key. Interference key reading, and the reading operation can be set to be invisible to the application software.
  • the OTP device can set the permission that only the decryption unit 16 is allowed to read, so as to avoid the leakage of the decryption key or descrambling key.
  • the key storage unit 15 is provided with only one read port, and the read port is electrically connected to the decryption unit 16 .
  • the decryption key is an OTP device written into the AI integrated chip by the chip manufacturer, and then the AI integrated chip that stores the decryption key is provided to the equipment manufacturer.
  • the neural network code and weight parameters are provided to the device manufacturer in ciphertext, and the device manufacturer cannot obtain the decryption key and the neural network code and weight parameters in plain text, and store the ciphertext data in the on-chip eFalsh, eNVM, MRAM or RRAM , compared with the existing scheme of storing the neural network code and weight parameters in the off-chip memory, the security level of the neural network code and weight parameters can be further improved.
  • the electronic device 1000 may include a first processor 1001 , a first memory 1002 and a first communication bus 1003 .
  • the first memory 1002 is used to store one or more computer programs 1004 .
  • One or more computer programs 1004 are configured to be executed by the first processor 1001 .
  • the one or more computer programs 1004 include instructions, which can be used to implement the method for deploying neural network parameters as described in FIG. 5 in the electronic device 1000 .
  • the structure shown in this embodiment does not constitute a specific limitation on the electronic device 1000 .
  • the electronic device 1000 may include more or fewer components than shown, or combine certain components, or separate certain components, or arrange different components.
  • the first processor 1001 may include one or more processing units, for example: the first processor 1001 may include an application processor (application processor, AP), a modem, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem graphics processing unit
  • GPU graphics processing unit
  • ISP image signal processor
  • controller video codec
  • digital signal processor digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the first processor 1001 may also be provided with a memory for storing instructions and data.
  • the memory in the first processor 1001 is a cache memory.
  • the memory may store instructions or data that the first processor 1001 has just used or recycled. If the first processor 1001 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the first processor 1001 is reduced, thereby improving the efficiency of the system.
  • the first processor 1001 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input/output (general-purpose input/output, GPIO) interface, SIM interface, and/or USB interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • SIM interface SIM interface
  • USB interface etc.
  • the first memory 1002 can include high-speed random access memory, and can also include non-volatile memory, such as hard disk, internal memory, plug-in hard disk, smart memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash memory card (Flash Card), at least one disk storage device, flash memory device, or other non-volatile solid-state storage devices.
  • non-volatile memory such as hard disk, internal memory, plug-in hard disk, smart memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash memory card (Flash Card), at least one disk storage device, flash memory device, or other non-volatile solid-state storage devices.
  • This embodiment also provides a computer storage medium, which stores computer instructions, and when the computer instructions are run on the electronic device, the electronic device executes the steps of the above-mentioned related methods to realize the neural network parameters as shown in Figure 5 deployment method.
  • This embodiment also provides a computer program product, which, when running on a computer, causes the computer to execute the above related steps, so as to realize the neural network parameter deployment method as shown in FIG. 5 .
  • the electronic equipment, computer storage medium or computer program product provided in this embodiment are all used to execute the corresponding method provided above, therefore, the beneficial effects that it can achieve can refer to the corresponding method provided above The beneficial effects of this will not be repeated here.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are schematic.
  • the division of the modules or units is a logical function division.
  • there may be other division methods for example, multiple units or components may be combined or may be integrated into another device, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the unit described as a separate component may or may not be physically separated, and a component displayed as a unit may be one physical unit or multiple physical units, that is, it may be located in one place, or may be distributed to multiple different places. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the software product is stored in a storage medium Among them, several instructions are included to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Storage Device Security (AREA)

Abstract

本申请实施例提供了一种AI计算装置,涉及终端领域。AI计算装置包括AI集成芯片及片外存储器,AI集成芯片包括CPU、NPU、第一片内存储单元及第二片内存储单元,第二片内存储单元设置有仅允许NPU读取的权限及仅允许CPU写入的权限。片外存储器存储与NPU关联的第一神经网络代码及第一权重参数,第二片内存储单元存储与NPU关联的第二神经网络代码及第二权重参数。本申请实施例还提供一种AI集成芯片、神经网络参数部署方法、电子设备及计算机可读存储介质。本申请将相对重要的第二神经网络代码和第二权重参数存储在AI芯片内部,提高第二神经网络代码和第二权重参数的安全等级,且可降低NPU访问片外存储器的带宽需求。

Description

神经网络参数部署方法、AI集成芯片及其相关装置
本申请要求于2021年08月11日提交中国专利局,申请号为202110919731.6、申请名称为“神经网络参数部署方法、AI集成芯片及其相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,尤其涉及一种神经网络参数部署方法、人工智能(Artificial Intelligence,AI)集成芯片、AI计算装置、电子设备及计算机可读存储介质。
背景技术
随着信息技术与AI技术的演进,AI计算逐渐从云端扩展到终端,具备AI计算能力的智能终端逐渐在智能手机、公共安全、汽车辅助驾驶、智慧家庭等领域普及。然而现有的AI计算所需要的高算力引起的高成本、性能瓶颈、功耗瓶颈、安全隐患等始终是智能终端设备面临的挑战。
如图1所示,现有的AI计算系统一般包括神经网络处理单元(Neural-network Processing Unit,NPU)、中央处理单元(Central Processing Unit,CPU)、片内静态随机访问存储器(Static Random Access Memory,SRAM)及片外动态随机访问存储器(Dynamic Random Access Memory,DRAM)。DRAM用于存储NPU计算所需要的软件代码,权重参数、待处理数据等。NPU的软件代码和权重参数具有极高的商业价值,存储在片外DRAM很难进行保护,容易被破解。
发明内容
有鉴于此,有必要提供一种神经网络参数部署方法、AI集成芯片、AI计算装置、电子设备及计算机可读存储介质,其可提升NPU软件代码和权重参数的安全等级。
本申请实施例第一方面公开了一种AI计算装置,AI计算装置包括AI集成芯片及片外存储器。AI集成芯片包括CPU、NPU、第一片内存储单元及第二片内存储单元。片外存储器用于存储与NPU关联的第一神经网络代码及第一权重参数。第二片内存储单元用于存储与NPU关联的第二神经网络代码及第二权重参数,NPU用于基于第一神经网络代码、第二神经网络代码、第一权重参数及第二权重参数对指定数据进行处理。第二片内存储单元设置有仅允许NPU读取的权限,及仅允许CPU写入的权限。
通过采用该技术方案,增设第二片内存储单元来存储与NPU关联的第二神经网络代码及第二权重参数,例如第二片内存储单元可以是嵌入式非易失存储器,将相对重要的第二神经网络代码与第二权重参数存储在第二片内存储单元,将相对不重要的第一神经网络代码与第一权重参数存储在片外存储器,可提升第二神经网络代码与第二权重参数的安全等级,NPU可从第二片内存储单元读取第二神经网络代码与第二权重参数,可降低NPU访问片外存储器的带宽需求,同时将第二片内存储单元设置为仅允许NPU读取、CPU写入,可以进一步提升第二神经网络代码与第二权重参数的安全等级。
在一些实施例中,AI集成芯片还包括第三片内存储单元,第三片内存储单元用于缓存从片外存储器读取的第一神经网络代码及第一权重参数的至少部分。
通过采用该技术方案,增设第三片内存储单元来缓存NPU从片外存储器读取的第一神经网络代码及第一权重参数,例如第三片内存储器可以是嵌入式非易失存储器,相比NPU从片外存储器读取第一神经网络代码及第一权重参数的速度,NPU从第三片内存储单元读取第一神经网络代码及第一权重参数的速度更快,可以提升NPU的数据处理效率。
在一些实施例中,CPU还用于接收密文数据,并将密文数据写入至第二片内存储单元,密文数据为对第二神经网络代码及第二权重参数进行加密得到的。
通过采用该技术方案,AI厂商将神经网络代码和权重参数使用加密密钥加密后发送给设备厂商,设备厂商将密文的神经网络代码以及权重参数写入芯片的内置eNWM(上述第二片内存储单元)中,而该芯片内的一次性可编程OTP器件中内置有解密密钥,该解密密钥用于对密文的神经网络代码以及权重参数进行解密,该解密密钥是由AI厂商和芯片厂商协商好后并由芯片厂商植入芯片内部的,AI厂商无需将解密密钥发送给设备厂商,而是由芯片厂商内置在芯片中,设备厂商的设备在使用了上述芯片后可以正常使用该解密密钥解密密文的神经网络代码和权重参数,设备厂商或其他任何第三方厂商无法获取明文的神经网络代码,有效保护了AI厂商的知识产权。
在一些实施例中,AI集成芯片还包括密钥存储单元及解密单元,解密密钥在密文数据写入至第二片内存储单元之前存储至密钥存储单元,密钥存储单元设置有仅允许解密单元读取的权限,NPU还用于从第二片内存储单元读取密文数据,并调用解密单元使用解密密钥对密文数据进行解密,得到第二神经网络代码及第二权重参数。
通过采用该技术方案,由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,避免解密密钥泄露给设备厂商,同时可以仅为密钥存储单元设置一个读端口,该读端口提供给解密单元,实现仅允许解密单元读取解密密钥,进一步避免解密密钥泄露,再者可以在NPU从第二片内存储单元读取密文数据时再对密文数据进行解密处理,避免设备厂商获取明文的第二神经网络代码及第二权重参数。
在一些实施例中,AI集成芯片还包括密钥存储单元及解密单元,密钥存储单元用于存储解密密钥,密钥存储单元设置有仅允许解密单元读取的权限,CPU用于接收密文数据,并调用解密单元使用解密密钥对密文数据进行解密,密文数据为对第二神经网络代码及第二权重参数进行加密得到的,CPU还用于将解密得到的第二神经网络代码及第二权重参数写入至第二片内存储单元。
通过采用该技术方案,由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,示例性的,该OTP器件仅设置有一个读端口,该读端口连接至芯片内的解密单元,该解密密钥是由AI厂商和芯片厂商协商好后并由芯片厂商植入芯片内部的。设备厂商从AI厂商购买密文的神经网络代码和权重参数,以及从芯片厂商购买植入了解密密钥的芯片,该芯片内还设置有eNVM,设备厂商购买了该密文的神经网络代码和权重参数后,将该密文的神经网络代码和权重参数写入该芯片内的eNVM中,芯片内的解密单元通过调用内置在OTP器件中的解密密钥对密文的神经网络代码和权重参数进行解密后使用。这样,AI厂商无需将解密密钥发送给设备厂商,而是由芯片厂商内置在芯片中,设备厂商的设备在使用了上述芯片后可以正常使用该解密密钥解密密文的神经网络代码和权重参数,设备厂商或其他任何第三方厂商无法获取明文的神经网络代码,有效保护了AI厂商的知识产权。
在一些实施例中,密钥存储单元包括一次性可编程OTP器件。
通过采用该技术方案,使用OTP器件存储解密密钥,由于OTP器件的一次性可编程特性,可以避免解密密钥被篡改,且由于OTP器件具有很强的抗攻击、抗反向工程能力,可提升解密密钥的存储安全性。
在一些实施例中,片外存储器还用于存储指定数据及指定数据的处理结果,第一片内存储单元用于缓存从片外存储器读取的指定数据。
通过采用该技术方案,通过第一片内存储单元来缓存从片外存储器读取的指定数据,例如第一片内存储单元可以是片内SRAM,相比NPU从片外存储器读取指定数据的速度,NPU从第一片内存储单元读取指定数据的速度更快,可以加快NPU读取指定数据的速度,提升NPU的数据处理效率。
在一些实施例中,第一片内存储单元包括SRAM,片外存储器包括DRAM,第二片内存储单元包括eFalsh、eNVM、MRAM或RRAM中的任意一种,第三片内存储单元包括eFalsh、eNVM、MRAM或RRAM中的任意一种。
本申请实施例第二方面公开了一种AI集成芯片,AI集成芯片包括CPU、NPU、第一片内存储单元及第二片内存储单元。第二片内存储单元用于存储与NPU关联的神经网络代码及权重参数,NPU用于基于神经网络代码及权重参数对指定数据进行处理。第二片内存储单元包括eFalsh、eNVM、MRAM或RRAM中的任意一种。
通过采用该技术方案,增设第二片内存储单元来存储与NPU关联的神经网络代码及权重参数,相比现有将神经网络代码及权重参数存储在片外存储器,可提升神经网络代码与权重参数的安全等级,NPU从第二片内存储单元读取神经网络代码与权重参数,可降低NPU访问片外存储器的带宽需求,提升NPU性能,同时将神经网络代码及权重参数写入至片内eFalsh、eNVM、MRAM或RRAM进行储存,设备厂商无法窃取AI厂商所提供的神经网络代码及权重参数。
在一些实施例中,第二片内存储单元仅设置一个读端口与一个写端口,读端口电连接于NPU,写端口电连接于所述CPU。
通过采用该技术方案,将第二片内存储单元设置为仅允许NPU读取的权限及仅允许CPU写入的权限,可以进一步提高第二片内存储单元存储的神经网络代码与权重参数的安全等级。
在一些实施例中,CPU还用于接收密文数据,并将密文数据写入至第二片内存储单元,密文数据为对神经网络代码及权重参数进行加密得到的。
通过采用该技术方案,AI厂商将神经网络代码和权重参数使用加密密钥加密后发送给设备厂商,设备厂商将密文的神经网络代码以及权重参数写入芯片的内置eNWM(上述第二片内存储单元)中,而该芯片内的一次性可编程OTP器件中内置有解密密钥,该解密密钥用于对密文的神经网络代码以及权重参数进行解密,该解密密钥是由AI厂商和芯片厂商协商好后并由芯片厂商植入芯片内部的,AI厂商无需将解密密钥发送给设备厂商,而是由芯片厂商内置在芯片中,设备厂商的设备在使用了上述芯片后可以正常使用该解密密钥解密密文的神经网络代码和权重参数,设备厂商或其他任何第三方厂商无法获取明文的神经网络代码,有效保护了AI厂商的知识产权。
在一些实施例中,AI集成芯片还包括密钥存储单元及解密单元,解密密钥在密文数据写入至第二片内存储单元之前存储至密钥存储单元,密钥存储单元设置有仅允许解密单元读 取的权限,NPU还用于从第二片内存储单元读取密文数据,并调用解密单元使用解密密钥对密文数据进行解密,得到神经网络代码及权重参数。
通过采用该技术方案,由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,避免解密密钥泄露给设备厂商,同时可以仅为密钥存储单元设置一个读端口,该读端口提供给解密单元,实现仅允许解密单元读取解密密钥,进一步避免解密密钥泄露,再者可以在NPU从第二片内存储单元读取密文数据时再对密文数据进行解密处理,避免设备厂商获取明文的第二神经网络代码及第二权重参数。
在一些实施例中,AI集成芯片还包括密钥存储单元及解密单元,密钥存储单元用于存储解密密钥,密钥存储单元设置有仅允许解密单元读取的权限,CPU用于接收密文数据,并调用解密单元使用解密密钥对密文数据进行解密,密文数据为对神经网络代码及权重参数进行加密得到的,CPU还用于将解密得到的神经网络代码及权重参数写入至第二片内存储单元。
通过采用该技术方案,由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,示例性的,该OTP器件仅设置有一个读端口,该读端口连接至芯片内的解密单元,该解密密钥是由AI厂商和芯片厂商协商好后并由芯片厂商植入芯片内部的。设备厂商从AI厂商购买密文的神经网络代码和权重参数,以及从芯片厂商购买植入了解密密钥的芯片,该芯片内还设置有eNVM,设备厂商购买了该密文的神经网络代码和权重参数后,将该密文的神经网络代码和权重参数写入该芯片内的eNVM中,芯片内的解密单元通过调用内置在OTP器件中的解密密钥对密文的神经网络代码和权重参数进行解密后使用。这样,AI厂商无需将解密密钥发送给设备厂商,而是由芯片厂商内置在芯片中,设备厂商的设备在使用了上述芯片后可以正常使用该解密密钥解密密文的神经网络代码和权重参数,设备厂商或其他任何第三方厂商无法获取明文的神经网络代码,有效保护了AI厂商的知识产权。
在一些实施例中,密钥存储单元包括一次性可编程OTP器件。
通过采用该技术方案,使用OTP器件存储解密密钥,由于OTP器件的一次性可编程特性,可以避免解密密钥被篡改,且由于OTP器件具有很强的抗攻击、抗反向工程能力,可提升解密密钥的存储安全性。
第三方面,本申请实施例提供一种神经网络参数部署方法,应用于AI计算装置,AI计算装置包括AI集成芯片及片外存储器。AI集成芯片包括CPU、NPU、第一片内存储单元及第二片内存储单元。神经网络参数部署包括:将与NPU关联的第一神经网络代码、第一权重参数存储至片外存储器;将与NPU关联的第二神经网络代码及第二权重参数存储至第二片内存储单元;其中,NPU用于基于第一神经网络代码、第二神经网络代码、第一权重参数及第二权重参数对指定数据进行处理,第二片内存储单元设置有仅允许NPU读取的权限,及仅允许CPU写入的权限。
通过采用该技术方案,增设第二片内存储单元来存储与NPU关联的第二神经网络代码及第二权重参数,例如第二片内存储单元可以是嵌入式非易失存储器,将相对重要的第二神经网络代码与第二权重参数存储在第二片内存储单元,将相对不重要的第一神经网络代码与第一权重参数存储在片外存储器,可提升第二神经网络代码与第二权重参数的安全等级,NPU可从第二片内存储单元读取神经网络代码与权重参数,可降低NPU访问片外存储器的带宽需求,同时将第二片内存储单元设置为仅允许NPU读取、CPU写入,可以进一步提升第二神经网络代码与第二权重参数的安全等级。
在一些实施例中,AI集成芯片还包括第三片内存储单元,神经网络参数部署方法还 包括:从片外存储器读取第一神经网络代码及第一权重参数后缓存在第三片内存储单元中。
通过采用该技术方案,增设第三片内存储单元来缓存NPU从片外存储器读取的第一神经网络代码及第一权重参数,例如第三片内存储器可以是嵌入式非易失存储器,相比NPU从片外存储器读取第一神经网络代码及第一权重参数的速度,NPU从第三片内存储单元读取第一神经网络代码及第一权重参数的速度更快,可以提升NPU的数据处理效率。
在一些实施例中,神经网络参数部署方法还包括:接收加密装置发送的密文数据,并将密文数据写入至第二片内存储单元;其中,密文数据为对第二神经网络代码及第二权重参数进行加密得到的。
通过采用该技术方案,AI厂商将神经网络代码和权重参数使用加密密钥加密后发送给设备厂商,设备厂商将密文的神经网络代码以及权重参数写入芯片的内置eNWM(上述第二片内存储单元)中,而该芯片内的一次性可编程OTP器件中内置有解密密钥,该解密密钥用于对密文的神经网络代码以及权重参数进行解密,该解密密钥是由AI厂商和芯片厂商协商好后并由芯片厂商植入芯片内部的,AI厂商无需将解密密钥发送给设备厂商,而是由芯片厂商内置在芯片中,设备厂商的设备在使用了上述芯片后可以正常使用该解密密钥解密密文的神经网络代码和权重参数,设备厂商或其他任何第三方厂商无法获取明文的神经网络代码,有效保护了AI厂商的知识产权。
在一些实施例中,神经网络参数部署方法还包括:当NPU从第二片内存储单元读取密文数据时,调用预先存储的解密密钥对密文数据进行解密,得到第二神经网络代码及第二权重参数。
通过采用该技术方案,由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,避免解密密钥泄露给设备厂商,同时可以仅为密钥存储单元设置一个读端口,该读端口提供给解密单元,实现仅允许解密单元读取解密密钥,进一步避免解密密钥泄露,再者可以在NPU从第二片内存储单元读取密文数据时再对密文数据进行解密处理,避免设备厂商获取明文的第二神经网络代码及第二权重参数。
在一些实施例中,神经网络参数部署方法还包括:接收加密装置发送的密文数据,并调用预先存储的解密密钥对密文数据进行解密;将解密得到的第二神经网络代码及第二权重参数写入至第二片内存储单元;其中,密文数据为对第二神经网络代码及第二权重参数进行加密得到的。
通过采用该技术方案,由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,示例性的,该OTP器件仅设置有一个读端口,该读端口连接至芯片内的解密单元,该解密密钥是由AI厂商和芯片厂商协商好后并由芯片厂商植入芯片内部的。设备厂商从AI厂商购买密文的神经网络代码和权重参数,以及从芯片厂商购买植入了解密密钥的芯片,该芯片内还设置有eNVM,设备厂商购买了该密文的神经网络代码和权重参数后,将该密文的神经网络代码和权重参数写入该芯片内的eNVM中,芯片内的解密单元通过调用内置在OTP器件中的解密密钥对密文的神经网络代码和权重参数进行解密后使用。这样,AI厂商无需将解密密钥发送给设备厂商,而是由芯片厂商内置在芯片中,设备厂商的设备在使用了上述芯片后可以正常使用该解密密钥解密密文的神经网络代码和权重参数,设备厂商或其他任何第三方厂商无法获取明文的神经网络代码,有效保护了AI厂商的知识产权。
第四方面,本申请实施例提供一种神经网络参数部署方法,包括:第一处理端将解 密密钥写入至AI集成芯片的OTP器件;第二处理端对神经网络代码及权重参数进行加密,得到密文数据;第三处理端将密文数据写入至AI集成芯片的片内存储单元;其中,解密密钥用于对密文数据进行解密,片内存储单元包括eFalsh、eNVM、MRAM或RRAM中的任意一种。
通过采用该技术方案,相比现有方案由设备厂商将解密密钥写入至AI集成芯片,改由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,示例性的,该OTP器件仅设置有一个读端口,该读端口连接至芯片内的解密单元,示例性的,该解密密钥是由AI厂商和芯片厂商协商好后并由芯片厂商植入芯片内部的,设备厂商从AI厂商购买密文的神经网络代码和权重参数,以及从芯片厂商购买植入了解密密钥的芯片,该芯片内还设置有eFalsh、eNVM、MRAM或RRAM,设备厂商购买了该密文的神经网络代码和权重参数后,将该密文的神经网络代码和权重参数写入该芯片内的eFalsh、eNVM、MRAM或RRAM中,芯片内的解密单元通过调用内置在OTP器件中的解密密钥对密文的神经网络代码和权重参数进行解密后使用。这样,AI厂商无需将解密密钥发送给设备厂商,而是由芯片厂商内置在芯片中,设备厂商的设备在使用了上述芯片后可以正常使用该解密密钥解密密文的神经网络代码和权重参数,设备厂商或其他任何第三方厂商无法获取明文的神经网络代码,有效保护了AI厂商的知识产权,再者将密文数据存储在片内eFalsh、eNVM、MRAM或RRAM中,相比现有方案将神经网络代码与权重参数存储在片外存储器,可以进一步提升神经网络代码与权重参数的安全等级。
第五方面,本申请实施例提供一种神经网络参数部署方法,包括:获取密文数据,密文数据为对神经网络代码及权重参数进行加密得到的;将密文数据写入至AI集成芯片的片内存储单元;其中,AI集成芯片的OTP器件中预先写入有对密文数据进行解密的解密密钥,片内存储单元包括eFalsh、eNVM、MRAM或RRAM中的任意一种。
通过采用该技术方案,对于设备厂商而言,解密密钥是有芯片厂商写入至AI集成芯片的OTP器件,例如该解密密钥是由AI厂商和芯片厂商协商好后,由芯片厂商植入芯片内部,设备厂商从AI厂商购买密文的神经网络代码和权重参数,以及从芯片厂商购买植入了解密密钥的芯片,该芯片内还设置有eFalsh、eNVM、MRAM或RRAM,设备厂商购买了该密文的神经网络代码和权重参数后,将该密文的神经网络代码和权重参数写入该芯片内的eFalsh、eNVM、MRAM或RRAM中,芯片内的解密单元通过调用内置在OTP器件中的解密密钥对密文的神经网络代码和权重参数进行解密后使用,AI厂商无需将解密密钥发送给设备厂商,而是由芯片厂商内置在芯片中,设备厂商的设备在使用了上述芯片后可以正常使用该解密密钥解密密文的神经网络代码和权重参数,设备厂商或其他任何第三方厂商无法获取明文的神经网络代码,有效保护了AI厂商的知识产权,再者将密文数据存储在片内eFalsh、eNVM、MRAM或RRAM中,相比现有方案将神经网络代码与权重参数存储在片外存储器,可以进一步提升神经网络代码与权重参数的安全等级。
在一些实施例中,AI集成芯片包括元CPU、NPU、第一片内存储单元及第二片内存储单元,将密文数据写入至AI集成芯片的片内存储单元,包括:将密文数据写入至第二片内存储单元,其中第二片内存储单元设置有仅允许NPU读取的权限,及仅允许CPU写入的权限。
通过采用该技术方案,NPU可从第二片内存储单元读取神经网络代码与权重参数,可降低NPU访问片外存储器的带宽需求,同时将第二片内存储单元设置为仅允许NPU读取、CPU写入,可以进一步提升第二神经网络代码与第二权重参数的安全等级。
在一些实施例中,神经网络参数部署方法还包括:当NPU从第二片内存储单元读取密文数据时,调用OTP器件存储的解密密钥对密文数据进行解密,得到神经网络代码及权重参数。
通过采用该技术方案,由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,避免解密密钥泄露给设备厂商,同时可以仅为密钥存储单元设置一个读端口,该读端口提供给解密单元,实现仅允许解密单元读取解密密钥,进一步避免解密密钥泄露,再者可以在NPU从第二片内存储单元读取密文数据时再对密文数据进行解密处理,避免设备厂商获取明文的第二神经网络代码及第二权重参数。
在一些实施例中,将密文数据写入至AI集成芯片的片内存储单元,包括:调用解密密钥对密文数据进行解密;将解密得到的神经网络代码及权重参数写入至第二片内存储单元。
通过采用该技术方案,由芯片厂商将解密密钥先写入至芯片内的密钥存储单元如OTP器件中,避免解密密钥泄露给设备厂商,同时可以仅为密钥存储单元设置一个读端口,该读端口提供给解密单元,实现仅允许解密单元读取解密密钥,进一步避免解密密钥泄露,再者可以在将密文数据写入至AI集成芯片的片内存储单元时再对密文数据进行解密处理,避免设备厂商获取明文的第二神经网络代码及第二权重参数。
第六方面,本申请实施例提供一种计算机可读存储介质,包括计算机指令,当计算机指令在电子设备上运行时,使得电子设备执行如第三方面所述的神经网络参数部署方法。
第七方面,本申请实施例提供一种电子设备,包括如第一方面所述的AI计算装置或者包括如第二方面所述的AI集成芯片。
第八方面,本申请实施例提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行如第三方面所述的神经网络参数部署方法。
可以理解地,上述提供的第四方面所述的计算机可读存储介质,第五方面所述的电子设备,第六方面所述的计算机程序产品,第七方面所述的装置均与上述第二方面或第三方面的方法对应,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
附图说明
图1为现有技术中的AI计算系统的架构示意图;
图2为本申请一实施例提供的AI计算装置的架构示意图;
图3a~3b为本申请另一实施例提供的AI计算装置的架构示意图;
图4为本申请一实施例提供的解密单元对密文数据进行解密的示意图;
图5为本申请一实施例提供的神经网络参数部署方法的流程示意图;
图6为本申请另一实施例提供的神经网络参数部署方法的流程示意图;
图7为本申请又一实施例提供的神经网络参数部署方法的流程示意图;
图8为本申请实施例提供的一种可能的电子设备。
具体实施方式
需要说明的是,本申请中“至少一个”是指一个或者多个,“多个”是指两个或多于两个。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者 复数。本申请的说明书和权利要求书及附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不是用于描述特定的顺序或先后次序。
下面结合图2示例性的介绍本发明一实施例提供的人工智能(Artificial Intelligence,AI)计算装置的架构示意图。
AI计算装置100可以应用于智能终端,智能终端可以是智能手机、平板电脑、智能汽车、智能家居设备等。AI计算装置100可以包括AI集成芯片10及片外存储器20。AI集成芯片10可以包括中央处理单元(Central Processing Unit,CPU)11、神经网络处理单元(Neural-network Processing Unit,NPU)12、第一片内存储单元13及第二片内存储单元14。
CPU 11可以包括NPU驱动,CPU 11可以通过NPU驱动配置NPU 12。例如,CPU 11可以通过NPU驱动配置NPU 12对指定数据进行处理,为处理指定数据对NPU 12内的寄存器进行分配。指定数据可以是指由CPU 11所指定的数据。可以由CPU 11分配任务,NPU 1再执行相应任务。例如,以智能汽车为例,摄像头拍摄的每帧图像可以自动存储在某个存储器中,每次有图像存储进来,CPU 11可以向NPU 12下发执行命令,指示NPU 12从该存储器中调用该图像进行AI模型推理。
NPU 12为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU 12可以实现智能终端的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。NPU 12在数据处理过程中所需的网络代码可以被划分为第一神经网络代码与第二神经网络代码,在数据处理过程中所需的权重数据可以被划分为第一权重参数与第二权重参数。第二神经网络代码可以是指该网络代码中具有高价值的神经网络代码/神经模型代码,第二神经网络代码可以由开发者进行指定。第一神经网络代码可以是指该网络代码中除了第二神经网络代码之外的其余代码。第二权重参数可以是指该权重数据中具有高价值的权重参数,第二权重参数可以由开发者进行指定。第一权重参数可以是指该权重数据中除了第二权重参数之外的其余权重参数。NPU 12可基于第一神经网络代码、第二神经网络代码、第一权重参数及第二权重参数对指定数据进行处理。
基于硬件成本考量,AI计算装置100所采用的第二片内存储单元14容量有限,可能无法存储NPU 12所需的所有网络代码和权重数据,可以利用片外存储器20存储第一神经网络代码、第一权重参数、指定数据及指定数据的处理结果,利用第二片内存储单元14存储第二神经网络代码及第二权重参数,使得NPU 12的核心神经网络代码和核心权重参数存储在AI集成芯片10,可提高核心神经网络代码和核心权重参数的安全等级。
例如,第二片内存储单元14可以是嵌入式非易失存储器,AI集成芯片10通过增设第二片内存储单元14来存储相对重要的第二神经网络代码与第二权重参数,将相对不重要的第一神经网络代码与第一权重参数存储在片外存储器20,可提升第二神经网络代码与第二权重参数的安全等级,由于NPU 12可从第二片内存储单元读取第二神经网络代码与第二权重参数,可降低NPU 12访问片外存储器20的带宽需求,提升NPU 12性能。若AI集成芯片10已包含有嵌入式非易失存储器,AI集成芯片10亦可复用该嵌入式非易失存储器来存储相对重要的第二神经网络代码与第二权重参数,无需额外增设一嵌入式非易失存储器作为第二片内存储单元14。
第一片内存储单元13可用于缓存从片外存储器20读取的指定数据,例如当NPU 12对 存储在片外存储器20中的第一数据进行处理时,该第一数据可以从片外存储器20读到第一片内存储单元13,通过第一片内存储单元13缓存该第一数据,可以加快NPU处理数据的速度、提升数据处理效率。
在一些实施例中,第一片内存储单元13可以包括静态随机访问存储器(Static Random Access Memory,SRAM),第二片内存储单元14可以包括嵌入式闪存(Embedded Flash,eFalsh)、嵌入式非易失性存储器(Embedded Non-volatile Memory,eNVM)、磁性随机访问存储器(Magnetic Random Access Memory,MRAM)、或阻性随机访问存储器(Resistive Random Access Memory,RRAM)中的任意一种。第二片内存储单元14也可以为其他类型的嵌入式存储器件。片外存储器20可以包括动态随机访问存储器(Dynamic Random Access Memory,DRAM)。
在一些实施例中,为了提高第二片内存储单元14所存储的数据私密性,可以将第二片内存储单元14设计为仅允许NPU 12读取的权限,及仅允许CPU 11写入的权限。例如,第二片内存储单元14可以仅设置一个读端口与一个写端口。第二片内存储单元14的写端口仅电连接于CPU 11,实现仅允许CPU 11写入的权限。第二片内存储单元14的读端口仅电连接于NPU 12,实现仅允许NPU 12读取的权限。
在一些实施例中,第二片内存储单元14存储的第二神经网络代码与第二权重参数可以选择性进行密码学加密处理或数学运算加扰处理,以提高第二神经网络代码与第二权重参数的安全性。加密处理或加扰处理的算法可以根据实际需求进行选定,本申请对此不做限定,例如加密算法可以是对称加密算法(AES算法、DES算法、SM4算法等)或非对称加密算法。加扰算法可以是开发者自定义的算法,例如开发者自定义的数学查表、倒序处理等可逆的加扰算法。加密处理或加扰处理可以在AI集成芯片10外部离线处理,例如可以由AI厂商(进行神经网络代码与权重参数开发的厂商)采用加密装置(独立于AI集成芯片10之外的元件或者装置,如电脑、服务器等设备)对第二神经网络代码与第二权重参数进行加密或加扰处理,得到密文数据,加密装置可以将密文数据发送给AI集成芯片10。密文数据的解密或解扰处理可以在AI集成芯片10内部进行。
在一些实施例中,如图3a与3b所示,AI集成芯片10还可以包括密钥存储单元15、解密单元16及第三片内存储单元17。密钥存储单元15用于存储解密密钥或解扰密钥。解密单元16用于根据密钥存储单元15存储的解密密钥或解扰密钥对密文数据进行解密或解扰处理。密钥存储单元15可以是设置AI集成芯片10内部的一次性可编程(One Time Programmable,OTP)器件,解密单元16可以是设置在AI集成芯片10内部的用于对密文数据进行解密或解扰处理的硬件电路。
第三片内存储单元17用于缓存从片外存储器20读取的第一神经网络代码及第一权重参数的至少部分。例如,NPU 12在需要使用第一神经网络代码及第一权重参数对指定数据进行处理时,存储在片外存储器20的部分或者全部的第一神经网络代码及第一权重参数可以从片外存储器20读到第三片内存储单元17,通过第三片内存储单元17缓存第一神经网络代码与第一权重参数,可以加快NPU 12读取第一神经网络代码与第一权重参数的速度,提升数据处理效率。第三片内存储单元17可以是eFalsh、eNVM、MRAM、或RRAM中的任意一种。第三片内存储单元17也可以为其他类型的嵌入式存储器件。
在一些实施例中,若第二片内存储单元14能够存储NPU 12所需的所有网络代码和权重数据,即将第一神经网络代码、第二神经网络代码、第一权重参数及第二权重参数存储 在第二片内存储单元14,此时片外存储器20可用于存储指定数据及指定数据的处理结果,第三片内存储单元17亦可被省略(即无需设置第三片内存储单元17缓存从片外存储器20读取的神经网络代码与权重参数)。
如图3a所示,AI集成芯片10可以在密文数据写入第二片内存储单元14时对密文数据进行解密或解扰处理,实现将第二神经网络代码与第二权重参数以明文形式存储在第二片内存储单元14,例如,CPU 11在接收到密文数据时,可以调用解密单元16对密文数据进行解密或解扰处理,再将解密或解扰处理得到的第二神经网络代码与第二权重参数写入至第二片内存储单元14进行存储。
如图3b所示,AI集成芯片10也可以在密文数据文件写入第二片内存储单元14时不进行解密或解扰处理,将第二神经网络代码与第二权重参数以密文形式存储在第二片内存储单元14,NPU 12从第二片内存储单元14读取密文数据时再对密文数据进行解密或解扰处理。例如,NPU从第二片内存储单元14读取密文数据时,可以调用解密单元16对密文数据进行解密或解扰处理,得到第二神经网络代码及第二权重参数。
在一些实施例中,在AI集成芯片10上电后,解密单元16可以自动对存储在密钥存储单元15内的解密密钥或解扰密钥进行读取。例如,OTP器件包括OTP存储器与OTP控制器,当AI集成芯片10上电后,OTP控制器自动控制OTP存储器将解密密钥或解扰密钥传送给解密单元16,使得解密单元16可以在AI集成芯片10上电后自动完成解密密钥或解扰密钥的读取。密钥存储单元15可以设置仅允许解密单元16读取的权限,避免解密密钥或解扰密钥的泄露。例如密钥存储单元15仅设置一个读取端口,该读取端口仅电连接于解密单元16。
举例而言,第二神经网络代码与第二权重参数以密文形式存储在第二片内存储单元14,当NPU 12从第二片内存储单元14读取密文数据时,NPU 12可以发送调用指令至解密单元16,解密单元16接收到NPU 12的调用指令时才使用解密密钥对密文数据进行解密处理,得到第二神经网络代码及第二权重参数。
上述AI计算装置,增设第二片内存储单元来存储与NPU关联的第二神经网络代码及第二权重参数,将相对重要的第二神经网络代码与第二权重参数存储在第二片内存储单元,将相对不重要的第一神经网络代码与第一权重参数存储在片外存储器,可提升第二神经网络代码与第二权重参数的安全等级,NPU可从第二片内存储单元读取神经网络代码与权重参数,可降低NPU访问片外存储器的带宽需求,同时将第二片内存储单元设置为仅允许NPU读取、CPU写入,可以进一步提升第二神经网络代码与第二权重参数的安全等级,且可以避免设备厂商获取解密密钥、明文的第二神经网络代码与第二权重参数。
如图4所示,以存储在密钥存储单元15内的密钥为解密密钥为例进行说明,不同输入地址的密文数据对应不同的解密密钥,输入的密文数据需要采用与其输入地址关联的解密密钥进行解密。解密单元16可以根据密文数据的输入地址,使用解密密钥对密文数据进行解密处理,得到明文数据及输出地址,其中输入地址作为解密处理的指示信息,输入地址与输出地址可以相同。密文数据的输入地址可以是指密文数据的存储地址,例如为存储在密钥存储单元15内的地址。
在一些实施例中,若所有密文数据共用一个解密密钥,解密单元16可以直接使用解密密钥对密文数据进行解密处理,得到明文数据。
参照图5所示,本申请实施例提供的一种神经网络参数部署方法,应用于AI计算装置100。AI计算装置100包括AI集成芯片10及片外存储器20。AI集成芯片10包括CPU 11、 NPU 12、第一片内存储单元13及第二片内存储单元14。
在一些实施例中,NPU 12在数据处理过程中所需的网络代码被划分为第一神经网络代码与第二神经网络代码,在数据处理过程中所需的权重数据可以被划分为第一权重参数与第二权重参数。第二神经网络代码可以是指该网络代码中具有高价值的神经网络代码/神经模型代码,第二神经网络代码可以由开发者进行指定。第一神经网络代码可以是指该网络代码中除了第二神经网络代码之外的其余代码。第二权重参数可以是指该权重数据中具有高价值的权重参数,第二权重参数可以由开发者进行指定。第一权重参数可以是指该权重数据中除了第二权重参数之外的其余权重参数。本实施例中,神经网络参数部署方法可以包括:
步骤50、将第一神经网络代码及第一权重参数存储至片外存储器20。
在一些实施例中,可以通过CPU 11将第一神经网络代码及第一权重参数写入至片外存储器20进行存储。片外存储器20可以包括DRAM。
步骤51,将第二神经网络代码及第二权重参数存储至第二片内存储单元14。
在一些实施例中,可以通过CPU 11将第二神经网络代码及第二权重参数写入至第二片内存储单元14进行存储。第二片内存储单元14可以是eFalsh、eNVM、MRAM、或RRAM中的任意一种,第二片内存储单元14可以设置有仅允许NPU 12读取的权限,及仅允许CPU 11写入的权限,进而可以提高存储在第二片内存储单元14内的第二神经网络代码与第二权重参数的安全性。
在一些实施例中,AI集成芯片10还可以包括第三片内存储单元17,NPU 12从片外存储器20读取第一神经网络代码与第一权重参数时,读取的第一神经网络代码与第一权重参数可以缓存在第三片内存储单元17。第三片内存储单元17可以是eFalsh、eNVM、MRAM、或RRAM中的任意一种,通过第三片内存储单元17可以加快NPU 12读取第一神经网络代码与第一权重参数的速度,提升数据处理效率。
在一些实施例中,第二片内存储单元14存储的第二神经网络代码与第二权重参数可以选择性进行密码学加密处理或数学运算加扰处理,以提高第二神经网络代码与第二权重参数的安全性。加密处理或加扰处理的算法可以根据实际需求进行选定,本申请对此不做限定。加密处理或加扰处理可以在AI集成芯片10外部离线处理,例如可以采用独立于AI集成芯片10之外的加密装置对第二神经网络代码与第二权重参数进行加密或加扰处理,得到密文数据,加密装置可以将密文数据发送给AI集成芯片10。密文数据的解密或解扰处理可以在AI集成芯片10内部进行。
AI集成芯片10可以在密文数据写入第二片内存储单元14时对密文数据进行解密或解扰处理,实现将第二神经网络代码与第二权重参数以明文形式存储在第二片内存储单元14。AI集成芯片10也可以在密文数据文件写入第二片内存储单元14时不进行解密或解扰处理,将第二神经网络代码与第二权重参数以密文形式存储在第二片内存储单元14,NPU 12从第二片内存储单元14读取密文数据时再对密文数据进行解密或解扰处理。
在一些实施例中,若第二片内存储单元14能够存储NPU 12所需的所有网络代码和权重数据,即可以将第一神经网络代码、第二神经网络代码、第一权重参数及第二权重参数存储在第二片内存储单元14。
上述神经网络参数部署方法,增设第二片内存储单元来存储与NPU关联的第二神经网络代码及第二权重参数,将相对重要的第二神经网络代码与第二权重参数存储在第二片内存储 单元,将相对不重要的第一神经网络代码与第一权重参数存储在片外存储器,可提升第二神经网络代码与第二权重参数的安全等级,NPU可从第二片内存储单元读取神经网络代码与权重参数,可降低NPU访问片外存储器的带宽需求,同时将第二片内存储单元设置为仅允许NPU读取、CPU写入,可以进一步提升第二神经网络代码与第二权重参数的安全等级,且可以避免设备厂商获取解密密钥、明文的第二神经网络代码与第二权重参数。
参照图6所示,本申请实施例提供的一种神经网络参数部署方法,应用于第一处理端、第二处理端及第三处理端。第一处理端可以是指芯片厂商侧的电子装置,第二处理端可以是指AI厂商侧的电子装置,第三处理端可以是设备厂商侧的电子装置。芯片厂商提供AI集成芯片10的硬件模块,例如硬件模块可以包括CPU 11、NPU 12、第一片内存储单元13、第二片内存储单元14、密钥存储单元15、解密单元16等。AI厂商提供AI集成芯片10的软件模块,例如软件模块包括AI集成芯片10中的NPU 12所需的神经网络代码及权重参数。设备厂商基于芯片厂商提供的AI集成芯片10的硬件模块与AI厂商提供AI集成芯片10的软件模块加工得到包含AI集成芯片10的终端。本实施例中,神经网络参数部署方法包括:
步骤61、第一处理端将解密密钥或解扰密钥写入至AI集成芯片10的密钥存储单元15。
在一些实施例中,芯片厂商与AI厂商可以协商解密密钥或解扰密钥,可以是AI厂商自己生成后提供给芯片厂商,也可以是芯片厂商生成后提供给AI厂商。密钥存储单元15可以是OTP器件,芯片厂商可以通过第一处理端将解密密钥或解扰密钥写入至AI集成芯片10的OTP器件。芯片厂商可以将植入有解密密钥或解扰密钥的AI集成芯片10的硬件模块提供给设备厂商。例如,第一处理端可以为烧录设备,芯片厂商使用烧录设备将解密密钥或解扰密钥写入至AI集成芯片10的OTP器件。
步骤62、第二处理端对与NPU 12关联的神经网络代码及权重参数进行加密,得到密文数据。
在一些实施例中,当AI厂商完成与NPU 12关联的神经网络代码及权重参数的开发后,AI厂商对神经网络代码及权重参数进行加密(加密所使用的加密密钥可以与解密密钥相同),得到密文数据。例如,AI厂商可以使用个人电脑(Personal Computer,PC)或服务器对神经网络代码及权重参数进行加密,得到密文数据。当AI厂商得到密文数据后,AI厂商可以将密文数据通过线上或线下方式提供给设备厂商。比如,设备厂商可以通过线下方式从AI厂商购买密文数据,设备厂商也可以于AI厂商进行线上交易,如向AI厂商的云服务器请求密文数据,AI厂商的云服务器在确认线上交易后,发送密文数据给设备厂商(设备厂商侧的PC、或服务器等设备)。
步骤63、第三处理端将密文数据写入至AI集成芯片10的第二片内存储单元14。
在一些实施例中,设备厂商可以通过第三处理端将密文数据写入至AI集成芯片10的第二片内存储单元14,进而AI集成芯片10后续在进行AI推理时,解密单元16可以使用内置的解密密钥或解扰密钥对密文数据进行解密得到明文的神经网络代码及权重参数。第二片内存储单元14可以包括eFalsh、eNVM、MRAM或RRAM中的任意一种。例如,第三处理端可以为烧录设备,设备厂商使用烧录设备将密文数据写入至AI集成芯片10的第二片内存储单元14。
在一些实施例中,在AI集成芯片10上电后,解密单元16可以自动对存储在密钥存储单元15内的解密密钥或解扰密钥进行读取。例如,OTP器件包括OTP存储器与OTP控制器,当AI集成芯片10上电后,OTP控制器自动控制OTP存储器将解密密钥或解扰密钥传送给解 密单元16,使得解密单元16可以在AI集成芯片10上电后自动完成解密密钥或解扰密钥的读取。同时,OTP器件可以设置仅允许解密单元16读取的权限,避免解密密钥或解扰密钥的泄露。例如密钥存储单元15仅设置一个读取端口,该读取端口电连接于解密单元16。
在一些实施例中,如果AI厂商需要升级神经网络代码及权重参数时,可以通过第二处理端对升级后的神经网络代码与权重参数进行加密,并将新的密文数据提供给设备厂商,设备厂商可以通过第三处理端将新的密文数据写入至AI集成芯片10的第二片内存储单元14。
上述神经网络参数部署方法,相比现有方案由设备厂商将解密密钥及明文的神经网络代码与权重参数写入至AI集成芯片,改由芯片厂商将解密密钥写入至AI集成芯片,再将储存有解密密钥的AI集成芯片提供给设备厂商,由AI厂商对神经网络代码及权重参数进行加密,AI厂商将神经网络代码及权重参数以密文形式提供给设备厂商,由设备厂商将密文数据写入至第二片内存储单元,可以避免设备厂商获取解密密钥及明文的神经网络代码与权重参数,再者将密文数据存储在片内eFalsh、eNVM、MRAM或RRAM中,相比现有方案将神经网络代码与权重参数存储在片外存储器,可以进一步提升神经网络代码与权重参数的安全等级。
参照图7所示,本申请实施例提供的一种神经网络参数部署方法,应用于第三处理端。第三处理端可以是设备厂商侧的电子设备。本实施例中,神经网络参数部署方法包括:
步骤70、获取密文数据。
在一些实施例中,密文数据可以是指对与NPU 12关联的神经网络代码及权重参数进行加密得到的数据。例如,可以由AI厂商对神经网络代码及权重参数进行加密得到密文数据,并将密文数据提供给设备厂商。
在一些实施例中,当AI厂商得到密文数据后,AI厂商可以将密文数据通过线上或线下方式提供给设备厂商。比如,设备厂商可以通过线下方式从AI厂商购买密文数据,设备厂商也可以与AI厂商进行线上交易,如向AI厂商的云服务器请求密文数据,AI厂商的云服务器在确认线上交易后,发送密文数据给设备厂商(设备厂商侧的PC、或服务器等设备)。
步骤71、将密文数据写入至AI集成芯片10的第二片内存储单元14。
在一些实施例中,当设备厂商得到密文数据是,设备厂商可以通过第三处理端将密文数据写入至AI集成芯片10的第二片内存储单元14。第二片内存储单元14可以包括eFalsh、eNVM、MRAM或RRAM中的任意一种。例如,第三处理端可以为烧录设备,设备厂商使用烧录设备将密文数据写入至AI集成芯片10的第二片内存储单元14。
在一些实施例中,AI集成芯片的一次性可编程OTP器件中预先写入有对密文数据进行解密的解密密钥。例如,可以由芯片厂商将解密密钥或解扰密钥写入至AI集成芯片10的OTP器件,芯片厂商再将植入有解密密钥或解扰密钥的AI集成芯片10提供给设备厂商。
在一些实施例中,存储在OTP器件内的解密密钥或解扰密钥,可以由设置在CPU 11或NPU 12内的程序所制定的读取逻辑控制解密单元16完成对解密密钥或解扰密钥的读取,且读取操作可以设置为对应用软件不可见。同时,OTP器件可以设置仅允许解密单元16读取的权限,避免解密密钥或解扰密钥的泄露。例如密钥存储单元15仅设置一个读取端口,该读取端口电连接于解密单元16。
上述神经网络参数部署方法,对于设备厂商而言,解密密钥是有芯片厂商写入至AI集成芯片的OTP器件,再将储存有解密密钥的AI集成芯片提供给设备厂商,同时AI厂商将神经网络代码及权重参数以密文形式提供给设备厂商,设备厂商无法获取解密密钥及明文 的神经网络代码与权重参数,再者将密文数据存储在片内eFalsh、eNVM、MRAM或RRAM中,相比现有方案将神经网络代码与权重参数存储在片外存储器,可以进一步提升神经网络代码与权重参数的安全等级。
参考图8,为本申请实施例提供的电子设备1000的硬件结构示意图。如图8所示,电子设备1000可以包括第一处理器1001、第一存储器1002及第一通信总线1003。第一存储器1002用于存储一个或多个计算机程序1004。一个或多个计算机程序1004被配置为被该第一处理器1001执行。该一个或多个计算机程序1004包括指令,上述指令可以用于实现在电子设备1000中执行如图5所述的神经网络参数部署方法。
可以理解的是,本实施例示意的结构并不构成对电子设备1000的具体限定。在另一些实施例中,电子设备1000可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。
第一处理器1001可以包括一个或多个处理单元,例如:第一处理器1001可以包括应用处理器(application processor,AP),调制解调器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
第一处理器1001还可以设置有存储器,用于存储指令和数据。在一些实施例中,第一处理器1001中的存储器为高速缓冲存储器。该存储器可以保存第一处理器1001刚用过或循环使用的指令或数据。如果第一处理器1001需要再次使用该指令或数据,可从该存储器中直接调用。避免了重复存取,减少了第一处理器1001的等待时间,因而提高了系统的效率。
在一些实施例中,第一处理器1001可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,SIM接口,和/或USB接口等。
在一些实施例中,第一存储器1002可以包括高速随机存取存储器,还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。
本实施例还提供一种计算机存储介质,该计算机存储介质中存储有计算机指令,当该计算机指令在电子设备上运行时,使得电子设备执行上述相关方法步骤实现如图5所述的神经网络参数部署方法。
本实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述相关步骤,以实现如图5所述的神经网络参数部署方法。
其中,本实施例提供的电子设备、计算机存储介质或计算机程序产品均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能 分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例是示意性的,例如,该模块或单元的划分,为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
该作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
该集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。

Claims (19)

  1. 一种人工智能AI计算装置,包括AI集成芯片及片外存储器,所述AI集成芯片包括中央处理单元CPU、神经网络处理单元NPU及第一片内存储单元,其特征在于,所述片外存储器用于存储与所述NPU关联的第一神经网络代码及第一权重参数;
    所述AI集成芯片还包括第二片内存储单元,所述第二片内存储单元用于存储与所述NPU关联的第二神经网络代码及第二权重参数,所述NPU用于基于所述第一神经网络代码、所述第二神经网络代码、所述第一权重参数及所述第二权重参数对指定数据进行处理;
    其中,所述第二片内存储单元设置有仅允许所述NPU读取的权限,及仅允许所述CPU写入的权限。
  2. 如权利要求1所述的AI计算装置,其特征在于,所述AI集成芯片还包括第三片内存储单元,所述第三片内存储单元用于缓存从所述片外存储器读取的所述第一神经网络代码及所述第一权重参数的至少部分。
  3. 如权利要求1或2所述的AI计算装置,其特征在于,所述CPU还用于接收密文数据,并将所述密文数据写入至所述第二片内存储单元,所述密文数据为对所述第二神经网络代码及所述第二权重参数进行加密得到的。
  4. 如权利要求3所述的AI计算装置,其特征在于,所述AI集成芯片还包括密钥存储单元及解密单元,解密密钥在所述密文数据写入至所述第二片内存储单元之前存储至所述密钥存储单元,所述密钥存储单元设置有仅允许所述解密单元读取的权限,所述NPU还用于从所述第二片内存储单元读取所述密文数据,并调用所述解密单元使用所述解密密钥对所述密文数据进行解密,得到所述第二神经网络代码及所述第二权重参数。
  5. 如权利要求1或2所述的AI计算装置,其特征在于,所述AI集成芯片还包括密钥存储单元及解密单元,所述密钥存储单元用于存储解密密钥,所述密钥存储单元设置有仅允许所述解密单元读取的权限,所述CPU用于接收密文数据,并调用所述解密单元使用所述解密密钥对所述密文数据进行解密,所述密文数据为对所述第二神经网络代码及所述第二权重参数进行加密得到的,所述CPU还用于将解密得到的第二神经网络代码及第二权重参数写入至所述第二片内存储单元。
  6. 如权利要求4或5所述的AI计算装置,其特征在于,所述密钥存储单元包括一次性可编程OTP器件。
  7. 如权利要求1至6任一项所述的AI计算装置,其特征在于,所述片外存储器还用于存储所述指定数据及所述指定数据的处理结果,所述第一片内存储单元用于缓存从所述片外存储器读取的所述指定数据。
  8. 如权利要求1至7中任意一项所述的AI计算装置,其特征在于,所述第一片内存储单元包括静态随机访问存储器SRAM,所述片外存储器包括动态随机访问存储器DRAM,所述第二片内存储单元包括嵌入式闪存eFalsh、嵌入式非易失性存储器eNVM、磁性随机访问存储器MRAM或阻性随机访问存储器RRAM中的任意一种,所述第三片内存储单元包括eFalsh、eNVM、MRAM或RRAM中的任意一种。
  9. 一种人工智能AI集成芯片,包括中央处理单元CPU、神经网络处理单元NPU及第一片内存储单元,其特征在于,所述AI集成芯片还包括第二片内存储单元,所述第二片内存储 单元用于存储与所述NPU关联的神经网络代码及权重参数,所述NPU用于基于所述神经网络代码及所述权重参数对指定数据进行处理;
    其中,所述第二片内存储单元包括嵌入式闪存eFalsh、嵌入式非易失性存储器eNVM、磁性随机访问存储器MRAM或阻性随机访问存储器RRAM中的任意一种。
  10. 如权利要求9所述的AI集成芯片,其特征在于,所述第二片内存储单元仅设置一个读端口与一个写端口,所述读端口电连接于所述NPU,所述写端口电连接于所述CPU。
  11. 一种神经网络参数部署方法,应用于人工智能AI计算装置,其特征在于,所述AI计算装置包括AI集成芯片及片外存储器,所述AI集成芯片包括中央处理单元CPU、神经网络处理单元NPU、第一片内存储单元及第二片内存储单元,所述神经网络参数部署包括:
    将与所述NPU关联的第一神经网络代码、第一权重参数存储至所述片外存储器;
    将与所述NPU关联的第二神经网络代码及第二权重参数存储至所述第二片内存储单元;
    其中,所述NPU用于基于所述第一神经网络代码、所述第二神经网络代码、所述第一权重参数及所述第二权重参数对指定数据进行处理,所述第二片内存储单元设置有仅允许所述NPU读取的权限,及仅允许所述CPU写入的权限。
  12. 如权利要求11所述的神经网络参数部署方法,其特征在于,所述AI集成芯片还包括第三片内存储单元,所述神经网络参数部署方法还包括:
    从所述片外存储器读取所述第一神经网络代码及所述第一权重参数后缓存在所述第三片内存储单元中。
  13. 如权利要求11或12所述的神经网络参数部署方法,其特征在于,所述神经网络参数部署方法还包括:
    接收加密装置发送的密文数据,并将所述密文数据写入至所述第二片内存储单元;
    其中,所述密文数据为对所述第二神经网络代码及所述第二权重参数进行加密得到的。
  14. 如权利要求13所述的神经网络参数部署方法,其特征在于,所述神经网络参数部署方法还包括:
    当所述NPU从所述第二片内存储单元读取所述密文数据时,调用预先存储的解密密钥对所述密文数据进行解密,得到所述第二神经网络代码及所述第二权重参数。
  15. 如权利要求11或12所述的神经网络参数部署方法,其特征在于,所述神经网络参数部署方法还包括:
    接收加密装置发送的密文数据,并调用预先存储的解密密钥对所述密文数据进行解密;
    将解密得到的第二神经网络代码及第二权重参数写入至所述第二片内存储单元;
    其中,所述密文数据为对所述第二神经网络代码及所述第二权重参数进行加密得到的。
  16. 一种神经网络参数部署方法,其特征在于,包括:
    第一处理端将解密密钥写入至人工智能AI集成芯片的一次性可编程OTP器件;
    第二处理端对神经网络代码及权重参数进行加密,得到密文数据;
    第三处理端将所述密文数据写入至所述AI集成芯片的片内存储单元;
    其中,所述解密密钥用于对所述密文数据进行解密,所述片内存储单元包括嵌入式闪存eFalsh、嵌入式非易失性存储器eNVM、磁性随机访问存储器MRAM或阻性随机访问存储器RRAM中的任意一种。
  17. 一种神经网络参数部署方法,其特征在于,包括:
    获取密文数据,所述密文数据为对神经网络代码及权重参数进行加密得到的;
    将所述密文数据写入至人工智能AI集成芯片的片内存储单元;
    其中,所述AI集成芯片的一次性可编程OTP器件中预先写入有对所述密文数据进行解密的解密密钥,所述片内存储单元包括嵌入式闪存eFalsh、嵌入式非易失性存储器eNVM、磁性随机访问存储器MRAM或阻性随机访问存储器RRAM中的任意一种。
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储计算机指令,当所述计算机指令在计算机或处理器上运行时,使得所述计算机或处理器执行如权利要求11至权利要求15中任一项所述的神经网络参数部署方法。
  19. 一种电子设备,其特征在于,所述电子设备包括如权利要求1至8中任意一项所述的人工智能AI计算装置,或者如权利要求9或10所述的AI集成芯片。
PCT/CN2022/094495 2021-08-11 2022-05-23 神经网络参数部署方法、ai集成芯片及其相关装置 WO2023016030A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110919731.6 2021-08-11
CN202110919731.6A CN115705301A (zh) 2021-08-11 2021-08-11 神经网络参数部署方法、ai集成芯片及其相关装置

Publications (1)

Publication Number Publication Date
WO2023016030A1 true WO2023016030A1 (zh) 2023-02-16

Family

ID=85180075

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/094495 WO2023016030A1 (zh) 2021-08-11 2022-05-23 神经网络参数部署方法、ai集成芯片及其相关装置

Country Status (2)

Country Link
CN (1) CN115705301A (zh)
WO (1) WO2023016030A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115982110B (zh) * 2023-03-21 2023-08-29 北京探境科技有限公司 文件运行方法、装置、计算机设备及可读存储介质
CN117014260B (zh) * 2023-10-07 2024-01-02 芯迈微半导体(上海)有限公司 一种信道估计滤波系数的加载方法和加载装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107086910A (zh) * 2017-03-24 2017-08-22 中国科学院计算技术研究所 一种针对神经网络处理的权重加解密方法和系统
CN107885509A (zh) * 2017-10-26 2018-04-06 杭州国芯科技股份有限公司 一种基于安全的神经网络加速器芯片架构
CN110601814A (zh) * 2019-09-24 2019-12-20 深圳前海微众银行股份有限公司 联邦学习数据加密方法、装置、设备及可读存储介质
US10956584B1 (en) * 2018-09-25 2021-03-23 Amazon Technologies, Inc. Secure data processing
CN113127407A (zh) * 2021-05-18 2021-07-16 南京优存科技有限公司 基于nvm进行ai计算的芯片架构

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107086910A (zh) * 2017-03-24 2017-08-22 中国科学院计算技术研究所 一种针对神经网络处理的权重加解密方法和系统
CN107885509A (zh) * 2017-10-26 2018-04-06 杭州国芯科技股份有限公司 一种基于安全的神经网络加速器芯片架构
US10956584B1 (en) * 2018-09-25 2021-03-23 Amazon Technologies, Inc. Secure data processing
CN110601814A (zh) * 2019-09-24 2019-12-20 深圳前海微众银行股份有限公司 联邦学习数据加密方法、装置、设备及可读存储介质
CN113127407A (zh) * 2021-05-18 2021-07-16 南京优存科技有限公司 基于nvm进行ai计算的芯片架构

Also Published As

Publication number Publication date
CN115705301A (zh) 2023-02-17

Similar Documents

Publication Publication Date Title
WO2023016030A1 (zh) 神经网络参数部署方法、ai集成芯片及其相关装置
US7882291B2 (en) Apparatus and method for operating plural applications between portable storage device and digital device
TWI448894B (zh) 數位內容分配的方法及系統
US8127131B2 (en) System and method for efficient security domain translation and data transfer
US7228436B2 (en) Semiconductor integrated circuit device, program delivery method, and program delivery system
KR20150143708A (ko) 스토리지 디바이스 보조 인라인 암호화 및 암호해독
EP2734951A1 (en) Cryptographic information association to memory regions
US9665740B1 (en) Method and system for cryptographically securing a graphics system
EP3809271B1 (en) Secure data transfer apparatus, system and method
KR100798927B1 (ko) 스마트카드 기반의 복제방지 기능을 가진 데이터 저장장치, 그의 데이터 저장 및 전송 방법
US11698973B2 (en) Platform security mechanism
US20210266301A1 (en) Secure application processing systems and methods
US20040117642A1 (en) Secure media card operation over an unsecured PCI bus
CN112052201A (zh) 一种基于Linux内核层实现的USB设备管控方法与系统
WO2018205512A1 (zh) 信息加解密方法、机顶盒、系统及存储介质
CN113496016A (zh) 一种内存的访问方法、系统级芯片和电子设备
KR102218715B1 (ko) 채널별로 데이터를 보호할 수 있는 반도체 장치
CN1373461A (zh) 应用于数据储存的加解密装置
JP3085785U (ja) データ暗号化・復号化装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22855016

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022855016

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022855016

Country of ref document: EP

Effective date: 20240214

NENP Non-entry into the national phase

Ref country code: DE