CN117280342A

CN117280342A - Method for using artificial intelligent model and related device

Info

Publication number: CN117280342A
Application number: CN202180098116.1A
Authority: CN
Inventors: 赵品华
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2021-06-16
Filing date: 2021-06-16
Publication date: 2023-12-22
Also published as: WO2022261878A1

Abstract

A method and a device for using an artificial intelligence AI model can be applied to the field of artificial intelligence. In the using method, an encrypted AI model is sent to an AI chip through a main chip, the encrypted AI model is a model obtained by encrypting a first AI model by the AI chip through a trusted root, the AI chip decrypts the encrypted AI model through the trusted root to obtain the first AI model, then the AI chip uses the first AI model to carry out reasoning to obtain a reasoning result, and finally the main chip receives the reasoning result obtained by the AI chip using the first AI model to carry out reasoning. According to the method, the first AI model is encrypted and decrypted through the trusted root, so that the safety of the use environment of the AI model is guaranteed, and meanwhile, the problem of high software and hardware cost in the process of encrypting and decrypting the AI model in the prior art is avoided.

Description

Method for using artificial intelligent model and related device

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a use method and a related device of an artificial intelligence model.

Background

Artificial intelligence (artificial intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar manner to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables various intelligent machines to have functions of sensing, reasoning and decision.

Generally, the design principle and implementation method of the intelligent machine depend on an AI model obtained after big data training processing, and the AI model is stored in an operation environment of an AI application in a file mode. However, the AI model, as a core asset of the AI application, needs to be strictly protected from theft in the actual operating environment.

The current common AI model protection method is mainly realized by software, namely, a trained AI model is encrypted and decrypted in the AI application, then the decrypted AI model is sent to an AI chip, and the AI chip performs reasoning; the other common mode is realized by a hardware dongle, after the hardware dongle receives the encrypted AI model sent by the AI chip, the hardware dongle decrypts the AI model, then sends the decrypted AI model to the AI chip, and the AI chip performs reasoning; however, both of these common approaches are found to be less safe during use.

Disclosure of Invention

The application provides a using method and a using device of an AI model, which can improve the security of the AI model on the premise of not increasing the hardware cost.

In a first aspect, the present application provides a method for using an AI model, where the method is applied to an AI chip, and the method includes: receiving an encrypted AI model from a host chip; decrypting the encrypted AI model to obtain a first AI model; reasoning by using the first AI model to obtain a reasoning result; and sending the reasoning result to the main chip.

In the method, compared with decryption by other chips or dongles except the AI chip, the decryption of the AI model is executed on the AI chip, and the decrypted AI model is directly used, so that the problem that the decrypted AI model is intercepted or stolen in the transmission process is avoided, that is, the leakage risk of the AI model is greatly reduced, and the safety is improved.

With reference to the first aspect, in a first possible implementation manner, the decrypting the encrypted AI model includes: and decrypting the encrypted AI model by using a trusted root stored in a secure storage area of the AI chip.

In this implementation, since the root of trust is stored in the secure storage area of the AI chip, and can only be read and used by the AI chip itself, the AI model after re-encryption is not externally available, that is, the AI model after re-encryption can only be decrypted and run on the AI chip, so that the security of the AI model can be further improved.

In combination with the first possible implementation manner, in a second possible implementation manner, before the AI chip receives the encrypted AI model, the method further includes: receiving the first AI model from the master chip; encrypting the first AI model by using the trusted root to obtain the encrypted AI model; and sending the encrypted AI model to the master chip.

In the implementation mode, the AI chip can automatically realize the encryption protection of the AI model, a special hardware encryption dog is not required to be deployed, and the complexity of a hardware-level protection scheme of the AI model can be reduced for a user.

In a second aspect, the present application provides a method for using an artificial intelligence AI model, where the method is applied to a host chip, and the method includes: sending an encrypted AI model to an AI chip, wherein the encrypted AI model is obtained by encrypting a first AI model by the AI chip; and receiving an inference result from the AI chip, wherein the inference result is obtained by the AI chip by using the first AI model.

With reference to the second aspect, in a first possible implementation manner, before the sending the encrypted AI model to the AI chip, the method further includes: transmitting the first AI model to the AI chip; receiving the encrypted AI model obtained by encrypting the first AI model by the AI chip by using a trusted root; the encrypted AI model is stored.

With reference to the first possible implementation manner, in a second possible implementation manner, the trusted root is a trusted root stored in a secure storage area of the AI chip.

In a third aspect, the present application provides an apparatus for using an artificial intelligence AI model, the apparatus being applied to an AI chip side, the apparatus comprising: the receiving module is used for receiving the encrypted AI model from the main chip; the decryption module is used for decrypting the encrypted AI model to obtain a first AI model; the reasoning module is used for reasoning by using the first AI model to obtain a reasoning result; and the sending module is used for sending the reasoning result to the main chip.

With reference to the third aspect, in a first possible implementation manner, the decryption module is configured to decrypt the encrypted AI model, and includes: the decryption module is used for decrypting the encrypted AI model by using the trusted root stored in the secure storage area of the AI chip.

With reference to the first possible implementation manner, in a second possible implementation manner, before the AI chip receives the encrypted AI model, the apparatus further includes: an encryption module; the receiving module is further configured to receive the first AI model from the main chip; the encryption module is used for encrypting the first AI model by using the trusted root to obtain the encrypted AI model; the sending module is further configured to send the encrypted AI model to the master chip.

In a fourth aspect, the present application provides an apparatus for using an artificial intelligence AI model, the apparatus being applied to a main chip side, the apparatus comprising: the transmission module is used for transmitting an encrypted AI model to the AI chip, wherein the encrypted AI model is a model obtained by encrypting the first AI model by the AI chip; the receiving module is used for receiving an inference result from the AI chip, wherein the inference result is obtained by the AI chip by using the first AI model for inference.

With reference to the fourth aspect, in a first possible implementation manner, before the sending module is configured to send the encrypted AI model to the AI chip, the apparatus further includes: a storage module; the sending module is further configured to send the first AI model to the AI chip; the receiving module is further configured to receive the encrypted AI model obtained by encrypting the first AI model by using a trusted root by the AI chip; the storage module is used for storing the encrypted AI model.

In a fifth aspect, the present application provides an AI chip comprising a processor coupled to a memory, the processor configured to execute program code in the memory to implement the method of the first aspect or any one of the possible implementations.

In a sixth aspect, the present application provides a chip comprising a processor coupled to a memory, the processor being for executing program code in the memory to implement the method of the second aspect or any one of the possible implementations.

In a seventh aspect, the present application provides a computer readable storage medium having stored therein a computer program or instructions which, when executed by a processor, implement the method of the first or second aspect or any one of the possible implementations thereof.

In an eighth aspect, the present application provides a computer program product comprising computer program code which, when run on a computer, causes the computer to implement a method as in the first aspect or the second aspect or any one of the possible implementations thereof.

Drawings

FIG. 1 is a schematic diagram of a convolutional neural network architecture;

fig. 2 is an application scenario architecture diagram according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a prior art deployment of a dedicated hardware dongle;

FIG. 4 is a schematic flow chart of a method of using an AI model in accordance with one embodiment of the disclosure;

FIG. 5 is a schematic flow chart of a method of using an AI model in accordance with another embodiment of the disclosure;

FIG. 6 is a schematic flow chart of a method of using an AI model in accordance with yet another embodiment of the disclosure;

FIG. 7 is a schematic diagram of an apparatus for using an AI model in accordance with one embodiment of the disclosure;

FIG. 8 is a schematic diagram of an apparatus for using an AI model in accordance with another embodiment of the disclosure;

fig. 9 is a schematic block diagram of an apparatus provided in an embodiment of the present application.

Detailed Description

The implementation of the examples of the present application will be described in detail below with reference to the accompanying drawings.

For ease of understanding, the "AI model" referred to in the embodiments of the present application will be described with reference to fig. 1.

As described above, the AI model may be regarded as a core of the intelligent machine, so that the intelligent machine may be applied to various fields, such as natural language processing, computer vision, decision and reasoning, man-machine interaction, recommendation and search, etc., based on the different types of the AI model. Currently, AI models that are popular include convolutional neural networks (convolutional neural network, CNN), linear regression models, and the like.

The AI model is described below by taking convolutional neural networks as an example. It should be noted that the method of the embodiment of the present application may also be applied to other AI models. Fig. 1 is a schematic diagram of a convolutional neural network architecture. The CNN100 shown in fig. 1 includes an input layer 110, a convolutional layer/pooling layer 120, where the pooling layer is optional, and a neural network layer 130.

The convolutional/pooling layer 120 as shown in fig. 1 may include layers as examples 121-126, in one implementation, 121 is a convolutional layer, 122 is a pooling layer, 123 is a convolutional layer, 124 is a pooling layer, 125 is a convolutional layer, and 126 is a pooling layer; in another implementation, 121, 122 are convolutional layers, 123 are pooling layers, 124, 125 are convolutional layers, and 126 are pooling layers. I.e. the output of the convolution layer may be used as input to a subsequent pooling layer or as input to another convolution layer to continue the convolution operation.

The convolution layer in the convolution/pooling layer 120 may comprise a number of convolution operators, also called kernels, which function in image processing as a filter to extract specific information from the input image matrix, which may be essentially a weight matrix, which is usually predefined, which is usually processed pixel by pixel (or two pixels by two pixels … … depending on the value of the step size stride) in the horizontal direction on the input image during the convolution operation of the image, thereby completing the task of extracting specific features from the image. The size of the weight matrix should be related to the size of the image, and it should be noted that the depth dimension (depth dimension) of the weight matrix is the same as the depth dimension of the input image, and the weight matrix extends to the entire depth of the input image during the convolution operation. Thus, convolving with a single weight matrix produces a convolved output of a single depth dimension, but in most cases does not use a single weight matrix, but instead applies multiple weight matrices of the same dimension. The outputs of each weight matrix are stacked to form the depth dimension of the convolved image. Different weight matrices can be used to extract different features in the image, for example, one weight matrix is used to extract image edge information, another weight matrix is used to extract specific color of the image, another weight matrix is used to blur … … unnecessary noise points in the image, the dimensions of the weight matrices are the same, the dimensions of feature images extracted by the weight matrices with the same dimensions are the same, and the extracted feature images with the same dimensions are combined to form the output of convolution operation.

The weight values in the weight matrices are required to be obtained through a large amount of training in practical application, and each weight matrix formed by the weight values obtained through training can extract information from the input image, so that the convolutional neural network 100 is helped to perform correct prediction.

When convolutional neural network 100 has multiple convolutional layers, the initial convolutional layer tends to extract more general features, which may also be referred to as low-level features; as the depth of the convolutional neural network 100 increases, features extracted by the later convolutional layers (e.g., 126) become more complex, such as features of high level semantics, which are more suitable for the problem to be solved. For example, where the layers shown at 121 and 126 are convolutional layers, the initial convolutional layer may be 121 and the later convolutional layer may be 126.

The pooling layers in the convolutional layer/pooling layer 120 often require periodic introduction of the pooling layers after the convolutional layer, i.e., layers 121 through 126 as illustrated by 120 in fig. 1, which may be one convolutional layer followed by a pooling layer or one or more pooling layers after a multi-layer convolutional layer, since the number of training parameters often needs to be reduced. The only purpose of the pooling layer during image processing is to reduce the spatial size of the image. The pooling layer may include an average pooling operator and/or a maximum pooling operator for sampling the input image to obtain a smaller size image. The averaging pooling operator may calculate pixel values in the image over a particular range to produce an average value. The max pooling operator may take the pixel with the largest value in a particular range as the result of max pooling. In addition, just as the size of the weighting matrix used in the convolutional layer should be related to the image size, the operators in the pooling layer should also be related to the image size. The size of the image output after the processing by the pooling layer can be smaller than the size of the image input to the pooling layer, and each pixel point in the image output by the pooling layer represents the average value or the maximum value of the corresponding sub-region of the image input to the pooling layer.

The neural network layer 130, after being processed by the convolutional layer/pooling layer 120, is not yet sufficient for the convolutional neural network 100 to output the required output information. Because, as previously described, the convolution layer/pooling layer 120 will only extract features and reduce the dimension of the input data. However, in order to generate the final output data (the required class information or other relevant information), convolutional neural network 100 needs to utilize neural network layer 130 to generate the output of one or a set of the required number of classes. Thus, multiple hidden layers (131, 132 to 13n as shown in fig. 1) and an output layer 140 may be included in the neural network layer 130, where model parameters included in the multiple hidden layers may be pre-trained according to relevant training data of a specific task type, for example, the task type may include image recognition, image classification, image super-resolution reconstruction, and so on.

After the underlying layers of the neural network layer 130, i.e., the final layer of the overall convolutional neural network 100 is the output layer 140, the output layer 140 has a class-cross entropy-like loss function, specifically for calculating the prediction error, once the forward propagation of the overall convolutional neural network 100 (e.g., propagation from 110 to 140 in fig. 1) is completed (e.g., propagation from 140 to 110 in fig. 1) and the backward propagation (e.g., propagation from 140 to 110 in fig. 1) will begin to update the weights and deviations of the aforementioned layers to reduce the loss of the convolutional neural network 100 and the error between the result output by the convolutional neural network 100 through the output layer and the desired result.

It should be noted that, the convolutional neural network 100 shown in fig. 1 is only an example of a convolutional neural network, and in a specific application, the convolutional neural network may also exist in the form of other network models, for example, multiple convolutional layers/pooling layers are parallel, and the features extracted respectively are all input to the full neural network layer 130 for processing. The method of the embodiment of the application can also be applied to CNNs of other structures.

It should be further noted that "AI model" referred to hereinafter refers only to a data processing procedure in the AI model, and does not include model parameters used in the data processing procedure. For example, the AI model refers only to convolution operations, pooling operations, and the like that need to be performed by each layer in the CNN shown in fig. 1, and does not include weights involved in the convolution operations, pooling operations.

With the development of artificial intelligence and deep learning, more and more enterprises deploy AI chips on electronic devices, and AI technology is used, that is, specific functions, such as fingerprint unlocking, image recognition, voice recognition, and the like, are realized through an AI model (such as CNN shown in fig. 1). In general, in order to improve the competitiveness of the AI model, enterprises can input a large amount of material resources and manpower to collect and purchase data so as to improve the training scale of the AI model, optimize the training parameters of the AI model and the like, so as to enhance and optimize the AI model. Thus, it can be seen that the AI model has become an asset for enterprises.

Current artificial intelligence AI applications based on deep neural networks are mainly divided into two phases: training and reasoning. Specifically, the training is to train the initial neural network model into a target neural network model (AI model) through the training process of large data volume so as to be applied to the actual scene; while reasoning is a process of applying a trained AI model to the actual scene.

In general, the AI model is obtained after huge training investment, and is a core asset of AI application, and the AI model needs to be strictly protected in an actual running environment of an reasoning stage to prevent theft. The AI model is stored in the running environment of the AI application in a file mode. The protection of the AI model file is realized mainly through encryption and decryption, namely the AI model file is stored after encryption, and the AI reasoning application decrypts the file for use when running.

Fig. 2 is an application scenario architecture diagram according to an embodiment of the present application. As shown in fig. 2, the application scenario architecture diagram includes: a hardware device 200, the hardware device 200 comprising: a main chip 201 and an AI chip 202, wherein the main chip 201 has an AI application deployed thereon.

As one example, hardware device 200 may be an autopilot box device of a vehicle; the main chip 201 may be a main controller of the vehicle, and the main chip 201 may include an AI application program, for example, an autopilot program; the AI chip 202 may be an AI computing chip. It should be understood that the above description is only exemplary.

Wherein the AI model is deployed in the form of a file in an AI application on the main chip 201.

As an alternative implementation manner, when protection of the AI model is implemented by encryption and decryption, the implementation process of encryption and decryption may be performed in the main chip 201, may be performed in the AI chip 202, or may be performed by means of other hardware devices.

In the prior art, encryption and decryption schemes for an AI model mainly have two types: a software-only scheme and a dedicated hardware dongle scheme.

The pure software scheme is to encrypt the trained AI model and store the encrypted AI model in the memory (e.g. disk) of the operation environment in the form of ciphertext file. When the AI application runs, the ciphertext is read and then directly decrypted by using the key for use. The key is managed by the AI application itself and may be hard coded or stored in a configuration file.

For example, the AI application deployed on the main chip 201 carries a trained AI model, the AI application encrypts the AI model by using a related program, then stores the encrypted AI model in a storage of an operating environment in a form of a ciphertext file, and simultaneously, a key is hard-coded or stored in the file, and when the AI application operates, the AI application directly decrypts the encrypted AI model by using the key and then uses the encrypted AI model.

The encryption and decryption scheme is realized by pure software in the encryption and decryption process, namely the AI application is required to be realized by itself, so that the realization cost of the AI application is increased; meanwhile, the decryption key is hard-coded or stored in a file and is easy to be reversely cracked; there is also a security problem in the process of transmitting the decrypted AI model to AI chip 202, so this scheme has low overall security.

In the special hardware dongle scheme, the dongle is also called a dongle, and the dongle is a software and hardware combined encryption product inserted on a parallel port of a computer, is a popular identity authentication safety tool at present, has a size similar to a USB flash disk, and can be directly plugged into and pulled out of a universal serial bus (universal serial bus, USB) interface of the computer; for the dongles, each dongle has an independent product identification code and an independent latest encryption algorithm, and when a user logs in the platform, the user is allowed to log in normally only after detecting the specific dongle and accurate physical verification.

Fig. 3 is a schematic diagram of a prior art deployment of a dedicated hardware dongle. As shown in fig. 3, the AI application is deployed on the host device, and the host device may be understood as the main chip 201, while a dedicated hardware dongle module is added on the host device, when the hardware dongle module receives the AI model ciphertext in the AI application, that is, when the AI model operates, the hardware dongle module decrypts the AI model ciphertext to obtain the AI model plaintext, and then sends the AI model plaintext back to the AI application, and the AI application sends the received AI model plaintext to the AI computing device for processing, where the AI computing device may be understood as the AI chip 202.

It can be appreciated that, in the use process of the dedicated hardware dongle scheme, one or more dongles (models of different model manufacturers) are required to be deployed on the AI application device, so that the cost is high and the deployment is complex; the encryption dog has weak capability, the encryption algorithm difficulty is lower than that of the public encryption algorithm, and the security is not very high; security issues also exist when the AI model decrypted using the hardware dongle is transmitted to the AI computing device.

As an example, in an edge computing deployment scenario, such as cameras, dongles and AI models are deployed on cameras that are typically outdoors with the risk of theft.

In view of this, the application provides a method for using an AI model, which achieves the purpose of strictly protecting the AI model, and avoids the problems that in the prior art, when the AI model is protected, the implementation cost of the AI application needs to be increased, the security is low, and the use of at least one dongle on the device where the AI application is deployed causes the increase of hardware cost and deployment complexity, and the process of transmitting the decrypted AI model to the AI chip is unsafe.

The following describes the technical solution of the present application in detail by taking an application scenario schematic diagram shown in fig. 2 as an example and through the accompanying drawings and specific embodiments. It should be noted that the following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.

FIG. 4 is a schematic flow chart of a method of using an artificial intelligence AI model provided in one embodiment of the application. As shown in fig. 4, the method may include S410 to S440. One example of a master chip in this method is master chip 201 and one example of an AI chip in this method is AI chip 202.

S410, the master chip sends an encrypted AI model to the AI chip. Accordingly, the AI chip receives an encrypted AI model sent by the main chip, wherein the encrypted AI model is a model obtained by encrypting the first AI model by the AI chip.

For example, when the host chip needs the AI chip to use the first AI model for reasoning, the host chip may send the encrypted AI model encrypted by the first AI model to the AI chip.

Taking a main chip and an AI chip as chips in the automatic driving box equipment, wherein an automatic driving program is deployed in the main chip, and the automatic driving program is realized by a first AI model as an example, when the automatic driving program is operated in the main chip, the main chip can send an encryption model of the first AI model to the AI chip.

As one example, the first AI model is the convolutional neural network model shown in fig. 1.

S420, the AI chip decrypts the encrypted AI model to obtain a first AI model.

And after the AI chip receives the encrypted AI model sent by the main chip, performing decryption operation on the encrypted AI model, and obtaining a first AI model after decryption.

S430, the AI chip uses the first AI model to carry out reasoning so as to obtain a reasoning result.

It will be appreciated that the process of reasoning is the process of applying a trained AI model to an actual scene. For example, using the trained model, various conclusions are inferred using new data, i.e., operations are performed with the existing neural network model, and the process of obtaining correct conclusions at a time with new input data may also be referred to as prediction or inference.

As an example, in an actual scenario, real operation data, also called field data, needs to be obtained from an operation device, then decoded by a CPU with a decoder to obtain input data required by an AI chip, and then the input data is transmitted to the AI chip by a high-speed serial computer expansion bus standard (peripheral component interconnect express, PCIE) to be inferred using a first AI model.

As another example, the first AI model is an algorithm of the determined optimal configuration parameters using big data in advance, and the process of reasoning by using the first AI model by the AI chip is to input the input data into the first AI model, and the reasoning result is obtained by combining the corresponding algorithm. For example, the obtained vehicle running data is obtained through the method, input data is obtained, and then the input data is input into the first AI model, so that vehicle condition information required by a user is obtained.

S440, the AI chip sends the reasoning result to the main chip. Accordingly, the master chip receives the inference results from the AI chip.

Compared with decryption by other chips or dongles except the AI chip, the method of the embodiment has the advantages that the decryption of the encrypted AI model is executed on the AI chip, and the AI chip is directly inferred and used for the decrypted AI model, so that the problem that the decrypted AI model is intercepted or stolen in the transmission process is avoided, that is, the leakage risk of the AI model is greatly reduced, and the safety is improved.

As an example, one implementation of the AI chip to decrypt the encrypted AI model to obtain the first AI model includes: and decrypting the encrypted AI model by using a trusted root stored in a secure storage area of the AI chip.

For example, the AI chip generates a decryption key using the chip's trusted root, and then decrypts the encrypted AI model using the decryption key to obtain the first AI model.

The chip trusted root can be understood as information which is stored in a safe storage area of the AI chip and has no precondition trusted in the chip, and the storage content in the safe storage area can only be read by the AI chip, and external equipment cannot read the information, namely, the re-encrypted AI model can only be decrypted and operated on the AI chip, so that the security of the AI model can be further improved.

In this embodiment, the encrypted AI model in the host chip may be obtained in a plurality of ways, and one way of obtaining the encrypted AI model is described below with reference to fig. 5.

As shown in fig. 5, before S410, one implementation manner of the obtaining of the encrypted AI model in the usage method of the embodiment of the present application may include the following steps:

s401, the master chip sends a first AI model to the AI chip. Accordingly, the AI chip receives the first AI model transmitted by the master chip.

For example, when the host chip requires the AI chip to encrypt using the first AI model, the host chip may send the first AI model to the AI chip.

As an alternative, stored on the main chip is the AI model obtained by software encryption of the first AI model. In this case, the host chip may decrypt the software-encrypted AI model first, and then send the first AI model to the AI chip after obtaining the first AI model.

As another alternative, the first AI model is stored on the host chip. In this case, the master chip may send the first AI model directly to the AI chip.

S402, the AI chip sends an encrypted AI model obtained by encrypting the first AI model by using a trusted root to the main chip. Accordingly, the master chip receives the encrypted AI model.

For example, after the AI chip receives the first AI model from the host chip, a key is generated using a trusted root stored in a secure storage area of the AI chip, and then the received first AI model is encrypted using the key to obtain an encrypted AI model, and the encrypted AI model is returned to the host chip.

It is understood that the trusted root in the AI chip refers to information in the chip that is not trusted, which may be stored in a secure storage area of the AI chip, and the storage in this secure storage area is readable only by the AI chip and cannot be read by an external device. This can increase the security of the key, and thus the security of the first AI model can be further increased.

S403, the main chip stores an encrypted AI model.

As an implementation manner, after receiving the encrypted AI model corresponding to the first AI model, the master chip stores the encrypted AI model and records a mapping relationship between the encrypted AI model and the first AI model, so that when the first AI model needs to be used, the encrypted AI model can be obtained based on the mapping relationship.

Optionally, when the master chip stores the encrypted AI model, the first AI model stored previously may be deleted, so as to avoid wasting the storage space corresponding to the master chip.

As an example, the main chip may perform S401, S402, and S403 when the AI application to which the first AI model belongs is installed and run on the main chip for the first time.

When the AI application program is run again, the main chip directly loads the locally stored encrypted AI model onto the AI chip; and then the AI chip carries out decryption operation on the encrypted AI model, and uses the first AI model obtained by decryption to carry out reasoning so as to obtain a reasoning result, and returns the reasoning result to the main chip.

In this embodiment, the AI chip may automatically implement encryption protection on the AI model, without deploying a dedicated hardware dongle, etc., and may reduce the complexity of implementing a hardware-level protection scheme for the AI model by a user.

An example of a method of using the AI model is described below with reference to fig. 6, taking the main chip and AI chip as examples of chips in an autopilot box.

S601, installing and starting an automatic driving program for the first time.

The maintenance personnel installs the autopilot program into the autopilot box device of the vehicle, then initiates the autopilot program for the first time, and proceeds to debug. The automatic driving box equipment comprises a main chip and an AI computing chip, an automatic driving program is deployed on the main chip, and the automatic driving program can comprise an AI model encrypted by using a software mode.

S602, the automatic driving program decrypts the AI model to obtain the plaintext of the AI model.

And S603, the main chip deployed with the automatic driving program sends an AI model plaintext to the AI computing chip.

It can be appreciated that if the AI model deployed in the autopilot is unencrypted, S602 may be skipped and S603 may be performed directly after S601.

For example, the autopilot invokes an interface of the AI computing chip, loading the AI model plaintext onto the AI computing chip for subsequent computation.

S604, the AI computing chip generates a unique key and re-encrypts the AI model.

The AI computing chip automatically generates a key of the AI computing chip according to the trusted root of the self chip, and encrypts the plaintext of the AI model according to the key to obtain the encrypted AI model.

In this step, the root of trust is stored in a secure memory area of the AI computing chip and cannot be obtained from outside.

And S605, the AI computing chip sends the re-encrypted AI model to the main chip.

S606, deleting the original stored AI model and key by the main chip deployed with the automatic driving program, and storing the re-encrypted AI model.

It will be appreciated that the primary chip may not delete the originally stored AI model and key.

S607, the main chip deployed with the autopilot program sends the re-encrypted AI model to the AI computing chip.

As an alternative way, when the subsequent maintainer debugs again or the automobile owner starts the autopilot, the main chip deployed with the autopilot program can directly load the re-encrypted AI model onto the AI computing chip.

S608, decrypting the encrypted AI model.

After the encrypted AI model is obtained, the AI computing chip decrypts the encrypted AI model by using a decryption key generated by the trusted root of the AI computing chip, so as to obtain the plaintext of the AI model, and then corresponding operation is carried out.

In the embodiment, the AI computing chip is utilized to automatically generate a key according to the self trusted root, encryption and decryption operations are carried out on the AI model plaintext, namely, the protection of the AI model is converted from software-level key encryption protection to chip hardware-level key encryption protection, the trusted root of the generated key is stored in the safe storage of the chip and is not externally available, so that the AI computing chip can only be used for decryption operation during decryption, and the risk of leakage of the AI model is greatly reduced; the AI chip can automatically realize the hardware-level key encryption protection of the AI model, greatly reduces the complexity of the AI user in realizing the hardware-level protection scheme of the AI model, and simultaneously does not need to deploy a special hardware dongle, thereby reducing the development and product cost.

FIG. 7 is a schematic diagram of a device for using an artificial intelligence AI model in accordance with one embodiment of the application. It should be understood that the apparatus 700 shown in fig. 7 is merely an example, and that the apparatus 700 of the embodiments of the present application may further include other modules or units. The apparatus 700 may be used to implement the method shown in fig. 4.

For example, the apparatus 700 may include an encryption module 701, a reception module 702, a decryption module 703, an inference module 704, and a transmission module 705. Wherein, the encryption module 701 and the receiving module 702 are used for executing S410, the decryption module 703 is used for executing S420, the reasoning module 704 is used for executing S430, and the sending module 705 is used for executing S440.

FIG. 8 is a schematic diagram of an apparatus for using an artificial intelligence AI model in accordance with another embodiment of the application. It should be understood that the apparatus 800 shown in fig. 8 is merely an example, and that the apparatus 800 of the embodiments of the present application may also include other modules or units. The apparatus 800 may be used to implement the method shown in fig. 5.

For example, the apparatus 800 may include a transmitting module 801, a receiving module 802, and a storage module 803. The sending module 801 is configured to execute S401, the receiving module 803 is configured to execute S402, and the storage module 804 is configured to execute S403.

It should be understood that the term "module" herein may be implemented in software and/or hardware, and is not specifically limited thereto. For example, a "module" may be a software program, a hardware circuit, or a combination of both that implements the functionality described above. The hardware circuitry may include application specific integrated circuits (application specific integrated circuit, ASICs), electronic circuits, processors (e.g., shared, proprietary, or group processors, etc.) and memory for executing one or more software or firmware programs, merged logic circuits, and/or other suitable components that support the described functions.

Fig. 9 is a schematic block diagram of an apparatus provided in an embodiment of the present application. The apparatus 900 shown in fig. 9 comprises a memory 901, a processor 902, a communication interface 903, and a bus 904. The memory 901, the processor 902, and the communication interface 903 are communicatively connected to each other via a bus 904.

The memory 901 may be a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a random access memory (random access memory, RAM). The memory 901 may store a program, and the processor 902 is configured to perform the steps of the methods shown in fig. 4 and 5 when the program stored in the memory 901 is executed by the processor 902.

The processor 902 may employ a general-purpose central processing unit (central processing unit, CPU), microprocessor, application specific integrated circuit (application specific integrated circuit, ASIC), or one or more integrated circuits for executing associated programs to perform the methods of the method embodiments of the present application.

The processor 902 may also be an integrated circuit chip with signal processing capabilities. In implementation, various steps of methods in embodiments of the present application may be performed by integrated logic circuitry in hardware or by instructions in software in processor 902.

The processor 902 may also be a general purpose processor, a digital signal processor (digital signal processing, DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 901, and the processor 902 reads the information in the memory 901, and in combination with its hardware, performs the functions necessary to be performed by the units comprised by the temperature measuring device, for example, the steps/functions of the embodiments shown in fig. 4 and 5 may be performed.

The communication interface 903 may enable communication between the apparatus 900 and other devices or communication networks using, but is not limited to, a transceiver or the like.

The bus 904 may include a path for transferring information between various components of the apparatus 900 (e.g., the memory 901, the processor 902, the communication interface 903).

It should be understood that the apparatus 900 shown in the embodiments of the present application may be an electronic device, or may be a chip configured in an electronic device.

It should be appreciated that the processor in embodiments of the present application may be a central processing unit (central processing unit, CPU), but may also be other general purpose processors, digital signal processors (digital signal processor, DSP), application specific integrated circuits (application specific integrated circuit, ASIC), off-the-shelf programmable gate arrays (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It should also be appreciated that the memory in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as an external cache. By way of example but not limitation, many forms of random access memory (random access memory, RAM) are available, such as Static RAM (SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced Synchronous Dynamic Random Access Memory (ESDRAM), synchronous Link DRAM (SLDRAM), and direct memory bus RAM (DR RAM).

The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with the embodiments of the present application are all or partially produced. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.

It should be understood that the term "and/or" is merely an association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B may mean: there are three cases, a alone, a and B together, and B alone, wherein a, B may be singular or plural. In addition, the character "/" herein generally indicates that the associated object is an "or" relationship, but may also indicate an "and/or" relationship, and may be understood by referring to the context.

In the present application, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.

It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a mobile hard disk, a read-only memory, a random access memory, a magnetic disk or an optical disk.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

A method of using an artificial intelligence AI model, the method being applied to an AI chip, the method comprising:

receiving an encrypted AI model from a host chip;

decrypting the encrypted AI model to obtain a first AI model;

reasoning is carried out by using the first AI model, so that a reasoning result is obtained;

and sending the reasoning result to the main chip.
The method of claim 1, wherein decrypting the encrypted AI model comprises:

and decrypting the encrypted AI model by using a trusted root stored in a secure storage area of the AI chip.
The method of claim 2, wherein prior to the AI chip receiving the encrypted AI model, the method further comprises:

receiving the first AI model from the master chip;

Encrypting the first AI model by using the trusted root to obtain the encrypted AI model;

and sending the encrypted AI model to the master chip.
A method of using an artificial intelligence AI model, the method being applied to a host chip, the method comprising:

sending an encrypted AI model to an AI chip, wherein the encrypted AI model is obtained by encrypting a first AI model by the AI chip;

and receiving an inference result from the AI chip, wherein the inference result is obtained by the AI chip by using the first AI model.
The method of claim 4, wherein prior to the sending the encrypted AI model to the AI chip, the method further comprises:

transmitting the first AI model to the AI chip;

receiving the encrypted AI model obtained by encrypting the first AI model by the AI chip by using a trusted root;

the encrypted AI model is stored.
The method of claim 5, wherein the trusted root is a trusted root stored within a secure storage area of the AI chip.
An apparatus for using an artificial intelligence AI model, the apparatus comprising means for implementing the method of any of claims 1-6.
An artificial intelligence AI chip comprising a processor coupled to a memory, the processor configured to execute program code in the memory to implement the method of any of claims 1-3.
A chip comprising a processor coupled to a memory, the processor for executing program code in the memory to implement the method of any of claims 4 to 6.
A computer readable storage medium, characterized in that the storage medium has stored therein a computer program or instructions which, when executed by a processor, implement the method of any of claims 1 to 6.
A computer program product comprising computer program code for causing a computer to carry out the method according to any one of claims 1 to 6 when said computer program code is run on the computer.