WO2019224612A1

WO2019224612A1 - Apparatus and method for protecting artificial intelligent agents from information theft

Info

Publication number: WO2019224612A1
Application number: PCT/IB2019/050365
Authority: WO
Inventors: Neisarg DAVE; Srinivas KRUTHIVENTI SUBRAHMANYESWARA SAI
Original assignee: Harman International Industries, Incorporated
Priority date: 2018-05-25
Filing date: 2019-01-16
Publication date: 2019-11-28

Abstract

In at least one embodiment, a system for protecting an artificial intelligence (AI) based apparatus is provided. The system includes a memory device and a server. The server includes the memory device and provides an AI based service for a client. The server is configured to receive a number of inquiries from the client for the AI based service. The server is further configured to generate a first shared information coefficient (SIC) value after receiving the number of inquiries from the client to determine whether the client is providing targeted inquires with an intention of information theft from the AI based service.

Description

APPARATUS AND METHOD FOR PROTECTING ARTIFICIAL INTELLIGENT AGENTS

FROM INFORMATION THEFT

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of Indian provisional Application No.

201841019617 filed on May 25, 2018, the disclosure of which is incorporated in its entirety by reference herein.

TECHNICAL FIELD

[0002] Aspects disclosed herein generally relate to an apparatus and method for protecting

Artificial Intelligent Agents from information theft. These aspects and others will be discussed in more detail below.

BACKGROUND

[0003] Technology is changing and evolving at a tremendous pace and society has found themselves at the dawn of next generation Artificial Intelligent (AI) Systems. These systems come at the expense of complexity that makes it very difficult to analyze. There is a widespread exploration and adoption of such systems in self-driving cars, virtual assistants, industrial robotics, medical research and clinical industry, military, space exploration and many more. Industry leaders are currently exploring AI systems for user experience, user authentication, user safety, advanced driver assistance systems and integrating car environment with home and office.

[0004] Most of the AI approaches, especially Deep Learning systems, may require a large amount of data for training. Comprehensive data, with annotations, may be the key for training models for superior results. In fact, much of the cost and labor in building an AI system may be attributed to collecting and annotating the data. These models may assimilate key information from this data (i.e., referred to as training phase) to respond intelligently to unseen data in the deployment phase. These models may be treated as concentrated sources of information. [0005] One proof of existence of such information in AI models is the framework of

Knowledge Distillation training a“student” neural network with the help of an already trained “teacher” neural network as set forth in“Distilling the Knowledge in a Neural Network” to Hinton et al, March 9, 2015. In theory, once meaningful information is extracted from data by a complex model, it may be possible to train a simpler model with the help of a complex model to achieve similar results.

[0006] Many companies like Google, Amazon, Microsoft and various startups provide cloud- based services, like text translation, image recognition, voice recognition, style transfer, etc. All of these services are backed by the AI models, which may have gone through rigorous training on a large amount of data. The AI models include information that is extracted from their training data and offer their intelligence as a paid service to clients. A malicious client can make targeted calls to such services via an AI system to steal this information.

SUMMARY

[0007] In at least one embodiment, a system for protecting an artificial intelligence (AI) based apparatus is provided. The system includes a memory device and a server. The server includes the memory device and provides an AI based service for a client. The server is configured to receive a number of inquiries from the client for the AI based service. The server is further configured to generate a first shared information coefficient (SIC) value after receiving the number of inquiries from the client to determine whether the client is providing targeted inquires with an intention of information theft from the AI based service.

[0008] In at least another embodiment, a method for protecting an artificial intelligence (AI) based apparatus is provided. The method includes providing an AI based service for a client and receiving a number of inquiries from the client for the AI based service. The method further includes generating a first shared information coefficient (SIC) value after receiving the number of inquiries from the client and determining whether the client is providing targeted inquires with an intention of information theft from the AI based service based on the first SIC value.

[0009] In at least another embodiment, a computer-program product embodied in a non- transitory computer readable medium that is programmed for protecting an artificial intelligence (AI) based apparatus is provided. The computer-program product includes instructions for providing an AI based service for a plurality of clients and for receiving a number of inquiries from the plurality of clients for the AI based service. The computer-program product further includes instructions for generating a shared information coefficient (SIC) value for each of the plurality of clients after receiving the number of inquiries from the plurality of clients and for selectively determining whether any of the plurality of clients is a malicious client.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010] The embodiments of the present disclosure are pointed out with particularity in the appended claims. However, other features of the various embodiments will become more apparent and will be best understood by referring to the following detailed description in conjunction with the accompanying drawings in which:

[0011] FIGURE 1 generally depicts one image that is recovered via a model inversion attack and another image that corresponds to the original image of the recovered image;

[0012] FIGURE 2 generally depicts a typical artificial intelligence (AI) based implementation that is a service-based model;

[0013] FIGURE 3 generally depicts an apparatus for protecting a host AI model from information theft in accordance to one embodiment;

[0014] FIGURE 4 generally depicts the apparatus for protecting a plurality of AI agents in accordance to one embodiment; and

[0015] FIGURE 5 generally depicts the apparatus for monitoring queries from the client in accordance to one embodiment;

[0016] FIGURE 6 generally depicts a method for protecting the host AI model from information theft in accordance to one embodiment;

[0017] Figure 7 generally illustrates a method for creating a dataset for training the apparatus in accordance to one embodiment; [0018] Figure 8 generally illustrates an example in which the guardian AI model is arranged to classify one or more images of different plants in accordance to one embodiment;

[0019] Figure 9 generally illustrates an example of training the guardian AI model in accordance to one embodiment, and

[0020] Figure 10 generally illustrates an example of training the guardian AI model in accordance to one embodiment.

DETAILED DESCRIPTION

[0021] As required, detailed embodiments of the present invention are disclosed herein; however, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various and alternative forms. The figures are not necessarily to scale; some features may be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art to variously employ the present invention.

[0022] It is recognized that various electrical devices such as servers, controllers, and clients, etc. as disclosed herein may include various microprocessors, integrated circuits, memory devices (e.g., FLASH, random access memory (RAM), read only memory (ROM), electrically programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM), or other suitable variants thereof), and software which co-act with one another to perform operation(s) disclosed herein. In addition, these electrical devices utilize one or more microprocessors to execute a computer-program that is embodied in a non-transitory computer readable medium that is programmed to perform any number of the functions as disclosed. Further, the various electrical devices as provided herein include a housing and various numbers of microprocessors, integrated circuits, and memory devices ((e.g., FLASH, random access memory (RAM), read only memory (ROM), electrically programmable read only memory (EPROM), electrically erasable programmable read only memory (EEPROM)) positioned within the housing. The electrical devices also include hardware-based inputs and outputs for receiving and transmitting data, respectively from and to other hardware based devices as discussed herein.

[0023] Recent studies have shown that it is possible to steal machine-learning models via

Prediction - Application Programming Interfaces (APIs). In particular, an adversary client with no knowledge of an internal architecture of a host AI can duplicate the functionality (i.e., steal) of the host AI. Examples of this may involve duplicating the functionality of online services such as BigML® and Amazon Machine Learning®. The studies further illustrate that the natural counter measure of omitting confidence values from the model outputs still admits potentially harmful model extraction attacks.

[0024] In addition, model inversion attacks on an extracted model can be used to recreate training data. Thus, this condition may pose a threat towards the discovery of private data used for training an original model. In general, model inversion attacks exploit confidence values that are revealed along with predictions. Figure 1 generally illustrates the recovery of recognizable images of a human’s face given that only the human’s name and access to the machine language (ML) model. For example, first image 10 depicts an original image of the human and second image 12 depicts a recovered image of the human using a model inversion attack. These attacks may also be extended to voice/audio data, video data, and other multimedia content.

[0025] To minimize information theft and subsequent data extraction from AI models, one or more aspects noted herein provide an apparatus that keeps track of the information provided by the AI-services on a server that is transmitted to one or more clients. Various throttling based systems may only regulate the number of queries made by a client. This kind of system does not take into account the entropy of queries; hence, it is most often possible to steal from the host AI by making queries well under a threshold, as set forth in“The Dark Secret at the Heart of AI”, Will KNIGHT, MIT Technology Review, April 11, 2017. As noted above, the apparatus of the present disclosure monitors the amount information content that is provided from the host AI to a client. Thus, this incorporates the entropy of queries made by a system of clients. Such an apparatus can identify and take appropriate actions against a client making targeted queries with the intention of information theft. Further, the disclosed apparatus provides a mechanism for safeguarding an AI service from attacks from malicious clients. [0026] Figure 2 generally depicts an artificial intelligence (AI) based apparatus 14 that operates as a service- based model. The apparatus 14 includes a server 16 having at least one controller 18 for executing AI based models (or a host AI model 19). The apparatus 14 generally includes memory 20 to provide and store information for the controller 18 to execute the AI based models. A client 22 (or computing device) is operably coupled to the controller 18 to request a service or operation to be performed by the host AI model 19 of the server 16. As noted above, the apparatus 14 is generally a service-based model in which the server 16 generally responds to requests or queries from the client 22. In one example, the service-based model offered by the host AI model 19 of the server 16 may be an image classification function. In particular, the client 22 may transmit an image as a query to the server 16 or to the host AI model 19 as stored on the memory 20. In response to the query, the server 16 transmits a classification label corresponding to the image (e.g., a cat) back to the client 22.

[0027] Figure 3 generally depicts the apparatus 14 for protecting the host AI model 19 from information theft in accordance to one embodiment. The apparatus 14 incudes a guardian controller

24 that is operably coupled to the controller 18 (i.e., to the host AI model 19) and to the client 22. The guardian controller 24 executes a guardian AI model 25 to protect the server 16 (or the host AI model) from a malicious attack from the client 22. The memory 20 provides and stores information for the guardian controller 24 to execute the guardian AI model 25. It is recognized that a single controller may be provided to execute the host AI model 19 for providing a service or operation for the client 22. In addition, the single controller may be configured to execute the guardian AI model

25 for protecting the host AI model 19 from a malicious attack from the client 22. It is recognized that any number of controllers may be used to execute the host AI model 19 and/or the guardian AI model 25.

[0028] In general, the guardian controller 24 maintains it states through recurrence. The guardian controller 24 may employ a Recurrent Neural Network to maintain its states. It is recognized that other types of Neural Networks may be used to maintain states. The guardian controller 24 monitors all incoming inquires made to the server 16 by the client 22 and updates its states accordingly. In this case, the guardian controller 24 maintains a memory of previous queries by the client 22 and also dynamically updates the memory of the previous queries with any new and incoming queries. The guardian controller 24 also monitors the level of information provided to the client 22 by the server 20 in response to the queries.

[0029] After a predetermined number of queries (or predetermined threshold), the guardian controller 24 generates a first Shared Information Coefficient (SIC). The first SIC is a value that is, for example, between 0 and 1. The first SIC value generally corresponds to an indicator that identifies the level of information that has been provided from the server 16 to the client 22. The guardian controller 24 also generates a second SIC value. The second SIC value may correspond to an accuracy of the host AI model 19 that is based on a standard test set when the host AI model 19 is trained only on the queries made by the client 22. This helps the guardian AI model 25 keep track of the information that is shared with each client 22. Generally speaking, the guardian controller 24 generates the second SIC value prior to generating the first SIC value. In other words, the second SIC value is a prior approximation to the first SIC value and the second SIC can be generated readily by the guardian AI model 25 because of its training procedure described later. This second SIC value may be used and processed using statistical methods to obtain a good estimate of the first SIC value. In reference back to the first SIC value, if the guardian controller 24 determines that the first SIC value exceeds the predetermined value for a particular client 22, then the guardian controller 24 generates an alert to indicate that the particular client 22 may be a threat or a malicious client 22. In one example, a first SIC value that is 0.8 may be more threatening than those resulting in the first SIC value being 0.2.

[0030] Figure 4 generally depicts the apparatus 14 for protecting a plurality of AI agents 22a

- 22n (“22”) in accordance to one embodiment. The apparatus 14 in this case generally includes a plurality of clients 22a - 22n (“22”) that is operably coupled to the server 16. The server 16 includes a plurality of copies of the guardian AI models 25a - 25n (“25”) with each corresponding guardian AI model 25a - 25n being operably coupled to a corresponding client 22a - 22n, respectively. One or more guardian controllers 24 may be provided to include the corresponding guardian AI model 25. Each corresponding guardian AI model 25a - 25n may determine a first SIC value for a corresponding client 22a - 22n. In this regard, the server 16 may be configured to selectively block queries for the corresponding client 22a - 22n, respectively, in the event the SIC for the various clients 22a - 22n exceed the predetermined threshold value. It is recognized that each client 22a - 22n may provide similar or different queries with the server 16 for a particular service or function to be performed by the host AI model 19. The server 18 may be configured to selectively block queries for any one or more of the clients 22a - 22n. Each AI controller 25 a - 25n is to be trained and generates a corresponding second SIC value prior to generating the first SIC value. This aspect will be discussed in more detail below in connection with Figure 7.

[0031] Figure 5 generally depicts the apparatus 14 in more detail for monitoring queries from the client 22 in accordance to one embodiment. The guardian AI model 25, when executed, includes a saved state 27 and a current state 29. The saved state 27 generally corresponds to stored previous queries for the clients 22. The current state 29 generally corresponds to previously stored queries and new queries as requested by the clients 22.

[0032] Figure 6 generally depicts a method 50 for protecting the host AI model 19 from information theft in accordance to one embodiment.

[0033] In operation 52, the server 16 receives a query from the client 22 and updates the saved state 27 into the current state 29.

[0034] In operation 54, the server 16 determines when the guardian AI model 25 had last computed or generated the SIC for the client 22.

[0035] For example, in operation 56, the server 16 determines whether the last SIC value

(i.e., the first SIC value) was generated within a predetermined previous number of queries (e.g. 50 queries ago). If the server 16 determines that the last SIC value (e.g., the first SIC value) was generated within the predetermined previous number of queries, then the method 50 moves to operation 58. If not, then the method 50 moves back to operation 54.

[0036] In operation 58, the server 16 generates the first SIC value (or updates the second SIC value) based on the incoming query.

[0037] In operation 60, the server 16 compares the first SIC value to the predetermined threshold. If the second SIC value exceeds the predetermined threshold, then the method 50 moves to operation 62. If not, then the method 50 moves to operation 64. [0038] In operation 62, the server 16 generates an alert and blocks the client 22. In this case, the server 16 raises a flag or notification so that a service provider takes appropriate action. The server 16 may be configured to continue to receive queries from the client(s) 22 prior to the flag being set and then ignores the queries after the flag or notification has been set. In another case, the server 16 may disconnect itself from the client(s) 22 and no longer receive the queries.

[0039] In operation 64, the server 16 reads the query and the host AI model 19 executes the desired operation based on the query for the client 22. As noted above, for example, the host AI model 19 may provide image classification for an image that is transmitted from the client 22.

[0040] Figure 7 generally illustrates a method 80 for creating a dataset to train the apparatus

14 in accordance to one embodiment.

[0041] In operation 82, the server 16 divides the training dataset of the host AI model 19 into sub-chunks of variable sizes. In particular, the server 16 divides the training dataset into sets of blocks of increasing size. For example, assume that there are 1000 inquires, these can be divided into 6 chunks of 50 queries, 4 chunks of 100 queries, and 1 chunk of 300 queries.

[0042] In operation 83, the server 16, for each chunk of data, proceeds to operation 84, 86, and 88. The server 16 may execute the operations 84, 86, and 88 concurrently (or in a parallel manner) for each chunk of data, or execute the operations 84, 86, and 88 serially for each chunk of data.

[0043] In operation 84, the server 16, for each sub-chunk, trains the host AI model 19 for the corresponding sub-chunk.

[0044] In operation 86, the server 16, for each sub-chunk, computes an accuracy of the host

AI model 19 based on a standard test dataset (i.e., the second SIC value for this sequence of queries).

[0045] In operation 88, the server 16, for this pair of sequence of queries, the recorded SIC becomes a data point for training the guardian AI model 25.

[0046] In general, all of the collected data points from the method 80 form a dataset for the guardian AI model 25. The method 80 splits the dataset, and tests/trains the guardian AI model 25. As noted above, assume for example, that there are 1000 inquires. These can be divided into 6 chunks of 50 queries, 4 chunks of 100 queries, and 1 chunk of 300 queries. The server 16 trains the host AI model 19 individually on each of the noted chunks and obtains a certain accuracy. This creates a dataset of 11 (6 + 4 + 1) of data points for the guardian AI model 25. Generally, in practice, since there are millions of data points for the host AI model 25, it is possible to obtain, for example, at least one thousand data points for the guardian AI model 25. Since the host AI model 19 is being trained repeatedly, from scratch, this may appear to consume a lot of time. However, the host AI model 19 is being trained on a limited dataset and hence the individual training times are relatively small in comparison to the amount of time required to train an entire dataset.

[0047] In general, the server 16 by way of the guardian AI model 25 is configured to inspect queries provided by the client 22 and to determine the amount of level of information that is shared with the client 22. The guardian AI model 25 generates the first SIC value which corresponds to the level of information shared with the client 22. Figure 8 illustrates an example in which the guardian AI model 25 is arranged to classify one or more images of different plants 102 (hereafter“images of different plants 102). The client 22 provides the images of different plants 102 to the guardian AI model 25 to classify the image of the different plants 102 with a species to which the respective plant belongs to. The guardian AI model 25 classifies the images of the different plants 102 with a corresponding species and provides the species type back to the client 22. During this exchange, the server 16 generates the first SIC value which corresponds to the level of information that is shared with the client 22 throughout this exchange.

[0048] Figure 9 generally illustrates an example of training the guardian AI model 25 in accordance to one embodiment. In general, a machine learning model is used to train the guardian AI model 25 with the various species for each of the different plants 102. The guardian AI model 25 generally requires labeled data of the labeled images. Figure 9 illustrates the second SIC values (or labels (SIC) and that data (i.e., set of images) for training the guardian AI model 25.

[0049] As noted above, it is generally necessary to have to define or establish the SIC by creating a train set for the guardian AI model 25 prior to training or generating the first SIC value. The second SIC value may also be defined for a set of images as an accuracy of the host AI model 19 on a standard test set if the host AI model 19 is trained on the set of images and predicts the species of the plants. The second SIC value captures how diverse any set of the images are. The greater the diversity, the greater the accuracy of the host AI model 19 when trained upon this set of images.

[0050] Figure 10 generally illustrates the second SIC value and the training data for the guardian AI model 25 in accordance to one embodiment. In general, obtaining a single data point (data and the first SIC value) in the above manner requires a complete training of the host AI model 19. For example, to create a train set of one million data points for the guardian AI model 25, the host AI model 19 has to be trained one million times.

[0051] In general, the host AI model 19 may be trained a large number of times. However, in an instance, the host AI model 19 may be trained on a small subset of images and hence the individual training time may be small. The total time that is required for creating the train set for the guardian AI model 25 is linearly proportional to a total number of images in all of the subsets which can be the same as the size of the original train set for the host AI model 19. In this regard, the total setup time for creating the host AI model 19 and the guardian AI model 25 roughly doubles.

[0052] While exemplary embodiments are described above, it is not intended that these embodiments describe all possible forms of the invention. Rather, the words used in the specification are words of description rather than limitation, and it is understood that various changes may be made without departing from the spirit and scope of the invention. Additionally, the features of various implementing embodiments may be combined to form further embodiments of the invention.

Claims

WHAT IS CLAIMED IS:

1. A system for protecting an artificial intelligence (AI) based apparatus, the system comprising:

a memory device; and

a server including the memory device to provide an AI based service for a client, the server being configured to:

receive a number of inquiries from the client for the AI based service; and generate a first shared information coefficient (SIC) value after receiving the number of inquiries from the client to determine whether the client is providing targeted inquires with an intention of information theft from the AI based service.

2. The system of claim 1, wherein the server is further configured to compare the first SIC value to a predetermined threshold.

3. The system of claim 2, wherein the server is further configured to perform a function corresponding to an inquiry requested by the client in response to the first SIC value being below the predetermined threshold.

4. The system of claim 2, wherein the server is further configured to block any additional inquiries from the client in response to the first SIC value exceeding the predetermined threshold.

5. The system of claim 2, wherein the server is further configured to generate an alert to notify a server provider that the first SIC value has exceeded the predetermined threshold.

6. The system of claim 2, wherein the server is further configured to disconnect from the client in response to the first SIC value exceeding the predetermined threshold.

7. The system of claim 1, wherein the first SIC value corresponds to an indicator of an amount of information shared by the server with the client.

8 The system of claim 1, wherein the first SIC value is a value between 0 and 1.

9. The system of claim 1, wherein the server is further configured to generate a second SIC value which corresponds to an accuracy of a host AI model that is executed by the server.

10. The system of claim 9, wherein the server is further configured to generate the second SIC value prior to the first SIC value to enable the server to monitor an amount of information shared by the server with the client.

11. A method for protecting an artificial intelligence (AI) based apparatus, the method comprising:

providing an AI based service for a client;

receiving a number of inquiries from the client for the AI based service; generating a first shared information coefficient (SIC) value after receiving the number of inquiries from the client; and

determining whether the client is providing targeted inquires with an intention of information theft from the AI based service based on the first SIC value.

12. The method of claim 11, wherein determining whether the client is providing targeting inquiries includes comparing the first SIC value to a predetermined threshold.

13. The method of claim 12 further comprising executing a function corresponding to an inquiry requested by the client in response to the first SIC value being below the predetermined threshold.

14. The method of claim 12 further comprising blocking any additional inquiries from the client in response to the first SIC value exceeding the predetermined threshold.

15. The method of claim 12 further comprising generating an alert to notify a server provider that the first SIC value has exceeded the predetermined threshold.

16. The method of claim 12 further comprising disconnecting from the client in response to the first SIC value exceeding the predetermined threshold.

17. The method of claim 11, wherein the first SIC value corresponds to an indicator of an amount of information shared by a server with the client.

18. The method of claim 11 further comprising generating the second SIC value prior to the first SIC value to enable a server to monitor an amount of information shared by the server with the client.

19. A computer-program product embodied in a non-transitory computer readable medium that is programmed for protecting an artificial intelligence (AI) based apparatus, the computer-program product comprising instructions for:

providing an AI based service for a plurality of clients;

receiving a number of inquiries from the plurality of clients for the AI based service; generating a shared information coefficient (SIC) value for each of the plurality of clients after receiving the number of inquiries from the plurality of clients; and

selectively determining whether any of the plurality of clients is a malicious client with an intention of information theft from the AI based service based on the SIC value.

20. The computer-program product of claim 19 further comprising instructions for:

comparing the SIC value for each of the clients to a predetermined threshold;

determining that a first client of the plurality of clients is providing targeted inquires with the intention of information theft in response to the SIC value for the first client being greater than the predetermined threshold; and

determining that any remaining clients of the plurality of clients is not providing targeting inquiries in response to the SIC value for each of the remaining clients being less than the predetermined threshold.