WO2021159819A1 - Machine learning model protection method and device - Google Patents

Machine learning model protection method and device Download PDF

Info

Publication number
WO2021159819A1
WO2021159819A1 PCT/CN2020/132839 CN2020132839W WO2021159819A1 WO 2021159819 A1 WO2021159819 A1 WO 2021159819A1 CN 2020132839 W CN2020132839 W CN 2020132839W WO 2021159819 A1 WO2021159819 A1 WO 2021159819A1
Authority
WO
WIPO (PCT)
Prior art keywords
machine learning
learning model
protection
input parameter
function
Prior art date
Application number
PCT/CN2020/132839
Other languages
French (fr)
Chinese (zh)
Inventor
刘永超
金跃
陈勇
张尧
滕腾
欧航
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021159819A1 publication Critical patent/WO2021159819A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • This manual relates to the field of computer technology, especially to the field of information security.
  • a method for protecting a machine learning model based on a domain-specific language compiler includes: for each of one or more protection strategies of the machine learning model, receiving instructions from a user to invoke the corresponding protection strategy. And receive the input parameter value of the function; and based on the one or more functions for the one or more protection strategies and the corresponding input parameter values, automatically generate for the machine learning model Protected machine executable code.
  • a machine learning model protection device based on a domain-specific language compiler, including: a receiving unit for each protection strategy of one or more protection strategies for the machine learning model, Receiving a user's instruction to call a corresponding function, and receiving input parameter values of the function; and a code generating unit, which is configured to be based on one or more functions and corresponding inputs for the one or more protection strategies, respectively Parameter value, and automatically generate machine executable code for protecting the machine learning model.
  • a system for generating a machine learning model including: a machine learning model generation device for generating a machine learning model; and a machine learning based on a domain-specific language compiler according to each embodiment of this specification
  • the model protection device is used to generate machine executable code for protecting the machine learning model.
  • the domain-specific language (DSL) compiler provides the ability to parameterize each protection strategy of the machine learning model, thus, by setting different input parameters, the automatic Generate different machine executable codes for each protection strategy, so as to achieve specific protection for each machine learning model. Even if an attacker cracks a machine learning model, since the executable code corresponding to the protection strategy for each machine learning model is different, the migration cost of cracking other machine learning models will not be reduced. As a result, more reliable protection of the machine learning model is provided.
  • Figure 1 shows a protection architecture diagram of a machine learning model in one case
  • Figure 2 shows a machine learning model protection method based on a domain-specific language compiler according to an embodiment
  • Figure 3a shows a predefined function according to an embodiment
  • Figure 3b shows function call and fusion according to one embodiment
  • Figure 4 shows a system for generating a machine learning model according to one embodiment.
  • One or more protection strategies can be used to protect the machine learning model, including encryption, computational graph obfuscation, and/or weighted data obfuscation.
  • the user can select one or more protection strategies from the group including encryption, calculation graph obfuscation, and weight data obfuscation to form his unique protection logic for the machine learning model.
  • the machine learning model protection device described below can display the currently available protection strategies to the user, and the user selects a specific protection strategy among them to protect the current machine learning model.
  • Figure 1 shows a protection architecture diagram of a machine learning model in one case.
  • the machine learning model generated by artificial intelligence means that is, the machine executable program that realizes the machine learning model, can be protected by specific protection logic composed of calculation graph obfuscation, weighted data obfuscation and encryption, and the final output is used for the model Protected machine executable code and custom model format. It can also be expected to choose different combinations of protection strategies to form other protection logic.
  • Fig. 2 shows a machine learning model protection method 100 based on a DSL compiler according to an embodiment.
  • the protection method 100 can perform the following processing.
  • a user's instruction is received to call a corresponding function.
  • the one or more protection strategies can be specifically selected by the user for the current machine learning model. Especially selected from the group consisting of encryption, computational graph confusion, and weight data confusion. This enables users to specify user-specific protection logic for each machine learning model.
  • These functions are predefined in the DSL compiler for various protection strategies. For example, function A represents encryption, function B represents calculation graph confusion, and function C represents weight data confusion.
  • the user can input instructions to sequentially call the functions B, C, and A as the protection logic for the machine learning model.
  • Figure 3a shows an example of a predefined function in the DSL compiler.
  • the input parameter value of the function for each protection strategy is received.
  • the input parameter value can be set by the user according to his needs.
  • the input parameter value can be different for different machine learning models. Refer to the example shown in Figure 1 to receive input parameter values for functions B, C, and A respectively.
  • the input parameter value for each protection strategy can be randomly generated, and then the randomly generated input parameter value can be received.
  • the receiving function call and the input parameter value of the receiving function are described separately in processes 110 and 120, it can be understood that they can be executed in the same process.
  • the user's instruction can include the designation of the input parameter value.
  • machine executable codes for protecting the current machine learning model are automatically generated.
  • machine executable code for each protection strategy can be automatically generated based on the corresponding function and input parameter values.
  • machine executable codes that implement corresponding functions ie, calculation graph obfuscation, weight data obfuscation, encryption
  • functions B, C, A ie, calculation graph obfuscation, weight data obfuscation, encryption
  • the protection code can be provided to users together with the machine learning model.
  • multiple functions corresponding to the multiple protection strategies can be selectively fused before the machine executable code is automatically generated, and then based on the fused Function to generate machine executable code, thereby further increasing the difficulty of understanding the code logic.
  • At least two of the multiple functions corresponding to multiple protection strategies can be fused to generate a fused function; then the corresponding machine can be automatically generated based on the fused function and corresponding input parameter values.
  • Executable code For those functions that are not fused, the corresponding machine executable code can still be automatically generated based on the function and the corresponding input parameter values.
  • multiple functions corresponding to multiple protection strategies are merged to generate a merged function, and then corresponding machine executable codes are automatically generated based on the merged function and corresponding input parameter values.
  • Figure 3a shows the predefined functions E and F in the DSL compiler according to one embodiment.
  • Figure 3b shows the calling and fusion of functions E and F according to one embodiment.
  • the functions E and F may be functions predefined in the DSL compiler corresponding to different protection strategies.
  • the predefined functions E and F are shown in Figure 3a.
  • the user can input the instructions E_func(x,len) and F_func(x,len) to call the functions E and F and input the corresponding parameter values, so that the DSL compiler can automatically generate the corresponding machine executable Code.
  • the DSL compiler can first merge the functions E and F to obtain the fused function shown in FIG. 3b, and then generate machine executable code based on the function.
  • FIG. 4 shows a system 10 for generating a machine learning model according to one embodiment.
  • the system 10 includes a machine learning model generation device 11, which is used to generate a machine learning model, and a DSL compiler-based machine learning model protection device 12, which generates machine executable code for protecting the machine learning model according to different protection strategies.
  • the protection device 12 includes a receiving unit 121 and a code generating unit 122.
  • the receiving unit 121 receives an instruction from a user to call a corresponding function for each protection strategy of one or more protection strategies of the current machine learning model, and receives input parameter values of the function.
  • the called function is predefined and can be stored in the memory 13. It is also conceivable that the memory is part of the protection device 12.
  • the code generation unit 122 automatically generates machine executable code for protecting the current machine learning model based on one or more functions for one or more protection strategies and corresponding input parameter values.
  • the code generation unit 122 automatically generates machine executable code for the protection strategy based on the corresponding function and input parameter value for each protection strategy in one or more protection strategies.
  • the code generation unit 122 fuses at least two of the multiple functions corresponding to the multiple protection strategies to generate a fused function; then based on the fused function and the corresponding input The parameter value automatically generates the corresponding machine executable code.
  • system may further include a random number generating unit (not shown) configured to randomly generate an input parameter value for each protection strategy, and the receiving unit 121 receives the randomly generated input parameter value.
  • the random number generating unit can be expected to be a part of the protection device 12.
  • the code generation unit 122 performs the function fusion described above and various processes related to code generation corresponding to the fusion function. It can be expected to add various functional units or modules of the protection device of this specification on the basis of the existing DSL compiler.
  • the above-mentioned receiving unit 121 and code generating unit 122 are implemented as DSL compiler modules by a DSL compiler.
  • machine learning model protection device 12 based on the DSL compiler is described above in the system 10 for generating the machine learning model, it is conceivable to use the protection device 12 based on the DSL compiler to the machine learning model as a separate device.
  • the receiving unit of the protection device 12 can also receive the user's selection of the protection strategy.
  • the protection device 12 can include a display unit, which can display the currently selectable protection strategy and the corresponding instruction to the user, and the user can input the instruction based on his own selection of the protection strategy to call the corresponding function. Further, the display unit can also prompt the user to input the corresponding parameter value for the specific protection strategy selected by the user.
  • the exemplary embodiments of this specification cover both of the following: creating/using the computer program/software of this specification from the beginning, and converting an existing program/software into a computer program/software using this specification by means of an update.
  • a machine such as a computer
  • a computer such as a computer
  • the readable medium has computer program code stored thereon, and the computer program code when executed
  • the computer or the processor executes the method according to the embodiments of this specification.
  • the machine-readable medium is, for example, an optical storage medium or a solid-state medium supplied with or as part of other hardware.
  • the computer program for executing the method according to the various embodiments of the present specification may also be distributed in other forms, for example, via the Internet or other wired or wireless telecommunication systems.
  • the computer program can also be provided on a network such as the World Wide Web and can be downloaded from such a network to the working computer of the data processor.
  • the system according to this specification can be implemented by a memory and a processor.
  • the memory can store computer program codes for running the method procedures according to the various embodiments of this specification; when running the program codes from the memory, the processor executes the procedures according to the various embodiments of this specification.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Stored Programmes (AREA)

Abstract

A domain-specific language compiler-based machine learning model protection method and protection device. The protection method (100) comprises: for each of one or more protection policies of the machine learning model, receiving an instruction from a user so as to invoke corresponding functions (110); receiving input parameter values of the functions; and on the basis of one or more functions respectively for the one or more protection policies and the corresponding input parameter values, automatically generating a machine-executable code for protecting the machine learning model (130).

Description

机器学习模型保护方法和设备Machine learning model protection method and equipment 技术领域Technical field
本说明书涉及计算机技术领域,尤其涉及信息安全领域。This manual relates to the field of computer technology, especially to the field of information security.
背景技术Background technique
随着智能物联时代的到来,越来越多的人工智能算法被部署到云或者终端设备的应用程序里。而如刷脸支付、刷脸登录、无人商超、无人银行等业务所使用的人工智能算法有可能会遭到攻击,从而带来资金风险。通常攻击者并不知晓机器学习模型的具体结构和训练所使用的数据特征,因此通常使用黑盒攻击,通过尝试不同的输入获取对应的输出结果,观察输出结果来猜测模型的工作机理、发现系统漏洞。With the advent of the era of intelligent IoT, more and more artificial intelligence algorithms are deployed in the cloud or terminal device applications. And the artificial intelligence algorithms used in businesses such as face-swiping payment, face-swiping login, unmanned supermarkets, and unmanned banks may be attacked, which may bring financial risks. Usually attackers do not know the specific structure of the machine learning model and the data characteristics used for training, so black box attacks are usually used to obtain corresponding output results by trying different inputs, and observing the output results to guess the working mechanism of the model and discover the system Loopholes.
当前通常使用如加密、私有模型格式、计算图混淆、权重数据混淆等策略对于机器学习模型进行保护。但是,仍然需要提供对机器学习模型更可靠地保护。Currently, strategies such as encryption, private model format, computational graph obfuscation, and weighted data obfuscation are commonly used to protect machine learning models. However, there is still a need to provide more reliable protection for machine learning models.
发明内容Summary of the invention
期望提供基于领域特定语言编译器的机器学习模型保护方法和设备,其能够对机器学习模型提供更加可靠的保护。It is desirable to provide a machine learning model protection method and device based on a domain-specific language compiler, which can provide more reliable protection for the machine learning model.
根据一方面,提供一种基于领域特定语言编译器的机器学习模型保护方法,包括:针对所述机器学习模型的一种或多种保护策略中的每种保护策略,接收用户的指令以调用相应的函数、并且接收所述函数的输入参数值;和基于分别针对所述一种或多种保护策略的一个或多个函数以及相应的输入参数值,自动生成用于对所述机器学习模型进行保护的机器可执行代码。According to one aspect, a method for protecting a machine learning model based on a domain-specific language compiler is provided, which includes: for each of one or more protection strategies of the machine learning model, receiving instructions from a user to invoke the corresponding protection strategy. And receive the input parameter value of the function; and based on the one or more functions for the one or more protection strategies and the corresponding input parameter values, automatically generate for the machine learning model Protected machine executable code.
根据另一方面,提供一种基于领域特定语言编译器的机器学习模型保护设备,包括:接收单元,其用于针对所述机器学习模型的一种或多种保护策略中的每种保护策略,接收用户的指令以调用相应的函数、并且接收所述函数的输入参数值;和代码生成单元,其用于基于分别针对所述一种或多种保护策略的一个或多个函数以及相应的输入参数值,自动生成用于对所述机器学习模型进行保护的机器可执行代码。According to another aspect, a machine learning model protection device based on a domain-specific language compiler is provided, including: a receiving unit for each protection strategy of one or more protection strategies for the machine learning model, Receiving a user's instruction to call a corresponding function, and receiving input parameter values of the function; and a code generating unit, which is configured to be based on one or more functions and corresponding inputs for the one or more protection strategies, respectively Parameter value, and automatically generate machine executable code for protecting the machine learning model.
根据再一方面,提供一种生成机器学习模型的系统,包括:机器学习模型生成设备,其用于生成机器学习模型;和根据本说明书各个实施例所述的基于领域特定语言编 译器的机器学习模型保护设备,其用于生成对所述机器学习模型进行保护的机器可执行代码。According to still another aspect, a system for generating a machine learning model is provided, including: a machine learning model generation device for generating a machine learning model; and a machine learning based on a domain-specific language compiler according to each embodiment of this specification The model protection device is used to generate machine executable code for protecting the machine learning model.
根据本说明书的每个方面的各个实施例,基于领域特定语言(DSL)编译器为机器学习模型的每种保护策略提供了参数化的能力,由此,通过设定不同的输入参数,使得自动生成针对各项保护策略的不同机器可执行代码,从而实现了针对每个机器学习模型特异性的保护。即使攻击者破解了一个机器学习模型,由于针对每个机器学习模型的保护策略对应的可执行代码不同,其破解其他机器学习模型的迁移成本不会降低。由此,提供了对机器学习模型更加可靠的保护。According to the various embodiments of each aspect of this specification, the domain-specific language (DSL) compiler provides the ability to parameterize each protection strategy of the machine learning model, thus, by setting different input parameters, the automatic Generate different machine executable codes for each protection strategy, so as to achieve specific protection for each machine learning model. Even if an attacker cracks a machine learning model, since the executable code corresponding to the protection strategy for each machine learning model is different, the migration cost of cracking other machine learning models will not be reduced. As a result, more reliable protection of the machine learning model is provided.
附图说明Description of the drawings
图1示出了一种情况下的机器学习模型保护架构图;Figure 1 shows a protection architecture diagram of a machine learning model in one case;
图2示出了根据一个实施例的基于领域特定语言编译器的机器学习模型保护方法;Figure 2 shows a machine learning model protection method based on a domain-specific language compiler according to an embodiment;
图3a示出了根据一个实施例预先定义的函数;Figure 3a shows a predefined function according to an embodiment;
图3b示出了根据一个实施例的函数调用和融合;Figure 3b shows function call and fusion according to one embodiment;
图4示出了根据一个实施例的用于生成机器学习模型的系统。Figure 4 shows a system for generating a machine learning model according to one embodiment.
参照上述附图来描述本说明书的各个方面和特征。通常采用相同或相似的附图标号来表示相同的部件。上述附图仅仅是示意性的,而非限制性的。在不脱离本说明书的主旨的情况下,在上述附图中各个元件的尺寸、形状、标号、或者外观可以发生变化,而不被限制到仅仅说明书附图所示出的那样。The various aspects and features of this specification are described with reference to the above-mentioned drawings. The same or similar reference numerals are usually used to denote the same parts. The above drawings are only schematic and not restrictive. Without departing from the gist of the specification, the size, shape, label, or appearance of the various elements in the above-mentioned drawings may be changed, and are not limited to only those shown in the drawings of the specification.
具体实施方式Detailed ways
能够采用一种或多种保护策略对于机器学习模型进行保护,这包括加密、计算图混淆和/或权重数据混淆。优选地,用户能够在包括加密、计算图混淆和权重数据混淆的组中选择一种或多种保护策略来构成其独有的针对机器学习模型的保护逻辑。能够由如下将要描述的机器学习模型保护设备将当前可用的保护策略显示给用户,用户选择其中的特定保护策略来对当前机器学习模型进行保护。One or more protection strategies can be used to protect the machine learning model, including encryption, computational graph obfuscation, and/or weighted data obfuscation. Preferably, the user can select one or more protection strategies from the group including encryption, calculation graph obfuscation, and weight data obfuscation to form his unique protection logic for the machine learning model. The machine learning model protection device described below can display the currently available protection strategies to the user, and the user selects a specific protection strategy among them to protect the current machine learning model.
图1示出了一种情况下的机器学习模型保护架构图。通过人工智能手段生成的机器学习模型,也即实现该机器学习模型的机器可执行程序,能够通过由计算图混淆、权重数据混淆和加密构成的特定的保护逻辑来被保护,最终输出用于模型保护的机器可执 行代码以及自定义模型格式。还可以预期选择保护策略的不同组合来构成其他的保护逻辑。Figure 1 shows a protection architecture diagram of a machine learning model in one case. The machine learning model generated by artificial intelligence means, that is, the machine executable program that realizes the machine learning model, can be protected by specific protection logic composed of calculation graph obfuscation, weighted data obfuscation and encryption, and the final output is used for the model Protected machine executable code and custom model format. It can also be expected to choose different combinations of protection strategies to form other protection logic.
图2示出了根据一个实施例基于DSL编译器的机器学习模型保护方法100。该保护方法100能够执行如下处理。Fig. 2 shows a machine learning model protection method 100 based on a DSL compiler according to an embodiment. The protection method 100 can perform the following processing.
在110,针对当前机器学习模型的一种或多种保护策略中的每种保护策略,接收用户的指令以调用相应的函数。该一种或多种保护策略能够是用户针对当前机器学习模型特异性选择的。尤其是从包括加密、计算图混淆和权重数据混淆的组中选择的。这使得用户能够针对每个机器学习模型指定用户特异性的保护逻辑。这些函数是已经在DSL编译器中针对各项保护策略预先定义的。例如函数A表示加密,函数B表示计算图混淆、函数C表示权重数据混淆。由此,参考图1所示的例子,用户能够输入指令顺序调用函数B、C、A作为针对该机器学习模型的保护逻辑。图3a示出了在DSL编译器中预先定义的函数的例子。At 110, for each protection strategy of one or more protection strategies of the current machine learning model, a user's instruction is received to call a corresponding function. The one or more protection strategies can be specifically selected by the user for the current machine learning model. Especially selected from the group consisting of encryption, computational graph confusion, and weight data confusion. This enables users to specify user-specific protection logic for each machine learning model. These functions are predefined in the DSL compiler for various protection strategies. For example, function A represents encryption, function B represents calculation graph confusion, and function C represents weight data confusion. Thus, referring to the example shown in FIG. 1, the user can input instructions to sequentially call the functions B, C, and A as the protection logic for the machine learning model. Figure 3a shows an example of a predefined function in the DSL compiler.
在120,接收针对每种保护策略的函数的输入参数值。该输入参数值能够由用户根据其需求设定的。尤其该输入参数值能够是针对不同的机器学习模型而不同的。参考图1所示的例子,接收分别针对函数B、C、A的输入参数值。At 120, the input parameter value of the function for each protection strategy is received. The input parameter value can be set by the user according to his needs. In particular, the input parameter value can be different for different machine learning models. Refer to the example shown in Figure 1 to receive input parameter values for functions B, C, and A respectively.
在一个实施例中,能够随机生成针对每种保护策略的输入参数值,然后接收该随机生成的输入参数值。In one embodiment, the input parameter value for each protection strategy can be randomly generated, and then the randomly generated input parameter value can be received.
虽然将接收对函数的调用和接收函数的输入参数值在处理110和120中分开描述,可以理解,它们可以在同一处理中被执行,例如,优选的是能够在接收对某项保护策略的函数调用的同时接收对应的输入参数值。在这种情况下,用户的指令能够包括对输入参数值的指定。Although the receiving function call and the input parameter value of the receiving function are described separately in processes 110 and 120, it can be understood that they can be executed in the same process. For example, it is preferable to be able to receive the function of a certain protection strategy Receive the corresponding input parameter value while calling. In this case, the user's instruction can include the designation of the input parameter value.
在130,基于分别针对一种或多种保护策略的一个或多个函数以及相应的输入参数值,自动生成用于对当前机器学习模型进行保护的机器可执行代码。At 130, based on one or more functions for one or more protection strategies and corresponding input parameter values, machine executable codes for protecting the current machine learning model are automatically generated.
在一个实施例中,能够针对一种或多种保护策略中的每种保护策略,基于相应的函数以及输入参数值,自动生成针对每种保护策略的机器可执行代码。参考图1所示的例子,在该处理中分别针对函数B、C、A及其输入参数值自动生成实现对应功能(即,计算图混淆、权重数据混淆、加密)的机器可执行代码。由此构成对该机器学习模型的保护代码。该保护代码能够与机器学习模型一同提供给用户。In one embodiment, for each protection strategy of one or more protection strategies, machine executable code for each protection strategy can be automatically generated based on the corresponding function and input parameter values. Referring to the example shown in FIG. 1, in this process, machine executable codes that implement corresponding functions (ie, calculation graph obfuscation, weight data obfuscation, encryption) are automatically generated for functions B, C, A and their input parameter values. This constitutes a protection code for the machine learning model. The protection code can be provided to users together with the machine learning model.
在另一个实施例中,当针对当前机器学习模型使用多种保护策略时,能够在自动 生成机器可执行代码之前对多种保护策略对应的多个函数进行选择性地融合,然后基于经融合的函数来生成机器可执行代码,从而进一步增加代码逻辑的理解难度。In another embodiment, when multiple protection strategies are used for the current machine learning model, multiple functions corresponding to the multiple protection strategies can be selectively fused before the machine executable code is automatically generated, and then based on the fused Function to generate machine executable code, thereby further increasing the difficulty of understanding the code logic.
具体地,能够对与多种保护策略分别对应的多个函数中的至少两个函数进行融合,以生成经融合的函数;然后基于该经融合的函数以及相应的输入参数值自动生成相应的机器可执行代码。对于那些未被融合的函数,仍然能够基于该函数以及相应的输入参数值自动生成相应的机器可执行代码。Specifically, at least two of the multiple functions corresponding to multiple protection strategies can be fused to generate a fused function; then the corresponding machine can be automatically generated based on the fused function and corresponding input parameter values. Executable code. For those functions that are not fused, the corresponding machine executable code can still be automatically generated based on the function and the corresponding input parameter values.
优选的是,对多种保护策略分别对应的多个函数都进行融合,以生成一个经融合的函数,然后基于该经融合的函数以及相应的输入参数值自动生成相应的机器可执行代码。Preferably, multiple functions corresponding to multiple protection strategies are merged to generate a merged function, and then corresponding machine executable codes are automatically generated based on the merged function and corresponding input parameter values.
也可以预期对多种保护策略所对应的多个函数进行分组融合,以生成多个经融合的函数,然后基于该多个经融合的函数以及相应的输入参数生成对应的机器可执行代码。It can also be expected to group and merge multiple functions corresponding to multiple protection strategies to generate multiple fused functions, and then generate corresponding machine executable codes based on the multiple fused functions and corresponding input parameters.
图3a示出了根据一个实施例在DSL编译器中预先定义的函数E和F。图3b示出了根据一个实施例的函数E和F的调用和融合。函数E和F可以是分别对应不同的保护策略在DSL编译器中预先定义的函数。预先定义的函数E和F被示出在图3a中。根据一般情况下的实施例,用户可以输入指令E_func(x,len)和F_func(x,len)来调用函数E和F并且输入对应的参数值,以由DSL编译器自动生成相应的机器可执行代码。在上述数据融合的实施例中,DSL编译器能够首先对函数E和F进行融合得到图3b示出的经融合的函数,然后基于该函数生成机器可执行代码。Figure 3a shows the predefined functions E and F in the DSL compiler according to one embodiment. Figure 3b shows the calling and fusion of functions E and F according to one embodiment. The functions E and F may be functions predefined in the DSL compiler corresponding to different protection strategies. The predefined functions E and F are shown in Figure 3a. According to the general embodiment, the user can input the instructions E_func(x,len) and F_func(x,len) to call the functions E and F and input the corresponding parameter values, so that the DSL compiler can automatically generate the corresponding machine executable Code. In the above-mentioned data fusion embodiment, the DSL compiler can first merge the functions E and F to obtain the fused function shown in FIG. 3b, and then generate machine executable code based on the function.
以上参考基于DSL编译器对机器学习模型的保护方法描述了各个实施例。可以理解其中的各个方法的各项处理能够被拆分、重组或者组合以实现相应的功能。The various embodiments are described above with reference to a method for protecting a machine learning model based on a DSL compiler. It can be understood that the various processes of the various methods can be split, reorganized, or combined to achieve corresponding functions.
图4示出了根据一个实施例生成机器学习模型的系统10。该系统10包括机器学习模型生成设备11,其用于生成机器学习模型,以及基于DSL编译器的机器学习模型保护设备12,生成针对不同的保护策略对机器学习模型进行保护的机器可执行代码。该保护设备12包括接收单元121和代码生成单元122。该接收单元121针对当前机器学习模型的一种或多种保护策略中的每种保护策略,接收用户的指令以调用相应的函数、并且接收所述函数的输入参数值。调用的函数是预先定义的并且可以存储在存储器13中。也可以设想该存储器作为保护设备12的一部分。代码生成单元122基于分别针对一种或多种保护策略的一个或多个函数以及相应的输入参数值,自动生成用于对当前机器学习模型进行保护的机器可执行代码。Figure 4 shows a system 10 for generating a machine learning model according to one embodiment. The system 10 includes a machine learning model generation device 11, which is used to generate a machine learning model, and a DSL compiler-based machine learning model protection device 12, which generates machine executable code for protecting the machine learning model according to different protection strategies. The protection device 12 includes a receiving unit 121 and a code generating unit 122. The receiving unit 121 receives an instruction from a user to call a corresponding function for each protection strategy of one or more protection strategies of the current machine learning model, and receives input parameter values of the function. The called function is predefined and can be stored in the memory 13. It is also conceivable that the memory is part of the protection device 12. The code generation unit 122 automatically generates machine executable code for protecting the current machine learning model based on one or more functions for one or more protection strategies and corresponding input parameter values.
在一个实施例中,该代码生成单元122针对一种或多种保护策略中的每种保护策略,基于相应的函数和输入参数值,自动生成针对所述保护策略的机器可执行代码。In one embodiment, the code generation unit 122 automatically generates machine executable code for the protection strategy based on the corresponding function and input parameter value for each protection strategy in one or more protection strategies.
在另一个实施例中,该代码生成单元122对与多种保护策略分别对应的多个函数中的至少两个函数进行融合,以生成经融合的函数;然后基于经融合的函数以及相应的输入参数值自动生成相应的机器可执行代码。In another embodiment, the code generation unit 122 fuses at least two of the multiple functions corresponding to the multiple protection strategies to generate a fused function; then based on the fused function and the corresponding input The parameter value automatically generates the corresponding machine executable code.
在另一个实施例中,该系统还可以包括随机数生成单元(未示出),其用于随机生成针对每种保护策略的输入参数值,接收单元121接收该随机生成的输入参数值。可以预期该随机数生成单元作为保护设备12的一部分。In another embodiment, the system may further include a random number generating unit (not shown) configured to randomly generate an input parameter value for each protection strategy, and the receiving unit 121 receives the randomly generated input parameter value. The random number generating unit can be expected to be a part of the protection device 12.
还可以预期该代码生成单元122执行以上描述的函数融合以及与融合函数相应的代码生成相关的各项处理。可以预期在现有DSL编译器的基础上添加本说明书的保护设备的各项功能单元或者模块。上述接收单元121和代码生成单元122作为DSL编译器的模块由DSL编译器实现。It is also expected that the code generation unit 122 performs the function fusion described above and various processes related to code generation corresponding to the fusion function. It can be expected to add various functional units or modules of the protection device of this specification on the basis of the existing DSL compiler. The above-mentioned receiving unit 121 and code generating unit 122 are implemented as DSL compiler modules by a DSL compiler.
以上虽然在生成机器学习模型的系统10中描述了基于DSL编译器的机器学习模型保护设备12,可以设想将基于DSL编译器对机器学习模型的保护设备12作为单独的设备使用。Although the machine learning model protection device 12 based on the DSL compiler is described above in the system 10 for generating the machine learning model, it is conceivable to use the protection device 12 based on the DSL compiler to the machine learning model as a separate device.
可以设想该保护设备12的接收单元还能够接收用户对保护策略的选择。在一个实施例中,保护设备12能够包括显示单元,其能够将当前可选的保护策略以及对应的指令显示给用户,用户可以基于其自身对保护策略的选择来输入指令以调用相应的函数。进一步显示单元还可以使得针对用户选定的特定保护策略提示用户输入相应的参数值。It is conceivable that the receiving unit of the protection device 12 can also receive the user's selection of the protection strategy. In one embodiment, the protection device 12 can include a display unit, which can display the currently selectable protection strategy and the corresponding instruction to the user, and the user can input the instruction based on his own selection of the protection strategy to call the corresponding function. Further, the display unit can also prompt the user to input the corresponding parameter value for the specific protection strategy selected by the user.
可以理解,本说明书的各个实施例的方法和设备能够由计算机程序/软件实现。这些软件包括计算机程序指令,其能够被载入到数据处理器的工作存储器中,当运行时用于执行根据本说明书的各实施例的方法。It can be understood that the methods and devices of the various embodiments of this specification can be implemented by computer programs/software. These software includes computer program instructions, which can be loaded into the working memory of the data processor, and used to execute the methods according to the embodiments of the present specification when running.
本说明书的示范性实施例覆盖以下两者:从一开始就创建/使用本说明书的计算机程序/软件,以及借助于更新将已有程序/软件转为使用本说明书的计算机程序/软件。The exemplary embodiments of this specification cover both of the following: creating/using the computer program/software of this specification from the beginning, and converting an existing program/software into a computer program/software using this specification by means of an update.
根据本说明书另外的实施例,提供一种机器(如计算机)可读介质,例如CD-ROM,其中所述可读介质具有被存储在其上的计算机程序代码,该计算机程序代码当被执行时令计算机或处理器执行根据本说明书的各实施例的方法。该机器可读介质例如是与其他硬件一起或作为其他硬件的部分供应的光学存储介质或固态介质。According to another embodiment of the present specification, a machine (such as a computer) readable medium, such as a CD-ROM, is provided, wherein the readable medium has computer program code stored thereon, and the computer program code when executed The computer or the processor executes the method according to the embodiments of this specification. The machine-readable medium is, for example, an optical storage medium or a solid-state medium supplied with or as part of other hardware.
也可以将用于执行根据本说明书的各实施例的方法的计算机程序以其他形式分布, 例如经由因特网或者其他有线或无线电信系统。The computer program for executing the method according to the various embodiments of the present specification may also be distributed in other forms, for example, via the Internet or other wired or wireless telecommunication systems.
计算机程序也可以被提供在诸如万维网的网络上,并且能够从这样的网络被下载到数据处理器的工作计算机中。The computer program can also be provided on a network such as the World Wide Web and can be downloaded from such a network to the working computer of the data processor.
也可以理解,本说明书的各个实施例的系统中的各个单元以及方法的流程也能够由硬件或者硬件和软件的组合来实现。It can also be understood that the flow of each unit and method in the system of each embodiment of this specification can also be implemented by hardware or a combination of hardware and software.
在一个实施例中,根据本说明书的系统能够由存储器和处理器来实现。存储器能够存储用于运行根据本说明书的各个实施例的方法流程的计算机程序代码;当运行来自存储器的程序代码时,处理器执行根据本说明书的各个实施例的流程。In one embodiment, the system according to this specification can be implemented by a memory and a processor. The memory can store computer program codes for running the method procedures according to the various embodiments of this specification; when running the program codes from the memory, the processor executes the procedures according to the various embodiments of this specification.
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps described in the claims can be performed in a different order than in the embodiments and still achieve desired results. In addition, the processes depicted in the drawings do not necessarily require the specific order or sequential order shown in order to achieve the desired results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
必须指出,本说明书的实施例是参考不同主题来描述的。尤其地,一些实施例是参考方法型权利要求来描述的,而其他实施例是参考设备型权利要求来描述的。然而,本领域技术人员将从以上和以下描述获悉,除非另外指明,除了属于一种类型的主题的特征的任意组合以外,涉及不同主题的特征之间的任意组合也被视为被本说明书公开了。并且,能够组合全部特征,提供大于特征的简单加和的协同效应。It must be pointed out that the embodiments of this specification are described with reference to different subjects. In particular, some embodiments are described with reference to method-type claims, while other embodiments are described with reference to device-type claims. However, those skilled in the art will learn from the above and the following description, unless otherwise specified, in addition to any combination of features belonging to one type of subject, any combination of features related to different subjects is also deemed to be disclosed in this specification. NS. In addition, all the features can be combined to provide a synergistic effect that is greater than the simple addition of the features.
以上参照特定的实施例描述本说明书,本领域技术人员应当理解,在不背离本说明书的精神和基本特征的情况下,能够以各种方式来实现本说明书的技术方案。具体的实施例仅仅是示意性的,而非限制性的。另外,这些实施例之间能够任意组合,来实现本说明书的目的。本说明书的保护范围由所附的权利要求书来定义。The specification is described above with reference to specific embodiments, and those skilled in the art should understand that the technical solutions of the specification can be implemented in various ways without departing from the spirit and basic characteristics of the specification. The specific embodiments are merely illustrative and not restrictive. In addition, these embodiments can be arbitrarily combined to achieve the purpose of this specification. The protection scope of this specification is defined by the appended claims.
说明书和权利要求中的“包括”一词不排除其它元件或步骤的存在。在说明书中说明或者在权利要求中记载的各个元件的功能也可以被分拆或组合,由对应的多个元件或单一元件来实现。The word "comprising" in the description and claims does not exclude the presence of other elements or steps. The function of each element described in the specification or described in the claims can also be divided or combined, and implemented by a plurality of corresponding elements or a single element.

Claims (13)

  1. 一种基于领域特定语言编译器的机器学习模型保护方法,包括A machine learning model protection method based on a domain-specific language compiler, including
    针对所述机器学习模型的一种或多种保护策略中的每种保护策略,接收用户的指令以调用相应的函数、并且接收所述函数的输入参数值;和For each of the one or more protection strategies of the machine learning model, receive a user's instruction to call the corresponding function, and receive the input parameter value of the function; and
    基于分别针对所述一种或多种保护策略的一个或多个函数以及相应的输入参数值,自动生成用于对所述机器学习模型进行保护的机器可执行代码。Based on one or more functions for the one or more protection strategies and corresponding input parameter values, the machine executable code for protecting the machine learning model is automatically generated.
  2. 根据权利要求1所述的机器学习模型保护方法,其中,自动生成用于对所述机器学习模型进行保护的机器可执行代码包括The machine learning model protection method according to claim 1, wherein automatically generating machine executable code for protecting the machine learning model comprises
    针对所述一种或多种保护策略中的每种保护策略,基于相应的函数和输入参数值,自动生成针对所述保护策略的机器可执行代码。For each of the one or more protection strategies, the machine executable code for the protection strategy is automatically generated based on the corresponding function and input parameter value.
  3. 根据权利要求1所述的机器学习模型保护方法,其中,自动生成用于对所述机器学习模型进行保护的机器可执行代码包括The machine learning model protection method according to claim 1, wherein automatically generating machine executable code for protecting the machine learning model comprises
    对与多种保护策略分别对应的多个函数中的至少两个函数进行融合,以生成经融合的函数;Fusion of at least two of the multiple functions corresponding to multiple protection strategies to generate a fused function;
    基于经融合的函数以及相应的输入参数值自动生成相应的机器可执行代码。The corresponding machine executable code is automatically generated based on the fused function and the corresponding input parameter value.
  4. 根据权利要求1-3中任一项所述的机器学习模型保护方法,还包括The machine learning model protection method according to any one of claims 1-3, further comprising
    随机生成针对每种保护策略的所述输入参数值;Randomly generating the input parameter value for each protection strategy;
    其中,接收所述函数的输入参数值包括Wherein, receiving the input parameter value of the function includes
    接收针对所述保护策略随机生成的所述输入参数值。Receiving the input parameter value randomly generated for the protection strategy.
  5. 根据权利要求1-3中任一项所述的机器学习模型保护方法,其中,The machine learning model protection method according to any one of claims 1-3, wherein:
    所述一种或多种保护策略是由用户选择的。The one or more protection strategies are selected by the user.
  6. 根据权利要求5所述的机器学习模型保护方法,其中,所述一种或多种保护策略从包括加密、计算图混淆、或权重数据混淆的组中选择。The machine learning model protection method according to claim 5, wherein the one or more protection strategies are selected from the group consisting of encryption, computational graph obfuscation, or weighted data obfuscation.
  7. 一种基于领域特定语言编译器的机器学习模型保护设备,包括A machine learning model protection device based on a domain-specific language compiler, including
    接收单元,其用于针对所述机器学习模型的一种或多种保护策略中的每种保护策略,接收用户的指令以调用相应的函数、并且接收所述函数的输入参数值;和A receiving unit, which is configured to receive a user's instruction to call a corresponding function for each of the one or more protection strategies of the machine learning model, and to receive input parameter values of the function; and
    代码生成单元,其用于基于分别针对所述一种或多种保护策略的一个或多个函数以及相应的输入参数值,自动生成用于对所述机器学习模型进行保护的机器可执行代码。The code generation unit is configured to automatically generate machine executable code for protecting the machine learning model based on one or more functions and corresponding input parameter values for the one or more protection strategies.
  8. 根据权利要求7所述的机器学习模型保护设备,其中,所述代码生成单元还用于The machine learning model protection device according to claim 7, wherein the code generation unit is also used for
    针对所述一种或多种保护策略中的每种保护策略,基于相应的函数和输入参数值, 自动生成针对所述保护策略的机器可执行代码。For each of the one or more protection strategies, the machine executable code for the protection strategy is automatically generated based on the corresponding function and input parameter value.
  9. 根据权利要求7所述的机器学习模型保护设备,其中,所述代码生成单元还用于The machine learning model protection device according to claim 7, wherein the code generation unit is also used for
    对与多种保护策略分别对应的多个函数中的至少两个函数进行融合,以生成经融合的函数;Fusion of at least two of the multiple functions corresponding to multiple protection strategies to generate a fused function;
    基于经融合的函数以及相应的输入参数值自动生成相应的机器可执行代码。The corresponding machine executable code is automatically generated based on the fused function and the corresponding input parameter value.
  10. 根据权利要求7-9中任一项所述的机器学习模型保护设备,还包括The machine learning model protection device according to any one of claims 7-9, further comprising
    随机数生成单元,其用于随机生成针对每种保护策略的所述输入参数值;A random number generating unit, which is used to randomly generate the input parameter value for each protection strategy;
    其中,所述接收单元从所述随机数生成单元接收针对所述保护策略随机生成的所述输入参数值。Wherein, the receiving unit receives the input parameter value randomly generated for the protection strategy from the random number generating unit.
  11. 根据权利要求7-9中任一项所述的机器学习模型保护设备,其中,The machine learning model protection device according to any one of claims 7-9, wherein:
    所述一种或多种保护策略是由用户选择的。The one or more protection strategies are selected by the user.
  12. 根据权利要求11所述的机器学习模型保护设备,其中,所述一种或多种保护策略从包括加密、计算图混淆、或权重数据混淆的组中选择。The machine learning model protection device according to claim 11, wherein the one or more protection strategies are selected from the group consisting of encryption, computational graph obfuscation, or weighted data obfuscation.
  13. 一种生成机器学习模型的系统,包括A system for generating machine learning models, including
    机器学习模型生成设备,其用于生成机器学习模型;和Machine learning model generation equipment, which is used to generate machine learning models; and
    根据权利要求7-12中的任一项所述的基于领域特定语言编译器的机器学习模型保护设备,其用于生成对所述机器学习模型进行保护的机器可执行代码。The machine learning model protection device based on a domain-specific language compiler according to any one of claims 7-12, which is used to generate machine executable code for protecting the machine learning model.
PCT/CN2020/132839 2020-02-13 2020-11-30 Machine learning model protection method and device WO2021159819A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010090978.7A CN113254885A (en) 2020-02-13 2020-02-13 Machine learning model protection method and device
CN202010090978.7 2020-02-13

Publications (1)

Publication Number Publication Date
WO2021159819A1 true WO2021159819A1 (en) 2021-08-19

Family

ID=77220048

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/132839 WO2021159819A1 (en) 2020-02-13 2020-11-30 Machine learning model protection method and device

Country Status (2)

Country Link
CN (1) CN113254885A (en)
WO (1) WO2021159819A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102428474A (en) * 2009-11-19 2012-04-25 株式会社日立制作所 Computer system, management system and recording medium
CN102750469A (en) * 2012-05-18 2012-10-24 北京邮电大学 Security detection system based on open platform and detection method thereof
CN105516154A (en) * 2015-12-15 2016-04-20 Tcl集团股份有限公司 Security policy configuration method and device applied to SEAndroid (Security-Enhanced Android) system
US20190340524A1 (en) * 2018-05-07 2019-11-07 XNOR.ai, Inc. Model selection interface
CN110457023A (en) * 2019-07-23 2019-11-15 东软集团股份有限公司 Task creation method, apparatus, storage medium and electronic equipment
CN110580527A (en) * 2018-06-08 2019-12-17 上海寒武纪信息科技有限公司 method and device for generating universal machine learning model and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9563771B2 (en) * 2014-01-22 2017-02-07 Object Security LTD Automated and adaptive model-driven security system and method for operating the same
US10623443B2 (en) * 2016-07-08 2020-04-14 Ulrich Lang Method and system for policy management, testing, simulation, decentralization and analysis
US10382489B2 (en) * 2016-12-29 2019-08-13 Mcafee, Llc Technologies for privacy-preserving security policy evaluation
US20190258953A1 (en) * 2018-01-23 2019-08-22 Ulrich Lang Method and system for determining policies, rules, and agent characteristics, for automating agents, and protection
WO2019215713A1 (en) * 2018-05-07 2019-11-14 Shoodoo Analytics Ltd. Multiple-part machine learning solutions generated by data scientists

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102428474A (en) * 2009-11-19 2012-04-25 株式会社日立制作所 Computer system, management system and recording medium
CN102750469A (en) * 2012-05-18 2012-10-24 北京邮电大学 Security detection system based on open platform and detection method thereof
CN105516154A (en) * 2015-12-15 2016-04-20 Tcl集团股份有限公司 Security policy configuration method and device applied to SEAndroid (Security-Enhanced Android) system
US20190340524A1 (en) * 2018-05-07 2019-11-07 XNOR.ai, Inc. Model selection interface
CN110580527A (en) * 2018-06-08 2019-12-17 上海寒武纪信息科技有限公司 method and device for generating universal machine learning model and storage medium
CN110457023A (en) * 2019-07-23 2019-11-15 东软集团股份有限公司 Task creation method, apparatus, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN113254885A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
Zhang et al. Adaptive reward-poisoning attacks against reinforcement learning
Swiler et al. A graph-based network-vulnerability analysis system
US20210014264A1 (en) System and method for reasoning about the optimality of a configuration parameter of a distributed system
CN107872467A (en) Honey jar active defense method and honey jar Active Defending System Against based on Serverless frameworks
Tran et al. Deep hierarchical reinforcement agents for automated penetration testing
Zheng et al. Securely and efficiently outsourcing decision tree inference
US11411920B2 (en) Method and system for creating a secure public cloud-based cyber range
US20240073226A1 (en) Quantum computing machine learning for security threats
Dawood Graph theory and cyber security
Happe et al. Getting pwn’d by ai: Penetration testing with large language models
Suratkar et al. An adaptive honeypot using Q-Learning with severity analyzer
CN115580430A (en) Attack tree-pot deployment defense method and device based on deep reinforcement learning
Islam et al. Chimera: Autonomous planning and orchestration for malware deception
Lin et al. Effective proactive and reactive defense strategies against malicious attacks in a virtualized honeynet
WO2021159819A1 (en) Machine learning model protection method and device
US20200396207A1 (en) Permitting firewall traffic as exceptions in default traffic denial environments
Truong et al. X-ware: a proof of concept malware utilizing artificial intelligence
Ojugo et al. Evolutionary model for virus propagation on networks
Yamaguchi et al. Modeling of infection phenomenon and evaluation of mitigation methods for IoT malware Mirai by agent-oriented Petri net PN 2
CN115499323B (en) Method and device for constructing target virtual scene and electronic equipment
CN112199657B (en) Identity authentication method and VR device based on virtual reality environment
Makihara et al. A proposal of patrol function by white-hat worm in botnet defense system
Kundu et al. Game theoretic attack response framework for enterprise networks
Bearss Extending Machine Learning of Cyberattack Strategies with Continuous Transition Rates
Besson et al. URSID: Automatically Refining a Single Attack Scenario into Multiple Cyber Range Architectures

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20918769

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20918769

Country of ref document: EP

Kind code of ref document: A1