WO2020034800A1 - 机器学习模型的处理方法、装置、介质及电子设备 - Google Patents

机器学习模型的处理方法、装置、介质及电子设备 Download PDF

Info

Publication number
WO2020034800A1
WO2020034800A1 PCT/CN2019/096183 CN2019096183W WO2020034800A1 WO 2020034800 A1 WO2020034800 A1 WO 2020034800A1 CN 2019096183 W CN2019096183 W CN 2019096183W WO 2020034800 A1 WO2020034800 A1 WO 2020034800A1
Authority
WO
WIPO (PCT)
Prior art keywords
operation unit
model file
machine learning
model
file
Prior art date
Application number
PCT/CN2019/096183
Other languages
English (en)
French (fr)
Inventor
张博
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2020034800A1 publication Critical patent/WO2020034800A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/53Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security

Definitions

  • the present application relates to the field of computer and communication technologies, and in particular, to a method, an apparatus, a computer-readable medium, and an electronic device for processing a machine learning model.
  • the machine learning framework package implements common machine learning and deep learning algorithms, and provides an easy-to-use interface that can be used to quickly train machine learning models or verify new machine learning algorithms. After using a machine learning framework to train a machine learning model, the model can be saved to a file to obtain a model file for subsequent cross-environment deployment or sharing with others.
  • the embodiments of the present application provide a method, a device, a computer-readable medium, and an electronic device for processing a machine learning model, which can solve the problem that the model file of the machine learning model cannot be modified conveniently.
  • An embodiment of the present application provides a method for processing a machine learning model, which is executed by an electronic device.
  • the method includes: obtaining a model file of the machine learning model and a target operation unit (the operation unit is Operation) that needs to be added to the model file. ); Calling an application programming interface in a machine learning framework corresponding to the machine learning model to add the target operation unit to the model file to obtain a processed model file, wherein the target operation unit does not affect the An output result of a model file; running the processed model file to execute the target operation unit during the running of the processed model file.
  • An embodiment of the present application provides a method for processing a machine learning model, which is executed by an electronic device.
  • the method includes: obtaining a model file of the machine learning model; analyzing the model file to obtain each operation unit included in the model file; Each operation unit performs security detection to determine whether a suspicious operation unit exists in the model file.
  • An embodiment of the present application provides a device for processing a machine learning model, including: an obtaining unit for obtaining a model file of the machine learning model and a target operation unit to be added to the model file; and an adding unit for invoking all
  • the application programming interface in the machine learning framework corresponding to the machine learning model adds the target operation unit to the model file to obtain a processed model file, wherein the target operation unit does not affect the output of the model file Result; a processing unit for running the processed model file to execute the target operation unit during the running of the processed model file.
  • An embodiment of the present application provides a device for processing a machine learning model, including: an obtaining unit for obtaining a model file of the machine learning model; and a parsing unit for analyzing the model file to obtain each of the files included in the model file.
  • An embodiment of the present application provides a computer-readable medium on which a computer program is stored.
  • the computer program is executed by a processor, the method for processing a machine learning model described in any embodiment of the present application is implemented.
  • An embodiment of the present application provides an electronic device including: one or more processors; a storage device configured to store one or more programs, and when the one or more programs are executed by the one or more processors When the one or more processors are enabled to implement the processing method of the machine learning model described in any embodiment of the present application.
  • FIG. 1 is a schematic diagram showing an exemplary system architecture of a processing method or a processing device of a machine learning model to which a machine learning model of an embodiment of the present application can be applied;
  • FIG. 2 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application
  • FIG. 3 schematically illustrates a flowchart of a method for processing a machine learning model according to an embodiment of the present application
  • FIG. 4 schematically illustrates a flowchart of a method for processing a machine learning model according to another embodiment of the present application
  • FIG. 5 shows a schematic architecture diagram of a machine learning system according to an embodiment of the present application
  • FIG. 6 shows a schematic diagram of a generation process of a TensorFlow model according to an embodiment of the present application
  • FIG. 7 shows a flowchart of constructing an AI system using a TensorFlow framework according to an embodiment of the present application
  • FIG. 8 schematically illustrates a flowchart of an attack test method according to an embodiment of the present application
  • FIG. 9A schematically illustrates a block diagram of a processing device for a machine learning model according to an embodiment of the present application.
  • FIG. 9B schematically illustrates a block diagram of a processing apparatus for a machine learning model according to another embodiment of the present application.
  • FIG. 10A schematically illustrates a block diagram of a processing apparatus for a machine learning model according to another embodiment of the present application.
  • FIG. 10B schematically illustrates a block diagram of a processing apparatus for a machine learning model according to another embodiment of the present application.
  • the embodiments of the present application provide a method, a device, a computer-readable medium, and an electronic device for processing a machine learning model, which can solve the problem that the model file of the machine learning model cannot be modified conveniently.
  • FIG. 1 shows a schematic diagram of an exemplary system architecture 100 to which a processing method of a machine learning model or a processing device of a machine learning model can be applied.
  • the system architecture 100 may include a terminal device (such as one or more of a terminal device 101, a terminal device 102, and a terminal device 103), a network 104, and a server 105.
  • the terminal device may be various electronic devices with a display screen, including but not limited to smart phones, tablet computers, portable computers, desktop computers, and so on.
  • the network 104 is a medium for providing a communication link between the terminal device and the server 105.
  • the network 104 may include various connection types, such as a wired communication link, a wireless communication link, and the like.
  • the numbers of terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.
  • the server 105 may be a server cluster composed of multiple servers.
  • the user can use the terminal device to interact with the server 105 through the network 104 to receive or send messages and the like.
  • the server 105 may be a server that provides various services. For example, a user may use a terminal device to upload a model file of a machine learning model and a target operation unit to be added to the model file to the server 105. The server 105 may add the target operation unit to the model file to obtain a processed model file, and then the server 105 may run the processed model file to execute the target operation unit during the running of the processed model file. Specifically, for example, a user may upload an operation unit for attack testing to the server 105, and the server 105 may add the operation unit for attack testing to a model file, and then execute the operation unit for attack testing to Perform an attack test.
  • the processing method of the machine learning model provided by the embodiment of the present application is generally executed by the server 105, and accordingly, the processing device of the machine learning model is generally disposed in the server 105.
  • the terminal may also have similar functions with the server, so as to execute the processing scheme of the machine learning model provided by the embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a computer system suitable for implementing an electronic device according to an embodiment of the present application.
  • the computer system 200 includes a central processing unit (CPU) 201, which can be loaded to a random computer according to a program stored in a read-only memory (ROM) 202 or from a storage section 208.
  • a program in the Random Access Memory (RAM) 203 is accessed to execute various appropriate actions and processes.
  • RAM 203 various programs and data required for system operation are also stored.
  • the CPU 201, the ROM 202, and the RAM 203 are connected to each other through a bus 204.
  • An input / output (I / O) interface 205 is also connected to the bus 204.
  • the following components are connected to the I / O interface 205: an input portion 206 including a keyboard, a mouse, and the like; including an output portion 207 such as a cathode ray tube (CRT), a liquid crystal display (LCD), and the like A storage section 208 including a hard disk and the like; and a communication section 209 including a network interface card such as a LAN (Local Area Network) card, a modem, and the like. The communication section 209 performs communication processing via a network such as the Internet.
  • the driver 210 is also connected to the I / O interface 205 as needed.
  • a removable medium 211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 210 as needed, so that a computer program read therefrom is installed into the storage section 208 as needed.
  • a process described below with reference to a flowchart may be implemented as a computer software program.
  • embodiments of the present application include a computer program product including a computer program borne on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart.
  • the computer program may be downloaded and installed from a network through the communication section 209, and / or installed from a removable medium 211.
  • this computer program is executed by the central processing unit (CPU) 201, various functions defined in the system of the present application are executed.
  • the computer-readable medium shown in the embodiments of the present application may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof.
  • Computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable Read-Only Memory (EPROM), Flash Memory, Optical Fiber, Compact Disc-Ready-Only Memory (CD-ROM), Optical Storage Device, Magnetic Storage Device, or any suitable device described above The combination.
  • the computer-readable storage medium may be any tangible medium containing or storing a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal transmitted in baseband or transmitted as a part of a carrier wave, where the computer-readable program code is carried.
  • a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, etc., or any suitable combination of the foregoing.
  • each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more of the logic functions used to implement the specified logic.
  • Executable instructions may also occur in a different order than those marked in the drawings. For example, two successively represented boxes may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram or flowchart, and combinations of blocks in the block diagram or flowchart can be implemented with a dedicated hardware-based system that performs the specified function or operation, or can be implemented with A combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present application may be implemented by software or hardware.
  • the described units may also be provided in a processor.
  • the names of these units do not, in some cases, define the unit itself.
  • An embodiment of the present application further provides a computer-readable medium, which may be included in the electronic device described in the foregoing embodiments; or may exist separately without being assembled into the electronic device.
  • the computer-readable medium carries one or more programs, and when the one or more programs are executed by one of the electronic devices, the electronic device is caused to implement a method as described in the following embodiments. For example, the electronic device can implement the steps shown in FIGS. 3 and 4.
  • FIG. 3 schematically illustrates a flowchart of a method for processing a machine learning model according to an embodiment of the present application.
  • the method for processing a machine learning model may be executed by the electronic device described in the foregoing embodiment.
  • the processing method of the machine learning model includes at least steps S310 to S340, which are described in detail as follows:
  • step S310 a model file of the machine learning model and a target operation unit that needs to be added to the model file are obtained.
  • the model file of the machine learning model may be stored in a serialized manner, and the machine learning model may be a churn calculation model based on a graph model, such as TensorFlow (an artificial intelligence learning system developed by Google). )model.
  • a graph model such as TensorFlow (an artificial intelligence learning system developed by Google).
  • step S320 the target operation unit is added to the model file to obtain a processed model file.
  • an application programming interface ie, Application Programming Interface, API for short
  • a target operation unit in a machine learning framework corresponding to a machine learning model
  • the target operation unit may be inserted into a set position in the model file, or the target operation unit may be added to the model file by replacing the specified operation unit in the model file with the target operation unit.
  • the target operation unit does not affect the output result of the model file.
  • step S330 the processed model file is run to execute the target operation unit during the running of the processed model file.
  • the running environment of the processed model file may be loaded, and then the target operation unit is parsed and executed in the running environment.
  • the technical solution of the embodiment shown in FIG. 3 enables a model user to add a corresponding operation unit to a model file to implement a corresponding function according to actual requirements. Not only can the model file of the machine learning model be easily modified, but also the Model file modification flexibility.
  • the processed model file may be loaded into a security sandbox to run the processed model file in the security sandbox.
  • the security sandbox is an independent virtual environment that can limit the running of programs in accordance with security policies.
  • the processed model file is loaded into the security sandbox for operation, so that even if the model file contains a malicious The operating unit can also avoid causing harm to the equipment running the model file, and improve the safety performance of the equipment.
  • a security test may be performed on each operation unit included in the processed model file to determine whether a suspicious operation unit exists in the model file. If there is no suspicious operation unit in the model file, the processed model file is run.
  • the technical solution of this embodiment can effectively detect suspicious operation units contained in the model file, and ensure the running security of the model file.
  • the above-mentioned target operation unit to be added to the model file may be an operation unit for performing an attack test, and further, an attack test may be performed on the machine learning model during the execution of the target operation unit.
  • FIG. 4 schematically illustrates a flowchart of a method for processing a machine learning model according to another embodiment of the present application.
  • the processing method of the machine learning model may be executed by the electronic device described in the foregoing embodiment.
  • a method for processing a machine learning model includes the following steps:
  • Step S410 Obtain a model file of the machine learning model.
  • the model file of the machine learning model may be stored in a serialized manner, and the machine learning model may be a churn calculation model based on a graph model, such as a TensorFlow model.
  • Step S420 Parse the model file to obtain each operation unit included in the model file.
  • Step S430 Perform security detection on the respective operation units to determine whether a suspicious operation unit exists in the model file.
  • each operation unit it can be determined whether the application programming interface called by each operation unit is abnormal, and then the operation unit with the called application programming interface abnormality is determined to be suspicious.
  • Operating unit For example, if it is not allowed to call the API interface of a write operation, but it is detected that an operation unit calls the API interface, then it can be determined that the operation unit is a suspicious operation unit.
  • an early warning prompt is performed; and / or if it is determined that there is no suspicious operation unit in the model file, the model file is run.
  • the technical solution of the embodiment shown in FIG. 4 can confirm the security of the operation unit included in the model file before running the model file, avoiding the problem of malicious attacks caused by the illegal operation unit in the model file, and improving the model. File security.
  • an operation unit for attack testing may be added to a model file to discover the loopholes of the machine learning model by performing an attack test on the machine learning model, thereby improving the machine learning model. This improves the security of machine learning models.
  • the architecture of a machine learning system mainly includes a machine learning framework 501, a third-party software library 502 on which the machine learning framework 501 depends, and an application program running on the machine learning framework. 503.
  • the machine learning framework 501 may be, for example, TensorFlow, Caffe (Convolutional Architecture for Fast Features, Convolutional Neural Network Framework), Torch (a machine learning framework based on Lua scripting language), and the like.
  • the third-party software library 502 can include Protocol (a protocol called Buffer, a data format proposed by Google), Libpng (a low-level cross-platform library for reading and writing PNG files written in C), Libgif (It is a low-level cross-platform library for reading and writing GIF files written in C language), OpenCV (Open Source Computer Vision Library), Libpeg (is a low-level read and write library for C language writing A cross-platform library of PEG files), Ffmpeg (an open source computer program that can be used to record, convert digital audio and video, and convert it into streams).
  • the application program 503 may include programs, data, models, and the like.
  • a software fuzzing (an automatic software testing technology based on defect injection) test may be performed on a third-party software library 502 that the machine learning framework depends on to discover the existing security loopholes.
  • the machine learning framework is a complex software system, in addition to the third-party software library 502, the machine learning framework itself also has security issues. Therefore, in the embodiment of the present application, the security of the machine learning framework is improved. Analyze and propose a corresponding attack test scheme for the machine learning framework.
  • the computing models of popular machine learning frameworks can be divided into graph-based stream computing models and programming computing models similar to common computer programming languages. Among them, most machine learning frameworks use graph-based stream computing models.
  • a typical example is a TensorFlow model. The following uses the TensorFlow model as an example to elaborate on the implementation details of the technical solution in the embodiment of the present application.
  • the process of constructing an AI (Artificial Intelligence) system using the TensorFlow framework includes training and deployment of a machine learning model, and specifically includes the following steps:
  • Step S701 Determine the sample data and algorithm.
  • Step S702 acquiring and training a machine learning model, for example, training the publicly trained model as a machine learning model to be used.
  • step S703 it is determined whether the trained machine learning model meets the requirements. If yes, step S704 is performed; otherwise, step S701 is returned.
  • Step S704 when it is determined that the trained machine learning model meets the requirements, the machine learning model is deployed.
  • Data attack During the model training phase and after the model is deployed to the generation environment, the model will receive external data (such as sample data). If the machine learning framework has defects in data parsing processing, this type of attack will be caused. Typical Input data such as pictures, audio, etc.
  • Model attacks There are a large number of models trained by other researchers in various machine learning frameworks. Framework users can retrain these models to meet their needs, or they can be directly used in production environment deployment, but these external models belong to Untrusted data can lead to such attacks if the framework has flaws in model processing.
  • the data can be detected and processed to ensure the security of the data input to the machine learning model; for algorithm attacks, model flaws can be improved to ensure model judgment. The accuracy of the results.
  • model flaws can be improved to ensure model judgment. The accuracy of the results.
  • the trained model in TensorFlow can be saved as a serialized file, and all data structures of Graph including Operation and Tensor are stored in the file.
  • the Operation operation in it can be interpreted and executed in the Runtime environment of TensorFlow. Therefore, it can be considered that the TensorFlow model file stores the executable code, which is a considerable risk point. Based on this, if a malicious Operation is inserted into a model file, when others use the model file, the malicious Operation in it will be executed, causing unpredictable consequences.
  • the attack method is described by using the attack test process shown in FIG. 8, which mainly includes the following steps:
  • step S801 a normal model file is generated.
  • step S802 a malicious Operation is inserted into a normal model file to generate a malicious model file.
  • the process can be completed by using a legal API provided by TensorFlow, that is, inserting a malicious model file into a normal model file through the API.
  • Step S803 publicly placing the malicious model file on the Internet, such as GitHub (a hosting platform for open source and private software projects) for downloading and use by others.
  • GitHub a hosting platform for open source and private software projects
  • step S804 the model downloader runs the model after downloading the malicious model file. Because the sharing of model files and the use of pre-trained models is a very common scenario, and the user's perception of model files is still stuck in the machine learning model file is a data file, which is basically harmless, so usually Model downloaders don't pay much attention to the security of model files. At the same time, when the model file is run in TensorFlow, there will not be any exception for the user, and the model may still output the expected results.
  • step S805 the malicious Operation is executed, which may cause consequences such as the computer being controlled and data being stolen. Specifically, the actions performed by a malicious Operation depend on the code in the malicious Operation.
  • the attack test method shown in Figure 8 discovers and makes use of the lack of inherent security mechanisms of the TensorFlow framework, and has the following three characteristics:
  • the embodiments of the present application also propose corresponding countermeasures to improve the execution security of the model file as much as possible, as follows:
  • Coping Strategy 1 Use the Sandbox mechanism to run untrusted model files (such as TensorFlow model files) in the sandbox, so that even if malicious Operation exists in the model file, the harm can be limited to the sandbox. Will affect the security of users' computers and private data.
  • untrusted model files such as TensorFlow model files
  • Coping Strategy 2 Use a model file security scanning tool, which scans all operations in the model file and issues warnings or reminders to suspicious operations. Users can use this scanning tool to check the model files before using untrusted machine learning models.
  • Coping Strategy 3 Because users generally think that model files are data files and are harmless, but through the above analysis, it can be seen that the model file can be executed as a graph-based program, so users' security awareness can be improved through security and stability. .
  • a malicious operation is added to a model file to perform an attack test as an example to describe the technical solution of the embodiment of the present application.
  • other customized Operation to achieve different functions, that is, users can customize the code in Operation according to actual needs, and then add to the model file for execution.
  • FIG. 9A schematically illustrates a block diagram of a processing device for a machine learning model according to an embodiment of the present application.
  • a processing device 900 for a machine learning model includes a first obtaining unit 902, an adding unit 904, and a processing unit 906.
  • the first obtaining unit 902 is configured to obtain a model file of a machine learning model and a target operation unit to be added to the model file;
  • the adding unit 904 is configured to add the target operation unit to the model file to obtain The processed model file;
  • the processing unit 906 is configured to run the processed model file to execute the target operation unit during the running of the processed model file.
  • the adding unit 904 is configured to: insert the target operation unit into a set position in the model file; or replace the designation in the model file with the target operation unit Operating unit.
  • the adding unit 904 is configured to: call an application programming interface in a machine learning framework corresponding to the machine learning model to add the target operation unit to the model file.
  • the processing unit 906 is configured to: load a running environment of the processed model file, and parse and execute the target operation unit in the running environment.
  • the target operation unit includes: an operation unit for performing an attack test; and the processing unit 906 is further configured to: perform a process on the machine learning model during the execution of the target operation unit. Perform an attack test.
  • the machine learning model includes: a stream computing model based on a graph model.
  • the processing unit 906 is configured to load the processed model file into a security sandbox to run the processed model file in the security sandbox.
  • the machine learning model processing device 900 further includes: a first detection unit 908, configured to perform security on each operation unit included in the processed model file. Detecting to determine whether a suspicious operation unit exists in the model file; the processing unit 906 is configured to run the process when the first detection unit determines that there is no suspicious operation unit in the model file Model file.
  • the first detection unit 908 is configured to determine whether the application programming interface called by each operation unit is abnormal according to the application programming interface called by each operation unit; An operating unit with an abnormal application programming interface was determined to be a suspicious operating unit.
  • FIG. 10A schematically illustrates a block diagram of a processing apparatus for a machine learning model according to another embodiment of the present application.
  • a processing device 1000 for a machine learning model includes: a second acquisition unit 1002, a parsing unit 1004, and a second detection unit 1006.
  • the second obtaining unit 1002 is configured to obtain a model file of a machine learning model; the parsing unit 1004 is configured to parse the model file to obtain each operation unit included in the model file; the second detecting unit 1006 is configured to perform Each operation unit performs security detection to determine whether a suspicious operation unit exists in the model file.
  • the processing device 1000 for a machine learning model further includes: a processing unit 1008, configured to determine, in the second detection unit 1006, that a suspicious operation unit exists in the model file.
  • a processing unit 1008 configured to determine, in the second detection unit 1006, that a suspicious operation unit exists in the model file.
  • the technical solution according to the embodiment of the present application may be embodied in the form of a software product, and the software product may be stored in a non-volatile storage medium (which may be a CD-ROM, a U disk, a mobile hard disk, etc.) or on a network. It includes several instructions to enable a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiment of the present application.
  • a computing device which may be a personal computer, a server, a touch terminal, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种机器学习模型的处理方法、装置、计算机可读介质及电子设备。该机器学习模型的处理方法包括:获取机器学习模型的模型文件和需要添加至所述模型文件中的目标操作单元(S310);将所述目标操作单元添加至所述模型文件中,得到处理后的模型文件(S320);运行所述处理后的模型文件,以在所述处理后的模型文件的运行过程中执行所述目标操作单元(S330)。

Description

机器学习模型的处理方法、装置、介质及电子设备
本申请要求于2018年8月15日提交中国专利局、申请号为201810930411.9,申请名称为“机器学习模型的处理方法、装置、介质及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机及通信技术领域,具体而言,涉及一种机器学习模型的处理方法、装置、计算机可读介质及电子设备。
背景技术
机器学习框架封装实现了常见的机器学习和深度学习算法,并提供易用的接口,可用来快速训练机器学习模型或验证新的机器学习算法等。在利用机器学习框架对机器学习模型进行训练后,可以把模型保存至文件中得到模型文件,以用于后续跨环境部署或共享给其他人使用。
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本申请的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。
发明内容
本申请的实施例提供了一种机器学习模型的处理方法、装置、计算机可读介质及电子设备,可以解决不能方便地对机器学习模型的模型文件进行修改的问题。
本申请的其它特性和优点将通过下面的详细描述变得显然,或部分地通过本申请的实践而习得。
本申请实施例提供了一种机器学习模型的处理方法,由电子设备执行,该方法包括:获取机器学习模型的模型文件和需要添加至所述模型文件中的目标操作单元(操作单元即为Operation);调用所述机器学习模型对应的机器学习框架中的应用程序编程接口向所述模型文件中添 加所述目标操作单元,得到处理后的模型文件,其中,所述目标操作单元不影响所述模型文件的输出结果;运行所述处理后的模型文件,以在所述处理后的模型文件的运行过程中执行所述目标操作单元。
本申请实施例提供了一种机器学习模型的处理方法,由电子设备执行,该包括:获取机器学习模型的模型文件;解析所述模型文件,以得到所述模型文件包含的各个操作单元;对所述各个操作单元进行安全性检测,以确定所述模型文件中是否存在可疑的操作单元。
本申请实施例提供了一种机器学习模型的处理装置,包括:获取单元,用于获取机器学习模型的模型文件和需要添加至所述模型文件中的目标操作单元;添加单元,用于调用所述机器学习模型对应的机器学习框架中的应用程序编程接口向所述模型文件中添加所述目标操作单元,得到处理后的模型文件,其中,所述目标操作单元不影响所述模型文件的输出结果;处理单元,用于运行所述处理后的模型文件,以在所述处理后的模型文件的运行过程中执行所述目标操作单元。
本申请实施例提供了一种机器学习模型的处理装置,包括:获取单元,用于获取机器学习模型的模型文件;解析单元,用于解析所述模型文件,以得到所述模型文件包含的各个操作单元;检测单元,用于对所述各个操作单元进行安全性检测,以确定所述模型文件中是否存在可疑的操作单元。
本申请实施例提供了一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现本申请任一实施例中所述的机器学习模型的处理方法。
本申请实施例提供了一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现本申请任一实施例中所述的机器学习模型的处理方法。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合 本申请的实施例,并与说明书一起用于解释本申请的原理。显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。在附图中:
图1示出了可以应用本申请实施例的机器学习模型的处理方法或机器学习模型的处理装置的示例性系统架构的示意图;
图2示出了适于用来实现本申请实施例的电子设备的计算机系统的结构示意图;
图3示意性示出了根据本申请的一个实施例的机器学习模型的处理方法的流程图;
图4示意性示出了根据本申请的另一个实施例的机器学习模型的处理方法的流程图;
图5示出了根据本申请的一个实施例的机器学习系统的架构示意图;
图6示出了根据本申请的一个实施例的TensorFlow模型的生成过程示意图;
图7示出了根据本申请的一个实施例的使用TensorFlow框架构建AI系统的流程图;
图8示意性示出了根据本申请的一个实施例的攻击测试方法的流程图;
图9A示意性示出了根据本申请的一个实施例的机器学习模型的处理装置的框图;
图9B示意性示出了根据本申请的另一个实施例的机器学习模型的处理装置的框图;
图10A示意性示出了根据本申请的另一个实施例的机器学习模型的处理装置的框图;
图10B示意性示出了根据本申请的另一个实施例的机器学习模型的处理装置的框图。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本申请将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本申请的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本申请的技术方案而没有特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知方法、装置、实现或者操作以避免模糊本申请的各方面。
附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
附图中所示的流程图仅是示例性说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解,而有的操作/步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。
一般地,若机器学习模型的模型文件不能满足用户的需求,则需要重新对机器学习模型进行训练,这种方式不仅费时费力,而且灵活性较差。
有鉴于此,本申请实施例提供了一种机器学习模型的处理方法、装置、计算机可读介质及电子设备,可以解决不能方便地对机器学习模型的模型文件进行修改的问题。
图1示出了可以应用本申请实施例的机器学习模型的处理方法或机器学习模型的处理装置的示例性系统架构100的示意图。
如图1所示,系统架构100可以包括终端设备(如终端设备101、终端设备102和终端设备103中的一种或多种)、网络104和服务器105。 终端设备可以是具有显示屏的各种电子设备,包括但不限于智能手机、平板电脑、便携式计算机和台式计算机等等。网络104用以在终端设备和服务器105之间提供通信链路的介质,网络104可以包括各种连接类型,例如有线通信链路、无线通信链路等等。
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。比如服务器105可以是多个服务器组成的服务器集群等。
用户可以使用终端设备通过网络104与服务器105交互,以接收或发送消息等。服务器105可以是提供各种服务的服务器。例如用户可以使用终端设备将机器学习模型的模型文件和需要添加至模型文件中的目标操作单元上传至服务器105。服务器105可以将该目标操作单元添加至模型文件中,得到处理后的模型文件,进而服务器105可以运行该处理后的模型文件,以在处理后的模型文件的运行过程中执行该目标操作单元。具体地,比如用户可以将用于进行攻击测试的操作单元上传至服务器105,服务器105可以将该用于攻击测试的操作单元添加至模型文件中,进而通过执行该用于攻击测试的操作单元来进行攻击测试。
需要说明的是,本申请实施例所提供的机器学习模型的处理方法一般由服务器105执行,相应地,机器学习模型的处理装置一般设置于服务器105中。但是,在本申请的其它实施例中,终端也可以与服务器具有相似的功能,从而执行本申请实施例所提供的机器学习模型的处理方案。
图2示出了适于用来实现本申请实施例的电子设备的计算机系统的结构示意图。
需要说明的是,图2示出的电子设备的计算机系统200仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图2所示,计算机系统200包括中央处理单元(Central Processing Unit,CPU)201,其可以根据存储在只读存储器(Read-Only Memory,ROM)202中的程序或者从存储部分208加载到随机访问存储器(Random Access Memory,RAM)203中的程序而执行各种适当的动作 和处理。在RAM 203中,还存储有系统操作所需的各种程序和数据。CPU 201、ROM 202以及RAM 203通过总线204彼此相连。输入/输出(Input/Output,I/O)接口205也连接至总线204。
以下部件连接至I/O接口205:包括键盘、鼠标等的输入部分206;包括诸如阴极射线管(Cathode Ray Tube,CRT)、液晶显示器(Liquid Crystal Display,LCD)等以及扬声器等的输出部分207;包括硬盘等的存储部分208;以及包括诸如LAN(Local Area Network,局域网)卡、调制解调器等的网络接口卡的通信部分209。通信部分209经由诸如因特网的网络执行通信处理。驱动器210也根据需要连接至I/O接口205。可拆卸介质211,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器210上,以便于从其上读出的计算机程序根据需要被安装入存储部分208。
特别地,根据本申请的实施例,下文参考流程图描述的过程可以被实现为计算机软件程序。例如,本申请的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分209从网络上被下载和安装,和/或从可拆卸介质211被安装。在该计算机程序被中央处理单元(CPU)201执行时,执行本申请的系统中限定的各种功能。
需要说明的是,本申请实施例所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请实施例中,计算机可读存 储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本申请实施例中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、有线等等,或者上述的任意合适的组合。
附图中的流程图和框图,图示了按照本申请各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现,所描述的单元也可以设置在处理器中。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定。
本申请实施例还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被一个该电子设备执行时,使得该电子设备实现如下述实施例中所述的方法。例如,所述的电子设备可以实现如图3 和图4所示的各个步骤。
以下对本申请实施例的技术方案的实现细节进行详细阐述。
图3示意性示出了根据本申请的一个实施例的机器学习模型的处理方法的流程图,该机器学习模型的处理方法可由前述实施例中所述的电子设备执行。参照图3所示,该机器学习模型的处理方法至少包括步骤S310至步骤S340,详细介绍如下:
在步骤S310中,获取机器学习模型的模型文件和需要添加至所述模型文件中的目标操作单元。
在本申请的一个实施例中,机器学习模型的模型文件可以是通过序列化的方式进行存储的,并且机器学习模型可以是基于图模型的流失计算模型,比如TensorFlow(谷歌研发的人工智能学习系统)模型。
在步骤S320中,将所述目标操作单元添加至所述模型文件中,得到处理后的模型文件。
在本申请的一个实施例中,可以调用机器学习模型对应的机器学习框架中的应用程序编程接口(即Application Programming Interface,简称API)向模型文件中添加目标操作单元。具体地,可以将目标操作单元插入模型文件中的设定位置,或者可以通过目标操作单元替换模型文件中的指定操作单元来将目标操作单元添加至模型文件中。在本申请实施例中,所述目标操作单元不影响所述模型文件的输出结果。
在步骤S330中,运行所述处理后的模型文件,以在所述处理后的模型文件的运行过程中执行所述目标操作单元。
在本申请的一个实施例中,可以加载处理后的模型文件的运行环境,进而在该运行环境中解析并执行所述目标操作单元。
图3所示实施例的技术方案使得模型使用者能够根据实际需求向模型文件中添加相应的操作单元以实现相应的功能,不仅能够方便地对机器学习模型的模型文件进行修改,而且提高了对模型文件进行修改的灵活性。
在本申请的一个实施例中,可以将处理后的模型文件加载至安全沙箱中,以在该安全沙箱中运行处理后的模型文件。其中,安全沙箱是一 个独立的虚拟环境,其可以按照安全策略限制程序的运行,该实施例中通过将处理后的模型文件加载至安全沙箱中来运行,使得即便模型文件中包含恶意的操作单元,也能够避免对运行模型文件的设备造成危害,提高了设备的安全性能。
在本申请的一个实施例中,可以在运行处理后的模型文件之前,对处理后的模型文件中包含的各个操作单元进行安全性检测,以确定该模型文件中是否存在可疑的操作单元,若确定模型文件中不存在可疑的操作单元,则运行处理后的模型文件。该实施例的技术方案可以有效检测出模型文件中包含的可疑的操作单元,保证模型文件的运行安全性。
在本申请的一个实施例中,上述的需要添加至模型文件中的目标操作单元可以是用于进行攻击测试的操作单元,进而可以在执行目标操作单元的过程中对机器学习模型进行攻击测试。
图4示意性示出了根据本申请的另一个实施例的机器学习模型的处理方法的流程图。该机器学习模型的处理方法可由前述实施例中所述的电子设备执行。
参照图4所示,根据本申请的另一个实施例的机器学习模型的处理方法,包括如下步骤:
步骤S410,获取机器学习模型的模型文件。
在本申请的一个实施例中,机器学习模型的模型文件可以是通过序列化的方式进行存储的,并且机器学习模型可以是基于图模型的流失计算模型,比如TensorFlow模型。
步骤S420,解析所述模型文件,以得到所述模型文件包含的各个操作单元。
步骤S430,对所述各个操作单元进行安全性检测,以确定所述模型文件中是否存在可疑的操作单元。
在本申请的一个实施例中,可以根据各个操作单元调用的应用程序编程接口,确定各个操作单元调用的应用程序编程接口是否异常,进而将调用的应用程序编程接口异常的操作单元确定为可疑的操作单元。比如若不允许调用写操作的API接口,但是检测到某个操作单元调用了该 API接口,那么可以确定该操作单元是可疑的操作单元。
在本申请的一个实施例中,若确定模型文件中存在可疑的操作单元,则进行预警提示;和/或若确定模型文件中不存在可疑的操作单元,则运行所述模型文件。
图4所示实施例的技术方案能够在运行模型文件之前,对模型文件中包含的操作单元的安全性进行确认,避免了模型文件中含有非法的操作单元而出现恶意攻击的问题,提高了模型文件的安全性。
在本申请的一个具体应用场景中,可以将用于攻击测试的操作单元添加至模型文件中,以通过对机器学习模型进行攻击测试来发现机器学习模型的漏洞,进而对机器学习模型进行完善,从而提高机器学习模型的安全性。以下对该具体应用场景的细节进行详细阐述。
如图5所示,根据本申请的一个实施例的机器学习系统的架构,主要包括机器学习框架501、机器学习框架501依赖的第三方软件库502,以及运行在机器学习框架之上的应用程序503。其中,机器学习框架501可以是诸如TensorFlow、Caffe(Convolutional Architecture for Fast Feature Embedding,卷积神经网络框架)、Torch(一种基于Lua脚本语言的机器学习框架)等。第三方软件库502可以包括Protobuf(全称为Protocol Buffer,是谷歌提出的一种数据交换的格式)、Libpng(是一款C语言编写的比较底层的读写PNG文件的跨平台的库)、Libgif(是一款C语言编写的比较底层的读写GIF文件的跨平台的库)、OpenCV(Open Source Computer Vision Library,开源计算机视觉库)、Libpeg(是一款C语言编写的比较底层的读写PEG文件的跨平台的库)、Ffmpeg(一套可以用来记录、转换数字音频、视频,并能将其转化为流的开源计算机程序)等。应用程序503可以包括程序、数据、模型等。
在本申请的一个实施例中,可以通过对机器学习框架依赖的第三方软件库502进行软件Fuzzing(一种基于缺陷注入的自动软件测试技术)测试,以发现存在的安全漏洞。但是,由于机器学习框架是一个复杂的软件系统,除第三方软件库502之外,机器学习框架本身也会存在安全问题,因此在本申请的实施例中,针对机器学习框架的安全性进行了分 析,并针对机器学习框架提出了相应的攻击测试方案。其中,目前流行的机器学习框架的计算模型可以分为基于图模型的流式计算模型和类似普通计算机编程语言的程序式计算模型,其中,大多数机器学习框架采用基于图模型的流式计算模型,典型的即为TensorFlow模型,以下以TensorFlow模型为例,对本申请实施例的技术方案的实现细节进行详细阐述。
如图6所示,在本申请的一个实施例中,机器学习算法中的所有计算在TensorFlow中均以Graph(图表)表示,Graph中的Operation(操作)表示一个具体的计算操作,如Add操作用于对2个数值进行相加;Tensor(张量)表示数据,Tensor可作为Operation的输入或输出,数据的具体流向用Graph中的边表示。在TensorFlow中训练一个机器学习模型的流程如下:
1)准备工作:根据要解决的实际问题,设计机器学习算法及准备训练使用的样本数据;
2)构建Graph:使用TensorFlow提供的API,根据算法构建Graph;
3)执行Graph:调用TensorFlow API执行Graph,该过程就是模型的训练过程,Graph运行在TensorFlow的Runtime(运行时刻,指一个程序在运行的状态)环境之上;
4)生成TensorFlow模型:训练完成后,生成算法的参数,然后把模型保存在文件中,供后续部署或优化使用。
在本申请的一个实施例中,如图7所示,使用TensorFlow框架构建AI(Artificial Intelligence,人工智能)系统的过程包括了机器学习模型的训练及部署,具体包括如下步骤:
步骤S701,确定样本数据及算法。
步骤S702,获取并训练机器学习模型,比如将公开的训练好的模型作为待使用的机器学习模型进行训练。
步骤S703,判断训练后的机器学习模型是否满足需求,若是,则执行步骤S704;否则,返回步骤S701。
步骤S704,在确定训练后的机器学习模型满足需求时,部署机器学 习模型。
基于上述的流程,通过对TensorFlow模型的训练及部署过程的系统性分析可以得出TensorFlow模型存在以下三类攻击方式:
1)数据攻击:在模型训练阶段和模型部署到生成环境后,模型会接收来自外部的数据(如样本数据),如果机器学习框架对数据解析处理存在缺陷,则会导致此类攻击,典型的输入数据如图片、音频等。
2)模型攻击:各类机器学习框架均存在大量其他研究者训练好的模型,框架使用者可以对这些模型重新训练以满足自己的需求,也可以直接用于生产环境部署,但这些外部模型属于不可信数据,如果框架对模型处理存在缺陷,则会导致此类攻击。
3)算法攻击:前人已做过大量研究,针对机器学习算法存在的缺陷,可生成恶意样本,致使模型的判断结果不符合预期。
在本申请的一个实施例中,对于数据攻击,可以通过对数据进行检测处理来保证输入至机器学习模型中的数据的安全性;对于算法攻击,可以通过完善机器学习算法的缺陷来确保模型判断结果的准确性。以下实施例将详细阐述对于模型攻击的处理。
在本申请的一个实施例中,TensorFlow中训练好的模型可以保存为序列化后的文件,文件中保存有Graph的所有数据结构,包括Operation和Tensor。当再次使用模型文件时,其中的Operation操作可在TensorFlow的Runtime环境中被解释执行,因此可以认为TensorFlow的模型文件中保存有可以被执行的代码,这是一个相当大的风险点。基于此,如果在模型文件中插入恶意的Operation,当他人使用该模型文件时,其中的恶意Operation就会被执行,造成不可预期的后果。
本申请的实施例中通过图8所示的攻击测试流程对该攻击方式进行了说明,主要包括如下步骤:
步骤S801,生成正常的模型文件。
步骤S802,在正常的模型文件中插入恶意Operation,生成恶意模型文件。具体地,该过程可利用TensorFlow提供的合法API完成,即通过API将恶意模型文件插入到正常的模型文件中。
步骤S803,把恶意模型文件公开放置到互联网上,如放置到GitHub(一个面向开源及私有软件项目的托管平台),供他人下载使用。
步骤S804,模型下载者在下载恶意模型文件后运行模型。由于模型文件的共享及使用预先训练好的模型是一个很常见的场景,并且目前用户对于模型文件的认知仍停留在机器学习模型文件是一个数据文件,基本无害的层面,因此通常情况下模型下载者并不会过多关注模型文件的安全性。同时,模型文件在TensorFlow中运行时,对于用户而言并不会有任何异常,模型可能仍会输出符合预期的结果。
步骤S805,恶意Operation被执行,可能会造成诸如电脑被控制、数据被盗取等后果。具体地,恶意Operation所做的动作取决于恶意Operation中的代码。
图8所示的攻击测试方法发现并利用了TensorFlow框架固有的安全机制缺失,具有以下3个特点:
1)影响面广:该风险在TensorFlow的所有版本中均存在,且攻击过程利用了互联网的传播特性。
2)隐蔽性高:攻击过程的几个关键点如下载模型、运行模型文件等均不会表现出任何异常,用户较难察觉,即该攻击过程具有很强的隐蔽性。
3)修复成本较高:该攻击利用的是TensorFlow的基础特性,即基于图的计算模型,无法从根本上修复该特性,需要增加其他的安全机制来防御此类攻击。
通过上述的攻击测试方法发现了机器学习框架的安全漏洞,进而可以引起业界对机器学习框架安全性的关注,改善了机器学习框架的安全性,并提高机器学习框架用户的安全意识。此外,本申请的实施例还提出了相应的应对策略,以尽可能提高模型文件的执行安全性,具体如下:
应对策略1:使用沙箱(Sandbox)机制,在沙箱中运行不可信的模型文件(如TensorFlow模型文件),这样即使模型文件中存在恶意的Operation,也可以将危害限制在沙箱中,不会影响到用户电脑及私密数据的安全性。
应对策略2:采用模型文件安全扫描工具,该扫描工具用于扫描模型文件中的所有Operation,对可疑Operation发出警告或提醒。用户在使用不可信的机器学习模型之前,可以使用该扫描工具对模型文件进行检查。
应对策略3:由于用户通常认为模型文件为数据文件,是无害的,但是通过上述分析可知,模型文件可以被当作一个基于图的程序被执行,因此可以通过安全稳定来提高用户的安全意识。
上述应用场景中以在模型文件中添加恶意的Operation来进行攻击测试为例对本申请实施例的技术方案进行了阐述,在本申请的其它应用场景中,还可以在模型文件中添加其它自定义的Operation,以实现不同的功能,即用户可以根据实际需求自定义Operation中的代码,进而添加至模型文件中来执行。
以下介绍本申请的装置实施例,可以用于执行本申请上述实施例中的机器学习模型的处理方法。对于本申请装置实施例中未披露的细节,请参照本申请上述的机器学习模型的处理方法的实施例。
图9A示意性示出了根据本申请的一个实施例的机器学习模型的处理装置的框图。
参照图9A所示,根据本申请的一个实施例的机器学习模型的处理装置900,包括:第一获取单元902、添加单元904和处理单元906。
其中,第一获取单元902用于获取机器学习模型的模型文件和需要添加至所述模型文件中的目标操作单元;添加单元904用于将所述目标操作单元添加至所述模型文件中,得到处理后的模型文件;处理单元906用于运行所述处理后的模型文件,以在所述处理后的模型文件的运行过程中执行所述目标操作单元。
在本申请的一个实施例中,所述添加单元904被配置为:将所述目标操作单元插入所述模型文件中的设定位置;或通过所述目标操作单元替换所述模型文件中的指定操作单元。
在本申请的一个实施例中,所述添加单元904被配置为:调用所述机器学习模型对应的机器学习框架中的应用程序编程接口向所述模型 文件中添加所述目标操作单元。
在本申请的一个实施例中,所述处理单元906被配置为:加载所述处理后的模型文件的运行环境,在所述运行环境中解析并执行所述目标操作单元。
在本申请的一个实施例中,所述目标操作单元包括:用于进行攻击测试的操作单元;所述处理单元906还用于:在执行所述目标操作单元的过程中对所述机器学习模型进行攻击测试。
在本申请的一个实施例中,所述机器学习模型包括:基于图模型的流式计算模型。
在本申请的一个实施例中,所述处理单元906被配置为:将所述处理后的模型文件加载至安全沙箱中,以在所述安全沙箱中运行所述处理后的模型文件。
在本申请的一个实施例中,如图9B所示,机器学习模型的处理装置900还包括:第一检测单元908,用于对所述处理后的模型文件中包含的各个操作单元进行安全性检测,以确定所述模型文件中是否存在可疑的操作单元;所述处理单元906被配置为,在所述第一检测单元确定所述模型文件中不存在可疑的操作单元时,运行所述处理后的模型文件。
在本申请的一个实施例中,所述第一检测单元908被配置为:根据所述各个操作单元调用的应用程序编程接口,确定所述各个操作单元调用的应用程序编程接口是否异常;将调用的应用程序编程接口异常的操作单元确定为可疑的操作单元。
图10A示意性示出了根据本申请的另一个实施例的机器学习模型的处理装置的框图。
参照图10A所示,根据本申请的另一个实施例的机器学习模型的处理装置1000,包括:第二获取单元1002、解析单元1004和第二检测单元1006。
其中,第二获取单元1002用于获取机器学习模型的模型文件;解析单元1004用于解析所述模型文件,以得到所述模型文件包含的各个操作单元;第二检测单元1006用于对所述各个操作单元进行安全性检 测,以确定所述模型文件中是否存在可疑的操作单元。
在本申请的一个实施例中,如图10B所示,机器学习模型的处理装置1000还包括:处理单元1008,用于在所述第二检测单元1006确定所述模型文件中存在可疑的操作单元时,进行预警提示;和/或用于在所述第二检测单元1006确定所述模型文件中不存在可疑的操作单元时,运行所述模型文件。
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本申请的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本申请实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、触控终端、或者网络设备等)执行根据本申请实施方式的方法。
本领域技术人员在考虑说明书及实践这里公开的申请后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由下面的权利要求指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (14)

  1. 一种机器学习模型的处理方法,由电子设备执行,包括:
    获取机器学习模型的模型文件和需要添加至所述模型文件中的目标操作单元;
    调用所述机器学习模型对应的机器学习框架中的应用程序编程接口向所述模型文件中添加所述目标操作单元,得到处理后的模型文件,其中,所述目标操作单元不影响所述模型文件的输出结果;
    运行所述处理后的模型文件,以在所述处理后的模型文件的运行过程中执行所述目标操作单元。
  2. 根据权利要求1所述的机器学习模型的处理方法,所述将所述目标操作单元添加至所述模型文件中,包括:
    将所述目标操作单元插入所述模型文件中的设定位置;或
    通过所述目标操作单元替换所述模型文件中的指定操作单元。
  3. 根据权利要求1所述的机器学习模型的处理方法,所述运行所述处理后的模型文件,以在所述处理后的模型文件的运行过程中执行所述目标操作单元,包括:
    加载所述处理后的模型文件的运行环境,在所述运行环境中解析并执行所述目标操作单元。
  4. 根据权利要求1所述的机器学习模型的处理方法,所述目标操作单元包括:用于进行攻击测试的操作单元;
    所述处理方法还包括:在执行所述目标操作单元的过程中对所述机器学习模型进行攻击测试。
  5. 根据权利要求1所述的机器学习模型的处理方法,所述机器学习模型包括:基于图模型的流式计算模型。
  6. 根据权利要求1至5中任一项所述的机器学习模型的处理方法,所述运行所述处理后的模型文件,包括:
    将所述处理后的模型文件加载至安全沙箱中,以在所述安全沙箱中运行所述处理后的模型文件。
  7. 根据权利要求1至5中任一项所述的机器学习模型的处理方法, 在运行所述处理后的模型文件之前,还包括:
    对所述处理后的模型文件中包含的各个操作单元进行安全性检测,以确定所述模型文件中是否存在可疑的操作单元;
    若确定所述模型文件中不存在可疑的操作单元,则运行所述处理后的模型文件。
  8. 根据权利要求7所述的机器学习模型的处理方法,所述对所述处理后的模型文件中包含的各个操作单元进行安全性检测,以确定所述模型文件中是否存在可疑的操作单元,包括:
    根据所述各个操作单元调用的应用程序编程接口,确定所述各个操作单元调用的应用程序编程接口是否异常;
    将调用的应用程序编程接口异常的操作单元确定为可疑的操作单元。
  9. 一种机器学习模型的处理方法,由电子设备执行,包括:
    获取机器学习模型的模型文件;
    解析所述模型文件,以得到所述模型文件包含的各个操作单元;
    对所述各个操作单元进行安全性检测,以确定所述模型文件中是否存在可疑的操作单元。
  10. 根据权利要求9所述的机器学习模型的处理方法,还包括:
    若确定所述模型文件中存在可疑的操作单元,则进行预警提示;和/或
    若确定所述模型文件中不存在可疑的操作单元,则运行所述模型文件。
  11. 一种机器学习模型的处理装置,包括:
    获取单元,用于获取机器学习模型的模型文件和需要添加至所述模型文件中的目标操作单元;
    添加单元,用于调用所述机器学习模型对应的机器学习框架中的应用程序编程接口向所述模型文件中添加所述目标操作单元,得到处理后的模型文件,其中,所述目标操作单元不影响所述模型文件的输出结果;
    处理单元,用于运行所述处理后的模型文件,以在所述处理后的模 型文件的运行过程中执行所述目标操作单元。
  12. 一种机器学习模型的处理装置,包括:
    获取单元,用于获取机器学习模型的模型文件;
    解析单元,用于解析所述模型文件,以得到所述模型文件包含的各个操作单元;
    检测单元,用于对所述各个操作单元进行安全性检测,以确定所述模型文件中是否存在可疑的操作单元。
  13. 一种计算机可读介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至10中任一项所述的方法。
  14. 一种电子设备,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1至10中任一项所述的方法。
PCT/CN2019/096183 2018-08-15 2019-07-16 机器学习模型的处理方法、装置、介质及电子设备 WO2020034800A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810930411.9A CN109255234B (zh) 2018-08-15 2018-08-15 机器学习模型的处理方法、装置、介质及电子设备
CN201810930411.9 2018-08-15

Publications (1)

Publication Number Publication Date
WO2020034800A1 true WO2020034800A1 (zh) 2020-02-20

Family

ID=65050080

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/096183 WO2020034800A1 (zh) 2018-08-15 2019-07-16 机器学习模型的处理方法、装置、介质及电子设备

Country Status (2)

Country Link
CN (1) CN109255234B (zh)
WO (1) WO2020034800A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112540835A (zh) * 2020-12-10 2021-03-23 北京奇艺世纪科技有限公司 一种混合机器学习模型的运行方法、装置及相关设备

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255234B (zh) * 2018-08-15 2023-03-24 腾讯科技(深圳)有限公司 机器学习模型的处理方法、装置、介质及电子设备
US11983535B2 (en) 2019-03-22 2024-05-14 Cambricon Technologies Corporation Limited Artificial intelligence computing device and related product
CN110070176A (zh) * 2019-04-18 2019-07-30 北京中科寒武纪科技有限公司 离线模型的处理方法、离线模型的处理装置及相关产品
CN111930368A (zh) * 2019-05-13 2020-11-13 阿里巴巴集团控股有限公司 信息可视化方法、装置、存储介质及处理器
CN110308910B (zh) * 2019-05-30 2023-10-31 苏宁金融服务(上海)有限公司 算法模型部署以及风险监控的方法、装置和计算机设备
CN112183735A (zh) * 2019-07-03 2021-01-05 安徽寒武纪信息科技有限公司 操作数据的生成方法、装置及相关产品
CN110968866B (zh) * 2019-11-27 2021-12-07 浙江工业大学 一种面向深度强化学习模型对抗攻击的防御方法
CN110889117B (zh) * 2019-11-28 2022-04-19 支付宝(杭州)信息技术有限公司 一种模型攻击的防御方法及装置
CN111047049B (zh) * 2019-12-05 2023-08-11 北京小米移动软件有限公司 基于机器学习模型处理多媒体数据的方法、装置及介质
CN111414646B (zh) * 2020-03-20 2024-03-29 矩阵元技术(深圳)有限公司 实现隐私保护的数据处理方法和装置
CN111415013B (zh) * 2020-03-20 2024-03-22 矩阵元技术(深圳)有限公司 隐私机器学习模型生成、训练方法、装置及电子设备
WO2021184345A1 (zh) * 2020-03-20 2021-09-23 云图技术有限公司 隐私机器学习实现方法、装置、设备及存储介质
CN111428880A (zh) * 2020-03-20 2020-07-17 矩阵元技术(深圳)有限公司 隐私机器学习实现方法、装置、设备及存储介质
CN113570063B (zh) * 2020-04-28 2024-04-30 大唐移动通信设备有限公司 机器学习模型参数传递方法及装置
CN112069508B (zh) * 2020-09-21 2023-03-21 西安交通大学 机器学习框架漏洞api参数定位方法、系统、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631708A (zh) * 2012-08-28 2014-03-12 深圳市世纪光速信息技术有限公司 程序测试方法及程序测试装置
US20160148115A1 (en) * 2014-11-26 2016-05-26 Microsoft Technology Licensing Easy deployment of machine learning models
CN106845232A (zh) * 2016-12-30 2017-06-13 北京瑞星信息技术股份有限公司 恶意代码库建立方法和系统
CN108347430A (zh) * 2018-01-05 2018-07-31 国网山东省电力公司济宁供电公司 基于深度学习的网络入侵检测和漏洞扫描方法及装置
CN109255234A (zh) * 2018-08-15 2019-01-22 腾讯科技(深圳)有限公司 机器学习模型的处理方法、装置、介质及电子设备

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11126720B2 (en) * 2012-09-26 2021-09-21 Bluvector, Inc. System and method for automated machine-learning, zero-day malware detection
US10176438B2 (en) * 2015-06-19 2019-01-08 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for data driven malware task identification
US9690938B1 (en) * 2015-08-05 2017-06-27 Invincea, Inc. Methods and apparatus for machine learning based malware detection
CN106909529B (zh) * 2015-12-22 2020-12-01 阿里巴巴集团控股有限公司 一种机器学习工具中间件及机器学习训练方法
US9928363B2 (en) * 2016-02-26 2018-03-27 Cylance Inc. Isolating data for analysis to avoid malicious attacks
CN105912500B (zh) * 2016-03-30 2017-11-14 百度在线网络技术(北京)有限公司 机器学习模型生成方法和装置
US10339320B2 (en) * 2016-11-18 2019-07-02 International Business Machines Corporation Applying machine learning techniques to discover security impacts of application programming interfaces
US10733530B2 (en) * 2016-12-08 2020-08-04 Resurgo, Llc Machine learning model evaluation in cyber defense
CN108229686B (zh) * 2016-12-14 2022-07-05 阿里巴巴集团控股有限公司 模型训练、预测方法、装置、电子设备及机器学习平台
CN107491691A (zh) * 2017-08-08 2017-12-19 东北大学 一种基于机器学习的远程取证工具安全分析系统
CN108268934A (zh) * 2018-01-10 2018-07-10 北京市商汤科技开发有限公司 基于深度学习的推荐方法和装置、电子设备、介质、程序
CN108255719B (zh) * 2018-01-11 2021-04-23 武汉斗鱼网络科技有限公司 一种应用程序dump文件获取方法、装置及电子设备
CN108304720B (zh) * 2018-02-06 2020-12-11 恒安嘉新(北京)科技股份公司 一种基于机器学习的安卓恶意程序检测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631708A (zh) * 2012-08-28 2014-03-12 深圳市世纪光速信息技术有限公司 程序测试方法及程序测试装置
US20160148115A1 (en) * 2014-11-26 2016-05-26 Microsoft Technology Licensing Easy deployment of machine learning models
CN106845232A (zh) * 2016-12-30 2017-06-13 北京瑞星信息技术股份有限公司 恶意代码库建立方法和系统
CN108347430A (zh) * 2018-01-05 2018-07-31 国网山东省电力公司济宁供电公司 基于深度学习的网络入侵检测和漏洞扫描方法及装置
CN109255234A (zh) * 2018-08-15 2019-01-22 腾讯科技(深圳)有限公司 机器学习模型的处理方法、装置、介质及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112540835A (zh) * 2020-12-10 2021-03-23 北京奇艺世纪科技有限公司 一种混合机器学习模型的运行方法、装置及相关设备
CN112540835B (zh) * 2020-12-10 2023-09-08 北京奇艺世纪科技有限公司 一种混合机器学习模型的运行方法、装置及相关设备

Also Published As

Publication number Publication date
CN109255234A (zh) 2019-01-22
CN109255234B (zh) 2023-03-24

Similar Documents

Publication Publication Date Title
WO2020034800A1 (zh) 机器学习模型的处理方法、装置、介质及电子设备
US11086619B2 (en) Code analytics and publication platform
US20230085001A1 (en) Testing and remediating compliance controls
KR20190109427A (ko) 침입 탐지를 위한 지속적인 학습
US11835987B2 (en) Methods and apparatus for finding long methods in code
US20160070911A1 (en) Rapid malware inspection of mobile applications
US9038185B2 (en) Execution of multiple execution paths
US10831892B2 (en) Web browser script monitoring
US9589134B2 (en) Remediation of security vulnerabilities in computer software
US10310956B2 (en) Techniques for web service black box testing
US8650546B2 (en) Static analysis based on observed string values during execution of a computer-based software application
US10387288B2 (en) Interactive analysis of a security specification
US20190303613A1 (en) Cognitive api policy manager
US10896252B2 (en) Composite challenge task generation and deployment
JP5700675B2 (ja) コンピュータ・プログラムのメソッドがバリデータであるかどうかを判断する方法、システム、及びコンピュータ・プログラム
US11163548B2 (en) Code registration to detect breaking API changes
JP7353346B2 (ja) ソフトウェアへの悪意あるプロセスの注入を防止するためのシステムおよび方法
US11934533B2 (en) Detection of supply chain-related security threats to software applications
US20220400121A1 (en) Performance monitoring in the anomaly detection domain for the it environment
US10761837B2 (en) Annotations in software development
CN109977669B (zh) 病毒识别方法、装置和计算机设备
JP7404223B2 (ja) 不正なメモリダンプ改変を防ぐシステムおよび方法
US11321225B2 (en) Reducing the memory load time for logic simulator by leveraging architecture simulator
US20230118939A1 (en) Risk Assessment of a Container Build
CN113034337B (zh) 图像检测方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19849570

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19849570

Country of ref document: EP

Kind code of ref document: A1