EP3171275A1 - Transparent process interception - Google Patents
Transparent process interception Download PDFInfo
- Publication number
- EP3171275A1 EP3171275A1 EP16197499.3A EP16197499A EP3171275A1 EP 3171275 A1 EP3171275 A1 EP 3171275A1 EP 16197499 A EP16197499 A EP 16197499A EP 3171275 A1 EP3171275 A1 EP 3171275A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- function
- custom
- create
- default
- build
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 102
- 230000008569 process Effects 0.000 title description 31
- 230000004044 response Effects 0.000 claims description 9
- 238000004590 computer program Methods 0.000 abstract description 13
- 230000006870 function Effects 0.000 description 267
- 239000003795 chemical substances by application Substances 0.000 description 89
- 230000003068 static effect Effects 0.000 description 32
- 238000004891 communication Methods 0.000 description 9
- 238000000605 extraction Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000013515 script Methods 0.000 description 4
- 230000003993 interaction Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 206010000210 abortion Diseases 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/433—Dependency analysis; Data or control flow analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/71—Version control; Configuration management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/75—Structural analysis for program understanding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45562—Creating, deleting, cloning virtual machine instances
Definitions
- This specification relates to transparent build system instrumentation, which can be used for static analysis of computer software source code.
- Static analysis refers to techniques for analyzing computer software source code without executing the source code as a computer software program.
- Source code in a code base is typically compiled in a build environment that includes a build system.
- the build environment includes an operating system; a file system; executable files, e.g., compilers; environment variables, e.g., variables that indicate a path to file system directories that contain executable files; and other configuration files for building source code in the code base.
- build systems can make arbitrary textual substitutions in existing source code files before a compiler is called to compile the modified source code.
- Build systems can also generate temporary source code files that are compiled but then deleted by the build system when compilation is complete.
- build utilities e.g., the "ant" utility on the Linux and Unix operating systems
- a build utility can copy a file from one location to another for compilation because another source code file may include or depend on the copied file. The copied file may then be deleted by the build system after compilation is complete.
- a static analysis system can instrument a build system using a custom agent, which, in some build systems, is a class that is loaded at run-time that is used to redefine the contents of another class.
- a custom agent is invoked instead, which the static analysis system can use to extract source code compiled by the build system.
- This specification describes how a static analysis system can transparently instrument a build system that uses custom agents, e.g., by intercepting calls to default symbol lookup and create VM functions of the build system. These calls can be replaced by custom functions that can modify the build environment to avoid triggering errors or diagnostic messages that would otherwise cause the build to fail.
- a static analysis system can execute custom functions to transparently instrument a build system.
- the static analysis system can provide a custom create VM function that cleanses a build environment in order to avoid triggering build system errors that would otherwise cause the build to fail.
- the static analysis system can also provide a custom symbol lookup function that ensures proper operation of a default symbol lookup function by manipulating the function call stack.
- the custom create VM function can also restore the build environment to its original state after instrumentation has occurred. Restoring the build environment provides robustness in the instrumentation by allowing the custom agent to still be loaded in situations where the interception mechanisms for custom symbol lookup or create VM functions fail.
- This specification describes a static analysis system that can transparently instrument a build system in order to extract source code compiled by the build system without the build failing.
- a "build system” is a computer system that builds source code in a code base.
- a build system can use one or more compilers and one or more other software utilities, e.g., a "make” utility, an “ant” utility, or a batch script, that coordinates compiling of source code in the code base.
- execution environment of a build system includes the set of all functions are variables that are available for access by a particular executing software application.
- an "environment variable” is a variable defined in a particular execution environment. Each environment variable of the execution environment has a value that can be accessed by software applications executing in that execution environment.
- instrumenting a build system function means that when the function is called by the build system or a software application running on the build system, the build system invokes a different function than the function that is ordinarily invoked.
- instrumenting a build system means configuring the build system to intercept calls to one or more functions of the build system. There are a variety of techniques for instrumenting a build system, one of which includes using a custom agent.
- a "custom agent” is a class that is loaded at run-time and which redefines the contents of another software element, e.g., a class.
- a custom agent can be defined that causes calls to a compile function to be defined by a custom function instead.
- a "create VM function” is a function that a build system calls to invoke a virtual machine.
- a "virtual machine” is a software program that runs on the hardware of a particular computer system and that provides virtual resources in order to execute the software code of another software program.
- any error message, diagnostic message, or compiler warning generated by a build process is considered to be a failure condition, and the build aborts.
- a compiler writing to stderr is interpreted as a build error on the part of the build system. If the process interception mechanism is not transparent, for example if the instrumented build writes to stderr and the non-instrumented build does not, performing source code extraction can cause the build process to fail, and the static analysis system will be unable to extract all source code compiled by the build system.
- FIG. 1 illustrates an example system.
- the system 100 includes a user device 160 in communication with a static analysis system 102 over a network, 170, which can be any appropriate communications network.
- the components of the static analysis system 102 can be implemented as computer programs installed on one or more computers in one or more locations that are coupled to each through a network.
- the static analysis system 102 can be installed in whole or in part on a single computing device, e.g., the user device 160.
- the static analysis system 102 is in communication with a build system 130.
- the static analysis system 102 and the build system 130 can be installed on different computing devices that are in communication with one another, e.g., using the network 170, or the static analysis system 102 and the build system 130 can be installed on a same computing device.
- the build system 130 generally builds source code in a code base.
- the build system 130 includes a build utility 131 and a compiler 132.
- the build utility 132 can be the "ant" utility, for Linux and Unix systems, or the build utility 132 can be a batch script that coordinates compiling of source code in the code base 140.
- the compiler 132 can be the conventional javac compiler for Java.
- the build system 130 also includes a number of default components 133a, 134a, 135a, and 136a, which can be conventional components for launching new processes and compiling source code in the build system 130.
- the static analysis system 102 provides an interception library 116 to the build system 130.
- the interception library 116 includes custom components that respectively preempt the default components of the build system 130.
- preempting means that whenever the build system 130 makes a request to invoke one of the default components, the request is intercepted by a corresponding custom component provided by the static analysis system 102.
- the custom components need not all be stored in the same library, but rather can be supplied by the static analysis system 102 as any appropriate module.
- the static analysis system can use a variety of techniques to get processes of the build system 130 to load the interception library 116. For example, on Linux systems, the extraction utility 110 can set the LD_PRELOAD environment variable, which will cause new processes to load the interception library 116.
- a first example custom component is a custom create-process function 133b that preempts a default create-process function 133a.
- the custom create-process function 133b ensures that each new process called by the build system 130 will also load the interception library 116.
- the custom create-process function 133b can ensure that LD_PRELOAD is set appropriately.
- a second example custom component is a custom symbol lookup function 134b that preempts a default symbol lookup function 134a.
- a symbol lookup function takes the name of a function and returns the address in memory of the named function, or an error if the function is not found.
- a symbol lookup function may also take as an argument a particular library in which the symbol is defined. For example, if the build system 130 runs Linux, the custom symbol lookup function 134b can preempt the function dlsym. If the build system runs Windows, the custom symbol lookup function 134b can preempt the function GetProcAddress. The custom symbol lookup function 134b returns the custom create virtual machine (VM) function 135b when it is requested. If any other symbol is requested, the custom symbol lookup function 134b uses the default symbol lookup function 134a to locate it.
- VM virtual machine
- a third example custom component is a custom create virtual machine function 135b that preempts a default create VM function 135a.
- the default create VM function 135a is used to initialize a virtual machine.
- VMs are used to host compiler processes for building source code.
- the default create VM function 135a generates error messages if particular environment variables are set, which, as mentioned above, can cause a build to fail.
- the custom create VM function 135b can preempt calls to the default create VM function 135a in order to cleanse the build environment before calling the default create VM function 135a.
- the default create VM function 135a can be JNI_CreateJavaVM, which is called to initialize a virtual machine to host the javac compiler.
- the javac compiler makes use of default compile functions 136a.
- the default compile functions 136a are preempted by a custom agent 136b.
- a custom agent is a class loaded at run-time that is used to redefine the contents of another class.
- the custom agent can redefine the contents of a class to intercept calls to the default compile functions 136a.
- the javac compiler calls a default compile function 136a to compile source code
- the call is intercepted by the custom agent 136b, which provides source code information 155 back to the static analysis system 102.
- the static analysis system 102 can use the source code information 155 to access the source code that is compiled by the build system and to store the source code in a collection of source code 150.
- a user of the user device 160 can provide an extraction command 105 to the extraction utility 110 of the static analysis system 102.
- the extraction command 105 is a request to extract precisely the source code that the build system 130 compiles.
- the extraction utility 110 provides a build command 115 to the build system 130.
- the build command 115 causes the build system 130 to execute the build utility 131.
- the request to execute the build utility causes the build system 130 to invoke a new process using the default create-process function 133a to run the build utility 131.
- the request by the build system 130 to the default create-process function 133a is intercepted by the custom create-process function 133b.
- the custom create-process function 133b ensures that the interception library will be loaded by the new process, e.g., by setting LD_PRELOAD, and then calls the default create-process function 133a to run the build utility 131. Additional techniques for preempting create-process functions are described in commonly-owned U.S. Patent Application No. 14/292,691 , for "Extracting Source Code," which is incorporated here by reference.
- the build utility 131 will build source code in the build system 130 including repeatedly invoking the compiler 132 in a new process. On each new process that is invoked, the custom create-process function 133b will intercept the request and ensure that the interception library 116 is loaded by each new process.
- Each compilation is hosted in a VM.
- the new process will call the default symbol lookup function 134a to locate the default create VM function 135a.
- the call to the default symbol lookup function 134a will be preempted by the custom symbol lookup function 134b.
- the custom symbol lookup function 134b returns an address of or a definition of the custom create VM function 135b whenever the default create VM function 135a is requested. If any other symbol was requested, the custom symbol lookup function 134b jumps to the default symbol lookup function 134a to find it, rather than directly calling the default symbol lookup function 134a. This is because the default symbol lookup function 134a often searches for the requested symbol in the next shared object in the library search order after the current library. In other words, the search begins with an object after the object that contains the currently executing call to the default symbol lookup function 134a. The object that contains the currently executing call to the default symbol lookup function is determined in the default symbol lookup function by examining the call stack to obtain the address of the calling function. By jumping to the default symbol lookup function 134a, the custom symbol lookup function 134b ensures that the address of the custom symbol lookup function 134b is not pushed onto the function call stack, which could make the default symbol lookup function return the wrong value.
- the custom create VM function 135b uses the default create VM function to create a VM with a custom agent 136b.
- the custom agent 136b then intercepts calls to the default compile functions 136a to obtain the source code information 155, which the custom agent 136b passes back to the static analysis system 102.
- the static analysis system 102 can then access the source code and store the source code in a collection of source code 150 or pass the source code itself 157 back to the user device 160.
- the custom create VM function 135b first cleanses the environment of environment variables that can cause a build to fail.
- One example environment variable that can cause a build to fail in some Java build environments is JAVA_TOOL_OPTIONS, which is an environment variable that can be used to specify the custom agent 136b.
- JAVA_TOOL_OPTIONS is an environment variable that can be used to specify the custom agent 136b.
- some default create VM functions emit a warning when the JAVA_TOOL_OPTIONS environment variable is set, a warning that the build system 130 may interpret as a failure condition.
- the custom create VM function 135b can remove JAVA_TOOL_OPTIONS from the environment and can specify the custom agent 136b in other ways.
- the custom create VM function 135b can specify the custom agent 136b as an argument to the default create VM function 135a, which typically does not cause the build to fail.
- the custom create VM function 135a or some other custom component can attach the custom agent 136b after the VM is started.
- the custom agent 135b can call the default compile functions 136a so that the build proceeds normally and so that the instrumentation of the build process is transparent from the perspective of the build system 130.
- FIG. 2 is a flow diagram of an example method for cleansing a build environment using a custom create VM function. The process will be described as being performed by an appropriately programmed system of one or more computers.
- the system sets an environment variable specifying a custom agent (202).
- an environment variable can specify a custom agent that should be loaded.
- the environment variable can specify the location of a Java Archive (JAR) file containing a custom java agent.
- JAR Java Archive
- the custom agent replaces or extends the functionality of a given set of methods whenever those methods are invoked.
- the system intercepts calls to the default create VM function. (204).
- the system can use any appropriate technique for preempting calls to a default create VM function, e.g., overloading or overriding the default create VM function.
- the system uses a custom symbol lookup function to preempt the default create VM function. This is described in more detail below with reference to FIG. 3 .
- the custom create VM function removes the environment variable from the execution environment before calling the default create VM function (206).
- the custom create VM function removes environment variables that may cause a build system to emit a warning or an error message that can cause the build to fail, e.g., JAVA_TOOL_OPTIONS.
- environment variables that have been set by the static analysis system.
- the system typically does not modify environment variables that are relied upon or set by the build system itself.
- the custom create VM function invokes a VM having a custom agent using the default create VM function (208).
- the custom create VM function specifies the custom agent by an argument rather than by the removed environment variable.
- the custom create VM function can append an argument specifying the custom agent to an argument list for the default create VM function.
- the system can transparently instrument the build process and thereby gain access to source code that is built by the build system without the build system halting the build process by raising an error or warning about unexpected environment variables.
- This allows a static analysis system to transparently perform static analysis on a build system that raises errors or warnings about such unexpected environment variables.
- the custom agent can, for example, receive all the command line arguments that would have been passed to the default compile functions of the build system, e.g., command line arguments that specify locations of source code files. The custom agent can then access the source code files and extract the source code from the source code files.
- the system can restore the execution environment by resetting the environment variable that specifies the custom agent, e.g., JAVA_TOOL_OPTIONS.
- Restoring the JAVA_TOOL_OPTIONS environment variable provides a level of robustness in the build system instrumentation. For example, if the interception of the default create VM function should fail for future process invocations, e.g., if the build system is a distributed system having machines on which the interception library cannot be installed, the build instrument simply falls back to using JAVA_TOOL_OPTIONS to specify a custom agent with which to extract source code on those machines.
- FIG. 3 is a flow diagram of an example method for cleansing a build environment using a custom symbol lookup function. The process will be described as being performed by an appropriately programmed system of one or more computers.
- the system sets an environment variable that, when encountered by a default create VM function, causes the default create VM function to load a custom agent specified by the environment variable (step 302).
- the environment variable can be a "-javaagent" argument set in a JAVA_TOOL_OPTIONS environment variable set in an execution environment, as described above with reference to FIG. 1 .
- the system intercepts, by a custom symbol lookup function, a request by the build system for an identifier of a function (step 304).
- the custom symbol lookup function can overload the default symbol lookup function. This causes the custom symbol lookup function to execute instead of the default symbol lookup function when the system calls the default symbol lookup function.
- the system determines whether the requested function is the default create VM function (step 306). That is, the custom symbol lookup function can determine whether the string argument identifies the default create VM function.
- the system can merely provide an identifier of the requested function (branch to 308). For example, the system can use the default symbol lookup function to find the requested function. The custom symbol lookup function can then provide the requested function to the requesting process.
- Some symbol-lookup functions including dlsym, determine which library in which to search by examining the function call stack. For example, if dlsym is called with an argument "RTLD_NEXT," dlsym searches for the requested function in the shared object after the current shared object.
- the custom symbol lookup function If the requested function is the default create VM function, the custom symbol lookup function generates a reference to a custom create VM function (branch to 310). For example, the custom symbol lookup function can return an address at which the custom create VM function is loaded in memory.
- the custom symbol lookup function can return a function closure that defines the custom create VM function.
- the custom create VM function when executed, causes the build system to launch a VM having a custom agent without the execution environment having the environment variable being set.
- the system then provides the reference to the custom create VM function (312). Providing the reference to the custom create VM function thus causes the build system to launch a VM having a custom agent without the execution environment having the environment variable being set.
- the custom create VM function can remove the environment variable from the execution environment and specify the custom agent in another way, e.g., by specifying the custom agent a list of arguments to the default create VM function.
- the following is an example segment of pseudo code for a custom symbol lookup function to return a function closure for a custom create VM function if the requested identifier is for the default create VM function.
- the custom agent can gain access to the source code that is compiled by a compiler of the build system without the build system halting the build process by raising an error or warning about unexpected environment variables. This allows a static analysis system to transparently perform static analysis on a build system that raises errors or warnings about such unexpected environment variables.
- the system can then restore the execution environment by adding the environment variable back to the execution environment.
- Embodiments of the subject matter and the functional operations described in this specification can be implemented in digital electronic circuitry, in tangibly-embodied computer software or firmware, in computer hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
- Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions encoded on a tangible non transitory program carrier for execution by, or to control the operation of, data processing apparatus.
- the program instructions can be encoded on an artificially generated propagated signal, e.g., a machine generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus.
- the computer storage medium can be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.
- data processing apparatus encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers.
- the apparatus can include special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
- the apparatus can also include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
- a computer program (which may also be referred to or described as a program, software, a software application, a module, a software module, a script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a standalone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- a computer program may, but need not, correspond to a file in a file system.
- a program can be stored in a portion of a file that holds other programs or data, e.g., one or more scripts stored in a markup language document, in a single file dedicated to the program in question, or in multiple coordinated files, e.g., files that store one or more modules, sub programs, or portions of code.
- a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
- the processes and logic flows described in this specification can be performed by one or more programmable computers executing one or more computer programs to perform functions by operating on input data and generating output.
- the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
- special purpose logic circuitry e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit).
- Computers suitable for the execution of a computer program include, by way of example, can be based on general or special purpose microprocessors or both, or any other kind of central processing unit.
- a central processing unit will receive instructions and data from a read only memory or a random access memory or both.
- the essential elements of a computer are a central processing unit for performing or executing instructions and one or more memory devices for storing instructions and data.
- a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
- mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks.
- a computer need not have such devices.
- a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device, e.g., a universal serial bus (USB) flash drive, to name just a few.
- PDA personal digital assistant
- GPS Global Positioning System
- USB universal serial bus
- Computer readable media suitable for storing computer program instructions and data include all forms of nonvolatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto optical disks; and CD ROM and DVD-ROM disks.
- semiconductor memory devices e.g., EPROM, EEPROM, and flash memory devices
- magnetic disks e.g., internal hard disks or removable disks
- magneto optical disks e.g., CD ROM and DVD-ROM disks.
- the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
- a computer having a display device, e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can send input to the computer.
- a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
- keyboard and a pointing device e.g., a mouse or a trackball
- Other kinds of devices can be used to send for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input.
- a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a
- Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back end, middleware, or front end components.
- the components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network (“LAN”) and a wide area network (“WAN”), e.g., the Internet.
- LAN local area network
- WAN wide area network
- the computing system can include clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network.
- the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Computer Security & Cryptography (AREA)
- Stored Programmes (AREA)
Abstract
Description
- This specification relates to transparent build system instrumentation, which can be used for static analysis of computer software source code.
- Static analysis refers to techniques for analyzing computer software source code without executing the source code as a computer software program.
- Source code in a code base is typically compiled in a build environment that includes a build system. The build environment includes an operating system; a file system; executable files, e.g., compilers; environment variables, e.g., variables that indicate a path to file system directories that contain executable files; and other configuration files for building source code in the code base.
- Many build systems can make arbitrary textual substitutions in existing source code files before a compiler is called to compile the modified source code. Build systems can also generate temporary source code files that are compiled but then deleted by the build system when compilation is complete.
- In addition, build utilities, e.g., the "ant" utility on the Linux and Unix operating systems, can be programmed to copy source code files from one place to another during the build process. For example, a build utility can copy a file from one location to another for compilation because another source code file may include or depend on the copied file. The copied file may then be deleted by the build system after compilation is complete.
- In these situations, merely having read access to the source code files in a file system is insufficient for a static analysis system to extract all the source code that is built by a build system.
- To extract the source code that is built by a build system, a static analysis system can instrument a build system using a custom agent, which, in some build systems, is a class that is loaded at run-time that is used to redefine the contents of another class. Thus, when the build system calls default compile functions are called, the custom agent is invoked instead, which the static analysis system can use to extract source code compiled by the build system. This specification describes how a static analysis system can transparently instrument a build system that uses custom agents, e.g., by intercepting calls to default symbol lookup and create VM functions of the build system. These calls can be replaced by custom functions that can modify the build environment to avoid triggering errors or diagnostic messages that would otherwise cause the build to fail.
- The foregoing and other embodiments can each optionally include one or more of the following features, alone or in combination. In particular, one embodiment may include all the following features in combination.
- Particular embodiments of the subject matter described in this specification can be implemented so as to realize one or more of the following advantages. When building code, a static analysis system can execute custom functions to transparently instrument a build system. The static analysis system can provide a custom create VM function that cleanses a build environment in order to avoid triggering build system errors that would otherwise cause the build to fail. The static analysis system can also provide a custom symbol lookup function that ensures proper operation of a default symbol lookup function by manipulating the function call stack. The custom create VM function can also restore the build environment to its original state after instrumentation has occurred. Restoring the build environment provides robustness in the instrumentation by allowing the custom agent to still be loaded in situations where the interception mechanisms for custom symbol lookup or create VM functions fail.
- The details of one or more embodiments of the subject matter of this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims.
-
-
FIG. 1 illustrates an example system. -
FIG. 2 is a flow diagram of an example method for cleansing a build environment using a custom create VM function. -
FIG. 3 is a flow diagram of an example method for cleansing a build environment using a custom symbol lookup function. - Like reference numbers and designations in the various drawings indicate like elements.
- This specification describes a static analysis system that can transparently instrument a build system in order to extract source code compiled by the build system without the build failing.
- In general, all terms used in this description have the meaning a skilled person will commonly understand.
- As used in this description, a "build system" is a computer system that builds source code in a code base. A build system can use one or more compilers and one or more other software utilities, e.g., a "make" utility, an "ant" utility, or a batch script, that coordinates compiling of source code in the code base.
- As used in this description, the "execution environment" of a build system includes the set of all functions are variables that are available for access by a particular executing software application.
- As used in this description, an "environment variable" is a variable defined in a particular execution environment. Each environment variable of the execution environment has a value that can be accessed by software applications executing in that execution environment.
- As used in this description, "intercepting" a build system function means that when the function is called by the build system or a software application running on the build system, the build system invokes a different function than the function that is ordinarily invoked. As used in this description, "instrumenting" a build system means configuring the build system to intercept calls to one or more functions of the build system. There are a variety of techniques for instrumenting a build system, one of which includes using a custom agent.
- As used in this description, a "custom agent" is a class that is loaded at run-time and which redefines the contents of another software element, e.g., a class. For example, a custom agent can be defined that causes calls to a compile function to be defined by a custom function instead.
- As used in this description, a "create VM function" is a function that a build system calls to invoke a virtual machine.
- As used in this description, a "virtual machine" is a software program that runs on the hardware of a particular computer system and that provides virtual resources in order to execute the software code of another software program.
- As used in this description, "transparently instrumenting" a build process means that a static analysis system intercepts calls to particular build system functions in a way that does not disrupt the build process.
- For many build systems, any error message, diagnostic message, or compiler warning generated by a build process is considered to be a failure condition, and the build aborts. For example, in some build systems a compiler writing to stderr is interpreted as a build error on the part of the build system. If the process interception mechanism is not transparent, for example if the instrumented build writes to stderr and the non-instrumented build does not, performing source code extraction can cause the build process to fail, and the static analysis system will be unable to extract all source code compiled by the build system.
-
FIG. 1 illustrates an example system. The system 100 includes auser device 160 in communication with astatic analysis system 102 over a network, 170, which can be any appropriate communications network. The components of thestatic analysis system 102 can be implemented as computer programs installed on one or more computers in one or more locations that are coupled to each through a network. Alternatively, thestatic analysis system 102 can be installed in whole or in part on a single computing device, e.g., theuser device 160. - The
static analysis system 102 is in communication with abuild system 130. Thestatic analysis system 102 and thebuild system 130 can be installed on different computing devices that are in communication with one another, e.g., using thenetwork 170, or thestatic analysis system 102 and thebuild system 130 can be installed on a same computing device. - The
build system 130 generally builds source code in a code base. Thebuild system 130 includes abuild utility 131 and acompiler 132. For example, thebuild utility 132 can be the "ant" utility, for Linux and Unix systems, or thebuild utility 132 can be a batch script that coordinates compiling of source code in the code base 140. Thecompiler 132 can be the conventional javac compiler for Java. - The
build system 130 also includes a number ofdefault components build system 130. - The
static analysis system 102 provides aninterception library 116 to thebuild system 130. Theinterception library 116 includes custom components that respectively preempt the default components of thebuild system 130. In this context, "preempting" means that whenever thebuild system 130 makes a request to invoke one of the default components, the request is intercepted by a corresponding custom component provided by thestatic analysis system 102. The custom components need not all be stored in the same library, but rather can be supplied by thestatic analysis system 102 as any appropriate module. The static analysis system can use a variety of techniques to get processes of thebuild system 130 to load theinterception library 116. For example, on Linux systems, theextraction utility 110 can set the LD_PRELOAD environment variable, which will cause new processes to load theinterception library 116. - A first example custom component is a custom create-
process function 133b that preempts a default create-process function 133a. Thus, whenever thebuild system 130 calls the default-createprocess function 133a to generate a new process, thebuild system 130 will actually call the custom create-process function 133b. The custom create-process function 133b ensures that each new process called by thebuild system 130 will also load theinterception library 116. For example, on Linux systems, the custom create-process function 133b can ensure that LD_PRELOAD is set appropriately. - A second example custom component is a custom
symbol lookup function 134b that preempts a defaultsymbol lookup function 134a. A symbol lookup function takes the name of a function and returns the address in memory of the named function, or an error if the function is not found. A symbol lookup function may also take as an argument a particular library in which the symbol is defined. For example, if thebuild system 130 runs Linux, the customsymbol lookup function 134b can preempt the function dlsym. If the build system runs Windows, the customsymbol lookup function 134b can preempt the function GetProcAddress. The customsymbol lookup function 134b returns the custom create virtual machine (VM) function 135b when it is requested. If any other symbol is requested, the customsymbol lookup function 134b uses the defaultsymbol lookup function 134a to locate it. - A third example custom component is a custom create virtual machine function 135b that preempts a default create
VM function 135a. The default createVM function 135a is used to initialize a virtual machine. In some build systems, VMs are used to host compiler processes for building source code. In some situations, the default createVM function 135a generates error messages if particular environment variables are set, which, as mentioned above, can cause a build to fail. Thus, the custom create VM function 135b can preempt calls to the default createVM function 135a in order to cleanse the build environment before calling the default createVM function 135a. For example, the default createVM function 135a can be JNI_CreateJavaVM, which is called to initialize a virtual machine to host the javac compiler. - To compile source code, the javac compiler makes use of default compile
functions 136a. However, the default compilefunctions 136a are preempted by a custom agent 136b. In some implementations, a custom agent is a class loaded at run-time that is used to redefine the contents of another class. In particular, the custom agent can redefine the contents of a class to intercept calls to the default compilefunctions 136a. Thus, when the javac compiler calls a default compilefunction 136a to compile source code, the call is intercepted by the custom agent 136b, which providessource code information 155 back to thestatic analysis system 102. For example, thestatic analysis system 102 can use thesource code information 155 to access the source code that is compiled by the build system and to store the source code in a collection ofsource code 150. - In operation, a user of the
user device 160 can provide anextraction command 105 to theextraction utility 110 of thestatic analysis system 102. Theextraction command 105 is a request to extract precisely the source code that thebuild system 130 compiles. - The
extraction utility 110 provides abuild command 115 to thebuild system 130. Thebuild command 115 causes thebuild system 130 to execute thebuild utility 131. - The request to execute the build utility causes the
build system 130 to invoke a new process using the default create-process function 133a to run thebuild utility 131. The request by thebuild system 130 to the default create-process function 133a is intercepted by the custom create-process function 133b. The custom create-process function 133b ensures that the interception library will be loaded by the new process, e.g., by setting LD_PRELOAD, and then calls the default create-process function 133a to run thebuild utility 131. Additional techniques for preempting create-process functions are described in commonly-ownedU.S. Patent Application No. 14/292,691 , for "Extracting Source Code," which is incorporated here by reference. - The
build utility 131 will build source code in thebuild system 130 including repeatedly invoking thecompiler 132 in a new process. On each new process that is invoked, the custom create-process function 133b will intercept the request and ensure that theinterception library 116 is loaded by each new process. - Each compilation is hosted in a VM. To launch a VM to host the compilations, the new process will call the default
symbol lookup function 134a to locate the default createVM function 135a. The call to the defaultsymbol lookup function 134a will be preempted by the customsymbol lookup function 134b. - The custom
symbol lookup function 134b returns an address of or a definition of the custom create VM function 135b whenever the default createVM function 135a is requested. If any other symbol was requested, the customsymbol lookup function 134b jumps to the defaultsymbol lookup function 134a to find it, rather than directly calling the defaultsymbol lookup function 134a. This is because the defaultsymbol lookup function 134a often searches for the requested symbol in the next shared object in the library search order after the current library. In other words, the search begins with an object after the object that contains the currently executing call to the defaultsymbol lookup function 134a. The object that contains the currently executing call to the default symbol lookup function is determined in the default symbol lookup function by examining the call stack to obtain the address of the calling function. By jumping to the defaultsymbol lookup function 134a, the customsymbol lookup function 134b ensures that the address of the customsymbol lookup function 134b is not pushed onto the function call stack, which could make the default symbol lookup function return the wrong value. - The custom create VM function 135b uses the default create VM function to create a VM with a custom agent 136b. The custom agent 136b then intercepts calls to the default compile
functions 136a to obtain thesource code information 155, which the custom agent 136b passes back to thestatic analysis system 102. Thestatic analysis system 102 can then access the source code and store the source code in a collection ofsource code 150 or pass the source code itself 157 back to theuser device 160. - The custom create VM function 135b first cleanses the environment of environment variables that can cause a build to fail. One example environment variable that can cause a build to fail in some Java build environments is JAVA_TOOL_OPTIONS, which is an environment variable that can be used to specify the custom agent 136b. However, some default create VM functions emit a warning when the JAVA_TOOL_OPTIONS environment variable is set, a warning that the
build system 130 may interpret as a failure condition. - Thus, when the custom agent 136b is specified by JAVA_TOOL_OPTIONS, the custom create VM function 135b can remove JAVA_TOOL_OPTIONS from the environment and can specify the custom agent 136b in other ways. For example, the custom create VM function 135b can specify the custom agent 136b as an argument to the default create
VM function 135a, which typically does not cause the build to fail. Alternatively or in addition, the custom createVM function 135a or some other custom component can attach the custom agent 136b after the VM is started. - When the custom agent 136b has finished extracting the source code, the custom agent 135b can call the default compile
functions 136a so that the build proceeds normally and so that the instrumentation of the build process is transparent from the perspective of thebuild system 130. -
FIG. 2 is a flow diagram of an example method for cleansing a build environment using a custom create VM function. The process will be described as being performed by an appropriately programmed system of one or more computers. - The system sets an environment variable specifying a custom agent (202). In some build system environments, e.g., build systems for Java, an environment variable can specify a custom agent that should be loaded. For example, the environment variable can specify the location of a Java Archive (JAR) file containing a custom java agent. The custom agent replaces or extends the functionality of a given set of methods whenever those methods are invoked.
- The system intercepts calls to the default create VM function. (204). The system can use any appropriate technique for preempting calls to a default create VM function, e.g., overloading or overriding the default create VM function.
- In some implementations, the system uses a custom symbol lookup function to preempt the default create VM function. This is described in more detail below with reference to
FIG. 3 . - The custom create VM function removes the environment variable from the execution environment before calling the default create VM function (206). In other words, the custom create VM function removes environment variables that may cause a build system to emit a warning or an error message that can cause the build to fail, e.g., JAVA_TOOL_OPTIONS. Generally these are environment variables that have been set by the static analysis system. In other words, to maintain transparency when instrumenting the build system, the system typically does not modify environment variables that are relied upon or set by the build system itself.
- The custom create VM function invokes a VM having a custom agent using the default create VM function (208). In some implementations, the custom create VM function specifies the custom agent by an argument rather than by the removed environment variable. For example, the custom create VM function can append an argument specifying the custom agent to an argument list for the default create VM function.
- By invoking the VM having the custom agent in the cleansed build environment, the system can transparently instrument the build process and thereby gain access to source code that is built by the build system without the build system halting the build process by raising an error or warning about unexpected environment variables. This allows a static analysis system to transparently perform static analysis on a build system that raises errors or warnings about such unexpected environment variables. Thus, the custom agent can, for example, receive all the command line arguments that would have been passed to the default compile functions of the build system, e.g., command line arguments that specify locations of source code files. The custom agent can then access the source code files and extract the source code from the source code files.
- After the custom create VM function invokes the VM having the custom agent, the system can restore the execution environment by resetting the environment variable that specifies the custom agent, e.g., JAVA_TOOL_OPTIONS.
- Restoring the JAVA_TOOL_OPTIONS environment variable provides a level of robustness in the build system instrumentation. For example, if the interception of the default create VM function should fail for future process invocations, e.g., if the build system is a distributed system having machines on which the interception library cannot be installed, the build instrument simply falls back to using JAVA_TOOL_OPTIONS to specify a custom agent with which to extract source code on those machines.
- The following example segment of pseudo code illustrates a custom create VM function for Java:
JNI_CreateJavaVM(x, y, argv) { // split all JAVA_TOOL_OPTIONS args by whitespace let JTO = split(getenv("JAVA_TOOL_OPTIONS")) // get all custom agent arguments let custom_agent_args = filter JTO (\arg -> arg contains custom-agent.jar) // Get all other arguments let other_args = filter JTO (\arg -> not arg contains custom-agent.jar) // Reset environment to omit the custom agent variable setenv("JAVA_TOOL_OPTIONS", combine other_args) // Call the default create VM function with // custom agent arguments appended to argument list let result = dlsym(RTLD_Next, "JNI_CreateJavaVM")( x, y, argv ++ custom_agent_args) // Restore the execution environment setenv("JAVA_TOOL_OPTIONS", combine JTO) return result }
dlsym(lib, name) { if lib == RTLD_NEXT: // *jump* to the default symbol lookup function (dlsym) // so the call stack is not changed // as it would be with a function call. jump get_original_dlsym() else jump dlsym_replacement }
get_original_dlsym() { return _libc_dlsym(libc, "dlsym") }
dlsym_replacement(lib, name) { // Use _libc_dlsym to get default symbol lookup function //(dlsym) // and then call _libc_dlsym, passing the lib and name // arguments let orig_dlsym = get_original_dlsym() let f = orig_dlsym(lib, name) // Only intercept calls to a custom create-process // function if f != null && name == "JNI_CreateJavaVM" then return function(x, y, argv) { let JTO = split(getenv("JAVA_TOOL_OPTIONS")) let custom_agent_args = filter JTO (\arg -> arg contains custom-agent.jar) let other_args = filter JTO (\arg -> not arg contains custom-agent.jar) setenv("JAVA_TOOL_OPTIONS", combine other_args) //f is the default symbol lookup function (dlsym) result = f(x, y, argv ++ custom_agent_args) setenv("JAVA_TOOL_OPTIONS", combine JTO) return result } else return f }
- Embodiment 1 is a method comprising:
- setting, in an execution environment of a computer system, a first environment variable that specifies a custom agent to be executed in the execution environment;
- intercepting, by a custom create VM function, a request by the system to create a virtual machine using a default create VM function;
- removing, by the custom create VM function, the first environment variable from the execution environment; and
- executing, by the custom create VM function, the default create VM function to invoke a VM having the custom agent without the execution environment having the first environment variable being set.
- Embodiment 2 is the method of embodiment 1, wherein executing the default create VM function comprises executing the default create VM function with the custom agent specified as an argument to the default create VM function.
- Embodiment 3 is the method of embodiment 1, further comprising attaching the custom agent to the VM after the VM has been invoked.
- Embodiment 4 is the method of any one of embodiments 1-3, further comprising:
- receiving, by the custom agent, an identification of one or more source code files; and
- extracting, by the custom agent, source code of the one or more source code files.
- Embodiment 5 is the method of any one of embodiments 1-4, wherein the system generates diagnostic messages whenever the default create VM function is called with the first environment variable being set.
- Embodiment 6 is the method of any one of embodiments 1-5, wherein the default create VM function is a Java create VM function.
- Embodiment 7 is the method of any one of embodiments 1-6, further comprising resetting, by the custom create VM function, the first environment variable in the execution environment after the default create VM function has been called.
- Embodiment 8 is a method comprising:
- setting, in an execution environment of a computer system, a first environment variable that specifies a custom agent to be executed in the execution environment;
- intercepting, by a custom symbol lookup function, a request by the system to a default symbol lookup function to return an identifier of a particular function;
- determining, by the custom symbol lookup function, that the particular function is a default create VM function;
- in response to the determining, generating, by the custom symbol lookup function, a reference to a custom create VM function, wherein the custom create VM function removes the first environment variable from the execution environment before executing the default create VM function to invoke a VM having the custom agent; and
- providing, by the custom symbol lookup function, the reference to the custom create VM function in response to the request, thereby causing a build system to launch the VM having the custom agent without the execution environment having the first environment variable being set.
- Embodiment 9 is the method of embodiment 8, wherein executing the default create VM function comprises executing the default create VM function with the custom agent specified as an argument to the default create VM function.
- Embodiment 10 is the method of embodiment 8, further comprising attaching the custom agent to the VM after the VM has been invoked.
- Embodiment 11 is the method of any one of embodiments 8-10, further comprising:
- receiving, by the custom agent, an identification of one or more source code files; and
- extracting, by the custom agent, source code of the one or more source code files.
- Embodiment 12 is the method of any one of embodiments 8-11, wherein the system generates a diagnostic message whenever the default create VM function is called with the first environment variable being set.
- Embodiment 13 is the method of any one of embodiments 8-12, wherein the default create VM function is a Java default create VM function.
- Embodiment 14 is the method of any one of embodiments 8-13, wherein intercepting the request comprises overloading the default symbol lookup function with the custom symbol lookup function.
- Embodiment 15 is the method of any one of embodiments claim 8-14, further comprising resetting the first environment variable in the execution environment after the default create VM function is called.
- Embodiment 16 is a system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform the method of any one of embodiments 1 to 15.
- Embodiment 17 is a computer storage medium encoded with a computer program, the program comprising instructions that are operable, when executed by data processing apparatus, to cause the data processing apparatus to perform the method of any one of embodiments 1 to 15.
- 1. A computer implemented method comprising:
- setting, in an execution environment of a computer system, a first environment variable that specifies a custom agent to be executed in the execution environment;
- intercepting, by a custom create VM function, a request by the system to create a virtual machine using a default create VM function;
- removing, by the custom create VM function, the first environment variable from the execution environment; and
- executing, by the custom create VM function, the default create VM function to invoke a VM having the custom agent without the execution environment having the first environment variable being set.
- 2. The method of embodiment 1, wherein executing the default create VM function comprises executing the default create VM function with the custom agent specified as an argument to the default create VM function.
- 3. The method of embodiment 1, further comprising attaching the custom agent to the VM after the VM has been invoked.
- 4. The method of embodiment 1, further comprising:
- receiving, by the custom agent, an identification of one or more source code files; and
- extracting, by the custom agent, source code of the one or more source code files.
- 5. The method of embodiment 1, wherein the system generates diagnostic messages whenever the default create VM function is called with the first environment variable being set.
- 6. The method of embodiment 1, wherein the default create VM function is a Java create VM function.
- 7. The method of embodiment 1, further comprising resetting, by the custom create VM function, the first environment variable in the execution environment after the default create VM function has been called.
- 8. A computer implemented method comprising:
- setting, in an execution environment of a computer system, a first environment variable that specifies a custom agent to be executed in the execution environment;
- intercepting, by a custom symbol lookup function, a request by the system to a default symbol lookup function to return an identifier of a particular function;
- determining, by the custom symbol lookup function, that the particular function is a default create VM function;
- in response to the determining, generating, by the custom symbol lookup function, a reference to a custom create VM function, wherein the custom create VM function removes the first environment variable from the execution environment before executing the default create VM function to invoke a VM having the custom agent; and
- providing, by the custom symbol lookup function, the reference to the custom create VM function in response to the request, thereby causing a build system to launch the VM having the custom agent without the execution environment having the first environment variable being set.
- 9. The method of embodiment 8, wherein executing the default create VM function comprises executing the default create VM function with the custom agent specified as an argument to the default create VM function.
- 10. The method of embodiment 8, further comprising attaching the custom agent to the VM after the VM has been invoked.
- 11. The method of embodiment 8, further comprising:
- receiving, by the custom agent, an identification of one or more source code files; and
- extracting, by the custom agent, source code of the one or more source code files.
- 12. The method of embodiment 8, wherein the system generates a diagnostic message whenever the default create VM function is called with the first environment variable being set.
- 13. The method of embodiment 8, wherein the default create VM function is a Java default create VM function.
- 14. The method of embodiment 8, wherein intercepting the request comprises overloading the default symbol lookup function with the custom symbol lookup function.
- 15. The method of embodiment 8, further comprising resetting the first environment variable in the execution environment after the default create VM function is called.
- 16. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
- setting, in an execution environment of a computer system, a first environment variable that specifies a custom agent to be executed in the execution environment;
- intercepting, by a custom create VM function, a request by the system to create a virtual machine using a default create VM function;
- removing, by the custom create VM function, the first environment variable from the execution environment; and
- executing, by the custom create VM function, the default create VM function to invoke a VM having the custom agent without the execution environment having the first environment variable being set.
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
- 17. The system of embodiment 16, wherein executing the default create VM function comprises executing the default create VM function with the custom agent specified as an argument to the default create VM function.
- 18. The system of embodiment 16, wherein the operations further comprise attaching the custom agent to the VM after the VM has been invoked.
- 19. The system of embodiment 16, wherein the operations further comprise:
- receiving, by the custom agent, an identification of one or more source code files; and
- extracting, by the custom agent, source code of the one or more source code files.
- 20. The system of embodiment 16, wherein the system generates diagnostic messages whenever the default create VM function is called with the first environment variable being set.
- 21. The system of embodiment 16, wherein the default create VM function is a Java create VM function.
- 22. The system of embodiment 16, wherein the operations further comprise resetting, by the custom create VM function, the first environment variable in the execution environment after the default create VM function has been called.
- 23. A system comprising:
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
- setting, in an execution environment of a computer system, a first environment variable that specifies a custom agent to be executed in the execution environment;
- intercepting, by a custom symbol lookup function, a request by the system to a default symbol lookup function to return an identifier of a particular function;
- determining, by the custom symbol lookup function, that the particular function is a default create VM function;
- in response to the determining, generating, by the custom symbol lookup function, a reference to a custom create VM function, wherein the custom create VM function removes the first environment variable from the execution environment before executing the default create VM function to invoke a VM having the custom agent; and
- providing, by the custom symbol lookup function, the reference to the custom create VM function in response to the request, thereby causing a build system to launch the VM having the custom agent without the execution environment having the first environment variable being set.
- one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising:
- 24. The system of embodiment 23, wherein executing the default create VM function comprises executing the default create VM function with the custom agent specified as an argument to the default create VM function.
- 25. The system of embodiment 23, wherein the operations further comprise attaching the custom agent to the VM after the VM has been invoked.
- 26. The system of embodiment 23, wherein the operations further comprise:
- receiving, by the custom agent, an identification of one or more source code files; and
- extracting, by the custom agent, source code of the one or more source code files.
- 27. The system of embodiment 23, wherein the system generates a diagnostic message whenever the default create VM function is called with the first environment variable being set.
- 28. The system of embodiment 23, wherein the default create VM function is a Java default create VM function.
- 29. The system of embodiment 23, wherein intercepting the request comprises overloading the default symbol lookup function with the custom symbol lookup function.
- 30. The system of embodiment 23, wherein the operations further comprise resetting the first environment variable in the execution environment after the default create VM function is called.
Claims (13)
- A computer-implemented method comprising:receiving a request to extract source code that is built by a build system;setting, in an execution environment of the build system, a first environment variable that specifies a custom agent to be executed in the execution environment of the build system;intercepting, by a custom create VM function, a request by the build system to create a virtual machine using a default create VM function;removing, by the custom create VM function, the first environment variable from the execution environment of the build system; andexecuting, by the custom create VM function, the default create VM function to invoke a VM having the custom agent without the execution environment having the first environment variable being set;intercepting, by the custom agent of the VM, a request by the build system to build one or more source code files; andextracting, by the custom agent, source code of the one or more source code files.
- The method of claim 1, wherein executing the default create VM function comprises executing the default create VM function with the custom agent specified as an argument to the default create VM function.
- The method of claim 1, further comprising attaching the custom agent to the VM after the VM has been invoked.
- The method of claim 1, wherein the build system generates diagnostic messages whenever the default create VM function is called with the first environment variable being set.
- The method of claim 1, wherein the default create VM function is a Java create VM function.
- The method of claim 1, further comprising resetting, by the custom create VM function, the first environment variable in the execution environment after the default create VM function has been called.
- A computer-implemented method comprising:receiving a request to extract source code that is built by a build system;setting, in an execution environment of a build system, a first environment variable that specifies a custom agent to be executed in the execution environment of the build system;intercepting, by a custom symbol lookup function, a request by the build system to a default symbol lookup function to return an identifier of a particular function;determining, by the custom symbol lookup function, that the particular function is a default create VM function;in response to the determining, generating, by the custom symbol lookup function, a reference to a custom create VM function, wherein the custom create VM function removes the first environment variable from the execution environment before executing the default create VM function to invoke a VM having the custom agent;providing, by the custom symbol lookup function, the reference to the custom create VM function in response to the request, thereby causing a build system to launch the VM having the custom agent without the execution environment having the first environment variable being set;intercepting, by the custom agent of the VM, a request by the build system to build one or more source code files; andextracting, by the custom agent, source code of the one or more source code files..
- The method of claim 7, wherein executing the default create VM function comprises executing the default create VM function with the custom agent specified as an argument to the default create VM function.
- The method of claim 7, further comprising attaching the custom agent to the VM after the VM has been invoked.
- The method of claim 7, wherein the build system generates a diagnostic message whenever the default create VM function is called with the first environment variable being set.
- The method of claim 7, wherein the default create VM function is a Java default create VM function.
- The method of claim 7, wherein intercepting the request by the build system to a default symbol lookup function comprises overloading the default symbol lookup function with the custom symbol lookup function.
- The method of claim 7, further comprising resetting the first environment variable in the execution environment after the default create VM function is called.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/947,631 US9489182B1 (en) | 2015-11-20 | 2015-11-20 | Transparent process interception |
Publications (2)
Publication Number | Publication Date |
---|---|
EP3171275A1 true EP3171275A1 (en) | 2017-05-24 |
EP3171275B1 EP3171275B1 (en) | 2021-10-20 |
Family
ID=57210863
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP16197499.3A Active EP3171275B1 (en) | 2015-11-20 | 2016-11-07 | Transparent process interception |
Country Status (2)
Country | Link |
---|---|
US (1) | US9489182B1 (en) |
EP (1) | EP3171275B1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112685203A (en) * | 2021-03-12 | 2021-04-20 | 北京安华金和科技有限公司 | Operation acquisition method and device, storage medium and electronic equipment |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9792114B1 (en) | 2016-10-10 | 2017-10-17 | Semmle Limited | Source code element signatures |
US10810007B2 (en) | 2017-12-29 | 2020-10-20 | Microsoft Technology Licensing, Llc | Classifying system-generated code |
US10929126B2 (en) | 2019-06-06 | 2021-02-23 | International Business Machines Corporation | Intercepting and replaying interactions with transactional and database environments |
US11016762B2 (en) | 2019-06-06 | 2021-05-25 | International Business Machines Corporation | Determining caller of a module in real-time |
US11036619B2 (en) | 2019-06-06 | 2021-06-15 | International Business Machines Corporation | Bypassing execution of a module in real-time |
US10915426B2 (en) | 2019-06-06 | 2021-02-09 | International Business Machines Corporation | Intercepting and recording calls to a module in real-time |
US11074069B2 (en) | 2019-06-06 | 2021-07-27 | International Business Machines Corporation | Replaying interactions with transactional and database environments with re-arrangement |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7340726B1 (en) * | 2003-08-08 | 2008-03-04 | Coverity, Inc. | Systems and methods for performing static analysis on source code |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9286043B2 (en) * | 2013-03-15 | 2016-03-15 | Microsoft Technology Licensing, Llc | Software build optimization |
US9110737B1 (en) * | 2014-05-30 | 2015-08-18 | Semmle Limited | Extracting source code |
-
2015
- 2015-11-20 US US14/947,631 patent/US9489182B1/en active Active
-
2016
- 2016-11-07 EP EP16197499.3A patent/EP3171275B1/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7340726B1 (en) * | 2003-08-08 | 2008-03-04 | Coverity, Inc. | Systems and methods for performing static analysis on source code |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112685203A (en) * | 2021-03-12 | 2021-04-20 | 北京安华金和科技有限公司 | Operation acquisition method and device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
US9489182B1 (en) | 2016-11-08 |
EP3171275B1 (en) | 2021-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3171275B1 (en) | Transparent process interception | |
US9946525B2 (en) | Extracting source code | |
US9003402B1 (en) | Method and system for injecting function calls into a virtual machine | |
CN109032706B (en) | Intelligent contract execution method, device, equipment and storage medium | |
US9684786B2 (en) | Monitoring an application in a process virtual machine | |
EP3035191A1 (en) | Identifying source code used to build executable files | |
US9378013B2 (en) | Incremental source code analysis | |
US10338952B2 (en) | Program execution without the use of bytecode modification or injection | |
US20210141645A1 (en) | Just-in-Time Containers | |
US11061695B2 (en) | Unikernel provisioning | |
US11403074B1 (en) | Systems and methods for generating interfaces for callback functions in object-oriented classes | |
US9639391B2 (en) | Scaling past the java virtual machine thread limit | |
EP3506136B1 (en) | Detecting stack cookie utilization in a binary software component using binary static analysis | |
US11656888B2 (en) | Performing an application snapshot using process virtual machine resources | |
US9430196B2 (en) | Message inlining | |
CN116685946A (en) | Reloading of updated shared libraries without stopping execution of an application | |
US11989569B2 (en) | Unikernel provisioning | |
CN115543486B (en) | Server-free computing oriented cold start delay optimization method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20161107 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20190912 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R079 Ref document number: 602016065062 Country of ref document: DE Free format text: PREVIOUS MAIN CLASS: G06F0011360000 Ipc: G06F0008710000 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06F 9/455 20180101ALI20210416BHEP Ipc: G06F 8/41 20180101ALI20210416BHEP Ipc: G06F 8/75 20180101ALI20210416BHEP Ipc: G06F 8/71 20180101AFI20210416BHEP |
|
INTG | Intention to grant announced |
Effective date: 20210520 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
RAP4 | Party data changed (patent owner data changed or rights of a patent transferred) |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602016065062 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 1440481 Country of ref document: AT Kind code of ref document: T Effective date: 20211115 |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG9D |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20211020 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 1440481 Country of ref document: AT Kind code of ref document: T Effective date: 20211020 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220120 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220220 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220221 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220120 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220121 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602016065062 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211107 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211130 |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20211130 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211130 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211130 |
|
26N | No opposition filed |
Effective date: 20220721 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20211107 Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20161107 |
|
P01 | Opt-out of the competence of the unified patent court (upc) registered |
Effective date: 20230428 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20231019 Year of fee payment: 8 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20231019 Year of fee payment: 8 Ref country code: DE Payment date: 20231019 Year of fee payment: 8 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20211020 |