RU2628920C2 - Method for detecting harmful assemblies - Google Patents

Method for detecting harmful assemblies Download PDF

Info

Publication number
RU2628920C2
RU2628920C2 RU2015111420A RU2015111420A RU2628920C2 RU 2628920 C2 RU2628920 C2 RU 2628920C2 RU 2015111420 A RU2015111420 A RU 2015111420A RU 2015111420 A RU2015111420 A RU 2015111420A RU 2628920 C2 RU2628920 C2 RU 2628920C2
Authority
RU
Russia
Prior art keywords
image
assembly
machine code
code
parent assembly
Prior art date
Application number
RU2015111420A
Other languages
Russian (ru)
Other versions
RU2015111420A (en
Inventor
Дмитрий Геннадьевич Иванов
Никита Алексеевич Павлов
Дмитрий Владимирович Швецов
Михаил Александрович Горшенин
Original Assignee
Закрытое акционерное общество "Лаборатория Касперского"
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Закрытое акционерное общество "Лаборатория Касперского" filed Critical Закрытое акционерное общество "Лаборатория Касперского"
Priority to RU2015111420A priority Critical patent/RU2628920C2/en
Publication of RU2015111420A publication Critical patent/RU2015111420A/en
Application granted granted Critical
Publication of RU2628920C2 publication Critical patent/RU2628920C2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/447Target code generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation
    • G06F8/63Image based installation; Cloning; Build to order
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • G06F9/45516Runtime code conversion or optimisation
    • G06F9/4552Involving translation to a different instruction set architecture, e.g. just-in-time translation in a JVM
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8146Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics
    • H04N21/8153Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics comprising still images, e.g. texture, background image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/835Generation of protective data, e.g. certificates
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors

Abstract

FIELD: information technology.
SUBSTANCE: method for detecting a harmful image of machine code, where harmful code is considered to be an image of machine code which execution logic of the machine code differs from the execution logic of the CIL code of the parent assembly, contains the steps: a) obtain the image of the machine code; B) define the parent assembly, where the parent assembly is the assembly on the basis of which the resulting image was created; C) establish a discrepancy between the execution logic of the machine code of the obtained image of the machine code and the execution logic of the CIL code of the specific parent assembly; D) recognize the image of the machine code as harmful on the basis of the established discrepancy between the execution logic of the machine code of the obtained image of the machine code and the execution logic of the CIL code of the specific parent assembly.
EFFECT: increase the security of the device by detecting a harmful image of a machine code.
6 cl, 9 dwg

Description

Technical field

The invention relates to methods for detecting malicious images of machine codes on a device.

State of the art

Currently, the number of applications installed on user devices is growing, the number of files created by applications is growing exponentially. Some files created by applications during installation and operation of the application are unique, i.e. files exist in a single copy or little common. This fact creates significant problems for the categorization of such files without detailed analysis.

Similar files are assembly images in native code, which are part of the .NET technology. .NET applications are created by using together a certain number of assemblies (assembly), where the assembly is a binary (binary) file served by the common language runtime (CLR).

The .NET assembly includes the following elements:

- file header RE (portable execution);

- CLR header;

- CIL code (Common Intermediate Language);

- metadata of types used in the assembly (classes, interfaces, structures, enumerations, delegates);

- assembly manifest;

- additional built-in resources.

The PE header indicates that the assembly can be downloaded and used on Windows operating systems. The header also identifies the type of application (console, GUI application or code library).

The CLR header is the data that all .NET assemblies must support in order to be served in the CLR environment. The CLR header contains data, such as flags, CLR versions, an entry point (in particular, the address of the start of the Main () function) that allows the runtime to determine the layout of a managed file (a file containing managed code).

Each assembly contains a CIL code, which is processor independent intermediate code. At run time, the CIL code is compiled in real time by the JIT (just in time, i.e. dynamic compilation) compiler into instructions that meet the requirements of a particular processor.

Any assembly contains metadata that completely describes the format of types (classes) inside the assembly, as well as external types that the assembly refers to (types described in other assemblies). In the runtime environment, metadata is used to determine where in the binary file the types are located, to place types in memory, and to simplify the process of remotely calling type methods.

The assembly also contains a manifest. This manifest describes each module that is part of the assembly, the version of the assembly, and any external assemblies referenced by the current assembly. The assembly manifest contains all the metadata needed to specify the assembly requirements for the versions, and the assembly identity, as well as all the metadata needed to determine the scope of the assembly and to resolve references to resources and classes. The following table shows the data contained in the assembly manifest. The first four elements — assembly name, version number, language and regional settings, as well as strong name data — make up the assembly identity.

Figure 00000001

Any .NET assembly can contain any number of embedded resources, such as application icons, image files, sound bites, or line tables.

An assembly may consist of several modules. A module is part of an assembly, a logical collection of code or resources. The hierarchy of entities used in the assembly is as follows: Assembly-> module-> type (class) -> method. A module can be internal (inside the assembly file) or external (separate file). The module does not have an entry point and does not have any individual version number and therefore cannot be loaded directly by the CLR environment. Modules can only be loaded by the main assembly module, for example, a file that contains the assembly manifest. The module manifest contains only an enumeration of all external assemblies. Each module has an identifier MVID (Module Version Identifier) - a unique identifier registered in each assembly module, which changes with each compilation.

In single-file assemblies, all the necessary elements (headers, CIL code, type metadata, manifest and resources) are placed inside a single * .exe or * .dll file. In FIG. 1a shows a single-file assembly device.

A multi-file assembly consists of a set of .NET modules that are deployed as a single logical unit and provided with a single version number. Formally, one of these modules is called the main module and contains the assembly manifest (as well as all the necessary CIL instructions, metadata, headers and additional resources).

The manifest of the main module describes all the other related modules on which the operation of the main module depends. In a particular case, minor modules in a multi-file assembly are assigned the * .netmodule extension (Fig. 1b); This is not a requirement for the CLR environment. The minor * .netmodule modules also contain the CIL code and type metadata, as well as the module level manifest, which lists the external assemblies required by this module.

Like any PE file, the assembly can be signed with an X.509 digital signature located in the overlay of the PE file or directory (catalog signature). Additionally or separately, StrongName signature is used - a hash generated using the contents of the assembly and the secret part of the RSA key. The hash is located in the assembly between the PE header and the metadata. The hash allows you to check the immutability of the assembly since compilation. For a single-file assembly, when compiling a file, after the PE header, free bytes are left. Then, the hash of the file using the secret key is considered and the received hash is written to the specified bytes.

For multi-file builds, the technology is different. In addition to the hash of the main assembly file, hashes of external modules are also considered, after which the data is written to the main assembly. Modules do not have their own signatures and are different from the main MVID module. The assembly manifest writes:

- PublicKey - the open part of the StrongName signature

- PublicKeyToken - hash of the public part of the StrongName signature key.

Assemblies are divided into: private (private) and public (public / shared). Private assemblies should always be located in the same directory as the client application in which they are used (i.e., in the application directory), or in one of its subdirectories.

Public assembly can be used simultaneously in several applications on the same device. Public assemblies are not placed inside the same directory as the applications in which they are to be used. Instead, they are installed in the Global Assembly Cache (GAC). GAC is located in several places at once:

Figure 00000002

The assembly installed in the GAC must have a strong name. The strong name is the modern .NET equivalent of globally unique identifiers (GUIDs) used in COM. Unlike COM GUIDs, which are 128-bit numbers, strong .NET names are based (in part) on two interconnected cryptographic keys called a public key and a private key.

A strict name consists of a set of related data, including:

- assembly names (which is the name of the assembly without a file extension).

- assembly version number;

- the value of the public key;

- a value indicating a region that is optional and may be provided to localize the application;

- a digital signature created using a hash obtained by the contents of the assembly and the value of the secret key.

To create a strong name for the assembly, public and private keys are obtained, for example, they generate public and private key data using the sn.exe utility supplied with the .NET Framework SDK. This utility generates a file that contains data for two different but mathematically related keys - public and secret. After they indicate the location of this file to the compiler, which will write the full value of the public key into the assembly manifest.

In the particular case, the compiler generates an appropriate hash based on the entire contents of the assembly (CIL code, metadata, etc.). A hash is a numeric value that is statistically unique to fixed input. Therefore, if any .NET assembly data changes (even one character in a string literal), the compiler returns a different hash. Next, the resulting hash is combined with the secret key data contained in the file to obtain a digital signature inserted into the assembly inside the CLR header data. In FIG. 1c shows what the process of creating a strong name looks like.

The secret key data is not indicated in the manifest, but serves only to verify the contents of the assembly with a digital signature (together with the generated hash). Upon completion of the process of creating and assigning a strong name, the assembly can be installed in the GAC.

GAC build path:

C: \ Windows \ assembly \ GAC_32 \ KasperskyLab \ 2.0.0.0_b03f5f7f11d50 a 3 a \ KasperskyLab.dll, where:

C: \ Windows \ assembly - path to the GAC;

\ GAC_32-OAC_ processor architecture;

\ KasperskyLab - assembly name;

\ 2.0.0.0_b03f5f7f11d50a3a - assembly version KasperskyLab.dll public key label- \ assembly name.extension.

The execution of the assembly code, in the particular case, is as follows. First of all, the PE header is analyzed to find out which process to start (32 or 64-bit). Then the selected version of the MSCorEE.dll file is loaded (C: \ Windows \ System32 \ MSCorEE.dll for 32-bit systems). An example of the assembly source code is given below.

Figure 00000003

Figure 00000004

To execute the method (for convenience, the code is presented in its original form, and not in the code compiled in CIL), for example, the System.Console.WriteLine (“Kaspersky”) method, the JIT compiler converts CIL code to machine instructions (Fig. 2).

First of all, before executing the Main () function, the CLR finds all declared types (classes) (for example, the Console type). It then defines the methods, combining them into records within a single “structure” (using one method defined in the Console type). Entries contain addresses where you can find implementations of methods. At the first call to the WriteLine method, the JIT compiler is called. The JIT compiler knows the called method and the type by which this method is defined. The LT compiler searches in the metadata of the corresponding assembly - the implementation of the method code (the implementation code of the WriteLine method (string str)). Then JIT - compiles IL into machine code, storing it in dynamic memory. Next, the JIT compiler returns to the internal "structure" of data of type (Console) and replaces the address of the called method with the address of the memory block with machine code. After that, the Main () method calls the WriteLine (string str) method again. Because the code has already been compiled, access without calling the JIT compiler. By executing the WriteLine (string str) method, control is returned to the Main () method.

From the description it follows that the function “works slowly” only at the time of the first call, when the LT translates the CIL code into the processor instructions. In all other cases, the code is already in memory and substituted as optimized for this processor. However, if another program is launched in another process, the JIT compiler will be called again for the same method.

The images mentioned above solve the problem of "slow operation of the function at the time of the first call." When loading the assembly, the image will be loaded from which the machine code will be executed, thanks to this technology, it is possible to speed up the loading and launching the application due to the fact that the JIT compiler does not need to compile anything, and also to create data structures (records) every time, all this taken from the image. An image can be created for any .NET assembly, regardless of whether it is installed in the GAC or not. For compilation, in a particular case, the ngen.exe utility is used, located along the path

Figure 00000005
When you run ngen.exe, machine code is created for the IL assembly code using the JIT compiler, the result is saved to disk in the NIC (Native Image Cache) image storage. Compilation is performed on the local device taking into account its hardware and software configuration, therefore the image should be used only in the environment (environment) under which it was compiled. The goal of creating such images is to increase the efficiency of managed applications, that is, instead of JIT compilation, the finished assembly is loaded in machine code.

If the assembly code is used by many applications, then creating an image significantly increases the speed of launching and running the application, since the image can be used simultaneously by many applications, and the code generated on the fly by the JIT compiler is used only by the running instance of the application for which it is compiled.

The path to the compiled image is formed as follows, for example:

C: \ Windows \ assembly \ NativeImages_ν4.0.30319_32 \ Kaspersky \ 9c87f327866f53 a ec68d4fee40cde33d \ Kaspersky.ni.dll, where

C: \ Windows \ assembly \ NativeImages - path to the image store on the system;

v4.0.30319_32 - <version of the .NET Framework> _ <processor architecture (32 or 64)>;

Kaspersky - friendly assembly name;

9c87f327866f53aec68d4fee40cde33d - application hash;

Kaspersky.ni.dll - <assembly friendly name> .mi. <extension>.

When creating an image of the assembly machine code, ngen.exe for 64-bit applications saves data about it in the registry branch

Figure 00000006
, for 32-bit applications in

Figure 00000007

If the image was installed for an assembly located in the GAC, the branch will be named like this:

Figure 00000008
Figure 00000009

If the assembly was not installed in the GAC, then like this:

Figure 00000010

Prior to Windows 8, the developer always had to initiate the creation, update, and deletion of assembly images himself using ngen.exe (or by configuring the installer). Starting with Windows 8, for some builds, Windows creates images automatically.

In a particular case, the Native Image Service is used to manage images. It allows developers to postpone installation, updating, removal of images in machine code, performing these procedures is delayed when the device is idle. The Native Image Service is launched by the application installation or update program. This is done using the ngen.exe utility. The service works with a request queue stored in the Windows registry, each of the requests has its own priority. The priority will determine when the task will be executed.

In another special case, images in machine code are created not only at the initiative of developers or administrators, but also automatically by the .NET Framework. The .NET Framework automatically creates an image by monitoring the operation of the JIT compiler. Creating an image while the application is running takes too much time, so this operation is postponed, for which the CLR puts the tasks in the queue and executes them when the device is idle.

The CLR uses the Assemble Binder module to search for assemblies to load at the time the corresponding assembly starts. CLR uses several types of binding modules. To search for images, the Native Binder module is used. The search for the desired image is carried out in two stages - first, the specified module finds the assembly and the image on the file system, then checks whether the image matches the assembly. The search algorithm is shown in FIG. 3. At step 310, the assembly binding module searches for the assembly, the search is performed in:

- GAC, which implies that the desired assembly is signed, the contents of the assembly cannot be read;

- the application directory, the assembly opens and metadata is read.

Next, at step 320, the image linking module searches for an image in the NIC corresponding to the found assembly. In the event that an image is found, this is checked at step 330, then the image linking module reads the necessary data and metadata from the image at step 340 and makes sure that the image is suitable, for which it conducts a thorough check, which includes control:

- strict name;

- creation time (the image must be newer than the assembly);

- MVID assembly and image;

- versions of the .NET Framework;

- processor architecture;

- versions of related images (eg. mscorlib.dll image);

- etc.

If the assembly for which the image is being searched does not have a strong name, then MVID is used instead for verification. In the event that the image is not relevant, this is checked at step 350, control is transferred to the JIT compiler at step 370, otherwise the code from the image is loaded at step 360.

From the described it follows that the number of images significantly exceeds the number of assemblies and the images generated by one parent assembly may differ from device to device, from version to version of the image, this greatly complicates the task of categorizing images. Methods for categorizing files are known in the prior art, for example, US 20140208426 describes a method for categorizing files using cloud services, but no solutions have been found to solve the problem of categorizing images.

Disclosure of invention

The present invention is intended to detect malicious images of machine codes on a device.

The technical result of the present invention is to increase the security of the device by detecting a malicious image of machine code.

A method for detecting a malicious image of machine code, in which: receive an image of machine code; determine the parent assembly on the basis of which the resulting image is created; establish a mismatch between the data of the received image of the machine code and the data of the specific parent assembly, while the correspondence guarantees the invariability of the image of the machine code after creation; recognize the image of the machine code as malicious on the basis of the identified non-compliance.

In the particular case, the data between which the mismatch is established is the CIL code of the specific parent assembly and the machine code of the machine code image.

In another particular case, the data between which the mismatch is established is at least the CIL code, machine code, type metadata, manifest, information in the PE header, information in the CLR header.

In another particular case, the mismatch is established by directly comparing the corresponding data of the parent assembly and the image of the machine code.

In the particular case, the discrepancy is established by directly comparing the corresponding data of the original image of the machine code and the resulting image of the machine code, where the original image of the machine code is guaranteed to be an unchanged image of the machine code received from the parent assembly.

In another particular case, the parent assembly, on the basis of which the resulting machine code image is created, is determined by the operating system.

Brief Description of the Drawings

The accompanying drawings are included to provide a further understanding of the invention and form part of this description, show embodiments of the invention, and together with the description serve to explain the principles of the invention.

The claimed invention is illustrated by the following drawings, in which:

FIG. 1a shows an example of a single-file assembly;

FIG. 1b shows an example of a multi-file assembly;

FIG. 1c shows a method for forming a strictly name;

FIG. 2 shows a method of executing assembly code;

FIG. 3 shows a method for operating a binding module;

FIG. 4 shows a method for determining an image trust category;

FIG. 5 shows a method for creating a template;

FIG. 6 shows a method of installing images on a device;

FIG. 7 shows an example of a general purpose computer system.

Although the invention may have various modifications and alternative forms, the characteristic features shown by way of example in the drawings will be described in detail. It should be understood, however, that the purpose of the description is not to limit the invention to its specific embodiment. On the contrary, the purpose of the description is to cover all changes, modifications that are included in the scope of this invention, as defined by the attached formula.

Description of the invention

The objects and features of the present invention, methods for achieving these objects and features will become apparent by reference to exemplary embodiments. However, the present invention is not limited to the exemplary embodiments disclosed below, it can be embodied in various forms. The above description is intended to help a person skilled in the art for a comprehensive understanding of the invention, which is defined only in the scope of the attached claims.

In FIG. 4 shows a method for categorizing images. At step 400, an image is obtained. An image in one particular case is obtained from the NIC (for example, if the image is installed on the device and used on the device for its intended purpose), in another particular case from any other image storage (for example, when the device is used as storage and images are not used on this device by destination). Next, at 410, an image trust category is determined. In a particular case, to determine the category of image trust, a query is made to a database where, for example, the checksum of the image is used for the request, in another particular case, the MVID of the image is used. Also, templates are used to determine the image category. The mechanism for working with templates is given below. If the image is not known in the database, then at step 420, the parent assembly is determined on the basis of which the image was created. At least the following data, data structures, and tools are used to determine:

- MVID;

- register;

- binding module

is a strict name.

Definition by MVID is used, for example, if there is a database containing MVID assemblies available on the device. To do this, determine the MVID of the image and access the database that stores the MVID of the assemblies.

The definition of the parent assembly by the entries in the registry is used in the case when an image is created in the registry when creating images. An example of such a record is presented above.

The strong name definition of the parent assembly is used for images created from strictly named assemblies. The components of the strong name of the parent assembly are extracted from the image, a strong name is formed, and based on this data, the path to the parent assembly in the GAC on the device or in the database that stores the assemblies is ordered in accordance with the strong name.

The choice of the method for determining the parent assembly is carried out depending on a number of factors. Such factors, for example, are: the location of the parent assembly and the image (user device or a remote or local database), the possibility of compromising the assembly and image in the storage location, the way the assembly is named (strong name or regular), etc.

In the particular case, after determining the parent assembly, at step 421, the correspondence between the image and the assembly is determined. This stage is carried out if there is a possibility that the image after creation could be unauthorized changed (compromised) at the place of storage. In one case, to determine the correspondence, an algorithm is used that is used by the image linking module known in the prior art, in another particular case, after determining the parent assembly, an image from this assembly is created (the original image) and the original image is directly compared with the analyzed image to determine the correspondence , the comparison is carried out, for example, byte-by-byte.

In the particular case, in order to avoid unauthorized alteration of images, image modification is allowed only to trusted processes, for example, only ngen.exe, other processes are only allowed to read data from the image.

In some cases, the template engine is used to determine the correspondence between the image and the parent assembly. In the particular case, if there is no correspondence between the parent assembly and the corresponding one, the image is considered compromised (malicious). A compromised image from the original image may differ in CIL code, machine code, type metadata, information contained in the CLR and PE headers.

The image, like the assembly, has some structure, for example, which is depicted in FIG. 5. The assembly of KasperskyLab.dll and the image of KasperskyLab.ni.dll contain metadata and code, where the assembly contains exclusively CIL code, and the image, in the particular case, additionally machine code and the NativeImageHeader structure. Based on the structure, metadata and code, the KasperskyLab.dll.tmpl template is generated, which is already mentioned above, which uniquely corresponds to the parent assembly and the image created on its basis. For linking structure, code, and metadata into a template, for example, flexible hash technology (English intelligent hash, also known as local sensitive hash) is used. In the particular case, a template is formed as shown in FIG. 5. Data is extracted from the assembly (manifest, metadata, CIL code, etc.). The same data and additional machine code are extracted from the image. Data that is unchanged for each of the possible versions of the image created from one parent assembly is processed (for example, the checksum is calculated from them) and a hash is formed, which is placed in the template. Data that changes from version to version of an image, such as machine code, is also processed and a flexible hash is formed on the basis of processing. In the particular case, for machine code, a function call log is generated, a listing with a disassembled machine code, or any other entity that reflects the execution logic of the specified machine code from these entities and form a flexible hash; in another particular case, these entities are used directly in the template. It should be noted that the template is formed in such a way that uniquely connects (establishes a correspondence) the parent assembly and the image, regardless of the version of the image, depending on the hardware-software configuration of the device. In the event that changes are made to the image machine code and the image code execution logic no longer matches the assembly code execution logic, the correspondence between the parent assembly and the image based on the template is not established, the image is recognized as not conforming to the assembly.

The following describes an example of matching using a template. For example, there is some parent assembly of Kaspersky.dll, an image of Kaspersky.ni.dll is created on the device for it. The Kaspersky.dll.tmpl template is generated, which allows you to establish a correspondence between the parent assembly and the image. Then, the firmware and hardware part is updated on the device (updating the operating system, .NET Framework, replacing the processor) and the version of the Kaspersky.ni.dll image becomes outdated, the image cannot be used, therefore, they update this image and create a new Kaspersky.ni image. dll, which is different from the image of the previous version. But when using the template, it is determined that the updated image corresponds to the parent assembly (the logic for executing machine code remains the same). Let in another case, malicious software is installed on the device that modifies the image of Kaspersky.ni.dll. In this case, when using the template, it is determined that the image modified by the malware does not correspond to the parent assembly (the execution logic of the machine code is different from the logic embedded in the parent assembly).

After determining the parent assembly, you must set the trust category of the assembly (step 430). The category of assembly trust refers to the degree of trust in the assembly (trusted or untrusted) by the device’s protection system, for example, an anti-virus application. In one embodiment, the assembly category may have two: trusted assembly or untrusted assembly. Within the framework of the current application, the concept of assembly category should be separated from the concept of assembly hazard status. The hazard status of the assembly in the framework of this application may be as follows: dangerous, safe. There are also unknown assemblies - these are assemblies whose hazard status is not defined. The hazard status of the assembly determines the hazard of the assembly for the device on which the assembly is installed. The danger of assembly for the device lies, in the particular case, in the possible theft of data from the device, data substitution or unauthorized modification of the software part of the device during execution of the assembly code.

Trusted assemblies include assemblies that are considered safe by the device’s security system. The device security system, assigning a trust category to an assembly, does this locally within the current state of the device and based on assembly information. In the particular case, such information is the hazard status of the assembly. The hazard status of the assembly is determined using assembly identification information, such as the MVID of the assembly, the strong assembly name, or the checksum of the assembly. To do this, a request is made to the reputation database at step 431, the database is located in the particular case on the device on which the assembly is stored, in another particular case, the database is located remotely. If the assembly is known (information about it is contained in the reputation database), then, accordingly, the assembly already has a hazard status safe or dangerous, depending on which identification information from the verified assembly corresponds to the identification information from the reputation database. If the assembly identification information is not contained in the database, the assembly is considered unknown, i.e. assembly has no status (status not defined). If the assembly has the safe status, then in the particular case the assembly is trusted. In another particular case, the assembly category is determined on the basis of other actual and statistical information about the assembly, for example, by installing the assembly on the device or by belonging to installation packages whose hazard status is known.

In the particular case, the actual assembly information is digital signature information (for example, StrongName signature or X.509), while the digital signature must be valid. To do this, at step 432, identification information about the digital signature of the assembly is obtained, which contains, for example, information about the manufacturer or the hash of the file or its part. In this case, the signature can be located both in the assembly and in the catalog (catalog signature). The danger status of the digital signature of the assembly is determined using the identification information of the signature; for this, a request is made to a reputation database, the database is located in one particular case on the device on which the assembly is stored, in another particular case, the database is located remotely. If the signature is known (information about it is contained in the reputation database), then, accordingly, the signature already has the status of safe or dangerous, depending on what identification information from the reputation database corresponds to the identification information of the verified signature. If the identification information of the signature is not contained in the database, the signature is considered unknown, i.e. the signature has no status (status unknown). In the particular case, if the signature has the safe status, then in the particular case the assembly receives the trusted category, and if the signature has the dangerous status, in the particular case the assembly receives the untrusted category.

Statuses of signatures are assigned in different ways, in one particular case depending on the manufacturer, in another particular case by inheritance from the installer, whose signature status is known. In some cases, the status of a signature is assigned depending on the popularity of the signature, for example, the more popular the signature, the more credible it is.

In the particular case, at step 433, the category of trust is determined using an anti-virus scan of the assembly; for this, various methods for detecting malicious software are used: signature, heuristic, statistical, etc. In the event that, according to the results of the anti-virus scan, the assembly is considered safe, then the assembly is trusted. Otherwise, the assembly is recognized as untrusted.

After determining the trust category of the assembly, at 440, the image trust category is determined. In the particular case, the image is assigned the trust category defined for the parent assembly; in another particular case, the category of image is determined by the method described above for step 410.

When installing the protection system on the device, it is necessary to ensure that the image storage has not been unauthorizedly changed and will not be changed, for this a number of measures are applied. In FIG. Figure 6 shows the category assignments to the image. At step 600, access to the image store or at least one image is limited, the restriction in the particular case is that only trusted processes or a finite number of some trusted processes are allowed to modify the image, for example, only the ngen.exe process, all other processes are allowed read access only. In another particular case, the restriction is to completely block write access to the storage as a whole or to at least one image. At step 610, the parent assembly is determined, on the basis of which an image is created, access to which is restricted. At step 620, an update (replacement) of at least one image is started. In one particular case, the update consists in deleting the previously created image and creating a new operating system tool (by running ngen.exe on the parent assembly or the automatic image creation service), in another particular case, only part of the image data is changed, for example, machine code, and the update is carried out trusted processes. In the first case, the image after deletion is created again, in one particular case immediately, in the other case, the creation is postponed for some time, for example, until the parent assembly is started, which is determined at step 610, the image of which is to be updated. At step 630, a parent assembly category is assigned to the image.

An anti-virus tool uses categories of trust in its work, for example, deletes images that have a category of trust untrusted, or significantly restricts their use, for example, restricts their access to resources provided by the operating system.

FIG. 7 is an example of a general purpose computer system, a personal computer or server 20 comprising a central processor 21, a system memory 22, and a system bus 23 that contains various system components, including memory associated with the central processor 21. The system bus 23 is implemented as any prior art bus structure comprising, in turn, a bus memory or a bus memory controller, a peripheral bus and a local bus that is capable of interacting with any other bus architecture. The system memory contains read-only memory (ROM) 24, random access memory (RAM) 25. The main input / output system (BIOS) 26 contains the basic procedures that ensure the transfer of information between the elements of the personal computer 20, for example, at the time of loading the operating system using ROM 24.

The personal computer 20, in turn, contains a hard disk 27 for reading and writing data, a magnetic disk drive 28 for reading and writing to removable magnetic disks 29, and an optical drive 30 for reading and writing to removable optical disks 31, such as a CD-ROM , DVD-ROM and other optical storage media. The hard disk 27, the magnetic disk drive 28, the optical drive 30 are connected to the system bus 23 through the interface of the hard disk 32, the interface of the magnetic disks 33 and the interface of the optical drive 34, respectively. Drives and associated computer storage media are non-volatile means of storing computer instructions, data structures, software modules and other data of a personal computer 20.

The present description discloses an implementation of a system that uses a hard disk 27, a removable magnetic disk 29, and a removable optical disk 31, but it should be understood that other types of computer storage media 56 that can store data in a form readable by a computer (solid state drives, flash memory cards, digital disks, random access memory (RAM), etc.) that are connected to the system bus 23 through the controller 55.

Computer 20 has a file system 36 where the recorded operating system 35 is stored, as well as additional software applications 37, other program modules 38, and program data 39. The user is able to enter commands and information into personal computer 20 via input devices (keyboard 40, keypad “ the mouse "42). Other input devices (not displayed) can be used: microphone, joystick, game console, scanner, etc. Such input devices are, as usual, connected to the computer system 20 via a serial port 46, which, in turn, is connected to the system bus, but can be connected in another way, for example, using a parallel port, a game port, or a universal serial bus (USB). A monitor 47 or other type of display device is also connected to the system bus 23 via an interface such as a video adapter 48. In addition to the monitor 47, the personal computer may be equipped with other peripheral output devices (not displayed), such as speakers, a printer, and the like.

The personal computer 20 is capable of operating in a networked environment, using a network connection with another or more remote computers 49. The remote computer (or computers) 49 are the same personal computers or servers that have most or all of the elements mentioned earlier in the description of the creature the personal computer 20 of FIG. 7. Other devices, such as routers, network stations, peer-to-peer devices or other network nodes, may also be present on the computer network.

Network connections can form a local area network (LAN) 50 and a wide area network (WAN). Such networks are used in corporate computer networks, internal networks of companies and, as a rule, have access to the Internet. In LAN or WAN networks, the personal computer 20 is connected to the local area network 50 via a network adapter or network interface 51. When using the networks, the personal computer 20 may use a modem 54 or other means of providing communication with a global computer network such as the Internet. The modem 54, which is an internal or external device, is connected to the system bus 23 via the serial port 46. It should be clarified that the network connections are only exemplary and are not required to display the exact network configuration, i.e. in reality, there are other ways to establish a technical connection between one computer and another.

In conclusion, it should be noted that the information provided in the description are examples that do not limit the scope of the present invention defined by the claims. One skilled in the art will recognize that there may be other embodiments of the present invention consistent with the spirit and scope of the present invention.

Claims (10)

1. A method for detecting a malicious image of machine code, where the image of machine code is considered malicious, the execution logic of the machine code of which differs from the execution logic of the CIL code of the parent assembly, in which:
a) receive an image of a machine code;
b) determine the parent assembly, where the parent assembly is the assembly on the basis of which the resulting image is created;
c) establish a mismatch between the logic of the execution of the machine code of the resulting image of the machine code and the logic of the execution of the CIL code of a particular parent assembly;
d) recognize the image of the machine code as malicious on the basis of the established discrepancy between the logic of the execution of the machine code of the resulting image of the machine code and the logic of the execution of the CIL code of a specific parent assembly.
2. The method according to claim 1, in which the mismatch between the CIL code or machine code is established by directly comparing the corresponding data of the original image of the machine code and the resulting image of the machine code.
3. The method according to claim 1, wherein the mismatch between the CIL code or the machine code of the original image of the machine code and the corresponding data of the parent assembly is established based on the template.
4. The method according to claim 3, in which for each block of manifest data, type metadata, resources, digital signature, CIL code, machine code, hashes are generated and placed in the template, while the template is formed in such a way that uniquely connects the parent assembly and the image machine code, regardless of the version of the image, depending on the firmware of the device.
5. The method according to claim 1, in which the parent assembly, based on which the resulting image of the machine code is created, is determined by the operating system.
6. The method according to claim 1, in which, to establish a discrepancy, an original image is created from the aforementioned parent assembly and a direct comparison of the original image and the resulting image of the machine code is performed, the original image of the machine code being the guaranteed unchanged image of the machine code received from the parent assembly.
RU2015111420A 2015-03-31 2015-03-31 Method for detecting harmful assemblies RU2628920C2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
RU2015111420A RU2628920C2 (en) 2015-03-31 2015-03-31 Method for detecting harmful assemblies

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
RU2015111420A RU2628920C2 (en) 2015-03-31 2015-03-31 Method for detecting harmful assemblies

Publications (2)

Publication Number Publication Date
RU2015111420A RU2015111420A (en) 2016-10-20
RU2628920C2 true RU2628920C2 (en) 2017-08-22

Family

ID=57138315

Family Applications (1)

Application Number Title Priority Date Filing Date
RU2015111420A RU2628920C2 (en) 2015-03-31 2015-03-31 Method for detecting harmful assemblies

Country Status (1)

Country Link
RU (1) RU2628920C2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2660643C1 (en) * 2017-09-29 2018-07-06 Акционерное общество "Лаборатория Касперского" System and method of detecting the harmful cil-file

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100095284A1 (en) * 2008-10-15 2010-04-15 Microsoft Corporation Caching runtime generated code
RU2439669C2 (en) * 2005-08-06 2012-01-10 Майкрософт Корпорейшн Method to prevent reverse engineering of software, unauthorised modification and data capture during performance
US20120323858A1 (en) * 2011-06-16 2012-12-20 Microsoft Corporation Light-weight validation of native images
US20140068583A1 (en) * 2012-09-05 2014-03-06 Microsoft Corporation Generating native code from intermediate language code for an application
US20140137078A1 (en) * 2012-11-14 2014-05-15 Microsoft Corporation Revertable managed execution image instrumentation
RU2013119285A (en) * 2013-04-26 2014-11-10 Закрытое акционерное общество "Лаборатория Касперского" System and method for evaluating the code harmful performance executed in the address space of a trusted process

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2439669C2 (en) * 2005-08-06 2012-01-10 Майкрософт Корпорейшн Method to prevent reverse engineering of software, unauthorised modification and data capture during performance
US20100095284A1 (en) * 2008-10-15 2010-04-15 Microsoft Corporation Caching runtime generated code
US20120323858A1 (en) * 2011-06-16 2012-12-20 Microsoft Corporation Light-weight validation of native images
US20140068583A1 (en) * 2012-09-05 2014-03-06 Microsoft Corporation Generating native code from intermediate language code for an application
US20140137078A1 (en) * 2012-11-14 2014-05-15 Microsoft Corporation Revertable managed execution image instrumentation
RU2013119285A (en) * 2013-04-26 2014-11-10 Закрытое акционерное общество "Лаборатория Касперского" System and method for evaluating the code harmful performance executed in the address space of a trusted process

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2660643C1 (en) * 2017-09-29 2018-07-06 Акционерное общество "Лаборатория Касперского" System and method of detecting the harmful cil-file

Also Published As

Publication number Publication date
RU2015111420A (en) 2016-10-20

Similar Documents

Publication Publication Date Title
Costin et al. A large-scale analysis of the security of embedded firmwares
US8458695B2 (en) Automatic optimization for virtual systems
US7849462B2 (en) Image server
US9038062B2 (en) Registering and accessing virtual systems for use in a managed system
US8949826B2 (en) Control and management of virtual systems
US7549164B2 (en) Intrustion protection system utilizing layers and triggers
US8073926B2 (en) Virtual machine image server
EP2084605B1 (en) Control and management of virtual systems
US9170833B2 (en) Compliance-based adaptations in managed virtual systems
US7512977B2 (en) Intrustion protection system utilizing layers
US9864600B2 (en) Method and system for virtualization of software applications
US20150227386A1 (en) Enforcement of compliance policies in managed virtual systems
KR101150019B1 (en) System and method for controlling inter-application association through contextual policy control
US8776038B2 (en) Method and system for configuration of virtualized software applications
RU2444056C1 (en) System and method of speeding up problem solving by accumulating statistical information
US8024564B2 (en) Automating configuration of software applications
US20140082621A1 (en) Automatic optimization for virtual systems
US20050091214A1 (en) Internal object protection from application programs
JP4676744B2 (en) Security-related programming interface
US7890689B2 (en) Virtual appliance management
US9332021B2 (en) Methods and systems for preventing security breaches
KR101693370B1 (en) Fuzzy whitelisting anti-malware systems and methods
US7640587B2 (en) Source code repair method for malicious code detection
US9747172B2 (en) Selective access to executable memory
JP6224173B2 (en) Method and apparatus for dealing with malware

Legal Events

Date Code Title Description
FZ9A Application not withdrawn (correction of the notice of withdrawal)

Effective date: 20170627