RU2628920C2

RU2628920C2 - Method for detecting harmful assemblies

Info

Publication number: RU2628920C2
Application number: RU2015111420A
Authority: RU
Inventors: Дмитрий Геннадьевич Иванов; Никита Алексеевич Павлов; Дмитрий Владимирович Швецов; Михаил Александрович Горшенин
Original assignee: Закрытое акционерное общество "Лаборатория Касперского"
Priority date: 2015-03-31
Filing date: 2015-03-31
Publication date: 2017-08-22
Also published as: RU2015111420A

Abstract

FIELD: information technology.

SUBSTANCE: method for detecting a harmful image of machine code, where harmful code is considered to be an image of machine code which execution logic of the machine code differs from the execution logic of the CIL code of the parent assembly, contains the steps: a) obtain the image of the machine code; B) define the parent assembly, where the parent assembly is the assembly on the basis of which the resulting image was created; C) establish a discrepancy between the execution logic of the machine code of the obtained image of the machine code and the execution logic of the CIL code of the specific parent assembly; D) recognize the image of the machine code as harmful on the basis of the established discrepancy between the execution logic of the machine code of the obtained image of the machine code and the execution logic of the CIL code of the specific parent assembly.

EFFECT: increase the security of the device by detecting a harmful image of a machine code.

6 cl, 9 dwg

Description

Область техникиTechnical field

Изобретение относится к способам обнаружения вредоносных образов машинных кодов на устройстве.The invention relates to methods for detecting malicious images of machine codes on a device.

Уровень техникиState of the art

В настоящее время количество приложений, устанавливаемых на устройства пользователя, растет, число файлов, создаваемых приложениями, растет экспоненциально. Некоторые файлы, создаваемые приложениями при установке и функционировании приложения, являются уникальными, т.е. файлы существуют в единственном экземпляре или мало распространенными. Указанный факт создает существенные проблемы для категоризации подобных файлов без детального анализа.Currently, the number of applications installed on user devices is growing, the number of files created by applications is growing exponentially. Some files created by applications during installation and operation of the application are unique, i.e. files exist in a single copy or little common. This fact creates significant problems for the categorization of such files without detailed analysis.

Подобными файлами являются образы сборок в машинном коде (native image), которые являются частью технологии .NET. Приложения .NET создаются за счет использования вместе некоторого количества сборок (assembly), где сборка является двоичным (бинарным) файлом, обслуживаемым общеязыковой исполняющей средой CLR (Common Language Runtime).Similar files are assembly images in native code, which are part of the .NET technology. .NET applications are created by using together a certain number of assemblies (assembly), where the assembly is a binary (binary) file served by the common language runtime (CLR).

Сборка .NET включает в себя следующие элементы:The .NET assembly includes the following elements:

- заголовок файла РЕ (portable execution);- file header RE (portable execution);

- заголовок CLR;- CLR header;

- CIL код (Common Intermediate Language);- CIL code (Common Intermediate Language);

- метаданные используемых в сборке типов (классов, интерфейсов, структур, перечислений, делегатов);- metadata of types used in the assembly (classes, interfaces, structures, enumerations, delegates);

- манифест сборки;- assembly manifest;

- дополнительные встроенные ресурсы.- additional built-in resources.

PE заголовок указывает на тот факт, что сборка может загружаться и использоваться в операционных системах семейства Windows. Заголовок также идентифицирует тип приложения (консольное, приложение с графическим пользовательским интерфейсом или библиотека кода).The PE header indicates that the assembly can be downloaded and used on Windows operating systems. The header also identifies the type of application (console, GUI application or code library).

Заголовок CLR представляет собой данные, которые должны обязательно поддерживать все сборки .NET для того, чтобы они могли обслуживаться в CLR среде. Заголовок CLR содержит данные, например такие как флаги, версии CLR, точку входа (в частном случае адрес начала функции Main()) которые позволяют исполняющей среде определять компоновку управляемого файла (файла, содержащего управляемый код).The CLR header is the data that all .NET assemblies must support in order to be served in the CLR environment. The CLR header contains data, such as flags, CLR versions, an entry point (in particular, the address of the start of the Main () function) that allows the runtime to determine the layout of a managed file (a file containing managed code).

Каждая сборка содержит CIL код, который является промежуточным кодом, не зависящим от процессора. Во время выполнения CIL-код компилируется в режиме реального времени посредством JIT (just in time, т.е. динамическая компиляция) компилятора в инструкции, соответствующие требованиям конкретного процессора.Each assembly contains a CIL code, which is processor independent intermediate code. At run time, the CIL code is compiled in real time by the JIT (just in time, i.e. dynamic compilation) compiler into instructions that meet the requirements of a particular processor.

В любой сборке содержатся метаданные, которые полностью описывают формат находящихся внутри сборки типов (классов), а также внешних типов, на которые сборка ссылается (типы, описанные в других сборках). В исполняющей среде метаданные используются для определения того, в каком месте внутри двоичного файла находятся типы, для размещения типов в памяти и для упрощения процесса удаленного вызова методов типов.Any assembly contains metadata that completely describes the format of types (classes) inside the assembly, as well as external types that the assembly refers to (types described in other assemblies). In the runtime environment, metadata is used to determine where in the binary file the types are located, to place types in memory, and to simplify the process of remotely calling type methods.

Сборка также содержит манифест. В этом манифесте описан каждый входящий в состав сборки модуль, версия сборки, а также любые внешние сборки, на которые ссылается текущая сборка. Манифест сборки содержит все метаданные, необходимые для задания требований сборки к версиям, и удостоверение сборки, а также все метаданные, необходимые для определения области видимости сборки и разрешения ссылок на ресурсы и классы. В следующей таблице показаны данные, содержащиеся в манифесте сборки. Первые четыре элемента - имя сборки, номер версии, язык и региональные параметры, а также данные строгого имени - составляют удостоверение сборки.The assembly also contains a manifest. This manifest describes each module that is part of the assembly, the version of the assembly, and any external assemblies referenced by the current assembly. The assembly manifest contains all the metadata needed to specify the assembly requirements for the versions, and the assembly identity, as well as all the metadata needed to determine the scope of the assembly and to resolve references to resources and classes. The following table shows the data contained in the assembly manifest. The first four elements — assembly name, version number, language and regional settings, as well as strong name data — make up the assembly identity.

В любой сборке .NET может содержаться любое количество вложенных ресурсов, таких как иконки приложения, графические файлы, звуковые фрагменты или таблицы строк.Any .NET assembly can contain any number of embedded resources, such as application icons, image files, sound bites, or line tables.

Сборка может состоять из нескольких модулей (module). Модуль - часть сборки, логическая коллекция кода или ресурсов. Иерархия используемых в сборке сущностей такова: Сборка->модуль->тип(класс)->метод. Модуль может быть внутренним (внутри файла сборки) или внешним (отдельный файл). Модуль не имеет точки входа и не обладает никаким индивидуальным номером версии и поэтому не может напрямую загружаться CLR средой. Модули могут загружаться только главным модулем сборки, например файлом, в котором содержится манифест сборки. Манифест модуля содержит только перечисление всех внешних сборок. Каждый модуль имеет идентификатор MVID (Module Version Identifier) - уникальный идентификатор, прописанный в каждом модуле сборки, который изменяется при каждой компиляции.An assembly may consist of several modules. A module is part of an assembly, a logical collection of code or resources. The hierarchy of entities used in the assembly is as follows: Assembly-> module-> type (class) -> method. A module can be internal (inside the assembly file) or external (separate file). The module does not have an entry point and does not have any individual version number and therefore cannot be loaded directly by the CLR environment. Modules can only be loaded by the main assembly module, for example, a file that contains the assembly manifest. The module manifest contains only an enumeration of all external assemblies. Each module has an identifier MVID (Module Version Identifier) - a unique identifier registered in each assembly module, which changes with each compilation.

В однофайловых сборках все необходимые элементы (заголовки, CIL код, метаданные типов, манифест и ресурсы) размещаются внутри единственного файла *.exe или *.dll. На Фиг. 1a показано устройство однофайловой сборки.In single-file assemblies, all the necessary elements (headers, CIL code, type metadata, manifest and resources) are placed inside a single * .exe or * .dll file. In FIG. 1a shows a single-file assembly device.

Многофайловая сборка состоит из набора модулей .NET, которые развертываются в виде одной логической единицы и снабжаются одним номером версии. Формально один из этих модулей называется главным модулем и содержит манифест сборки (а также все необходимые CIL-инструкции, метаданные, заголовки и дополнительные ресурсы).A multi-file assembly consists of a set of .NET modules that are deployed as a single logical unit and provided with a single version number. Formally, one of these modules is called the main module and contains the assembly manifest (as well as all the necessary CIL instructions, metadata, headers and additional resources).

В манифесте главного модуля описаны все остальные связанные модули, от которых зависит работа главного модуля. В частном случае второстепенным модулям в многофайловой сборке назначается расширение *.netmodule (Фиг. 1б); обязательным требованием для CLR среды это не является. Во второстепенных модулях *.netmodule тоже содержится CIL-код и метаданные типов, а также манифест уровня модуля, в котором перечислены внешние сборки, необходимые данному модулю.The manifest of the main module describes all the other related modules on which the operation of the main module depends. In a particular case, minor modules in a multi-file assembly are assigned the * .netmodule extension (Fig. 1b); This is not a requirement for the CLR environment. The minor * .netmodule modules also contain the CIL code and type metadata, as well as the module level manifest, which lists the external assemblies required by this module.

Как и любой PE-файл, сборка может быть подписана электронно-цифровой подписью Х.509, располагающейся в оверлее PE-файла или каталоге (каталожная подпись). Дополнительно или отдельно используется StrongName подпись - хеш, сгенерированный с применением содержимого сборки и секретной части RSA-ключа. Хеш располагается в сборке между заголовком PE и метаданными. Хеш позволяет проверить неизменность сборки с момента компиляции. Для однофайловой сборки при компиляции файла после PE заголовка оставляют свободные байты. Затем считается хеш файла с применением секретного ключа и полученный хеш записывается в указанные байты.Like any PE file, the assembly can be signed with an X.509 digital signature located in the overlay of the PE file or directory (catalog signature). Additionally or separately, StrongName signature is used - a hash generated using the contents of the assembly and the secret part of the RSA key. The hash is located in the assembly between the PE header and the metadata. The hash allows you to check the immutability of the assembly since compilation. For a single-file assembly, when compiling a file, after the PE header, free bytes are left. Then, the hash of the file using the secret key is considered and the received hash is written to the specified bytes.

Для многофайловых сборок технология отличается. Кроме хеша основного файла сборки считаются также хеши внешних модулей, после чего данные записываются в основную сборку. Модули не имеют собственных подписей и имеют отличные от главного модуля MVID. В манифест сборки записывается:For multi-file builds, the technology is different. In addition to the hash of the main assembly file, hashes of external modules are also considered, after which the data is written to the main assembly. Modules do not have their own signatures and are different from the main MVID module. The assembly manifest writes:

- PublicKey - открытая часть StrongName-подписи- PublicKey - the open part of the StrongName signature

- PublicKeyToken - хеш открытой части ключа StrongName-подписи.- PublicKeyToken - hash of the public part of the StrongName signature key.

Сборки разделяют на: приватные (private) и публичные (public/shared). Приватные сборки должны всегда размещаться в том же каталоге, что и клиентское приложение, в котором они используются (т.е. в каталоге приложения), или в одном из его подкаталогов.Assemblies are divided into: private (private) and public (public / shared). Private assemblies should always be located in the same directory as the client application in which they are used (i.e., in the application directory), or in one of its subdirectories.

Публичная сборка одновременно может использоваться в нескольких приложениях на одном и том же устройстве. Публичные сборки не размещаются внутри того же самого каталога, что и приложения, в которых они должны использоваться. Вместо этого они устанавливаются в глобальное хранилище (кэш) сборок (Global Assembly Cache - GAC). GAC располагается сразу в нескольких местах:Public assembly can be used simultaneously in several applications on the same device. Public assemblies are not placed inside the same directory as the applications in which they are to be used. Instead, they are installed in the Global Assembly Cache (GAC). GAC is located in several places at once:

Сборка, устанавливаемая в GAC, должна иметь строгое имя (strong name). Строгое имя является современным .NET-эквивалентом глобально уникальных идентификаторов (GUID), которые применялись в COM. В отличие от GUID-значений в COM, которые представляют собой 128-битные числа, строгие имена .NET основаны (отчасти) на двух взаимосвязанных криптографических ключах, называемых открытым (public) и секретным (private) ключом.The assembly installed in the GAC must have a strong name. The strong name is the modern .NET equivalent of globally unique identifiers (GUIDs) used in COM. Unlike COM GUIDs, which are 128-bit numbers, strong .NET names are based (in part) on two interconnected cryptographic keys called a public key and a private key.

Строгое имя состоит из набора взаимосвязанных данных, включающих:A strict name consists of a set of related data, including:

- имена сборки (которое представляет собой имя сборки без файлового расширения).- assembly names (which is the name of the assembly without a file extension).

- номер версии сборки;- assembly version number;

- значение открытого ключа;- the value of the public key;

- значение, обозначающее регион, которое является необязательным и может предоставляться для локализации приложения;- a value indicating a region that is optional and may be provided to localize the application;

- цифровой подписи, созданной с использованием хеша, полученного по содержимому сборки и значению секретного ключа.- a digital signature created using a hash obtained by the contents of the assembly and the value of the secret key.

Для создания строгого имени сборки получают открытый и секретный ключи, например генерируют данные открытого и секретного ключей с помощью поставляемой в составе .NET Framework SDK утилиты sn.exe. Эта утилита генерирует файл, который содержит данные для двух разных, но математически связанных ключей - открытого и секретного. После указывают местонахождения этого файла компилятору, который запишет полное значение открытого ключа в манифест сборки.To create a strong name for the assembly, public and private keys are obtained, for example, they generate public and private key data using the sn.exe utility supplied with the .NET Framework SDK. This utility generates a file that contains data for two different but mathematically related keys - public and secret. After they indicate the location of this file to the compiler, which will write the full value of the public key into the assembly manifest.

В частном случае, компилятор генерирует на основе всего содержимого сборки (CIL кода, метаданных и т.д.) соответствующий хеш. Хешем является числовое значение, которое является статистически уникальным для фиксированных входных данных. Следовательно, в случае изменения любых данных сборки .NET (даже одного символа в строковом литерале), компилятор выдает другой хеш. Далее полученный хеш объединяется с содержащимися внутри файла данными секретного ключа для получения цифровой подписи, вставляемой в сборку внутрь данных заголовка CLR. На Фиг. 1в показано, как выглядит процесс создания строгого имени.In the particular case, the compiler generates an appropriate hash based on the entire contents of the assembly (CIL code, metadata, etc.). A hash is a numeric value that is statistically unique to fixed input. Therefore, if any .NET assembly data changes (even one character in a string literal), the compiler returns a different hash. Next, the resulting hash is combined with the secret key data contained in the file to obtain a digital signature inserted into the assembly inside the CLR header data. In FIG. 1c shows what the process of creating a strong name looks like.

Данные секретного ключа в манифесте не указываются, а служат только для удостоверения содержимого сборки цифровой подписью (вместе с генерируемым хешем). По завершении процесса создания и назначения строгого имени сборка может устанавливаться в GAC.The secret key data is not indicated in the manifest, but serves only to verify the contents of the assembly with a digital signature (together with the generated hash). Upon completion of the process of creating and assigning a strong name, the assembly can be installed in the GAC.

Путь к сборке в GAC:GAC build path:

C:\Windows\assembly\GAC_32\KasperskyLab\2.0.0.0_b03f5f7f11d50a3a\KasperskyLab.dll, где:C: \ Windows \ assembly \ GAC_32 \ KasperskyLab \ 2.0.0.0_b03f5f7f11d50 a 3 a \ KasperskyLab.dll, where:

C:\Windows\assembly - путь к GAC;C: \ Windows \ assembly - path to the GAC;

\GAC_32-ОАС_архитектура процессора;\ GAC_32-OAC_ processor architecture;

\KasperskyLab - имя сборки;\ KasperskyLab - assembly name;

\2.0.0.0_b03f5f7f11d50a3a - версия сборки метка публичного ключа KasperskyLab.dll-\имя сборки.расширение.\ 2.0.0.0_b03f5f7f11d50a3a - assembly version KasperskyLab.dll public key label- \ assembly name.extension.

Исполнение кода сборки, в частном случае, происходит следующим образом. В первую очередь происходит анализ заголовка РЕ, чтобы узнать, какой процесс запустить (32 или 64-разрядный). Затем загружается выбранная версия файла MSCorEE.dll (C:\Windows\System32\MSCorEE.dll для 32-разрядных систем). Пример исходного кода сборки приведен ниже.The execution of the assembly code, in the particular case, is as follows. First of all, the PE header is analyzed to find out which process to start (32 or 64-bit). Then the selected version of the MSCorEE.dll file is loaded (C: \ Windows \ System32 \ MSCorEE.dll for 32-bit systems). An example of the assembly source code is given below.

Для выполнения метода (для удобства код представлен в исходном виде, а не в скомпилированном в CIL код), например метода System.Console.WriteLine(«Kaspersky»), CIL код JIT-компилятор преобразует в машинные команды (Фиг. 2).To execute the method (for convenience, the code is presented in its original form, and not in the code compiled in CIL), for example, the System.Console.WriteLine (“Kaspersky”) method, the JIT compiler converts CIL code to machine instructions (Fig. 2).

В первую очередь перед выполнением функции Main() среда CLR находит все объявленные типы(классы) (например, тип Console). Затем определяет методы, объединяя их в записи внутри единой «структуры» (по одному методу, определенному в типе Console). Записи содержат адреса, по которым можно найти реализации методов. При первом обращение к методу WriteLine вызывается JIT-компилятор. JIT-компилятору известен вызываемый метод и тип, которым определен этот метод. ЛТ компилятор ищет в метаданных соответствующей сборки - реализацию кода метода (код реализации метода WriteLine(string str)). Затем JIT - компилирует IL в машинный код, сохраняя его в динамической памяти. Далее JIT-компилятор возвращается к внутренней «структуре» данных типа (Console) и заменяет адрес вызываемого метода на адрес блока памяти с машинным кодом. После этого метод Main() обращается к методу WriteLine(string str) повторно. Т.к. код уже скомпилирован, обращение без вызова JIT-компилятора. Выполнив метод WriteLine(string str), управление возвращается методу Main().First of all, before executing the Main () function, the CLR finds all declared types (classes) (for example, the Console type). It then defines the methods, combining them into records within a single “structure” (using one method defined in the Console type). Entries contain addresses where you can find implementations of methods. At the first call to the WriteLine method, the JIT compiler is called. The JIT compiler knows the called method and the type by which this method is defined. The LT compiler searches in the metadata of the corresponding assembly - the implementation of the method code (the implementation code of the WriteLine method (string str)). Then JIT - compiles IL into machine code, storing it in dynamic memory. Next, the JIT compiler returns to the internal "structure" of data of type (Console) and replaces the address of the called method with the address of the memory block with machine code. After that, the Main () method calls the WriteLine (string str) method again. Because the code has already been compiled, access without calling the JIT compiler. By executing the WriteLine (string str) method, control is returned to the Main () method.

Из описания следует, что «медленно» работает функция только в момент первого вызова, когда ЛТ переводит CIL-код в инструкции процессора. Во всех остальных случаях код уже находится в памяти и подставляется как оптимизированный для данного процессора. Однако если будет запущена еще одна программа в другом процессе, то JIT-компилятор будет вызван снова для того же метода.From the description it follows that the function “works slowly” only at the time of the first call, when the LT translates the CIL code into the processor instructions. In all other cases, the code is already in memory and substituted as optimized for this processor. However, if another program is launched in another process, the JIT compiler will be called again for the same method.

Образы, о которых упоминалось выше, решают задачу «медленной работы функции в момент первого вызова». При загрузке сборки будет подгружаться образ, из которого будет исполняться машинный код, благодаря этой технологии возможно ускорение загрузки и запуска приложения в силу того, что JIT-компилятору не нужно ничего компилировать, а также каждый раз заново создавать структуры данных (записи), все это берется из образа. Образ можно создать для любой .NET-сборки вне зависимости от того, установлена она в GAC или нет. Для компиляции, в частном случае, используется утилита ngen.exe, располагающаяся по пути

При запуске ngen.exe происходит создание машинного кода для IL кода сборки с помощью JIT-компилятора, результат сохраняется на диск в хранилище образов NIC (Native Image Cache). Компиляция производится на локальном устройстве с учетом его программно-аппаратной конфигурации, поэтому образ должен использоваться только в той среде (окружении), под которую компилировался. Цель создания таких образов - повышение эффективности управляемых приложений, то есть вместо JIT-компиляции загружается готовая сборка в машинном коде.The images mentioned above solve the problem of "slow operation of the function at the time of the first call." When loading the assembly, the image will be loaded from which the machine code will be executed, thanks to this technology, it is possible to speed up the loading and launching the application due to the fact that the JIT compiler does not need to compile anything, and also to create data structures (records) every time, all this taken from the image. An image can be created for any .NET assembly, regardless of whether it is installed in the GAC or not. For compilation, in a particular case, the ngen.exe utility is used, located along the path

When you run ngen.exe, machine code is created for the IL assembly code using the JIT compiler, the result is saved to disk in the NIC (Native Image Cache) image storage. Compilation is performed on the local device taking into account its hardware and software configuration, therefore the image should be used only in the environment (environment) under which it was compiled. The goal of creating such images is to increase the efficiency of managed applications, that is, instead of JIT compilation, the finished assembly is loaded in machine code.

Если код сборки используется многими приложениями, то создание образа значительно увеличивает скорость запуска и работы приложения, так как образ может использоваться одновременно многими приложениями, а код, генерируемый на лету JIT-компилятором, используется только тем запущенным экземпляром приложения, для которого он компилируется.If the assembly code is used by many applications, then creating an image significantly increases the speed of launching and running the application, since the image can be used simultaneously by many applications, and the code generated on the fly by the JIT compiler is used only by the running instance of the application for which it is compiled.

Путь к компилируемому образу формируется следующим образом, например:The path to the compiled image is formed as follows, for example:

С:\Windows\assembly\NativeImages_ν4.0.30319_32\Kaspersky\9с87f327866f53aec68d4fee40cde33d\Kaspersky.ni.dll, гдеC: \ Windows \ assembly \ NativeImages_ν4.0.30319_32 \ Kaspersky \ 9c87f327866f53 a ec68d4fee40cde33d \ Kaspersky.ni.dll, where

C:\Windows\assembly\NativeImages - путь к хранилищу образов в системе;C: \ Windows \ assembly \ NativeImages - path to the image store on the system;

v4.0.30319_32 - <версия.NET Framework>_<архитектура процессора (32 или 64)>;v4.0.30319_32 - <version of the .NET Framework> _ <processor architecture (32 or 64)>;

Kaspersky - дружественное имя сборки;Kaspersky - friendly assembly name;

9c87f327866f53aec68d4fee40cde33d - хеш приложения;9c87f327866f53aec68d4fee40cde33d - application hash;

Kaspersky.ni.dll - <дружественное имя сборки>.mi.<расширение>.Kaspersky.ni.dll - <assembly friendly name> .mi. <extension>.

При создании образа машинного кода сборки ngen.exe для 64-битных приложений сохраняет данные о нем в ветке реестра

, для 32-битных приложений вWhen creating an image of the assembly machine code, ngen.exe for 64-bit applications saves data about it in the registry branch

, for 32-bit applications in

Если образ устанавливался для сборки, расположенной в GAC, ветка будет называться так:

If the image was installed for an assembly located in the GAC, the branch will be named like this:

Если же сборка не была установлена в GAC, то так:If the assembly was not installed in the GAC, then like this:

До Windows 8 разработчик всегда должен был сам инициировать создание, обновление и удаление образов сборок, используя ngen.exe (или конфигурируя установщик). Начиная с Windows 8 для некоторых сборок Windows создает образы автоматически.Prior to Windows 8, the developer always had to initiate the creation, update, and deletion of assembly images himself using ngen.exe (or by configuring the installer). Starting with Windows 8, for some builds, Windows creates images automatically.

В частном случае для управления образами используется служба Native Image Service. Она позволяет разработчикам откладывать установку, обновление, удаление образов в машинном коде, выполняя эти процедуры отложено, когда устройство простаивает. Native Image Service запускают программой установки приложения или обновления. Делается это посредством утилиты ngen.exe. Служба работает с очередью запросов, сохраняемой в реестре Windows, каждый из запросов имеет свой приоритет. От установленного приоритета зависит то, когда будет выполняться задача.In a particular case, the Native Image Service is used to manage images. It allows developers to postpone installation, updating, removal of images in machine code, performing these procedures is delayed when the device is idle. The Native Image Service is launched by the application installation or update program. This is done using the ngen.exe utility. The service works with a request queue stored in the Windows registry, each of the requests has its own priority. The priority will determine when the task will be executed.

В другом частном случае, образы в машинном коде создают не только по инициативе разработчиков или администраторов, но и автоматически платформой .NET Framework. Платформа .NET Framework автоматически создает образ, отслеживая работу JIT-компилятора. Создание образа во время работы приложения занимает слишком много времени, поэтому эта операция проводится отложено, для чего среда CLR ставит задачи в очередь и выполняет их во время простоя устройства.In another special case, images in machine code are created not only at the initiative of developers or administrators, but also automatically by the .NET Framework. The .NET Framework automatically creates an image by monitoring the operation of the JIT compiler. Creating an image while the application is running takes too much time, so this operation is postponed, for which the CLR puts the tasks in the queue and executes them when the device is idle.

Среда CLR для поиска сборок для загрузки в момент запуска соответствующей сборки использует модуль связывания сборок (Assemble Binder). CLR использует несколько видов модулей связывания. Для поиска образов используется модуль связывания образов (Native Binder). Поиск нужного образа проводится в два этапа - сначала указанный модуль находит сборку и образ на файловой система, затем проверяет соответствие образа сборке. Алгоритм поиска представлен на Фиг. 3. На этапе 310 модуль связывания сборок ищет сборку, поиск производится в:The CLR uses the Assemble Binder module to search for assemblies to load at the time the corresponding assembly starts. CLR uses several types of binding modules. To search for images, the Native Binder module is used. The search for the desired image is carried out in two stages - first, the specified module finds the assembly and the image on the file system, then checks whether the image matches the assembly. The search algorithm is shown in FIG. 3. At step 310, the assembly binding module searches for the assembly, the search is performed in:

- GAC, что подразумевает, что искомая сборка подписана, содержимое сборки не читается;- GAC, which implies that the desired assembly is signed, the contents of the assembly cannot be read;

- каталоге приложения, сборка открывается и читаются метаданные.- the application directory, the assembly opens and metadata is read.

Далее на этапе 320 модуль связывания образов ищет образ в NIC, соответствующий найденной сборке. В том случае если образ найден, это проверяется на этапе 330, то модуль связывания образов зачитывает необходимые данные и метаданные из образа на этапе 340 и убеждается, что образ подходящий, для чего проводит тщательную проверку, которая в том числе включает контроль:Next, at step 320, the image linking module searches for an image in the NIC corresponding to the found assembly. In the event that an image is found, this is checked at step 330, then the image linking module reads the necessary data and metadata from the image at step 340 and makes sure that the image is suitable, for which it conducts a thorough check, which includes control:

- строгого имени;- strict name;

- времени создания (образ должен быть новее, чем сборка);- creation time (the image must be newer than the assembly);

- MVID сборки и образа;- MVID assembly and image;

- версии .NET Framework;- versions of the .NET Framework;

- архитектуры процессора;- processor architecture;

- версии связанных образов (например, образ mscorlib.dll);- versions of related images (eg. mscorlib.dll image);

- и т.д.- etc.

Если сборка, для которой ищется образ, не имеет строгого имени, тогда вместо него для проверки используется MVID. В том случае, если образ не актуален, это проверяется на этапе 350, управление передается JIT-компилятору на этапе 370, иначе загружается код из образа на этапе 360.If the assembly for which the image is being searched does not have a strong name, then MVID is used instead for verification. In the event that the image is not relevant, this is checked at step 350, control is transferred to the JIT compiler at step 370, otherwise the code from the image is loaded at step 360.

Из описанного следует, что число образов существенно превышает число сборок и образы, порожденные одной родительской сборкой, могут отличаться от устройства к устройству, от версии к версии образа, это существенно усложняет задачу по категоризации образов. Из уровня техники известны способы категоризации файлов, например, публикация US 20140208426 описывает способ категоризации файлов с применением облачных сервисов, но не было найдено решений, позволяющих решить задачу категоризации образов.From the described it follows that the number of images significantly exceeds the number of assemblies and the images generated by one parent assembly may differ from device to device, from version to version of the image, this greatly complicates the task of categorizing images. Methods for categorizing files are known in the prior art, for example, US 20140208426 describes a method for categorizing files using cloud services, but no solutions have been found to solve the problem of categorizing images.

Раскрытие изобретенияDisclosure of invention

Настоящее изобретение предназначено для обнаружения вредоносных образов машинных кодов на устройстве.The present invention is intended to detect malicious images of machine codes on a device.

Технический результат настоящего изобретения заключается в повышении безопасности устройства путем обнаружения вредоносного образа машинного кода.The technical result of the present invention is to increase the security of the device by detecting a malicious image of machine code.

Способ обнаружения вредоносного образа машинного кода, в котором: получают образ машинного кода; определяют родительскую сборку, на основании которой создан полученный образ; устанавливают несоответствие между данными полученного образа машинного кода и данными определенной родительской сборкой, при этом соответствие гарантирует неизменность образа машинного кода после создания; признают образ машинного кода вредоносным на основании установленного несоответствия.A method for detecting a malicious image of machine code, in which: receive an image of machine code; determine the parent assembly on the basis of which the resulting image is created; establish a mismatch between the data of the received image of the machine code and the data of the specific parent assembly, while the correspondence guarantees the invariability of the image of the machine code after creation; recognize the image of the machine code as malicious on the basis of the identified non-compliance.

В частном случае данными, между которыми устанавливается несоответствие, являются CIL код определенной родительской сборки и машинный код образа машинного кода.In the particular case, the data between which the mismatch is established is the CIL code of the specific parent assembly and the machine code of the machine code image.

В другом частном случае данными, между которыми устанавливается несоответствие, являются по меньшей мере CIL код, машинный код, метаданные типов, манифест, информация в РЕ заголовке, информация в CLR заголовке.In another particular case, the data between which the mismatch is established is at least the CIL code, machine code, type metadata, manifest, information in the PE header, information in the CLR header.

В еще одном частном случае несоответствие устанавливают путем непосредственного сравнения соответствующих данных родительской сборки и образа машинного кода.In another particular case, the mismatch is established by directly comparing the corresponding data of the parent assembly and the image of the machine code.

В частном случае несоответствие устанавливают путем непосредственного сравнения соответствующих данных оригинального образа машинного кода и полученного образа машинного кода, где оригинальный образ машинного кода, гарантированно неизмененный образ машинного кода, полученный от родительской сборки.In the particular case, the discrepancy is established by directly comparing the corresponding data of the original image of the machine code and the resulting image of the machine code, where the original image of the machine code is guaranteed to be an unchanged image of the machine code received from the parent assembly.

В другом частном случае родительскую сборку, на основании которой создан полученный образ машинного кода, определяют средствами операционной системы.In another particular case, the parent assembly, on the basis of which the resulting machine code image is created, is determined by the operating system.

Краткое описание чертежейBrief Description of the Drawings

Сопровождающие чертежи включены для обеспечения дополнительного понимания изобретения и составляют часть этого описания, показывают варианты осуществления изобретения и совместно с описанием служат для объяснения принципов изобретения.The accompanying drawings are included to provide a further understanding of the invention and form part of this description, show embodiments of the invention, and together with the description serve to explain the principles of the invention.

Заявленное изобретение поясняется следующими чертежами, на которых:The claimed invention is illustrated by the following drawings, in which:

Фиг. 1a показывает пример однофайловой сборки;FIG. 1a shows an example of a single-file assembly;

Фиг. 1б показывает пример многофайловой сборки;FIG. 1b shows an example of a multi-file assembly;

Фиг. 1в показывает способ формирования строго имени;FIG. 1c shows a method for forming a strictly name;

Фиг. 2 показывает способ исполнения кода сборки;FIG. 2 shows a method of executing assembly code;

Фиг. 3 показывает способ работы модуля связывания;FIG. 3 shows a method for operating a binding module;

Фиг. 4 показывает способ определения категории доверия образа;FIG. 4 shows a method for determining an image trust category;

Фиг. 5 показывает способ создания шаблона;FIG. 5 shows a method for creating a template;

Фиг. 6 показывает способ установки образов на устройстве;FIG. 6 shows a method of installing images on a device;

Фиг. 7 показывает пример компьютерной системы общего назначения.FIG. 7 shows an example of a general purpose computer system.

Хотя изобретение может иметь различные модификации и альтернативные формы, характерные признаки, показанные в качестве примера на чертежах, будут описаны подробно. Следует понимать, однако, что цель описания заключается не в ограничении изобретения конкретным его воплощением. Наоборот, целью описания является охват всех изменений, модификаций, входящих в рамки данного изобретения, как это определено приложенной формуле.Although the invention may have various modifications and alternative forms, the characteristic features shown by way of example in the drawings will be described in detail. It should be understood, however, that the purpose of the description is not to limit the invention to its specific embodiment. On the contrary, the purpose of the description is to cover all changes, modifications that are included in the scope of this invention, as defined by the attached formula.

Описание изобретенияDescription of the invention

Объекты и признаки настоящего изобретения, способы для достижения этих объектов и признаков станут очевидными посредством отсылки к примерным вариантам осуществления. Однако настоящее изобретение не ограничивается примерными вариантами осуществления, раскрытыми ниже, оно может воплощаться в различных видах. Приведенное описание предназначено для помощи специалисту в области техники для исчерпывающего понимания изобретения, которое определяется только в объеме приложенной формулы.The objects and features of the present invention, methods for achieving these objects and features will become apparent by reference to exemplary embodiments. However, the present invention is not limited to the exemplary embodiments disclosed below, it can be embodied in various forms. The above description is intended to help a person skilled in the art for a comprehensive understanding of the invention, which is defined only in the scope of the attached claims.

На Фиг. 4 изображен способ категоризации образов. На этапе 400 получают образ. Образ в одном частном случае получают из NIC (например, если образ установлен на устройстве и используется на устройстве по назначению), в другом частном случае из любого другого хранилища образов (например, когда устройство используется в качестве хранилища и образы не используются на этом устройстве по назначению). Далее на этапе 410 определяют категорию доверия образа. В частном случае для определения категории доверия образа осуществляют запрос к базе данных, где для запроса используют, например, контрольную сумму образа, в другом частном случае используют MVID образа. Также для определения категории образа используют шаблоны. Механизм работы с шаблонами приводится ниже. Если образ неизвестен в базе данных, то на этапе 420 определяют родительскую сборку, на основе которой создали образ. Для определения используют по меньшей мере следующие данные, структуры данных и средства:In FIG. 4 shows a method for categorizing images. At step 400, an image is obtained. An image in one particular case is obtained from the NIC (for example, if the image is installed on the device and used on the device for its intended purpose), in another particular case from any other image storage (for example, when the device is used as storage and images are not used on this device by destination). Next, at 410, an image trust category is determined. In a particular case, to determine the category of image trust, a query is made to a database where, for example, the checksum of the image is used for the request, in another particular case, the MVID of the image is used. Also, templates are used to determine the image category. The mechanism for working with templates is given below. If the image is not known in the database, then at step 420, the parent assembly is determined on the basis of which the image was created. At least the following data, data structures, and tools are used to determine:

- MVID;- MVID;

- реестр;- register;

- модуль связывания- binding module

- строгое имя.is a strict name.

Определение по MVID используют, например, при наличии базы данных, содержащей MVID сборок, имеющихся на устройстве. Для этого определяют MVID образа и обращаются к базе данных, хранящей MVID сборок.Definition by MVID is used, for example, if there is a database containing MVID assemblies available on the device. To do this, determine the MVID of the image and access the database that stores the MVID of the assemblies.

Определение родительской сборки по записям в реестре используют в том случае, когда при создании образов создается запись в реестре. Пример подобной записи представлен выше.The definition of the parent assembly by the entries in the registry is used in the case when an image is created in the registry when creating images. An example of such a record is presented above.

Определение родительской сборки по строгому имени используют для образов, созданных из строго именованных сборок. Из образа извлекают компоненты строгого имени родительской сборки, формируют строгое имя и на основании этих данных определяют путь к родительской сборке в GAC на устройстве или в базе данных, хранящей сборки упорядоченно в соответствии со строгим именем.The strong name definition of the parent assembly is used for images created from strictly named assemblies. The components of the strong name of the parent assembly are extracted from the image, a strong name is formed, and based on this data, the path to the parent assembly in the GAC on the device or in the database that stores the assemblies is ordered in accordance with the strong name.

Выбор способа определения родительской сборки осуществляют в зависимости от ряда факторов. Такими факторами, например, являются: местоположение родительской сборки и образа (устройство пользователя или удаленная либо локальная база данных), возможность компрометации сборки и образа в месте их хранения, способ именования сборки (строгое имя или обычное) и т.д.The choice of the method for determining the parent assembly is carried out depending on a number of factors. Such factors, for example, are: the location of the parent assembly and the image (user device or a remote or local database), the possibility of compromising the assembly and image in the storage location, the way the assembly is named (strong name or regular), etc.

В частном случае после определения родительской сборки дополнительно на этапе 421 определяют соответствие между образом и сборкой. Данный этап осуществляют в том случае, если существует вероятность того, что образ после создания мог быть несанкционированно изменен (скомпрометирован) по месту хранения. В одном случае для определения соответствия используют алгоритм, который используется модулем связывания образов, известным из уровня техники, в другом частном случае после определения родительской сборки создается образ от этой сборки (оригинальный образ) и осуществляется непосредственное сравнение оригинального образа с анализируемым образом, для определения соответствия, сравнение осуществляют, например, побайтовое.In the particular case, after determining the parent assembly, at step 421, the correspondence between the image and the assembly is determined. This stage is carried out if there is a possibility that the image after creation could be unauthorized changed (compromised) at the place of storage. In one case, to determine the correspondence, an algorithm is used that is used by the image linking module known in the prior art, in another particular case, after determining the parent assembly, an image from this assembly is created (the original image) and the original image is directly compared with the analyzed image to determine the correspondence , the comparison is carried out, for example, byte-by-byte.

В частном случае, для того чтобы избежать несанкционированного изменения образов, модификацию образов разрешают только доверенным процессам, например только ngen.exe, другим процессам разрешается только чтение данных из образа.In the particular case, in order to avoid unauthorized alteration of images, image modification is allowed only to trusted processes, for example, only ngen.exe, other processes are only allowed to read data from the image.

В некоторых случаях для определения соответствия между образом и родительской сборкой используется механизм шаблонов. В частном случае если соответствие между родительской сборкой и соответствующим образом отсутствует, образ считается скомпрометированным (вредоносным). Скомпрометированный образ от оригинального образа может отличаться CIL кодом, машинным кодом, метаданными типов, информацией, содержащейся в заголовках CLR и РЕ.In some cases, the template engine is used to determine the correspondence between the image and the parent assembly. In the particular case, if there is no correspondence between the parent assembly and the corresponding one, the image is considered compromised (malicious). A compromised image from the original image may differ in CIL code, machine code, type metadata, information contained in the CLR and PE headers.

Образ, как и сборка, имеют некоторую структуру, например, которая изображена на Фиг. 5. Сборка KasperskyLab.dll и образ KasperskyLab.ni.dll содержат метаданные и код, где сборка содержит исключительно CIL код, а образ, в частном случае, дополнительно машинный код и структуру NativeImageHeader. На основании структуры, метаданных и кода формируется шаблон KasperskyLab.dll.tmpl, о котором уже упоминалось выше, который однозначно соответствует родительской сборке и созданному на ее основе образу. Для связывания структуры, кода и метаданных в шаблон, используют, например, технологию гибкого хеша (англ. intelligent hash, также известный как local sensitive hash). В частном случае шаблон формируют, как показано на Фиг. 5. Из сборки извлекаются данные (манифест, метаданные, CIL код и т.д.). Из образа извлекаются те же данные и дополнительно машинный код. Данные, которые неизменны для каждой из возможных версий образа, созданных от одной родительской сборки, обрабатываются (например, от них рассчитывается контрольная сумма) и формируется хеш, который помещается в шаблон. Данные, которые изменяются от версии к версии образа, например машинный код, также обрабатываются и на основании обработки формируется гибкий хеш. В частном случае, для машинного кода, формируют журнал вызова функций, листинг с дизассемблированным машинным кодом или любую другую сущность, которая отражает логику исполнения указанного машинного кода, из этих сущностей и формируют гибкий хеш, в другом частном случае эти сущности используются в шаблоне непосредственно. Следует отметить, что шаблон формируется таким образом, что однозначно связывает (устанавливает соответствие) родительскую сборку и образ независимо от версий образа, зависящих от программно-аппаратной конфигурации устройства. В том случае, если в машинный код образа вносятся изменения и логика исполнения кода образа перестает соответствовать логике исполнения кода сборки, соответствие между родительской сборкой и образом на основании шаблона не устанавливается, образ признается не соответствующим сборке.The image, like the assembly, has some structure, for example, which is depicted in FIG. 5. The assembly of KasperskyLab.dll and the image of KasperskyLab.ni.dll contain metadata and code, where the assembly contains exclusively CIL code, and the image, in the particular case, additionally machine code and the NativeImageHeader structure. Based on the structure, metadata and code, the KasperskyLab.dll.tmpl template is generated, which is already mentioned above, which uniquely corresponds to the parent assembly and the image created on its basis. For linking structure, code, and metadata into a template, for example, flexible hash technology (English intelligent hash, also known as local sensitive hash) is used. In the particular case, a template is formed as shown in FIG. 5. Data is extracted from the assembly (manifest, metadata, CIL code, etc.). The same data and additional machine code are extracted from the image. Data that is unchanged for each of the possible versions of the image created from one parent assembly is processed (for example, the checksum is calculated from them) and a hash is formed, which is placed in the template. Data that changes from version to version of an image, such as machine code, is also processed and a flexible hash is formed on the basis of processing. In the particular case, for machine code, a function call log is generated, a listing with a disassembled machine code, or any other entity that reflects the execution logic of the specified machine code from these entities and form a flexible hash; in another particular case, these entities are used directly in the template. It should be noted that the template is formed in such a way that uniquely connects (establishes a correspondence) the parent assembly and the image, regardless of the version of the image, depending on the hardware-software configuration of the device. In the event that changes are made to the image machine code and the image code execution logic no longer matches the assembly code execution logic, the correspondence between the parent assembly and the image based on the template is not established, the image is recognized as not conforming to the assembly.

Далее описывается пример определения соответствия при помощи шаблона. Например, существует некоторая родительская сборка Kaspersky.dll, для нее на устройстве создается образ Kaspersky.ni.dll. Формируется шаблон Kaspersky.dll.tmpl, который позволяет установить соответствие между родительской сборкой и образом. Далее на устройстве происходит обновление программно-аппаратной части (обновление операционной системы, .NET Framework, замена процессора) и версия образа Kaspersky.ni.dll становится неактуальной, образ использоваться не может, поэтому инициируют обновление этого образа и создают новый образ Kaspersky.ni.dll, который отличен от образа предыдущей версии. Но при использовании шаблона определяют, что обновленный образ соответствует родительской сборке (логика исполнения машинного кода осталась прежней). Пусть в другом случае на устройство устанавливается вредоносное программное обеспечение, которое модифицирует образ Kaspersky.ni.dll. В данном случае при использовании шаблона, определяют, что образ, модифицированный вредоносным программным обеспечением, не соответствует родительской сборке (логика исполнения машинного кода отличается от логики, заложенной в родительской сборке).The following describes an example of matching using a template. For example, there is some parent assembly of Kaspersky.dll, an image of Kaspersky.ni.dll is created on the device for it. The Kaspersky.dll.tmpl template is generated, which allows you to establish a correspondence between the parent assembly and the image. Then, the firmware and hardware part is updated on the device (updating the operating system, .NET Framework, replacing the processor) and the version of the Kaspersky.ni.dll image becomes outdated, the image cannot be used, therefore, they update this image and create a new Kaspersky.ni image. dll, which is different from the image of the previous version. But when using the template, it is determined that the updated image corresponds to the parent assembly (the logic for executing machine code remains the same). Let in another case, malicious software is installed on the device that modifies the image of Kaspersky.ni.dll. In this case, when using the template, it is determined that the image modified by the malware does not correspond to the parent assembly (the execution logic of the machine code is different from the logic embedded in the parent assembly).

После определения родительской сборки необходимо установить категорию доверия сборки (этап 430). Под категорией доверия сборки подразумевается степень доверия к сборке (доверенная или недоверенная) со стороны системы защиты устройства, например антивирусного приложения. В одном из вариантов реализации категории сборок могут быть две: доверенная сборка или недоверенная сборка. В рамках текущей заявки следует отделять понятие категории сборки от понятия статуса опасности сборки. Статус опасности сборки в рамках данной заявки может быть следующим: опасная, безопасная. Также есть неизвестные сборки - это те сборки, статус опасности которых не определен. Статус опасности сборки определяет опасность сборки, для устройства, на котором данная сборка установлена. Опасность сборки для устройства заключается, в частном случае, в возможной краже данных с устройства, подмене данных или несанкционированной модификации программной части устройства во время выполнения кода сборки.After determining the parent assembly, you must set the trust category of the assembly (step 430). The category of assembly trust refers to the degree of trust in the assembly (trusted or untrusted) by the device’s protection system, for example, an anti-virus application. In one embodiment, the assembly category may have two: trusted assembly or untrusted assembly. Within the framework of the current application, the concept of assembly category should be separated from the concept of assembly hazard status. The hazard status of the assembly in the framework of this application may be as follows: dangerous, safe. There are also unknown assemblies - these are assemblies whose hazard status is not defined. The hazard status of the assembly determines the hazard of the assembly for the device on which the assembly is installed. The danger of assembly for the device lies, in the particular case, in the possible theft of data from the device, data substitution or unauthorized modification of the software part of the device during execution of the assembly code.

К доверенным сборкам относятся сборки, которые по мнению системы защиты устройства являются безопасными. Система защиты устройства, присваивая категорию доверия сборке, делает это локально в рамках текущего состояния на устройстве и на основании информации о сборке. В частном случае такой информацией является статус опасности сборки. Статус опасности сборки определяют, используя идентификационную информацию сборки, например MVID сборки, строгое имя сборки или контрольную сумму сборки. Для этого организуют запрос к репутационной базе данных на этапе 431, база располагается в частном случае на устройстве, на котором хранится сборка, в другом частном случае база располагается удаленно. Если сборка известна (информация о ней содержится в репутационной базе), то, соответственно, сборка уже имеет статус опасности безопасная или опасная, в зависимости от того, какой идентификационной информацией из репутационной базы соответствует идентификационная информация проверяемой сборки. Если идентификационная информация сборки не содержится в базе данных, сборка считается неизвестной, т.е. сборка не имеет статуса (статус не определен). Если сборка имеет статус безопасной, то в частном случае сборка получает категорию доверенной. В другом частном случае категория сборки определяется исходя из другой фактической и статистической информации о сборке, например по пути установки сборки на устройстве или принадлежности ее к установочным пакетам, статус опасности которых известен.Trusted assemblies include assemblies that are considered safe by the device’s security system. The device security system, assigning a trust category to an assembly, does this locally within the current state of the device and based on assembly information. In the particular case, such information is the hazard status of the assembly. The hazard status of the assembly is determined using assembly identification information, such as the MVID of the assembly, the strong assembly name, or the checksum of the assembly. To do this, a request is made to the reputation database at step 431, the database is located in the particular case on the device on which the assembly is stored, in another particular case, the database is located remotely. If the assembly is known (information about it is contained in the reputation database), then, accordingly, the assembly already has a hazard status safe or dangerous, depending on which identification information from the verified assembly corresponds to the identification information from the reputation database. If the assembly identification information is not contained in the database, the assembly is considered unknown, i.e. assembly has no status (status not defined). If the assembly has the safe status, then in the particular case the assembly is trusted. In another particular case, the assembly category is determined on the basis of other actual and statistical information about the assembly, for example, by installing the assembly on the device or by belonging to installation packages whose hazard status is known.

В частном случае фактической информацией о сборке, является информация о цифровой подписи (например, StrongName подписи или Х.509), при этом цифровая подпись должна быть действительна. Для этого на этапе 432 получают идентификационную информацию о цифровой подписи сборки, которая содержит, например, информацию о производителе или хеш файла или его части. При этом подпись может располагаться как в сборке, так и в каталоге (каталожная подпись). Статус опасности цифровой подписи сборки определяют, используя идентификационную информацию подписи, для этого организуют запрос к репутационной базе данных, база располагается в одном частном случае на устройстве, на котором хранится сборка, в другом частном случае база располагается удаленно. Если подпись известна (информация о ней содержится в репутационной базе), то, соответственно, подпись уже имеет статус безопасная или опасная, в зависимости от того, какой идентификационной информации из репутационной базы соответствует идентификационная информация проверяемой подписи. Если идентификационная информация подписи не содержится в базе данных, подпись считается неизвестной, т.е. подпись не имеет статуса (статус неизвестен). В частном случае, если подпись имеет статус безопасная, то в частном случае сборка получает категорию доверенная, а если подпись имеет статус опасная, то в частном случае сборка получает категорию недоверенная.In the particular case, the actual assembly information is digital signature information (for example, StrongName signature or X.509), while the digital signature must be valid. To do this, at step 432, identification information about the digital signature of the assembly is obtained, which contains, for example, information about the manufacturer or the hash of the file or its part. In this case, the signature can be located both in the assembly and in the catalog (catalog signature). The danger status of the digital signature of the assembly is determined using the identification information of the signature; for this, a request is made to a reputation database, the database is located in one particular case on the device on which the assembly is stored, in another particular case, the database is located remotely. If the signature is known (information about it is contained in the reputation database), then, accordingly, the signature already has the status of safe or dangerous, depending on what identification information from the reputation database corresponds to the identification information of the verified signature. If the identification information of the signature is not contained in the database, the signature is considered unknown, i.e. the signature has no status (status unknown). In the particular case, if the signature has the safe status, then in the particular case the assembly receives the trusted category, and if the signature has the dangerous status, in the particular case the assembly receives the untrusted category.

Статусы подписям назначают разными способами, в одном частном случае в зависимости от производителя, в другом частном случае наследованием от установщика, статус подписи которого известен. В некоторых случаях статус подписи назначается в зависимости от популярности подписи, например, чем подпись популярнее, тем больше доверия она вызывает.Statuses of signatures are assigned in different ways, in one particular case depending on the manufacturer, in another particular case by inheritance from the installer, whose signature status is known. In some cases, the status of a signature is assigned depending on the popularity of the signature, for example, the more popular the signature, the more credible it is.

В частном случае на этапе 433 категорию доверия определяют при помощи антивирусной проверки сборки, для этого используются различные методы обнаружения вредоносного программного обеспечения: сигнатурные, эвристические, статистические и т.д. В том случае, если по результатам антивирусной проверки сборка признается безопасной, то сборка получает категорию доверенная. В противном случае сборка признается недоверенной.In the particular case, at step 433, the category of trust is determined using an anti-virus scan of the assembly; for this, various methods for detecting malicious software are used: signature, heuristic, statistical, etc. In the event that, according to the results of the anti-virus scan, the assembly is considered safe, then the assembly is trusted. Otherwise, the assembly is recognized as untrusted.

После определения категории доверия сборки на этапе 440 определяют категорию доверия образа. В частном случае образу назначают категорию доверия, определенную для родительской сборки, в другом частном случае категорию образа определяют способом, описанным выше для шага 410.After determining the trust category of the assembly, at 440, the image trust category is determined. In the particular case, the image is assigned the trust category defined for the parent assembly; in another particular case, the category of image is determined by the method described above for step 410.

При установке системы защиты на устройство необходимо гарантировать, что хранилище образов не было несанкционированно изменено и изменено не будет, для этого применяется ряд мер. На Фиг. 6 изображены назначения категории образу. На этапе 600 ограничивают доступ к хранилищу образов или, по меньшей мере, одному образу, ограничение в частном случае заключается в том, что модифицировать образ разрешают только доверенным процессам или конечному числу некоторых доверенных процессов, например только процессу ngen.exe, всем остальным процессам разрешен только доступ на чтение. В другом частном случае ограничение заключается в полной блокировке доступа на запись к хранилищу в целом или по меньшей мере к одному образу. На этапе 610 определяют родительскую сборку, на основе которой создали образ, доступ к которому ограничили. На этапе 620 запускают обновление(замену), по меньшей мере, одного образа. В одном частном случае обновление заключается в удалении ранее созданного образа и создание нового средства операционной системы (запуском ngen.exe на родительской сборке или сервисом автоматического создания образов), в другом частном случае изменяют лишь часть данных образа, например машинный код, при этом обновление осуществляется доверенными процессами. В первом случае образ после удаления создают вновь, в одном частном случае немедленно, в другом случае создание откладывают на некоторое время, например до запуска родительской сборки, определенную на этапе 610, образ которой подлежит обновлению. На этапе 630 назначают образу категорию родительской сборки.When installing the protection system on the device, it is necessary to ensure that the image storage has not been unauthorizedly changed and will not be changed, for this a number of measures are applied. In FIG. Figure 6 shows the category assignments to the image. At step 600, access to the image store or at least one image is limited, the restriction in the particular case is that only trusted processes or a finite number of some trusted processes are allowed to modify the image, for example, only the ngen.exe process, all other processes are allowed read access only. In another particular case, the restriction is to completely block write access to the storage as a whole or to at least one image. At step 610, the parent assembly is determined, on the basis of which an image is created, access to which is restricted. At step 620, an update (replacement) of at least one image is started. In one particular case, the update consists in deleting the previously created image and creating a new operating system tool (by running ngen.exe on the parent assembly or the automatic image creation service), in another particular case, only part of the image data is changed, for example, machine code, and the update is carried out trusted processes. In the first case, the image after deletion is created again, in one particular case immediately, in the other case, the creation is postponed for some time, for example, until the parent assembly is started, which is determined at step 610, the image of which is to be updated. At step 630, a parent assembly category is assigned to the image.

Антивирусное средство использует категории доверия в своей работе, например удаляет образы, которые имеют категорию доверия недоверенные, либо существенно ограничивает их использование, например ограничивает их доступ к ресурсам, предоставляемым операционной системой.An anti-virus tool uses categories of trust in its work, for example, deletes images that have a category of trust untrusted, or significantly restricts their use, for example, restricts their access to resources provided by the operating system.

Фиг. 7 представляет пример компьютерной системы общего назначения, персональный компьютер или сервер 20, содержащий центральный процессор 21, системную память 22 и системную шину 23, которая содержит разные системные компоненты, в том числе память, связанную с центральным процессором 21. Системная шина 23 реализована, как любая известная из уровня техники шинная структура, содержащая, в свою очередь, память шины или контроллер памяти шины, периферийную шину и локальную шину, которая способна взаимодействовать с любой другой шинной архитектурой. Системная память содержит постоянное запоминающее устройство (ПЗУ) 24, память с произвольным доступом (ОЗУ) 25. Основная система ввода/вывода (BIOS) 26 содержит основные процедуры, которые обеспечивают передачу информации между элементами персонального компьютера 20, например, в момент загрузки операционной системы с использованием ПЗУ 24.FIG. 7 is an example of a general purpose computer system, a personal computer or server 20 comprising a central processor 21, a system memory 22, and a system bus 23 that contains various system components, including memory associated with the central processor 21. The system bus 23 is implemented as any prior art bus structure comprising, in turn, a bus memory or a bus memory controller, a peripheral bus and a local bus that is capable of interacting with any other bus architecture. The system memory contains read-only memory (ROM) 24, random access memory (RAM) 25. The main input / output system (BIOS) 26 contains the basic procedures that ensure the transfer of information between the elements of the personal computer 20, for example, at the time of loading the operating system using ROM 24.

Персональный компьютер 20, в свою очередь, содержит жесткий диск 27 для чтения и записи данных, привод магнитных дисков 28 для чтения и записи на сменные магнитные диски 29 и оптический привод 30 для чтения и записи на сменные оптические диски 31, такие как CD-ROM, DVD-ROM и иные оптические носители информации. Жесткий диск 27, привод магнитных дисков 28, оптический привод 30 соединены с системной шиной 23 через интерфейс жесткого диска 32, интерфейс магнитных дисков 33 и интерфейс оптического привода 34 соответственно. Приводы и соответствующие компьютерные носители информации представляют собой энергонезависимые средства хранения компьютерных инструкций, структур данных, программных модулей и прочих данных персонального компьютера 20.The personal computer 20, in turn, contains a hard disk 27 for reading and writing data, a magnetic disk drive 28 for reading and writing to removable magnetic disks 29, and an optical drive 30 for reading and writing to removable optical disks 31, such as a CD-ROM , DVD-ROM and other optical storage media. The hard disk 27, the magnetic disk drive 28, the optical drive 30 are connected to the system bus 23 through the interface of the hard disk 32, the interface of the magnetic disks 33 and the interface of the optical drive 34, respectively. Drives and associated computer storage media are non-volatile means of storing computer instructions, data structures, software modules and other data of a personal computer 20.

Настоящее описание раскрывает реализацию системы, которая использует жесткий диск 27, сменный магнитный диск 29 и сменный оптический диск 31, но следует понимать, что возможно применение иных типов компьютерных носителей информации 56, которые способны хранить данные в доступной для чтения компьютером форме (твердотельные накопители, флеш карты памяти, цифровые диски, память с произвольным доступом (ОЗУ) и т.п.), которые подключены к системной шине 23 через контроллер 55.The present description discloses an implementation of a system that uses a hard disk 27, a removable magnetic disk 29, and a removable optical disk 31, but it should be understood that other types of computer storage media 56 that can store data in a form readable by a computer (solid state drives, flash memory cards, digital disks, random access memory (RAM), etc.) that are connected to the system bus 23 through the controller 55.

Компьютер 20 имеет файловую систему 36, где хранится записанная операционная система 35, а также дополнительные программные приложения 37, другие программные модули 38 и данные программ 39. Пользователь имеет возможность вводить команды и информацию в персональный компьютер 20 посредством устройств ввода (клавиатуры 40, манипулятора «мышь» 42). Могут использоваться другие устройства ввода (не отображены): микрофон, джойстик, игровая консоль, сканнер и т.п. Подобные устройства ввода по своему обычаю подключают к компьютерной системе 20 через последовательный порт 46, который, в свою очередь, подсоединен к системной шине, но могут быть подключены иным способом, например при помощи параллельного порта, игрового порта или универсальной последовательной шины (USB). Монитор 47 или иной тип устройства отображения также подсоединен к системной шине 23 через интерфейс, такой как видеоадаптер 48. В дополнение к монитору 47, персональный компьютер может быть оснащен другими периферийными устройствами вывода (не отображены), например колонками, принтером и т.п.Computer 20 has a file system 36 where the recorded operating system 35 is stored, as well as additional software applications 37, other program modules 38, and program data 39. The user is able to enter commands and information into personal computer 20 via input devices (keyboard 40, keypad “ the mouse "42). Other input devices (not displayed) can be used: microphone, joystick, game console, scanner, etc. Such input devices are, as usual, connected to the computer system 20 via a serial port 46, which, in turn, is connected to the system bus, but can be connected in another way, for example, using a parallel port, a game port, or a universal serial bus (USB). A monitor 47 or other type of display device is also connected to the system bus 23 via an interface such as a video adapter 48. In addition to the monitor 47, the personal computer may be equipped with other peripheral output devices (not displayed), such as speakers, a printer, and the like.

Персональный компьютер 20 способен работать в сетевом окружении, при этом используется сетевое соединение с другим или несколькими удаленными компьютерами 49. Удаленный компьютер (или компьютеры) 49 являются такими же персональными компьютерами или серверами, которые имеют большинство или все упомянутые элементы, отмеченные ранее при описании существа персонального компьютера 20, представленного на Фиг. 7. В вычислительной сети могут присутствовать также и другие устройства, например маршрутизаторы, сетевые станции, пиринговые устройства или иные сетевые узлы.The personal computer 20 is capable of operating in a networked environment, using a network connection with another or more remote computers 49. The remote computer (or computers) 49 are the same personal computers or servers that have most or all of the elements mentioned earlier in the description of the creature the personal computer 20 of FIG. 7. Other devices, such as routers, network stations, peer-to-peer devices or other network nodes, may also be present on the computer network.

Сетевые соединения могут образовывать локальную вычислительную сеть (LAN) 50 и глобальную вычислительную сеть (WAN). Такие сети применяются в корпоративных компьютерных сетях, внутренних сетях компаний и, как правило, имеют доступ к сети Интернет. В LAN- или WAN-сетях персональный компьютер 20 подключен к локальной сети 50 через сетевой адаптер или сетевой интерфейс 51. При использовании сетей персональный компьютер 20 может использовать модем 54 или иные средства обеспечения связи с глобальной вычислительной сетью, такой как Интернет. Модем 54, который является внутренним или внешним устройством, подключен к системной шине 23 посредством последовательного порта 46. Следует уточнить, что сетевые соединения являются лишь примерными и не обязаны отображать точную конфигурацию сети, т.е. в действительности существуют иные способы установления соединения техническими средствами связи одного компьютера с другим.Network connections can form a local area network (LAN) 50 and a wide area network (WAN). Such networks are used in corporate computer networks, internal networks of companies and, as a rule, have access to the Internet. In LAN or WAN networks, the personal computer 20 is connected to the local area network 50 via a network adapter or network interface 51. When using the networks, the personal computer 20 may use a modem 54 or other means of providing communication with a global computer network such as the Internet. The modem 54, which is an internal or external device, is connected to the system bus 23 via the serial port 46. It should be clarified that the network connections are only exemplary and are not required to display the exact network configuration, i.e. in reality, there are other ways to establish a technical connection between one computer and another.

В заключение следует отметить, что приведенные в описании сведения являются примерами, которые не ограничивают объем настоящего изобретения, определенного формулой. Специалисту в данной области становится понятным, что могут существовать и другие варианты осуществления настоящего изобретения, согласующиеся с сущностью и объемом настоящего изобретения.In conclusion, it should be noted that the information provided in the description are examples that do not limit the scope of the present invention defined by the claims. One skilled in the art will recognize that there may be other embodiments of the present invention consistent with the spirit and scope of the present invention.

Claims

1. A method for detecting a malicious image of machine code, where the image of machine code is considered malicious, the execution logic of the machine code of which differs from the execution logic of the CIL code of the parent assembly, in which:

a) receive an image of a machine code;

b) determine the parent assembly, where the parent assembly is the assembly on the basis of which the resulting image is created;

c) establish a mismatch between the logic of the execution of the machine code of the resulting image of the machine code and the logic of the execution of the CIL code of a particular parent assembly;

d) recognize the image of the machine code as malicious on the basis of the established discrepancy between the logic of the execution of the machine code of the resulting image of the machine code and the logic of the execution of the CIL code of a specific parent assembly.

2. The method according to claim 1, in which the mismatch between the CIL code or machine code is established by directly comparing the corresponding data of the original image of the machine code and the resulting image of the machine code.

3. The method according to claim 1, wherein the mismatch between the CIL code or the machine code of the original image of the machine code and the corresponding data of the parent assembly is established based on the template.

4. The method according to claim 3, in which for each block of manifest data, type metadata, resources, digital signature, CIL code, machine code, hashes are generated and placed in the template, while the template is formed in such a way that uniquely connects the parent assembly and the image machine code, regardless of the version of the image, depending on the firmware of the device.

5. The method according to claim 1, in which the parent assembly, based on which the resulting image of the machine code is created, is determined by the operating system.

6. The method according to claim 1, in which, to establish a discrepancy, an original image is created from the aforementioned parent assembly and a direct comparison of the original image and the resulting image of the machine code is performed, the original image of the machine code being the guaranteed unchanged image of the machine code received from the parent assembly.