KR102271269B1

KR102271269B1 - Recorded media recording of the malware classification device, method and program that is transformed according to the api level of the operating system

Info

Publication number: KR102271269B1
Application number: KR1020190123874A
Authority: KR
Inventors: 정수환; 응웬부렁; 심현석
Original assignee: 숭실대학교산학협력단
Priority date: 2019-10-07
Filing date: 2019-10-07
Publication date: 2021-06-29
Also published as: KR20210041317A; KR102271269B9; WO2021071027A1

Abstract

악성코드를 분류하는 방법에 있어서, 시스템 기능 정보를 수집하는 단계; 상기 API 레벨을 분류하여 API 분류 정보를 생성하는 단계; 어플리케이션의 목표 API 레벨에 따라 기준 동작 정보를 추출하는 단계; 상기 어플리케이션에 대해, 다른 API 레벨에서 동작하는 변형 동작 정보를 추출하는 단계; 및 동작 정보에 따라 악성코드를 분류하는 단계를 포함하는, 악성코드 분류 방법을 제공한다.A method for classifying malicious code, the method comprising: collecting system function information; generating API classification information by classifying the API level; extracting reference operation information according to the target API level of the application; extracting, for the application, information about transforming operations operating at different API levels; and classifying the malicious code according to the operation information.

Description

A device and method for classifying malicious codes that are transformed according to the API level of the operating system, and a recording medium for recording a program for performing the same

본 발명은 운영체제의 API 레벨에 따라 변형되는 악성코드 분류 장치, 방법 및 이를 수행하기 위한 프로그램을 기록한 기록매체에 관한 것으로, 보다 상세하게는, 서로 다른 API 레벨에 따라 동작이 변형되는 악성코드를 분류하는 장치, 방법 및 이를 수행하기 위한 프로그램을 기록한 기록매체에 관한 것이다.The present invention relates to an apparatus and method for classifying malicious code that is modified according to an API level of an operating system, and a recording medium recording a program for performing the same, and more particularly, to classify malicious code whose operation is modified according to different API levels. It relates to an apparatus, a method, and a recording medium recording a program for performing the same.

하드웨어, 운영체제 버전, API(Application Programming Interface) 레벨 및 OEM(Original Equipment Manufacturers) 등의 다양성은 사용자에게 광범위한 제품 라인, 풍부한 기능 및 맞춤형 설계 등을 제공하지만, 이러한 다양성에 따라 해당 기기를 여러가지 위험에 노출시키기도 한다. Diversity in hardware, operating system versions, application programming interface (API) levels, and original equipment manufacturers (OEMs) provides users with a broad product line, rich features and custom designs, but this diversity puts those devices at risk. also do it

예를 들어, 장치에 설치되는 소프트웨어의 업데이트가 늦어지면, 장치에 상당한 보안 위험을 초래할 수 있으며, 하드웨어가 최신 릴리즈를 실행할 수 있더라도, 장치가 다양한 CVE(Common Vulnerability Exposures)에 대해 취약해지게 한다.For example, a delayed update of software installed on a device can pose a significant security risk to the device, leaving the device vulnerable to a variety of Common Vulnerability Exposures (CVEs), even if the hardware is capable of running the latest release.

또한, 안드로이드(Android)는 장치 제조사의 커스터마이징으로 인해 프레임워크 등에서의 단편화가 발생한다. 현재 최신 안드로이드 버전은 Pie 버전으로, 출시된 지 상당한 기간이 지났지만 점유율은 약 10.4퍼센트에 불과하다. 또한 Lollipop을 비롯하여 Marshmallow, Nougat, Oreo 버전에 걸쳐 그 점유율이 고르게 분산되어 있다. 이러한 안드로이드 API는 새로운 버전이 배포될 때마다 어플리케이션 및 장치에 새로운 기능이 도입된다. 이러한 변경은 안드로이드의 동작을 내부적으로 정의하는 새로운 API, 매니페스트(Manifest), 드라이버 및 하드웨어에 영향을 준다.In addition, in Android, fragmentation occurs in frameworks due to customization by device manufacturers. Currently, the latest Android version is the Pie version, which has been released for a long time, but the market share is only about 10.4%. Also, its share is evenly distributed across Lollipop, Marshmallow, Nougat, and Oreo versions. Each time a new version of these Android APIs is released, new functions are introduced to applications and devices. These changes affect new APIs, manifests, drivers, and hardware that internally define Android's behavior.

이와 관련하여, 악성코드는 목표 SDK(Software Development Kit) 버전이 23보다 낮다고 선언함으로써 안드로이드 6.0(Marshmallow)의 런타임 권한 정책을 회피할 수 있으며, Android.Bakosy의 경우에는, 새로운 런타임 버전에 적응하려 할 수도 있다. 이에 따라, 서로 다른 운영체제 버전에서 다르게 동작하는 악성코드를 분류하는 방안이 요구되는 실정이다.In this regard, the malware can evade the runtime permission policy of Android 6.0 (Marshmallow) by declaring that the target SDK (Software Development Kit) version is lower than 23, and in the case of Android. may be Accordingly, there is a need for a method for classifying malicious codes that operate differently in different operating system versions.

본 발명이 해결하고자 하는 기술적 과제는, 서로 다른 API 레벨에 따라 동작이 변형되는 악성코드를 분류하는 장치, 방법 및 이를 수행하기 위한 프로그램을 기록한 기록매체를 제공하는 것이다.The technical problem to be solved by the present invention is to provide an apparatus and method for classifying malicious code whose operation is modified according to different API levels, and a recording medium recording a program for performing the same.

본 발명의 일측면은, 모바일 운영체제의 API 레벨에 따른 시스템 기능 정보를 수집하는 단계; 상기 API 레벨을 사전에 설정되는 레벨 분류 정보에 따라 분류하여 API 분류 정보를 생성하는 단계; 상기 모바일 운영체제에서 실행되는 어플리케이션의 목표 API 레벨에 따라 기준 동작 정보를 추출하는 단계; 상기 어플리케이션에 대해, 상기 목표 API 레벨과 다른 API 레벨에서 동작하는 변형 동작 정보를 추출하는 단계; 및 상기 기준 동작 정보 및 상기 변형 동작 정보 중 적어도 하나의 정보에 따라 악성코드를 분류하는 단계를 포함할 수 있다.An aspect of the present invention, the method comprising: collecting system function information according to an API level of a mobile operating system; generating API classification information by classifying the API level according to preset level classification information; extracting reference operation information according to a target API level of an application executed in the mobile operating system; extracting, with respect to the application, information about a modification operation operating at an API level different from the target API level; and classifying the malicious code according to at least one of the reference action information and the modified action information.

또한, 상기 기준 동작 정보를 생성하는 단계는, 상기 모바일 운영체제에서 실행되는 적어도 하나 이상의 어플리케이션에 설정되는 각각의 목표 API 레벨에 따라 기준 동작 정보를 추출할 수 있다.In addition, the generating of the reference operation information may include extracting the reference operation information according to each target API level set in at least one or more applications executed in the mobile operating system.

또한, 상기 변형 동작 정보를 생성하는 단계는, 상기 모바일 운영체제에서 실행되는 적어도 하나 이상의 어플리케이션에 대해, 각각의 기준 동작 정보가 생성되는 목표 API 레벨과 다른 API 레벨이 구비되는 API 분류 정보에 따라 적어도 하나 이상의 변형 동작 정보를 추출할 수 있다.In addition, the generating of the modified operation information may include, for at least one or more applications executed in the mobile operating system, at least one according to API classification information in which an API level different from a target API level at which each reference operation information is generated is provided. The above deformation motion information can be extracted.

또한, 상기 악성코드를 분류하는 단계는, 상기 API 분류 정보에 따라 상기 적어도 하나 이상의 기준 동작 정보 및 상기 적어도 하나 이상의 변형 동작 정보를 수집하는 단계; 상기 기준 동작 정보를 추출하는 단계 및 상기 변형 동작 정보를 추출하는 단계 중 적어도 하나의 단계로부터 다른 어플리케이션의 기준 동작 정보 또는 변형 동작 정보 중 적어도 하나의 정보를 전달받는 단계; 및 상기 다른 어플리케이션의 기준 동작 정보 또는 변형 동작 정보 중 적어도 하나의 정보와 수집된 상기 적어도 하나 이상의 기준 동작 정보 및 상기 적어도 하나 이상의 변형 동작 정보를 비교하여 계산되는 유사도에 따라 상기 다른 어플리케이션을 분류하는 단계를 더 포함할 수 있다.The classifying of the malicious code may include: collecting the at least one or more reference operation information and the at least one or more modified operation information according to the API classification information; receiving, from at least one of extracting the reference motion information and extracting the modified motion information, at least one of reference motion information of another application or modified motion information; and classifying the other application according to a similarity calculated by comparing at least one of the reference motion information or the modified motion information of the other application with the collected at least one or more reference motion information and the at least one or more modified motion information. may further include.

또한, 상기 시스템 기능 정보를 수집하는 단계는, 상기 모바일 운영체제의 업데이트에 따라 변경되는 적어도 하나 이상의 API 레벨에 구비되는 시스템 기능 정보를 수집할 수 있다.In addition, the collecting of the system function information may include collecting system function information provided in at least one API level that is changed according to an update of the mobile operating system.

또한, 상기 레벨 분류 정보는, 상기 모바일 운영체제의 버전이 변경되는 시점에 변경되는 API 레벨을 분류 지점으로 설정할 수 있다.In addition, the level classification information may set an API level that is changed when the version of the mobile operating system is changed as a classification point.

본 발명의 다른 일측면은, 악성코드 분류 방법을 수행하기 위한, 컴퓨터 프로그램이 기록된 컴퓨터로 판독 가능한 기록매체일 수 있다.Another aspect of the present invention may be a computer-readable recording medium in which a computer program for performing a malicious code classification method is recorded.

본 발명의 또 다른 일측면은, 모바일 운영체제의 API 레벨에 따른 시스템 기능 정보를 수집하고, 상기 API 레벨을 사전에 설정되는 레벨 분류 정보에 따라 분류하여 API 분류 정보를 생성하는 정보 수집부; 상기 모바일 운영체제에서 실행되는 어플리케이션의 목표 API 레벨에 따라 기준 동작 정보를 추출하고, 상기 어플리케이션에 대해, 상기 목표 API 레벨과 다른 API 레벨에서 동작하는 변형 동작 정보를 추출하는 정보 추출부; 및 상기 기준 동작 정보 및 상기 변형 동작 정보 중 적어도 하나의 정보에 따라 악성코드를 분류하는 악성코드 분류부를 포함할 수 있다.Another aspect of the present invention includes: an information collecting unit that collects system function information according to an API level of a mobile operating system, and classifies the API level according to preset level classification information to generate API classification information; an information extracting unit for extracting reference operation information according to a target API level of an application executed in the mobile operating system, and extracting, with respect to the application, modified operation information operating at an API level different from the target API level; and a malicious code classification unit that classifies malicious code according to at least one of the reference action information and the modified action information.

또한, 상기 정보 추출부는, 상기 모바일 운영체제에서 실행되는 적어도 하나 이상의 어플리케이션에 설정되는 각각의 목표 API 레벨에 따라 기준 동작 정보를 추출할 수 있다.Also, the information extractor may extract reference operation information according to each target API level set in at least one or more applications executed in the mobile operating system.

또한, 상기 정보 추출부는, 상기 모바일 운영체제에서 실행되는 적어도 하나 이상의 어플리케이션에 대해, 각각의 기준 동작 정보가 생성되는 목표 API 레벨과 다른 API 레벨이 구비되는 API 분류 정보에 따라 적어도 하나 이상의 변형 동작 정보를 추출할 수 있다.In addition, the information extraction unit, for at least one or more applications executed in the mobile operating system, at least one or more variant operation information according to API classification information provided with an API level different from the target API level at which each reference operation information is generated can be extracted.

또한, 상기 악성코드 분류부는, 상기 API 분류 정보에 따라 상기 적어도 하나 이상의 기준 동작 정보 및 상기 적어도 하나 이상의 변형 동작 정보를 수집하고, 상기 정보 추출부로부터 다른 어플리케이션의 기준 동작 정보 또는 변형 동작 정보 중 적어도 하나의 정보를 전달받고, 상기 다른 어플리케이션의 기준 동작 정보 또는 변형 동작 정보 중 적어도 하나의 정보와 수집된 상기 적어도 하나 이상의 기준 동작 정보 및 상기 적어도 하나 이상의 변형 동작 정보를 비교하여 계산되는 유사도에 따라 상기 다른 어플리케이션을 분류할 수 있다.In addition, the malicious code classification unit collects the at least one or more reference operation information and the at least one or more modified operation information according to the API classification information, and collects at least one of the reference operation information or the modified operation information of another application from the information extracting unit. According to a similarity calculated by receiving one piece of information and comparing at least one of the reference motion information or the modified motion information of the other application with the collected at least one or more reference motion information and the at least one or more modified motion information Different applications can be classified.

상술한 본 발명의 일측면에 따르면, 운영체제의 API 레벨에 따라 변형되는 악성코드 분류 장치, 방법 및 이를 수행하기 위한 프로그램을 기록한 기록매체를 제공함으로써, 서로 다른 API 레벨에 따라 동작이 변형되는 악성코드를 분류할 수 있다.According to one aspect of the present invention described above, there is provided an apparatus and method for classifying malicious code that is modified according to the API level of an operating system, and a recording medium recording a program for performing the same, thereby providing malicious code whose operation is modified according to different API levels. can be classified.

도1은 본 발명의 일 실시예에 따라 악성코드를 분류하는 악성코드 분류 시스템의 개략도이다.
도2는 본 발명의 일 실시예에 따른 악성코드 분류 장치의 제어블록도이다.
도3은 본 발명의 일 실시예에 따라 어플리케이션의 동작 정보를 수집하는 방법을 나타내는 개략도이다.
도4는 본 발명의 일 실시예에 따라 악성코드를 분류하는 방법을 나타내는 개략도이다.
도5는 본 발명의 일 실시예에 따른 악성코드 분류 방법의 순서도이다.
도6은 도5의 악성코드를 분류하는 단계의 세부 순서도이다.1 is a schematic diagram of a malicious code classification system for classifying malicious code according to an embodiment of the present invention.
2 is a control block diagram of a malicious code classification apparatus according to an embodiment of the present invention.
3 is a schematic diagram illustrating a method of collecting operation information of an application according to an embodiment of the present invention.
4 is a schematic diagram illustrating a method for classifying a malicious code according to an embodiment of the present invention.
5 is a flowchart of a malicious code classification method according to an embodiment of the present invention.
6 is a detailed flowchart of the step of classifying the malicious code of FIG.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이들 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 여기에 기재되어 있는 특정 형상, 구조 및 특성은 일 실시예와 관련하여 본 발명의 정신 및 범위를 벗어나지 않으면서 다른 실시예로 구현될 수 있다. 또한, 각각의 개시된 실시예 내의 개별 구성요소의 위치 또는 배치는 본 발명의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 취하려는 것이 아니며, 본 발명의 범위는, 적절하게 설명된다면, 그 청구항들이 주장하는 것과 균등한 모든 범위와 더불어 첨부된 청구항에 의해서만 한정된다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일하거나 유사한 기능을 지칭한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS [0010] DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS [0010] DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS [0023] Reference is made to the accompanying drawings, which show by way of illustration specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present invention. It should be understood that the various embodiments of the present invention are different but need not be mutually exclusive. For example, certain shapes, structures, and characteristics described herein with respect to one embodiment may be implemented in other embodiments without departing from the spirit and scope of the invention. In addition, it should be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the present invention. Accordingly, the detailed description set forth below is not intended to be taken in a limiting sense, and the scope of the invention, if properly described, is limited only by the appended claims, along with all scope equivalents to those claimed. Like reference numerals in the drawings refer to the same or similar functions throughout the various aspects.

이하, 도면들을 참조하여 본 발명의 바람직한 실시예들을 보다 상세하게 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the drawings.

도1은 본 발명의 일 실시예에 따라 악성코드를 분류하는 악성코드 분류 시스템의 개략도이다.1 is a schematic diagram of a malicious code classification system for classifying malicious code according to an embodiment of the present invention.

악성코드 분류 시스템(100)은 모바일 운영체제에 대한 API 레벨(110)에 따른 시스템 기능 정보를 수집할 수 있다.The malicious code classification system 100 may collect system function information according to the API level 110 for the mobile operating system.

이때, API는 어플리케이션(180)에 의해 호출되고, 동작을 수행하는 시스템 기능 정보를 포함할 수 있으며, API는 모바일 운영체제가 업데이트 되는 경우에, 시스템 기능 정보가 일부 변경될 수 있으며, 이와 같이, 모바일 운영체제의 업데이트에 따라 변경되는 API를 구분할 수 있도록 API 레벨(110)이 지정되는 것으로 이해할 수 있다.In this case, the API may include system function information that is called by the application 180 and performs an operation, and the API may partially change system function information when the mobile operating system is updated. It can be understood that the API level 110 is designated so as to distinguish an API that is changed according to an update of the operating system.

한편, API의 시스템 기능 정보는 어플리케이션(180)으로부터 호출되는 동작을 수행하도록 구비되는 프로그래밍 언어를 나타낼 수 있으며, 이는, 클래스(Class), 메소드(Method), 변수 등을 포함할 수 있다.Meanwhile, the system function information of the API may indicate a programming language provided to perform an operation called from the application 180 , which may include a class, a method, a variable, and the like.

또한, 시스템 기능 정보는 API 레벨(110)에 따라 API에 추가되거나, 운영체제 제조자에 의한 지원이 중단될 수 있다. 표1은 안드로이드 버전에 따라 추거되거나, 지원이 중단된 클래스 및 메소드를 나타내는 표이다.In addition, the system function information may be added to the API according to the API level 110 or support by the operating system manufacturer may be stopped. Table 1 is a table showing classes and methods that have been removed or discontinued depending on the Android version.

표1을 참조하면, 첫째 열에 나열되는 명칭은 안드로이드 버전의 명칭이며, 첫째 행의 AC는 추가 클래스, DC는 중단 클래스, AM은 추가 메소드, DM은 중단 메소드를 의미하는 것으로 이해할 수 있다.Referring to Table 1, it can be understood that the names listed in the first column are the names of the Android version, and in the first row, AC means an additional class, DC means a break class, AM means an additional method, and DM means a break method.

이에 따라, 안드로이드의 버전 업데이트에 따라 추가되거나, 중단되는 클래스 및 메소드의 개수를 확인할 수 있다.Accordingly, it is possible to check the number of classes and methods added or stopped according to the Android version update.

한편, 안드로이드와 같은 모바일 운영체제의 버전 업데이트는 버전 명칭은 유지되나, API 레벨(110)이 변하는 형태의 업데이트가 존재할 수 있으며, 버전 명칭과 API 레벨(110)이 모두 변하는 형태의 업데이트가 존재할 수도 있다.On the other hand, in the version update of a mobile operating system such as Android, the version name is maintained, but there may be an update in which the API level 110 is changed, and there may be an update in which both the version name and the API level 110 are changed. .

이와 관련하여, 악성코드 분류 시스템(100)은 레벨 분류 정보(120)에 따라 분류하여 API 분류 정보(130)를 생성할 수 있다.In this regard, the malicious code classification system 100 may classify according to the level classification information 120 to generate the API classification information 130 .

여기에서, 레벨 분류 정보(120)는 모바일 운영체제의 중요 업데이트를 기준으로, 복수 개의 API 레벨(110)을 적어도 하나 이상의 묶음으로 분류하도록 설정되는 정보일 수 있다.Here, the level classification information 120 may be information set to classify the plurality of API levels 110 into at least one bundle based on an important update of the mobile operating system.

예를 들어, 안드로이드는 API 레벨(110)이 1부터 26까지 존재하며, 안드로이드 버전의 명칭이 변하는 API 레벨(110) 중에서 16 레벨 및 21 레벨을 기준으로 API 레벨(110)을 분류하여 API 분류 정보(130)를 생성할 수 있으며, 이러한 경우에, API 분류 정보(130)는 1 레벨부터 15 레벨까지의 API 레벨(110)을 포함하는 제1 레벨 분류(131a), 16 레벨부터 20 레벨까지의 API 레벨(110)을 포함하는 제2 레벨 분류(131b) 및 21 레벨부터 26 레벨까지의 API 레벨(110)을 포함하는 제3 레벨 분류(131c)로 분류될 수 있다.For example, in Android, API level 110 exists from 1 to 26, and API level 110 is classified based on level 16 and level 21 among API levels 110 in which the name of the Android version changes, and API classification information 130 may be generated, and in this case, the API classification information 130 includes the first level classification 131a including the API level 110 from the 1st level to the 15th level, and the 16th level to the 20th level. It may be classified into a second level classification 131b including the API level 110 and a third level classification 131c including the API level 110 from the 21st level to the 26th level.

한편, 여기에 기재되어 있는 API 레벨(110)의 개수, 레벨 분류 정보(120)에 따라 분류되는 API 분류 정보(130)에 포함되는 레벨 분류(131)의 개수 등은 본 발명의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다.On the other hand, the number of API levels 110 described here, the number of level classifications 131 included in the API classification information 130 classified according to the level classification information 120, etc. It should be understood that changes may be made without departing from it.

악성코드 분류 시스템(100)은 모바일 운영체제에서 실행되는 어플리케이션(180)의 목표 API 레벨(110)에 따라 기준 동작 정보(140)를 추출할 수 있다.The malicious code classification system 100 may extract the reference operation information 140 according to the target API level 110 of the application 180 executed in the mobile operating system.

여기에서, 목표 API 레벨(110)은 어플리케이션(180)의 실행이 권장되는 API 레벨(110)을 의미할 수 있으며, 이러한 목표 API 레벨(110)은 어플리케이션(180) 내에 설정 값이 존재할 수 있다.Here, the target API level 110 may mean an API level 110 at which execution of the application 180 is recommended, and the target API level 110 may have a set value in the application 180 .

또한, 기준 동작 정보(140)는 어플리케이션(180)이 목표 API 레벨(110)에서 수행하는 동작을 나타낼 수 있으며, 해당 API 레벨(110)에서 어플리케이션(180)에 의해 이용되는 클래스, 메소드 및 변수 등을 나열하는 정보일 수 있다.In addition, the reference operation information 140 may indicate an operation performed by the application 180 at the target API level 110 , and may include classes, methods, and variables used by the application 180 in the corresponding API level 110 . It may be information listing

악성코드 분류 시스템(100)은 어플리케이션(180)이 목표 API 레벨과 다른 API 레벨(110)에서 동작하는 변형 동작 정보를 추출할 수 있다.The malicious code classification system 100 may extract information about the modification operation in which the application 180 operates at an API level 110 different from the target API level.

여기에서, 목표 API 레벨(110)과 다른 API 레벨(110)은 목표 API 레벨(110)이 포함되는 레벨 분류(131) 외의 다른 레벨 분류(131)에 포함되는 API 레벨(110)을 의미할 수 있다.Here, the API level 110 different from the target API level 110 may mean an API level 110 included in the level classification 131 other than the level classification 131 including the target API level 110 . have.

예를 들어, 목표 API 레벨(110)이 5 레벨인 경우에, 목표 API 레벨(110)은 제1 레벨 분류(131a)에 포함되는 API 레벨(110)일 수 있으며, 다른 API 레벨(110)은 제2 레벨 분류(131b) 및 제3 레벨 분류(131c)에 포함되는 API 레벨(110)일 수 있다. 이때, 목표 API 레벨(110)에 의해 생성되는 제1 레벨 분류(131a)는 목표 API 레벨(110)을 대표 API로 설정하여, 어플리케이션(180)의 기준 동작 정보(140)를 대표 API에서 수행되는 동작으로 추출될 수 있으며, 다른 API 레벨(110)에 의해 생성되는 제2 레벨 분류(131b) 및 제3 레벨 분류(131c)는 각각 다른 API 분류 정보(130)와 시스템 기능 정보에 포함되는 정보의 차이가 가장 큰 API 레벨(110)의 API에서 수행되는 동작으로 추출될 수 있다.For example, when the target API level 110 is level 5, the target API level 110 may be the API level 110 included in the first level classification 131a, and the other API level 110 is It may be the API level 110 included in the second level classification 131b and the third level classification 131c. At this time, the first level classification 131a generated by the target API level 110 sets the target API level 110 as the representative API, and the reference operation information 140 of the application 180 is performed in the representative API. The second level classification 131b and the third level classification 131c generated by different API levels 110, which can be extracted by operation, are different API classification information 130 and information included in the system function information, respectively. The difference may be extracted as an operation performed in the API of the API level 110 with the largest difference.

악성코드 분류 시스템(100)은 서로 다른 어플리케이션(180)으로부터 기준 동작 정보(140) 및 변형 동작 정보(150)를 추출하여, API 분류 정보(130)에 따라 서로 다른 동작 정보의 데이터베이스(160)를 생성할 수 있다.The malicious code classification system 100 extracts the reference operation information 140 and the modified operation information 150 from different applications 180 , and creates a database 160 of different operation information according to the API classification information 130 . can create

이때, 서로 다른 동작 정보는 기준 동작 정보(140) 및 변형 동작 정보(150)를 포함할 수 있으며, 이러한 동작 정보는 API 분류 정보(130)에 따라 구분될 수 있다.In this case, the different motion information may include the reference motion information 140 and the modified motion information 150 , and such motion information may be classified according to the API classification information 130 .

또한, 데이터베이스(160)는 서로 다른 동작 정보를 API 분류 정보(130)에 따라 분류하여 수집하고, 동일한 API 분류 정보(130) 내의 서로 다른 동작 정보의 유사도를 계산하여, 유사도가 사전에 설정되는 값을 만족하는 경우에, 해당하는 적어도 하나 이상의 동작 정보를 동일한 악성코드로 분류할 수 있다.In addition, the database 160 classifies and collects different operation information according to the API classification information 130 , calculates the similarity of different operation information in the same API classification information 130 , and sets the similarity in advance. , it is possible to classify the corresponding at least one piece of operation information as the same malicious code.

이에 따라, 악성코드 분류 시스템(100)은 기준 동작 정보(140) 및 변형 동작 정보(150) 중 적어도 하나의 정보에 따라 생성되는 데이터베이스에 의해 악성코드를 분류할 수 있다.Accordingly, the malicious code classification system 100 may classify the malicious code by the database generated according to at least one of the reference action information 140 and the modified action information 150 .

이를 위해, 악성코드 분류 시스템(100)은 새로운 어플리케이션(180)의 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보를 추출할 수 있으며, 악성코드 분류 시스템(100)은 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보를 데이터베이스(160)에 저장되어 있는 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보와 비교하여 유사도를 계산하고, 유사도가 사전에 설정되는 값을 만족하는 경우에, 새로운 어플리케이션(180)을 해당하는 악성코드로 분류할 수 있다.To this end, the malicious code classification system 100 may extract at least one of the reference operation information 140 and the modified operation information 150 of the new application 180 , and the malicious code classification system 100 uses the reference operation information 150 . The similarity is calculated by comparing at least one of the motion information 140 and the modified motion information 150 with at least one of the reference motion information 140 and the modified motion information 150 stored in the database 160 . and, when the degree of similarity satisfies a preset value, the new application 180 may be classified as a corresponding malicious code.

도2는 본 발명의 일 실시예에 따른 악성코드 분류 장치의 제어블록도이다.2 is a control block diagram of a malicious code classification apparatus according to an embodiment of the present invention.

악성코드 분류 장치(200)는 정보 수집부(210), 정보 추출부(220) 및 악성코드 분류부(230)를 포함할 수 있다.The malicious code classification apparatus 200 may include an information collection unit 210 , an information extraction unit 220 , and a malicious code classification unit 230 .

정보 수집부(210)는 모바일 운영체제의 API 레벨(110)에 따른 시스템 기능 정보를 수집할 수 있다.The information collection unit 210 may collect system function information according to the API level 110 of the mobile operating system.

이와 관련하여, API는 어플리케이션(180)에 의해 호출되고, 동작을 수행하는 시스템 기능 정보를 포함할 수 있으며, API는 모바일 운영체제가 업데이트 되는 경우에, 시스템 기능 정보가 일부 변경될 수 있으며, 이와 같이, 모바일 운영체제의 업데이트에 따라 변경되는 API를 구분할 수 있도록 API 레벨(110)이 지정되는 것으로 이해할 수 있다.In this regard, the API may include system function information that is called by the application 180 and performs an operation, and the API may partially change system function information when the mobile operating system is updated. , it can be understood that the API level 110 is designated so as to distinguish the API that is changed according to the update of the mobile operating system.

이때, 정보 수집부(210)는 모바일 운영체제의 업데이트에 따라 변경되는 적어도 하나 이상의 API 레벨(110)을 수집할 수 있으며, 이에 따라, 각각의 API 레벨(110)에 구비되는 시스템 기능 정보를 수집하는 것으로 이해할 수 있다.At this time, the information collection unit 210 may collect at least one or more API levels 110 that are changed according to the update of the mobile operating system, and thus collect system function information provided in each API level 110 . can be understood as

정보 수집부(210)는 API 레벨(110)을 레벨 분류 정보(120)에 따라 분류하여 API 분류 정보(130)를 생성할 수 있다.The information collection unit 210 may classify the API level 110 according to the level classification information 120 to generate the API classification information 130 .

이때, 정보 수집부(210)는 모바일 운영체제의 버전이 변경되는 시점에 변경되는 API 레벨(110)을 분류 지점으로 설정할 수 있다. 이때, 모바일 운영체제의 버전이 변경되는 것은 모바일 운영체제의 버전을 나타내는 명칭이 변경되는 것으로 이해할 수 있으며, 예를 들어, 안드로이드의 경우에는, 젤리빈(Jelly Bean) 버전에서 킷캣(KitKat) 버전으로 변경되거나, 또는 누가(Nougat) 버전에서 오레오(Oreo) 버전으로 변경되는 지점을 의미할 수 있다.In this case, the information collection unit 210 may set the API level 110, which is changed when the version of the mobile operating system is changed, as a classification point. In this case, the change in the version of the mobile operating system may be understood as a change in the name indicating the version of the mobile operating system, for example, in the case of Android, the version of the Jelly Bean is changed to the version of KitKat or , or may mean a point of change from the Nougat version to the Oreo version.

정보 추출부(220)는 모바일 운영체제에서 실행되는 어플리케이션(180)의 목표 API 레벨(110)에 따라 기준 동작 정보(140)를 추출할 수 있다.The information extraction unit 220 may extract the reference operation information 140 according to the target API level 110 of the application 180 executed in the mobile operating system.

여기에서, 목표 API 레벨은 어플리케이션(180)의 실행이 권장되는 API 레벨(110)을 의미할 수 있으며, 이러한 목표 API 레벨은 어플리케이션(180) 내에 설정 값이 존재할 수 있다.Here, the target API level may mean the API level 110 at which the execution of the application 180 is recommended, and the target API level may have a set value in the application 180 .

또한, 기준 동작 정보(140)는 어플리케이션(180)이 목표 API 레벨에서 수행하는 동작을 나타낼 수 있으며, 해당 API 레벨(110)에서 어플리케이션(180)에 의해 이용되는 클래스, 메소드 및 변수 등의 소스 코드를 나열하는 정보일 수 있다.In addition, the reference operation information 140 may indicate an operation performed by the application 180 at the target API level, and source codes such as classes, methods, and variables used by the application 180 at the corresponding API level 110 . It may be information listing

한편, 정보 추출부(220)는 적어도 하나 이상의 서로 다른 어플리케이션(180)으로부터 각각의 어플리케이션(180)에 설정되는 목표 API 레벨(110)에 따라 기준 동작 정보(140)를 각각 추출할 수 있다.Meanwhile, the information extractor 220 may extract the reference operation information 140 from at least one or more different applications 180 according to the target API level 110 set in each application 180 .

이때, 서로 다른 어플리케이션(180)에 설정되는 목표 API 레벨(110)은 서로 다른 레벨 분류(131)에 포함될 수 있으므로, 서로 다른 어플리케이션(180)으로부터 추출되는 기준 동작 정보(140)는 서로 다른 레벨 분류(131)에서의 어플리케이션(180)의 동작을 나타낼 수 있다.In this case, since target API levels 110 set in different applications 180 may be included in different level classifications 131 , reference operation information 140 extracted from different applications 180 may be classified at different levels. An operation of the application 180 at 131 may be indicated.

정보 추출부(220)는 기준 동작 정보(140)가 추출된 어플리케이션(180)에 대해, 기준 동작 정보(140)를 추출하는 목표 API 레벨(110)과 다른 API 레벨(110)에서 동작하는 변형 동작 정보(150)를 추출할 수 있다.The information extracting unit 220, with respect to the application 180 from which the reference operation information 140 is extracted, operates at an API level 110 different from the target API level 110 from which the reference operation information 140 is extracted. Information 150 can be extracted.

이때, 목표 API 레벨(110)과 다른 API 레벨(110)은 서로 다른 레벨 분류(131)에 포함되는 API 레벨(110)일 수 있다.In this case, the API level 110 different from the target API level 110 may be the API level 110 included in the different level classification 131 .

한편, 정보 추출부(220)는 적어도 하나 이상의 서로 다른 어플리케이션(180)으로부터 각각의 어플리케이션(180)에 설정되는 목표 API 레벨(110)과 다른 API 레벨(110)에 따라 변형 동작 정보(150)를 각각 추출할 수 있다.Meanwhile, the information extraction unit 220 extracts the transformation operation information 150 from at least one or more different applications 180 according to the target API level 110 set in each application 180 and the other API level 110 . Each can be extracted.

이때, 목표 API 레벨(110)과 다른 API 레벨(110)은 서로 다른 레벨 분류(131)에 포함되는 API 레벨(110)일 수 있으며, 서로 다른 어플리케이션(180)에 설정되는 다른 API 레벨(110)은 서로 다른 레벨 분류(131)에 포함될 수 있으므로, 서로 다른 어플리케이션(180)으로부터 추출되는 변형 동작 정보(150)는 서로 다른 레벨 분류(131)에서의 어플리케이션(180)의 동작을 나타낼 수 있다.In this case, the API level 110 different from the target API level 110 may be an API level 110 included in a different level classification 131 , and a different API level 110 set in different applications 180 . may be included in different level classifications 131 , so the modified operation information 150 extracted from different applications 180 may indicate the operation of the application 180 in different level classifications 131 .

악성코드 분류부(230)는 기준 동작 정보(140) 및 변형 동작 정보(150) 중 적어도 하나의 동작 정보에 따라 악성코드를 분류할 수 있다.The malicious code classification unit 230 may classify the malicious code according to the operation information of at least one of the reference operation information 140 and the modified operation information 150 .

이를 위해, 악성코드 분류부(230)는 서로 다른 어플리케이션(180)으로부터 추출되는 기준 동작 정보(140) 및 변형 동작 정보(150)를 수집하여, API 분류 정보(130)에 따라 서로 다른 동작 정보의 데이터베이스(160)를 생성할 수 있다.To this end, the malicious code classification unit 230 collects the reference operation information 140 and the modified operation information 150 extracted from different applications 180 , and divides the different operation information according to the API classification information 130 . A database 160 may be created.

이때, 악성코드 분류부(230)는 데이터베이스(160)에 서로 다른 동작 정보를 API 분류 정보(130)에 따라 분류하여 수집할 수 있다.In this case, the malicious code classification unit 230 may classify and collect different operation information in the database 160 according to the API classification information 130 .

이에 따라, 악성코드 분류부(230)는 데이터베이스(160)에 수집되는 동일한 API 분류 정보(130) 내의 서로 다른 동작 정보의 유사도를 계산할 수 있으며, 유사도가 사전에 설정되는 값을 만족하는 경우에, 해당 동작 정보가 추출된 적어도 하나 이상의 어플리케이션(180)을 동일한 악성코드로 분류할 수 있다.Accordingly, the malicious code classification unit 230 may calculate the similarity of different operation information in the same API classification information 130 collected in the database 160, and when the similarity satisfies a preset value, At least one or more applications 180 from which corresponding operation information is extracted may be classified as the same malicious code.

이와 같이, 악성코드 분류부(230)는 API 분류 정보(130)에 따라 적어도 하나 이상의 기준 동작 정보(140) 및 적어도 하나 이상의 변형 동작 정보(150)를 수집하여 데이터베이스(160)를 생성할 수 있다.As such, the malicious code classification unit 230 may generate the database 160 by collecting at least one or more pieces of reference operation information 140 and at least one or more pieces of modified operation information 150 according to the API classification information 130 . .

이때, 데이터베이스(160)를 생성하기 위해 수집되는 동작 정보가 추출되는 어플리케이션(180)은 사전에 악성코드로 알려진 복수 개의 어플리케이션(180)을 포함할 수 있다.In this case, the application 180 from which the operation information collected to create the database 160 is extracted may include a plurality of applications 180 known as malicious codes in advance.

또한, 악성코드 분류부(230)는 다른 어플리케이션(180)으로부터 추출되는 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보를 정보 추출부(220)로부터 전달받을 수 있으며, 이때, 다른 어플리케이션(180)은 데이터베이스(160)를 생성하도록 수집된 기준 동작 정보(140) 또는 변형 동작 정보(150)가 추출된 어플리케이션(180)과 다른 어플리케이션(180)인 것으로 이해할 수 있다.Also, the malicious code classification unit 230 may receive at least one of the reference operation information 140 and the modified operation information 150 extracted from the other application 180 from the information extraction unit 220 . , the other application 180 may be understood as an application 180 different from the application 180 from which the reference operation information 140 or the modified operation information 150 collected to generate the database 160 is extracted.

악성코드 분류부(230)는 다른 어플리케이션(180)의 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보와 데이터베이스(160)에 수집된 적어도 하나 이상의 기준 동작 정보(140) 및 적어도 하나 이상의 변형 동작 정보(150)를 비교하여 계산되는 유사도에 따라 다른 어플리케이션(180)을 분류할 수 있다. The malicious code classification unit 230 includes at least one of the reference action information 140 or the modified action information 150 of the other application 180 and at least one or more pieces of reference action information 140 collected in the database 160 and The other applications 180 may be classified according to a degree of similarity calculated by comparing at least one or more pieces of information about the deformation operation 150 .

이때, 악성코드로 분류되는 다른 어플리케이션(180)은 데이터베이스(160)에 수집된 동작 정보 중 적어도 하나의 정보와 정보 추출부(220)에서 추출되는 동작 정보를 비교하여 생성되는 유사도가 사전에 설정되는 값을 만족하는 어플리케이션(180)일 수 있으며, 이때, 동작 정보를 비교하는 것은 동일한 API 분류 정보(130)에 대해 추출되는 동작 정보를 비교하는 것일 수 있다.At this time, the other application 180 classified as a malicious code compares at least one piece of motion information collected in the database 160 with the motion information extracted from the information extraction unit 220 to set the similarity level in advance. It may be the application 180 that satisfies the value, and in this case, comparing the operation information may be comparing the operation information extracted with respect to the same API classification information 130 .

도3은 본 발명의 일 실시예에 따라 어플리케이션의 동작 정보를 수집하는 방법을 나타내는 개략도이다.3 is a schematic diagram illustrating a method of collecting operation information of an application according to an embodiment of the present invention.

도3을 참조하면, 서로 다른 어플리케이션(180a, 180b, 180c)으로부터 각각의 기준 동작 정보(140a, 140b, 140c) 및 변형 동작 정보(150a, 150b, 150c)가 추출되는 것을 확인할 수 있으며, 이에 따라, 데이터베이스(160)는 각각의 레벨 분류(131a, 131b, 131c)에 따라 서로 다른 기준 동작 정보(140a, 140b, 140c) 및 변형 동작 정보(150a, 150b, 150c)를 수집하는 것을 확인할 수 있다.Referring to FIG. 3 , it can be seen that the respective reference motion information 140a, 140b, 140c and the modified motion information 150a, 150b, and 150c are extracted from different applications 180a, 180b, and 180c, and accordingly , it can be confirmed that the database 160 collects different reference motion information 140a, 140b, 140c and modified motion information 150a, 150b, 150c according to each level classification 131a, 131b, and 131c.

이때, 데이터베이스(160)는 각각의 동작 정보를 해당 동작 정보의 기반이 되는 API 분류 정보(130)에 따라 분류하여 수집할 수 있다.In this case, the database 160 may classify and collect each operation information according to the API classification information 130 that is the basis of the corresponding operation information.

또한, 서로 다른 어플리케이션(180a, 180b, 180c)의 기준 동작 정보(140a, 140b, 140c)는 각각의 목표 API 레벨(110)이 포함되는 레벨 분류(131a, 131b, 131c)의 API에 대해 수행되는 동작을 나타낼 수 있으며, 이와 관련하여, 서로 다른 어플리케이션(180a, 180b, 180c)의 기준 동작 정보(140a, 140b, 140c)는 서로 다른 레벨 분류(131a, 131b, 131c)에서 추출될 수 있다.In addition, the reference operation information 140a, 140b, 140c of the different applications 180a, 180b, 180c is performed for the API of the level classification (131a, 131b, 131c) including the target API level 110, respectively. It may indicate a motion, and in this regard, the reference motion information 140a, 140b, 140c of the different applications 180a, 180b, 180c may be extracted from different level classifications 131a, 131b, and 131c.

이에 따라, 임의의 레벨 분류(131)에는 특정 어플리케이션(180)의 기준 동작 정보(140)가 저장될 수 있으며, 다른 어플리케이션(180)의 변형 동작 정보(150)가 저장될 수도 있다.Accordingly, the reference operation information 140 of the specific application 180 may be stored in the arbitrary level classification 131 , and the modified operation information 150 of the other application 180 may be stored.

도4는 본 발명의 일 실시예에 따라 악성코드를 분류하는 방법을 나타내는 개략도이다.4 is a schematic diagram illustrating a method for classifying a malicious code according to an embodiment of the present invention.

도4를 참조하면, 다른 어플리케이션(180d)으로부터 추출되는 기준 동작 정보(140d) 및 변형 동작 정보(150d)를 데이터베이스(160)에 레벨 분류(131) 별로 수집되어 있는 기준 동작 정보(140) 및 변형 동작 정보(150)와 비교하여 유사도를 생성하고, 유사도가 사전에 설정되는 값을 만족하는 경우에, 어플리케이션(180d)을 악성코드로 분류하는 것을 확인할 수 있다.Referring to FIG. 4 , the reference motion information 140d and the variation motion information 150d extracted from another application 180d are collected in the database 160 for each level classification 131 , and the reference motion information 140 and the variation It can be confirmed that the similarity is generated by comparing the operation information 150 and the application 180d is classified as a malicious code when the similarity satisfies a preset value.

이때, 데이터베이스(160)에 수집되어 있는 동작 정보(140, 150)와 다른 어플리케이션(180d)으로부터 추출된 동작 정보(140d, 150d)를 비교하는 것은 각각의 동작 정보로부터 나타나는 클래스, 메소드, 변수 등의 소스 코드를 비교하여 유사도를 생성할 수 있다.At this time, comparing the motion information 140 , 150 collected in the database 160 with the motion information 140d and 150d extracted from another application 180d is to determine the class, method, variable, etc. You can create similarities by comparing source codes.

또한, 유사도는 데이터베이스(160)에 수집되어 있는 동작 정보(140, 150)에 대해, 다른 어플리케이션(180d)으로부터 추출된 동작 정보(140d, 150d)가 유사한 정도를 나타낼 수 있다.Also, the degree of similarity may indicate a degree to which the motion information 140d and 150d extracted from another application 180d is similar to the motion information 140 and 150 collected in the database 160 .

도5는 본 발명의 일 실시예에 따른 악성코드 분류 방법의 순서도이다.5 is a flowchart of a malicious code classification method according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 악성코드 분류 방법은 도 2에 도시된 악성코드 분류 장치(200)와 실질적으로 동일한 구성 상에서 진행되므로, 도 2의 악성코드 분류 장치(200)와 동일한 구성요소에 대해 동일한 도면 부호를 부여하고, 반복되는 설명은 생략하기로 한다.Since the malicious code classification method according to an embodiment of the present invention proceeds in substantially the same configuration as the malicious code classification apparatus 200 shown in FIG. 2 , the same components as the malicious code classification apparatus 200 of FIG. 2 are applied. The same reference numerals are given, and repeated descriptions will be omitted.

악성코드 분류 방법은 시스템 기능 정보를 수집하는 단계(500), API 분류 정보를 생성하는 단계(510), 기준 동작 정보를 추출하는 단계(520), 변형 동작 정보를 추출하는 단계(530) 및 악성코드를 분류하는 단계(540)를 포함할 수 있다.The malicious code classification method includes the steps of collecting system function information (500), generating API classification information (510), extracting reference action information (520), extracting modified action information (530), and malicious code classification. classifying the code (540).

시스템 기능 정보를 수집하는 단계(500)는 모바일 운영체제의 API 레벨(110)에 따른 시스템 기능 정보를 수집할 수 있다.Collecting system function information 500 may collect system function information according to the API level 110 of the mobile operating system.

이때, 시스템 기능 정보를 수집하는 단계(500)는 모바일 운영체제의 업데이트에 따라 변경되는 적어도 하나 이상의 API 레벨(110)을 수집할 수 있으며, 이에 따라, 각각의 API 레벨(110)에 구비되는 시스템 기능 정보를 수집하는 것으로 이해할 수 있다.In this case, the step 500 of collecting system function information may collect at least one or more API levels 110 that are changed according to the update of the mobile operating system, and accordingly, system functions provided in each API level 110 . It can be understood as collecting information.

API 분류 정보를 생성하는 단계(510)는 API 레벨(110)을 레벨 분류 정보(120)에 따라 분류하여 API 분류 정보(130)를 생성할 수 있다.In the step 510 of generating the API classification information, the API level 110 may be classified according to the level classification information 120 to generate the API classification information 130 .

이때, API 분류 정보를 생성하는 단계(510)는 모바일 운영체제의 버전이 변경되는 시점에 변경되는 API 레벨(110)을 분류 지점으로 설정할 수 있다. 이때, 모바일 운영체제의 버전이 변경되는 것은 모바일 운영체제의 버전을 나타내는 명칭이 변경되는 것으로 이해할 수 있다.In this case, the step 510 of generating the API classification information may set the API level 110, which is changed when the version of the mobile operating system is changed, as a classification point. In this case, the change in the version of the mobile operating system may be understood as a change in the name indicating the version of the mobile operating system.

기준 동작 정보를 추출하는 단계(520)는 모바일 운영체제에서 실행되는 어플리케이션(180)의 목표 API 레벨(110)에 따라 기준 동작 정보(140)를 추출할 수 있다.In step 520 of extracting the reference operation information, the reference operation information 140 may be extracted according to the target API level 110 of the application 180 executed in the mobile operating system.

한편, 기준 동작 정보를 추출하는 단계(520)는 적어도 하나 이상의 서로 다른 어플리케이션(180)으로부터 각각의 어플리케이션(180)에 설정되는 목표 API 레벨(110)에 따라 기준 동작 정보(140)를 각각 추출할 수 있다.Meanwhile, in the step of extracting the reference operation information 520 , the reference operation information 140 is extracted from at least one or more different applications 180 according to the target API level 110 set in each application 180 , respectively. can

변형 동작 정보를 추출하는 단계(530)는 기준 동작 정보(140)가 추출된 어플리케이션(180)에 대해, 기준 동작 정보(140)를 추출하는 목표 API 레벨(110)과 다른 API 레벨(110)에서 동작하는 변형 동작 정보(150)를 추출할 수 있다.Step 530 of extracting the modified motion information is at an API level 110 different from the target API level 110 from which the reference motion information 140 is extracted with respect to the application 180 from which the reference motion information 140 is extracted. It is possible to extract the information on the deformed operation 150 that operates.

한편, 변형 동작 정보를 추출하는 단계(530)는 적어도 하나 이상의 서로 다른 어플리케이션(180)으로부터 각각의 어플리케이션(180)에 설정되는 목표 API 레벨(110)과 다른 API 레벨(110)에 따라 변형 동작 정보(150)를 각각 추출할 수 있다.On the other hand, in the step 530 of extracting the transformation operation information, the transformation operation information according to the API level 110 different from the target API level 110 set in each application 180 from at least one or more different applications 180 . (150) can be extracted, respectively.

악성코드를 분류하는 단계(540)는 기준 동작 정보(140) 및 변형 동작 정보(150) 중 적어도 하나의 동작 정보에 따라 악성코드를 분류할 수 있다.In the classifying operation 540 of the malicious code, the malicious code may be classified according to operation information of at least one of the reference operation information 140 and the modified operation information 150 .

이를 위해, 악성코드를 분류하는 단계(540)는 서로 다른 어플리케이션(180)으로부터 추출되는 기준 동작 정보(140) 및 변형 동작 정보(150)를 수집하여, API 분류 정보(130)에 따라 서로 다른 동작 정보의 데이터베이스(160)를 생성할 수 있다.To this end, the step 540 of classifying the malicious code collects the reference operation information 140 and the modified operation information 150 extracted from different applications 180 , and performs different operations according to the API classification information 130 . A database 160 of information may be created.

이때, 악성코드를 분류하는 단계(540)는 데이터베이스(160)에 서로 다른 동작 정보를 API 분류 정보(130)에 따라 분류하여 수집할 수 있다.In this case, in the step of classifying the malicious code ( 540 ), different operation information may be classified according to the API classification information 130 and collected in the database 160 .

이에 따라, 악성코드를 분류하는 단계(540)는 데이터베이스(160)에 수집되는 동일한 API 분류 정보(130) 내의 서로 다른 동작 정보의 유사도를 계산할 수 있으며, 유사도가 사전에 설정되는 값을 만족하는 경우에, 해당 동작 정보가 추출된 적어도 하나 이상의 어플리케이션(180)을 동일한 악성코드로 분류할 수 있다.Accordingly, in the step 540 of classifying the malicious code, the similarity of different operation information in the same API classification information 130 collected in the database 160 may be calculated, and when the similarity satisfies a preset value, For example, at least one or more applications 180 from which corresponding operation information is extracted may be classified as the same malicious code.

이와 같이, 악성코드를 분류하는 단계(540)는 API 분류 정보(130)에 따라 적어도 하나 이상의 기준 동작 정보(140) 및 적어도 하나 이상의 변형 동작 정보(150)를 수집하여 데이터베이스(160)를 생성할 수 있다.In this way, in the step of classifying the malicious code (540), the database 160 is generated by collecting at least one or more reference action information 140 and at least one or more modified action information 150 according to the API classification information 130. can

또한, 악성코드를 분류하는 단계(540)는 다른 어플리케이션(180)으로부터 추출되는 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보를 기준 동작 정보를 추출하는 단계(520) 및 변형 동작 정보를 추출하는 단계(530) 중 적어도 하나의 단계로부터 전달받을 수 있다.In addition, the step of classifying the malicious code (540) includes the step of extracting reference action information from at least one of the reference action information 140 and the modified action information 150 extracted from the other application 180 in step 520 and It may be transmitted from at least one of the step 530 of extracting the deformation motion information.

이에 따라, 악성코드를 분류하는 단계(540)는 다른 어플리케이션(180)의 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보와 데이터베이스(160)에 수집된 적어도 하나 이상의 기준 동작 정보(140) 및 적어도 하나 이상의 변형 동작 정보(150)를 비교하여 계산되는 유사도에 따라 다른 어플리케이션(180)을 분류할 수 있다.Accordingly, in the step of classifying the malicious code ( 540 ), at least one of the reference action information 140 or the modified action information 150 of the other application 180 and at least one reference action collected in the database 160 . The other applications 180 may be classified according to a degree of similarity calculated by comparing the information 140 and the at least one piece of information about the transformation operation 150 .

도6은 도5의 악성코드를 분류하는 단계의 세부 순서도이다.6 is a detailed flowchart of the step of classifying the malicious code of FIG.

악성코드를 분류하는 단계(540)는 동작 정보를 수집하는 단계(541), 동작 정보를 전달받는 단계(542) 및 어플리케이션을 분류하는 단계(543)를 더 포함할 수 있다.Classifying the malicious code ( 540 ) may further include collecting operation information ( 541 ), receiving operation information ( 542 ), and classifying the application ( 543 ).

동작 정보를 수집하는 단계(541)는 서로 다른 어플리케이션(180)으로부터 추출되는 기준 동작 정보(140) 및 변형 동작 정보(150)를 수집하여, API 분류 정보(130)에 따라 서로 다른 동작 정보의 데이터베이스(160)를 생성할 수 있다.In the step 541 of collecting motion information, the reference motion information 140 and the modified motion information 150 extracted from different applications 180 are collected, and a database of different motion information according to the API classification information 130 . (160) can be created.

동작 정보를 전달받는 단계(542)는 다른 어플리케이션(180)으로부터 추출되는 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보를 기준 동작 정보를 추출하는 단계(520) 및 변형 동작 정보를 추출하는 단계(530) 중 적어도 하나의 단계로부터 전달받을 수 있다.The step 542 of receiving the motion information includes the step of extracting reference motion information from at least one of the reference motion information 140 and the modified motion information 150 extracted from the other application 180 ( 520 ) and the transforming motion. The information may be transmitted from at least one of the step 530 of extracting the information.

어플리케이션을 분류하는 단계(543)는 다른 어플리케이션(180)의 기준 동작 정보(140) 또는 변형 동작 정보(150) 중 적어도 하나의 정보와 데이터베이스(160)에 수집된 적어도 하나 이상의 기준 동작 정보(140) 및 적어도 하나 이상의 변형 동작 정보(150)를 비교하여 계산되는 유사도에 따라 다른 어플리케이션(180)을 분류할 수 있다.In the step of classifying the application ( 543 ), at least one of the reference operation information 140 or the modified operation information 150 of the other application 180 and at least one or more reference operation information 140 collected in the database 160 . And the other applications 180 may be classified according to a degree of similarity calculated by comparing at least one or more pieces of information about the transformation operation 150 .

이와 같은, 악성코드 분류 방법은 어플리케이션으로 구현되거나 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.Such a malicious code classification method may be implemented as an application or implemented in the form of program instructions that may be executed through various computer components and recorded in a computer-readable recording medium. The computer-readable recording medium may include program instructions, data files, data structures, etc. alone or in combination.

상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거니와 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다.The program instructions recorded on the computer-readable recording medium are specially designed and configured for the present invention, and may be known and available to those skilled in the computer software field.

컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD 와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.Examples of the computer-readable recording medium include a hard disk, a magnetic medium such as a floppy disk and a magnetic tape, an optical recording medium such as a CD-ROM and DVD, and a magneto-optical medium such as a floppy disk. media), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like.

프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device may be configured to operate as one or more software modules to perform processing according to the present invention, and vice versa.

이상에서는 실시예들을 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허 청구범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.Although the above has been described with reference to the embodiments, those skilled in the art will understand that various modifications and changes can be made to the present invention without departing from the spirit and scope of the present invention as set forth in the following claims. will be able

100: 악성코드 분류 시스템100: Malware classification system

Claims

A method for classifying malicious code whose behavior is modified according to different API levels of a mobile operating system, the method comprising:
Collecting system function information according to the API level of the mobile operating system;
generating API classification information by classifying the API level according to preset level classification information;
extracting reference operation information according to a target API level of an application executed in the mobile operating system;
extracting, with respect to the application, information about a modification operation operating at an API level different from the target API level; and
classifying the malicious code according to at least one of the reference action information and the modified action information;
The generating of the reference operation information includes extracting the reference operation information according to each target API level set in at least one or more applications executed in the mobile operating system,
The generating of the modified operation information may include, for at least one or more applications executed in the mobile operating system, at least one or more transformations according to API classification information provided with an API level different from a target API level at which each reference operation information is generated. extract motion information,
Classifying the malicious code includes:
collecting the at least one or more reference motion information and the at least one or more modified motion information according to the API classification information;
receiving, from at least one of extracting the reference motion information and extracting the modified motion information, at least one of reference motion information of another application or modified motion information; and
classifying the other application according to a similarity calculated by comparing at least one of the reference operation information or the modified operation information of the other application with the collected at least one or more reference operation information and the at least one or more modified operation information; Including, malware classification method.

delete

The method of claim 1, wherein the collecting of the system function information comprises:
A malicious code classification method for collecting system function information provided in at least one API level that is changed according to an update of the mobile operating system.

According to claim 1, wherein the level classification information,
A malicious code classification method for setting an API level that is changed when the version of the mobile operating system is changed as a classification point.

A computer-readable recording medium on which a computer program is recorded, for performing the malicious code classification method according to any one of claims 1 to 6.

an information collection unit that collects system function information according to an API level of a mobile operating system, and classifies the API level according to preset level classification information to generate API classification information;
an information extraction unit for extracting reference operation information according to a target API level of an application executed in the mobile operating system, and extracting, with respect to the application, modified operation information operating at an API level different from the target API level; and
and a malicious code classification unit for classifying malicious code according to at least one of the reference action information and the modified action information;
The information extracting unit extracts reference operation information according to each target API level set in at least one or more applications executed in the mobile operating system, and for at least one or more applications executed in the mobile operating system, each reference operation information extracts at least one or more transformation operation information according to API classification information provided with an API level different from the target API level generated,
The malicious code classification unit,
Collecting the at least one or more reference operation information and the at least one or more modified operation information according to the API classification information,
At least one of the reference motion information or the modified motion information of another application is received from the information extraction unit,
Classifying the other application according to a similarity calculated by comparing at least one of the reference operation information or the modified operation information of the other application with the collected at least one or more reference operation information and the at least one or more modified operation information code classification device.

delete