KR20160031590A

KR20160031590A - Malicious app categorization apparatus and malicious app categorization method

Info

Publication number: KR20160031590A
Application number: KR1020140120951A
Authority: KR
Inventors: 김휘강; 윤재성; 장재욱; 우지영
Original assignee: 고려대학교 산학협력단
Priority date: 2014-09-12
Filing date: 2014-09-12
Publication date: 2016-03-23
Also published as: KR101657667B1

Abstract

The present invention relates to an apparatus and a method for categorizing malicious applications. The apparatus for categorizing malicious applications includes: a profiling module for generating an operation profile capable of identifying at least one malicious operation of an application from data based on analysis of the application; and an operation categorizing module for categorizing the application according to a malicious operation pattern of the application which is analyzed by the operation profile. When the apparatus and the method of the present invention are used, a mobile malicious application can be detected and classified with high accuracy by applying a profiling method based on the operation.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a malicious app classification apparatus and a malicious application classification method,

본 발명은 악성 앱 분류 장치 및 악성 앱 분류 방법에 관한 것으로서, 구체적으로는 앱의 로그 데이터부터 행위기반의 프로파일을 구성하여 악성 앱을 탐지 및 분류할 수 있도록 하는 악성 앱 분류 장치 및 악성 앱 분류 방법에 관한 것이다. The present invention relates to a malicious application classifying device and a malicious application classifying method, and more particularly, to a malicious application classifying device and a malicious application classifying device that can detect and classify malicious applications by configuring an action- .

스마트폰의 대중화로 언제 어디서나 정보를 액세스하고 교환할 수 있는 환경이 구축되었다. 스마트폰의 보급은 기하급수적으로 증가하고 있지만 모바일 악성 앱에 의한 정보유출 등과 같은 해킹 위협 또한 증가하고 있다. With the popularization of smartphones, an environment was established to access and exchange information anytime, anywhere. While the spread of smartphones is growing exponentially, there are also increasing threats of hacking, such as information leakage by mobile malicious apps.

특히 안드로이드 플랫폼을 기반으로 하는 스마트폰이 해킹 위험에 더 많이 노출되어 있다. 이는 안드로이드 플랫폼의 개방 정책에 따라 어떤 개발자나 안드로이드 마켓에 간단한 인증으로 손쉽게 악성 앱을 업로드할 수 있고 앱 자체의 크랙과 리패키징이 용이하기 때문이다. In particular, smartphones based on the Android platform are more exposed to the risk of hacking. This is because, according to the opening policy of the Android platform, malicious apps can easily be uploaded to any developer or Android market with simple authentication, and the cracks and repackaging of the app itself are easy.

기하급수적으로 증가하는 악성 앱의 확산을 막기 위해서, 백신 회사는 기존 PC에서 적용한 악성코드 탐지방법인 시그니처(signature) 기반의 탐지방법을 모바일 환경에서도 확대 적용하고 있다. 시그니처 기반의 탐지방법은 지속적으로 데이터베이스를 업데이트 해야하고 알려지지 않은 공격(Zero-day)에 대해서는 탐지가 어렵고 악성 행위 자체는 유지한 채 앱 코드의 일부 또는 전부를 변경한 악성 앱의 경우 탐지율이 현저히 떨어지는 문제가 존재한다. To prevent the proliferation of exponentially growing malicious apps, vaccine companies are extending their signature-based detection methods, which are malicious code detection methods applied in existing PCs, to mobile environments. Signature-based detection methods require a continuous database update, detection is difficult for unknown (zero-day) malicious apps, and malicious apps that change some or all of their app code There is a problem.

이러한 시그너처 기반의 탐지방법에 대응하기 위해서 많은 선행연구가 진행되고 있다. 선행연구는 주로 퍼미션(permission) 기반, API(Application Programming Interface) 기반 또는 시스템 콜 기반으로 악성 앱을 탐지하고 있다. Many previous studies have been conducted to deal with such signature - based detection methods. The previous researches mainly detect malicious apps based on permissions, API (Application Programming Interface) based or system call.

퍼미션 기반의 악성 앱 탐지 방법은 정상(Benign) 앱을 악성 앱으로 분류하는 오탐(false positive)이 많고 API 기반의 악성 앱 탐지 방법은 디컴파일과 디스어셈블리 과정이 완료되어야 하고 변조 및 난독화 기술에 취약하다. 또한 시스템 콜 기반의 악성 앱 탐지 방법은 시스템 콜 빈도수에만 기초하고 다른 시스템 콜 인자(Argument)를 고려하지 않기에 정확도가 떨어진다. Permission-based malicious app detection methods have many false positives that classify benign apps as malicious apps. The method of detecting malicious apps based on APIs must be completed before decompilation and disassembly. weak. Also, the malicious app detection method based on the system call is based only on the system call frequency and does not take into consideration other system call arguments, and therefore the accuracy is lowered.

이와 같이 기존의 악성 앱 탐지방법의 선행 연구에서 발견되는 한계들을 해소할 수 있고 정확하게 악성 앱을 분류할 수 있도록 하는 새로운 악성 앱 분류 장치 및 악성 앱 분류 방법이 필요하다. Thus, there is a need for a new malicious app classification apparatus and a malicious app classification method that can solve the limitations found in previous researches on existing malicious app detection methods and accurately classify malicious apps.

본 발명은 상술한 문제점을 해결하기 위해서 안출한 것으로서, 앱에 대한 동적 분석으로 출력되는 사용자 레벨 및 커넬 레벨의 로그 데이터로부터 행위 기반의 프로파일을 생성하여 악성 앱을 탐지하고 분류할 수 있도록 하는 악성 앱 분류 장치 및 악성 앱 분류 방법을 제공하는 데 그 목적이 있다. The present invention has been made in order to solve the above-described problems, and it is an object of the present invention to provide a malicious app that can detect and classify malicious apps by generating an action-based profile from user level and kernel level log data output by dynamic analysis of an app. And a method of classifying malicious apps.

상기와 같은 목적을 달성하기 위한 악성 앱 분류 장치는 앱의 분석에 기초한 데이터로부터 앱의 하나 이상의 악성 행위(operation)를 식별할 수 있는 행위 프로파일을 생성하는 행위 프로파일링 모듈 및 행위 프로파일로부터 분석되는 앱의 악성 행위 패턴에 따라 앱을 분류하는 행위 범주화 모듈을 포함한다.According to another aspect of the present invention, there is provided a malicious application classifying apparatus including an action profiling module for generating an action profile capable of identifying at least one malicious operation of an application from data based on an analysis of an application, And a behavior categorizing module for classifying the app according to the malicious behavior pattern of the malicious behavior pattern of the malicious behavior.

또한 상기와 같은 목적을 달성하기 위한 악성 앱 분류 방법은 (b) 앱의 분석에 기초한 데이터로부터 앱의 하나 이상의 악성 행위(operation)를 식별할 수 있는 행위 프로파일을 생성하는 단계 및 (c) 행위 프로파일로부터 분석되는 앱의 악성 행위 패턴에 따라 앱을 분류하는 단계를 포함한다.According to another aspect of the present invention, there is provided a malicious application classification method, comprising the steps of: (b) generating an action profile capable of identifying one or more malicious operations of an application from data based on an analysis of an application; and (c) And classifying the app according to the malicious behavior pattern of the app to be analyzed.

상기와 같은 본 발명에 따른 악성 앱 분류 장치 및 악성 앱 분류 방법은 앱에 대한 동적 분석으로 출력되는 사용자 레벨 및 커넬 레벨의 로그 데이터로부터 행위 기반의 프로파일을 생성하여 악성 앱을 탐지하고 분류할 수 있도록 하는 효과가 있다. The malicious application classifying device and the malicious application classifying method according to the present invention can be used to detect malicious apps by generating an action-based profile from log data of user level and kernel level output by dynamic analysis of an application .

도 1은 본 발명에 따라 악성 앱 탐지 및 분류를 위한 예시적인 시스템 블록도와 개략적인 내부 구조를 도시한 도면이다.
도 2는 앱 분석부의 예시적인 기능 블록도를 도시한 도면이다.
도 3은 악성 앱 탐지 및 분류를 위한 예시적인 흐름도를 도시한 도면이다.
도 4는 특정 악성 앱에 대하여 생성된 행위 프로파일의 예를 도시한 도면이다.1 is a diagram illustrating an exemplary system block and schematic internal structure for malicious app detection and classification according to the present invention.
2 is a diagram showing an exemplary functional block diagram of an app analysis unit.
3 is a diagram illustrating an exemplary flow chart for malicious app detection and classification.
4 is a view showing an example of a behavior profile generated for a specific malicious app.

상술한 목적, 특징 및 장점은 첨부된 도면을 참조하여 상세하게 후술 되어 있는 상세한 설명을 통하여 더욱 명확해 질 것이며, 그에 따라 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 또한, 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에 그 상세한 설명을 생략하기로 한다. 이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시 예를 상세히 설명하기로 한다.
The above and other objects, features and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings, in which: It can be easily carried out. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따라 악성 앱 탐지 및 분류를 위한 예시적인 시스템 블록도와 개략적인 내부 구조를 도시한 도면이다. 1 is a diagram illustrating an exemplary system block and schematic internal structure for malicious app detection and classification according to the present invention.

도 1에 따르면 악성 앱 탐지 및 분류를 위한 시스템은 하나 이상의 클라이언트(100)(Client), 악성 앱 분류 장치(200) 및 통신망(300)을 포함한다. Referring to FIG. 1, a malicious application detection and classification system includes at least one client 100, a malicious application classification apparatus 200, and a communication network 300.

시스템의 각 블록에 대해서 살펴보면, 클라이언트(100)는 일반 사용자가 이용하는 단말기로서 예를 들어 스마트폰, 태블릿 PC 등일 수 있다. 클라이언트(100)는 프로세서를 내장하고 각종 응용 프로그램을 수행할 수 있다. 이하 응용 프로그램을 앱(App)으로 지칭한다. 클라이언트(100)는 바람직하게 이동중에도 이동통신망을 통한 전화, SMS(Short Message Service) 또는 MMS(Multimedia Message Service)가 가능한 휴대형 단말기이다. 클라이언트(100)는 이동통신망이나 인터넷망을 통해 각종 앱을 다운받을 수 있고 다운받은 앱을 실행시킬 수 있다. As to each block of the system, the client 100 may be a smartphone, a tablet PC, or the like, which is used by a general user. The client 100 has a built-in processor and can execute various application programs. Hereinafter, an application is referred to as an app. The client 100 is preferably a portable terminal capable of performing telephone communication, SMS (Short Message Service), or MMS (Multimedia Message Service) over a mobile communication network even on the move. The client 100 can download various apps through a mobile communication network or an internet network and can execute downloaded apps.

클라이언트(100)는 다운받은 앱의 실행 이전에 웹 브라우저를 통해 먼저 악성 앱 분류 장치(200)로 다운받은 앱이 악성 앱인지를 또는 악성 앱인 경우 어떤 악성 앱인지를 알기 위해 앱 정보를 전송하고 해당 앱에 대한 분석 결과를 수신하도록 구성될 수 있다. 앱 정보는 예를 들어 다운받은 앱의 이름과 앱의 해쉬값(hash value)을 포함한다. 또는 앱 정보는 다운받은 앱의 패키지(Package)를 포함할 수도 있다. The client 100 transmits the app information in order to know whether the app downloaded to the malicious application classification device 200 through the web browser is a malicious app or malicious app if it is a malicious app before executing the downloaded app, And receive analysis results for the app. The app information includes, for example, the name of the downloaded app and the hash value of the app. Alternatively, the app information may include a package of the downloaded app.

악성 앱 분류 장치(200)는 통신망(300)을 통해 하나 이상의 클라이언트(100)에 연결되어 클라이언트(100)로부터의 앱 탐지 및 분류 요청을 수신하고 수신된 앱 정보를 이용하여 앱이 악성인지 탐지하고 및/또는 악성인 경우 어떠한 악성 앱인지를 분류하여 그 분석 결과를 클라이언트(100)로 전송하도록 구성된다. 악성 앱 분류 장치(200)는 소위 서버를 구성하고 예를 들어 웹 페이지를 HTTP(hyper text transfer protocol)를 통해 제공하는 웹 서버로서 동작할 수 있다.The malicious application classification apparatus 200 is connected to one or more clients 100 through a communication network 300 and receives an application detection and classification request from the client 100 and detects whether the application is malicious using the received application information And / or malicious apps, and transmits the analysis result to the client 100. The malicious application classification apparatus 200 can operate as a web server that constitutes a so-called server and provides, for example, a web page through hyper text transfer protocol (HTTP).

악성 앱 분류 장치(200)는 적어도 통신망(300)에 인터페이스 하기 위한 통신 인터페이스, 각종 프로그램을 수행할 수 있도록 하는 하나 이상의 프로세서 및 각종 데이터를 저장하기 위하여 하나 이상의 하드 디스크로 그 하드웨어가 구성된다. The malicious application classifying apparatus 200 comprises at least a communication interface for interfacing with the communication network 300, at least one processor for executing various programs, and at least one hard disk for storing various data.

악성 앱 분류 장치(200)는 기능 블록으로서 송수신부(210), 데이터베이스(220), 앱 확인부(230), 앱 분석부(240) 및 크롤러(250)(crawler)를 포함한다. 각각의 기능 블록들은 바람직하게는 악성 앱 분류 장치(200)의 하드웨어와 결합하여 구성된다. 예를 들어 송수신부(210)는 통신 인터페이스와 통신 인터페이스상에서의 통신패킷을 처리하여 통신망(300)을 통해 데이터를 송수신한다. 데이터베이스(220)는 하드디스크 등을 이용하여 각종 데이터를 액세스 및 관리를 할 수 있도록 한다. 그리고 크롤러(250), 앱 확인부(230) 및 앱 분석부(240)는 프로세서에서 실행할 수 있는 프로그램의 형태로 구성된다. The malicious application classifying apparatus 200 includes a transmitting and receiving unit 210, a database 220, an app checking unit 230, an app analyzing unit 240, and a crawler 250 as functional blocks. Each functional block is preferably configured in combination with the hardware of the malicious app classifier 200. For example, the transceiver 210 processes communication packets on a communication interface and a communication interface to transmit and receive data through the communication network 300. The database 220 allows various data to be accessed and managed by using a hard disk or the like. The crawler 250, the app checking unit 230, and the app analyzing unit 240 are configured in the form of a program executable by the processor.

각각의 기능 블록의 기능과 그 관계를 좀 더 살펴보면, 크롤러(250)는 인터넷망을 통해 공식/비공식 마켓으로부터 앱 파일들(예를 들어 APK(Android PacKage) 파일)을 지속적으로 수집하고 수집된 앱 파일들을 데이터베이스(220)나 하드디스크 등에 저장한다. 크롤러(250)는 클라이언트(100)에 의한 분석 요청 이전에 먼저 앱 파일들을 수집하여 앱 분석부(240)를 통해 미리 악성 앱인지를 결정하고 악성 앱 인 경우 해당 앱을 분류할 수 있도록 한다. The crawler 250 continuously collects app files (for example, APK (Android PacKage) files) from the official / unofficial market through the Internet network and collects collected apps And stores the files in the database 220 or a hard disk. The crawler 250 first collects the app files before requesting the analysis by the client 100, determines the malicious application in advance through the application analysis unit 240, and classifies the malicious application in the case of the malicious application.

송수신부(210)는 클라이언트(100)로부터 앱의 분석 대상을 나타내는 앱 정보를 수신하고 수신된 앱 정보를 데이터베이스(220)나 데이터베이스(220)를 구성하는 하드디스크 등에 저장한다. 앱 정보는 예를 들어 앱 파일의 이름(식별자)과 앱 파일의 해쉬값을 포함한다. 또는 앱 정보는 앱 파일(APK 파일) 자체를 나타낼 수도 있다. 앱 정보는 또한 앱 확인부(230)로 바로 전달될 수도 있다. 송수신부(210)는 또한 앱 확인부(230) 또는 앱 분석부(240)를 통해 생성된 분석 결과를 요청한 클라이언트(100)로 전송한다. The transmitting and receiving unit 210 receives the app information indicating the analysis target of the app from the client 100 and stores the received app information in a hard disk constituting the database 220 or the database 220. [ The app information includes, for example, the name (identifier) of the app file and the hash value of the app file. Alternatively, the app information may represent an app file (APK file) itself. The app information may also be passed directly to the app checking unit 230. The transmission / reception unit 210 also transmits the analysis result generated through the application confirmation unit 230 or the application analysis unit 240 to the requesting client 100.

앱 확인부(230)와 앱 분석부(240)는 클라이언트(100)로부터 앱 분석 요청에 따른 앱 정보를 이용하여 앱이 악성인지 나아가 앱이 악성인 경우에 해당 앱을 분류하여 카테고리를 유사도에 따라 결정할 수 있다. The app checking unit 230 and the app analyzing unit 240 classify the app using the app information according to the app analysis request from the client 100 when the app is malicious and the app is malicious, You can decide.

앱 확인부(230)는 클라이언트(100)로부터의 앱 정보의 해쉬값과 데이터베이스(220)에 이미 분석이 완료되어 카테고리별로 분류된 앱들의 해쉬값들을 비교한다. 만일 동일한 해쉬값이 데이터베이스(220)에 존재하는 경우에 데이터베이스(220)로부터 추출되는 정보를 이용하여 결정되는 분석 결과를 앱 확인부(230)가 생성한다. 이후 앱 확인부(230)는 분석 결과를 송수신부(210)를 통해 클라이언트(100)로 전송한다. The app checking unit 230 compares the hash value of the app information from the client 100 with the hash values of the apps already analyzed in the database 220 and classified by category. If the same hash value exists in the database 220, the app checking unit 230 generates an analysis result determined using the information extracted from the database 220. Then, the application verification unit 230 transmits the analysis result to the client 100 through the transmission / reception unit 210.

클라이언트(100)로부터의 앱 정보가 데이터베이스(220)에서 식별되지 않는 경우에, 앱 확인부(230)는 크롤러(250)를 이용하여 앱 파일을 인터넷망의 공식/비공식 마켓으로부터 자동 추출하여 추출된 앱 파일을 데이터베이스(220)에 저장하고 또한 앱 파일을 앱 분석부(240)로 전달한다. 만일 공식/비공식 마켓으로부터 앱 파일이 추출되지 않는 경우 앱 확인부(230)는 클라이언트(100) 측으로 앱 파일을 앱 정보로서 요청할 수도 있다. When the app information from the client 100 is not identified in the database 220, the app checking unit 230 automatically extracts the app file from the official / informal marketplace of the internet network using the crawler 250, Stores the app file in the database 220, and transfers the app file to the app analyzer 240. If the app file is not extracted from the official / informal market, the app checking unit 230 may request the app file to the client 100 as app information.

앱 분석부(240)는 클라이언트(100)로부터 탐지 및 분류 요청된 앱 파일 또는 크롤러(250)를 통해 수집된 앱 파일들에 대해서 행위 기반의 프로파일링을 통해 카테고리로 분류하고 각 카테고리 내에서 서브-카테고리를 결정할 수 있다. 앱 분석부(240)를 통한 분석 결과는 데이터베이스(220)에 해쉬값과 같이 저장되고 또한 클라이언트(100)로 송수신부(210)를 통해 전송될 수 있다. The app analyzing unit 240 classifies the app files requested to be detected and classified by the client 100 or the app files collected through the crawler 250 into categories according to behavior-based profiling, Categories can be determined. The analysis result through the application analysis unit 240 may be stored in the database 220 as a hash value and may be transmitted to the client 100 through the transceiver unit 210. [

클라이언트(100)는 수신된 분석결과를 화면상에 디스플레이하고 이후 다운로드된 앱의 설치 여부를 사용자가 분석결과에 따라 결정할 수 있다. The client 100 may display the received analysis result on the screen and then determine whether the downloaded application is installed according to the analysis result by the user.

앱 분석부(240)에 대해서는 도 2 및 도 3을 통해서 상세히 살펴보도록 한다. The app analysis unit 240 will be described in detail with reference to FIG. 2 and FIG.

통신망(300)은 악성 앱 분류 장치(200)와 클라이언트(100) 사이의 데이터를 송수신할 수 있는 유선, 무선 또는 유무선 망이다. 통신망(300)은 이동통신망 또는 인터넷망을 포함한다. 물론 이동통신망은 인터넷망을 구성할 수도 있다.
The communication network 300 is a wired, wireless, or wired or wireless network capable of transmitting and receiving data between the malicious application classification apparatus 200 and the client 100. The communication network 300 includes a mobile communication network or an Internet network. Of course, the mobile communication network may constitute an internet network.

도 2는 앱 분석부(240)의 예시적인 기능 블록도를 도시한 도면이다. 2 is a diagram showing an exemplary functional block diagram of the app analysis unit 240. As shown in FIG.

도 2에 따르면 앱 분석부(240)는 행위 식별 모듈(241), 행위 프로파일링 모듈(243), 행위 범주화 모듈(245) 및 유사도 매칭 모듈(247)을 포함한다. 2, the app analysis unit 240 includes a behavior identification module 241, a behavior profiling module 243, a behavior categorizing module 245, and a similarity matching module 247. [

각 모듈들을 살펴보면, 행위 식별 모듈(241)은 행위 프로파일 생성에 이용될 로그 데이터를 생성한다. 이 로그 데이터는 이후 분석 대상인 앱의 행위를 식별할 수 있도록 한다. Looking at the respective modules, the behavior identification module 241 generates log data to be used for generating the behavior profile. This log data can then identify the behavior of the app being analyzed.

좀 더 구체적으로 살펴보면, 행위 식별 모듈(241)은 분석 대상인 앱을 에뮬레이터(emulator) 환경에서 실행하고 사용자 레벨에서의 사용자 로그 데이터와 커널(Kernel) 레벨에서의 시스템 로그 데이터를 통합하여 통합 로그 데이터를 생성한다. 생성된 통합 로그 데이터는 이후 행위 프로파일링을 위해 행위 프로파일링 모듈(243)로 전달된다. More specifically, the behavior identification module 241 executes an application to be analyzed in an emulator environment and integrates user log data at a user level and system log data at a kernel level to generate integrated log data . The generated consolidated log data is then transmitted to the behavior profiling module 243 for behavior profiling.

예를 들어 행위 식별 모듈(241)은 사용자 레벨에서 수행되는 에뮬레이터와 커널에서 동적으로 결합되는 적재가 가능한 커널 모듈(LKM : Loadable Kernel Module)를 이용하여 일정 시간 이상 앱을 실행하여 에뮬레이터로부터 사용자 로그 데이터를 수집하고 커널 모듈로부터 시스템 로그 데이터를 수집한다. 에뮬레이터는 예를 들어 안드로이드 앱 실행이 에뮬레이팅 가능한 Droidbox이다. For example, the action identification module 241 may execute an application for a predetermined time or longer using a loadable kernel module (LKM) capable of being dynamically combined with an emulator executed at a user level and a kernel, And collects system log data from the kernel module. For example, the emulator is a Droidbox that can emulate an Android app running.

행위 식별 모듈(241)은 사용자 로그 데이터 중 데이터 누수(leak) 내역, SMS/통화 내역을 수집하고 동적 커널 모듈로부터 시스템 콜과 매개변수를 수집하여 이를 통합 로그 데이터로 구성한다. 생성된 통합 로그 데이터는 적어도 시스템 콜(system call)과 시스템 콜에서 이용되는 매개변수를 포함한다. The action identification module 241 collects data leak data and SMS / call data among user log data, collects system calls and parameters from the dynamic kernel module, and composes the data into integrated log data. The generated consolidated log data includes at least the parameters used in the system call and the system call.

일반적으로 악성 앱 제작자는 악성행위와 관련있는 API 호출 사실을 사용자에게 숨길 수 있기 때문에 앱의 악성 행위를 파악하기 용이치 않다. 그래서 시스템 콜 후킹(Hooking)을 통해 어떤 시스템 함수가 호출되는 지 그리고 어떠한 매개변수를 가지는 지를 로그로 기록하면 API 함수 호출을 숨기는 경우에도 정확한 행위를 파악할 수 있다. Generally, malicious app authors are unable to identify malicious behavior in your app because they can hide the API calls associated with malicious behavior. So, if you record what kind of system function is called through system call hooking (Hooking) and what parameter it has, you can grasp correct behavior even when hiding API function call.

행위 프로파일링 모듈(243)은 행위 식별 모듈(241)의 통합 로그 데이터로부터 분석 대상인 앱의 하나 이상의 행위(Operation)를 식별할 수 있는 행위 프로파일을 생성한다. 즉 행위 프로파일링 모듈(243)은 통합 로그 데이터를 파싱하고 파싱된 데이터로부터 행위 프로파일을 구성한다. The behavior profiling module 243 generates an action profile that can identify one or more operations of the analyzed application from the integrated log data of the action identification module 241. [ That is, the behavior profiling module 243 parses the integrated log data and constructs a behavior profile from the parsed data.

행위 프로파일링 모듈(243)에서 생성되는 행위 프로파일에 대해서 여기서 구체적으로 살펴보도록 한다. The behavior profile generated by the behavior profiling module 243 will be described in detail here.

행위 프로파일(P)는

와 같이 4개의 원소를 갖는 집합으로 구성된다.

는 분석 대상인 앱에서 파악되는 모든 객체 셋(Set)을 의미하고

는 모든 오퍼레이션(Operation) 집합을 나타내며 키(Key)와 값(Value)로 구성된 내재된 사전 구조(nested dictionary)를 가진다.

는 객체와 오퍼레이션과의 매핑 관계를 의미하며

는 순서(Sequence)와 관련 없는 전체 객체-오프레이션 관게 집합을 의미한다.

는 통합 로그 데이터로 표현된다. The behavior profile (P)

And a set having four elements as shown in FIG.

Refers to all the set of objects that are understood by the app to be analyzed

Has a built-in nested dictionary consisting of a key and a value representing all sets of operations.

Means a mapping relationship between an object and an operation

Means a whole set of object-offsets that are not related to a sequence.

Is represented by integrated log data.

객체(Object)는 악성 앱이 각각의 행위를 하는데 필요한 추상적인 기능(functionality)을 의미한다. 객체는 특정 행위에 대한 추상화를 의미하고 아래와 같이 기호화하여 설명할 수 있다. 이하 살펴볼 바와 같이, 악성 앱의 행위 패턴은 과금 전화번호로 전화를 거는 행위(이하 '과금 전화'라고도 함), 과금 전화번호로 SMS 보내는 행위(이하 '과금 문자'라고도 함), 민감한 개인 정보를 수집하여 다른 곳으로 전송하는 행위 (이하 '개인 정보 전송'이라고도 함) 및 전송을 위하여 데이터를 변환하는 행위(이하 '데이터 변환'이라고도 함)의 하나 이상의 조합으로 결정된다. Objects represent the abstract functionality that malicious apps need to do for each action. An object is an abstraction of a specific action and can be described as follows. As will be seen below, malicious apps' behavior patterns can be classified into the following three categories: calling a billing phone number (hereinafter referred to as a "billing call"), sending an SMS to a billing phone number (Hereinafter also referred to as " personal information transmission ") and an operation of converting data for transmission (hereinafter also referred to as " data conversion ").

오퍼레이션은 구체적인 악성 행위를 나타낸다. 이하에서는 오퍼레이션을 '악성 행위'라고 지칭할 수도 있다. 오퍼레이션은 이름(name), 목표(target) 및 속성(attribute)로 구성된다. 이름은 악성 행위의 식별자(또는 구분자)를 의미하고 목표는 악성 앱의 공격 목표를 의미한다. 예를 들어 클라이언트(100)의 외부 메모리의 내용이나 시스템 정보 등은 공격 목표가 될 수 있다. 속성은 공격 목표에 대하여 추출된 데이터, 예를 들어 과금/문자 전화번호, 국가 코드나 유심 정보 등 악성 앱이 목표로 하는 민감한 개인 정보들 자체를 의미한다. 오퍼레이션은 다음과 같이 기호화하여 설명될 수 있다. Operation indicates specific malicious behavior. Hereinafter, the operation may be referred to as a malicious act. An operation consists of a name, a target and an attribute. The name refers to the identifier (or delimiter) of malicious behavior and the goal is the attack target of the malicious app. For example, the content or system information of the external memory of the client 100 may be an attack target. Attributes refer to sensitive personal information targeted by malicious apps, such as billing / text phone numbers, country codes or worm information, extracted against attack targets. The operation can be described by symbolizing as follows.

아래 표 1은 네트워크 객체와 대응되는 오퍼레이션간의 관계(

)에 대한 예제이다. Table 1 below shows the relationship between the network object and the corresponding operation (

) Is an example.

표 1에서 알 수 있는 바와 같이, 네트워크 객체는 두 개의 오퍼레이션에 맵핑되고 각각의 오퍼레이션들은 다시 하나 혹은 다수의 속성들을 전송하거나 변환하는 것을 알 수 있다. As can be seen in Table 1, the network object is mapped to two operations, and each of the operations can again know to transfer or translate one or more attributes.

행위 프로파일링 모듈(243)은 통합 로그 데이터로부터 객체 및 오퍼레이션(악성 행위)을 식별하고 객체 및 악성 행위 간의 맵핑 관계 그리고 악성 행위의 이름, 목표, 속성 등을 내제된(nested) 형식으로 구성한다. 예를 들어 행위 프로파일링 모듈(243)은 파이썬(PYTHON) 프로그래밍 언어의 내제된 사전구조(nested dictionary)로 이러한 행위 프로파일(P)을 표현할 수 있다. The behavior profiling module 243 identifies objects and operations (malicious actions) from the integrated log data and configures the mapping relationship between the object and the malicious behavior and the name, target, and attributes of the malicious behavior in a nested format. For example, the behavior profiling module 243 may express this behavior profile P with a nested dictionary of the PYTHON programming language.

각각의 악성 행위는 에뮬레이션(emulation) 과정(또한 커널 후킹을 통해)에서 식별될 수 있다. 악성 행위는 예를 들어 클라이언트(100)를 이용하는 사용자에게 그 행위를 알리지 않고 과금 문자, 과금 전화, 개인 정보 전송 또는 데이터 변환을 수행하는 행위이다. 이러한 행위는 앞서 살펴본 바와 같은 4가지 악성 행위로 범주화되고 앱은 행위 프로파일을 통해 4가지 악성 행위의 조합으로 표현된다. Each malicious behavior can be identified in an emulation process (also through kernel hooking). The malicious action is, for example, an act of performing a billing character, a billing call, a personal information transmission or a data conversion without notifying the user using the client 100 of the action. These behaviors are categorized into four malicious behaviors as described above, and the apps are represented by a combination of four malicious acts through the behavior profile.

도 4는 특정 악성 앱에 대하여 생성된 행위 프로파일의 예를 도시한 도면이다. 도 4의 예는 자료 수집형 악성 앱의 일종인 FakeBattScar에 대해서 수집된 통합 로그 데이터로 행위 프로파일링 모듈(243)에서 생성된 행위 프로파일의 예이다. FakeBattScar는 광고를 디바이스 화면에 띄우고, 디바이스 모델, 제조사, 이동통신사 정보, 위치정보, IMEI, OS/SDK 버전 등을 악성 앱 제작자의 서버로 전송한다. 도 4에서 알 수 있는 바와 같이 행위 프로파일을 통해 분석가는 직관적으로 해당 앱의 악성 행위 유무를 파악할 수 있다. 4 is a view showing an example of a behavior profile generated for a specific malicious app. The example of FIG. 4 is an example of the behavior profile generated by the behavior profiling module 243 with the integrated log data collected for FakeBattScar, which is a kind of data collecting malicious app. FakeBattScar displays the advertisement on the device screen, and transmits the device model, manufacturer, carrier information, location information, IMEI, OS / SDK version, etc. to the malicious app producer's server. As shown in FIG. 4, the behavior profile allows the analyst to intuitively grasp the malicious behavior of the app.

행위 프로파일링 모듈(243)에 의해서 분석 대상인 앱에 대한 행위 프로파일이 구성되고 행위 범주화 모듈(245)로 이 행위 프로파일(P)이 전달된다. The behavior profiling module 243 configures a behavior profile for the application to be analyzed and delivers the behavior profile P to the behavior categorizing module 245. [

행위 범주화 모듈(245)은 행위 프로파일로부터 분석되는 앱의 행위 패턴에 따라 분석 대상인 앱을 특정 카테고리로 분류한다. 대분류로 이용되는 카테고리들은 예를 들어 4가지 악성 행위의 조합으로 구성되고 16가지의 행위 패턴(하나의 행위 패턴은 정상 앱을 나타내고 나머지 15가지의 행위 패턴은 4가지 악성 행위에 대한 1가지 이상의 조합을 나타냄)에 각각 대응한다. The action categorizing module 245 classifies the analyzed application into a specific category according to the behavior pattern of the app analyzed from the behavior profile. The categories used as the main categories are, for example, a combination of 4 malicious acts, and 16 behavior patterns (one behavior pattern represents the normal application and the remaining 15 behavior patterns are combinations of one or more malicious acts Respectively).

이와 같이 행위 범주화 모듈(245)은 앱에서 수행되는 과금 전화, 과금 문자, 개인 정보 전송 또는 데이터 변환의 악성 행위 조합으로 인식되는 행위 패턴으로 분석 대상인 앱을 분류한다. Thus, the behavior categorizing module 245 classifies the analyzed application into a behavior pattern recognized as a malicious behavior combination of a billing call, a billing character, personal information transmission, or data conversion performed in the application.

만일 행위 프로파일에 사전 정의된 4가지 악성 행위가 전혀 존재하지 않는 경우 행위 범주화 모듈(245)은 해당 앱을 정상 앱으로 간주하고 정상 앱을 나타내는 카테고리에 해당 앱 정보(예를 들어 앱 이름 및 해쉬값)를 저장한다. 그리고 행위 범주화 모듈(245)은 이후 유사도 매칭 모듈(247)을 구동하지 않고 정상임을 나타내는 분석 결과를 송수신부(210)를 통해 클라이언트(100)로 전달할 수 있다. If there are no four malicious acts predefined in the behavior profile, the action categorization module 245 regards the corresponding application as a normal application and stores the corresponding application information (for example, the application name and the hash value ). Then, the behavior categorization module 245 may not transmit the analysis result indicating normalness to the client 100 through the transmission / reception unit 210 without driving the similarity matching module 247 thereafter.

유사도 매칭 모듈(247)은 행위 범주화 모듈(245)에 의해서 분류된 분석 대상 앱의 대응 카테고리 내에서 복수의 서브-카테고리(sub-category)별 대표 (행위) 프로파일과의 유사도 비교로 행위 범주화 모듈(245)에서 분류된 앱의 서브-카테고리를 결정한다. The similarity matching module 247 compares the similarity with a plurality of sub-category-based representative (activity) profiles within the corresponding category of the analysis target application classified by the behavior categorization module 245, Category of an app that is classified in < / RTI >

여기서, 악성 앱 분류 장치(200)에 포함된 데이터베이스(220)를 좀 더 구체적으로 살펴보면, 데이터베이스(220)는 크롤러(250) 또는 클라이언트(100)로부터 요청된 앱들을 16가지 카테고리로 대분류하고 카테고리 내 앱들을 다시 군집별로 서브-카테고리로 분류한다. 16가지 카테고리 중 하나는 정상 앱을 위한 카테고리이며 나머지 15가지는 4가지 악성 행위의 하나 이상의 조합의 악성 행위 패턴을 나타낸다. Here, the database 220 included in the malicious application classifying apparatus 200 may be classified into 16 categories by the crawler 250 or the client 100, The apps are again sub-categorized by community. One of the 16 categories is the category for the normal app and the remaining 15 represents the malicious behavior pattern of one or more combinations of the four malicious acts.

악성 행위 패턴의 각 카테고리 내의 서브-카테고리 각각은 예를 들어 악성 앱의 진단명을 나타낼 수 있다. 특정 서브-카테고리에 분류된 앱들은 앱 파일에 대한 해쉬값 및 앱에 대하여 행위 프로파일링 모듈(243)에 의해서 생성된 행위 프로파일로 데이터베이스(220)에 저장된다. 데이터베이스(220)에서 특정 서브-카테고리 각각은 대표 프로파일을 또한 저장한다. Each of the sub-categories within each category of the malicious behavior pattern may, for example, represent a diagnosis of a malicious app. The apps classified in a specific sub-category are stored in the database 220 as a hash value for the app file and a behavior profile generated by the behavior profiling module 243 for the app. Each of the particular sub-categories in the database 220 also stores a representative profile.

대표 프로파일은 서브-카테고리 내의 앱들의 프로파일에 의해서 결정된다. 대표 프로파일은 서브-카테고리의 그룹의 공통적인 성격을 가지도록 구성된다. 예를 들어 대표 프로파일은 분석 대상인 앱이 특정 서브-카테고리로 분류 결정됨에 따라 업데이트되며 업데이트 방식은 현재 대표 프로파일과 분석 대상인 앱의 행위 프로파일의 합집합(Union) 또는 교집합(Intersection)으로 업데이트된다. The representative profile is determined by the profile of the apps in the sub-category. The representative profile is configured to have a common character of a group of sub-categories. For example, the representative profile is updated as the analyzed application is classified into a specific sub-category, and the update method is updated to the union or intersection of the current representative profile and the behavior profile of the analyzed application.

교집합 방식은 악성 앱의 변종으로 인해 발생하는 노이즈에 용이하게 대응할 수 있고 해당 군(서브-카테고리)에 속하는 악성 앱들의 공통적인 성격을 유지하는 방향으로 업데이트한다. 합집합 방식은 다양한 변종을 수용할 수 있는 방향으로 업데이트한다. The intersection method can easily cope with the noise caused by the variant of the malicious app and updates the common characteristics of malicious apps belonging to the corresponding group (sub-category). The union scheme is updated to accommodate various variants.

유사도 매칭 모듈(247)은 유사도 비교로 분석 대상 앱의 서브 카테고리를 결정하게 되는 데, 유사도 매칭 모듈(247)은 과금 전화, 과금 문자, 개인 정보 전송 및 데이터 변환 악성 행위별로 서브-카테고리별 대표 프로파일과의 유사성 점수의 합산으로 유사도를 계산한다. The similarity matching module 247 determines the subcategories of the analysis target appara- tus by the similarity comparison. The similarity matching module 247 compares the representative profile of each sub-category by the billing phone, the billing character, the personal information transmission, And the similarity score is calculated by summing up the similarity scores.

유사성 점수는 전화나 카메라 등과 같은 하드웨어 자원과 시스템 또는 개인 정보에 접근할 수 있는 정도로 수치화된다. Similarity scores are quantified to the extent that they have access to hardware resources, such as phones and cameras, and systems or personal information.

서브-카테고리별 유사도(S)는 The similarity (S) per sub-category is

로 계산되고,

이다.

와

는 i 번째 행위 유사성 요소와 가중치를 나타낸다. BFS(Behavior Factor Similarity)는 행위 유사성 요소로서 인덱스 i에 따라 4가지를 나타낸다. i 인덱스 값에 따라 즉 과금 전화 악성 행위에 대한 유사성(CS : Similarity of Calling premiun-rate number), 과금 문자 악성 행위에 대한 유사성(SS : Similarity of Sending premiun-rate SMS), 개인 정보 전송 악성 행위에 대한 유사성(SIS : Similarity of collecting Sensitvie Information), 데이터 변환 악성 행위에 대한 유사성(CDS : Similarity of Converting Data)을 나타낸다. Lt; / RTI >

to be.

Wow

Represents the ith similarity factor and its weight. Behavior Factor Similarity (BFS) is a behavior similarity factor and represents four things according to index i. (iii) Similarity of calling premiun-rate number (CS), Similarity of sending premiun-rate SMS (SS), Personal information transmission Malicious behavior Similarity of collecting Sensitvie Information (SIS) and Similarity of Converting Data (CDS).

각 유사성에 적용되는 가중치는 달리 적용될 수 있다. 예를 들어 SS는 0.33, CS는 0.33, SIS는 0.21, CDS는 0.13으로 결정될 수 있다. 이와 같이 SS와 CS는 SIS와 CDS의 가중치에 비해서 크다. 특히 CDS의 가중치가 가장 낮은데 그 이유는 다음과 같다. 먼저 개인 정보 전송을 위하여 통신하는 과정에서 에러 발생시에 전송을 위한 자료 변환이 불필요하기에 다른 유사성 요소에 비해 가중치를 낮게 두었고 둘째 일반적으로 정상 파일에 대해서 서버간 통신이 필요할 때 인코딩 또는 암호화가 적용되는 경우가 많아 자료 변환 만으로 악성 행위로 보기 힘들기 때문이다. The weights applied to each similarity can be applied differently. For example, SS can be determined to be 0.33, CS to be 0.33, SIS to be 0.21, and CDS to be 0.13. Thus, SS and CS are larger than SIS and CDS weights. In particular, CDS has the lowest weight because: First, since data conversion for transmission is unnecessary in the process of communication for personal information transmission, weight is lower than other similarity factors. Second, encoding and encryption are applied when communication between servers is required for normal files Because there are many cases, it is hard to be seen as malicious act by data conversion only.

표 2는 각 요소별 유사성 측정 방법을 나타낸다. 과금 번호로 SMS을 보내는 SS와 과금 번호로 전화를 거는 CS의 유사성 점수는 해당 하드웨어 자원에 접근할 수 있느냐 없느냐를 바이너리로 비교한다. 악성 행위의 속성 등에 포함되는 전화번호와 같은 문자열은 한 글자라도 다르면 다른 전화번호가 되기에 부분 매칭은 유사성 비교측면에서 유의한 의미를 제공하지 못한다. 따라서 서브-카테고리별 대표 프로파일과의 비교에서 악성 행위의 속성에 기초하여 동일한 행위를 한다면 1 그렇지 않다면 0이 부여된다. Table 2 shows the similarity measurement method for each element. The similarity score of the CS that dials the billing number and the SS that sends the SMS with the billing number compares binary access whether or not the corresponding hardware resource can be accessed. A string such as a phone number included in the attribute of malicious action is different from one letter, so partial matching does not provide a meaningful meaning in terms of similarity comparison. Therefore, if the same behavior is performed based on the attribute of the malicious behavior in comparison with the sub-category representative profile, 1 is assigned to 0 otherwise.

민감한 데이터를 전송하는 SIS는 자카르드 인덱스 방법을 이용하여 유사성 점수를 계산한다. CDS는 악성 행위의 속성에 포함되는 목적 URL 주소, 암호화 모드와 인코딩 모드에 대한 유사성을 각각 구하여 더한 평균값으로 유사성 점수를 계산한다. 목적 URL 주소의 유사성 점수는 Longest Prefix Matching 방법을 적용한 후에 불일치되는 부분에 대해서는 레빈쉬타인 거리(Levenshtein distance) 방법을 적용한다. 암호화 모드 및 인코딩 모드에 대해서는 바이너리 비교를 통해 동일한 행위를 한다면 1 아니라면 0을 부여한다. SIS, which transmits sensitive data, calculates the similarity score using the ZarCard index method. The CDS computes the similarity scores for the destination URL address, the encryption mode, and the encoding mode included in the attribute of the malicious behavior, and calculates the similarity score by adding the average value. The similarity score of the destination URL address is obtained by applying the Levenshtein distance method to the discordant part after applying the Longest Prefix Matching method. For the encryption mode and the encoding mode, 0 is assigned if the same operation is performed through binary comparison.

이와 같이 유사도 매칭 모듈(247)은 서브-카테고리별 유사도(S) 점수를 계산하고 서브-카테고리별 유사도 점수를 임계치와 비교한다. 유사도 매칭 모듈(247)은 서브-카테고리별 유사도 점수가 임계치를 초과하고 가장 높은 유사도 점수를 가지는 서브-카테고리를 분석 대상인 앱의 서브-카테고리로 결정할 수 있다. 임계치는 예를 들어 0.85 등이 될 수 있다. 만일 임계치를 초과하는 서브-카테고리가 존재하지 않는 경우에 유사도 매칭 모듈(247)은 새로운 서브-카테고리를 데이터베이스(220)에 생성하고 해당 앱의 행위 프로파일을 새로운 서브-카테고리에 저장한다. 서브-카테고리의 결정에 따라 유사도 매칭 모듈(247)은 합집합 또는 교집합 방식에 따라 대표 프로파일을 업데이트한다. Thus, the similarity matching module 247 calculates the similarity score (S) per sub-category and compares the similarity score by sub-category with a threshold value. The similarity matching module 247 may determine a sub-category having a similarity score per sub-category exceeding a threshold value and having the highest similarity score as a sub-category of the analyzed application. The threshold value may be 0.85, for example. If there is no sub-category that exceeds the threshold, the similarity matching module 247 creates a new sub-category in the database 220 and stores the behavior profile of that app in the new sub-category. According to the determination of the sub-category, the similarity matching module 247 updates the representative profile according to the union or intersection method.

이후 유사도 매칭 모듈(247)은 분석 결과를 클라이언트(100)에게 송수신부(210)를 통해 전송할 수 있다. Then, the similarity matching module 247 may transmit the analysis result to the client 100 through the transceiver 210.

본 발명에 따른 앱 분석부(240)를 구현하여 실험한 결과, 악성 앱의 서브-카테고리(예를 들어 악성 앱 진단명)간 유사도 점수는 최대 0.8 까지 인 것으로 나타났고 악성 앱의 서브-카테고리와 정상 앱 간의 유사도 점수는 0.1 정도인 것으로 나타났다. 또한 악성 앱의 유사도 비교에 따른 분류에서 정확도가 99% 이상임이 확인되었고 정상 앱과 악성 앱의 분류 정확도가 98 % 이상임이 나타났고 1MB 앱을 58초 안에 분류 가능하였다. 실험에 이용되고 서브-카테고리일 수 있는 진단명은 예를 들어 AdWo, AirPush, Boxer, FakeBattScar 등이 있다. As a result of experimenting with the application analyzer 240 according to the present invention, it has been found that the similarity score between the sub-category of the malicious application (for example, malicious application diagnosis name) is up to 0.8 and the sub- The similarity score between the apps was 0.1. In addition, it was confirmed that the accuracy was more than 99% in classification based on the comparison of malicious apps, and the classification accuracy of normal and malicious apps was more than 98%, and 1MB app was able to be classified within 58 seconds. Diagnostic names that may be used in the experiment and which may be sub-categories are, for example, AdWo, AirPush, Boxer, FakeBattScar, and the like.

이와 같이 행위 프로파일 기법을 악성 앱 탐지 및 분류에 적용하여 악성코드 샘플들에 대해서 신속하고 효율적인 대처가 가능하고 행위 프로파일링 기법을 활용하여 다양한 기존 문제점을 해소하면서 신종 악성 앱을 탐지하고 및 분류할 수 있도록 한다.
In this way, the behavior profile technique can be applied to malicious application detection and classification, and it is possible to cope with malicious code samples quickly and efficiently, and it is possible to detect and classify new malicious apps while solving various existing problems by utilizing behavior profiling technique .

도 3은 악성 앱 탐지 및 분류를 위한 예시적인 흐름도를 도시한 도면이다.3 is a diagram illustrating an exemplary flow chart for malicious app detection and classification.

이미 도 2를 통해 탐지 및 분류를 위한 모듈 및 제어를 상세히 살펴보았으므로 여기서는 도 2의 설명과 중복되는 부분은 생략하고 흐름을 중심으로 간단히 살펴보도록 한다. 도 3의 흐름도는 악성 앱 분류 장치(200)의 하드웨어 상에서 수행되며 바람직하게는 하드디스크 등에 저장되어 있는 프로그램을 프로세서상에서 실행하여 수행된다. Since the module and the control for detection and classification have already been described in detail with reference to FIG. 2, a description of the module and the control for the detection and classification will be omitted here. 3 is performed on the hardware of the malicious application classifying apparatus 200, preferably by executing a program stored in a hard disk or the like on the processor.

먼저 악성 앱 분류 장치(200)는 클라이언트(100)로부터 앱 정보를 수신(S101)한다.First, the malicious application classification apparatus 200 receives application information from the client 100 (S101).

앱 정보의 수신 후에 앱 정보에 대응하는 앱이 데이터베이스(220)에 존재하는 지를 결정(S103)한다. 존재 여부의 결정은 앱 정보에 포함된 해쉬값과 데이터베이스(220)에 포함된 해쉬값의 비교로 결정된다. After receiving the app information, it is determined whether the app corresponding to the app information exists in the database 220 (S103). The determination of existence is determined by comparing the hash value included in the app information with the hash value included in the database 220. [

만일 이미 존재하여 데이터베이스(220)에 서브-카테고리로 분류되어 있는 경우에 악성 앱 분류 장치(200)는 데이터베이스(220)로부터 분석 결과를 생성(S117)한다. If the malicious application classifier 200 already exists and is classified into the sub-category in the database 220, the malicious appclassifier 200 generates an analysis result from the database 220 (S117).

데이터베이스(220)에 존재하지 않는 경우, 악성 앱 분류 장치(200)는 앱 정보에 대응하는 앱의 악성 행위를 식별할 수 있도록 하는 데이터인 통합 로그 데이터를 생성(S105)한다. 악성 앱 분류 장치(200)는 앱을 크롤러(250) 등을 통해 또는 클라이언트(100)로부터 수신하여 앱을 에뮬레이터를 통해 실행한다. 에뮬레이터에서의 실행에 따라 악성 앱 분류 장치(200)는 사용자 레벨에서의 사용자 로그 데이터와 LKM에 의해 커널 레벨에서 수집되는 시스템 로그 데이터를 통합하여 통합 로그 데이터를 생성한다. 통합 로그 데이터는 적어도 시스템 콜과 시스템 콜에서 이용되는 매개변수를 포함한다. If it is not present in the database 220, the malicious appclassifier 200 generates integrated log data (S105), which is data for identifying malicious behavior of the application corresponding to the app information. The malicious application classification apparatus 200 receives an application from the client 100 via the crawler 250 or the like and executes the application through the emulator. According to the execution in the emulator, the malicious application classification apparatus 200 combines user log data at the user level and system log data collected at the kernel level by the LKM to generate integrated log data. Consolidated log data includes at least the parameters used in system calls and system calls.

이후 악성 앱 분류 장치(200)는 앱 분석에 기초한 데이터인 통합 로그 데이터로부터 앱의 악성 행위를 식별할 수 있는 행위 프로파일을 생성(S107)하고 행위 프로파일로부터 분석되는 악성 행위 패턴에 따라 앱을 카테고리로 분류(S109)한다. 이에 따라 분석 대상인 앱은 정상 앱인지 또는 악성 앱인지를 탐지할 수 있고 앱의 대분류를 식별할 수 있다. Thereafter, the malicious appclassifier 200 generates an action profile that can identify the malicious action of the app from the integrated log data, which is data based on the application analysis (S107), and classifies the application into categories according to the malicious act pattern analyzed from the action profile (S109). Accordingly, the analyzed application can detect whether it is a normal app or a malicious app, and can identify the main category of the app.

분류된 카테고리 내에서 악성 앱 분류 장치(200)는 서브-카테고리별 대표 프로파일과의 유사도 비교로 분류된 앱의 서브-카테고리를 결정(S111)한다. 이때 악성 앱 분류 장치(200)는 임계치를 이용하여 특정 서브-카테고리를 결정하거나 새로운 서브-카테고리를 생성할 수 있다. Within the classified category, the malicious appclassifier 200 determines the sub-category of the app classified by the similarity comparison with the representative profile by sub-category (S111). At this time, the malicious application classification apparatus 200 can determine a specific sub-category or create a new sub-category using the threshold value.

서브-카테고리의 결정에 따라 악성 앱 분류 장치(200)는 결정된 서브-카테고리의 대표 프로파일을 업데이트(S113)한다. 이러한 업데이트는 예를 들어 교집합 방식이거나 합집합 방식일 수 있다. According to the determination of the sub-category, the malicious appclassifier 200 updates the representative profile of the determined sub-category (S113). Such updates may be, for example, intersection or union.

이후 악성 앱 분류 장치(200)는 서브-카테고리의 결정 및 대표 프로파일의 업데이트를 반영하도록 데이터베이스(220)를 업데이트하고 분석 대상인 앱에 대한 분석 결과를 생성(S115)한다.Then, the malicious appclassifier 200 updates the database 220 to reflect the determination of the sub-category and the update of the representative profile, and generates an analysis result for the application to be analyzed (S115).

분석 결과의 생성에 따라 악성 앱 분류 장치(200)는 분석 결과를 클라이언트(100)로 송수신부(210)를 경유하여 전송(S119)한다. 분석 결과를 수신한 클라이언트(100)는 분석 결과를 화면 등에 표시하고 사용자에게 알릴 수 있다.
According to the generation of the analysis result, the malicious app classification apparatus 200 transmits the analysis result to the client 100 via the transmission / reception unit 210 (S119). The client 100 receiving the analysis result can display the analysis result on a screen or the like and inform the user of the analysis result.

이상에서 설명한 본 발명은, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 있어 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시 예 및 첨부된 도면에 의해 한정되는 것이 아니다. It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention. The present invention is not limited to the drawings.

100 : 클라이언트 200 : 악성 앱 분류 장치
210 : 송수신부 220 : 데이터베이스
230 : 앱 확인부 240 : 앱 분석부
241 : 행위 식별 모듈 243 : 행위 프로파일링 모듈
245 : 행위 범주화 모듈 247 : 유사도 매칭 모듈
250 : 크롤러
300 : 통신망100: Client 200: Malicious app classifier
210: Transmitting / receiving unit 220:
230: App verification unit 240: App analysis unit
241: Behavior Identification Module 243: Behavior Profiling Module
245: Act categorization module 247: Similarity matching module
250: Crawler
300: Network

Claims

An act profiling module that generates an action profile that can identify one or more malicious operations of the app from data based on an analysis of the app; And
And a behavior categorization module for classifying the application according to a malicious behavior pattern of the application analyzed from the behavior profile.
Malicious app classifier.

The method according to claim 1,
And an action identification module for generating the data to identify malicious behavior of the app,
The action identification module generates the integrated log data by integrating the user log data collected at the user level and the system log data collected at the kernel level by executing the application through the emulator and transmits the combined log data to the behavior profiling module,
The integrated log data includes system call and parameters used in the system call,
Malicious app classifier.

The method according to claim 1,
The malicious action may represent a billing call, billing text, personal information transfer or data conversion,
Wherein the behavior categorization module classifies the application into a malicious behavior pattern recognized as one or more combinations of charging phone calls, charging characters, personal information transmission, and data conversion performed in the application,
Malicious app classifier.

The method according to claim 1,
A database for classifying malicious apps by category, classifying the malicious apps into a plurality of sub-categories within a category, and managing representative profiles for each sub-category; And
And a similarity matching module for determining a sub-category of the classified application by comparing the similarity with the sub-category representative profile within the corresponding category of the application classified by the behavior categorization module.
Malicious app classifier.

5. The method of claim 4,
The similarity matching module calculates the similarity by summing up similarity between the classified app and the representative profile for each of billing telephone, billing character, personal information transmission, and data conversion malicious behavior,
The weight applied to the similarity of malicious act of charging phone malicious act and billing character is larger than the weight applied to malicious malicious act of data transmission and malicious activity of data conversion,
Malicious app classifier.

5. The method of claim 4,
The representative profile is updated according to sub-category determination of the app,
Wherein the representative profile is updated with an intersection or a union of profiles of the apps in the determined sub-
Malicious app classifier.

6. The method of claim 5,
The malicious action consists of a name, a goal and an attribute
Wherein the similarity calculated for each malicious action is calculated based on at least an attribute included in the malicious action,
Malicious app classifier.

5. The method of claim 4,
A transmitting and receiving unit receiving app information from a client and transmitting an analysis result according to the app information; And
And a crawler for collecting apps over the Internet,
The malicious app classification apparatus classifies the apps collected through the crawler into a database, and when the app identified by the hash value of the received app information is already classified in the database, the malicious app classification apparatus transmits the analysis result determined from the database to the client through the transmitting / Transmitting,
Malicious app classifier.

(b) generating an action profile that can identify one or more malicious operations of the app from data based on an analysis of the app; And
(c) classifying the app according to a malicious behavior pattern of the app analyzed from the behavior profile.
How to classify malicious apps.

10. The method of claim 9,
The method of claim 1, further comprising: (a) before the step (b), generating the data to identify malicious behavior of the app,
Wherein the step (a) generates integrated log data, which is obtained by integrating user log data collected at a user level and system log data collected at a kernel level by executing the app through an emulator, based on analysis of the app,
The integrated log data includes system call and parameters used in the system call,
How to classify malicious apps.

10. The method of claim 9,
Determining a sub-category of an app that is classified by a similarity comparison with a sub-category representative profile within a corresponding category of the classified application;
How to classify malicious apps.