WO2023177020A1 - Deobfuscation apparatus for data flow analysis of obfuscated application, and method therefor - Google Patents

Deobfuscation apparatus for data flow analysis of obfuscated application, and method therefor Download PDF

Info

Publication number
WO2023177020A1
WO2023177020A1 PCT/KR2022/008971 KR2022008971W WO2023177020A1 WO 2023177020 A1 WO2023177020 A1 WO 2023177020A1 KR 2022008971 W KR2022008971 W KR 2022008971W WO 2023177020 A1 WO2023177020 A1 WO 2023177020A1
Authority
WO
WIPO (PCT)
Prior art keywords
code
obfuscated
call flow
file
deobfuscated
Prior art date
Application number
PCT/KR2022/008971
Other languages
French (fr)
Korean (ko)
Inventor
이정현
조해현
이동호
Original Assignee
숭실대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 숭실대학교 산학협력단 filed Critical 숭실대학교 산학협력단
Publication of WO2023177020A1 publication Critical patent/WO2023177020A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements

Definitions

  • the present invention relates to a deobfuscation device and method for analyzing the data flow of an obfuscated application. More specifically, the present invention relates to a deobfuscation device and method for analyzing the data flow of an obfuscated application. More specifically, the present invention relates to a deobfuscation device and method for analyzing the data flow of an obfuscated application. It relates to a deobfuscation device and method for analyzing the data flow of an obfuscated application that enables analysis.
  • deobfuscation techniques are used to invalidate obfuscation techniques to protect intellectual property rights of software and to more quickly detect malicious actions that attack the system or steal important information from users using the system.
  • the technical problem to be achieved by the present invention is an obfuscated application that enables more sensitive data flow analysis by deobfuscating obfuscated information and hidden code using call instructions in which an obfuscated Android application package (APK) is executed.
  • the purpose is to provide a deobfuscation device and method for data flow analysis.
  • a deobfuscation device for data flow analysis of an obfuscated application is a command extraction device that extracts commands executed by executing an obfuscated APK (Android application package) file.
  • APK Android application package
  • a deobfuscation unit that deobfuscates the obfuscated APK file using the extracted commands
  • a code insertion unit that compares the code of the obfuscated APK file and the deobfuscated APK file and inserts the code of the deobfuscated APK file in the form of a dummy code according to the comparison result
  • a call flow generator that generates a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow
  • a DEX file creation unit that uses the generated new call flow to generate a DEX (Dalvik Executable) file to which a new call flow that has been disconnected due to obfuscation is added.
  • it may further include a static pollution analysis performing unit that performs static pollution analysis using the generated DEX file.
  • the code insertion unit determines whether deobfuscated information exists by comparing the code of the obfuscated APK file and the deobfuscated APK file, and if deobfuscated information exists as a result of the determination, the deobfuscated information is deobfuscated.
  • the code of the APK file can be inserted in the form of dummy code after the existing code.
  • the call flow generator compares the code of the deobfuscated APK file in the form of a dummy code with the code of the obfuscated APK file, recombines the obfuscated call flow to match the existing call flow, and generates a new call flow. can do.
  • the DEX file generator can generate a Classes.dex file by adding a call flow that reconnects the call flow that was interrupted due to obfuscation to the code of the original APK file.
  • the deobfuscation method using a deobfuscation device for analyzing the data flow of an obfuscated application extracts the commands executed by executing the obfuscated APK (Android application package) file.
  • APK Android application package
  • step of performing static contamination analysis using the generated DEX file may be further included.
  • the step of inserting in the form of dummy code compares the code of the obfuscated APK file and the deobfuscated APK file to determine whether deobfuscated information exists, and as a result of the determination, determines whether deobfuscated information exists.
  • the code of the deobfuscated APK file can be inserted in the form of dummy code after the existing code.
  • the step of generating the new call flow compares the code of the deobfuscated APK file in the form of dummy code with the code of the obfuscated APK file, and recombines the obfuscated call flow to match the existing call flow to create a new call flow. You can create call flows.
  • a Classes.dex file can be created by adding a call flow that reconnects the call flow that was interrupted due to obfuscation to the code of the original APK file.
  • the obfuscation information and hidden code are deobfuscated using the call command in which the obfuscated APK is executed, thereby enabling more sensitive data flow analysis, rather than simply deobfuscation. It has the effect of having a deobfuscated call flow that can have the original call flow.
  • an obfuscated APK can find a call flow that can have the original flow by rewriting the call flow that cannot be handled by conventional deobfuscation technology, thereby providing an excellent analysis that can be used in static contamination analysis. Techniques can be provided.
  • Figure 1 is a block diagram showing a deobfuscation device for data flow analysis of an obfuscated application according to an embodiment of the present invention.
  • Figure 2 is an overall configuration diagram showing the deobfuscation process of a deobfuscation device for data flow analysis of an obfuscated application according to an embodiment of the present invention.
  • Figure 3 is a flowchart showing the operational flow of a deobfuscation method for data flow analysis of an obfuscated application according to an embodiment of the present invention.
  • Figure 4 is a diagram illustrating a class deobfuscation rewriting process in the deobfuscation method according to an embodiment of the present invention.
  • Figure 5 is a diagram illustrating an API hiding deobfuscation rewriting process in the deobfuscation method according to an embodiment of the present invention.
  • FIGS. 1 and 2 a deobfuscation device for data flow analysis of an obfuscated application according to an embodiment of the present invention will be described through FIGS. 1 and 2.
  • Figure 1 is a block diagram showing a deobfuscation device for analyzing the data flow of an obfuscated application according to an embodiment of the present invention
  • Figure 2 is a block diagram showing the data flow analysis of an obfuscated application according to an embodiment of the present invention. This is an overall diagram showing the deobfuscation process of the deobfuscation device.
  • the deobfuscation device 100 for data flow analysis of an obfuscated application includes a command extraction unit 110, a deobfuscation unit 120, and code insertion. It includes a unit 130, a call flow generation unit 140, a DEX file creation unit 150, and a static contamination analysis performing unit 160.
  • the command extractor 110 executes the obfuscated APK (Android application package) file and extracts the executed commands.
  • APK Android application package
  • the obfuscated APK file is dynamically executed through the Android Process of the deobfuscation process shown in FIG. 2 to extract commands.
  • the deobfuscation unit 120 deobfuscates the obfuscated APK file using the commands extracted through the command extraction unit 110.
  • the commands extracted through the command extractor 110 are monitored and used as deobfuscated information.
  • the code insertion unit 130 compares the code of the obfuscated APK file and the deobfuscated APK file, and inserts the code of the deobfuscated APK file in the form of a dummy code according to the comparison result.
  • the code insertion unit 130 compares the code of the obfuscated APK file and the deobfuscated APK file to determine whether deobfuscated information exists, and if deobfuscated information exists, deobfuscates the code. Insert the code of the APK file in the form of dummy code after the existing code.
  • the commands extracted through the command extractor 110 are dynamically loaded and extracted through the operation of the application, and among the commands, the deobfuscated code compared to the existing code is added in the form of a dummy code after the existing code. Through this, obfuscated information can be identified and de-obfuscated information can be identified.
  • the call flow generator 140 generates a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow.
  • the call flow generator 140 compares the code of the deobfuscated APK file in the form of a dummy code with the code of the obfuscated APK file, and reassembles the obfuscated call flow to match the existing call flow to create a new call flow. creates .
  • the deobfuscated code that exists as a dummy code is compared with the obfuscated code and a call is added to match the existing call flow.
  • DEX file generator 150 uses the new call flow created through the call flow generator 140 to create a DEX (Dalvik Executable) file to which a new call flow that connects the call flow that was interrupted by obfuscation is added.
  • the DEX file generator 150 creates a Classes.dex file by adding a call flow that reconnects the call flow that was interrupted due to obfuscation to the code of the original APK file.
  • the original APK file cannot be analyzed because the call flow is obfuscated and the call flow cannot be accurately identified. Therefore, in an embodiment of the present invention, the deobfuscated information extracted through a module called Deobfuscated Data Flow is obfuscated, not in the form of a dummy code, and a call that reconnects the broken call flow is added to the code of the original APK file. Add to connect the flow.
  • the newly created deobfuscated APK file has the Classes.dex file of the existing APK file and the Classes.dex file of the deobfuscated APK file.
  • the APK file created in the embodiment of the present invention has obfuscated information and de-obfuscated information added to the Classes.dex file and stored there.
  • the problem of static contamination analysis being impossible because the call flow is hidden due to obfuscation can be solved.
  • the static contamination analysis performing unit 160 performs static contamination analysis using the DEX file generated by the DEX file generating unit 150.
  • the code or data to be protected is deobfuscated and the original code is inserted into the Classes.dex file.
  • the Static Taint Analysis portion shown in FIG. 2 is the static contamination analysis performing unit 160, and as shown in FIG. 2, when performing static analysis, the internal call flow of the Classes.dex file is used, so the existing obfuscation is performed in the static analysis tool. You will have a newly specified deobfuscated call flow rather than the deobfuscated call flow.
  • the deobfuscated information can also have the same flow as the original flow.
  • Figure 3 is a flowchart showing the operation flow of a deobfuscation method for data flow analysis of an obfuscated application according to an embodiment of the present invention, with reference to which the specific operation of the present invention will be described.
  • the command extractor 110 executes the obfuscated APK file and extracts the executed commands (S10).
  • step S10 the obfuscated APK file is dynamically executed to extract commands.
  • the deobfuscation unit 120 deobfuscates the obfuscated APK file using the commands extracted in step S10 (S20).
  • step S10 the code extracted in step S10 is monitored and used as deobfuscated information.
  • the code insertion unit 130 compares the codes of the obfuscated APK file and the deobfuscated APK file (S30).
  • the code insertion unit 130 determines whether deobfuscated information exists by comparing the codes of the obfuscated APK file and the deobfuscated APK file (S40).
  • step S40 If deobfuscated information exists as a result of the determination in step S40, the code of the deobfuscated APK file is inserted in the form of a dummy code after the existing code (S50).
  • the call flow generator 140 generates a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow (S60).
  • step S60 the code of the deobfuscated APK file in the form of dummy code is compared with the code of the obfuscated APK file, and the obfuscated call flow is recombined to match the existing call flow to create a new call flow.
  • the DEX file generator 150 uses the new call flow generated in step S60 to generate a DEX (Dalvik Executable) file to which a new call flow that connects the call flow that was interrupted by obfuscation is added (S70).
  • a Classes.dex file is created by adding a new call flow that connects the call flow that was broken due to obfuscation to the code of the original APK file.
  • a .dex file with the original data flow is created by rewriting the deobfuscated information into the smalli code of the .dex file using the deobfuscated results for each option.
  • the static pollution analysis performing unit 160 may perform static pollution analysis using the DEX file generated in step S70.
  • the static analysis process conducts static contamination analysis using a .dex file with a data flow similar to the original through the results of the deobfuscation process. Detects the flow and entry-point from the source to the sink, extracts information from the Android Manifest.xml file, Classes.dex file, and layout.xml file and conducts analysis. While obfuscated data flows cannot be found in static contamination analysis, de-obfuscated .dex files can produce results similar to the original data flows.
  • Figure 4 is a diagram illustrating a class deobfuscation rewriting process in the deobfuscation method according to an embodiment of the present invention
  • Figure 5 is a diagram showing API hidden deobfuscation in the deobfuscation method according to an embodiment of the present invention. This is a diagram illustrating the process of rewriting a painting.
  • Figures 4 and 5 the Smali code part of the original application is indicated by a, the Smali code part of the obfuscated application is indicated by b, and the deobfuscated Smali code result is indicated by c.
  • Figure 4 shows that an application in which it was impossible to find information leakage due to class obfuscation was applied, and the information leakage was detected by finding hidden classes through a deobfuscation process and rewriting them in a .dex file.
  • Figure 5 shows that the Hide Access option was applied and the API was hidden, making it impossible to find information leaks.
  • the flow of hidden information being managed in the v0 table was identified and rewritten in smali code.
  • Information leakage was detected by creating a deobfuscated .dex file.
  • the deobfuscation method for analyzing the data flow of an obfuscated application deobfuscates the obfuscation information and hidden code using the call command through which the obfuscated APK is executed.
  • the obfuscated APK can find a call flow that can have the original flow by rewriting the call flow that cannot be handled by conventional deobfuscation technology, so it can be used in static contamination analysis. It can provide excellent analysis techniques.
  • the deobfuscation device and method for data flow analysis of an obfuscated application are implemented in the form of program instructions that can be executed through various computer components and recorded on a computer-readable recording medium. You can.
  • the computer-readable recording medium may include program instructions, data files, data structures, etc., singly or in combination.
  • the program instructions recorded on the computer-readable recording medium may be those specifically designed and configured for the present invention, or may be known and usable by those skilled in the computer software field.
  • Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. media), and hardware devices specifically configured to store and perform program instructions, such as ROM, RAM, flash memory, etc.
  • Examples of program instructions include not only machine language code such as that created by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.
  • the hardware device may be configured to operate as one or more software modules to perform processing according to the invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Storage Device Security (AREA)

Abstract

The present invention relates to a deobfuscation apparatus for data flow analysis of an obfuscated application, and a method therefor. The deobfuscation apparatus for data flow analysis of an obfuscated application, according to the present invention, comprises: an instruction extraction unit for extracting instructions executed by executing an obfuscated APK file; a deobfuscation unit for deobfuscating the obfuscated APK file by using the extracted instructions; a code insertion unit for comparing the code of the obfuscated APK file to the code of the deobfuscated APK file and inserting the code of the deobfuscated APK file in a dummy code form according to the comparison result; a call flow generation unit for generating a new call flow by recombining the code of the deobfuscated APK file in the dummy code form with a conventional obfuscation call flow; and a DEX file generation unit, which uses the generated new call flow so as to generate a DEX file to which a call flow, that newly connects a call flow disconnected because of obfuscation, is added.

Description

난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치 및 그 방법Deobfuscation device and method for data flow analysis of obfuscated applications
본 발명은 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치 및 그 방법에 관한 것으로서, 더욱 상세하게는 난독화된 APK(Android application package)가 실행되는 호출 명령어를 이용하여 더 많은 민감한 데이터 흐름 분석을 가능하도록 하는 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치 및 그 방법에 관한 것이다.The present invention relates to a deobfuscation device and method for analyzing the data flow of an obfuscated application. More specifically, the present invention relates to a deobfuscation device and method for analyzing the data flow of an obfuscated application. More specifically, the present invention relates to a deobfuscation device and method for analyzing the data flow of an obfuscated application. It relates to a deobfuscation device and method for analyzing the data flow of an obfuscated application that enables analysis.
사용자의 민감한 정보를 유출하는 악성코드로부터 사용자를 보호하기 위해서는 악성코드에서 악성 행위를 최대한 빨리 찾아내어 탐지하는 것이 중요하다. 하지만 공격자들은 악성코드에 대한 탐지를 회피하기 위해 다양한 난독화 기법을 사용하여 악성코드의 수명이 길어질 수 있으므로 적시에 악성코드를 분석하는 것이 점점 더 어려워지고 있다.In order to protect users from malicious code that leaks users' sensitive information, it is important to find and detect malicious behavior in the malicious code as quickly as possible. However, attackers use various obfuscation techniques to avoid detection, which can prolong the lifespan of malware, making it increasingly difficult to analyze malware in a timely manner.
일반적으로 역난독화 기법은 소프트웨어의 지식재산권을 보호하기 위한 난독화 기법을 무효화하고, 시스템을 공격하거나 시스템을 이용하는 사용자들로부터 중요한 정보를 탈취하는 악의적인 행위로부터 더 빠르게 탐지하기 위해 사용한다.In general, deobfuscation techniques are used to invalidate obfuscation techniques to protect intellectual property rights of software and to more quickly detect malicious actions that attack the system or steal important information from users using the system.
그러나 종래의 역난독화 솔루션은 난독화된 안드로이드 애플리케이션을 역난독화 하더라도 난독화된 정보를 식별할 수는 있지만 정적분석도구를 통해 분석했을 때 민감한 데이터 정보를 정확하게 찾아내지 못하는 솔루션들이 대부분이다. 난독화가 적용된 애플리케이션은 데이터의 호출 흐름이 정적분석도구에 식별되지 않는 형태로 변형되어 여전히 필요한 부분만 찾아가며 검색해야 하는 형태로 존재한다. 이는 정적분석은 가능하지만 정적분석의 흐름을 정확하게 연결하지 못해 정적분석도구에서는 식별하지 못하는 문제점이 다수 존재한다.However, most of the conventional de-obfuscation solutions are capable of identifying obfuscated information even when de-obfuscating an obfuscated Android application, but cannot accurately find sensitive data information when analyzed through a static analysis tool. In applications where obfuscation has been applied, the data call flow is transformed into a form that cannot be identified by static analysis tools, so it still exists in a form that requires searching only the necessary parts. Although static analysis is possible, there are many problems that cannot be identified by static analysis tools because the flow of static analysis cannot be accurately connected.
따라서 종래의 기법들과는 다른 역난독화하고 정적오염분석을 더 광범위하게 할 수 있는 역난독화된 정보 호출 흐름을 기존 호출 흐름에 연결해주는 기법이 필요하다.Therefore, a technique is needed that connects the deobfuscated information call flow to the existing call flow, which allows for deobfuscation that is different from conventional techniques and allows for more extensive static contamination analysis.
본 발명의 배경이 되는 기술은 대한민국 등록특허공보 제10-1861341호(2018. 05. 28. 공고)에 개시되어 있다.The technology behind the present invention is disclosed in Republic of Korea Patent Publication No. 10-1861341 (announced on May 28, 2018).
본 발명이 이루고자 하는 기술적 과제는 난독화된 안드로이드 애플리케이션 패키지(APK)가 실행되는 호출 명령어를 이용하여 난독화 정보 및 숨겨진 코드를 역난독화하여 더 많은 민감한 데이터 흐름 분석을 가능하도록 하는 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치 및 그 방법을 제공하기 위한 것이다.The technical problem to be achieved by the present invention is an obfuscated application that enables more sensitive data flow analysis by deobfuscating obfuscated information and hidden code using call instructions in which an obfuscated Android application package (APK) is executed. The purpose is to provide a deobfuscation device and method for data flow analysis.
이러한 기술적 과제를 이루기 위한 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치는, 난독화된 APK(Android application package) 파일을 실행시켜 실행된 명령어들을 추출하는 명령어 추출부; 상기 추출된 명령어들을 이용하여 상기 난독화된 APK 파일의 역난독화를 진행하는 역난독화부; 상기 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하고, 비교 결과에 따라 상기 역난독화된 APK 파일의 코드를 더미 코드 형태로 삽입하는 코드 삽입부; 더미 코드 형태의 역난독화된 APK 파일의 코드와 기존 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성하는 호출 흐름 생성부; 및 상기 생성된 새로운 호출 흐름을 이용하여 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름이 추가된 DEX(Dalvik Executable) 파일을 생성하는 DEX 파일 생성부를 포함한다.A deobfuscation device for data flow analysis of an obfuscated application according to an embodiment of the present invention to achieve this technical task is a command extraction device that extracts commands executed by executing an obfuscated APK (Android application package) file. wealth; a deobfuscation unit that deobfuscates the obfuscated APK file using the extracted commands; a code insertion unit that compares the code of the obfuscated APK file and the deobfuscated APK file and inserts the code of the deobfuscated APK file in the form of a dummy code according to the comparison result; A call flow generator that generates a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow; and a DEX file creation unit that uses the generated new call flow to generate a DEX (Dalvik Executable) file to which a new call flow that has been disconnected due to obfuscation is added.
또한, 상기 생성된 DEX 파일을 이용하여 정적오염분석을 수행하는 정적오염분석 수행부를 더 포함할 수 있다.In addition, it may further include a static pollution analysis performing unit that performs static pollution analysis using the generated DEX file.
또한, 상기 코드 삽입부는 상기 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하여 역난독화된 정보가 존재하는지 판단하고, 판단 결과 역난독화된 정보가 존재하는 경우 상기 역난독화된 APK 파일의 코드를 기존 코드 뒤에 더미 코드 형태로 삽입할 수 있다.In addition, the code insertion unit determines whether deobfuscated information exists by comparing the code of the obfuscated APK file and the deobfuscated APK file, and if deobfuscated information exists as a result of the determination, the deobfuscated information is deobfuscated. The code of the APK file can be inserted in the form of dummy code after the existing code.
또한, 상기 호출 흐름 생성부는 더미 코드 형태의 역난독화된 APK 파일의 코드와 상기 난독화된 APK 파일의 코드를 비교하여 기존의 호출 흐름과 일치하도록 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성할 수 있다.In addition, the call flow generator compares the code of the deobfuscated APK file in the form of a dummy code with the code of the obfuscated APK file, recombines the obfuscated call flow to match the existing call flow, and generates a new call flow. can do.
또한, 상기 DEX 파일 생성부는 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름을 원본 APK 파일의 코드에 추가하여 Classes.dex 파일을 생성할 수 있다.Additionally, the DEX file generator can generate a Classes.dex file by adding a call flow that reconnects the call flow that was interrupted due to obfuscation to the code of the original APK file.
또한, 본 발명의 다른 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치를 이용한 역난독화 방법은, 난독화된 APK(Android application package) 파일을 실행시켜 실행된 명령어들을 추출하는 단계; 상기 추출된 명령어들을 이용하여 상기 난독화된 APK 파일의 역난독화를 진행하는 단계; 상기 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하고, 비교 결과에 따라 상기 역난독화된 APK 파일의 코드를 더미 코드 형태로 삽입하는 단계; 더미 코드 형태의 역난독화된 APK 파일의 코드와 기존 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성하는 단계; 및 상기 생성된 새로운 호출 흐름을 이용하여 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름이 추가된 DEX(Dalvik Executable) 파일을 생성하는 단계를 포함한다.In addition, the deobfuscation method using a deobfuscation device for analyzing the data flow of an obfuscated application according to another embodiment of the present invention extracts the commands executed by executing the obfuscated APK (Android application package) file. steps; Deobfuscating the obfuscated APK file using the extracted commands; Comparing the code of the obfuscated APK file and the deobfuscated APK file, and inserting the code of the deobfuscated APK file in the form of a dummy code according to the comparison result; Creating a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow; And a step of using the generated new call flow to create a DEX (Dalvik Executable) file to which a call flow that newly connects the call flow that was interrupted by obfuscation has been added.
또한, 상기 생성된 DEX 파일을 이용하여 정적오염분석을 수행하는 단계를 더 포함할 수 있다.In addition, the step of performing static contamination analysis using the generated DEX file may be further included.
또한, 상기 더미 코드 형태로 삽입하는 단계는 상기 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하여 역난독화된 정보가 존재하는지 판단하고, 판단 결과 역난독화된 정보가 존재하는 경우 상기 역난독화된 APK 파일의 코드를 기존 코드 뒤에 더미 코드 형태로 삽입할 수 있다.In addition, the step of inserting in the form of dummy code compares the code of the obfuscated APK file and the deobfuscated APK file to determine whether deobfuscated information exists, and as a result of the determination, determines whether deobfuscated information exists. In this case, the code of the deobfuscated APK file can be inserted in the form of dummy code after the existing code.
또한, 상기 새로운 호출 흐름을 생성하는 단계는 더미 코드 형태의 역난독화된 APK 파일의 코드와 상기 난독화된 APK 파일의 코드를 비교하여 기존의 호출 흐름과 일치하도록 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성할 수 있다.In addition, the step of generating the new call flow compares the code of the deobfuscated APK file in the form of dummy code with the code of the obfuscated APK file, and recombines the obfuscated call flow to match the existing call flow to create a new call flow. You can create call flows.
또한, 상기 DEX 파일을 생성하는 단계는 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름을 원본 APK 파일의 코드에 추가하여 Classes.dex 파일을 생성할 수 있다.Additionally, in the step of creating the DEX file, a Classes.dex file can be created by adding a call flow that reconnects the call flow that was interrupted due to obfuscation to the code of the original APK file.
이와 같이 본 발명에 따르면, 난독화된 APK가 실행되는 호출 명령어를 이용하여 난독화 정보 및 숨겨진 코드를 역난독화하여 더 많은 민감한 데이터 흐름 분석을 가능하도록 함으로써, 단순하게 역난독화만 진행하는 것이 아닌 원본 호출 흐름을 가질 수 있는 역난독화된 호출 흐름까지 갖도록 하는 효과가 있다.In this way, according to the present invention, the obfuscation information and hidden code are deobfuscated using the call command in which the obfuscated APK is executed, thereby enabling more sensitive data flow analysis, rather than simply deobfuscation. It has the effect of having a deobfuscated call flow that can have the original call flow.
또한 본 발명에 따르면, 난독화된 APK가 종래의 역난독화 기술로는 대처하지 못하는 호출 흐름을 재작성해줌으로써 원본 흐름을 가질 수 있는 호출 흐름을 찾아줄 수 있어 정적오염분석에서도 활용할 수 있는 뛰어난 분석 기법을 제공할 수 있다.In addition, according to the present invention, an obfuscated APK can find a call flow that can have the original flow by rewriting the call flow that cannot be handled by conventional deobfuscation technology, thereby providing an excellent analysis that can be used in static contamination analysis. Techniques can be provided.
또한 본 발명에 따르면, 어떠한 난독화 도구로 난독화가 되어 있더라도 역난독화 과정을 거쳐 난독화되었을때보다 더 많은 정보 유출 탐색을 하는데 기여할 수 있을 것으로 기대되며, 악성코드 분석가들로 하여금 악성 모바일 애플리케이션 로직 분석을 위한 시간과 노력을 단축시킬 수 있어 효과적인 대응 및 분석에 기여할 수 있다.In addition, according to the present invention, no matter what obfuscation tool it is obfuscated with, it is expected to contribute to detecting more information leaks than when it is obfuscated through a de-obfuscation process, and allows malware analysts to identify malicious mobile application logic. It can shorten the time and effort for analysis, contributing to effective response and analysis.
도 1은 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치를 나타낸 블록구성도이다.Figure 1 is a block diagram showing a deobfuscation device for data flow analysis of an obfuscated application according to an embodiment of the present invention.
도 2는 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치의 역난독화 과정을 나타낸 전체구성도이다.Figure 2 is an overall configuration diagram showing the deobfuscation process of a deobfuscation device for data flow analysis of an obfuscated application according to an embodiment of the present invention.
도 3은 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 방법의 동작 흐름을 도시한 순서도이다.Figure 3 is a flowchart showing the operational flow of a deobfuscation method for data flow analysis of an obfuscated application according to an embodiment of the present invention.
도 4는 본 발명의 실시 예에 따른 역난독화 방법에서 클래스 역난독화 재작성 과정을 예시적으로 도시한 도면이다.Figure 4 is a diagram illustrating a class deobfuscation rewriting process in the deobfuscation method according to an embodiment of the present invention.
도 5는 본 발명의 실시 예에 따른 역난독화 방법에서 API 은닉 역난독화 재작성 과정을 예시적으로 도시한 도면이다.Figure 5 is a diagram illustrating an API hiding deobfuscation rewriting process in the deobfuscation method according to an embodiment of the present invention.
이하 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시 예를 상세히 설명하기로 한다. 이 과정에서 도면에 도시된 선들의 두께나 구성요소의 크기 등은 설명의 명료성과 편의상 과장되게 도시되어 있을 수 있다. Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the attached drawings. In this process, the thickness of lines or sizes of components shown in the drawing may be exaggerated for clarity and convenience of explanation.
또한 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서, 이는 사용자, 운용자의 의도 또는 관례에 따라 달라질 수 있다. 그러므로 이러한 용어들에 대한 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.Additionally, the terms described below are terms defined in consideration of functions in the present invention, and may vary depending on the intention or custom of the user or operator. Therefore, definitions of these terms should be made based on the content throughout this specification.
이하, 도면들을 참조하여 본 발명의 바람직한 실시예들을 보다 상세하게 설명하기로 한다.Hereinafter, preferred embodiments of the present invention will be described in more detail with reference to the drawings.
먼저, 도 1 및 도 2를 통해 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치에 대하여 설명한다.First, a deobfuscation device for data flow analysis of an obfuscated application according to an embodiment of the present invention will be described through FIGS. 1 and 2.
도 1은 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치를 나타낸 블록구성도이고, 도 2는 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치의 역난독화 과정을 나타낸 전체구성도이다.Figure 1 is a block diagram showing a deobfuscation device for analyzing the data flow of an obfuscated application according to an embodiment of the present invention, and Figure 2 is a block diagram showing the data flow analysis of an obfuscated application according to an embodiment of the present invention. This is an overall diagram showing the deobfuscation process of the deobfuscation device.
도 1 및 도 2에서와 같이 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치(100)는, 명령어 추출부(110), 역난독화부(120), 코드 삽입부(130), 호출 흐름 생성부(140), DEX 파일 생성부(150) 및 정적오염분석 수행부(160)를 포함한다.1 and 2, the deobfuscation device 100 for data flow analysis of an obfuscated application according to an embodiment of the present invention includes a command extraction unit 110, a deobfuscation unit 120, and code insertion. It includes a unit 130, a call flow generation unit 140, a DEX file creation unit 150, and a static contamination analysis performing unit 160.
먼저, 명령어 추출부(110)는 난독화된 APK(Android application package) 파일을 실행시켜 실행된 명령어들을 추출한다.First, the command extractor 110 executes the obfuscated APK (Android application package) file and extracts the executed commands.
자세히는, 도 2에 도시된 역난독화 프로세스(Deobfuscation process)의 안드로이드 프로세스(Android Process)를 통해 난독화된 APK 파일을 동적으로 실행시켜 명령어들을 추출한다.In detail, the obfuscated APK file is dynamically executed through the Android Process of the deobfuscation process shown in FIG. 2 to extract commands.
그리고 역난독화부(120)는 명령어 추출부(110)를 통해 추출된 명령어들을 이용하여 난독화된 APK 파일의 역난독화를 진행한다.Then, the deobfuscation unit 120 deobfuscates the obfuscated APK file using the commands extracted through the command extraction unit 110.
즉, 명령어 추출부(110)를 통해 추출된 명령어들을 모니터링하여 역난독화된 정보로 이용한다.That is, the commands extracted through the command extractor 110 are monitored and used as deobfuscated information.
그리고 코드 삽입부(130)는 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하고, 비교 결과에 따라 역난독화된 APK 파일의 코드를 더미 코드 형태로 삽입한다.Then, the code insertion unit 130 compares the code of the obfuscated APK file and the deobfuscated APK file, and inserts the code of the deobfuscated APK file in the form of a dummy code according to the comparison result.
자세히는, 코드 삽입부(130)는 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하여 역난독화된 정보가 존재하는지 판단하고, 역난독화된 정보가 존재하는 경우 역난독화된 APK 파일의 코드를 기존 코드 뒤에 더미 코드 형태로 삽입한다.In detail, the code insertion unit 130 compares the code of the obfuscated APK file and the deobfuscated APK file to determine whether deobfuscated information exists, and if deobfuscated information exists, deobfuscates the code. Insert the code of the APK file in the form of dummy code after the existing code.
즉, 명령어 추출부(110)를 통해 추출된 명령어들은 애플리케이션의 동작을 통해 동적으로 로딩되어 추출되는데 명령어들 중 기존 코드와 비교하여 역난독화된 코드는 기존 코드 뒤에 더미 코드 형식으로 추가된다. 이를 통해 난독화된 정보를 식별하고 역난독화된 정보를 식별할 수 있다.That is, the commands extracted through the command extractor 110 are dynamically loaded and extracted through the operation of the application, and among the commands, the deobfuscated code compared to the existing code is added in the form of a dummy code after the existing code. Through this, obfuscated information can be identified and de-obfuscated information can be identified.
그리고 호출 흐름 생성부(140)는 더미 코드 형태의 역난독화된 APK 파일의 코드와 기존 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성한다.Then, the call flow generator 140 generates a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow.
이때, 호출 흐름 생성부(140)는 더미 코드 형태의 역난독화된 APK 파일의 코드와 난독화된 APK 파일의 코드를 비교하여 기존의 호출 흐름과 일치하도록 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성한다.At this time, the call flow generator 140 compares the code of the deobfuscated APK file in the form of a dummy code with the code of the obfuscated APK file, and reassembles the obfuscated call flow to match the existing call flow to create a new call flow. creates .
즉, 더미 코드로 존재하는 역난독화 코드를 난독화된 코드와 비교하여 기존의 호출 흐름과 일치 할수 있도록 호출을 추가한다.In other words, the deobfuscated code that exists as a dummy code is compared with the obfuscated code and a call is added to match the existing call flow.
그리고 DEX 파일 생성부(150)는 호출 흐름 생성부(140)를 통해 생성된 새로운 호출 흐름을 이용하여 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름이 추가된 DEX(Dalvik Executable) 파일을 생성한다.And the DEX file generator 150 uses the new call flow created through the call flow generator 140 to create a DEX (Dalvik Executable) file to which a new call flow that connects the call flow that was interrupted by obfuscation is added.
이때, DEX 파일 생성부(150)는 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름을 원본 APK 파일의 코드에 추가하여 Classes.dex 파일을 생성한다.At this time, the DEX file generator 150 creates a Classes.dex file by adding a call flow that reconnects the call flow that was interrupted due to obfuscation to the code of the original APK file.
자세히는, 원본 APK 파일은 호출 흐름이 난독화 되어 호출 흐름을 정확하게 식별하지 못하여 분석할 수 없다. 따라서 본 발명의 실시 예에서는 역난독화 데이터 흐름(Deobfuscated Data Flow)이라는 모듈을 통해 추출된 역난독화 정보를 더미 코드 형태가 아닌 난독화 되어 끊어진 호출 흐름을 새로 이어주는 호출을 원본 APK 파일의 코드에 추가하여 흐름을 연결 시켜준다.In detail, the original APK file cannot be analyzed because the call flow is obfuscated and the call flow cannot be accurately identified. Therefore, in an embodiment of the present invention, the deobfuscated information extracted through a module called Deobfuscated Data Flow is obfuscated, not in the form of a dummy code, and a call that reconnects the broken call flow is added to the code of the original APK file. Add to connect the flow.
이때, 새로 만들어진 역난독화된 APK 파일은 기존 APK 파일의 Classes.dex 파일, 역난독화된 APK 파일의 Classes.dex 파일을 갖게된다.At this time, the newly created deobfuscated APK file has the Classes.dex file of the existing APK file and the Classes.dex file of the deobfuscated APK file.
새로 추가된 Classes.dex 파일에서는 기존의 dex 파일 보다 난독화되어져 알 수 없는 정보들이 역난독화되어 더미 코드 형태로 복원되어 있으며, 난독화로 인해 끊어진 호출 흐름이 다시 연결된다.In the newly added Classes.dex file, which is more obfuscated than the existing dex file, unknown information is deobfuscated and restored in the form of dummy code, and the call flow that was disconnected due to obfuscation is reconnected.
즉, 본 발명의 실시 예에서 만들어진 APK 파일은 난독화된 정보 및 역난독화된 정보를 Classes.dex 파일에 추가 작성되어 저장하고 있게 된다. 이로 인해, 난독화되어 호출 흐름이 숨겨져 정적오염분석이 불가능했던 문제를 해결할 수 있다.That is, the APK file created in the embodiment of the present invention has obfuscated information and de-obfuscated information added to the Classes.dex file and stored there. As a result, the problem of static contamination analysis being impossible because the call flow is hidden due to obfuscation can be solved.
정적오염분석 수행부(160)는 DEX 파일 생성부(150)에서 생성된 DEX 파일을 이용하여 정적오염분석을 수행한다.The static contamination analysis performing unit 160 performs static contamination analysis using the DEX file generated by the DEX file generating unit 150.
본 발명의 실시 예에서는 보호 대상 코드 혹은 데이터를 역난독화를 진행하여 Classes.dex 파일에 원본 코드를 삽입시킨다. 도 2에 도시된 Static Taint Analysis 부분이 정적오염분석 수행부(160)이며 도 2에 도시된 바와 같이 정적분석을 진행할 때 Classes.dex 파일 내부 호출 흐름을 이용하기 때문에 정적분석도구에서는 기존의 난독화된 호출 흐름이 아닌 새로 지정된 역난독화된 호출 흐름을 갖게 된다.In an embodiment of the present invention, the code or data to be protected is deobfuscated and the original code is inserted into the Classes.dex file. The Static Taint Analysis portion shown in FIG. 2 is the static contamination analysis performing unit 160, and as shown in FIG. 2, when performing static analysis, the internal call flow of the Classes.dex file is used, so the existing obfuscation is performed in the static analysis tool. You will have a newly specified deobfuscated call flow rather than the deobfuscated call flow.
즉, 본 발명의 실시 예에 따르면 난독화된 안드로이드 애플리케이션을 역난독화한 후 정적오염분석이 불가능한 솔루션을 역난독화된 정보를 찾아 기존 호출 흐름을 새로 지정해줌으로써 정적오염분석이 가능할 수 있게 한다. 안드로이드 애플리케이션에 포함된 난독화 정보들을 역난독화 하여 난독화된 호출 흐름을 파악하고 역난독화된 호출 흐름으로 새로 지정함으로써 역난독화된 정보 또한 원본 흐름과 같은 흐름을 가질 수 있다.That is, according to an embodiment of the present invention, after deobfuscating an obfuscated Android application, static contamination analysis is possible by finding the deobfuscated information for a solution where static contamination analysis is not possible and re-specifying the existing call flow. By deobfuscating the obfuscated information included in the Android application to identify the obfuscated call flow and designating it as a deobfuscated call flow, the deobfuscated information can also have the same flow as the original flow.
이하에서는 도 3 내지 도 5를 통해 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 방법에 대하여 설명한다.Hereinafter, a deobfuscation method for data flow analysis of an obfuscated application according to an embodiment of the present invention will be described with reference to FIGS. 3 to 5.
도 3은 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 방법의 동작 흐름을 도시한 순서도로서, 이를 참조하여 본 발명의 구체적인 동작을 설명한다.Figure 3 is a flowchart showing the operation flow of a deobfuscation method for data flow analysis of an obfuscated application according to an embodiment of the present invention, with reference to which the specific operation of the present invention will be described.
본 발명의 실시 예에 따르면, 먼저 명령어 추출부(110)는 난독화된 APK 파일을 실행시켜 실행된 명령어들을 추출한다(S10).According to an embodiment of the present invention, first, the command extractor 110 executes the obfuscated APK file and extracts the executed commands (S10).
이때, S10 단계에서는 난독화된 APK 파일을 동적으로 실행시켜 명령어들을 추출한다.At this time, in step S10, the obfuscated APK file is dynamically executed to extract commands.
그리고 역난독화부(120)는 S10 단계에서 추출된 명령어들을 이용하여 난독화된 APK 파일의 역난독화를 진행한다(S20).Then, the deobfuscation unit 120 deobfuscates the obfuscated APK file using the commands extracted in step S10 (S20).
즉, S10 단계에서 추출된 코드를 모니터링하여 역난독화된 정보로 이용한다.In other words, the code extracted in step S10 is monitored and used as deobfuscated information.
그리고 코드 삽입부(130)는 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교한다(S30).Then, the code insertion unit 130 compares the codes of the obfuscated APK file and the deobfuscated APK file (S30).
이때, 코드 삽입부(130)는 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하여 역난독화된 정보가 존재하는지 판단한다(S40).At this time, the code insertion unit 130 determines whether deobfuscated information exists by comparing the codes of the obfuscated APK file and the deobfuscated APK file (S40).
S40 단계의 판단 결과 역난독화된 정보가 존재하는 경우 역난독화된 APK 파일의 코드를 기존 코드 뒤에 더미 코드 형태로 삽입한다(S50).If deobfuscated information exists as a result of the determination in step S40, the code of the deobfuscated APK file is inserted in the form of a dummy code after the existing code (S50).
그리고 호출 흐름 생성부(140)는 더미 코드 형태의 역난독화된 APK 파일의 코드와 기존 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성한다(S60).Then, the call flow generator 140 generates a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow (S60).
이때, S60 단계에서는 더미 코드 형태의 역난독화된 APK 파일의 코드와 난독화된 APK 파일의 코드를 비교하여 기존의 호출 흐름과 일치하도록 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성한다.At this time, in step S60, the code of the deobfuscated APK file in the form of dummy code is compared with the code of the obfuscated APK file, and the obfuscated call flow is recombined to match the existing call flow to create a new call flow.
그리고 DEX 파일 생성부(150)는 S60 단계에서 생성된 새로운 호출 흐름을 이용하여 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름이 추가된 DEX(Dalvik Executable) 파일을 생성한다(S70).Then, the DEX file generator 150 uses the new call flow generated in step S60 to generate a DEX (Dalvik Executable) file to which a new call flow that connects the call flow that was interrupted by obfuscation is added (S70).
자세히는, 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름을 원본 APK 파일의 코드에 추가하여 Classes.dex 파일을 생성한다.In detail, a Classes.dex file is created by adding a new call flow that connects the call flow that was broken due to obfuscation to the code of the original APK file.
즉, 옵션별로 역난독화된 결과를 이용하여 역난독화된 정보를 .dex파일의 스말리 코드(smali code)에 재작성함으로써 원본 데이터 흐름을 갖는 .dex File을 생성한다.In other words, a .dex file with the original data flow is created by rewriting the deobfuscated information into the smalli code of the .dex file using the deobfuscated results for each option.
이때, 정적오염분석 수행부(160)가 S70 단계에서 생성된 DEX 파일을 이용하여 정적오염분석을 수행할 수도 있다.At this time, the static pollution analysis performing unit 160 may perform static pollution analysis using the DEX file generated in step S70.
자세히는, 정적분석과정은 역난독화 과정의 결과를 통해 원본과 유사한 데이터 흐름을 갖는 .dex File을 이용하여 정적오염분석을 진행한다. 소스에서 싱크(Sink)로 이어지는 흐름 및 Entry-point를 감지하고 Android Manifest.xml 파일과 Classes.dex 파일, layout.xml 파일로부터 정보를 추출하여 분석을 진행한다. 난독화 되어있는 데이터 흐름은 정적오염분석에서 찾을 수 없는 반면, 역난독화를 진행한 .dex file은 원본 Data Flow과 유사한 결과를 도출해낼 수 있다.In detail, the static analysis process conducts static contamination analysis using a .dex file with a data flow similar to the original through the results of the deobfuscation process. Detects the flow and entry-point from the source to the sink, extracts information from the Android Manifest.xml file, Classes.dex file, and layout.xml file and conducts analysis. While obfuscated data flows cannot be found in static contamination analysis, de-obfuscated .dex files can produce results similar to the original data flows.
본 발명의 실시 예에 따르면 난독화로 인해 탐지하지 못하였던 소스, 싱크 데이터 유출까지 찾아낼 수 있다.According to an embodiment of the present invention, it is possible to detect source and sink data leaks that could not be detected due to obfuscation.
도 4는 본 발명의 실시 예에 따른 역난독화 방법에서 클래스 역난독화 재작성 과정을 예시적으로 도시한 도면이고, 도 5는 본 발명의 실시 예에 따른 역난독화 방법에서 API 은닉 역난독화 재작성 과정을 예시적으로 도시한 도면이다.Figure 4 is a diagram illustrating a class deobfuscation rewriting process in the deobfuscation method according to an embodiment of the present invention, and Figure 5 is a diagram showing API hidden deobfuscation in the deobfuscation method according to an embodiment of the present invention. This is a diagram illustrating the process of rewriting a painting.
본 발명을 평가하기 위해 애플리케이션을 구현하고 DexGuard, DexProtector로 3가지 난독화 기술을 적용하는 실험을 수행했다.To evaluate the present invention, we implemented an application and conducted an experiment applying three obfuscation technologies with DexGuard and DexProtector.
도 4 및 도 5에서 원본 애플리케이션 Smali code 부분은 각각 a로 표시하고, 난독화된 애플리케이션의 Smali code 부분은 각각 b로 표시하며, 역난독화한 Smali code 결과는 각각 c로 표시했다. 도 4는 클래스 난독화가 적용되어 정보유출을 찾는 것이 불가능했던 애플리케이션을 역난독화 과정을 거쳐 숨겨진 클래스를 찾아내어 .dex 파일에 재작성 함으로써 정보유출을 탐지하였다.In Figures 4 and 5, the Smali code part of the original application is indicated by a, the Smali code part of the obfuscated application is indicated by b, and the deobfuscated Smali code result is indicated by c. Figure 4 shows that an application in which it was impossible to find information leakage due to class obfuscation was applied, and the information leakage was detected by finding hidden classes through a deobfuscation process and rewriting them in a .dex file.
그리고, 도 5는 Hide Access 옵션이 적용되어 API가 숨겨져 정보유출을 찾는 것이 불가능했던 애플리케이션을 역난독화과정을 거쳐 숨겨진 정보가 v0 테이블에 관리가 되어진다는 흐름을 파악하여 smali code에 재작성해줌으로써 역난독화된 .dex 파일을 생성하여 정보유출을 탐지하였다.And, Figure 5 shows that the Hide Access option was applied and the API was hidden, making it impossible to find information leaks. By going through a deobfuscation process, the flow of hidden information being managed in the v0 table was identified and rewritten in smali code. Information leakage was detected by creating a deobfuscated .dex file.
상술한 바와 같이, 본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 방법은 난독화된 APK가 실행되는 호출 명령어를 이용하여 난독화 정보 및 숨겨진 코드를 역난독화하여 더 많은 민감한 데이터 흐름 분석을 가능하도록 함으로써, 단순하게 역난독화만 진행하는 것이 아닌 원본 호출 흐름을 가질 수 있는 역난독화된 호출 흐름까지 갖도록 할 수 있다.As described above, the deobfuscation method for analyzing the data flow of an obfuscated application according to an embodiment of the present invention deobfuscates the obfuscation information and hidden code using the call command through which the obfuscated APK is executed. By enabling more sensitive data flow analysis, it is possible to have a deobfuscated call flow that can retain the original call flow rather than simply deobfuscation.
또한 본 발명의 실시 예에 따르면, 난독화된 APK가 종래의 역난독화 기술로는 대처하지 못하는 호출 흐름을 재작성해줌으로써 원본 흐름을 가질 수 있는 호출 흐름을 찾아줄 수 있어 정적오염분석에서도 활용할 수 있는 뛰어난 분석 기법을 제공할 수 있다.In addition, according to an embodiment of the present invention, the obfuscated APK can find a call flow that can have the original flow by rewriting the call flow that cannot be handled by conventional deobfuscation technology, so it can be used in static contamination analysis. It can provide excellent analysis techniques.
또한 본 발명의 실시 예에 따르면, 어떠한 난독화 도구로 난독화가 되어 있더라도 역난독화 과정을 거쳐 난독화되었을때보다 더 많은 정보 유출 탐색을 하는데 기여할 수 있을 것으로 기대되며, 악성코드 분석가들로 하여금 악성 모바일 애플리케이션 로직 분석을 위한 시간과 노력을 단축시킬 수 있어 효과적인 대응 및 분석에 기여할 수 있다.In addition, according to an embodiment of the present invention, it is expected that even if it is obfuscated with any obfuscation tool, it will be able to contribute to detecting more information leaks than when it is obfuscated through a de-obfuscation process, and it is expected to help malware analysts It can reduce the time and effort for analyzing mobile application logic, contributing to effective response and analysis.
본 발명의 실시 예에 따른 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치 및 그 방법은 다양한 컴퓨터 구성요소를 통하여 수행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다.The deobfuscation device and method for data flow analysis of an obfuscated application according to an embodiment of the present invention are implemented in the form of program instructions that can be executed through various computer components and recorded on a computer-readable recording medium. You can. The computer-readable recording medium may include program instructions, data files, data structures, etc., singly or in combination.
상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것들이거니와 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수도 있다.The program instructions recorded on the computer-readable recording medium may be those specifically designed and configured for the present invention, or may be known and usable by those skilled in the computer software field.
컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 ROM, RAM, 플래시 메모리 등과 같은 프로그램 명령어를 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다.Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. media), and hardware devices specifically configured to store and perform program instructions, such as ROM, RAM, flash memory, etc.
프로그램 명령어의 예에는, 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 상기 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있다.Examples of program instructions include not only machine language code such as that created by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware device may be configured to operate as one or more software modules to perform processing according to the invention.
본 발명은 도면에 도시된 실시 예를 참고로 하여 설명되었으나 이는 예시적인 것에 불과하며, 당해 기술이 속하는 분야에서 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시 예가 가능하다는 점을 이해할 것이다. 따라서 본 발명의 진정한 기술적 보호범위는 아래의 특허청구범위의 기술적 사상에 의하여 정해져야 할 것이다.The present invention has been described with reference to the embodiments shown in the drawings, but these are merely illustrative, and those skilled in the art will understand that various modifications and other equivalent embodiments are possible therefrom. will be. Therefore, the true technical protection scope of the present invention should be determined by the technical spirit of the patent claims below.
[부호의 설명][Explanation of symbols]
100 : 역난독화 장치 110 : 명령어 추출부100: deobfuscation device 110: command extractor
120 : 역난독화부 130 : 코드 삽입부120: reverse obfuscation unit 130: code insertion unit
140 : 호출 흐름 생성부 150 : DEX 파일 생성부140: Call flow creation unit 150: DEX file creation unit
160 : 정적오염분석 수행부160: Static pollution analysis performance department

Claims (10)

  1. 난독화된 APK(Android application package) 파일을 실행시켜 실행된 명령어들을 추출하는 명령어 추출부;A command extraction unit that executes an obfuscated APK (Android application package) file and extracts the executed commands;
    상기 추출된 명령어들을 이용하여 상기 난독화된 APK 파일의 역난독화를 진행하는 역난독화부;a deobfuscation unit that deobfuscates the obfuscated APK file using the extracted commands;
    상기 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하고, 비교 결과에 따라 상기 역난독화된 APK 파일의 코드를 더미 코드 형태로 삽입하는 코드 삽입부;a code insertion unit that compares the code of the obfuscated APK file and the deobfuscated APK file and inserts the code of the deobfuscated APK file in the form of a dummy code according to the comparison result;
    더미 코드 형태의 역난독화된 APK 파일의 코드와 기존 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성하는 호출 흐름 생성부; 및A call flow generator that generates a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow; and
    상기 생성된 새로운 호출 흐름을 이용하여 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름이 추가된 DEX(Dalvik Executable) 파일을 생성하는 DEX 파일 생성부를 포함하는 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치.Deobfuscation for data flow analysis of an obfuscated application, including a DEX file generator that generates a DEX (Dalvik Executable) file with an added call flow that reconnects the call flow interrupted by obfuscation using the generated new call flow. Device.
  2. 제1항에 있어서,According to paragraph 1,
    상기 생성된 DEX 파일을 이용하여 정적오염분석을 수행하는 정적오염분석 수행부를 더 포함하는 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치.A deobfuscation device for data flow analysis of an obfuscated application, further comprising a static contamination analysis performing unit that performs static contamination analysis using the generated DEX file.
  3. 제1항에 있어서,According to paragraph 1,
    상기 코드 삽입부는,The code insertion part,
    상기 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하여 역난독화된 정보가 존재하는지 판단하고,Compare the code of the obfuscated APK file and the deobfuscated APK file to determine whether deobfuscated information exists,
    판단 결과 역난독화된 정보가 존재하는 경우 상기 역난독화된 APK 파일의 코드를 기존 코드 뒤에 더미 코드 형태로 삽입하는 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치.A deobfuscation device for data flow analysis of an obfuscated application that inserts the code of the deobfuscated APK file in the form of dummy code after the existing code when deobfuscated information exists as a result of the determination.
  4. 제1항에 있어서,According to paragraph 1,
    상기 호출 흐름 생성부는,The call flow generator,
    더미 코드 형태의 역난독화된 APK 파일의 코드와 상기 난독화된 APK 파일의 코드를 비교하여 기존의 호출 흐름과 일치하도록 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성하는 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치.Obfuscated application data that creates a new call flow by comparing the code of the deobfuscated APK file in the form of dummy code with the code of the obfuscated APK file and recombining the obfuscated call flow to match the existing call flow A deobfuscation device for flow analysis.
  5. 제1항에 있어서,According to paragraph 1,
    상기 DEX 파일 생성부는,The DEX file creation unit,
    난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름을 원본 APK 파일의 코드에 추가하여 Classes.dex 파일을 생성하는 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치.A deobfuscation device for data flow analysis of obfuscated applications that creates a Classes.dex file by adding a call flow that reconnects the call flow broken by obfuscation to the code of the original APK file.
  6. 난독화된 애플리케이션의 데이터 흐름 분석을 위한 역난독화 장치를 이용한 역난독화 방법에 있어서,In a deobfuscation method using a deobfuscation device for data flow analysis of an obfuscated application,
    난독화된 APK(Android application package) 파일을 실행시켜 실행된 명령어들을 추출하는 단계;Executing an obfuscated APK (Android application package) file to extract executed commands;
    상기 추출된 명령어들을 이용하여 상기 난독화된 APK 파일의 역난독화를 진행하는 단계;Deobfuscating the obfuscated APK file using the extracted commands;
    상기 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하고, 비교 결과에 따라 상기 역난독화된 APK 파일의 코드를 더미 코드 형태로 삽입하는 단계;Comparing the code of the obfuscated APK file and the deobfuscated APK file, and inserting the code of the deobfuscated APK file in the form of a dummy code according to the comparison result;
    더미 코드 형태의 역난독화된 APK 파일의 코드와 기존 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성하는 단계; 및Creating a new call flow by recombining the code of the deobfuscated APK file in the form of dummy code and the existing obfuscated call flow; and
    상기 생성된 새로운 호출 흐름을 이용하여 난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름이 추가된 DEX(Dalvik Executable) 파일을 생성하는 단계를 포함하는 역난독화 방법.A deobfuscation method including the step of using the generated new call flow to create a DEX (Dalvik Executable) file to which a call flow that newly connects the call flow that was interrupted by obfuscation is added.
  7. 제6항에 있어서,According to clause 6,
    상기 생성된 DEX 파일을 이용하여 정적오염분석을 수행하는 단계를 더 포함하는 역난독화 방법.A deobfuscation method further comprising performing static contamination analysis using the generated DEX file.
  8. 제6항에 있어서,According to clause 6,
    상기 더미 코드 형태로 삽입하는 단계는,The step of inserting in the form of dummy code is,
    상기 난독화된 APK 파일과 역난독화된 APK 파일의 코드를 비교하여 역난독화된 정보가 존재하는지 판단하고,Compare the code of the obfuscated APK file and the deobfuscated APK file to determine whether deobfuscated information exists,
    판단 결과 역난독화된 정보가 존재하는 경우 상기 역난독화된 APK 파일의 코드를 기존 코드 뒤에 더미 코드 형태로 삽입하는 역난독화 방법.As a result of the determination, if deobfuscated information exists, a deobfuscation method of inserting the code of the deobfuscated APK file in the form of a dummy code after the existing code.
  9. 제6항에 있어서,According to clause 6,
    상기 새로운 호출 흐름을 생성하는 단계는,The step of creating the new call flow is,
    더미 코드 형태의 역난독화된 APK 파일의 코드와 상기 난독화된 APK 파일의 코드를 비교하여 기존의 호출 흐름과 일치하도록 난독화 호출 흐름을 재조합하여 새로운 호출 흐름을 생성하는 역난독화 방법.A deobfuscation method that generates a new call flow by comparing the code of a deobfuscated APK file in the form of a dummy code with the code of the obfuscated APK file and recombining the obfuscated call flow to match the existing call flow.
  10. 제6항에 있어서,According to clause 6,
    상기 DEX 파일을 생성하는 단계는,The step of creating the DEX file is,
    난독화로 끊어진 호출 흐름을 새로 이어주는 호출 흐름을 원본 APK 파일의 코드에 추가하여 Classes.dex 파일을 생성하는 역난독화 방법.A deobfuscation method that creates a Classes.dex file by adding a new call flow that connects the call flow that was broken due to obfuscation to the code of the original APK file.
PCT/KR2022/008971 2022-03-18 2022-06-23 Deobfuscation apparatus for data flow analysis of obfuscated application, and method therefor WO2023177020A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2022-0034088 2022-03-18
KR1020220034088A KR102514888B1 (en) 2022-03-18 2022-03-18 Deobfuscation apparatus and method for data flow analysis of obfuscated applications

Publications (1)

Publication Number Publication Date
WO2023177020A1 true WO2023177020A1 (en) 2023-09-21

Family

ID=85799811

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/008971 WO2023177020A1 (en) 2022-03-18 2022-06-23 Deobfuscation apparatus for data flow analysis of obfuscated application, and method therefor

Country Status (2)

Country Link
KR (1) KR102514888B1 (en)
WO (1) WO2023177020A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101833220B1 (en) * 2017-07-25 2018-02-28 올댓소프트 코. Deobfuscation assessing apparatus of application code and method of assessing deobfuscation of application code using the same
KR101861341B1 (en) * 2017-05-30 2018-05-28 올댓소프트 코. Deobfuscation apparatus of application code and method of deobfuscating application code using the same
KR20200131383A (en) * 2019-05-13 2020-11-24 고려대학교 산학협력단 Apparatus for deobfuscation and method for the same

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101861341B1 (en) * 2017-05-30 2018-05-28 올댓소프트 코. Deobfuscation apparatus of application code and method of deobfuscating application code using the same
KR101833220B1 (en) * 2017-07-25 2018-02-28 올댓소프트 코. Deobfuscation assessing apparatus of application code and method of assessing deobfuscation of application code using the same
KR20200131383A (en) * 2019-05-13 2020-11-24 고려대학교 산학협력단 Apparatus for deobfuscation and method for the same

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LEE, DONG HO ET AL.: "Deobfuscation for Sensitive Data Flow Analysis of Obfuscated Applications", CISC-W`21, 27 November 2021 (2021-11-27) *
MOSES YONI, MORDEKHAY YANIV: "ANDROID APP DEOBFUSCATION USING STATIC-DYNAMIC COOPERATION", VIRUSBULLETIN 2018, 1 January 2018 (2018-01-01), XP093091434 *
STEVEN ARZT, SIEGFRIED RASTHOFER, CHRISTIAN FRITZ, ERIC BODDEN, ALEXANDRE BARTEL, JACQUES KLEIN, YVES LE TRAON, DAMIEN OCTEAU, PAT: "FlowDroid : precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps", PROCEEDINGS OF THE 35TH ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, PLDI '14, JUNE 09 - 11 2014, EDINBURGH, UNITED KINGDOM, ACM PRESS, NEW YORK, NEW YORK, USA, 1 January 2013 (2013-01-01) - 11 June 2014 (2014-06-11), New York, New York, USA , pages 259 - 269, XP055534670, ISBN: 978-1-4503-2784-8, DOI: 10.1145/2594291.2594299 *

Also Published As

Publication number Publication date
KR102514888B1 (en) 2023-03-27

Similar Documents

Publication Publication Date Title
Zhang et al. Metaaware: Identifying metamorphic malware
US7334263B2 (en) Detecting viruses using register state
KR100942795B1 (en) A method and a device for malware detection
WO2015023024A1 (en) Device for obfuscating application code and method for same
WO2018056601A1 (en) Device and method for blocking ransomware using contents file access control
WO2019160195A1 (en) Apparatus and method for detecting malicious threats contained in file, and recording medium therefor
WO2013168951A1 (en) Apparatus and method for checking malicious file
WO2018016671A2 (en) Dangerous code detection system for checking security vulnerability and method thereof
WO2019039730A1 (en) Device and method for preventing ransomware
WO2016024838A1 (en) Method and system for providing cloud-based application security service
WO2019066222A1 (en) Method and system for identifying open source software package on basis of binary file
CN109684829B (en) Service call monitoring method and system in virtualization environment
WO2022114392A1 (en) Feature selection-based mobile malicious code classification method, and recording medium and device for performing same
WO2023177020A1 (en) Deobfuscation apparatus for data flow analysis of obfuscated application, and method therefor
WO2014185627A1 (en) Data processing system security device and security method
WO2014077615A1 (en) Anti-malware system, method of processing packet in the same, and computing device
WO2022163908A1 (en) Method for assessing data leakage risk within application, and recording medium and device for performing same
JP7235126B2 (en) BACKDOOR INSPECTION DEVICE, BACKDOOR INSPECTION METHOD, AND PROGRAM
WO2019066099A1 (en) System for detecting abnormal behavior on basis of integrated analysis model, and method therefor
WO2018043885A1 (en) System for detecting malicious code and method for detecting malicious code
WO2016137035A1 (en) Test case generation device and method, and computer-readable recording medium for recording program for executing same
WO2019225849A1 (en) Security device and method for providing security service through control of file input/output and integrity of guest operating system
WO2022149729A1 (en) Executable file unpacking system and method for static analysis of malicious code
WO2023282442A1 (en) Design method for sharing profile in container environment, and recording medium and apparatus for performing same
WO2013159491A1 (en) Method for implementing software tool for use in usb flash disk privacy protection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22932387

Country of ref document: EP

Kind code of ref document: A1