KR102105020B1

KR102105020B1 - Dynamic self mutation system using virtual machine based code transformation technology

Info

Publication number: KR102105020B1
Application number: KR1020190107035A
Authority: KR
Inventors: 김연우; 조성원; 이원일
Original assignee: (유)아홉
Priority date: 2019-08-30
Filing date: 2019-08-30
Publication date: 2020-04-27

Abstract

Provided is a dynamic self-variation system using a VM-based code variation technique. The dynamic self-variation system comprises: an original code management part configured to store a source code and an execution file of an original program; and a variation engine server including an obfuscation engine configured to perform execution file recombination on the source code of the original program through obfuscation and compiling, and perform execution file recombination on an execution code of the original program through obfuscation and encoding through a virtual code, thereby selectively performing binary level obfuscation and obfuscation of the compiled execution file.

Description

Dynamic self mutation system using virtual machine based code transformation technology

본 발명은 프로그램의 난독화 방법에 관한 것이다. 보다 상세하게는, 다수의 복제 난독화 실행 프로그램을 미리 생성하고 동적으로 스케줄러에 따라 프로그램이 변이되는 예방적 코드 보호 시스템을 구성하는 방법에 대한 것이다.The present invention relates to a method of obfuscating a program. More specifically, the present invention relates to a method of constructing a preventive code protection system in which a plurality of duplicate obfuscation execution programs are generated in advance and dynamically mutated according to a scheduler.

리버스 엔지니어링 기반의 허가되지 않은 코드 분석 및 수정은 소프트웨어 업계의 주요 관심사이다. 이러한 공격은 온라인 게임에서의 부정 행위, 소프트웨어의 무단 사용, 불법 유료 TV 등 수많은 바람직하지 않은 결과를 초래할 수 있다. 업계에서는 소프트웨어 시스템의 리버스 엔지니어링을 막기 위한 문제에 대한 솔루션을 찾고 있는데, 민감한 코드를 추적하거나 분석하기 어려워지도록 하는 코드 난독화(code obfuscation)가 이러한 문제의 잠재적인 대안이다. Reverse engineering-based unauthorized code analysis and modification is a major concern in the software industry. Such attacks can have numerous undesirable consequences, including cheating in online games, unauthorized use of software, and illegal pay TV. The industry is looking for solutions to problems to prevent reverse engineering of software systems, and code obfuscation, which makes it difficult to track or analyze sensitive code, is a potential alternative.

난독화는 실행의 동일성을 유지하면서 프로그램 코드를 변형시키는 것으로서 컴파일 시(compile-time), 로딩 시(loading-time), 실행 시(run-time) 적용할 수 있다. Obfuscation is a modification of program code while maintaining execution uniformity and can be applied at compile-time, loading-time, or run-time.

이러한 기법으로는 주소 재구조(address re-ordering and stack padding), 시스템 콜 재구조(system-call re-ordering), 명령어 랜덤화(instruction set randomization), 힙 랜덤화(heap randomization) 등의 기법이 존재한다. These techniques include address re-ordering and stack padding, system-call re-ordering, instruction set randomization, and heap randomization. exist.

가상 머신 (VM) 기반의 코드 가상화는 코드 난독화를 구현하는 유망한 방법으로 부상하고 있다. (Fang 외, 2011, Oreans-Technology, 2015, 2016, VMProtect-Software, 2015, Wang 외, 2013, 2014 Yang and Huang, 2011). Virtual machine (VM) based code virtualization is emerging as a promising way to implement code obfuscation. (Fang et al., 2011, Oreans-Technology, 2015, 2016, VMProtect-Software, 2015, Wang et al., 2013, 2014 Yang and Huang, 2011).

VM 기반 보호의 기본 원리는 프로그램 명령어(instructions)를 공격자에게 익숙하지 않은 가상 명령어(instructions)로 대체하는 것이다. 대체된 가상 명령어는 기본 하드웨어 플랫폼에서 실행되도록 런타임 시 원시 시스템 코드(native machine code)로 변환된다. The basic principle of VM-based protection is to replace program instructions with virtual instructions unfamiliar to the attacker. The replaced virtual instructions are converted to native machine code at runtime to run on the underlying hardware platform.

이러한 VM 기반 방식을 사용하게 되면 난독화 코드의 실행 경로가 가상 명령어 스케줄러에 의해 제어되도록 할 수 있다. Using this VM-based method, the execution path of obfuscation code can be controlled by the virtual instruction scheduler.

일반적인 스케줄러는 2가지로 구성된다. 첫번째는 실행 준비가 되어있는 명령어(instruction)를 결정하는 디스패처(dispatcher)와 바이트 코드를 디코드(decode) 하고 기본 머신 코드로 변환하는 바이트 코드 핸들러 집합으로 구성된다. There are two general schedulers. The first consists of a dispatcher that determines which instructions are ready to run, and a set of bytecode handlers that decode the bytecode and convert it into basic machine code.

이러한 프로세스를 통해 원본 프로그램 명령어(instruction)를 맞춤형 바이트 코드로 대체하여 민감하거나 중요한 코드 영역의 목적이나 논리를 은폐할 수 있도록 할 수 있다. Through this process, the original program instructions can be replaced with custom byte codes, so that the purpose or logic of sensitive or important code areas can be concealed.

종래의 VM 기반 코드 보호 기술은 주로 단일 바이트 코드집합을 보다 복잡하게 만드는 데 중점을 두고 단일 가상 명령어 스케줄러를 사용한다. 이것은 스케줄러 및 바이트 코드 명령 세트가 대부분의 실제 런타임 환경에서 분석하기 어렵다는 가정에 근거한다. 그러나 흔히 누적 공격(cumulative attack)이라는 방법으로 이러한 가정이 깨질 수 있다는 것이 입증되었다. 누적 공격으로부터 소프트웨어를 보호하려면 프로그램 실행 중에 일정 수준의 비결정성과 다양성을 갖는 것이 중요하다.Conventional VM-based code protection technology mainly uses a single virtual instruction scheduler with a focus on making a single byte code set more complex. This is based on the assumption that the scheduler and bytecode instruction set are difficult to analyze in most real runtime environments. However, it has been proven that this assumption can often be broken by a cumulative attack. To protect software from cumulative attacks, it is important to have a certain level of non-determinism and diversity during program execution.

따라서, 본 발명의 목적은 전술한 문제를 해결하기 위해, 시간적 다양성을 추가하여, 가상 머신 기반 코드 변이 기술을 이용한 동적 자가변이 방법 및 이에 기반한 시스템을 제공함에 있다. Accordingly, an object of the present invention is to provide a dynamic self-mutation method using a virtual machine-based code variation technique and a system based thereon by adding temporal diversity to solve the above-described problem.

또한, 본 발명의 목적은 복제 프로그램 간에 코드 독립성을 실현하여 공격자의 일부 복제 프로그램의 분석이 성공하더라도 얻을 수 있는 정보의 재사용을 무력화시킴으로써 서비스 대상의 다른 복제 프로그램을 공략하기 어렵도록 함에 있다. In addition, it is an object of the present invention to realize the code independence between replication programs so that it is difficult to attack other replication programs in service by neutralizing reuse of information that can be obtained even if the analysis of some replication programs by an attacker is successful.

상기와 같은 과제를 해결하기 위한 본 발명에 따른 VM기반 코드 변이 기술을 이용한 동적 자가변이 시스템이 제공된다. 상기 동적 자가변이 시스템은 원본 프로그램의 소스 코드 및 실행 파일을 저장하도록 구성된 원본 코드 관리부; 및 상기 원본 프로그램의 소스 코드에 대해 난독화와 컴파일을 통해 실행 파일 재조합을 수행하고, 상기 원본 프로그램의 실행 코드에 대해 가상 코드를 통한 난독화와 인코딩을 통해 실행 파일 재조합을 수행하도록 구성된 난독화 엔진을 포함하는 변이 엔진 서버를 포함하여, 바이너리 수준의 난독화와 컴파일된 실행파일에 대한 난독화를 선택적으로 수행할 수 있다. 또한, 상기 동적 자가변이 시스템은 상기 재조합된 실행 파일을 수신하고, 상기 실행 파일을 적용하도록 구성된 에이전트를 더 포함할 수 있다.A dynamic self-mutation system using a VM-based code variation technology according to the present invention for solving the above problems is provided. The dynamic self-mutation system includes an original code management unit configured to store source code and executable files of the original program; And an obfuscation engine configured to perform recombination of executable files through obfuscation and compilation of the source code of the original program, and recombination of executable files through obfuscation through virtual code and encoding of the executable code of the original program. Including a mutation engine server including a, binary level obfuscation and obfuscation of the compiled executable can be selectively performed. In addition, the dynamic auto-mutation system may further include an agent configured to receive the recombined executable file and apply the executable file.

일 실시 예로, 상기 난독화 엔진은 상기 원본 프로그램의 소스 코드를 다운로드하고, 상기 소스 코드를 난독화 옵션에 따라 코드 난독화를 수행하고, 상기 난독화된 코드를 컴파일하고, 상기 컴파일된 코드를 이용하여 실행파일을 재조합하도록 구성될 수 있다.In one embodiment, the obfuscation engine downloads the source code of the original program, performs obfuscation of the source code according to an obfuscation option, compiles the obfuscated code, and uses the compiled code Can be configured to recombine the executable.

일 실시 예로, 상기 난독화 엔진은 상기 원본 프로그램의 실행 파일을 다운로드하고, 상기 원본 프로그램을 역어셈블하여 머신 코드를 추출하고, 상기 머신 코드에 대응하는 가상 코드를 생성하고, 상기 가상 코드를 바이트 코드로 인코딩하여 상기 원본 프로그램에 결합하여 실행 파일을 재조합하도록 구성될 수 있다.In one embodiment, the obfuscation engine downloads the executable file of the original program, deassembles the original program, extracts machine code, generates virtual code corresponding to the machine code, and byte codes the virtual code. It can be configured to recombine the executable by encoding it into the original program.

일 실시 예로, 상기 난독화 엔진은 랜덤 시드 및 타임 스탬프 기반의 난독화 기법을 생성하고, 미리 정의된 핸들러를 상기 난독화 기법으로 변형시켜 신규 가상 머신(VM)을 생성하고, 상기 신규 가상 머신의 바이트 코드로 인코딩하여 상기 원본 프로그램에 결합하여 실행 파일을 재조합하도록 구성될 수 있다.In one embodiment, the obfuscation engine generates a random seed and time stamp based obfuscation technique, transforms a predefined handler into the obfuscation technique to generate a new virtual machine (VM), and the new virtual machine It can be configured to recombine the executable file by encoding with the byte code and combining it with the original program.

일 실시 예로, 상기 변이 엔진 서버는 상기 난독화에 의해 동일한 실행 로직, 기능을 갖는 복제 프로그램을 시간별로 구분하여 저장 및 관리하도록 구성된 변이코드 관리부를 더 포함하고, 상기 변이코드 관리부는 상기 에이전트와 통신하여 상기 복제 프로그램을 전송할 수 있다.In one embodiment, the mutation engine server further includes a mutation code management unit configured to store and manage the duplicate programs having the same execution logic and function by time by obfuscation, and the mutation code management unit communicates with the agent To transmit the duplicate program.

일 실시 예로, 상기 에이전트를 포함하도록 구성된 타깃 서버를 더 포함하고, 상기 타깃 서버는 상기 복제 프로그램에 해당하는 복수의 변이 프로그램을 프록시를 통해 클라이언트로 전달하도록 구성될 수 있다.In an embodiment, the target server may further include a target server configured to include the agent, and the target server may be configured to deliver a plurality of mutation programs corresponding to the replication program to a client through a proxy.

일 실시 예로, 상기 타깃 서버는 서비스 실패, 변조 또는 공격받은 상태를 관리하는 상태 관리 메커니즘, 상기 복제 프로그램의 교체와 관련된 교체 메커니즘 및 서버 프로그램과 클라이언트가 공격 대상으로부터 보안 및 서비스가 정상인지 여부를 확인하는 정상 확인 메커니즘을 수행하도록 구성될 수 있다. 대안적으로, 클라이언트는 서비스 실패, 변조 또는 공격받은 상태를 관리하는 상태 관리 메커니즘, 상기 복제 프로그램의 교체와 관련된 교체 메커니즘 및 서버 프로그램과 클라이언트가 공격 대상으로부터 보안 및 서비스가 정상인지 여부를 확인하는 정상 확인 메커니즘을 수행하도록 구성될 수 있다. In one embodiment, the target server checks whether the security and service from the attack target and the server program and the client are normal, and the service management, the state management mechanism for managing the falsified or attacked state, the replacement mechanism related to the replacement of the replication program, and the server program and the client. Can be configured to perform a normal verification mechanism. Alternately, the client is a state management mechanism that manages service failures, tampering or attacked states, a replacement mechanism related to the replacement of the replication program, and a server program and a client to verify whether the security and service from the target are normal. It can be configured to perform a verification mechanism.

일 실시 예로, 상기 상태 관리 메커니즘에서 상기 타깃 서버는 어플리케이션 및 네트워크 계층에서 서비스 중단이 발생하는 상태라고 판단되면 서비스실패 상태로 판단하고, 상기 서버 프로그램이 위조 또는 변조된 경우 상기 난독화 엔진으로부터 새로운 복제 프로그램을 다운로드받아 상기 서버 프로그램을 재구동시키고, 상기 서버 프로그램에 대한 공격 위험 상태를 인식하여, 공격 위험 상태로 전이하고, 상기 서버 프로그램에 대한 예방 프로그램을 구동시킬 수 있다. 대안적으로, 클라이언트는 어플리케이션 및 네트워크 계층에서 서비스 중단이 발생하는 상태라고 판단되면 서비스실패 상태로 판단하고, 상기 서버 프로그램이 위조 또는 변조된 경우 상기 난독화 엔진으로부터 새로운 복제 프로그램을 다운로드 받아 상기 서버 프로그램을 재구동시키고, 상기 서버 프로그램에 대한 공격위험 상태를 인식하여, 공격 위험 상태로 전이하고, 상기 서버 프로그램에 대한 예방 프로그램을 구동시킬 수 있다.In one embodiment, in the state management mechanism, when it is determined that the service interruption occurs in the application and network layer, the target server determines a service failure state, and when the server program is forged or tampered with, new replication from the obfuscation engine By downloading the program, the server program can be restarted, the danger state of the attack on the server program is recognized, the state is transferred to the attack risk state, and the prevention program for the server program can be driven. Alternatively, if it is determined that a service interruption occurs in the application and network layers, the client determines that the service is in a failed state, and when the server program is forged or falsified, downloads a new replication program from the obfuscation engine and downloads the server program. It can be restarted, recognize the attack risk state for the server program, transition to the attack risk state, and run the prevention program for the server program.

본 발명의 적어도 일 실시예에 따르면, 바이너리 수준의 난독화와 컴파일된 실행파일에 대한 난독화를 선택적으로 수행할 수 있다.According to at least one embodiment of the present invention, binary level obfuscation and obfuscation of a compiled executable can be selectively performed.

본 발명의 적어도 일 실시예에 따르면, 바이너리 수준의 난독화와 실행파일에 대한 난독화를 선택적으로 수행하고, 시간적으로 동적 변이 요소를 추가함으로써 악의적 프로그램 분석 공격에 대해 대응할 수 있다.According to at least one embodiment of the present invention, binary level obfuscation and obfuscation for an executable file are selectively performed, and a dynamic mutation element is added in time to counter a malicious program analysis attack.

본 발명의 적어도 일 실시예에 따르면, 악의적 프로그램 분석 공격에 대해 난이도를 배가시켜 분석 시간을 크게 증가하거나 또는 상기 공격을 포기하도록 유도할 수 있다.According to at least one embodiment of the present invention, it is possible to increase the analysis time by increasing the difficulty level for a malicious program analysis attack or to induce the attack to be abandoned.

도 1은 본 발명에 따른 소스 코드 난독화와 바이너리 수준 난독화의 개념도를 나타낸다.
도 2는 본 발명에 따른 가상 머신 기반 코드 변이 기술을 이용한 동적 자가변이 방법을 수행하는 동적 자가변이 시스템의 상세 구성을 나타낸다.
도 3은 본 발명에 따른 PE 포맷의 구조를 나타낸다.
도 4는 본 발명에 따른 소스 코드 수준에서의 난독화 엔진에 의해 수행되는 난독화 및 실행 파일 재조합 방법을 나타낸다.
도 5는 본 발명에 따른 바이너리 수준에서의 난독화 엔진에 의해 수행되는 난독화, 인코딩 및 실행 파일 재조합 방법을 나타낸다.
도 6은 본 발명과 관련하여 VM 기반 난독화 방법의 흐름도를 나타낸다.
도 7은 본 발명에 따른 VM 기반 코드 난독화 방법이 난독화 엔진을 통해 수행되는 구성을 나타낸다.
도 8은 본 발명에 따른 다양한 메커니즘을 수행하는 에이전트와 클라이언트의 구성 및 개념도를 나타낸다.1 shows a conceptual diagram of source code obfuscation and binary level obfuscation according to the present invention.
2 shows a detailed configuration of a dynamic auto-variation system that performs a dynamic auto-variation method using a virtual machine-based code variation technique according to the present invention.
3 shows a structure of a PE format according to the present invention.
4 shows an obfuscation and executable file recombination method performed by the obfuscation engine at the source code level according to the present invention.
5 shows an obfuscation, encoding and executable file recombination method performed by the obfuscation engine at the binary level according to the present invention.
6 shows a flowchart of a VM-based obfuscation method in connection with the present invention.
7 shows a configuration in which the VM-based code obfuscation method according to the present invention is performed through an obfuscation engine.
8 shows a configuration and conceptual diagram of an agent and a client performing various mechanisms according to the present invention.

상술한 본 발명의 특징 및 효과는 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해 질 것이며, 그에 따라 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. The above-described features and effects of the present invention will become more apparent through the following detailed description in connection with the accompanying drawings, and accordingly, those skilled in the art to which the present invention pertains can easily implement the technical spirit of the present invention. Will be able to.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시 예를 가질 수 있는바, 특정 실시 예들을 도면에 예시하고 상세한 설명에 구체적으로 설명하고자 한다. 그러나 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.The present invention can be applied to various changes and can have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention.

각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용한다.In describing each drawing, similar reference numerals are used for similar components.

제1, 제2등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from other components.

예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.For example, the first component may be referred to as a second component without departing from the scope of the present invention, and similarly, the second component may be referred to as a first component. The term and / or includes a combination of a plurality of related described items or any one of a plurality of related described items.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미가 있다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person skilled in the art to which the present invention pertains.

일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않아야 한다.Terms, such as those defined in a commonly used dictionary, should be interpreted as having meanings consistent with meanings in the context of related technologies, and should not be interpreted as ideal or excessively formal meanings unless explicitly defined in the present application. Should not.

이하의 설명에서 사용되는 구성요소에 대한 접미사 모듈, 블록 및 부는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. The suffix modules, blocks, and parts for components used in the following description are given or mixed in consideration of the ease of writing the specification, and do not have a meaning or a role distinguished from each other.

이하, 본 발명의 바람직한 실시 예를 첨부한 도면을 참조하여 당해 분야에 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 설명한다. 하기에서 본 발명의 실시 예를 설명함에 있어, 관련된 공지의 기능 또는 공지의 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략한다. Hereinafter, with reference to the accompanying drawings, preferred embodiments of the present invention will be described so that those skilled in the art can easily carry out. In the following description of embodiments of the present invention, when it is determined that detailed descriptions of related well-known functions or well-known configurations may unnecessarily obscure the subject matter of the present invention, detailed descriptions thereof will be omitted.

이하에서는, 본 발명에 따른 가상 머신 기반 코드 변이 기술을 이용한 동적 자가변이 방법 및 이를 수행하는 시스템에 대해 설명하기로 한다.Hereinafter, a dynamic self-mutation method using a virtual machine-based code variation technique according to the present invention and a system performing the same will be described.

본 발명은 프로그램의 난독화 방법으로서, 다수의 복제 난독화 실행 프로그램을 발생시켜 스케줄러에 의한 예방적 코드 보호 시스템을 구성하는 방법에 대한 것이다. The present invention relates to a method of obfuscating a program, and constructing a preventive code protection system by a scheduler by generating a plurality of duplicate obfuscation execution programs.

본 발명에 따르면, 복제 프로그램 간에 코드 독립성을 실현하여 공격자의 일부 복제 프로그램의 분석이 성공하더라도 얻을 수 있는 정보의 재사용을 무력화시킴으로써 서비스 대상의 다른 복제 프로그램을 공략하기 어렵도록 함에 있다. According to the present invention, code independence between replication programs is realized, so that even if an attacker successfully analyzes some replication programs, the reuse of information that can be obtained is neutralized, making it difficult to target other replication programs in service.

구체적으로, 본 발명은 타깃 프로그램(서버 어플리케이션)을 정적 분석으로부터 보호하는 방식에 대한 시스템에 관한 것이다. Specifically, the present invention relates to a system for a method of protecting a target program (server application) from static analysis.

노출되는 프로그램의 정적 분석은 프로그램을 위조, 변조하거나 취약한 로직을 악용하여 수정하는 등의 방법으로 공격할 수 있는 지식 수단, 응용의 수단이 된다. 이에 따라, 현재 수많은 멀웨어, 바이러스 프로그램을 제작하는 것이 첫 단계 목표가 된다. Static analysis of exposed programs becomes a means of knowledge and application that can be attacked by forging, altering, or exploiting weak logic to fix the program. Accordingly, the first step goal is to create a number of malware and virus programs.

타깃 프로그램의 정적 분석이 대상이 되는 것은 역공학(reverse engineering)을 통해 취득한 소스코드가 되는데, 역공학의 수준에 따라 좀 더 원본 소스코드와 유사한 형태를 얻을 수 있다. 유사한 정도에 따라 분석의 난이도가 쉬워질 수 있기 때문에, 공격이 더 수월해지게 된다. The target of static analysis of the target program is the source code obtained through reverse engineering. Depending on the level of reverse engineering, a more similar form to the original source code can be obtained. The degree of difficulty of the analysis can be eased by a similar degree, making the attack easier.

이를 막거나 어렵게 하는 방법이 난독화(obfuscation) 기법이며, 대상에 따라 바이너리 수준, 소스코드 수준으로 나뉘게 된다. 소스코드 수준의 경우 개발자가 작성한 소스코드를 대상으로 난독화 기법을 적용하고 컴파일 과정을 거쳐 실행파일을 생성하여 배포하는 것이고, 바이너리 수준의 경우 실행파일을 생성한 단계에서 난독화 기법을 적용하는 것이 다르다. The method of preventing or making this difficult is obfuscation, which is divided into binary level and source code level depending on the target. In the case of the source code level, the obfuscation technique is applied to the source code written by the developer, and the executable file is generated and distributed through the compilation process. In the binary level, the obfuscation technique is applied in the step of generating the executable file. different.

소스코드 수준의 난독화의 경우 개발자의 개입이 필요하며, 적용 단계가 번거롭다는 점이 단점이지만 컴파일 과정에서 확실히 적용여부를 확인할 수 있으며, 성공/실패 여부를 검증할 수 있다. In the case of obfuscation at the source code level, developer intervention is required, and the disadvantage is that the application step is cumbersome, but it can be confirmed whether it is applied in the compilation process and can verify the success / failure.

바이너리 수준의 난독화의 경우 개발자 개입이 필요 없고, 컴파일 된 실행파일에 적용하므로 적용이 매우 간편하다는 점이 장점이 될 수 있으나, 저 수준에서 적용해야 하기 때문에 오류 발생의 위험성도 커질 수 있으며, 컴파일러나 개발 환경에 의존성을 갖는다. 이와 관련하여, 도 1은 본 발명에 따른 소스 코드 난독화와 바이너리 수준 난독화의 개념도를 나타낸다. In the case of binary level obfuscation, there is no need for developer intervention, and since it is applied to the compiled executable file, it can be very easy to apply, but the risk of error may increase because it must be applied at a low level. It depends on the development environment. In this regard, FIG. 1 shows a conceptual diagram of source code obfuscation and binary level obfuscation according to the present invention.

도 1을 참조하면, 복수의 소스 코드(SRC_1 내지 SRC_N)에 대하여 소스코드 수준의 난독화가 수행될 수 있다. 이후, 난독화가 수행된 소스 코드에 대해 코드 컴파일을 통해 실행 코드가 생성될 수 있다. 또한, 실행 코드에 대해 바이너리 수준 난독화를 통해 변이된 실행 코드가 생성될 수 있다. Referring to FIG. 1, obfuscation of a source code level may be performed on a plurality of source codes SRC_1 to SRC_N. Subsequently, executable code may be generated through code compilation for the source code on which obfuscation has been performed. In addition, mutated executable code may be generated through binary level obfuscation of the executable code.

본 발명에서는, 두 가지 모두에 대해 적용할 수 있는 방법을 선택적으로 적용하고, 이를 시간적으로 동적 변이 요소를 추가함으로써 악의적 프로그램 분석 공격에 대해 난이도를 배가 시켜 분석 시간을 크게 증가하거나 포기하도록 유도하는 데 있다. In the present invention, a method that can be applied to both is selectively applied, and by adding a dynamic variation factor in time, it is possible to multiply the difficulty for a malicious program analysis attack and induce the analysis time to be significantly increased or abandoned. have.

도 2는 본 발명에 따른 가상 머신 기반 코드 변이 기술을 이용한 동적 자가변이 방법을 수행하는 동적 자가변이 시스템의 상세 구성을 나타낸다.2 shows a detailed configuration of a dynamic auto-variation system that performs a dynamic auto-variation method using a virtual machine-based code variation technique according to the present invention.

이와 관련하여, 상기 난독화 기법을 적용하기 위한 난독화 엔진(120)과 난독화가 적용된 코드를 재조합하여 실행 파일을 생성하고, 이를 적용하는 에이전트(1000a)로 구분할 수 있다. In this regard, the obfuscation engine 120 for applying the obfuscation technique and the code to which the obfuscation has been applied are recombined to generate an executable file, and can be classified into an agent 1000a that applies it.

도 2를 참조하면, 동적 자가변이 시스템은 변이 엔진 서버(100)와 타깃 서버(1000)를 포함할 수 있다. 한편, 변이 엔진 서버(100)는 원본 코드 관리부(110), 난독화 엔진(120), 변이 코드 관리부(130) 및 제어 및 통신부(140)를 포함하도록 구성 가능하다. 한편, 타깃 서버(1000) 내의 에이전트(1000a)는 다운로더(1010), 프로그램 제어부(1100), 하나 이상의 스케줄러(1200)를 포함하도록 구성 가능하다. 또한, 클라이언트(200)는 타깃 서버(1000)를 통해 변이 엔진 서버(100)와 연동 가능하다. 구체적으로, 클라이언트(200)는 프록시(300)를 통해 프로그램 제어부(1100)로부터 복수의 변이 프로그램을 제공받을 수 있다. 전술한 구성에 대한 상세한 설명은 이하에서 설명한다.Referring to FIG. 2, the dynamic auto-variation system may include a mutation engine server 100 and a target server 1000. Meanwhile, the variation engine server 100 may be configured to include an original code management unit 110, an obfuscation engine 120, a variation code management unit 130, and a control and communication unit 140. Meanwhile, the agent 1000a in the target server 1000 may be configured to include a downloader 1010, a program control unit 1100, and one or more schedulers 1200. In addition, the client 200 may interwork with the mutation engine server 100 through the target server 1000. Specifically, the client 200 may receive a plurality of mutation programs from the program controller 1100 through the proxy 300. Detailed description of the above-described configuration will be described below.

도 2의 구조도에서 “원본 코드” 및 “변이 코드”는 소스 코드 및 실행 코드를 통칭하는 것이다. In the structural diagram of FIG. 2, “original code” and “variant code” are collectively referred to as source code and executable code.

소스 코드는 개발자(programmer)가 개발 시 사용하는 언어 수준의 코드이며, 실행 코드는 소스 코드를 컴파일 한 결과물로서 머신 수준에서 해석이 가능한 코드이다. The source code is language level code used by the developer (programmer) when developing, and the executable code is the result of compiling the source code and can be interpreted at the machine level.

이 때 실행 코드(바이너리 코드)는 머신 수준(예, X86 CPU)에서 해석되는 어셈블리 코드와 가상 머신(예, JVM) 수준에서 해석되는 바이트 코드로 구분할 수 있는데, 바이트 코드의 경우 가상 머신에 의해 프로그램 실행 시 바이트 코드를 어셈블리 코드(기계어)로 변환 후 머신에서 실행되도록 한다. At this time, execution code (binary code) can be divided into assembly code interpreted at the machine level (eg, X86 CPU) and byte code interpreted at the virtual machine (eg, JVM) level. When executed, the byte code is converted into assembly code (machine language) and then executed on the machine.

변이 코드는 상기 소스 코드 또는 실행 코드를 난독화 기법을 통해 실행상 의미에서 동일성을 유지하여 변환한 결과물을 의미한다. 예를 들어 x86 머신에서 윈도우 환경인 경우 PE (Portable Executable) 포맷이 된다. The mutation code refers to a result of converting the source code or the execution code by maintaining the same in the execution sense through obfuscation technique. For example, in the case of a Windows environment on an x86 machine, the format is PE (Portable Executable).

도 3은 본 발명에 따른 PE 포맷의 구조를 나타낸다. 윈도우에서 실행 가능한 파일 형식으로서 대표적으로 exe, dll, obj, sys 등이 있다. 3 shows a structure of a PE format according to the present invention. Typical file types executable on Windows are exe, dll, obj, sys, and so on.

도 1 내지 도 3을 참조하면, 난독화 엔진(120) 및/또는 난독화 에이전트(1000a)는 소스 수준(=소스 코드), 바이너리 수준(=실행 코드)에서 난독화를 선택적으로 적용할 수 있도록 한다. 1 to 3, the obfuscation engine 120 and / or the obfuscation agent 1000a may selectively apply obfuscation at the source level (= source code) and binary level (= executable code). do.

한편, 도 4는 본 발명에 따른 소스 코드 수준에서의 난독화 엔진에 의해 수행되는 난독화 및 실행 파일 재조합 방법을 나타낸다. 이와 관련하여, 도 1 내지 도 4를 참조하면, 난독화 엔진(120)은 원본 프로그램의 소스 코드에 대해 난독화와 컴파일을 통해 실행 파일 재조합을 수행한다. 또한, 난독화 엔진(120)은 원본 프로그램의 실행 코드에 대해 가상 코드를 통한 난독화와 인코딩을 통해 실행 파일 재조합을 수행하도록 구성될 수 있다. 한편, 에이전트(1000a)는 재조합된 실행 파일을 수신하고, 상기 실행 파일을 적용하도록 구성된다. 구체적으로, 에이전트(1000a)는 재조합된 실행 파일을 복수의 변이 프로그램 형태로 클라이언트(200)에 적용할 수 있다.Meanwhile, FIG. 4 shows an obfuscation and execution method recombination method performed by the obfuscation engine at the source code level according to the present invention. In this regard, referring to FIGS. 1 to 4, the obfuscation engine 120 performs recombination of executable files through obfuscation and compilation of the source code of the original program. In addition, the obfuscation engine 120 may be configured to perform executable file recombination through obfuscation and encoding through virtual code on the execution code of the original program. Meanwhile, the agent 1000a is configured to receive the recombined executable file and apply the executable file. Specifically, the agent 1000a may apply the recombined executable file to the client 200 in the form of a plurality of mutation programs.

한편, 소스 수준에서 난독화 기법의 적용은 자동화가 완전히 이루어질 수 없는 가능성이 있기 때문에, 일정 주기를 두고 관리자의 개입이 필요하다. 도 1 내지 도 4를 참조하면, 본 발명에 따른 소스 코드 수준에서 난독화 및 실행 파일 재조합 방법은 다음과 같은 특징을 갖는다. 여기서, 다음과 같은 각 단계별 동작은 난독화 엔진(120)에 의해 수행될 수 있다.On the other hand, the application of obfuscation techniques at the source level is likely to be unable to be fully automated, and therefore requires administrator intervention at regular intervals. Referring to Figures 1 to 4, obfuscation and executable recombination method at the source code level according to the present invention has the following features. Here, each of the following step-by-step operations may be performed by the obfuscation engine 120.

1) 즉, 프로그램의 소스코드를 리포지토리(Repository, 10) 공간에 두고, 최신 버전을 내려받는다. 여기서, 리포지토리(10)는 소스 코드 외부 저장소에 해당하지만, 원본 코드 관리부(110)도 이에 해당할 수 있다. 또는, 원본 코드 관리부(110)가 리포지토리(10)에서 원본 프로그램을 다운로드 받아, 원본 프로그램의 소스 코드 및 실행 파일을 저장하도록 구성될 수 있다. 한편, 이 단계는 SVN 과 같은 SCM(소스코드 버전 관리 시스템) 툴을 이용하여 최신 버전 또는 특정 버전에 대한 원격 소스코드를 내려 받게 한다. 1) That is, the source code of the program is placed in the Repository (10) space, and the latest version is downloaded. Here, the repository 10 corresponds to an external storage of source code, but the original code management unit 110 may also correspond to this. Alternatively, the original code management unit 110 may be configured to download the original program from the repository 10 and store source code and executable files of the original program. Meanwhile, this step uses a source code version control system (SCM) tool such as SVN to download the remote source code for the latest version or a specific version.

2) 소스코드를 내려받은 후에, 난독화 옵션 관리자에 의해 옵션을 설정 후 난독화를 수행한다. 이때, 적용 대상 소스 코드 파일 리스트를 선택할 수 있다. 즉, 프로그램 소스 전부에 대해 적용하는 것이 아닌 중요 코드 부분을 선택하여 난독화를 적용하는 것이다. 전부 적용하는 경우 프로그램 실행 성능이 저하될 수 있다. (즉, 메모리가 커지거나 실행 속도가 느려질 수 있음) 2) After downloading the source code, obfuscation is performed after setting options by the obfuscation option manager. At this time, a list of source code files to be applied may be selected. In other words, obfuscation is applied by selecting important parts of code rather than applying to all program sources. If all are applied, the performance of program execution may deteriorate. (That is, the memory may be large or the execution speed may be slow)

3) 난독화 기법 적용의 자세한 내용에 대해서는 상세한 설명은 생략한다.3) Details of application of obfuscation techniques are omitted.

4) 난독화 적용된 소스코드는 컴파일러에 의해 프로그램을 빌드한다. 이 단계에서 오류가 발생할 경우 관리자의 개입이 필요하게 된다. 즉, 컴파일러 옵션을 조정하거나 빌드 오류에 대한 개발자의 진단 조치가 필요하다. 4) The source code applied to obfuscation is built by the compiler. If an error occurs at this stage, administrator intervention is required. This means you need to tweak compiler options or take developer diagnostics for build errors.

5) 빌드가 완성된 후에는 시간, 버전, 난독화 옵션, 외부 인터페이스 등을 기록하여 변이 코드 관리부(130)에서 관리한다. 5) After the build is completed, the time, version, obfuscation option, external interface, etc. are recorded and managed by the variation code management unit 130.

이 때 변이코드는 난독화 적용 후 결과물을 통칭하며, 여기서는 실행 파일(exe) 또는 라이브러리(dll) 파일이 된다. 보통은 실행 파일이 될 것이며, x86 환경에서 구동될 수 있는 상태의 PE 포맷으로 한다. At this time, the mutant code refers to the result after obfuscation is applied, and here is an executable file (exe) or library (dll) file. Usually it will be an executable file, and it is in PE format in a state that can be run in an x86 environment.

다만, 과다 트래픽 발생 또는 용량이 매우 큰 실행 파일의 경우 일부 모듈을 한정하여 변이된 라이브러리를 관리한다. 한편, 타깃 서버(1000)의 “에이전트(1000a)”의 요청에 따라 라이브러리를 전송하고, 프로그램 실행 경로의 라이브러리를 교체하는 방식으로 변이된다. However, in case of excessive traffic or an executable file with very large capacity, some modules are limited to manage the mutated library. On the other hand, according to the request of the "agent 1000a" of the target server 1000, the library is transferred, and the program execution path is changed to replace the library.

따라서, 난독화 엔진(120)은 원본 프로그램의 소스 코드를 다운로드하고, 상기 소스 코드를 난독화 옵션에 따라 코드 난독화를 수행한다. 또한, 난독화 엔진(120)은 상기 난독화된 코드를 컴파일하고, 상기 컴파일된 코드를 이용하여 실행파일을 재조합하도록 구성될 수 있다.Accordingly, the obfuscation engine 120 downloads the source code of the original program, and performs obfuscation of the source code according to the obfuscation option. In addition, the obfuscation engine 120 may be configured to compile the obfuscated code and to recompose the executable file using the compiled code.

한편, 바이너리 수준에서 난독화 기법의 적용은 자동화가 가능하다. 이와 관련하여, 도 5는 본 발명에 따른 바이너리 수준에서의 난독화 엔진에 의해 수행되는 난독화, 인코딩 및 실행 파일 재조합 방법을 나타낸다. 도 1 내지 도 3, 도 5를 참조하면, 난독화 엔진(120)은 원본 프로그램의 실행 파일을 다운로드하고, 상기 원본 프로그램을 역어셈블하여 머신 코드를 추출한다. 또한, 난독화 엔진(120)은 상기 머신 코드에 대응하는 가상 코드를 생성하고, 상기 가상 코드를 바이트 코드로 인코딩하여 상기 원본 프로그램에 결합하여 실행 파일을 재조합하도록 구성될 수 있다.On the other hand, the application of obfuscation techniques at the binary level can be automated. In this regard, FIG. 5 shows a method for obfuscation, encoding and recombination of executable files performed by the obfuscation engine at the binary level according to the present invention. 1 to 3 and 5, the obfuscation engine 120 downloads an executable file of the original program, and assembles the original program to extract machine code. In addition, the obfuscation engine 120 may be configured to generate a virtual code corresponding to the machine code, encode the virtual code into byte code, and combine it with the original program to recompose the executable file.

이와 관련하여, 바이너리 수준의 난독화 적용 기법은 다음과 같은 방법을 통해 수행될 수 있다. C/C++ 수준의 언어의 경우 컴파일 시 기계어에 가깝고, JAVA 수준의 언어는 컴파일 시 바이트코드를 산출하며, JVM언어는 가상 머신에 의해 최종 기계어로 번역되어 실행된다. 그러므로 JAVA언어의 경우 전자의 경우에 비해 좀 더 직관적이고 수월하게 난독화를 적용할 수 있다. 즉, 바이트코드 자체에 대해 난독화 기법(심볼 치환, 더미 코드 삽입 등)을 적용하게 된다. In this regard, the technique of applying obfuscation at the binary level may be performed through the following method. In case of C / C ++ level language, it is close to machine language at compile time, JAVA level language produces bytecode at compile time, and JVM language is translated and executed by the virtual machine as the final machine language. Therefore, in the JAVA language, obfuscation can be applied more intuitively and easily than in the former case. That is, the obfuscation technique (symbol substitution, dummy code insertion, etc.) is applied to the bytecode itself.

그러나 C/C++의 경우 컴파일된 기계어(어셈블리 코드)에 변형을 가하게 되면 직접적으로 동일성을 상실하며, 실행 자체가 되지 않게 되기 때문에 보통은 소스 코드 수준에서 난독화를 적용하게 된다. 다른 방법으로는 JAVA의 경우와 유사하게 가상머신(VM)을 직접 생성하는 것이다. 즉, 어셈블리 코드의 명령어에 대응하는 가상 명령어를 생성하고, 동일한 실행을 보장하는 가상머신 핸들러를 매핑하여 동일한 동작성을 보장하면서, 어셈블리 코드를 가상 코드로 변형을 가하는 방법이다. However, in the case of C / C ++, when the transformation is made to the compiled machine language (assembly code), the identity is lost directly, and the execution itself is not executed, so obfuscation is usually applied at the source code level. Another method is to create a virtual machine (VM) directly, similar to JAVA. That is, it is a method of generating a virtual instruction corresponding to an instruction of the assembly code, and mapping the virtual machine handler that guarantees the same execution, while assuring the same operability, and transforming the assembly code into the virtual code.

구체적으로는 보호 대상인 코드 영역에 대하여 가상 코드로 치환된 영역을 원본 프로그램에 삽입하는 형태로 이루어지며, 실행 시에는 해당 영역에 진입 시 원본 프로그램의 레지스터, 플래그 등의 정보를 가상 머신의 컨텍스트에 보존한 채로 가상 코드 영역의 바이트 코드를 기계어로 변환 후 머신에서 실행하게 된다. 실행이 완료되면 다시 보존했던 컨텍스트 정보를 가지고 원본 프로그램 영역으로 되돌아가며, 이후 코드를 이어서 실행한다. Specifically, it consists of inserting the region replaced by the virtual code into the original program with respect to the code region to be protected, and upon execution, when entering the region, information such as registers and flags of the original program is stored in the context of the virtual machine. After converting the byte code of the virtual code area into machine language, it is executed on the machine. When the execution is completed, it returns to the original program area with the context information that was saved again, and the code is subsequently executed.

C/C++으로 작성된 프로그램의 경우, 컴파일 완료된 원본 프로그램을 소스 코드 수정없이 VM기반 난독화를 적용하여 바이너리 수준의 난독화 기법을 적용한다. 이 경우 별도의 원본 소스코드를 다운로드 받거나 재컴파일하는 등의 과정은 필요 없게 된다. In the case of a program written in C / C ++, a binary-level obfuscation technique is applied by applying VM-based obfuscation to the compiled original program without modifying the source code. In this case, there is no need to download a separate source code or recompile.

이 처리 과정을 순서대로 나열하면 다음과 같다. The process is as follows.

1) 원본 프로그램을 다운로드 받는다. 1) Download the original program.

2) 원본 프로그램을 역어셈블(dis-assembly) 하여, 머신 코드를 추출한다. 2) The machine code is extracted by disassembling the original program.

3) 머신 코드에 대응하는 가상 코드를 생성한다. 3) Create virtual code corresponding to the machine code.

4) 가상 코드를 바이트 코드로 인코딩하여 원본 프로그램에 결합시킨다. 4) Encode the virtual code into a byte code and combine it into the original program.

상기 VM 기반 난독화 적용의 핵심은 기계어 명령어(Native Instruction Set)와 가상 명령어(Virtual Instruction Set)와의 관계를 숨기거나 파악하기 어렵게 하는 데 있다. 즉, 가상 명령어로부터 원본 어셈블리 코드(Op-Code)를 추측하는 것을 어렵게 하는 것이 목표가 된다. The core of the VM-based obfuscation application is to make it difficult to hide or understand the relationship between a machine instruction (Native Instruction Set) and a virtual instruction (Virtual Instruction Set). That is, the goal is to make it difficult to guess the original assembly code (Op-Code) from a virtual instruction.

이때 한 가지 가정은 실행시 기계어 명령어와 가상 명령어로의 매핑 관계가 실시간으로 추측하기 어렵다는 것을 기반으로 하는데, 이에 대한 연구가 많이 진행되어 있어 동일한 난독화 기법을 적용한 샘플 실행파일을 취득하여, 역공학을 여러 차례 수행하고 패턴 유사성을 조사하는 방법으로 어느 정도 추측이 가능하다는 연구 결과가 발표되었다. At this time, one assumption is based on the fact that the mapping relationship between the machine language instruction and the virtual instruction during execution is difficult to guess in real time, and many studies have been conducted to obtain a sample executable file that applies the same obfuscation technique and reverse engineering. Several studies have been conducted and studies have been conducted to investigate the similarity of patterns.

이에 따라 본 실시에서는 VM 구현 및 난독화 기법의 복잡도를 개선하기 위하여, 바이트 코드, 핸들러, 가상 머신에 대한 임의성, 시간에 대한 임의성을 제공하고자 한다. Accordingly, in this embodiment, in order to improve the complexity of the VM implementation and obfuscation technique, it is intended to provide randomness for byte code, handler, virtual machine, and randomness for time.

본 발명과 관련하여, VM 기반 난독화 방식은 다음과 같이 구성된다. 이와 관련하여, 도 6은 본 발명과 관련하여 VM 기반 난독화 방법의 흐름도를 나타낸다. In connection with the present invention, the VM-based obfuscation scheme is configured as follows. In this regard, FIG. 6 shows a flow diagram of a VM-based obfuscation method in connection with the present invention.

도 6을 참조하면, 본 발명에 따른 VM 기반 난독화 방법은 1) 중요코드 추출 단계, 2) 코드 가상화 단계, 3)바이트 코드 생성 단계 및 4) 파일 재구성 단계를 수행할 수 있다. 이러한 VM 기반 난독화 방법은 아래와 같은 단계 별 동작을 통해 수행된다. Referring to FIG. 6, the VM-based obfuscation method according to the present invention may perform 1) important code extraction step, 2) code virtualization step, 3) byte code generation step, and 4) file reconstruction step. This VM-based obfuscation method is performed through the following steps.

1) 보호해야 할 중요한 코드 세그먼트가 컴파일된 바이너리에서 추출되어 어셈블리 코드로 분해된다. 1) Important code segments to be protected are extracted from the compiled binary and decomposed into assembly code.

2) 네이티브 어셈블리 코드는 가상 명령어(virtual instruction), 즉 VM에 의해 사용되는 머신 독립적인 중간 표현(intermedia representation)으로 변환된다. 단, 변환된 가상 명령어는 원래의 원시 코드와 기능이 동일해야 한다. 2) Native assembly code is converted into virtual instructions, ie, machine-independent intermediate representations used by the VM. However, the converted virtual instruction must have the same function as the original source code.

3) 생성되는 가상 명령어는 바이트 코드 형식으로 부호화(encode)된다. 3) The generated virtual instruction is encoded in bytecode format.

4) 새로운 VM 섹션이 타겟 프로그램에 링크 (또는 삽입)되며, 런타임시 보호 대상 코드 영역의 엔트리 포인트는 VM을 호출하여 바이트 코드 명령어를 네이티브 머신 코드로 변환하여 프로그램을 실행하게 된다. 4) A new VM section is linked (or inserted) to the target program, and at run time, the entry point of the protected code area calls the VM to convert the byte code instruction into native machine code to execute the program.

이러한 VM 기반 코드 난독화 기법은 공격자로 하여금 익숙한 명령어 집합(instruction set)으로부터 생소한 가상화 명령어 집합을 해석하도록 하여 역공학을 어렵게 하는 방법으로 시간과 자원의 투입량을 급격히 증가시켜 공격을 무력화시키는 것이다. This VM-based code obfuscation technique disables the attack by rapidly increasing the amount of time and resource input in a way that makes it difficult for reverse engineering by allowing an attacker to interpret an unfamiliar virtualization instruction set from a familiar instruction set.

VM 섹션 구성 요소는 VMContext, VMInit, Dispatcher, Handlers, Bytecode, VMexit 로 구분할 수 있다. VM section components can be divided into VMContext, VMInit, Dispatcher, Handlers, Bytecode, and VMexit.

목적 프로그램의 로컬 변수, 함수 인수, 리턴 주소 등의 정보를 포함하는 컨텍스트를 VMContext라고 하는 레지스터 기반 VM 메모리 공간에 저장한다. The context containing information such as local variables, function arguments, and return addresses of the target program is stored in a register-based VM memory space called VMContext.

VMinit 구성 요소는 VM에 진입할 때 목적 프로그램 컨텍스트를 저장하고 VMContext를 초기화한다. When entering the VM, the VMinit component stores the target program context and initializes the VMContext.

중요 로직을 포함하는 보호된 코드 세그먼트를 실행한 후 VMExit은 목적 프로그램 컨텍스트를 복원한 다음 네이티브 머신 코드를 계속 실행하기 위해 프로그램 컨트롤을 원래 프로그램으로 되돌린다. After executing a protected code segment containing critical logic, VMExit restores the destination program context and then returns program control to the original program in order to continue executing native machine code.

<실시예><Example>

상기 기존 VM 기반 코드 난독화 기법은 단일 VM, 단일 핸들러 집합을 가정하는 방식이므로, 비슷한 패턴이 반복되는 구조에서는 원본 어셈블리 명령어와의 관계가 추측되기 쉽다. 이에 본 발명에 따른 실시예는 다수의 가상 머신을 기반으로 하고, 다수의 핸들러를 임의적으로 생성하여 분석을 어렵게 하고자 한다. 이와 관련하여, 도 7은 본 발명에 따른 VM 기반 코드 난독화 방법이 난독화 엔진을 통해 수행되는 구성을 나타낸다. 이와 관련하여, 신규 VM 기반 바이트 코드 변이 단계는 별도의 엔티티에 의해 수행되거나, 또는 난독화 엔진(120)에 의해 수행될 수 있다.Since the existing VM-based code obfuscation technique assumes a single VM and a single handler set, it is easy to guess the relationship with the original assembly instructions in a structure in which similar patterns are repeated. Accordingly, the embodiment according to the present invention is based on a plurality of virtual machines, and tries to make analysis difficult by randomly generating a plurality of handlers. In this regard, FIG. 7 shows a configuration in which the VM-based code obfuscation method according to the present invention is performed through an obfuscation engine. In this regard, the new VM-based byte code transition step may be performed by a separate entity, or may be performed by the obfuscation engine 120.

도 7을 참조하면, 난독화 엔진(120)은 랜덤 시드 및 타임 스탬프 기반의 난독화 기법을 생성하고, 미리 정의된 핸들러를 상기 난독화 기법으로 변형시켜 신규 가상 머신(VM)을 생성할 수 있다. 또한, 난독화 엔진(120)은 상기 신규 가상 머신의 바이트 코드로 인코딩하여 상기 원본 프로그램에 결합하여 실행 파일을 재조합하도록 구성될 수 있다.Referring to FIG. 7, the obfuscation engine 120 may generate a random seed and time stamp based obfuscation technique, and transform a predefined handler into the obfuscation technique to generate a new virtual machine (VM). . In addition, the obfuscation engine 120 may be configured to recombine an executable file by encoding the byte code of the new virtual machine and combining it with the original program.

한편, 가상화 난독화(=VM 기반 코드 난독화)를 적용한 프로그램은 VM 섹션으로 이동(JUMP) 한 후, 디스패처(Dispatcher)에 의해 바이트 코드를 읽고 해석하여, 심볼에 해당하는 핸들러(Handler)를 찾아 해당 명령어를 기계 명령어로 변환 후 실행한다. Meanwhile, a program that applies virtualization obfuscation (= VM-based code obfuscation) moves to the VM section (JUMP), reads and interprets the byte code by the dispatcher, and finds the handler corresponding to the symbol. Convert and execute the corresponding command into a machine command.

이때 여러 개의 가상 머신(VM)을 미리 생성하는 방법을 고려해 볼 수 있다. 즉, 난독화 기법을 이용하여 핸들러를 동일 구조하에 여러 개 복제를 만드는 것이다. 이 복제 핸들러에 대해 매핑되는 가상 명령어를 랜덤하게 여러 개 생성하고, 미리 정의된 가상 명령어의 집합에 대해 여러 유사 복제본에 대해 가상 머신이 생성되는 것이다. At this time, you can consider how to create multiple virtual machines (VMs) in advance. In other words, using the obfuscation technique, multiple handlers are created under the same structure. It generates random virtual instructions mapped to this replication handler, and virtual machines are created for multiple similar replicas for a predefined set of virtual instructions.

즉, 동일 기능을 수행하지만 바이트 코드 형식은 상이한 가상 머신을 난독화 기법을 적용하여 여러 개 생성한다. That is, multiple virtual machines that perform the same function but have different byte code formats are generated by applying obfuscation techniques.

다음은 실시예에 적용하는 시나리오를 보여준다. The following shows a scenario applied to the embodiment.

1) 중요 코드를 추출한다. 1) Extract important code.

2) 기계어(어셈블리) 코드로 디-어셈블리 (역공학) 시킨다. 2) De-assembly (reverse engineering) in machine language (assembly) code.

3) 미리 정의된 핸들러를 (난독화 기법으로) 변형시켜 새로운 가상 머신을 생성한다. 3) Create a new virtual machine by transforming the predefined handler (with obfuscation technique).

4) 새로운 가상머신의 바이트 코드로 인코딩한다. 4) Encode with the byte code of the new virtual machine.

5) 바이트 코드를 원본 프로그램에 결합시킨다. 5) Combine the byte code into the original program.

“변이코드 관리부(130)”는 난독화에 의해 동일한 실행 로직, 기능을 가지는 복제프로그램을 시간별로 구분하여 저장, 관리하도록 한다. The “variant code management unit 130” stores and manages the duplicate programs having the same execution logic and function by time by obfuscation.

1) 원격 “에이전트(1000a)”와 통신을 하여 복제 프로그램을 전송한다. 1) It communicates with the remote “agent (1000a)” to transmit the replication program.

2) 복제 프로그램의 선택 기준은 동일성 여부가 중요하다. 즉, 원본 프로그램의 변경이 발생되면 반드시 버전을 수정하여, 이로부터 복제된 프로그램은 버전을 상속받아 동일성을 식별한다. 즉, 다수의 복제 프로그램은 형상이 다르더라도 실행 시에는 동일한 프로그램으로 간주된다. 2) It is important whether the selection criteria of the replication program are the same. That is, when a change of the original program occurs, the version must be corrected, and the duplicated program inherits the version to identify the identity. That is, a plurality of duplicate programs are regarded as the same program at the time of execution even if they have different shapes.

3) 버전이 다를 경우, 즉 원본 프로그램의 소스코드가 개발자에 의해 변경된 경우, 이로부터 복제된 프로그램은 별개의 것으로 간주한다. 3) If the version is different, that is, the source code of the original program has been changed by the developer, the duplicated program is regarded as separate.

“에이전트(1000a)”는 “변이코드 관리부(130)”와 통신을 통해 필요한 수의 복제프로그램을 다운로드 받는다. 즉, 변이코드 관리부(130)는 에이전트(1000a)와 통신하여 적어도 하나의 복제 프로그램을 전송할 수 있다. 이에 따라, 타깃 서버(1000)는 복제 프로그램에 해당하는 복수의 변이 프로그램을 프록시(300)를 통해 클라이언트(200)로 전달하도록 구성될 수 있다.The “agent 1000a” downloads the necessary number of copy programs through communication with the “variant code management unit 130”. That is, the mutation code management unit 130 may transmit at least one replication program by communicating with the agent 1000a. Accordingly, the target server 1000 may be configured to deliver a plurality of mutation programs corresponding to the replication program to the client 200 through the proxy 300.

1) “에이전트(1000a)”는 “변이코드 관리부(130)”와 통신 프로토콜에 의해 명령, 전문 전송, 파일 다운로드 등의 네트워크 통신을 수행한다. 1) The “agent 1000a” performs network communication such as command, message transmission, and file download by a communication protocol with the “variant code management unit 130”.

2) 타깃 프로그램의 서비스를 관리한다. 즉, 프로그램의 시작과 중단을 관장한다. 2) Manage the service of the target program. That is, it governs the start and stop of the program.

3) 타깃 프로그램의 무중단 실행을 보장한다. 3) Ensures non-disruptive execution of target programs.

이와 관련하여, 주로 관심의 대상이 되는 프로그램은 서버형 서비스이다. 클라이언트 프로그램은 기존의 수많은 상용화 제품이 존재하며, 서버형 프로그램의 보안의 중요성이 더 부각되는 차원에서 범위를 서버형 서비스 프로그램에 한정할 수 있다. In this regard, a program of interest mainly is a server type service. A number of existing commercialized products exist for the client program, and the scope of the server-type program can be limited to that of the server-type service program in order to increase the importance of security of the server-type program.

무중단 실행을 보장하기 위해서는 여러 가지 방법이 존재할 수 있다. 본 실시예에서는 분산 클러스터 구조상에서 Fault Tolerant 시스템을 구성하는 방식을 차용한다. 즉, 다수의 복제 프로그램을 동시에 구동 시키고, 클라이언트의 요청에 프록시가 대신 받아서 복수의 응답 후보군에 대해 응답을 매치하여 전달하는 방식이다. Several methods can exist to ensure non-disruptive execution. In this embodiment, a method of constructing a fault tolerant system on a distributed cluster structure is employed. In other words, multiple replication programs are run at the same time, and the proxy receives the client's request instead, and matches and delivers responses to multiple response candidate groups.

이에 따라, 타깃 서버(1000)는 서비스 실패, 변조 또는 공격받은 상태를 관리하는 상태 관리 메커니즘, 상기 복제 프로그램의 교체와 관련된 교체 메커니즘 및 서버 프로그램과 클라이언트가 공격 대상으로부터 보안 및 서비스가 정상인지 여부를 확인하는 정상 확인 메커니즘을 수행하도록 구성될 수 있다. Accordingly, the target server 1000 determines whether the security and service of the server program and the client are normal from the attack target, and the state management mechanism for managing the service failure, tampering or attacked state, and the replacement mechanism related to the replacement of the replication program. It can be configured to perform a normal verification mechanism to verify.

이와 관련하여, 도 8은 본 발명에 따른 다양한 메커니즘을 수행하는 에이전트와 클라이언트의 구성 및 개념도를 나타낸다. 도 8을 참조하면, 타깃 서버의 에이전트(1000a)를 통해 클라이언트(200)는 다양한 메커니즘을 수행할 수 있다. 이와 관련하여, 에이전트(1000a)는 상태 관리 메커니즘, 교체 메커니즘 및 정상 확인 메커니즘을 수행할 수 있다. 대안으로, 클라이언트(200)는 에이전트(1000a)와 연동하여 상태 관리 메커니즘, 교체 메커니즘 및 정상 확인 메커니즘을 수행할 수 있다.In this regard, FIG. 8 shows a configuration and conceptual diagram of an agent and a client performing various mechanisms according to the present invention. Referring to FIG. 8, the client 200 may perform various mechanisms through the agent 1000a of the target server. In this regard, the agent 1000a may perform a state management mechanism, a replacement mechanism, and a normal verification mechanism. Alternatively, the client 200 may perform a state management mechanism, a replacement mechanism, and a normal confirmation mechanism in conjunction with the agent 1000a.

본 메커니즘을 더 자세히 기술하면 다음과 같다. The mechanism is described in more detail as follows.

1) 상태 관리 메커니즘 1) State management mechanism

서비스 실패(crash) 되거나 변조 또는 공격(attacked)받은 상태를 관리하는 것이 핵심이다. 즉, 프로그램의 상태를 지속적으로 관리하여 청정 무결인 상태로 유지시키는 것이다. The key is to manage the status of a service crash, tampering, or attack. In other words, the state of the program is continuously managed to keep it clean and clean.

a. 정상 상태 a. Steady state

- 서버, 클라이언트 모두 정상적으로 동작하는 상태 -Both server and client are operating normally

b. 서비스실패 상태 b. Service failure status

- 어플리케이션, 네트워크 등의 계층에서 서비스 중단이 발생하는 상태이다. -Service interruption occurs in the application, network, and other layers.

즉, 네트워크가 일시적으로 단락되거나 중단되는 경우 서비스 실패 상태로 판단하고, 관리자에게 경고 알람을 주는 정도로 조치를 취한다. In other words, if the network is temporarily short-circuited or stopped, it is determined that the service has failed, and measures are taken to give a warning alarm to the administrator.

클라이언트 앱이 위변조된 경우 비정상적인 서비스 요청이 들어올 수 있을 것이다. 이 경우 특정 IP, 또는 식별자를 통해 수신되는 통신을 차단시킨다. If the client app has been forged, an abnormal service request may come in. In this case, communication received through a specific IP or identifier is blocked.

서버 앱(프로그램)이 다운된 경우에 해당하며, 이때 공격에 의해 다운될 수도 있고, 하드웨어 및 소프트웨어적인 결함에 의해 다운될 수도 있다. 이 경우 관리자에 의해 문제 원인을 파악하여 재구동시킴으로써 정상 상태로 회복시킬 수 있다. It corresponds to the case where the server app (program) is down. At this time, it may be down by an attack, or may be down due to a hardware and software defect. In this case, it is possible to recover to the normal state by identifying the cause of the problem and restarting it by the administrator.

c. 공격된 상태 (변조) c. Attacked (Modulation)

서버 앱(프로그램)이 위/변조된 경우 “에이전트”가 난독화 엔진의 “실행파일 관리기”로부터 새로운 복제 프로그램을 다운로드 받아 재구동시킨다. 이때 2개 이상의 서버 구조를 유지시킨 상태에서 일부를 재구동(reboot) 하므로 서비스에 중단이 발생하지 않도록 한다. If the server app (program) is forged / modified, the “agent” downloads a new clone program from the “execution file manager” of the obfuscation engine and restarts it. At this time, a part is rebooted while maintaining the structure of two or more servers so that there is no interruption in service.

d. 공격 위험 상태 (예방적)d. Risk of attack (prophylactic)

다만, 서버 앱의 공격 여부에 대한 탐지 솔루션과 결합하여 상태를 추가할 수 있다. 최근 제로 데이 공격, APT 공격 등 다양한 서버 공격에 대한 탐지 솔루션이 많이 도입되고 있는 추세에 있다. 이러한 솔루션의 탐지 이벤트와 연동하여 공격 위험 상태를 인식하고, 일정 기준에 따라 “공격 위험 상태”로 전이하고, 예방적 조치를 취하는 것이다. 예방적 조치는 “공격된 상태”와 유사한 방식을 거친다. However, the status can be added in combination with a detection solution for whether the server app is attacked or not. Recently, many detection solutions for various server attacks such as zero-day attacks and APT attacks have been introduced. In conjunction with the detection events of these solutions, it recognizes the risk of attack, transitions to “attack risk” according to certain criteria, and takes preventive measures. Preventive measures go in a similar way to “attacked states”.

따라서, 상태 관리 메커니즘에서, 타깃 서버(200)는 어플리케이션 및 네트워크 계층에서 서비스 중단이 발생하는 상태라고 판단되면 서비스실패 상태로 판단할 수 있다. 또한, 타깃 서버(200)는 상기 서버 프로그램이 위조 또는 변조된 경우 상기 난독화 엔진으로부터 새로운 복제 프로그램을 다운로드 받아 상기 서버 프로그램을 재구동시킬 수 있다. 또한, 타깃 서버(200)는 상기 서버 프로그램에 대한 공격위험 상태를 인식하여, 공격 위험 상태로 전이하고, 상기 서버 프로그램에 대한 예방 프로그램을 구동시킬 수 있다.Accordingly, in the state management mechanism, if it is determined that the service interruption occurs in the application and network layers, the target server 200 may determine the service failure state. In addition, the target server 200 may download the new replication program from the obfuscation engine and restart the server program when the server program is forged or tampered with. In addition, the target server 200 may recognize an attack risk state for the server program, transition to an attack risk state, and drive a prevention program for the server program.

2) 교체 메커니즘 2) Replacement mechanism

a. 교체는 정기적, 비정기적 교체가 있을 수 있다. a. The replacement may be regular or irregular.

정기적 교체는 주기적 시간 간격으로 복제 프로그램을 새로운 것으로 변경하는 것이다. 미리 에이전트를 통해 다운로드 받을 수도 있다. 이때 주기는 공격받는 횟수나 프로그램의 볼륨(복잡도, 크기, 언어 등)에 따라 결정할 수 있다. Regular replacement is the replacement of the replication program with new ones at regular time intervals. It can also be downloaded through the agent in advance. At this time, the period can be determined according to the number of attacks or the volume of the program (complexity, size, language, etc.).

즉, 공격의 횟수가 증가할수록 주기를 짧게 하여, 노출되는 복제 프로그램의 공격 성공률을 낮추도록 방어하고자 하는 것이다. In other words, as the number of attacks increases, the cycle is shortened to defend against the attack success rate of the exposed replication program.

비정기적 교체는 “공격된 상태” 로 상태가 변경된 경우와 같이 일부 복제 프로그램에 공격에 노출되어 성공된 경우, 즉시 그 복제 프로그램을 새로운 변이된 복제 프로그램으로 교체하는 것이다. 서비스 되는 프로그램은 기존의 프로그램과 다른 형태로 바뀌게 되므로 공격자는 처음부터 새로이 시도를 해야 하는 상황이 된다. An occasional replacement is to replace a cloned program with a new mutated cloned program immediately after being successfully exposed to an attack by some cloned program, such as when the state changes to “attacked state”. Since the serviced program is changed to a different form from the existing program, the attacker must try again from the beginning.

3) 정상 확인 메커니즘 3) Normal confirmation mechanism

공격의 대상이 되는 것은 서버 프로그램 뿐만 아니라 서버와 통신하는 클라이언트 (앱)도 대상이 될 수 있다. 보안 이외의 서비스 정상 여부(서비스 실패)에 대한 것 뿐만 아니라 보안과 관련된 정상 여부(위변조 여부)에 대한 것이 주기적으로 이루어져야 한다. The target of the attack is not only the server program, but also the client (app) that communicates with the server. In addition to the normality of service other than security (service failure), the regularity of security-related (falsification) should be done periodically.

클라이언트는 위변조 여부를 확인하는 솔루션을 사용할 수 있다. 간단하게는 어플리케이션의 해시값을 미리 계산하여 일치 여부를 실행 시마다 확인하는 방법이 있다. Clients can use a solution to check for forgery. A simple method is to calculate the hash value of the application in advance and check whether it is matched at every execution.

a. 서버 간 정상 확인 알고리즘으로 heart-beat 방식이 있을 수 있다. a. There can be a heart-beat method as a normal verification algorithm between servers.

즉, 맥박 측정을 하는 것과 같이 서버 간 일정 주기로 메시지를 주고받으며, 응답이 없으면, 몇 번 더 시도 후 끊어지거나 서버가 다운된 것으로 간주한다. In other words, messages are exchanged at regular intervals between servers, such as pulse measurement, and if there is no response, it is considered to be disconnected after several more attempts or the server is down.

이때 복제 프로그램 N개에 대하여 “에이전트”와 메시지를 주고받도록 한다. “에이전트”는 각 복제 프로그램에 대한 정보를 모두 가지고 있으며, 복제 프로그램에게 명령을 내릴 수도 있다. 메시지는 간단한 hello (프로그램 구동 여부 확인), 위변조 확인을 위한 해시값을 계산한 값을 전송하거나 재부팅을 명령하는 것일 수도 있다. At this time, messages are exchanged with the “agent” for N replication programs. The “agent” has all the information about each copying program, and can issue a command to the copying program. The message may be a simple hello (checking whether the program is running), a hash value for checking forgery, or a command to reboot.

b. 또 다른 알고리즘으로 상호 인증 방식이 있을 수 있다. b. Another algorithm may be a mutual authentication method.

암호화 채널 상에서 상호 인증을 하도록 프로토콜을 구성한다. The protocol is configured to perform mutual authentication on the encrypted channel.

HTTPS (TLS1.2) 암호화 채널을 형성하고, 상호 인증서를 통한 보안 인증을 하고, 미리 공유된 정보를 이용하여 문제를 내고 맞추는 식으로 오염원(공격 성공된 서버)을 걸러내는 것이다. It forms a HTTPS (TLS1.2) encrypted channel, secures authentication through mutual certificates, and filters and pollutes sources (successful attacks) by using pre-shared information to solve and correct problems.

- 공유된 정보는 지속적으로 변경하고 암호화된 채널을 통해 공유한다. -The shared information is constantly changed and shared through the encrypted channel.

- 문제는 공유된 정보를 이용하여, 수학 문제를 푸는 식으로 연산 알고리즘을 전달하고 답을 비교하여 검증하는 것을 생각해 볼 수 있다. 예를 들면, (hash, a,b,c,d) 와 같은 문제를 내고, 알고리즘 HAS-256, (a,b,c) 값을 전달하고, d 값은 서로 공유된 정보라고 하면 d 값을 미리 알고 있지 않는 한 결과값을 조작할 수 없다. -Consider using a shared information to solve math problems by passing a computational algorithm and comparing the answers to verify them. For example, it issues (hash, a, b, c, d), passes the algorithm HAS-256, (a, b, c) values, and d values are d values The result value cannot be manipulated unless known in advance.

이상에서는 본 발명에 따른 가상 머신 기반 코드 변이 기술을 이용한 동적 자가변이 방법 및 이를 수행하는 시스템에 대해 살펴보았다. 본 발명에 따른 기술적 효과에 대해 살펴보면 다음과 같다.In the above, a dynamic self-mutation method using a virtual machine-based code variation technique according to the present invention and a system for performing the same are described. Looking at the technical effect according to the present invention are as follows.

소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능뿐만 아니라 각각의 구성 요소들에 대한 설계 및 파라미터 최적화는 별도의 소프트웨어 모듈로도 구현될 수 있다. 적절한 프로그램 언어로 쓰여진 소프트웨어 어플리케이션으로 소프트웨어 코드가 구현될 수 있다. 상기 소프트웨어 코드는 메모리에 저장되고, 제어부(controller) 또는 프로세서(processor)에 의해 실행될 수 있다.According to the software implementation, design and parameter optimization for each component as well as the procedures and functions described herein may be implemented as separate software modules. Software code can be implemented in a software application written in an appropriate programming language. The software code is stored in a memory and can be executed by a controller or processor.

Claims

In the dynamic self-mutation system using VM-based code mutation technology,
An original code management unit configured to store source code and executable files of the original program; And
An obfuscation engine configured to perform executable file recombination through obfuscation and compilation of the source code of the original program, and recombination of executable files through obfuscation through virtual code and encoding for the executable code of the original program A variation engine server comprising;
An agent configured to receive the recombined executable file and apply the executable file; And
A target server configured to include the agent,
The mutation engine server,
Further comprising a variation code management unit configured to store and manage the replication program having the same execution logic and function by time by the obfuscation,
The mutation code management unit transmits the replication program in communication with the agent,
The target server
A state management mechanism configured to deliver a plurality of mutation programs corresponding to the replication program to a client through a proxy, and manages a service failure, tampering or attacked state, a replacement mechanism related to the replacement of the replication program, and a server program and a client A dynamic self-mutation system, configured to perform a normal verification mechanism that checks whether security and services are normal from an attack target.

According to claim 1,
The obfuscation engine,
It is configured to download the source code of the original program, perform obfuscation of the source code according to the obfuscation option, compile the obfuscated code, and recompose the executable file using the compiled code, Dynamic automutation system.

According to claim 1,
The obfuscation engine,
Download the executable file of the original program, deassemble the original program, extract machine code, generate virtual code corresponding to the machine code, encode the virtual code into byte code and combine it with the original program A dynamic auto-mutation system, configured to recombine executable files.

According to claim 3,
The obfuscation engine,
Generate obfuscation technique based on random seed and time stamp,
The predefined handler is modified with the obfuscation technique to create a new virtual machine (VM),
A dynamic auto-mutation system configured to recombine an executable file by encoding the byte code of the new virtual machine and combining it with the original program.

delete

According to claim 1,
In the state management mechanism,
The target server,
If it is determined that a service interruption occurs in the application and network layers, it is determined as a service failure state
If the server program is forged or tampered with, download a new copy program from the obfuscation engine and restart the server program,
Dynamic self-mutation system, characterized in that, by recognizing an attack risk state for the server program, transitioning to an attack risk state, and driving a prevention program for the server program.