WO2017177003A1 - Extraction and comparison of hybrid program binary features - Google Patents
Extraction and comparison of hybrid program binary features Download PDFInfo
- Publication number
- WO2017177003A1 WO2017177003A1 PCT/US2017/026359 US2017026359W WO2017177003A1 WO 2017177003 A1 WO2017177003 A1 WO 2017177003A1 US 2017026359 W US2017026359 W US 2017026359W WO 2017177003 A1 WO2017177003 A1 WO 2017177003A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- hybrid
- program
- binaries
- feature
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/563—Static detection by source code analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/564—Static detection by virus signature recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Prevention of errors by analysis, debugging or testing of software
- G06F11/3668—Testing of software
- G06F11/3672—Test management
- G06F11/3688—Test management for test execution, e.g. scheduling of test suites
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Prevention of errors by analysis, debugging or testing of software
- G06F11/3668—Testing of software
- G06F11/3672—Test management
- G06F11/3692—Test management for test results analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/03—Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
- G06F2221/033—Test or assess software
Definitions
- the present invention relates to extraction and comparison of program features, and more particularly to detection and prevention of malicious software attacks by extraction and comparison of hybrid program binary features.
- Program binaries are a critical aspect of cyber security to understand the characteristics of programs. Benign software and malware are distributed as program binaries. Inspecting their distribution and runtime behavior is an important task done by many cyber security solutions such as anti-virus software.
- control flow information e.g., CPU instructions, system calls
- CPU instructions e.g., CPU instructions, system calls
- these approaches are not effective in similarity comparison, at least partly due to sensitivity in benign software as well as malware. Therefore, such methods are not effective in accurately determining similarity of programs.
- Benign software has many versions for different platforms and patches. Furthermore, even though their source code may be very similar, once it is compiled into the binary format, its instruction structure becomes significantly different due to algorithm and optimization of compilers. Moreover, with respect to malicious software (malware), malware writers use variations of code (e.g., polymorphic malware code), which effectively confuses and renders conventional approaches, such as those discussed above, inaccurate and ineffective. Thus, reliable and effective characterization and similarity comparison of program binaries is an unsolved problem, as conventional approaches are not reliable or effective enough to determine, for example, similar benign programs and malware families accurately.
- a method for identifying similarities in program binaries, including extracting program binary features from one or more input program binaries to generate corresponding hybrid features.
- the hybrid features include a reference feature, a resource feature, an abstract control flow feature, and a structural feature.
- Combinations of a plurality of pairs of binaries are generated from the extracted hybrid features, and a similarity score is determined for each of the pairs of binaries.
- a hybrid difference score is generated based on the similarity score for each of the binaries combined with input hybrid feature parameters.
- a likelihood of malware in the input program is identified based on the hybrid difference score.
- a system for identifying similarities in program binaries.
- the system includes a processor coupled to a memory in which the processor is configured to extract program binary features from one or more input program binaries to generate corresponding hybrid features.
- the hybrid features include a reference feature, a resource feature, an abstract control flow feature, and a structural feature.
- Combinations of a plurality of pairs of binaries are generated from the extracted hybrid features, and a similarity score is determined for each of the pairs of binaries.
- a hybrid difference score is generated based on the similarity score for each of the binaries combined with input hybrid feature parameters.
- a likelihood of malware in the input program is identified based on the hybrid difference score.
- a non-transitory computer readable medium for identifying similarities in program binaries, including extracting program binary features from one or more input program binaries to generate corresponding hybrid features.
- the hybrid features include a reference feature, a resource feature, an abstract control flow feature, and a structural feature.
- Combinations of a plurality of pairs of binaries are generated from the extracted hybrid features, and a similarity score is determined for each of the pairs of binaries.
- a hybrid difference score is generated based on the similarity score for each of the binaries combined with input hybrid feature parameters.
- a likelihood of malware in the input program is identified based on the hybrid difference score
- FIG. 1 is a block/flow diagram illustrating an exemplary processing system to which the present principles may be applied, in accordance with the present principles.
- FIG. 2A is a block/flow diagram illustrating a high-level system/method for program binary feature extraction, in accordance with the present principles
- FIG. 2B is a block/flow diagram illustrating a high-level system/method for hybrid feature similarity analysis, in accordance with the present principles
- FIG. 3 is a block/flow diagram illustrating a method for program binary feature extraction, in accordance with the present principles
- FIG. 4 is a block/flow diagram illustrating exemplary reference features for program binary feature extraction, in accordance with the present principles
- FIG. 5 is a block/flow diagram illustrating a method for generation of abstract control features, in accordance with the present principles
- FIG. 6 is a block/flow diagram illustrating a method for hybrid feature similarity comparison, in accordance with the present principles
- FIG. 7 is a block/flow diagram illustrating a method for similarity comparison of two binaries, in accordance with the present principles.
- FIG. 8 is a block/flow diagram illustrating a system for extraction and comparison of hybrid program binary features, in accordance with the present principles.
- a system and method for detecting and/or preventing malicious software (malware) attacks on one of more computer systems by extraction and comparison of hybrid binary program features is provided in accordance with the present principles.
- the present principles may be employed as a practical solution for protecting one or more computing systems from malware attacks as, for example an integrated virus definition updater for antivirus protection systems.
- Program binaries are a critical aspect of cyber security to understand the characteristics of programs. Benign software and malware are distributed as program binaries. Inspecting their distribution and runtime behavior is an important task, as performed by many cyber security solutions (e.g., anti-virus software).
- malware writers use variations of code (e.g., polymorphic malware code), which effectively confuses and renders conventional approaches, such as those discussed above, inaccurate and ineffective.
- code e.g., polymorphic malware code
- reliable and effective characterization and similarity comparison of program binaries is an unsolved problem, as conventional approaches, such as those discussed above, are not reliable or effective enough in determining, for example, similar benign programs and malware families accurately and reliably.
- the present principles may be applied to extract multiple features from program binaries to quantify the characteristics of programs and compare their similarity in a blackbox way (e.g., without using any source code or debug information).
- the extracted multiple features may include (1) the reference feature, (2) the resource feature (3), the abstract control flow feature, and/or (4) the structural feature.
- These features represent multiple aspects of binaries in terms of referenced binaries, resource, control flow, and binary structure.
- these features are richer in the coverage of relevant characteristics than other features of the binaries, and thus are more effective to quantify the similarity of programs in a complementary way to each other.
- the present principles may be applied to enable an effective, accurate comparison of program binaries, which is an important feature for program whitelisting and malware clustering in cyber security systems.
- malware clustering With respect to malware clustering, a large volume of new malware is newly released and discovered on a daily basis. Manually examining all of this malware is very challenging due to high cost of human efforts, and inaccuracies and lack of ability to process such vast amounts of potential malware in a timely manner (e.g., timely enough to address these ever-changing malware variations and prevent attacks from malware variations in real time) using such human efforts and/or conventional antivirus systems.
- the present invention advantageously improves the quality, effectiveness, and accuracy of malware comparison and clustering in accordance with various embodiments, which will be described in further detail herein below.
- Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements.
- the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
- Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
- a computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device.
- the medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium.
- the medium may include a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
- Each computer program may be tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein.
- the inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
- a data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus.
- the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution.
- I/O devices including but not limited to keyboards, displays, pointing devices, etc. may be coupled to the system either directly or through intervening I/O controllers.
- Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks.
- Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
- FIG. 1 an exemplary processing system 100, to which the present principles may be applied, is illustratively depicted in accordance with one embodiment of the present principles.
- the processing system 100 includes at least one processor (CPU) 104 operatively coupled to other components via a system bus 102.
- a cache 106 operatively coupled to the system bus 102.
- ROM Read Only Memory
- RAM Random Access Memory
- I/O input/output
- sound adapter 130 operatively coupled to the system bus 102.
- network adapter 140 operatively coupled to the system bus 102.
- user interface adapter 150 operatively coupled to the system bus 102.
- display adapter 160 operatively coupled to the system bus 102.
- a first storage device 122 and a second storage device 124 are operatively coupled to system bus 102 by the I/O adapter 120.
- the storage devices 122 and 124 can be any of a disk storage device (e.g., a magnetic or optical disk storage device), a solid state magnetic device, and so forth.
- the storage devices 122 and 124 can be the same type of storage device or different types of storage devices.
- a speaker 132 is operatively coupled to system bus 102 by the sound adapter 130.
- a transceiver 142 is operatively coupled to system bus 102 by network adapter 140.
- a display device 162 is operatively coupled to system bus 102 by display adapter 160.
- a first user input device 152, a second user input device 154, and a third user input device 156 are operatively coupled to system bus 102 by user interface adapter 150.
- the user input devices 152, 154, and 156 can be any of a keyboard, a mouse, a keypad, an image capture device, a motion sensing device, a microphone, a device incorporating the functionality of at least two of the preceding devices, and so forth. Of course, other types of input devices can also be used, while maintaining the spirit of the present principles.
- the user input devices 152, 154, and 156 can be the same type of user input device or different types of user input devices.
- the user input devices 152, 154, and 156 are used to input and output information to and from system 100.
- processing system 100 may also include other elements (not shown), as readily contemplated by one of skill in the art, as well as omit certain elements.
- various other input devices and/or output devices can be included in processing system 100, depending upon the particular implementation of the same, as readily understood by one of ordinary skill in the art.
- various types of wireless and/or wired input and/or output devices can be used.
- additional processors, controllers, memories, and so forth, in various configurations can also be utilized as readily appreciated by one of ordinary skill in the art.
- systems 100, 200, 210, and 800 are systems for implementing respective embodiments of the present principles. Part or all of processing system 100 may be implemented in one or more of the elements of systems 200, 210 and 800, according to various embodiments of the present principles.
- processing system 100 may perform at least part of the method described herein including, for example, at least part of methods 200, 210, 300, 400, 500, 600, and 700 of FIGs 2A, 2B, 3, 4, 5, 6, and 7, respectively.
- part or all of system 800 may be used to perform at least part of methods 200, 210, 300, 400, 500, 600, and 700 of FIGs 2A, 2B, 3, 4, 5, 6, and 7, respectively, according to various embodiments of the present principles.
- FIG. 2A a high-level method 200 for program binary feature extraction is illustratively depicted in accordance with an embodiment of the present principles.
- one or more program binaries may be input in block 202.
- program binary feature extraction may be performed to generate hybrid binary features, and the hybrid binary features may be output in block 206, in accordance with the present principles.
- FIG. 2B a high-level method 210 for hybrid feature similarity analysis is illustratively depicted in accordance with an embodiment of the present principles.
- one or more generated hybrid binary features may be input in blocks 212, 214, and/or 216, and one or more hybrid feature parameters may be input in block 218.
- the features 212, 214, 216 and the feature parameters 218 may be employed to determine the similarity of two or more program binaries in accordance with the present principles.
- a similarity vector based on the hybrid feature similarity analysis 220 may be output in block 222.
- the similarity vector output in block 222 may be employed to, for example, provide real-time malicious software (malware) definition comparisons and updates for detection and prevention of malware attacks in accordance with various embodiments.
- FIG. 3 a method 300 for program binary feature extraction is illustratively depicted in accordance with an embodiment of the present principles.
- one or more program binaries may be input in block 302, and program binary features may be extracted in block 304 in accordance with the present principles.
- the hybrid binary features extracted from program binaries 314 may include one or more of reference features 316, resource features 318, abstract control flow features 320, and structural features 322.
- the hybrid features may be extracted using a corresponding extraction function, including reference feature extraction 306, resource feature extraction 308, abstract control flow feature extraction 310, and structural feature extraction 312 in accordance with the present principles.
- the output may be stored as individual features in the hybrid features in accordance with various embodiments.
- F_R(P) Resource feature 318 for a program binary P
- F_C(P) Abstract control feature 320 for a program binary P
- F_S(P) Structural feature 322 for a program binary P
- hybrid binary features 314 for a program P are a four-tuple of a reference feature, a resource feature, an abstract control feature, and a structural feature for Program P, as shown below:
- This value is between 0 (0%) and 1 (100%).
- This value is between 0 (0%) and 1 (100%).
- This value is between 0 (0%) and 1 (100%).
- program binary feature extraction 304 may be performed as follows:
- reference feature extraction 306 may be performed as follows:
- section is a reference table (For example, import Table, GOT in ELF)
- resource feature extraction 308 may be performed as follows:
- Extract_F_R(P) // Extract Resource Feature from a Program binary
- Sections Get the list of binary sections of Program P
- section has resource (E.g., String, Symbol, Global Data, Icon, etc.)
- abstract control feature extraction 310 may be performed as follows: Extract_F_C(P) // Extract Abstract Control Flow Feature from a Program binary P
- Sections Get the list of binary sections of Program P
- section is a code section
- Disassembledlnstructions Disassemble(section) For each instruction in Disassembledlnstructions:
- ACF getOpCode(instruction) ListOfACF.add(ACF)
- structural feature extraction 312 may be performed as follows:
- Sections Get the list of binary sections of Program P
- FIG. 4 a diagram 400 of exemplary reference features for program binary feature extraction is illustratively depicted in accordance with the present principles.
- ELF Executable and Linkable Format
- GAT Global Offset Table
- Other binary formats have similar binary sections or tables.
- Program A 402 uses the function Bl 405 of Program B 404, and the function CI 407 of Program C 406.
- a reference feature from this exemplary program is shown in more detail in block 410.
- exemplary resource features 318 are shown in Table 1, below:
- Programs employ various resources which are embedded in the program binary. Some of these resources are data which are used in programs. For example, global data, program metadata, program icon, strings, debug symbols, etc. belong to this category of resource features 318. Such information is typically stored in separated binary sections. For example, in ELF binary format, read-only data section and symbol table sections are used for such information.
- this table has a column of the Kind, where the data is from actual measured and/or received Values.
- This example shows several strings, a program function symbol, global data from read only section, and an icon data which belong to the metadata.
- FIG. 5 a method 500 for generation and extraction of abstract control flow features is illustratively depicted in accordance with the present principles.
- a given program is disassembled, and an algorithm iterates each instruction in block 502. If the instruction is not determined to be a control dependent instruction (e.g., arithmetic) in block 504, it is discarded in block 508. If the instruction is determined to be a control dependent instruction in block 504, only op-code is taken and included in the abstract control flow feature in accordance with various embodiments of the present principles, as shown in the exemplary Pseudocode 1, below.
- a control dependent instruction e.g., arithmetic
- Control flow information e.g., function calls, returns, jumps and system calls
- control flow information are important descriptions that represent their behavior.
- using their full information can be too noisy because certain details can be sensitively changed only due to minor changes.
- program instructions use jumps to other subroutines, and their locations at the binary are subject to change with a small code patch. Therefore, in accordance with various embodiments of the present principles, a subset of control flow information is employed, and as such, is more resilient on sensitive changes than if the full information is employed.
- a subset of instruction information including op codes but without instruction parameters for control-dependent instructions (e.g., jump, call, and return instructions) may be employed in accordance with various embodiments. Non- control dependent instructions are not used in the above-discussed embodiment.
- Another feature of binaries is the structure information of the binary.
- the characteristics of binary sections e.g., the name, size, and the number of binary sections
- Table 2 in the exemplary table of structural features 322 one column shows the names of binary sections, and another column shows the sizes of the binary sections.
- FIG. 6 a system and method 600 for hybrid feature similarity comparison is illustratively depicted in accordance with the present principles.
- hybrid features of program binaries may be employed for comparing and determining a similarity of a plurality of characteristics of binaries (e.g., to detect and/or prevent malware attacks) in accordance with the present principles.
- a set of N hybrid features which are generated from a plurality (e.g., N) program binaries, and one or more hybrid feature parameters 608 (e.g., a set of rates determining the contribution of each feature in the comparison) may be input for hybrid feature similarity comparison in block 610.
- hybrid feature parameters 608 may be represented as follows:
- C_R a parameter for resource feature
- C_C a parameter for abstract control flow feature
- C_S a parameter for structural feature
- C_F, C_R, C_C, C_S are a ratio between 0 and 1.
- a combination generator 612 is configured to generate combinations for every possible pair of binaries. For each two binaries (e.g., pair), a similarity comparison is performed in block 616 to generate a hybrid difference score in block 618.
- the similarity comparison (e.g., feature comparison) is iterated in block 614 for one or more of the pairs of binaries, and a similarity vector is generated and output in block 620 for use in, for example, detection and prevention of malware attacks, in accordance with the present principles.
- the hybrid feature similarity comparison in block 610 may be performed as follows: Hybrid Feature Similarity Comparison (HFList, HFIndex, C_F, C_R, C_C, C_S, C F P, C F F)
- HFList [HF(P1), HF(P2), ... , HF(PN)]
- CombinationList GenerateCombination(HFIndex)
- the hybrid difference score 618 represents a similarity between two binaries, and the scores of all combinations may be stored in a similarity vector 620.
- clusters of binaries are produced by applying, for example, clustering algorithms to the data stored in the similarity vector in accordance with the present principles.
- FIG. 7 a method 700 for similarity comparison of two binaries is illustratively depicted in accordance with the present principles.
- a similarity comparison of two hybrid binary features 702, 712 is performed in block 722 in accordance with the present principles.
- a comparison is performed between two features of the same kind (e.g., reference feature 704, 714; resource feature 706, 716; abstract control flow feature 708, 718; and structural feature 710, 720) to determine a difference score value between the features of the same kind in blocks 724, 726, 728, and 730, respectively.
- difference score values 724, 726, 728, and 730 are employed for determining a determined hybrid difference score in block 734 by, for example, multiplying values of the hybrid feature parameters 732 with the difference score values 724, 726, 728, and 730 in accordance with various embodiments of the present principles.
- the hybrid difference score 734 between program binary PI and P2 may be determined as follows:
- the similarity comparison 724 of reference features 704, 714 of Program PI, P2 may be performed as follows:
- the similarity comparison 726 of resource features 706, 716 of Program PI, P2 may be performed as follows:
- the similarity comparison 728 of abstract control flow features 708, 718 of Program PI, P2 may be performed as follows:
- the similarity comparison 730 of structural features 710, 720 of Program PI, P2 may be performed as follows:
- thresholds may be employed to, for example, match similar program and function names in the comparison of the reference feature 724.
- the different versions of the same library e.g., LibX Vl and LibX_V2
- C F P a threshold for determining whether similar program and function names in the comparison of the reference feature 724.
- C F F a threshold for determining whether similar program and function names in the comparison of the reference feature 724.
- FIG. 8 an exemplary system 800 for extraction and comparison of hybrid program binary features is illustratively depicted in accordance with the present principles.
- system 800 While many aspects of system 800 are described in singular form for the sakes of illustration and clarity, the same can be applied to multiples ones of the items mentioned with respect to the description of system 800.
- controller 816 While a single controller 816 is described, more than one controller 816 can be used in accordance with the teachings of the present principles, while maintaining the spirit of the present principles.
- storage device 818 is but one aspect involved with system 800 than can be extended to plural form while maintaining the spirit of the present principles.
- program whitelisting this may be accomplished by determining the variations or different versions of software.
- Software companies and developers produce diverse versions of software for bug fixes, security updates, and new features. For example, if a company updates a binary once in several weeks or several months, if we take all binary information inside an enterprise, there could be from dozens to hundreds different versions of a program. [0102]
- This invention can determine the similarities of such programs. Knowing different versions of benign software is helpful to exclude them from the comparison with malicious thus reducing the complexity of malware detection.
- an unknown binary may be compared with a list of malicious software binaries and determine possibilities that the binary may have malicious functionalities. For example:
- Similarity between an unknown program and a file search utility 20% Similarity between an unknown program and a network utility : 30% Thus, it may be determined that there is a chance (e.g., likelihood percentage) that this program may have malicious functions seen in malware X.
- malware clustering when an unknown binary shows similarity with multiple kinds of malicious software, this invention helps us to understand the category of the malware by malware clustering for use in antivirus applications. For example:
- Similarity between an unknown program and malware family 1 70% Similarity between an unknown program and malware family 2: 27% Similarity between an unknown program and malware family 3 : 3%
- antivirus/anti -malware applications may be updated/applied accordingly to detect and prevent malicious software attacks in accordance with the present principles.
- the system 800 can include a bus 801, which may be connected to one or more computing networks and/or storage devices 818.
- a program binary feature extractor 802 may be employed for extraction of binary features, and hybrid features may be generated using a hybrid binary feature generator 804.
- the hybrid binary features may be analyzed using a hybrid binary feature similarity analyzer 806, which may further take as input hybrid feature parameters provided by, for example, a hybrid feature parameter determination device 808.
- a similarity determination device 812 may be employed to determine a difference between pairs of binaries, and a similarity vector generator 810 generates and outputs similarity score vectors based on the similarity comparison.
- the similarity vectors generated may be employed (e.g., in real- time, in the future) for malware attack protection in a malware attack analyzer, detector, and preventer 814, which may be controlled by a controller 816, for instructing (e.g., manually or automatic) antivirus software to, for example, update malware definitions based on the similarity vectors, quarantine malware detected by the updated malware definitions, etc.
- a storage device 818 may be employed to store updated malware definitions, results of similarity comparison, etc. for use in, for example, detecting and preventing malware attacks in accordance with various embodiments of the present principles.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Virology (AREA)
- General Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Stored Programmes (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2018552688A JP6778761B2 (ja) | 2016-04-06 | 2017-04-06 | ハイブリッドプログラムバイナリ特徴の抽出及び比較 |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201662318844P | 2016-04-06 | 2016-04-06 | |
| US62/318,844 | 2016-04-06 | ||
| US15/479,928 | 2017-04-05 | ||
| US15/479,928 US10289843B2 (en) | 2016-04-06 | 2017-04-05 | Extraction and comparison of hybrid program binary features |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2017177003A1 true WO2017177003A1 (en) | 2017-10-12 |
Family
ID=59998743
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2017/026359 Ceased WO2017177003A1 (en) | 2016-04-06 | 2017-04-06 | Extraction and comparison of hybrid program binary features |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US10289843B2 (https=) |
| JP (1) | JP6778761B2 (https=) |
| WO (1) | WO2017177003A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110852235A (zh) * | 2019-11-05 | 2020-02-28 | 长安大学 | 一种图像特征提取方法 |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11609998B2 (en) * | 2017-06-14 | 2023-03-21 | Nippon Telegraph And Telephone Corporation | Device, method, and computer program for supporting specification |
| US10346293B2 (en) * | 2017-10-04 | 2019-07-09 | International Business Machines Corporation | Testing pre and post system call exits |
| CN109299609A (zh) * | 2018-08-08 | 2019-02-01 | 北京奇虎科技有限公司 | 一种elf文件检测方法及装置 |
| CN111723373A (zh) * | 2019-03-19 | 2020-09-29 | 国家计算机网络与信息安全管理中心 | 复合式二进制文档的漏洞利用文件检测方法及装置 |
| CN113378162B (zh) * | 2020-02-25 | 2023-11-07 | 深信服科技股份有限公司 | 可执行和可链接格式文件的检验方法、装置及存储介质 |
| US11294804B2 (en) * | 2020-03-23 | 2022-04-05 | International Business Machines Corporation | Test case failure with root cause isolation |
| CN113254934B (zh) * | 2021-06-29 | 2021-09-24 | 湖南大学 | 基于图匹配网络的二进制代码相似性检测方法及系统 |
| CN115658646B (zh) * | 2022-09-28 | 2025-11-14 | 中国信息通信研究院 | 一种二进制特征数据库构建方法及装置 |
| CN117910043B (zh) * | 2024-01-18 | 2024-12-10 | 北京信息科技大学 | 电子文档信息隐藏深度挖掘方法、系统和装置 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130097195A1 (en) * | 2010-07-20 | 2013-04-18 | Barracuda Networks Inc. | Method For Measuring Similarity Of Diverse Binary Objects Comprising Bit Patterns |
| US20130326625A1 (en) * | 2012-06-05 | 2013-12-05 | Los Alamos National Security, Llc | Integrating multiple data sources for malware classification |
| US9038186B1 (en) * | 2010-01-13 | 2015-05-19 | Symantec Corporation | Malware detection using file names |
| US9197665B1 (en) * | 2014-10-31 | 2015-11-24 | Cyberpoint International Llc | Similarity search and malware prioritization |
| US9223554B1 (en) * | 2012-04-12 | 2015-12-29 | SourceDNA, Inc. | Recovering source code structure from program binaries |
Family Cites Families (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6775780B1 (en) | 2000-03-16 | 2004-08-10 | Networks Associates Technology, Inc. | Detecting malicious software by analyzing patterns of system calls generated during emulation |
| US7752667B2 (en) | 2004-12-28 | 2010-07-06 | Lenovo (Singapore) Pte Ltd. | Rapid virus scan using file signature created during file write |
| US20070239993A1 (en) | 2006-03-17 | 2007-10-11 | The Trustees Of The University Of Pennsylvania | System and method for comparing similarity of computer programs |
| JP2010198565A (ja) * | 2009-02-27 | 2010-09-09 | Hitachi Ltd | 不正プログラム検知方法、不正プログラム検知プログラム、および情報処理装置 |
| US8516446B2 (en) * | 2010-05-21 | 2013-08-20 | Apple Inc. | Automated qualification of a binary application program |
| JP5569935B2 (ja) * | 2010-07-23 | 2014-08-13 | 日本電信電話株式会社 | ソフトウェア検出方法及び装置及びプログラム |
| KR101162051B1 (ko) * | 2010-12-21 | 2012-07-03 | 한국인터넷진흥원 | 문자열 비교 기법을 이용한 악성코드 탐지 및 분류 시스템 및 그 방법 |
| JP5667957B2 (ja) * | 2011-09-30 | 2015-02-12 | Kddi株式会社 | マルウェア検知装置およびプログラム |
| US8584235B2 (en) * | 2011-11-02 | 2013-11-12 | Bitdefender IPR Management Ltd. | Fuzzy whitelisting anti-malware systems and methods |
| US9215245B1 (en) * | 2011-11-10 | 2015-12-15 | Google Inc. | Exploration system and method for analyzing behavior of binary executable programs |
| CN105793864A (zh) * | 2013-12-27 | 2016-07-20 | 迈克菲股份有限公司 | 检测恶意多媒体文件的系统和方法 |
-
2017
- 2017-04-05 US US15/479,928 patent/US10289843B2/en active Active
- 2017-04-06 JP JP2018552688A patent/JP6778761B2/ja not_active Expired - Fee Related
- 2017-04-06 WO PCT/US2017/026359 patent/WO2017177003A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9038186B1 (en) * | 2010-01-13 | 2015-05-19 | Symantec Corporation | Malware detection using file names |
| US20130097195A1 (en) * | 2010-07-20 | 2013-04-18 | Barracuda Networks Inc. | Method For Measuring Similarity Of Diverse Binary Objects Comprising Bit Patterns |
| US9223554B1 (en) * | 2012-04-12 | 2015-12-29 | SourceDNA, Inc. | Recovering source code structure from program binaries |
| US20130326625A1 (en) * | 2012-06-05 | 2013-12-05 | Los Alamos National Security, Llc | Integrating multiple data sources for malware classification |
| US9197665B1 (en) * | 2014-10-31 | 2015-11-24 | Cyberpoint International Llc | Similarity search and malware prioritization |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110852235A (zh) * | 2019-11-05 | 2020-02-28 | 长安大学 | 一种图像特征提取方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2019514119A (ja) | 2019-05-30 |
| JP6778761B2 (ja) | 2020-11-04 |
| US10289843B2 (en) | 2019-05-14 |
| US20170293761A1 (en) | 2017-10-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10289843B2 (en) | Extraction and comparison of hybrid program binary features | |
| US11188650B2 (en) | Detection of malware using feature hashing | |
| Rathnayaka et al. | An efficient approach for advanced malware analysis using memory forensic technique | |
| CN108763928B (zh) | 一种开源软件漏洞分析方法、装置和存储介质 | |
| Salehi et al. | MAAR: Robust features to detect malicious activity based on API calls, their arguments and return values | |
| US10055585B2 (en) | Hardware and software execution profiling | |
| US8370934B2 (en) | Methods for detecting malicious programs using a multilayered heuristics approach | |
| US20090313700A1 (en) | Method and system for generating malware definitions using a comparison of normalized assembly code | |
| US20120072988A1 (en) | Detection of global metamorphic malware variants using control and data flow analysis | |
| CN115146282A (zh) | 基于ast的源代码异常检测方法及其装置 | |
| WO2016049373A1 (en) | Method for detecting libraries in program binaries | |
| Kostakis et al. | Improved call graph comparison using simulated annealing | |
| O'Kane et al. | N-gram density based malware detection | |
| CN114969759B (zh) | 工业机器人系统的资产安全评估方法、装置、终端及介质 | |
| Mimura | Impact of benign sample size on binary classification accuracy | |
| US20180285565A1 (en) | Malware detection in applications based on presence of computer generated strings | |
| US20230022279A1 (en) | Automatic intrusion detection based on malicious code reuse analysis | |
| US10083298B1 (en) | Static approach to identify junk APIs in a malware | |
| Wichmann et al. | Using infection markers as a vaccine against malware attacks | |
| US20170171224A1 (en) | Method and System for Determining Initial Execution of an Attack | |
| KR20150111610A (ko) | 스택 기반 소프트웨어 유사도 평가 방법 및 장치 | |
| CN104008336B (zh) | 一种ShellCode检测方法和装置 | |
| CN114925363B (zh) | 基于递归神经网络的云在线恶意软件检测方法 | |
| Jurn et al. | A survey of automated root cause analysis of software vulnerability | |
| Gond et al. | System Calls for Malware Detection and Classification: Methodologies and Applications |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| ENP | Entry into the national phase |
Ref document number: 2018552688 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17779825 Country of ref document: EP Kind code of ref document: A1 |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17779825 Country of ref document: EP Kind code of ref document: A1 |