CN103279711A - PE file shell adding detecting method with stable static characteristic values - Google Patents

PE file shell adding detecting method with stable static characteristic values Download PDF

Info

Publication number
CN103279711A
CN103279711A CN2013101610274A CN201310161027A CN103279711A CN 103279711 A CN103279711 A CN 103279711A CN 2013101610274 A CN2013101610274 A CN 2013101610274A CN 201310161027 A CN201310161027 A CN 201310161027A CN 103279711 A CN103279711 A CN 103279711A
Authority
CN
China
Prior art keywords
file
shell
joint
sorter
adds
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013101610274A
Other languages
Chinese (zh)
Inventor
李琪林
刘达富
肖杰
苗长胜
覃剑
王俊峰
余明书
冯军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Sichuan Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN2013101610274A priority Critical patent/CN103279711A/en
Publication of CN103279711A publication Critical patent/CN103279711A/en
Pending legal-status Critical Current

Links

Images

Abstract

Provided is a PE file shell adding detecting method with stable static characteristic values. Whether a shell is added is detected by analyzing the four characteristic values of a PE file in a static mode, and the four characteristic values are the percentage of data contained by a code section, the percentage content of zero in a data section, the number of nonstandard section names and the number of sections with the section lengths of 0. The process that the PE file practically without the added shell is processed by a universal shell removing tool can be avoided, the method has the advantages of being short in consumed time, low in false report rate and low in missing report rate, and accordingly a virus detecting process is improved, and processing time is saved. The structural attribute of the PE file is analyzed, corresponding characteristics are constructed, and a shell adding detecting rule is obtained by machine learning. Test data show that the method can overcome the defect that a traditional signature shell adding judging method is high in missing report rate, and the method has detecting rate higher than that of a similar enlightening type detecting method.

Description

A kind of PE file of static nature value stabilization adds the shell detection method
Technical field
The present invention relates to the file security technical field, is that a kind of PE file based on the static nature value stabilization of novelty adds the shell detection method.
Background technology
Virus generally adopts the code fuzzy technology to resist traditional anti-viral software based on signature, as polymorphic technology, deformation technology, add shell and encryption technology.In these technology, add the shell side method and be most widely used.Adding shell is given executable file E, produces a new executable file E', has comprised among the executable file E' to have been added close executable file E and one section decryption instructions.When E' carried out, it was at first carried out decryption instructions and comes E is decrypted, and then carried out the E after the deciphering.If E contains malicious code, just might detect based on the anti-viral software of signing; Yet, if E has become E' after adding shell, just be difficult to detect based on the anti-viral software of signing.
At the encryption technology that Virus uses, people develop working solution shell instrument: under the situation of nondeterministic program cryptographic algorithm that E' adopts E is separated shell.Working solution shell instrument working procedure E' in the environment (as emulator or virtual machine) of isolating makes E dynamic resolution shell, after separating the shell success, program E after using tradition based on the anti-viral software scanning solution shell of signature has again so just improved the accuracy rate of viral detection greatly.
The major defect of working solution shell instrument is: because whether program to be detected adds the uncertainty of shell, all programs to be detected all will be separated shell through working solution shell instrument and be handled.Because working solution shell technology calculated amount is big, handling each program to be detected needs tens seconds to a few minutes, a large amount of programs is carried out may needing just can finish in several hours when virus detects, thereby greatly reduce the detection efficiency of virus.
Whether add shell for detecting a program, people develop and add the shell testing tool based on signature.Based on signature though to add shell testing tool rate of false alarm low, but rate of failing to report is higher, mainly be because adding the shell testing tool and can only detect the known shell instrument that adds based on signature, the viral production persons to produce the new shell instrument that adds, just can avoid the detection that adds the shell testing tool based on signature by the existing shell instrument that adds of simple modification.
At adding the shell testing tool and can not detect the problem that the unknown adds the shell instrument based on signature, Perdisci etc. adopt the PE file test method based on static nature, from the PE file, extract 9 eigenwerts, the shell that adds that adopts machine learning algorithm to derive the PE file detects rule, is used for detecting adopting the known and unknown PE file that the shell instrument adds shell that adds.9 eigenwerts that extract comprise standard section name number; The number of non-standard section name; Number with joint of attribute-executable; Have simultaneously readable/can write/number of the joint of attribute-executable; The number of contained list item in the IAT table; The entropy of PE file header; The entropy of code joint; The entropy of data section; And the entropy of PE file.The major defect of this method is that PE file characteristic value mainly reaches statistical information unusually based on the form of PE file, and these eigenwerts are easy to be evaded by new virus, thereby reduces the accuracy rate to the new virus pattern detection.
Personnel of the present invention propose patented claim " the PE file based on static nature adds the shell detection method " (CN201010594433.6), and it exists eigenwert stable inadequately, causes adding shell and detects the not high problem of accuracy rate.
Summary of the invention
The objective of the invention is to propose a kind of under 32 and 64 Microsoft Windows operating system the PE(Portable Executable based on static nature) file adds the shell detection method, add accuracy rate that shell detects, reduce rate of false alarm and rate of failing to report with raising.
The object of the present invention is achieved like this: a kind of PE file of static nature value stabilization adds the shell detection method, carries out according to the following steps:
At each PE file to be detected, at first it is carried out the static file analysis, extract 4 eigenwerts of this PE file, use the PE document sorter to add shell then and detect.
4 eigenwerts of above-mentioned PE file are defined as follows:
1) number percent of contained data in the code joint;
2) zero degree in the data section;
3) number of non-standard section name;
4) joint length is zero joint number;
It is one of following four kinds that above-mentioned sorter is selected for use:
A, Bayes classifier;
The J48 decision tree classification device based on C4.5 decision tree classification algorithm of b, Weka exploitation;
The IBk sorter based on K nearest-neighbors sorting algorithm of c, Weka exploitation;
D, Multi Layer Perceptron sorter.
Above-mentioned sorter is preferably Multi Layer Perceptron sorter.
Add the shell detection method with respect to existing P E file, the present invention mainly has following characteristics:
1, add the shell detection method with respect to tradition based on the PE file of signature, the present invention adopts the mode of analyzing the PE file characteristic, has better to add shell detectability, lower rate of false alarm and rate of failing to report.
2, PE file characteristic of the present invention has higher, more stable PE file and adds shell detection accuracy rate based on adding inevitably essential characteristic of shell process itself.
3, the present invention uses that machine learning algorithm derives adds shell and detects rule and have higher accuracy rate.
The present invention all adopts the static nature value of PE file under the Windows to carry out the method that the PE file adds the shell detection with invention (CN 201010594433.6) before.The difference of the two is:
1, the PE file static nature value that adopts of last invention mainly is based on the format character of PE file, and this category feature value is not to add the eigenwert that the shell process certainly leads to, but adds the eigenwert that the PE file behind the shell presents.This category feature value instability;
2, PE file static nature value of the present invention is caught major part to add eigenwert that shell process itself certainly leads to carry out static state and add shell and detect;
3, PE file static state of the present invention adds shell to detect accuracy rate higher with respect to last invention;
4, summary of the invention of the present invention only adds 4 PE file static nature values that shell detects for being used for PE file static state.
Description of drawings
Fig. 1 is application model synoptic diagram of the present invention.
Fig. 2 is PE file layout simple knot composition.
Embodiment
Whether add shell for detecting a program, people develop and add the shell testing tool based on signature.Based on signature though to add shell testing tool rate of false alarm low, but rate of failing to report is higher, mainly be because adding the shell testing tool and can only detect the known shell instrument that adds based on signature, the viral production persons to produce the new shell instrument that adds, just can avoid the detection that adds the shell testing tool based on signature by the existing shell instrument that adds of simple modification.The present invention is directed to this problem, propose to add the shell detection method based on static analysis PE file characteristic.It is unusual that the present invention is not limited to analyze the form of PE file, but adopt based on add shell process institute itself inevitably a file inward nature feature add the shell detection.This method has the low and low advantage of rate of failing to report of rate of false alarm.
Fig. 1 has showed application model of the present invention.At each PE file to be detected, the present invention at first carries out the static file analysis to it, extract 4 eigenwerts of this PE file, using the PE document sorter to add shell then detects: detect to the PE file that adds shell and just separate shell with working solution shell instrument, and then use whether the anti-viral software detection based on signature is viral; Detect the processing of then skipping working solution shell instrument for the PE file that does not add shell, carry out virus based on the anti-viral software of signing and detect.
The present invention specifically describes as follows:
(1) PE file format profile
The PE file layout is applied to 32 and 64 s' Microsoft Windows operating system.The PE file has encapsulated the required various information of operating system loader, comprises the PE file header, code joint, data section etc.Be a simple structure of PE file layout as accompanying drawing 2:
How PE file header guiding operation system is mapped to the PE file in the internal memory: the AddressOfEntryPoint field of file header is the entrance that program is carried out; Code is saved in the executable code of storage PE file; Data section has partly been stored the global data of program.Acquiescence is by 512 byte-aligned between joint and the joint, and each joint comprises three field: SizeOfRawData, VirtualSize, Characteristics.Wherein SizeOfRawData represents to save the size in the PE file; VirtualSize represents to save the size that is loaded into behind the internal memory; Characteristics is the attribute of joint.Usually, that code joint can be identified as is readable/and can not write/can carry out and wait attribute, whether operating system saves according to this determined property code is executable code.That data section is identified as usually is readable/can write/not executable attribute, and most of PE files all comprise the code joint and the data section that is named as .data that are named as .text.
(2) PE file characteristic value:
Four are used for adding the PE file characteristic value that shell detects:
1) number percent of contained data in the code joint:
The code joint that does not add shell PE file (just is considered to the code joint if a joint has attribute-executable; Otherwise be data section (joint that does not have attribute-executable); If all joints all do not have attribute-executable in the PE file, then defining the joint that PE file header field AddressOfEntryPoint points to is the code joint) in usually most of for instruction and compiling need be by the minute quantity data of compiler generation.Because the code of program E must change after encryption or compression, no longer be executable instruction but data, therefore among the program E' after adding shell, if the code storage of encrypted program E is in the code joint, then contained data account for the number percent that whole code saves in the code joint, compare with the number percent of contained data in the PE document code joint that does not add shell, will significantly improve.Therefore, the number percent of contained data can be used to detect a PE file and whether adds shell in the PE document code joint.
2) zero degree in the data section:
Index 1) situation of the code storage that is applicable to encrypted program E in the code joint, this index are applicable to the situation of code storage in data section after encrypted.Program E is close because of having been added, so its code seems will seem very " at random ", and inorganizable property; And the unencrypted code has certain rules, such as, instruction can comprise the memory address of operational code and operand, and the data message that the unencrypted data section comprises also can have the sense of organization; Because the needs of alignment between the joint, zero padding is used at PE file data joint end usually.In view of more than, zero degree is most important in all data section of PE file: if zero degree is low excessively in the data section of a PE file, this joint just comprises probably and has added close code so.Therefore, whether zero content can add shell for detection of a PE file in the PE file data joint.
3) number of non-standard section name:
The PE file that does not add shell comprises the good standard knot of definition usually: for example, the PE file of Microsoft Visual C++ compiler compiling comprises the data section that at least one is named as code joint and .data and the .rsrc of .text usually.The code joint and the data section that have added the PE file of shell are not followed these naming standards usually: for example, the PE file that the UPX cryptor is created comprises .UPX0 and .UPX1 joint or .rsrc joint (or claiming section name) usually, .UPX0 not the section name of standard with .UPX1, can whether add shell for detection of the PE file.Except UPX, many PE files that add the generation of shell instrument comprise off-gauge section name usually, and therefore, whether the number information of the non-standard section name that comprises in the PE file can add shell in detecting a PE file.
4) joint length is zero joint number:
The output that adds the shell instrument by analysis finds, it is zero joint that some program that has added shell contains the SizeOfRawData field.Suppose that a program E' inside that adds behind the shell has hidden an encrypted program E, when executive routine E', E' can at first carry out one section decryption instructions and come decrypted program E, executive routine E again after the deciphering.Finishing this process just need deposit the code of program E after the deciphering in the joint that can hold program E.Therefore need to create joint length and be the code that zero joint comes the program E after the store decrypted, and for the PE file that does not add shell, if there is not global data, need not create the joint of a sky.Therefore, joint length is whether the number of zero joint can add shell for detection of a PE file in the PE file.
Concrete operations:
Collect 748 PE files: 449 viral PE files that add shell, 299 normal PE files that do not add shell.Use at present most widely used PE file based on signature to add shell testing tool PEiD and detect 449 PE files that added shell, experimental result shows that the PEiD instrument can only detect 128 PE files and add shell, 321 PE files are failed to be detected and have been added shell, and the rate of failing to report of PEiD instrument is 71.49%.
Develop a PE file analysis instrument, 4 eigenwerts extracting arbitrary PE file are as shown in table 1.
4 eigenwerts of table 1 PE file are summed up
Eigenwert Span
The number percent of contained data in the code joint [0,1]
Zero degree in the data section [0,1]
The number of non-standard section name Integer more than or equal to 0
Joint length is zero joint number Integer more than or equal to 0
Use above-mentioned instrument to extract the eigenwert of 748 PE files of test, obtain a data set.This data set is divided into two parts: 1) training set comprises 299 eigenwerts that do not add normal PE file and 128 PE files that added shell of shell; 2) test set comprises the eigenwert that 321 PEiD instruments fail to detect the PE file that has added shell.
Use the machine self study instrument of freely increasing income of Weka exploitation, choose 4 kinds of different sorters:
(a) Bayes classifier;
(b) the J48 decision tree classification device based on C4.5 decision tree classification algorithm of Weka exploitation;
(c) Weka exploitation based on K nearest-neighbors (k-Nearest Neighbor, KNN) the IBk sorter of sorting algorithm;
(d) Multi Layer Perceptron (MLP) sorter.
At first use training set to train each sorter; Come it is tested with test set then, calculate the accuracy rate of each sorter on test set.
Be checking effect of the present invention, from Http:// roberto.perdisci.googlepages.com/codeDownload the program that Perdisci provides, repeated above-mentioned experimentation.
Table 2 has provided the test result of 2 kinds of methods.
The test result of 4 kinds of sorters of table 2
Method Used sorter Accuracy rate (%)
The present invention Multi?Layer?Perceptron 96.88%
Perdisci Bayes classifier 73.21%
Interpretation of result:
As can be seen from the test results, 321 that comprise in test set have not been gone out to add in the PE file of shell by the PEiD tool detection, the PE file that the present invention can correctly detect above 96% when adopting Multi Layer Perceptron sorter has added shell, and the highest detection rate of Perdisci method is 73.21% when adopting Bayes classifier.

Claims (4)

1. the PE file of a static nature value stabilization adds the shell detection method, carries out according to the following steps:
Be Portable Executable file at each PE file to be detected, at first it carried out the static file analysis, extract 4 eigenwerts of this PE file, use the PE document sorter to add shell then and detect;
4 eigenwerts of above-mentioned PE file are defined as follows:
1) number percent of contained data in the code joint;
2) zero degree in the data section;
3) number of non-standard section name;
4) joint length is zero joint number;
It is one of following four kinds that above-mentioned sorter is selected for use:
A, Bayes classifier;
The J48 decision tree classification device based on C4.5 decision tree classification algorithm of b, Weka exploitation;
The IBk sorter based on K nearest-neighbors sorting algorithm of c, Weka exploitation;
D, Multi Layer Perceptron sorter.
2. the PE file of static nature value stabilization according to claim 1 adds the shell detection method, it is characterized in that: the preferred Multi Layer of described sorter Perceptron sorter.
3. the PE file of static nature value stabilization according to claim 1 and 2 adds the shell detection method, it is characterized in that: sorter should pass through following training step:
Based on abundant PE file, extract 4 eigenwerts of each PE file, obtain a training set, this training set comprises the eigenwert of the normal PE file that does not add shell, has comprised the eigenwert of the PE file that adds shell again; Then, use above-mentioned training set to come sorter is trained.
4. the PE file of static nature value stabilization according to claim 3 adds the shell detection method, it is characterized in that: described code joint is for having the joint of attribute-executable; Described data section is not for having the joint of attribute-executable; If all joints all do not have attribute-executable in the PE file, the joint that then defines PE file header field AddressOfEntryPoint sensing is the code joint; Non-standard joint refers to comprise the joint of non-standard section name.
CN2013101610274A 2013-05-03 2013-05-03 PE file shell adding detecting method with stable static characteristic values Pending CN103279711A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013101610274A CN103279711A (en) 2013-05-03 2013-05-03 PE file shell adding detecting method with stable static characteristic values

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013101610274A CN103279711A (en) 2013-05-03 2013-05-03 PE file shell adding detecting method with stable static characteristic values

Publications (1)

Publication Number Publication Date
CN103279711A true CN103279711A (en) 2013-09-04

Family

ID=49062226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013101610274A Pending CN103279711A (en) 2013-05-03 2013-05-03 PE file shell adding detecting method with stable static characteristic values

Country Status (1)

Country Link
CN (1) CN103279711A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298919A (en) * 2014-09-30 2015-01-21 珠海市君天电子科技有限公司 Method and device for identifying whether PE file is resource file
CN104700000A (en) * 2015-03-05 2015-06-10 中国科学技术大学苏州研究院 Detecting and limiting method of covert channel based on PE file
CN105101092A (en) * 2015-09-01 2015-11-25 上海美慧软件有限公司 Mobile phone user travel mode recognition method based on C4.5 decision tree
CN106709336A (en) * 2015-11-18 2017-05-24 腾讯科技(深圳)有限公司 Method and apparatus for identifying malware
CN106778226A (en) * 2016-11-24 2017-05-31 四川无声信息技术有限公司 Shell document hulling method and device
CN108710800A (en) * 2018-05-22 2018-10-26 国家计算机网络与信息安全管理中心 A kind of shell adding recognition methods of Android application program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102024112A (en) * 2010-12-17 2011-04-20 四川大学 PE (portable executable) file pack detection method based on static characteristics
CN102034043A (en) * 2010-12-13 2011-04-27 四川大学 Novel file-static-structure-attribute-based malware detection method
CN102855440A (en) * 2012-09-13 2013-01-02 北京奇虎科技有限公司 Method, device and system for detecting packed executable files

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034043A (en) * 2010-12-13 2011-04-27 四川大学 Novel file-static-structure-attribute-based malware detection method
CN102024112A (en) * 2010-12-17 2011-04-20 四川大学 PE (portable executable) file pack detection method based on static characteristics
CN102855440A (en) * 2012-09-13 2013-01-02 北京奇虎科技有限公司 Method, device and system for detecting packed executable files

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵跃华等: ""基于数据挖掘技术的加壳PE程序识别方法"", 《计算机应用》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298919A (en) * 2014-09-30 2015-01-21 珠海市君天电子科技有限公司 Method and device for identifying whether PE file is resource file
CN104298919B (en) * 2014-09-30 2017-04-26 珠海市君天电子科技有限公司 Method and device for identifying whether PE file is resource file
CN104700000A (en) * 2015-03-05 2015-06-10 中国科学技术大学苏州研究院 Detecting and limiting method of covert channel based on PE file
CN104700000B (en) * 2015-03-05 2017-12-08 中国科学技术大学苏州研究院 A kind of detection of private communication channel based on PE files and method for limiting
CN105101092A (en) * 2015-09-01 2015-11-25 上海美慧软件有限公司 Mobile phone user travel mode recognition method based on C4.5 decision tree
CN106709336A (en) * 2015-11-18 2017-05-24 腾讯科技(深圳)有限公司 Method and apparatus for identifying malware
US10635812B2 (en) 2015-11-18 2020-04-28 Tencent Technology (Shenzhen) Company Limited Method and apparatus for identifying malicious software
CN106778226A (en) * 2016-11-24 2017-05-31 四川无声信息技术有限公司 Shell document hulling method and device
CN108710800A (en) * 2018-05-22 2018-10-26 国家计算机网络与信息安全管理中心 A kind of shell adding recognition methods of Android application program

Similar Documents

Publication Publication Date Title
CN102024112B (en) PE (portable executable) file pack detection method based on static characteristics
US11689561B2 (en) Detecting unknown malicious content in computer systems
US10838844B2 (en) Static feature extraction from structured files
CN103279711A (en) PE file shell adding detecting method with stable static characteristic values
Laskov et al. Static detection of malicious JavaScript-bearing PDF documents
CN107204960B (en) Webpage identification method and device and server
Carmony et al. Extract Me If You Can: Abusing PDF Parsers in Malware Detectors.
Maiorca et al. Digital investigation of pdf files: Unveiling traces of embedded malware
US20150172303A1 (en) Malware Detection and Identification
CN101751530B (en) Method for detecting loophole aggressive behavior and device
US20120311709A1 (en) Automatic management system for group and mutant information of malicious codes
US20200285893A1 (en) Exploit kit detection system based on the neural network using image
CN104123501B (en) A kind of viral online test method based on many assessor set
CN104144148A (en) Vulnerability scanning method and server and risk assessment system
Li et al. FEPDF: a robust feature extractor for malicious PDF detection
CN106951782A (en) A kind of malicious code detecting method applied towards Android
CN109543408A (en) A kind of Malware recognition methods and system
CN110704841A (en) Convolutional neural network-based large-scale android malicious application detection system and method
CN112817877B (en) Abnormal script detection method and device, computer equipment and storage medium
JP6523799B2 (en) Information analysis system, information analysis method
Bai et al. Dynamic k-gram based software birthmark
Liao et al. Automated detection and classification for packed android applications
Koch et al. Toward the detection of polyglot files
Zhang et al. S2f: Discover hard-to-reach vulnerabilities by semi-symbolic fuzz testing
CN109472143A (en) It is a kind of to the method and system extorting software and being automatically analyzed

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130904