CN103425798A - Heuristic type behavioral parameter analysis algorithm - Google Patents

Heuristic type behavioral parameter analysis algorithm Download PDF

Info

Publication number
CN103425798A
CN103425798A CN2013103918716A CN201310391871A CN103425798A CN 103425798 A CN103425798 A CN 103425798A CN 2013103918716 A CN2013103918716 A CN 2013103918716A CN 201310391871 A CN201310391871 A CN 201310391871A CN 103425798 A CN103425798 A CN 103425798A
Authority
CN
China
Prior art keywords
weights
heuristic
formatted files
behavior
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013103918716A
Other languages
Chinese (zh)
Inventor
朱永强
江雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHENGDU WANGAN TECHNOLOGY DEVELOPMENT Co Ltd
Original Assignee
CHENGDU WANGAN TECHNOLOGY DEVELOPMENT Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHENGDU WANGAN TECHNOLOGY DEVELOPMENT Co Ltd filed Critical CHENGDU WANGAN TECHNOLOGY DEVELOPMENT Co Ltd
Priority to CN2013103918716A priority Critical patent/CN103425798A/en
Publication of CN103425798A publication Critical patent/CN103425798A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a heuristic type behavioral parameter analysis algorithm, which mainly solves the problems that in the prior art, the false alarm rate of judging risk by a static heuristic method is relatively high, and the needs of people cannot be met. The heuristic type behavioral parameter analysis algorithm comprises the following steps of recurring a disk traversing file, and filtering a non-PE formatted file; establishing a rule base which is matched with a parsing engine, carrying out static heuristic analysis on all PE formatted files by using the parsing engine, and summarizing and calculating malicious weight numbers of all suspicious acts of the PE formatted files; calculating a heuristic alarm threshold value basing on a Liapunov central-limit theorem; judging whether the malicious weight numbers of the PE formatted file are higher than an alarm threshold value or not, carrying out alarm on the PE formatted files if so, and carrying out file traversing again until the file traversing is over if not. Through the technical scheme, according to the heuristic type behavioral parameter analysis algorithm disclosed by the invention, the purposes of flexibly adjusting scan sensitivity and reducing the false alarm rate are achieved, and the practical value and the promotion value are very high.

Description

Heuristic behavioral parameters parser
Technical field
The present invention relates to a kind of heuristic behavioral parameters parser.
Background technology
At present, security firm generally carries out checking and killing virus by the way of feature based code, although this kind of killing mode can accurately judge most of popular virus, needs first to collect Virus Sample and is analyzed, feature is extracted, and realizes by way of upgrade feature storehouse the killing to new virus.This mode has hysteresis quality, in addition, all kinds of anti-viral softwares, which analyze obtained virus signature, easily to carry out positioning modification by technology free to kill, so as to bypass the feature recognition of anti-viral software, realizes to the free to kill of all kinds of anti-viral softwares.
Heuristic antivirus technique is a kind of for unknown virus and killing mode free to kill, belongs to the scanning of Behavior-based control.It is concerned with some suspicious actions that file is shown, and to each characteristic behavior, a weights are assigned according to its degree of danger, and behavior weights are overlapped according to certain algorithm, when weights are higher than certain threshold value, then carries out early warning to this file.Due to this kind of antivirus technique it is contemplated that file behavior, without extraction and analysis document code feature, therefore it can effectively resist unknown and new virus, it is likely that producing " false-alarm " to part normal file.
It is heuristic to be divided into dynamic heuristic and static heuristic two kinds, the static heuristic danger that PE files are determined by analyzing PE static structures, relative to dynamic heuristic, static state is heuristic the features such as calculating is simple, committed memory is small, real-time killing is high, therefore it is widely used in the killing of unknown wooden horse and veil wooden horse, but, because static didactic danger early warning analysis is according to more single, thus rate of false alarm relative dynamic is heuristic also higher, can not increasingly meet people's demand.
The content of the invention
It is an object of the invention to provide a kind of heuristic behavioral parameters parser, the static heuristic judgement rate of false alarm to danger present in prior art is mainly solved of a relatively high, it is impossible to the problem of meeting people's demand.
To achieve these goals, the technical solution adopted by the present invention is as follows:
Heuristic behavioral parameters parser, comprises the following steps:
(1)Recursive traversal disk file, and filter out non-PE formatted files;
(2)The rule base matched with analytics engine is set up, all PE formatted files are carried out with static heuristic analysis using analytics engine, collects the malice weights for calculating all suspicious actions of PE formatted files;
(3)Heuristic alarm threshold value is calculated based on Liapunov central-limit theorems;
(4)Judge whether the malice weights of PE formatted files are higher than alarm threshold value, this document is alarmed if alarm threshold value is higher than and detailed information is quoted, otherwise return to step(1), until file traversal terminates.
The step(2)In:
Analytics engine is carried out the PE structure elucidations engine of static heuristic analysis to PE formatted files including the use of PE structure elucidations rule and is carried out the PE dis-assembling analytics engines of static heuristic analysis to PE formatted files using PE dis-assemblings rule;
The rule base includes two parts engine rule base matched respectively with PE structure elucidations engine and PE dis-assembling analytics engines;
The malice weights of the PE formatted files are the superposition value of PE structure elucidations engine and PE dis-assembling analytics engine analysis results.
Specifically, the malice weights of the PE formatted files are drawn by following steps:
The interval weights ranged space to be resolved is set as [MIN, MAX], the behavior probability difference in weights interval is LK=PK-QK, wherein, PKThe probability that occurs for a certain investigation behavior k in virus, QKThe probability occurred for same behavior in normal PE formatted files, and the interval probability difference maximum of the weights isCmax, probability difference minimum value isCmin
(2a)According to
Figure 2013103918716100002DEST_PATH_IMAGE001
Determine the interval size of weights;
(2b)According to
Figure 980038DEST_PATH_IMAGE002
The corresponding malice weights of behavior k are determined, according to this algorithm, the malice weights of the interval interior each behavior of weights are determined successively.
The step(3)Specifically include following steps:
(3a)The overall weights in Virus Sample space are drawn according to Liapunov central-limit theorems
Figure 2013103918716100002DEST_PATH_IMAGE003
, wherein,
Figure 490654DEST_PATH_IMAGE004
(3b)According to
Figure 2013103918716100002DEST_PATH_IMAGE005
CalculateT V Distribution function, if rate of false alarm be f, then its meet
Figure 739232DEST_PATH_IMAGE006
, draw the minimum value for meeting this condition
Figure 2013103918716100002DEST_PATH_IMAGE007
, then
Figure 478649DEST_PATH_IMAGE008
, and then show that rate of false alarm is that optimal alarm threshold value of warning under f is
Figure 2013103918716100002DEST_PATH_IMAGE009
Compared with prior art, the invention has the advantages that:
(1)The present invention passes through the dual static heuristic analysis to PE structures and PE dis-assemblings, and the sensitivity, accuracy and flexibility of checking and killing virus are effectively increased to the optimization that weights derivation algorithm and threshold value determine resolving Algorithm, and can be realized by the adjustment to alarm threshold value to scan sensitivity, the flexible modulation of rate of false alarm, applicability is higher, meet technology growth requirement, with prominent substantive distinguishing features and marked improvement, it is adapted to large-scale promotion application.
Brief description of the drawings
Fig. 1 is overall flow schematic diagram of the invention.
Embodiment
The invention will be further described with reference to the accompanying drawings and examples, and embodiments of the present invention include but is not limited to the following example.
Embodiment
Determine that static heuristic to quote with the key factor of rate of false alarm be the behavior captured, the threshold value of warning of the weights of each behavior and entirety, the present invention is according to actual needs, devise a set of static heuristic behavioral parameters parser, pass through the dual static heuristic analysis to PE structures and PE dis-assemblings, with reference to mathematics statistical knowledge, determine resolving Algorithm with threshold value to the weights derivation algorithm of each behavior emphatically to be optimized, weights for determining malicious act substantially reduce rate of false alarm with solving the optimal alarm threshold value under real needs.
Wherein, the principle of the static heuristic analysis based on PE structures is as follows:
PE files areWindowsMain flow under systemExecutable file32 under form, all Windows or 64 executable files are all PEFile format, such asDLLEXEFONOCXLIBAnd partSYSFile etc., virus document under a windows environment is also all PE file formats, and PE files have strict off-set construction and hierarchy attributes, however, the means free to kill or shell adding mode of some malice can change normal PE file structures, such as:Conventional flower instruction is free to kill to add the flower instruction of a large amount of confusions in PE source codes, to upset the condition code of PE files, this way improves the comentropy for saving the .code of PE structures, by monitoring .code sections of entropy, just can determine whether PE files with the presence or absence of flower instruction suspicious actions free to kill;And for example, PE files are by importing table loading system function, but some rogue programs often empty the importing table of PE files to hide the motivation of oneself, and this behavior can also be found by being analyzed PE files.It can be seen that, the static heuristic analysis based on PE structures is the feature produced by a kind of behavior free to kill of ad hoc analysis virus, a kind of means dangerous to predict PE files.
The principle of static heuristic analysis based on PE dis-assemblings is as follows:
Virus means free to kill also influence whether PE disassembler except that can be left a trace in PE structures, or even some means free to kill are directly to be carried out in compilation structure.Therefore, PE files are reduced to assembler using dis-assembling engine, then heuristic analysis is carried out to PE disassemblers, the file suspicious to find malicious act, positioning, such as:One application program is typically to check order line input whether there is parameter item, cls and preserves original screen to show in initial instruction, and the instruction of Virus initially is typically associative operation command sequences such as direct writing disk manipulation, solution code instruction, or the executable program searched under certain path etc..Therefore it can be judged by analyzing the dis-assembling code of PE files come the malice degree to PE files, find free to kill and unknown virus.
As shown in Figure 1, based on above-mentioned principle, the present invention carries out the heuristic parsing of PE structures to PE files respectively using with double engines, the two-part rule base of matching and the dis-assembling of PE files is parsed, record the suspicious actions of PE files matching, behavior is finally collected to the dangerous weights for calculating this file, once the dangerous weights of this file exceed alarm threshold value, then early warning is carried out.
With tradition it is heuristic behavior weights are solved surely when typically only consider Virus Sample one-sided behavioural characteristic, the mode that incidence using investigation behavior in Virus Sample space is allocated to weights is different, the present invention, which has considered, didactic quotes rate and rate of false alarm, by the mathematics characteristic of the normal PE sample spaces of comprehensive analysis and behavior point under viral space, didactic behavior weights are allocated using the probability difference under same behavior.                    
In the present invention, the weights solution specific algorithm of each PE file behavior is as follows:
The probability P that same investigation behavior k is occurred in virusKWith the probability Q occurred in normal PE filesKDifference be defined as probability difference LK
If the investigation behavior point position in weights interval is N, the interval probability difference maximum of weights isCmax, probability difference minimum value isCmin;Maximum is in weights are intervalMAX, minimum value is in weights are intervalMIN, such as:Definable weights interval is [0,36], then some integer of each weights desirable 0 to 36, and MIN=0, MAX=36; 
It is determined that the interval size of each weights:
(Gauss functions);
Behavior weights allocation algorithm:
Figure 177801DEST_PATH_IMAGE002
(Gauss functions);
BKAs behavior k corresponding malice weights, algorithm, the malice weights of each heuristic detection behavior just can be determined successively according to this.
In the present invention, the specific algorithm that threshold value is solved surely is as follows:
Based on practical application request, the present invention carries out alarm threshold value calculating based on Liapunov central-limit theorems, in actual applications, maximum rate of false alarm, minimum are can select according to the different demands of user quote rate, maximum and quote rate etc. to determine in alarm threshold value, the present embodiment to illustrate as f situation using fixed maximum rate of false alarm of limiting.
The overall weights in Virus Sample space are understood by Liapunov central-limit theorems theorem
Figure 444834DEST_PATH_IMAGE012
, wherein, N refers to Gauss distributions;
If
Figure 2013103918716100002DEST_PATH_IMAGE013
, then; 
Figure 2013103918716100002DEST_PATH_IMAGE015
In above-mentioned,
Figure 624198DEST_PATH_IMAGE013
Derivation it is as follows:
If a certain behavior k weights are B in behavior complete or collected works AK, it is respectively A in viral space statistical variable corresponding with normal file spaceKWith A'K, corresponding behavior incidence is respectively PKWith QK, probability difference lower threshold is separately set as Kmin, even then thinks that this behavior is non-viral behavior less than the probability difference lower threshold.The weights ranged space is [MIN, MAX].As from the foregoing:
Behavior AKProbability distribution such as table 1.
Table 1
Weights BK 0
AkOccurrence rate in virus PK 1-PK
AKOccurrence rate in normal PE QK 1-QK
Figure 86403DEST_PATH_IMAGE016
 
Similarly, have:
Figure 2013103918716100002DEST_PATH_IMAGE017
It can draw
Figure 434208DEST_PATH_IMAGE018
Thus, it can be known that the inequality that the maximum rate of false alarm of satisfaction is f is
Figure 2013103918716100002DEST_PATH_IMAGE019
, the minimum value that can be met this condition is, have after conversion
Figure 163446DEST_PATH_IMAGE020
, i.e., the optimal danger warning threshold value under minimum rate of false alarm is
Figure 937498DEST_PATH_IMAGE020
In each formula of the present invention, " ~ " refers to variable and obeys certain distribution.
The implementation method of the present invention is as follows:
(1)Static heuristic analysis is carried out to the PE files in system based on PE structures and PE dis-assemblings code respectively, and the suspicious actions matched with PE files are recorded respectively, and collected the suspicious actions matched with PE files of record, and calculate the malice weights of the PE files;
(2)Danger early warning threshold value is adjusted according to the actual requirements, to realize the adjustment to scanning susceptibility;
(3)Judge whether the malice weights of PE files exceed the danger early warning threshold value that has set, if being alarmed more than and if quote detailed information.
Due to carrying out killing present invention is generally directed to PE files, thus carrying out not including non-PE files when malice weights judge.
According to above-described embodiment, the present invention just can be realized well.

Claims (4)

1. heuristic behavioral parameters parser, it is characterised in that comprise the following steps:
(1)Recursive traversal disk file, and filter out non-PE formatted files;
(2)The rule base matched with analytics engine is set up, all PE formatted files are carried out with static heuristic analysis using analytics engine, collects the malice weights for calculating all suspicious actions of PE formatted files;
(3)Heuristic alarm threshold value is calculated based on Liapunov central-limit theorems;
(4)Judge whether the malice weights of PE formatted files are higher than alarm threshold value, this document is alarmed if alarm threshold value is higher than and detailed information is quoted, otherwise return to step(1), until file traversal terminates.
2. heuristic behavioral parameters parser according to claim 1, it is characterised in that the step(2)In:
Analytics engine is carried out the PE structure elucidations engine of static heuristic analysis to PE formatted files including the use of PE structure elucidations rule and is carried out the PE dis-assembling analytics engines of static heuristic analysis to PE formatted files using PE dis-assemblings rule;
The rule base includes two parts engine rule base matched respectively with PE structure elucidations engine and PE dis-assembling analytics engines;
The malice weights of the PE formatted files are the superposition value of PE structure elucidations engine and PE dis-assembling analytics engine analysis results.
3. heuristic behavioral parameters parser according to claim 2, it is characterised in that the malice weights of the PE formatted files are drawn by following steps:
The interval weights ranged space to be resolved is set as [MIN, MAX], the behavior probability difference in weights interval is LK=PK-QK, wherein, PKThe probability that occurs for a certain investigation behavior k in virus, QKThe probability occurred for same behavior in normal PE formatted files, and the interval probability difference maximum of the weights isCmax, probability difference minimum value isCmin
(2a)According to
Figure 2013103918716100001DEST_PATH_IMAGE002
Determine the interval size of weights;
(2b)According to
Figure 2013103918716100001DEST_PATH_IMAGE004
The corresponding malice weights of behavior k are determined, according to this algorithm, the malice weights of the interval interior each behavior of weights are determined successively.
4. heuristic behavioral parameters parser according to claim 3, it is characterised in that the step(3)Specifically include following steps:
(3a)The overall weights in Virus Sample space are drawn according to Liapunov central-limit theorems, wherein,
Figure 2013103918716100001DEST_PATH_IMAGE008
(3b)According toCalculateT V Distribution function, if rate of false alarm be f, then its meet
Figure 2013103918716100001DEST_PATH_IMAGE012
, draw the minimum value for meeting this condition, then, and then show that rate of false alarm is that optimal alarm threshold value under f is
Figure 2013103918716100001DEST_PATH_IMAGE018
CN2013103918716A 2013-09-02 2013-09-02 Heuristic type behavioral parameter analysis algorithm Pending CN103425798A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013103918716A CN103425798A (en) 2013-09-02 2013-09-02 Heuristic type behavioral parameter analysis algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013103918716A CN103425798A (en) 2013-09-02 2013-09-02 Heuristic type behavioral parameter analysis algorithm

Publications (1)

Publication Number Publication Date
CN103425798A true CN103425798A (en) 2013-12-04

Family

ID=49650535

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013103918716A Pending CN103425798A (en) 2013-09-02 2013-09-02 Heuristic type behavioral parameter analysis algorithm

Country Status (1)

Country Link
CN (1) CN103425798A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030065926A1 (en) * 2001-07-30 2003-04-03 Schultz Matthew G. System and methods for detection of new malicious executables
CN1801030A (en) * 2004-12-31 2006-07-12 福建东方微点信息安全有限责任公司 Method for distinguishing baleful program behavior
CN101859269A (en) * 2009-04-01 2010-10-13 埃森哲环球服务有限公司 The system that is used for monitoring technology assembly efficiency

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030065926A1 (en) * 2001-07-30 2003-04-03 Schultz Matthew G. System and methods for detection of new malicious executables
CN1801030A (en) * 2004-12-31 2006-07-12 福建东方微点信息安全有限责任公司 Method for distinguishing baleful program behavior
CN101859269A (en) * 2009-04-01 2010-10-13 埃森哲环球服务有限公司 The system that is used for monitoring technology assembly efficiency

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨建召等: "基于主机的入侵检测系统设计与实现", 《长春工业大学学报》, vol. 26, no. 4, 31 December 2005 (2005-12-31), pages 311 *
谭云松: "一种启发式反病毒技术的研究", 《网络安全技术与应用》, 30 November 2006 (2006-11-30), pages 56 - 57 *

Similar Documents

Publication Publication Date Title
US9973517B2 (en) Computing device to detect malware
CN103038777B (en) For the method and apparatus analyzing and detecting Malware
CN103297435B (en) A kind of abnormal access behavioral value method and system based on WEB daily record
Shang et al. Android malware detection method based on naive Bayes and permission correlation algorithm
CN107169351A (en) With reference to the Android unknown malware detection methods of dynamic behaviour feature
Yuxin et al. Feature representation and selection in malicious code detection methods based on static system calls
US20150242635A1 (en) DuLeak: A Scalable App Engine for High-Impact Privacy Leaks
CN107360152A (en) A kind of Web based on semantic analysis threatens sensory perceptual system
CN106548073B (en) Malicious APK screening method based on convolutional neural network
CN110659502B (en) Project version detection method and system based on text information incidence relation analysis
US20220019658A1 (en) Systems and methods for improving accuracy in recognizing and neutralizing injection attacks in computer services
CN104252592A (en) Method and device for identifying plug-in application program
CN111382438A (en) Malicious software detection method based on multi-scale convolutional neural network
US20220036208A1 (en) Conjoining malware detection models for detection performance aggregation
Pan et al. Webshell detection based on executable data characteristics of php code
CN110069927A (en) Malice APK detection method, system, data storage device and detection program
Jin et al. Peekaboo: A hub-based approach to enable transparency in data processing within smart homes
Zhang et al. MALDC: a depth detection method for malware based on behavior chains
CN112688966A (en) Webshell detection method, device, medium and equipment
CN108959930A (en) Malice PDF detection method, system, data storage device and detection program
Zhao et al. VULDEFF: vulnerability detection method based on function fingerprints and code differences
Suhuan et al. Android malware detection based on logistic regression and XGBoost
US10025936B2 (en) Systems and methods for SQL value evaluation to detect evaluation flaws
US10002254B2 (en) Systems and methods for SQL type evaluation to detect evaluation flaws
WO2021082938A1 (en) Url deduplication method, apparatus, device and computer-readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20131204

WD01 Invention patent application deemed withdrawn after publication