CN101645125B - Method for filtering and monitoring behavior of program - Google Patents

Method for filtering and monitoring behavior of program Download PDF

Info

Publication number
CN101645125B
CN101645125B CN 200810030001 CN200810030001A CN101645125B CN 101645125 B CN101645125 B CN 101645125B CN 200810030001 CN200810030001 CN 200810030001 CN 200810030001 A CN200810030001 A CN 200810030001A CN 101645125 B CN101645125 B CN 101645125B
Authority
CN
China
Prior art keywords
behavior
sample
program
samples
weight
Prior art date
Application number
CN 200810030001
Other languages
Chinese (zh)
Other versions
CN101645125A (en
Inventor
黄声声
Original Assignee
珠海金山软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 珠海金山软件有限公司 filed Critical 珠海金山软件有限公司
Priority to CN 200810030001 priority Critical patent/CN101645125B/en
Publication of CN101645125A publication Critical patent/CN101645125A/en
Application granted granted Critical
Publication of CN101645125B publication Critical patent/CN101645125B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems

Abstract

The invention relates to a method for filtering and monitoring the behavior of a program. The method for filtering the behavior of the program comprises the following steps: constructing a behavior sample database which comprises behavior samples collected from a plurality of program samples, and the weight of each behavior sample calculated based on the frequency of occurrence of the behavior sample, wherein the weight can be an inverse document frequency index, the probability of occurrence and the like; acquiring the behavior of the program to be processed, judging whether a behavior sample identical to the behavior of the program exists in the behavior sample database, if the behavior sample identical to the behavior of the program does not exist in the behavior sample database, keeping the behavior of the program; and if the behavior sample identical to the behavior of the program exists in the behavior sample database, judging whether the weight of the behavior sample falls intoa preset filtering threshold range, if so, filtering the behavior of the program, otherwise, keeping the behavior of the program. The method can reduce the interferences to monitoring or analysis caused by non-characteristic behaviors, reduce the treating capacity, and improve the accuracy.

Description

过滤以及监控程序的行为的方法 Method of filtering, and behavior monitoring program

技术领域 FIELD

[0001] 本发明涉及计算机的安全领域,更具体地,涉及对程序的行为进行过滤和监控的方法。 [0001] The present invention relates to the field of computer security, and more particularly, relates to the behavior of a program and a method for monitoring the filter.

背景技术 Background technique

[0002] 对程序的行为进行拦截与监控是安全软件在防御病毒时常用的手段。 [0002] on the behavior of a program to intercept and monitor the security software is commonly used in the defense against viral means. 在实践中, 基于非特征码检测的安全软件产品通常通过对程序行为的监控与分析来识别出可疑的程序(例如病毒、木马)。 In practice, based on non-signature detection of security software products, typically by monitoring and analysis of program behavior to identify suspicious programs (such as viruses, Trojan). 例如,可以基于某些特定的拦截点(例如,系统资源的调用)拦截和监控程序的行为,包括对文件读写操作、对注册表读写操作等,然后根据这些行为判断程序的类型(病毒、木马、系统程序等)。 For example, based on certain intercept interception point (e.g., call system resources) and a monitoring program behavior, including the file read and write operations, registry write operation, and then determines the type of program (viruses based on these behavior , Trojans, system program, etc.).

[0003] 在统计语言处理中,一些常用的副词、连词等词类,例如“的”、“得”、“中”等,应用过于广泛,以至于在绝大部分的文章中都会出现,因此这些词汇在文本分类中基本没有作用。 [0003] In statistical language processing, some commonly used adverbs, conjunctions and other parts of speech, such as "the," "too", "medium" and so on, the application is too broad, that will appear in most articles, these basic vocabulary in text classification has no effect. 相应地,这些词语在统计语言学中称为“停止词”(Stop Words)。 Accordingly, these words called "stop words" (Stop Words) in statistical linguistics. “停止词”在文本分类的过程中常常被删除掉,以免影响处理。 "Stop words" in the process of text classification often is deleted, so as not to affect the process.

[0004] 类似地,程序行为可分为两种类型:具有分类意义的行为(又称为“特征行为”); 不具有分类意义的行为(又称为“非特征行为”)。 [0004] Similarly, the behavior of the program can be divided into two types: behavior have taxonomic significance (also known as "behavioral characteristics"); having no taxonomic significance behavior (also referred to as "non-characteristic behavior"). 例如,有的行为是绝大部分程序都使用到的,或者被绝大部分程序频繁使用,这种行为就不具有分类与分析的意义,属于非特征行为。 For example, some behavior is that most programs use to, or frequently use most of the program, this behavior is not meaningful classification and analysis, a non-characteristic behavior. 在程序行为的处理当中,识别出这种非特征行为,并且在进行分类或者分析处理之前删除掉这些非特征行为,可以有效的减少非特征行为对程序样本分类的干扰(例如,如果这类非特征行为被作为病毒特征处理,可能会带来严重的误报问题)。 In the process among the program behavior, this non-identified characteristic behavior, and removed before performing the classification analysis or dispose of these non-characteristic behavior, can effectively reduce the non-interference characteristic behavior of a program classification of samples (e.g., if such non- characteristic behavior is treated as a virus signature, it may lead to serious problems of false positives).

[0005] 现有的一种对程序的行为进行监控方法中,监听待监控的程序的所有行为,并对所有的行为进行分析和监控。 [0005] An existing program to monitor the behavior of the method, all the behavior of the program monitor to be monitored, and all behavior analysis and monitoring. 这种方案的缺陷在于数据处理量很大,复杂度高,且存在较高的出错率(例如,如果这类非特征行为被作为病毒特征处理,很可能会带来严重的误报问题)。 The disadvantage of this solution is that a large amount of data processing, high complexity, and there is a high error rate (for example, if such acts are treated as non-feature characteristics of the virus, it may lead to serious problems of false positives).

[0006] 现有的另一种对程序的行为进行监控方法中,首先使用人工识别的方式识别和过滤掉这种非特征行为,再对其余的行为进行和分析。 Behavior [0006] Another conventional method for monitoring the program carried out, first, using the manual identification mode identifying and filter out such non-characteristic behavior, then the behavior of the rest and analyzed. 这种监控方法需要大量的人力,成本很高,监控结果也不够稳定和准确,难以推广。 This monitoring method requires a lot of manpower, high cost, the monitoring results are not stable and accurate, it is difficult to promote.

[0007] 发明内容 [0007] SUMMARY OF THE INVENTION

[0008] 本发明的一个目的在于提供两种过滤程序的行为的方法,这两种方法用于在监控或者分析程序的行为之前过滤掉程序的非特征行为,以减少非特征行为对监控或者分析的干扰,降低计算机的处理量,提高监控和分析的准确度。 [0008] An object of the present invention provides two methods of filtering behavior of the program, two methods for filtering out non-characteristic behavior of the program prior to analysis or monitoring the behavior of the program to reduce the monitoring or analysis of the behavior of non-feature interference, reducing the amount of processing computer, monitor and improve the accuracy of the analysis.

[0009] 为此,本发明提供的第一种过滤程序的行为的方法,包括以下步骤:步骤Si、构造行为样本库,所述行为样本库包括从若干程序样本收集的行为样本、每一种行为样本的基于该种行为样本的出现频率计算出来的权重;步骤S2、获取待处理的程序行为,判断所述行为样本库是否存在与所述程序行为相同的行为样本,若所述行为样本库不存在与所述程序行为相同的行为样本,就保留所述程序行为;若所述行为样本库存在与所述程序行为相同的行为样本,就判断所述行为样本的权重是否落入预设的过滤阈值范围,如果落入就过滤掉所述程序行为,否则,就保留所述程序行为。 [0009] For this reason, the behavior of a first method of the present invention to provide filters, comprising the following steps: Si, configured to conduct sample library, the library includes a sample behavior sample behavior from a plurality of sample collection procedure, each calculated based on the occurrence frequency of the behavior of such acts sample weight sample weight; step S2, the acquiring program behavior to be processed, determining whether there is the behavior of the sample database program behavior samples the same behavior, the behavior when sample library the same program behavior and the behavior of the sample is not present, the program behavior of reservations; if the behavior of the sample in the inventory program behavior samples the same behavior, the behavior of the sample determines the weight falls within a predetermined weight filtering threshold range, if it falls off the filter program behavior, otherwise, retaining the program behavior.

[0010] 与现有技术相比,本发明在监控或者分析程序的程序行为之前,根据行为样本库内的行为样本、预设的过滤阈值范围对过滤掉其中的非特征行为,减少了非特征行为对监控或者分析的干扰,降低了计算机的处理量,提高了监控和分析的准确度, [0010] Compared with the prior art, the present invention prior to monitoring program behavior or analysis program, the behavior of the sample according to the behavior of the sample compartment, the preset threshold value range of the filter to filter out non-behavioral characteristics which reduce the non-feature acts of interference with monitoring or analysis, reducing the amount of processing computer, monitor and improve the accuracy of analysis,

[0011] 在所述行为样本库中,每一种行为样本的出现频率是出现该种行为样本的程序样本的数量与所有程序样本的总量的比值,或者是该种行为样本在所有程序样本中的出现次数与所有程序样本包含的行为样本总量的比值;行为样本的权重为该种行为样本的出现频率;所述判断行为样本的权重是否落入预设的过滤阈值范围的步骤具体为:如果所述行为样本的出现频率大于预设的过滤阈值下限,就判定为落入所述预设的过滤阈值范围。 [0011] Behavior of the sample database, the frequency of occurrence of each behavior of the sample is the ratio of the total number of programs such acts sample sample sample occurs with all programs, such acts or program samples in all samples the ratio of the total number of samples in the behavior of occurrence of all programs included in the sample; weight weight species occurrence frequency behavior for the behavior of the sample of the sample; said step of determining the behavior of the sample weight falls within a preset threshold range filter is particularly : If the frequency of the behavior of the sample is greater than a predetermined lower threshold filtering, it is determined to fall within the preset threshold range filter. 在该优选方案中,根据出现频率判断某个程序行为是否属于需要过滤掉的非特征行为,因为出现频率过大的行为通常属于不具有分类或分析意义的非特征行为,因此,本优选方案根据预设的过滤阈值下限过滤掉这些非特征行为。 In the preferred embodiment, the frequency of occurrence is determined according to whether or not a program to be filtered out behavior is non-behavioral characteristics, since the frequency of excessive behavior generally occurs no non-behavioral characteristics belonging to classification or the analysis of significance, therefore, a preferred embodiment according to the present preset lower threshold filter to filter out these non-characteristic behavior. 该优选方案简单,计算量不大,容易实现。 The preferred embodiment is simple, the calculation amount is not easily achieved.

[0012] 本发明提供的第二种过滤程序的行为的方法与第一种过滤程序的行为的方法的不同之处在于:在所述行为样本库中,每一种行为样本的出现频率是出现该种行为样本的程序样本的数量与所有程序样本的总量的比值,或者是该种行为样本在所有程序样本中的出现次数与所有程序样本包含的行为样本总量的比值;行为样本的权重为该种行为样本的逆文本频率指数,行为样本的逆文本频率指数等于该种行为样本的出现频率的倒数的对数;所述判断行为样本的权重是否落入预设的过滤阈值范围的步骤具体为:如果所述行为样本的逆文本频率指数小于预设的过滤阈值上限,就判定为落入所述预设的过滤阈值范围。 [0012] The method is different from the behavior of the second method of the present invention to provide filters with a behavior of a first filtration process wherein: the behavior of the sample database, the frequency of occurrence of each sample is the occurrence of behavior number of occurrences of the number of samples the program behaviors sample ratio of the total amount of all program samples, or the samples in all program behaviors ratio of the total amount of samples all samples with the behavior of the program contained in the sample; right to act weight of the sample sample behaviors for inverse document frequency index, behavior of the sample is equal to the inverse document frequency index logarithm of the reciprocal of the frequency of occurrence of such acts sample; step a preset threshold range of said filter determines the behavior of the sample weight falls specifically: if the behavior of the sample is less than a preset inverse document frequency index filtration upper threshold, it is determined to fall within the preset threshold range filter. 在该优选方案中,根据逆文本频率指数判断某个行为是否属于需要过滤掉的非特征行为,在统计学领域,逆文本频率指数是公认的量度相关性、价值的一种重要参数。 In the preferred embodiment, according to the inverse document frequency index is determined whether the behavior is a need to filter out the non-feature behavior in the field of statistics, inverse document frequency index is a recognized measure of the correlation value is an important parameter. 通常,逆文本频率指数过小的行为通常属于不具有分类或分析意义的非特征行为,因此,本优选方案根据预设的过滤阈值上限过滤掉这些非特征行为。 Typically, inverse document frequency index is too small, no non-acts typically belong to the behavioral characteristics or classification analysis of significance, therefore, the preferred embodiment acts to filter out the non-feature filtered according to a preset upper threshold. 该优选方案采用逆文本频率指数来识别和过滤掉“非特征行为”,效果更好,过滤结果更可靠。 The preferred embodiment uses an inverse document frequency index to identify and filter out "non-characteristic behavior" better, more reliable filtration result.

[0013] 优选地,第一种过滤程序的行为的方法与第二种过滤程序的行为的方法中,所述行为样本库还包括所有程序样本的总量、所有行为样本的总量;所述方法还包括更新所述行为样本库,所述更新包括:若步骤S2中所述行为样本库不存在与所述程序行为相同的行为样本,则在步骤S2之后,将所述程序行为作为新的行为样本添加到所述行为样本库中, 更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 Methodological acts with a second method of filtering program behavior [0013] Preferably, the first filter program, the behavior of the total amount of all sample libraries further comprising program samples, the total amount of all the behavior of the samples; the the method further comprises acts of updating the sample database, the updating comprises: step S2, if the behavior of the sample database program behavior does not present the same behavior as the sample, after the step S2, the program acts as a new behavior of the sample added to the sample database behavior, the behavior of the total sample database updating program samples, the total amount of the behavior of the sample and re-calculate the weight of each weight of the sample behavior. 在该优选方案中,根据当前的处理情况对行为样本进行及时的更新,使得行为样本库包含的内容更广、更全面以及更准确,从而进一步提高了过滤的准确性。 In the preferred embodiment, the process according to the current situation on the behavior of the sample timely updates, so that the behavior of the sample content library contains a broader, more comprehensive and more accurate, thus further improving the filtration accuracy.

[0014] 优选地,所述更新还包括:若步骤S2中所述行为样本库存在与所述程序行为相同的行为样本,则在步骤S2之后,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 [0014] Preferably, the updating further comprises: if the step S2 in the behavior of the stock sample program behavior samples the same behavior, after the step S2, the program updates the total amount of the behavior of a sample of the sample database the total sample of behavior and recalculate every right to conduct sample weight. 同样地,在该优选方案中,根据当前的处理情况对行为样本进行及时的更新,使得行为样本库包含的内容更广、更全面以及更准确,从而进一步提高了过滤的准确性。 Similarly, in the preferred embodiment, according to the current handling of the behavior of the sample timely updates, so that the behavior of the sample content library contains a broader, more comprehensive and more accurate, thus further improving the filtration accuracy.

[0015] 另一方面,本发明的另一个发明目的在于提供两种监控程序的行为的方法,这两种方法能过滤掉程序的非特征行为,以减少非特征行为对监控或者分析的干扰,降低计算 [0015] On the other hand, another object of the invention is a method of the present invention provides two monitor the behavior of these two methods to filter out non-characteristic behavior of the program, to reduce non-interfering behavior characteristic monitoring or analysis, reduce computing

6机的处理量,提高监控和分析的准确度。 6 machine processing capacity, improve the accuracy of monitoring and analysis.

[0016] 为此,本发明提供的第一种监控程序的行为的方法,包括:步骤SO :收集被监控的程序的程序行为;步骤S4 :分析和监控所述程序行为;在所述步骤SO和步骤S4之间,还包括以下步骤:步骤Si、构造行为样本库,所述行为样本库包括从若干程序样本收集的行为样本、每一种行为样本的基于该种行为样本的出现频率计算出来的权重;步骤S2、获取所述被监控的程序的程序行为,判断所述行为样本库是否存在与所述程序行为相同的行为样本,若所述行为样本库不存在与所述程序行为相同的行为样本,就保留所述程序行为;若所述行为样本库存在与所述程序行为相同的行为样本,就判断所述行为样本的权重是否落入预设的过滤阈值范围,如果落入就过滤掉所述程序行为,否则,就保留所述程序行为。 [0016] For this reason, the behavior of a first method of the present invention provides a monitoring program, comprising the steps of: SO: collector monitored program behavior; Step S4: monitoring and analyzing the behavior of the program; in the step SO and between step S4, further comprising the following steps: Si, configured to conduct sample library, the library includes a sample behavior sample behavior from a plurality of sample collection procedure, the frequency of occurrence of each behavior based on the behavior of this kind of sample the sample calculated weights; step S2, the acquisition program of the monitored program behavior, it is determined whether there is the behavior of the sample database program behavior samples the same behavior, the behavior of the sample if the library does not exist in the same program behavior sample behavior, it acts to retain the program; if the behavior of the sample in the inventory program behavior samples the same behavior, it is determined that the right to act weight of the sample falls within a preset threshold range of the filter, the filter if it falls out of the program behavior, otherwise, it retains the program behavior.

[0017] 类似地,与现有技术相比,本发明提供的监控程序的行为的方法在监控或者分析程序的行为之前,根据行为样本库内的行为样本、预设的过滤阈值范围对程序的行为进行比较,过滤掉非特征行为,从而以减少非特征行为对监控或者分析的干扰,降低了计算机的处理量,提高了监控和分析的准确度, [0017] Similarly, as compared with the prior art, the behavior monitoring method of the present invention provides a program prior to analysis or monitoring program behavior, the behavior of a sample according to the behavior of the sample compartment, a preset threshold range of the filtered process Compare behavior, the behavioral characteristics of the non-filtered so as to reduce non-interfering behavior characteristic monitoring or analysis, reducing the processing amount of the computer, monitor and improve the accuracy of analysis,

[0018] 在所述行为样本库中,每一种行为样本的出现频率是出现该种行为样本的程序样本的数量与所有程序样本的总量的比值,或者是该种行为样本在所有程序样本中的出现次数与所有程序样本包含的行为样本总量的比值;行为样本的权重为该种行为样本的出现频率;所述判断行为样本的权重是否落入预设的过滤阈值范围的步骤具体为:如果所述行为样本的出现频率大于预设的过滤阈值下限,就判定为落入所述预设的过滤阈值范围。 [0018] The behavior of the sample database, the frequency of occurrence of each behavior of the sample is the ratio of the total number of programs such acts sample sample sample occurs with all programs, such acts or program samples in all samples the ratio of the total number of samples in the behavior of occurrence of all programs included in the sample; weight weight species occurrence frequency behavior for the behavior of the sample of the sample; said step of determining the behavior of the sample weight falls within a preset threshold range filter is particularly : If the frequency of the behavior of the sample is greater than a predetermined lower threshold filtering, it is determined to fall within the preset threshold range filter. 在该优选方案中,根据出现频率判断某个行为是否属于需要过滤掉的“非特征行为”,因为出现频率过大的行为通常属于不具有分类或分析意义的“非特征行为”,因此,本优选方案根据预设的过滤阈值下限过滤掉这些非特征行为。 In the preferred embodiment, the frequency of occurrence is determined according to whether a behavior is to be filtered out "non-characteristic behavior", since the frequency behavior can occur generally too large does not have a category or belonging to the meaning analysis of "non-feature behavior" Accordingly, the present preferred embodiment according to a preset lower threshold filter to filter out the non-feature behavior. 该优选方案简单,计算量不大,容易实现。 The preferred embodiment is simple, the calculation amount is not easily achieved.

[0019] 本发明提供的第二种监控程序的行为的方法与第一种监控程序的行为的方法的不同之处在于:在所述行为样本库中,每一种行为样本的出现频率是出现该种行为样本的程序样本的数量与所有程序样本的总量的比值,或者是该种行为样本在所有程序样本中的出现次数与所有程序样本包含的行为样本总量的比值;行为样本的权重为该种行为样本的逆文本频率指数,行为样本的逆文本频率指数等于该种行为样本的出现频率的倒数的对数;所述判断行为样本的权重是否落入预设的过滤阈值范围的步骤具体为:如果所述行为样本的逆文本频率指数小于预设的过滤阈值上限,就判定为落入所述预设的过滤阈值范围。 [0019] The method is different from the behavior of a second method of the present invention provides a monitoring program to monitor the behavior of a first program in that: in the sample database behavior, the behavior occurrence frequency of each sample is the occurrence of number of occurrences of the number of samples the program behaviors sample ratio of the total amount of all program samples, or the samples in all program behaviors ratio of the total amount of samples all samples with the behavior of the program contained in the sample; right to act weight of the sample sample behaviors for inverse document frequency index, behavior of the sample is equal to the inverse document frequency index logarithm of the reciprocal of the frequency of occurrence of such acts sample; step a preset threshold range of said filter determines the behavior of the sample weight falls specifically: if the behavior of the sample is less than a preset inverse document frequency index filtration upper threshold, it is determined to fall within the preset threshold range filter. 在该优选方案中,根据逆文本频率指数判断某个行为是否属于需要过滤掉的非特征行为,在统计学领域,逆文本频率指数是公认的量度相关性、价值的一种重要参数。 In the preferred embodiment, according to the inverse document frequency index is determined whether the behavior is a need to filter out the non-feature behavior in the field of statistics, inverse document frequency index is a recognized measure of the correlation value is an important parameter. 通常,逆文本频率指数过小的行为通常属于不具有分类或分析意义的非特征行为,因此,本优选方案根据预设的过滤阈值上限过滤掉这些非特征行为。 Typically, inverse document frequency index is too small, no non-acts typically belong to the behavioral characteristics or classification analysis of significance, therefore, the preferred embodiment acts to filter out the non-feature filtered according to a preset upper threshold. 该优选方案采用逆文本频率指数来识别和过滤掉“非特征行为”,效果更好,过滤结果更可靠。 The preferred embodiment uses an inverse document frequency index to identify and filter out "non-characteristic behavior" better, more reliable filtration result.

[0020] 优选地,第一种监控程序的行为的方法与第二种监控程序的行为的方法中,所述行为样本库还包括所有程序样本的总量、所有行为样本的总量;所述方法还包括更新所述行为样本库,所述更新包括:若步骤S2中所述行为样本库不存在与所述程序行为相同的行为样本,则在步骤S2之后,将所述程序行为作为新的行为样本添加到所述行为样本库中, 更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 Methodological acts methodological acts [0020] Preferably, the first and second monitor program to monitor the behavior of the total amount of all sample libraries further comprising program samples, the total amount of all the behavior of the samples; the the method further comprises acts of updating the sample database, the updating comprises: step S2, if the behavior of the sample database program behavior does not present the same behavior as the sample, after the step S2, the program acts as a new behavior of the sample added to the sample database behavior, the behavior of the total sample database updating program samples, the total amount of the behavior of the sample and re-calculate the weight of each weight of the sample behavior. 在该优选方案中,根据当前的处理情况对行为样本进行及时的更新,使得行为样本库包含的内容更广、更全面以及更准确,从而进一步提高了过滤的准确性。 In the preferred embodiment, the process according to the current situation on the behavior of the sample timely updates, so that the behavior of the sample content library contains a broader, more comprehensive and more accurate, thus further improving the filtration accuracy.

[0021] 优选地,所述更新还包括:若步骤S2中所述行为样本库存在与所述程序行为相同的行为样本,则在步骤S2之后,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 [0021] Preferably, the updating further comprises: if the step S2 in the behavior of the stock sample program behavior samples the same behavior, after the step S2, the program updates the total amount of the behavior of a sample of the sample database the total sample of behavior and recalculate every right to conduct sample weight. 同样地,在该优选方案中,根据当前的处理情况对行为样本进行及时的更新,使得行为样本库包含的内容更广、更全面以及更准确,从而进一步提高了过滤的准确性。 Similarly, in the preferred embodiment, according to the current handling of the behavior of the sample timely updates, so that the behavior of the sample content library contains a broader, more comprehensive and more accurate, thus further improving the filtration accuracy.

[0022] 附图说明 [0022] BRIEF DESCRIPTION OF DRAWINGS

[0023] 图1是本发明一个实施例中构造行为样本库的流程图; [0023] FIG. 1 is a flowchart illustrating a sample database structure acts embodiment of the present invention;

[0024] 图2是应用图1所示的行为样本库对程序的行为进行过滤的流程图; [0024] FIG 2 is a flowchart illustrating the behavior of a filtered sample database application shown in FIG. 1, the behavior proceedings;

[0025] 图3是本发明另一个实施例中构造行为样本库的流程图; [0025] FIG. 3 is a flowchart of another embodiment of the present invention is configured to conduct sample library of embodiment;

[0026] 图4是应用图3所示的行为样本库对程序的行为进行过滤的流程图。 [0026] FIG. 4 is a flowchart of the behavior of the application sample database shown in FIG. 3, the behavior of the filtering procedure.

[0027] 具体实施方式 [0027] DETAILED DESCRIPTION

[0028] 本发明涉及监控或者分析程序的行为方法,尤其是涉及在监控或者分析程序的行为之前过滤掉程序的非特征行为的方法。 [0028] The present invention relates to monitoring program behavior or analysis methods, and more particularly to a method wherein a non-filtered off behavior of the program before the program behavior analysis or monitoring. 实施本发明,能减少非特征行为对监控或者分析的干扰,降低计算机的处理量,提高监控和分析的准确度。 Embodiment of the present invention, can reduce the non-interference characteristic behavior monitoring or analysis, to reduce the amount of processing of the computer, monitor and improve the accuracy of analysis.

[0029] 为此,首先构造行为样本库,所述行为样本库包括从若干程序样本收集的行为样本、每一种行为样本的基于该种行为样本的出现频率计算出来的权重。 [0029] For this purpose, the sample is first configured behavior library, the library includes sample behavior from the behavior of a plurality of sample collection program samples, calculated based on the occurrence frequency of such acts sample weight sample weight for each behavior. 其中,行为样本的权重用来表示这种行为的价值、相关性或者重要性。 Among them, the right to conduct sample weights used to indicate the value, relevance or importance of such acts. 权重可以是但不限于出现频率、根据出现频率估计的出现概率,或者逆文本频率指数。 Weight may be, but is not limited to the frequency of occurrence, based on the estimated probability of occurrence frequency, or inverse document frequency index. 进一步地,行为样本的出现频率可以是出现该种行为样本的程序样本的数量与所有程序样本的总量的比值。 Further, the frequency of occurrence of the behavior of the sample may be a ratio of the total number of programs such acts sample sample sample occurs with all programs. 例如,假如构造行为样本库的过程中,收集了100个程序样本的行为样本,如果有30个程序样本出现了行为样本A, 那么,行为样本A的出现频率为30/100 = 30%。 For example, if the behavior of the sample during the construction of the library, the collection of samples of the behavior of the program 100 samples if there are 30 samples appeared program behavior of the specimen A, then the frequency of occurrence of the behavior of the specimen A is 30/100 = 30%. 替换地,行为样本的出现频率也可以是该种行为样本在所有程序样本中的出现次数与所有程序样本包含的行为样本总量的比值,例如,在上述的例子中,假如所述100个程序样本总共具有9000个行为样本,而行为样本A的出现次数是2500次,那么,行为样本A的出现频率为2500/9000〜27. 8%。 Alternatively, the frequency of occurrence of the behavior of the sample may be a sample of a ratio of the total number of occurrences of the behavior of the sample in all program behaviors sample contained in the sample all the programs, for example, in the above example, if the program 100 sample a total of 9000 samples of behavior, while the number of occurrences of the behavior of the sample a is 2500, then the frequency of occurrence of the behavior of the sample a is 2500 / 9000~27 8%.

[0030] 行为样本库构造好之后,可用于对程序行为进行过滤。 [0030] After the good behavior of the sample database structure can be used to filter the program behavior. 具体地,先获取待处理的程序行为,判断所述行为样本库是否存在与所述程序行为相同的行为样本,若所述行为样本库不存在与所述程序行为相同的行为样本,就保留所述程序行为;若所述行为样本库存在与所述程序行为相同的行为样本,就判断所述行为样本的权重是否落入预设的过滤阈值范围,如果落入就过滤掉所述程序行为,否则,就保留所述程序行为。 Specifically, acquiring program behavior to be processed, determining whether there is the behavior of the sample database program behavior samples the same behavior, if the behavior of the sample database program behavior and the behavior of the same sample does not exist, the reservation said program behavior; if the behavior of the sample in the inventory program behavior samples the same behavior, it is determined that the right to act weight of the sample falls within a preset threshold range of the filter, if it falls off the filter program behavior, otherwise, the retention program behavior.

[0031] 下面结合附图对本发明进行更详细的阐述。 DRAWINGS The invention is explained in more detail [0031] below in conjunction.

[0032] 实施例一 [0032] Example a

[0033] 图1是本发明一个实施例中构造行为样本库的流程图,图2是应用图1所示的行为样本库对程序的行为进行过滤的流程图。 [0033] FIG. 1 is a flowchart illustrating a sample database structure acts embodiment of the present invention, FIG 2 is a flowchart illustrating the behavior of the application sample database shown in FIG. 1 the behavior of the filtering procedure.

[0034] 如图1所示,开始步骤SlOO之后,在步骤S102中,收集大量的程序样本的行为,得到大量的行为样本,并记录所收集到的行为样本的总量D。 [0034] As shown in FIG. 1, after the start step SlOO, in step S102, the program collect a large sample of the behavior, the behavior of a large number of samples obtained, and record the behavior of the total amount collected sample D. 根据统计学原理,样本的规模越大,得到的统计结果越接近真实值。 According to statistical theory, the larger the sample size, the statistical results closer to the true value. 因此,在构造行为样本库的过程中,优选收集尽可能多的程序样本的行为样本。 Therefore, the behavior of the sample during the construction of the library, it is preferable to collect as many samples of the behavior of the sample program. 本领域的技术人员应当意识到,利用现有的技术,可以通过设置拦截点等方式收集大量程序样本的行为,例如对文件读写操作、对注册表读写操作等。 Those skilled in the art will appreciate that the use of the prior art, samples can be collected by a large number of programs behavior intercept point, etc. provided, for example, file read and write operations, registry write operation. [0035] 接着,步骤S104中,计算行为样本的出现次数Dwi,其中,Dwi表示第i种行为样本在出现在所述行为样本库中的次数,显然,Dwi实际上等于该行为样本库中与第i种行为样本相同的行为样本的数目。 [0035] Next, in step S104, the number of occurrences of the behavior of the sample is calculated Dwi, wherein, Dwi i represents the number of behaviors appear in the behavior of the sample in the sample database, obviously, the behavior Dwi substantially equal to the sample database the number of the i-th sample of the same behavior as the behavior of the sample.

[0036] 然后,步骤S106中,计算行为样本的出现频率其中,&表示第i种行为样本在出现在所述行为样本库中的频率,第i中行为样本的频率A等于该种行为样本的出现次数Dwi与行为样本库中行为样本的总量D的比值,S卩& = Dwi/D。 [0036] Then, in step S106, there is calculated a sample frequency behavior where, represents the frequency of the i & behaviors appear in the behavior of samples in sample library, the i-th frequency behavior of the sample A is equal to the kind of behavior of the sample the ratio of the total number of occurrences of D Dwi behavior and behavior of samples in sample library, S Jie & = Dwi / D. 如上所述,出现频率fi作为行为样本的一种表现方式,用于表示这种行为样本的相关性、重要性等。 As described above, the occurrence frequency fi as an expression of the behavior of the sample, the sample is used to indicate the relevance of such behavior, and the like importance. 显然,0 < fi ^ 1, 且fi越大表示该种行为样本的出现频率或者出现概率越高。 Clearly, 0 <fi ^ 1, and fi represents the frequency of occurrence of the larger sample of such acts or higher probability. 如上所述,虽然在该实施例中, 将某种行为样本在所有程序样本中的出现次数与所有程序样本包含的行为样本总量的比值作为该种行为样本的出现频率,但是,也将出现某种行为样本的程序样本的数量与所有程序样本的总量的比值作为该种行为样本的出现频率。 As described above, although the number of occurrences of this embodiment, the behavior of some sample programs in all samples and the ratio of the total amount of all the programs included in the behavior of a sample as the sample frequency of occurrence of such acts sample, however, will also be an act of the program number of samples with a ratio of the total amount of samples all samples programs such acts as the appearance frequency of the sample.

[0037] 计算完所有的行为样本的出现频率&之后,保存上述的行为样本的总量D、各个行为样本的出现次数Dwi以及出现频率就完成了行为样本库的构造,如步骤S108所示。 [0037] After completion of calculation of the frequency of occurrence of the behavior of all samples & saved amount D of the behavior of the sample, the number of occurrences of each behavior and the frequency of occurrence of samples Dwi completed behavior sample database structure, as shown in step S108.

[0038] 接着,如图2所示,在实际应用时,在开始步骤S200之后,在步骤S201中,收集或者读取需要处理的程序行为。 [0038] Next, as shown in FIG. 2, in practical applications, after the start step S200, the in step S201, reads the program behavior requires the collection or processing. 同样,本领域的技术人员应当意识到,利用现有的技术,可以通过设置拦截点等方式收集大量程序样本的行为,例如对文件读写操作、对注册表读写操作等。 Similarly, those skilled in the art will appreciate that, using existing technology, the behavior of the program can collect a large number of samples, etc. by providing the intercept point, for example, file read and write operations, registry write operation.

[0039] 接着,步骤S202中,判断所述行为样本库是否存在与所述程序行为相同的行为样本。 [0039] Next, in step S202, it is determined whether there is the behavior of the sample database program behavior samples the same behavior. 如果不存在,就说明该程序行为是一种新的程序行为或者是出现频率较低的程序行为,不属于非特征行为,因此,保留该程序行为,以便于后续步骤中对该程序行为进行处理(例如监听、分析或者监控),如步骤S205所示。 If not, it means that the program behavior is the behavior of a new procedure or program behavior occurs less frequently, than non-characteristic behavior, therefore, the reserved program behavior, the behavior of the program for processing in a subsequent step (e.g., monitoring, analysis or monitoring), as shown in step S205.

[0040] 反之,如果步骤S202中发现行为样本库存在与所述程序行为相同的行为样本,就进一步读取该相同的行为样本的出现频率,如步骤S203。 [0040] Conversely, if in step S202 the samples found in the behavior of the stock in the same program behavior behavior sample, the further the frequency of occurrence of the read samples of the same behavior as by step S203.

[0041] 接着,步骤S203之后,在步骤S204中判断该出现频率是否落入预设的过滤阈值范围。 [0041] Next, after the step S203, the determination in step S204 whether the occurrence frequency falls within a preset threshold range filter. 如上所述,由于频率越高的程序行为,就越可能属于非特征行为,因此,如果某个程序行为的出现频率大于预设的过滤阈值下限,如步骤S206所示,就可以将该程序行为作为非特征行为,过滤掉该程序行为。 As described above, since the higher the frequency of the behavior of the program, the more likely is non-behavioral characteristics, therefore, if the frequency of occurrence of the behavior of a program is greater than a preset lower threshold of the filter, as shown in step S206, the behavior of the program can be as a non-characteristic behavior, the behavior of the program was filtered off. 这样,后续的处理流程中,不再需要对该程序行为进行分析、 监听、监控等,有效地减少了后期的处理量,并减少了这种非特征行为对监控或者分析的干扰,提高了监控和分析的准确度。 Thus, the subsequent process flow is no longer required to the program behavior analysis, monitoring, surveillance, effectively reducing the amount of processing later, and reduces this non-interference characteristic behavior monitoring or analysis, improved monitoring and analysis accuracy.

[0042] 相反,如果在步骤S204中,发现该该程序行为的出现频率没有落入预设的过滤阈值范围,也就是说,如果该出现频率小于预设的过滤阈值下限,就说明该程序行为的出现频率较低,不属于非特征行为,因此,流程进入步骤S205,在步骤S205中保留该程序行为,以便于后续步骤中对该程序行为进行处理(例如监听、分析或者监控)。 [0042] Conversely, if in step S204, the frequency of occurrence of the found program behavior does not fall within a preset threshold range of the filter, that is, if the lower limit frequency of the filter is less than a predetermined threshold value appears, it shows the behavior of the program the less frequent, other than non-behavioral characteristics, therefore, the flow proceeds to step S205, in step S205, the reservation program behavior, the behavior of the program for processing in a subsequent step (e.g., monitoring, analysis or monitoring).

[0043] 步骤S205以及步骤S206结束于步骤S207,至此,整个过滤流程结束。 [0043] Step S205 and step S206 ends at step S207, At this point, the entire filter process ends.

[0044] 在这个实施例中,根据出现频率判断某个行为是否属于需要过滤掉的非特征行为,如果程序行为属于非特征行为,就过滤掉该程序行为,以减轻后续的处理量,提高后续处理的准确度。 [0044] In this embodiment, the frequency of occurrence is determined according to whether the behavior is a need to filter out the non-feature behavior, if the program is non-behavioral characteristic behavior, to filter out the program behavior in order to reduce the amount of subsequent processing, improve the follow-up the accuracy of the process. 这种方案简单,计算量不大,容易实现。 This solution is simple, small amount of calculation, easy to implement.

[0045] 实施例二 [0045] Second Embodiment

[0046] 图3是本发明另一个实施例中构造行为样本库的流程图;图4是应用图3所示的行为样本库对程序的行为进行过滤的流程图。 [0046] FIG. 3 is a flow chart of another embodiment of the present invention acts sample library constructed embodiment; FIG. 4 is a flowchart of the application shown in FIG 3 sample database behavior behavior filtering procedure.

[0047] 图3所示的构造行为样本库的流程与图1所示的构造流程大同小异。 Configuration flow shown in the flow behavior of the structure shown in FIG. 1 sample library [0047] FIG. 3 is similar. 更具体地, 图3所示的步骤S300至步骤S304与图1所示的步骤SlOO至步骤S104相同,分别是开始步骤、收集大量的行为样本并记录行为样本的总量D、计算每一种行为样本的出现次数Dwi。 More specifically, the same procedure as shown in Step 1 shown in FIG. 3 S300 to step S304 in FIG SlOO to step S104, steps are started to collect the sample and record the behavior of a large number of total samples D acts, is calculated for each the number of sample behavioral Dwi.

[0048] 接着,步骤S306中,计算每一种行为样本的逆文本频率指数(IDF)。 [0048] Next, in step S306, calculating an inverse document frequency index (IDF) of each sample behavior. 如上所述, 逆文本频率指数是公认的量度相关性、价值的一种重要参数。 As mentioned above, the inverse document frequency index is a recognized measure of the relevance, value is an important parameter. 第i种行为样本的逆文本频率指数IDF(i)等于该第i种行为样本在该行为样本库中的出现频率的倒数的对数,即: I-th sample behavior of inverse document frequency index IDF (i) equal to the i-th sample behavior logarithm of the reciprocal of the frequency of occurrence of the behavior sample library, namely:

IDFii) = IogC-^7)。 IDFii) = IogC- ^ 7). 其中,D为行为样本库中的行为样本的总量;Dwi为第i种行为在行为样Dwi Wherein, D is the total sample library behavior Behavior samples; Dwi i th sample behaviors behavior Dwi

本库中出现过的次数。 The number of occurrences of this library before. 显然,某种行为样本的IDF(i)与其出现频率(Dwi/D)是成反比的, 具体地,如果第i种行为样本出现得很频繁,这种行为样本的逆文本指数IDF(i)将越小, IDF(i)的最小值等于0。 Obviously, the IDF (i) an act of the sample frequency of its occurrence (Dwi / D) is inversely proportional, in particular, if the behavior of the i-th sample appear very frequently, this behavior of the samples the IDF inverse document index (i) the smaller the minimum value IDF (i) is equal to 0. 反之,如果第i种行为样本出现得很少,其IDF(i)就会越高。 Conversely, if the i-th sample behavior appear very little, its IDF (i) will be higher. 因此,当ID F(i)低于某个预设的过滤阈值时,可以认为这个行为样本属于非特征行为,可以被过滤掉。 Thus, when the ID F (i) below a preset threshold filter may be considered that the behavior of the sample is non-behavioral characteristics, it can be filtered out.

[0049] 构造好行为样本库之后,就可以利用该行为样本库对程序的行为进行识别和判断。 [0049] After constructing the good behavior of the sample database, you can use the act sample library program identified behavior and judgment. 具体如图4所示。 Specifically as shown in FIG.

[0050] 图4所示的步骤S400至步骤S407和图2所示的步骤S200至步骤S207基本相同,稍有区别的地方在于步骤S403和步骤S404。 Step shown in [0050] FIG. 4 S400 to step and step shown in FIG. 2 S200 to S207 is basically the same step S407, where a slightly different in that step S403 and step S404. 具体地,在步骤S403中,读取的是行为样本库中与待处理的程序样本相同的行为样本的IDF值。 Specifically, in step S403, read is the IDF value acts sample database program to be treated with the same sample behavior of the sample. 而在步骤S404中,若该IDF值小于预设的过滤阈值上限,就说明该IDF值落入预设的过滤阈值范围,相应地,该程序行为属于非特征行为,可以过滤掉(步骤S406);否则,流程从步骤S404中进入步骤S405,即保留该程序行为,留待后续的处理(分析、监听或者监控)等。 In step S404, if the IDF value less than a preset upper limit threshold value filter, it shows that the IDF values ​​falls within a preset threshold range filter, accordingly, the behavior of the program is non-behavioral characteristics, you can filter out (step S406) ; otherwise, the flow proceeds to step S404 from step S405, i.e. the reserved program behavior, left to subsequent processing (analysis, monitoring, or monitoring) and the like.

[0051] 在本实施例采用的方案中,根据逆文本频率指数判断某个行为是否属于需要过滤掉的非特征行为,在统计学领域,逆文本频率指数是公认的量度相关性、价值的一种重要参数。 [0051] In the present embodiment employed in the embodiment, according to the inverse document frequency index is determined whether the behavior is a need to filter out the non-feature behavior in the field of statistics, inverse document frequency index is a recognized measure of the correlation, a value of kinds of important parameters. 通常,逆文本频率指数过小的行为通常属于不具有分类或分析意义的非特征行为,因此,本优选方案根据预设的过滤阈值上限过滤掉这些非特征行为。 Typically, inverse document frequency index is too small, no non-acts typically belong to the behavioral characteristics or classification analysis of significance, therefore, the preferred embodiment acts to filter out the non-feature filtered according to a preset upper threshold. 该优选方案采用逆文本频率指数来识别和过滤掉非特征行为,效果更好,过滤结果更可靠。 The preferred embodiment uses an inverse document frequency index to identify and filter out non-characteristic behavior, better, more reliable filtration result.

[0052] 上面已经结合附图对本发明进行阐述。 [0052] The above has described the present invention in conjunction with the accompanying drawings. 应当意识到,本发明不仅可以用于过滤掉非特征行为,还可以应用到对程序的监控中,例如应用到安全软件中。 It should be appreciated that the present invention not only acts to filter out non-characteristic, may also be applied to the monitoring of the program, such as application software security. 具体地,安全软件利用现有的技术获得被监控的程序的行为后,可以利用上述的过滤方法过滤掉其中的非特征行为,然后再按照现有的监控方法对剩余的程序行为进行监控。 Specifically, after the behavior of the monitored security software program obtained by the prior art, it can be used to filter out the above-described filtering method wherein a non-behavioral characteristics, and then to monitor the behavior of the rest of the procedure according to the existing monitoring method. 与现有技术相比,本发明提供的这种监控程序的行为的方法在监控或者分析程序的行为之前,根据行为样本库内的行为样本、预设的过滤阈值范围对程序的行为进行比较,过滤掉非特征行为,从而以减少非特征行为对监控或者分析的干扰,降低了计算机的处理量,提高了监控和分析的准确度。 Compared with the prior art, such behavior of the method of the present invention provides a monitoring program to monitor the behavior before or analysis program, the behavior of the sample according to the behavior of the sample compartment, a preset threshold range of the filtered program behavior compared, filtering out non-characteristic behavior so as to reduce non-interfering behavior characteristic monitoring or analysis, reducing the processing amount of the computer, monitor and improve the accuracy of analysis.

[0053] 作为对上述各种实施例的一种改进,还可以定期地或者实时地更新行为样本库。 [0053] As an improvement of the above-described various embodiments, may also be updated periodically or in real time the behavior of the sample library. 为了更好地更新行为样本库,所述行为样本库应当存储着程序样本的总量、行为样本的总量D等信息。 For a better sample database update behavior, the behavior of the total sample should be stored library program samples, the total amount of information and other acts D samples. 在实施时,例如,如果在图2所示的步骤S202中发现行为样本库不存在与所述程序行为相同的行为样本,那么,可以在流程计算每一种行为样本的出现频率。 In practice, for example, if it is found in the step shown in FIG. 2 S202 Behavior Behavior sample database with the program samples the same behavior is not present, then the frequency of occurrence may be calculated for each sample in the flow behavior. 再例如,如果在图4所示的步骤S402中发现行为样本库不存在与所述程序行为相同的行为样本,那么,可以在流程结束之后将所述程序行为作为新的行为样本添加到所述行为样本库中,更新所述行为样本的总量D并重新计算每一种行为样本的逆文本频率指数IDF。 As another example, if the found behavior sample database with the program samples the same behavior as the behavior does not exist, then the process may be performed after the end of the program acts as a new behavior in step shown in FIG. 4 S402 is added to the sample behavior sample database, the update amount D and the behavior of the sample is recalculated inverse document frequency IDF for each behavior index sample. 这样,通过对行为样本进行及时的更新,使得行为样本库包含的内容更广、更全面以及更准确,从而进一步提高了过滤的准确性。 In this way, the behavior of samples timely updates, so that the behavior of the sample content library contains a broader, more comprehensive and more accurate, thus further improving the filtration accuracy.

[0054] 类似地,如果图2所示的步骤S202中发现行为样本库存在与所述程序行为相同的行为样本,那么,在流程结束之后,可以更新所述行为样本的总量D以及所述相同的行为样本的出现频率,并重新计算每一种行为样本的出现频率。 [0054] Similarly, if the steps shown in FIG. 2 S202 inventory found in the behavior of the sample program behavior same behavior samples, then, after the end of the process, the behavior of the sample can be updated and the total amount of D the same frequency of occurrence of the behavior of the sample, and re-calculate the frequency of occurrence of each behavior sample. 同样地,如果图4所示的步骤S402 中发现行为样本库存在与所述程序行为相同的行为样本,那么,在流程结束之后,可以更新所述行为样本的总量D以及所述相同的行为样本的逆文本频率指数IDF。 Similarly, if the steps shown in FIG. 4 S402 is found in the behavior of the stock sample program behavior same behavior samples, then, after the end of the process to update the total amount of the behavior of the sample and the same behavior of the D inverse document frequency sample index IDF. 在该优选方案中, 根据当前的处理情况对行为样本进行及时的更新,使得行为样本库包含的内容更广、更全面以及更准确,从而进一步提高了过滤的准确性。 In the preferred embodiment, the process according to the current situation on the behavior of the sample timely updates, so that the behavior of the sample content library contains a broader, more comprehensive and more accurate, thus further improving the filtration accuracy.

[0055] 以上所述的本发明实施方式,并不构成对本发明保护范围的限定。 [0055] The above-described embodiments of the present invention, not to limit the scope of the present invention. 任何在本发明的精神和原则之内所作的修改、等同替换和改进等,均应包含在本发明的权利要求保护范围之内。 Any modifications within the spirit and principle of the present invention, equivalent substitutions and improvements should be included within the protection scope claimed in the claims of the present invention.

Claims (12)

1. 一种过滤程序的行为的方法,其特征在于,包括以下步骤:步骤Si、构造行为样本库,所述行为样本库包括从若干程序样本收集的行为样本、每一种行为样本的基于该种行为样本的出现频率计算出来的权重;步骤S2、获取待处理的程序行为,判断所述行为样本库是否存在与所述程序行为相同的行为样本,若所述行为样本库不存在与所述程序行为相同的行为样本,就保留所述程序行为;若所述行为样本库存在与所述程序行为相同的行为样本,就判断所述行为样本的权重是否落入预设的过滤阈值范围,如果落入就过滤掉所述程序行为,否则,就保留所述程序行为,在所述行为样本库中,每一种行为样本的出现频率是出现该种行为样本的程序样本的数量与所有程序样本的总量的比值,或者是该种行为样本在所有程序样本中的出现次数与所有程序样本包含 1. A method of filtering behavior of a program, characterized by comprising the following steps: Si, configured to conduct sample library, the library includes a sample behavior sample behavior from a plurality of sample collection procedure, based on the behavior of each sample behaviors appearance frequency calculated from the sample weight; step S2, the acquiring program behavior to be processed, determining whether there is the behavior of the sample database program behavior samples the same behavior, the behavior of the sample if the library is not present the same behavior behavior sample program, the program behavior of reservations; if the behavior of the sample in the inventory program behavior samples the same behavior, it is determined that the right to act weight of the sample falls within a preset threshold range filter, if falls to the program behavior was filtered off, otherwise, the reserved program behavior, the behavior of the sample database, the frequency of occurrence of each behavior of samples is the number of samples of the program sample behaviors occur with all program samples the ratio of total number of occurrences of such acts or program samples in all the sample program contains all samples 的行为样本总量的比值;行为样本的权重为该种行为样本的出现频率;所述判断行为样本的权重是否落入预设的过滤阈值范围的步骤具体为:如果所述行为样本的出现频率大于预设的过滤阈值下限,就判定为落入所述预设的过滤阈值范围。 The ratio of the total amount of the behavior of the sample; right to act for the weight of the sample species frequency of occurrence of the behavior of a sample; the step of determining the behavior of the sample weight falls within a preset threshold range filter is specifically: if the frequency of occurrence of the behavior of the sample the lower limit of the filter is greater than a predetermined threshold value, it is determined to fall within the preset threshold range filter.
2.根据权利要求1所述的过滤程序的行为的方法,其特征在于:所述行为样本库还包括所有程序样本的总量、所有行为样本的总量;所述方法还包括更新所述行为样本库,所述更新行为样本库包括:若步骤S2中所述行为样本库不存在与所述程序行为相同的行为样本,则在步骤S2之后,将所述程序行为作为新的行为样本添加到所述行为样本库中,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 2. The method of claim 1 filter behavior program according to claim, wherein: the behavior of the sample database further includes the total amount of all program samples, the total amount of all the behavior of the sample; said method further comprising updating said behavior sample database, the database update behavior sample comprising: step S2, if the behavior of the sample database program behavior does not exist the same behavior as the sample, after step S2, the program acts as a new sample is added to conduct the behavior of the sample database, updating the total sample program behavior of the sample database, the total amount of the behavior of the sample and re-calculate the weight of each weight of the sample behavior.
3.根据权利要求2所述的过滤程序的行为的方法,其特征在于,所述更新行为样本库还包括:若步骤S2中所述行为样本库存在与所述程序行为相同的行为样本,则在步骤S2之后,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 The method according to claim 2, behavior of the filters claims, characterized in that the behavior of the sample database updating further comprises: if the step S2 in the behavior of the stock sample program behavior same behavior samples, after the step S2, the behavior of the sample database updating procedures total sample, the behavior of the total weight of the sample and re-calculate the weight of each sample behavior.
4. 一种过滤程序的行为的方法,其特征在于,包括以下步骤:步骤Si、构造行为样本库,所述行为样本库包括从若干程序样本收集的行为样本、每一种行为样本的基于该种行为样本的出现频率计算出来的权重;步骤S2、获取待处理的程序行为,判断所述行为样本库是否存在与所述程序行为相同的行为样本,若所述行为样本库不存在与所述程序行为相同的行为样本,就保留所述程序行为;若所述行为样本库存在与所述程序行为相同的行为样本,就判断所述行为样本的权重是否落入预设的过滤阈值范围,如果落入就过滤掉所述程序行为,否则,就保留所述程序行为,在所述行为样本库中,每一种行为样本的出现频率是出现该种行为样本的程序样本的数量与所有程序样本的总量的比值,或者是该种行为样本在所有程序样本中的出现次数与所有程序样本包含 4. A method of filtering behavior of a program, characterized by comprising the following steps: Si, configured to conduct sample library, the library includes a sample behavior sample behavior from a plurality of sample collection procedure, based on the behavior of each sample behaviors appearance frequency calculated from the sample weight; step S2, the acquiring program behavior to be processed, determining whether there is the behavior of the sample database program behavior samples the same behavior, the behavior of the sample if the library is not present the same behavior behavior sample program, the program behavior of reservations; if the behavior of the sample in the inventory program behavior samples the same behavior, it is determined that the right to act weight of the sample falls within a preset threshold range filter, if falls to the program behavior was filtered off, otherwise, the reserved program behavior, the behavior of the sample database, the frequency of occurrence of each behavior of samples is the number of samples of the program sample behaviors occur with all program samples the ratio of total number of occurrences of such acts or program samples in all the sample program contains all samples 的行为样本总量的比值;行为样本的权重为该种行为样本的逆文本频率指数,行为样本的逆文本频率指数等于该种行为样本的出现频率的倒数的对数;所述判断行为样本的权重是否落入预设的过滤阈值范围的步骤具体为:如果所述行为样本的逆文本频率指数小于预设的过滤阈值上限,就判定为落入所述预设的过滤阈值范围。 The ratio of the total amount of the behavior of the sample; right to act for the weight of the sample species inverse document frequency of the exponential behavior of the sample, the behavior of the sample is equal to the inverse document frequency index logarithm of the reciprocal of the frequency of occurrence of such acts sample; Analyzing the behavior of the sample step weight falls within a preset threshold range of the filter is specifically: if said behavior samples inverse document frequency index smaller than a preset upper threshold filtering, it is determined to fall within the preset threshold range filter.
5.根据权利要求4所述的过滤程序的行为的方法,其特征在于:所述行为样本库还包括所有程序样本的总量、所有行为样本的总量;所述方法还包括更新所述行为样本库,所述更新行为样本库包括:若步骤S2中所述行为样本库不存在与所述程序行为相同的行为样本,则在步骤S2之后,将所述程序行为作为新的行为样本添加到所述行为样本库中,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 5. The method of filtering behavior of a program as claimed in claim 4, characterized in that: said behavior sample database further comprises the total amount of all program samples, the total amount of all the behavior of the sample; said method further comprising updating said behavior sample database, the database update behavior sample comprising: step S2, if the behavior of the sample database program behavior does not exist the same behavior as the sample, after step S2, the program acts as a new sample is added to conduct the behavior of the sample database, updating the total sample program behavior of the sample database, the total amount of the behavior of the sample and re-calculate the weight of each weight of the sample behavior.
6.根据权利要求5所述的过滤程序的行为的方法,其特征在于,所述更新行为样本库还包括:若步骤S2中所述行为样本库存在与所述程序行为相同的行为样本,则在步骤S2之后,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 The method according to the behavior of the filter program according to claim 5, wherein said sample database update behavior further comprises: if the step S2 acts on the stock sample program behavior same behavior samples, after the step S2, the behavior of the sample database updating procedures total sample, the behavior of the total weight of the sample and re-calculate the weight of each sample behavior.
7. —种监控程序的行为的方法,包括:步骤SO :收集被监控的程序的程序行为;步骤S4 :分析和监控所述程序行为;其特征在于,在所述步骤SO和步骤S4之间,还包括以下步骤:步骤Si、构造行为样本库,所述行为样本库包括从若干程序样本收集的行为样本、每一种行为样本的基于该种行为样本的出现频率计算出来的权重;步骤S2、获取所述被监控的程序的程序行为,判断所述行为样本库是否存在与所述程序行为相同的行为样本,若所述行为样本库不存在与所述程序行为相同的行为样本,就保留所述程序行为;若所述行为样本库存在与所述程序行为相同的行为样本,就判断所述行为样本的权重是否落入预设的过滤阈值范围,如果落入就过滤掉所述程序行为,否则,就保留所述程序行为,在所述行为样本库中,每一种行为样本的出现频率是出现该种行 7. - Method species behavior monitoring program, comprising the steps of: SO: collector monitored program behavior; Step S4: monitoring and analyzing the program behavior; wherein, in the step between steps S4 and SO further comprising the following steps: Si, configured to conduct sample library, the library includes a sample behavior sample behavior from a plurality of sample collection procedure, calculated based on the occurrence frequency of such acts sample weight sample weight for each behavior; step S2 , the acquisition program of the monitored program behavior, the behavior determining whether there is a sample database with the program samples the same behavior behavior if the behavior of the sample database with the same behavior as the sample program behavior does not exist, retention the program acts; if the behavior of the sample in the inventory program behavior samples the same behavior, it is determined that the right to act weight of the sample falls within a preset threshold range of the filter, if it falls off the filter program behavior otherwise, it retains the program behavior, behavior in the sample database, the frequency of occurrence of each kind of behavior is that the sample line appears 样本的程序样本的数量与所有程序样本的总量的比值,或者是该种行为样本在所有程序样本中的出现次数与所有程序样本包含的行为样本总量的比值;行为样本的权重为该种行为样本的出现频率;所述判断行为样本的权重是否落入预设的过滤阈值范围的步骤具体为:如果所述行为样本的出现频率大于预设的过滤阈值下限,就判定为落入所述预设的过滤阈值范围。 The ratio of the total number of occurrences of the number of samples the program samples and the sample ratio of the total amount of all program samples, or the samples in all program behaviors sample with all samples comprising the behavior of the program; weight of the sample for the right to act species behavior occurrence frequency samples; Analyzing the behavior of the sample weight falls within the range of the step of filtering the predetermined threshold is specifically: if the occurrence frequency of the behavior of the sample is greater than a predetermined lower threshold filtering, it is determined to fall within the filtered preset threshold range.
8.根据权利要求7所述的监控程序的行为的方法,其特征在于:所述行为样本库还包括所有程序样本的总量、所有行为样本的总量;所述方法还包括更新所述行为样本库,所述更新包括:若步骤S2中所述行为样本库不存在与所述程序行为相同的行为样本,则在步骤S2之后,将所述程序行为作为新的行为样本添加到所述行为样本库中,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 The method according to claim behavior monitoring program according to claim 7, wherein: the behavior of the sample database further includes the total amount of all program samples, the total amount of all the behavior of the sample; said method further comprising updating said behavior sample database, the updating comprises: the step S2, if the behavior of the sample libraries are not the same program behavior behavior present in the sample, after the step S2, the program acts as a new sample is added to said behavior behavior sample database, update the database of the total sample behavior of the sample program, the behavior of the total weight of the sample and re-calculate the weight of each sample behavior.
9.根据权利要求8所述的监控程序的行为的方法,其特征在于,所述更新还包括:若步骤S2中所述行为样本库存在与所述程序行为相同的行为样本,则在步骤S2之后,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 9. The method according to the behavior of the monitoring program as claimed in claim 8, characterized in that said updating further comprises: if the step S2 in the behavior of the stock sample program behavior samples the same behavior, in step S2 after updating the behavior of the total amount of sample database sample program, the behavior of the total weight of the sample and re-calculate the weight of each sample behavior.
10. 一种监控程序的行为的方法,包括:步骤SO :收集被监控的程序的程序行为;步骤S4 :分析和监控所述程序行为;其特征在于,在所述步骤SO和步骤S4之间,还包括以下步骤:步骤Si、构造行为样本库,所述行为样本库包括从若干程序样本收集的行为样本、每一种行为样本的基于该种行为样本的出现频率计算出来的权重;步骤S2、获取所述被监控的程序的程序行为,判断所述行为样本库是否存在与所述程序行为相同的行为样本,若所述行为样本库不存在与所述程序行为相同的行为样本,就保留所述程序行为;若所述行为样本库存在与所述程序行为相同的行为样本,就判断所述行为样本的权重是否落入预设的过滤阈值范围,如果落入就过滤掉所述程序行为,否则,就保留所述程序行为,在所述行为样本库中,每一种行为样本的出现频率是出现该种行 10. A method of monitoring the behavior of a program, comprising the steps of: SO: collector monitored program behavior; Step S4: monitoring and analyzing the program behavior; wherein, in the step between steps S4 and SO further comprising the following steps: Si, configured to conduct sample library, the library includes a sample behavior sample behavior from a plurality of sample collection procedure, calculated based on the occurrence frequency of such acts sample weight sample weight for each behavior; step S2 , the acquisition program of the monitored program behavior, the behavior determining whether there is a sample database with the program samples the same behavior behavior if the behavior of the sample database with the same behavior as the sample program behavior does not exist, retention the program acts; if the behavior of the sample in the inventory program behavior samples the same behavior, it is determined that the right to act weight of the sample falls within a preset threshold range of the filter, if it falls off the filter program behavior otherwise, it retains the program behavior, behavior in the sample database, the frequency of occurrence of each kind of behavior is that the sample line appears 样本的程序样本的数量与所有程序样本的总量的比值,或者是该种行为样本在所有程序样本中的出现次数与所有程序样本包含的行为样本总量的比值;行为样本的权重为该种行为样本的逆文本频率指数,行为样本的逆文本频率指数等于该种行为样本的出现频率的倒数的对数;所述判断行为样本的权重是否落入预设的过滤阈值范围的步骤具体为:如果所述行为样本的逆文本频率指数小于预设的过滤阈值上限,就判定为落入所述预设的过滤阈值范围。 The ratio of the total number of occurrences of the number of samples the program samples and the sample ratio of the total amount of all program samples, or the samples in all program behaviors sample with all samples comprising the behavior of the program; weight of the sample for the right to act species behavior inverse document frequency index of the sample, the behavior of the sample inverse document frequency index equal to the inverse of the frequency of occurrence of such acts sample number; Analyzing the behavior of the sample weight falls within a preset threshold range filtering step is specifically: If the behavior of the sample is less than a preset inverse document frequency index filtration upper threshold, it is determined to fall within the preset threshold range filter.
11.根据权利要求10所述的监控程序的行为的方法,其特征在于:所述行为样本库还包括所有程序样本的总量、所有行为样本的总量;所述方法还包括更新所述行为样本库,所述更新包括:若步骤S2中所述行为样本库不存在与所述程序行为相同的行为样本,则在步骤S2之后,将所述程序行为作为新的行为样本添加到所述行为样本库中,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 11. The method of claim behavior monitor of claim 10, wherein: the behavior of the sample database further includes the total amount of all program samples, the total amount of all the behavior of the sample; said method further comprising updating said behavior sample database, the updating comprises: the step S2, if the behavior of the sample libraries are not the same program behavior behavior present in the sample, after the step S2, the program acts as a new sample is added to said behavior behavior sample database, update the database of the total sample behavior of the sample program, the behavior of the total weight of the sample and re-calculate the weight of each sample behavior.
12.根据权利要求11所述的监控程序的行为的方法,其特征在于,所述更新还包括:若步骤S2中所述行为样本库存在与所述程序行为相同的行为样本,则在步骤S2之后,更新所述行为样本库的程序样本的总量、行为样本的总量并重新计算每一种行为样本的权重。 12. The method of monitoring the behavior of a program according to claim 11, wherein said updating further comprises: if the step S2 in the behavior of the stock sample program behavior samples the same behavior, in step S2 after updating the behavior of the total amount of sample database sample program, the behavior of the total weight of the sample and re-calculate the weight of each sample behavior.
CN 200810030001 2008-08-05 2008-08-05 Method for filtering and monitoring behavior of program CN101645125B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200810030001 CN101645125B (en) 2008-08-05 2008-08-05 Method for filtering and monitoring behavior of program

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN 200810030001 CN101645125B (en) 2008-08-05 2008-08-05 Method for filtering and monitoring behavior of program
PCT/CN2009/000871 WO2010015145A1 (en) 2008-08-05 2009-08-04 Method and system for filtering and monitoring program behaviors
JP2011521424A JP5370486B2 (en) 2008-08-05 2009-08-04 Method and system for filtering and monitoring program behavior

Publications (2)

Publication Number Publication Date
CN101645125A CN101645125A (en) 2010-02-10
CN101645125B true CN101645125B (en) 2011-07-20

Family

ID=41657008

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200810030001 CN101645125B (en) 2008-08-05 2008-08-05 Method for filtering and monitoring behavior of program

Country Status (3)

Country Link
JP (1) JP5370486B2 (en)
CN (1) CN101645125B (en)
WO (1) WO2010015145A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923617B (en) * 2010-08-18 2013-03-20 北京奇虎科技有限公司 Cloud-based sample database dynamic maintaining method
CN103106366B (en) * 2010-08-18 2016-05-04 北京奇虎科技有限公司 A kind of sample database dynamic maintaining method based on cloud
CN101984450B (en) * 2010-12-15 2012-10-24 北京安天电子设备有限公司 Malicious code detection method and system
CN102831153B (en) * 2012-06-28 2015-09-30 北京奇虎科技有限公司 A kind of method and apparatus choosing sample
CN103902894B (en) * 2012-12-24 2017-12-22 珠海市君天电子科技有限公司 Virus defense method and system based on user behavior differentiation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1421771A (en) 2001-11-27 2003-06-04 四川安盟科技有限责任公司 Guard system to defend network invansion of unkown attack trick effectively
CN1859398A (en) 2006-01-05 2006-11-08 珠海金山软件股份有限公司 System and method for reverse network fishing

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09171460A (en) * 1995-12-20 1997-06-30 Hitachi Ltd Diagnostic system for computer
JP3844193B2 (en) * 2001-01-24 2006-11-08 Kddi株式会社 Information automatic filtering method, information automatic filtering system, and information automatic filtering program
JP3992136B2 (en) * 2001-12-17 2007-10-17 学校法人金沢工業大学 Virus detection method and apparatus
GB2400197B (en) * 2003-04-03 2006-04-12 Messagelabs Ltd System for and method of detecting malware in macros and executable scripts
CA2545916C (en) * 2003-11-12 2015-03-17 The Trustees Of Columbia University In The City Of New York Apparatus method and medium for detecting payload anomaly using n-gram distribution of normal data
US20050262058A1 (en) * 2004-05-24 2005-11-24 Microsoft Corporation Query to task mapping
US8516583B2 (en) * 2005-03-31 2013-08-20 Microsoft Corporation Aggregating the knowledge base of computer systems to proactively protect a computer from malware
US20100169484A1 (en) * 2005-12-15 2010-07-01 Keiichi Okamoto Unauthorized Communication Program Regulation System and Associated Program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1421771A (en) 2001-11-27 2003-06-04 四川安盟科技有限责任公司 Guard system to defend network invansion of unkown attack trick effectively
CN1859398A (en) 2006-01-05 2006-11-08 珠海金山软件股份有限公司 System and method for reverse network fishing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JP特开2004-62416A 2004.02.26
王斌.基于贝叶斯的Windows注册表访问的异常检测研究.《现代电子技术》.2007,参见第86页第2段至第88页第3段,表1-2、图1.

Also Published As

Publication number Publication date
JP2011530121A (en) 2011-12-15
CN101645125A (en) 2010-02-10
WO2010015145A1 (en) 2010-02-11
JP5370486B2 (en) 2013-12-18

Similar Documents

Publication Publication Date Title
US9516046B2 (en) Analyzing a group of values extracted from events of machine data relative to a population statistic for those values
US6742128B1 (en) Threat assessment orchestrator system and method
US9965630B2 (en) Method and apparatus for anti-virus scanning of file system
EP2472425B1 (en) System and method for detecting unknown malware
US8056136B1 (en) System and method for detection of malware and management of malware-related information
US20130167236A1 (en) Method and system for automatically generating virus descriptions
US9996693B2 (en) Automated malware signature generation
US20100192222A1 (en) Malware detection using multiple classifiers
CN101751535B (en) Data loss protection through application data access classification
US8667583B2 (en) Collecting and analyzing malware data
US8214905B1 (en) System and method for dynamically allocating computing resources for processing security information
US20080127346A1 (en) System and method of detecting anomaly malicious code by using process behavior prediction technique
US7809670B2 (en) Classification of malware using clustering that orders events in accordance with the time of occurance
JP2004318552A (en) Device, method and program for supporting ids log analysis
US10019573B2 (en) System and method for detecting executable machine instructions in a data stream
US6952776B1 (en) Method and apparatus for increasing virus detection speed using a database
CN101350054B (en) Method and apparatus for automatically protecting computer noxious program
EP2566130B1 (en) Automatic analysis of security related incidents in computer networks
US8191149B2 (en) System and method for predicting cyber threat
Ye et al. CIMDS: adapting postprocessing techniques of associative classification for malware detection
US8037536B2 (en) Risk scoring system for the prevention of malware
US20060074621A1 (en) Apparatus and method for prioritized grouping of data representing events
US9419996B2 (en) Detection and prevention for malicious threats
US8375450B1 (en) Zero day malware scanner
US9781144B1 (en) Determining duplicate objects for malware analysis using environmental/context information

Legal Events

Date Code Title Description
C06 Publication
C10 Request of examination as to substance
C14 Granted
C41 Transfer of the right of patent application or the patent right
COR Bibliographic change or correction in the description

Free format text: CORRECT: ADDRESS; FROM: 519015 ZHUHAI, GUANGDONG PROVINCE TO: 100085 HAIDIAN, BEIJING

ASS Succession or assignment of patent right

Owner name: KINGSOFT CORPORATION LIMITED

Free format text: FORMER OWNER: ZHUHAI KINGSOFT SOFTWARE CO., LTD.

Effective date: 20140904

LICC Enforcement, change and cancellation of record of contracts on the license for exploitation of a patent