WO2015190446A1 - マルウェア判定器、マルウェア判定システム、マルウェア判定方法、プログラム - Google Patents
マルウェア判定器、マルウェア判定システム、マルウェア判定方法、プログラム Download PDFInfo
- Publication number
- WO2015190446A1 WO2015190446A1 PCT/JP2015/066527 JP2015066527W WO2015190446A1 WO 2015190446 A1 WO2015190446 A1 WO 2015190446A1 JP 2015066527 W JP2015066527 W JP 2015066527W WO 2015190446 A1 WO2015190446 A1 WO 2015190446A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- attribute
- feature
- malware
- attribute value
- unit
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/567—Computer malware detection or handling, e.g. anti-virus arrangements using dedicated hardware
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1433—Vulnerability analysis
Definitions
- the present invention relates to a technology for determining whether or not an executable file used on an OS (Operating System) is malware (malicious software).
- hash values such as MD (Message Digest Algorithm) 5, SHA (Secure Hash Algorithm) 1, SHA256, etc. of existing malware are registered in the database in advance, and the hash value of the execution file to be judged is registered in the database. If the hash value matches, the executable file is determined to be malware. Pattern match determination is to register a specific character string or byte code included in existing malware in the database in advance, and if the execution file to be determined contains either a character string or byte code registered in the database, The executable file is determined to be malware. These methods have the advantage of a low false detection rate (the rate at which executable files that are not malware are mistakenly determined as malware), but it is difficult to detect variants of new malware and new types of malware. It was.
- heuristic determination has been proposed as a method for determining whether or not an execution file to be determined is a variant / new type of malware. This is based on the experience so far, defining the likelihood of malware and making a decision according to the definition.
- Patent Document 1 learns in advance a readable character string included in an execution file, and based on how many words frequently used in malware are included in the execution file to be determined, Determine the likelihood of malware.
- an execution file (teacher data) to be learned is converted into a set of several parameters, and then learning is performed using a machine learning algorithm.
- This set of parameters is called a feature vector or simply a feature, and the number of parameters included in this set is called a feature vector dimension.
- converting an executable file into a feature vector is called feature extraction.
- a feature vector in the technique described in Patent Document 1, a pair of a word name and the number of appearances of the word is a feature vector, and the number of word types is a feature vector dimension.
- Non-Patent Documents 1 and 2 The larger the feature vector dimension, the better the judgment accuracy, but it may worsen. This phenomenon is known as “curse of dimension” (Non-Patent Documents 1 and 2).
- PE Portable Executable
- the feature vector dimension can be reduced, and more It shows that good determination accuracy can be obtained.
- principal component analysis is often used in the dimensional compression technique. This is a technique that automatically combines correlated features into one feature (for example, the two features human height and weight are roughly proportional, so these two features are combined into one feature. (In this example, the features combined into one can be attributed to, for example, the size of the body, but usually not meaningful).
- Patent Document 1 When applying machine learning technology to malware determination, in order to obtain better determination accuracy, it is necessary to adjust the feature vector such as reducing the feature vector dimension as described above. However, the technique described in Patent Document 1 does not have a function of explicitly adjusting a feature vector.
- Non-Patent Document 3 adjusts feature vectors by a method called dimensional compression.
- the determination accuracy required for the malware determination may differ depending on the purpose. For example, for general users, a particularly low false detection rate is required.
- the detection rate the rate at which an executable file that is malware is correctly determined as malware
- the method based on dimensional compression is performed almost automatically, it contributes to improving the determination accuracy in general.
- Dimensional compression such as principal component analysis is a technique that mechanically generates another feature by combining one feature with another, so it is difficult for humans to understand the generated feature. The feature vector could not be adjusted with the knowledge of the above.
- an object of the present invention is to provide a technique capable of adjusting a feature vector according to the purpose without performing automatic feature conversion such as dimensional compression.
- the malware judging device of the present invention A feature selection database including an attribute table and an attribute value table; When the attribute name of the attribute of the executable file is input, the attribute of the input attribute name is registered in the attribute table as the attribute to be extracted, and when the attribute value of the attribute of the executable file is input, the input attribute A feature selection setting unit for registering a value in the attribute value table as an attribute value to be deleted or not deleted; When an execution file is input, a feature extraction unit that extracts an attribute value of an attribute registered as an extraction target in the attribute table from the execution file, and generates a feature vector including the extracted attribute value as a feature; From the feature vector generated by the feature extraction unit, delete an attribute value registered as a deletion target in the attribute value table or delete an attribute value other than an attribute value registered as a non-deletion target.
- a feature selector for reconstructing the vector When the feature selection unit reconstructs the feature vector of the execution file to be learned, the execution of the learning target is performed based on the feature vector and information on whether or not the feature vector of the execution file to be learned is malware.
- the feature selection unit reconstructs the feature vector of the execution file to be determined, based on the result of the machine learning and the feature vector, the execution file of the determination target A classifier to calculate the score; And a determination unit that determines whether or not the execution file to be determined is malware based on the score of the execution file to be determined calculated by the classifier.
- the first malware determination system of the present invention is: The malware determiner; A feature selection trial device, and The feature selection trial unit is: A feature selection instruction unit that repeats the process of selecting one set from a combination of one or more features composed of attribute names and attribute values and inputting the set to the feature selection setting unit of the malware determiner; Each time the feature selection instructing unit inputs the set to the feature selection setting unit, a process of inputting the execution file to be determined after inputting the execution file to be learned to the feature extraction unit of the malware determination unit A repeat test part; Each time the feature selection instruction unit inputs the set to the feature selection setting unit, from the determination unit of the malware determination unit, for each execution target execution file repeatedly input to the feature extraction unit by the verification unit, A determination result including a score of the execution file to be determined and information indicating whether or not the execution file to be determined is determined to be malware is acquired, and the determination accuracy of the determination unit is good based on the determination result An index calculation unit that calculates an index indicating The feature selection instruction unit includes: Among the groups input to the
- the second malware determination system of the present invention is The malware determiner; A user interface;
- the user interface is A feature list acquisition unit that acquires attribute names and attribute values of attributes of executable files;
- a feature selection input unit that displays a setting screen including a list of attribute names and attribute values acquired by the feature list acquisition unit,
- the feature selection input unit includes: The attribute name is displayed on the setting screen together with the first check box, and for the attribute name of the attribute registered as the extraction target in the attribute table, the first check box is checked, An attribute value is displayed on the setting screen together with a second check box, and an attribute value other than an attribute value registered as a deletion target in the attribute value table, or an attribute value registered as a non-deletion target, Check the second check box, When the check status of the first check box or the second check box is manually changed, an attribute name with a check in the first check box after the change is selected as the feature selection of the malware determiner.
- the attribute value that is not checked in the second check box after the change or the attribute value that is checked
- the malware determination method of the present invention includes: A malware determination method performed by a malware determiner, When the attribute name of the executable file attribute is input, the attribute name input attribute is registered in the attribute table as the attribute to be extracted, and when the attribute value of the executable file attribute is input, the input attribute value A feature selection setting step for registering in the attribute value table as an attribute value to be deleted or not deleted, When an execution file is input, a feature extraction step of extracting an attribute value of an attribute registered as an extraction target in the attribute table from the execution file and generating a feature vector including the extracted attribute value as a feature; From the feature vector generated in the feature extraction step, delete an attribute value registered as a deletion target in the attribute value table or delete an attribute value other than an attribute value registered as a non-deletion target.
- a feature selection step for reconstructing the vector When the feature vector of the execution file to be learned is reconstructed in the feature selection step, the execution of the learning object is performed based on the feature vector and information on whether or not the feature vector of the execution file to be learned is malware. When machine learning of the file is performed and the feature vector of the execution file to be determined is reconstructed in the feature selection step, the execution file of the determination target has the likelihood of malware based on the result of the machine learning and the feature vector.
- a classification step for calculating a score A determination step of determining whether or not the execution file to be determined is malware based on the score of the execution file to be determined calculated in the classification step.
- the program of the present invention Computer Storage means including an attribute table and an attribute value table;
- the attribute name of the attribute of the executable file is input, the attribute of the input attribute name is registered in the attribute table as the attribute to be extracted, and when the attribute value of the attribute of the executable file is input, the input attribute Feature selection setting means for registering a value in the attribute value table as an attribute value to be deleted or not deleted;
- feature extraction means for extracting an attribute value of an attribute registered as an extraction target in the attribute table from the execution file and generating a feature vector including the extracted attribute value as a feature; From the feature vector generated by the feature extraction unit, the attribute value registered as a deletion target in the attribute value table is deleted, or an attribute value other than the attribute value registered as a deletion target is deleted, and the feature Feature selection means for reconstructing a vector;
- the feature selection unit reconstructs the feature vector of the executable file to be learned, the execution of the learning target is performed based on the feature vector and information on whether or not the feature vector of the executable
- the execution file of the determination target has a likelihood of malware based on the result of the machine learning and the feature vector.
- a classification means for calculating a score Based on the score of the execution file to be determined calculated by the classification unit, the determination unit determines whether or not the execution file to be determined is malware.
- the feature selection setting unit registers the attribute of the attribute name as an extraction target attribute in the attribute table and deletes the attribute value. Alternatively, register it in the attribute value table as an attribute value that is not subject to deletion. Then, in the case of malware determination by machine learning, the feature extraction unit extracts the attribute value of only the attribute registered as the extraction target in the attribute table to generate a feature vector, and the feature selection unit, from the feature vector, Attribute values registered as deletion targets in the attribute value table or attribute values other than attribute values registered as deletion targets are further deleted.
- features related to unnecessary attributes / attribute values can be deleted from the feature vector, or only features related to useful attributes / attribute values can be included in the feature vector. Therefore, it is possible to adjust the feature vector such as reducing the feature vector dimension while improving the determination accuracy.
- feature vectors can be adjusted in consideration of expert knowledge.
- the feature vector can be adjusted according to the purpose.
- FIG. 2 is a configuration diagram showing a configuration of a malware judgment device according to the first exemplary embodiment of the present invention.
- 6 is a flowchart illustrating an outline of a feature selection setting process in the malware determiner according to the first embodiment of the present invention.
- 6 is a flowchart illustrating an overview of a teacher data learning process in the malware determiner according to the first embodiment of the present invention.
- 6 is a flowchart illustrating an outline of target file determination processing in the malware determination device of the first exemplary embodiment of the present invention. It is a figure explaining the attribute table and attribute value table 1 in the malware determination device of the 1st Embodiment of this invention. It is a figure explaining the attribute value table 2 in the malware determination device of the 1st Embodiment of this invention.
- FIG. 10 is a diagram illustrating a setting screen displayed on a feature selection UI according to the third embodiment of the present invention.
- ⁇ In addition to dimensional compression, there is a method called feature selection as a method for reducing the feature vector dimension.
- This technique is a technique for searching for a feature vector of a combination that achieves a target determination accuracy by selecting each feature and reducing the feature vector dimension.
- the malware determination device of the present invention uses the above-described method of feature selection as a method for reducing the feature vector dimension.
- (1) First Embodiment (1-1) Configuration of the First Embodiment FIG. 1 shows a configuration diagram of the malware determination device 10 of the first embodiment of the present invention.
- the malware determiner 10 of this embodiment includes a feature selection setting unit 11, a feature selection database (storage means) 12, a feature extraction unit 13, a feature selection unit 14, a classifier 15, And a determination unit 16.
- the malware determination unit 10 of the present embodiment performs machine learning of PE / COFF (Portable Executable / Common Object File Format) header information of an execution file to be learned, and determines whether or not the execution file to be determined is malware.
- PE / COFF Portable Executable / Common Object File Format
- a feature selection function capable of selecting a feature to be included in the feature vector is provided.
- the malware determiner 10 of the present embodiment first sets the feature to be included in the feature vector (feature selection setting process, FIG. 2A). Next, machine learning of teacher data composed of existing malware and non-malware executable files (goodware) is performed (teacher data learning process, Fig. 2B). After learning, it is possible to determine whether or not the execution file to be determined is malware (target file determination processing, FIG. 2C).
- FIG. 2A shows a flowchart for explaining an outline of the feature selection setting process in the malware determiner 10 of this embodiment.
- the feature selection setting unit 11 registers the setting related to the feature selection in the feature selection database 12. (Step A2).
- ⁇ Settings related to feature selection consist of attribute names such as PE / COFF header information extracted from the executable file, and attribute values to be deleted from the feature among the attribute values of each attribute.
- Fig. 3 shows the data structure of the feature selection database 12.
- the feature selection database 12 includes an attribute table and an attribute value table 1.
- the feature selection setting unit 11 reflects the attribute name of the attribute extracted from the execution file in the attribute table. Specifically, the attribute described in the setting related to feature selection is treated as an attribute to be extracted from the execution file, and the value of the ON / OFF field of the attribute name of the attribute is registered as ON in the attribute table. On the other hand, attributes that are not described in the setting related to feature selection are handled as attributes that are not extracted from the execution file, and the value of the ON / OFF field of the attribute name of the attribute is registered as OFF in the attribute table.
- the feature selection setting unit 11 reflects the attribute value to be deleted from the feature among the attribute values of each attribute in the attribute value table 1. Specifically, the attribute value described in the setting related to feature selection is handled as an attribute value to be deleted from the feature vector and registered in the attribute value table 1.
- FIG. 2B shows a flowchart for explaining the outline of the teacher data learning process in the malware determiner 10 of this embodiment.
- the teacher data used in the teacher data learning process is composed of existing malware and executable files that are goodware (executable files to be learned) and information that indicates whether the executable file is malware.
- the instead of the execution file information obtained by cutting out only the part in which PE / COFF header information is described from the execution file may be used.
- Teacher data is processed in the order of the feature extraction unit 13, the feature selection unit 14, and the classifier 15.
- the feature extraction unit 13 extracts the attribute value of each attribute of the PE / COFF header information from the execution file in the teacher data, and further extracts the extracted attribute.
- a feature vector including a value as a feature is generated (step B1).
- a tool such as dumpbin.exe (Microsoft Visual C ++) or objdump (Unix such as Linux), or use these tools It is possible to use a method using a program library.
- attribute values only attribute values of attributes that are ON in the attribute table of the feature selection database 12 are extracted.
- Some PE / COFF header information attributes have multiple values in one attribute.
- the Characteristics attribute has a plurality of values such as “Executable”, “32 bit word machine”, “Debug information stripped”. Each of these values is regarded as one feature (in this example, three features are extracted with one attribute).
- Attributes with numerical data such as File size are handled by converting the attribute value to a range name at an appropriate interval such as 1 kB or less, 1 kB to 100 kB, 100 kB to 500 kB, 500 kB to 1 MB, 1 MB or more.
- These feature values may be appropriately weighted such as tf-idf (term transaction-inverse document frequency).
- the feature selection unit 14 performs feature selection by deleting the feature from the feature vector according to the attribute value table 1, and reconstructs the feature vector (step B2). For example, if the attribute value is registered in the attribute value table 1 as shown in FIG. 3, if “Characteristics”: “Executable” is included in the feature vector, the feature “Characteristics”: “Executable” is extracted from the feature vector. delete.
- the classifier 15 receives the feature vector processed by the feature extraction unit 13 and the feature selection unit 14 in the previous stage, and classification information indicating whether the execution file to be learned is malware or not.
- Machine learning of the target executable file is performed (step B3).
- an appropriate classifier such as logistic regression, support vector machine, perceptron, Passive-Aggressive, naive Bayes, or decision tree may be used.
- FIG. 2C shows a flowchart for explaining the outline of the target file determination process in the malware determiner 10 of this embodiment.
- the execution file to be determined used in the target file determination process may be the entire execution file, or may be information obtained by cutting out only the part in which PE / COFF header information is described from the execution file.
- the execution file to be determined is processed in the order of the feature extraction unit 13, the feature selection unit 14, the classifier 15, and the determination unit 16.
- the classifier 15 is based on (1-2-2) the result of machine learning at the time of the teacher data learning process and the feature vector processed by the feature extraction unit 13 and the feature selection unit 14 in the previous stage,
- the malware-likeness of the execution file to be determined is calculated as a numerical value called a score (step C3).
- a method of calculating a malware-like score for example, using the technique described in Patent Document 1, it is based on how much of the feature that is often used in malware is included in the feature vector of the execution file to be determined A known method such as a method for making a determination can be used.
- the determination unit 16 determines whether or not the execution file to be determined is malware based on the score calculated by the classifier 15 and outputs a determination result (step C4).
- the determination result may be only information indicating whether the execution file to be determined is classified as malware or goodware, or the score calculated by the classifier 15 may be added thereto.
- the determination method there is a threshold determination that malware is set when the score is equal to or higher than a certain threshold, and goodware is set when the score is lower than the threshold.
- the feature selection unit 14 performs a process of deleting the attribute value registered in the attribute value table 1 from the feature vector, but on the contrary, the attribute registered in the attribute value table. You may perform the process of including only a value in a feature vector.
- the attribute value to be included as the feature is specified, and the attribute value table 2 in FIG. 4 is prepared instead of the attribute value table 1 in FIG.
- the specified attribute value is registered in the attribute value table 2 as an attribute value not to be deleted.
- the features related to the Import DLL attribute in the feature vector are “Import DLL”: “WINSOCK32.DLL”, “Import DLL”: “WININET. Only the attribute value of "DLL” is left, and all other attribute values of Import DLL attribute are deleted.
- the attribute values of executable files include those that are hardly useful for malware judgment and those that are very useful. For example, there is an attribute value “Characteristics”: “Executable” that always appears in any executable file as a useless one, and “Import DLL”: “WINSOCK32. DLL "etc. Therefore, useless attribute values are registered in the attribute value table 1 and deleted from the feature vector, and if the useful attribute values are clearly known, prepare the attribute value table 2 instead of the attribute value table 1. It is possible to selectively use such as registering useful attribute values in the attribute value table 2 and making a determination only with the attribute values.
- the feature selection setting unit 11 registers the attribute of the attribute name as an attribute to be extracted in the attribute table. Then, the attribute value is registered in the attribute value table as an attribute value to be deleted or not to be deleted. Then, in the case of malware determination by machine learning, the feature extraction unit 13 generates a feature vector by extracting only attribute values registered as extraction targets in the attribute table, and the feature selection unit 14 Then, the attribute value registered as the deletion target in the attribute value table or the attribute value other than the attribute value registered as the deletion target is further deleted.
- features related to unnecessary attributes / attribute values can be deleted from the feature vector, or only features related to useful attributes / attribute values can be deleted from the feature vector. Therefore, it is possible to adjust the feature vector such as reducing the feature vector dimension while improving the determination accuracy. In addition, since the feature vector dimension can be reduced, the processing of the classifier 15 can be reduced, and the speed of machine learning and malware determination can be increased.
- attributes and attribute values that take into account expert knowledge.
- some header information attributes have default values such as Section alignment and File alignment, and it is known that malware may deviate from the default values.
- malware tends to use executable file compression software called a packer more than goodware, and it can be determined by the Section flags attribute. Therefore, it is possible to specify useful attributes and attribute values using such knowledge.
- FIG. 5 shows a configuration diagram of a malware determination system according to the second embodiment of the present invention.
- the malware determination system includes a feature selection trial device 20 that tries feature selection in addition to the malware determination device 10 as another device.
- the feature selection trial device 20 includes a feature selection instruction unit 21, a test unit 22, and an index calculation unit 23.
- Methods for automatically performing feature selection are roughly classified into a wrapper method, a filter method, and an incorporation method.
- the wrapper method is used.
- the wrapper method is a method in which machine learning / classification is actually performed for various feature selection settings to measure accuracy, and the feature selection setting with the best accuracy is searched.
- As a method for searching for a combination of features there are a brute force method for trying all combinations, a variable increasing method for reducing the number of trials, a variable decreasing method, a stepwise method, and the like. In the present embodiment, the case where the variable increasing method is used will be described, but other search methods may be used.
- FIG. 6 shows a flowchart for explaining the outline of processing in the feature selection trial device 20.
- the feature selection instruction unit 21 first reads a feature list (feature candidate list) that is a candidate for selecting a feature (step D1). Subsequently, according to the search method, the feature selection instruction unit 21 selects one feature selection setting, which is a combination of one or more features composed of attribute names and attribute values, from these candidate features, and a malware determination device. Enter 10 (step D2).
- the test unit 22 divides the test data including the existing malware and the existing goodware, and generates teacher data and an execution file to be determined (step D3). Subsequently, the test unit 22 inputs the teacher data to the malware determiner 10 and causes the malware determiner 10 to learn the teacher data (step D4), and then inputs the determination target executable file to the malware determiner 10, The malware judging device 10 is made to judge whether or not the execution file to be judged is malware (step D5). The determination result (malware / goodware classification information and malware-like score) is acquired by the index calculation unit 23 (cross-validation).
- the test unit 22 repeats steps D4 and D5 for all the teacher data generated in step D3 and the execution file to be determined.
- step D6 the index calculation unit 23 compares each determination result of the execution file to be determined against the verification data, and determines the determination unit 16 of the malware determination unit 10. An index indicating the good judgment accuracy is calculated (step D7).
- the feature selection instruction unit 21 determines whether to try another feature selection setting or end the trial based on the index calculated by the index calculation unit 23 (step D8).
- step D2 If another feature selection setting is attempted, the process returns to step D2, and another feature selection setting is input to the malware determination unit 10 to perform the subsequent processing.
- the feature selection setting with the highest index is selected and input to the malware judgment unit 10 (step D9), and the process is performed. finish.
- the feature candidate list is a list in which attribute names of attributes and attribute values of the attributes are described as feature candidates.
- the feature selection trial unit 20 combines various attribute names and attribute values described in the feature candidate list, and searches for a combination having the highest determination accuracy.
- the features A, B, C, and D are described in the feature candidate list as an example.
- the number of features described in the feature candidate list is not limited to four, and attribute names and attribute values may be freely described.
- variable increment method which is the search method used in the present embodiment, is a method of increasing the feature (or attribute) to be used one by one, and ending when the determination accuracy (or index) reaches the maximum value.
- judgment accuracy is measured with each of the feature selection settings [A], [B], [C], and [D].
- Adopt high feature selection settings In this example, it is assumed that [A] is the highest.
- the feature selection setting with the highest determination accuracy is again adopted.
- [A C] is the highest.
- add one feature to [A C] measure the determination accuracy with [A C B], [A C D], and end if there is no higher accuracy than the previous time
- the setting of feature selection with the highest determination accuracy is also adopted. This is repeated until there is no feature selection setting to try.
- the feature selection instructing unit 21 finally determines one feature selection setting according to the above-described variable increasing method, and inputs the feature selection setting to the malware determination unit 10.
- the test data used in the test unit 22 is similar to the teacher data of the first embodiment, the existing malware and goodware executable file, information indicating whether the executable file is malware, and Consists of.
- typical methods include a holdout test that divides the test data into two parts and performs the test only once, or an N-part test that is divided into N equal parts and performed N times. There is a split cross validation and any method can be used.
- AUC Absolute Under the Curve
- AUC Absolute Under the Curve
- AUC may be used as an index.
- Another index can also be used.
- the detection rate and the false detection rate are in a contradictory relationship, but there are cases where it is desired to increase the detection rate accordingly while keeping the false detection rate very low.
- the detection rate of the determination unit 16 when adjusted so that the false detection rate of the determination unit 16 of the malware determination unit 10 is very low (for example, 0.1% or less) can be used.
- an index based on the false detection rate of the determination unit 16 when adjusting the detection rate of the determination unit 16 to be very high for example, 99.9% or more
- (1-false detection rate) is used. it can.
- the adjustment of the false detection rate or the detection rate in the determination unit 16 can be realized by adjusting the threshold when performing threshold determination, for example.
- the feature selection instruction unit 21 determines whether to try with another feature selection setting or to end the trial. In the case of the variable increment method, judgment is made in the following order.
- the “number” below indicates the number of features (attributes) included in the feature selection setting.
- the feature selection can be automatically set, and the index calculation unit 23 calculates an appropriate index according to the purpose to indicate the accuracy of the determination accuracy. It is possible to configure the malware determination device 10 having a determination accuracy that matches the purpose of use, such as a reduction in detection and an improvement in detection rate.
- (3) Third Embodiment A malware determination system according to a third embodiment of the present invention inputs settings relating to feature selection by manual input to the malware determination device 10 shown in FIG. (3-1) Configuration of the Third Embodiment
- FIG. 7 shows a configuration diagram of the malware determination system of the third embodiment of the present invention. As shown in FIG.
- the malware determination system collects features from the existing malware / goodware in addition to the malware determiner 10, and manually selects features using the feature list.
- a feature selection UI (User Interface) 30 is provided.
- the feature selection UI 30 includes a feature list acquisition unit 31 and a feature selection input unit 32.
- Feature list acquisition unit 31 acquires features from existing malware / goodware and generates a list of the features.
- Existing malware / goodware may use the teacher data of the first embodiment and the test data of the second embodiment.
- the feature selection input unit 32 displays a setting screen based on the feature list generated by the feature list acquisition unit 31 and the information in the feature selection database 12 of the malware determiner 10.
- (3-2) Setting Screen of Third Embodiment A setting screen displayed on the feature selection input unit 32 will be described below.
- Figure 8 shows an example of the setting screen.
- the attribute name / attribute value and the number of appearances are displayed based on the feature list generated by the feature list acquisition unit 31.
- the feature list generated by the feature list acquisition unit 31 includes information on the attribute name and attribute value of the attribute and the number of executable files having the attribute value.
- the attribute name / attribute value can be switched between the display of only the attribute name of the attribute and the display of the attribute name of the attribute and all the attribute values of the attribute by pressing the +/ ⁇ button.
- the number of occurrences of the attribute value represents the number of executable files having the attribute value / the total number of executable files.
- the display of the number of appearances helps to judge that an attribute value that appears too frequently is excluded from the characteristic vector.
- the check boxes 1 and 2 are displayed based on the information in the feature selection database 12.
- the check box 1 reflects the ON / OFF field of the attribute table, and a check is added to the attribute name whose ON / OFF field is ON, that is, the attribute name of the attribute registered as an extraction target in the attribute table.
- the check box 2 reflects the contents of the attribute value table 1.
- the attribute value registered as the deletion target in the attribute value table 1 is not checked, and conversely, the attribute value registered as the deletion target A check is added to attribute values other than (attribute values not registered).
- the feature selection input unit 32 will check the attribute name with check box 1 checked and check box 2 checked.
- the attribute value that has not been input is input to the feature selection setting unit 11 of the malware determiner 10 as a feature selection setting. Therefore, on this setting screen, you can easily set the feature selection manually by checking the attribute / attribute value you want to include in the feature vector and unchecking the attribute / attribute value you want to exclude. .
- the check box 2 reflects the contents of the attribute value table 2. In other words, the attribute value registered as not being deleted in the attribute value table 2 is checked, and the attribute value other than the attribute value registered as not being deleted in the attribute value table 2 (attribute value not registered) Is not checked.
- the attribute value input to the malware determination device 10 is an attribute value with the check box 2 checked.
- feature selection can be set manually, and check boxes 1 and 2 are used to specify attribute names and attribute values, so that convenient feature selection settings can be provided. Can do.
- the feature selection setting may be viewed / corrected according to the third embodiment. Good.
- the malware judgment device 10 of the present invention can be realized by a computer and a program, and the program can be recorded on a recording medium or provided through a network.
Abstract
Description
属性テーブルおよび属性値テーブルを含む特徴選択データベースと、
実行ファイルの属性の属性名が入力されると、入力された属性名の属性を抽出対象の属性として前記属性テーブルに登録し、実行ファイルの属性の属性値が入力されると、入力された属性値を削除対象または削除対象外の属性値として前記属性値テーブルに登録する特徴選択設定部と、
実行ファイルが入力されると、該実行ファイルから、前記属性テーブルに抽出対象として登録された属性の属性値を抽出し、抽出した属性値を特徴として含む特徴ベクトルを生成する特徴抽出部と、
前記特徴抽出部が生成した特徴ベクトルから、前記属性値テーブルに削除対象として登録された属性値の削除、または、削除対象外として登録された属性値以外の属性値の削除を行って、該特徴ベクトルを再構成する特徴選択部と、
前記特徴選択部が学習対象の実行ファイルの特徴ベクトルを再構成すると、該特徴ベクトルと該学習対象の実行ファイルの特徴ベクトルがマルウェアであるか否かの情報とを基に、該学習対象の実行ファイルの機械学習を行い、前記特徴選択部が判定対象の実行ファイルの特徴ベクトルを再構成すると、前記機械学習の結果と該特徴ベクトルとを基に、該判定対象の実行ファイルについて、マルウェアらしさのスコアを算出する分類器と、
前記分類器が算出した判定対象の実行ファイルのスコアを基に、該判定対象の実行ファイルがマルウェアであるか否かを判定する判定部と、を含む。
前記マルウェア判定器と、
特徴選択試行器と、を有し、
前記特徴選択試行器は、
属性名と属性値とからなる特徴を1以上組み合わせた組の中から1つの組を選択して前記マルウェア判定器の前記特徴選択設定部に入力する処理を繰り返す特徴選択指示部と、
前記特徴選択指示部が前記組を前記特徴選択設定部に入力する度に、前記マルウェア判定器の前記特徴抽出部に、学習対象の実行ファイルを入力した後に判定対象の実行ファイルを入力する処理を繰り返す検定部と、
前記特徴選択指示部が前記組を前記特徴選択設定部に入力する度に、前記マルウェア判定器の前記判定部から、前記検定部が前記特徴抽出部に繰り返し入力した各判定対象の実行ファイルについて、該判定対象の実行ファイルのスコアと該判定対象の実行ファイルをマルウェアと判定したか否かを示す情報とを含む判定結果を取得し、該判定結果を基に、前記判定部の判定精度の良さを示す指標を計算する指標計算部と、を含み、
前記特徴選択指示部は、
前記特徴選択設定部に入力した組のうち、前記指標計算部が計算した指標が最も高い組を選出し、前記特徴選択設定部に入力する。
前記マルウェア判定器と、
ユーザインタフェースと、を有し、
前記ユーザインタフェースは、
実行ファイルの属性の属性名および属性値を取得する特徴一覧取得部と、
前記特徴一覧取得部が取得した属性名および属性値の一覧を含む設定画面を表示する特徴選択入力部と、を含み、
前記特徴選択入力部は、
属性名を第1のチェックボックスと共に前記設定画面上に表示し、前記属性テーブルに抽出対象として登録された属性の属性名については、該第1のチェックボックスにチェックを付し、
属性値を第2のチェックボックスと共に前記設定画面上に表示し、前記属性値テーブルに削除対象として登録された属性値以外の属性値、または、削除対象外として登録された属性値については、該第2のチェックボックスにチェックを付し、
前記第1のチェックボックスまたは前記第2のチェックボックスのチェック状況が手動で変更されると、変更後の前記第1のチェックボックスにチェックが付いている属性名を前記マルウェア判定器の前記特徴選択設定部に入力すると共に、変更後の前記第2のチェックボックスにチェックが付いていない属性値またはチェックが付いている属性値を前記マルウェア判定器の前記特徴選択設定部に入力する。
マルウェア判定器が行うマルウェア判定方法であって、
実行ファイルの属性の属性名が入力されると、入力された属性名の属性を抽出対象の属性として属性テーブルに登録し、実行ファイルの属性の属性値が入力されると、入力された属性値を削除対象または削除対象外の属性値として属性値テーブルに登録する特徴選択設定ステップと、
実行ファイルが入力されると、該実行ファイルから、前記属性テーブルに抽出対象として登録された属性の属性値を抽出し、抽出した属性値を特徴として含む特徴ベクトルを生成する特徴抽出ステップと、
前記特徴抽出ステップで生成した特徴ベクトルから、前記属性値テーブルに削除対象として登録された属性値の削除、または、削除対象外として登録された属性値以外の属性値の削除を行って、該特徴ベクトルを再構成する特徴選択ステップと、
前記特徴選択ステップで学習対象の実行ファイルの特徴ベクトルを再構成すると、該特徴ベクトルと該学習対象の実行ファイルの特徴ベクトルがマルウェアであるか否かの情報とを基に、該学習対象の実行ファイルの機械学習を行い、前記特徴選択ステップで判定対象の実行ファイルの特徴ベクトルを再構成すると、前記機械学習の結果と該特徴ベクトルとを基に、該判定対象の実行ファイルについて、マルウェアらしさのスコアを算出する分類ステップと、
前記分類ステップで算出した判定対象の実行ファイルのスコアを基に、該判定対象の実行ファイルがマルウェアであるか否かを判定する判定ステップと、を含む。
コンピュータを、
属性テーブルおよび属性値テーブルを含む記憶手段と、
実行ファイルの属性の属性名が入力されると、入力された属性名の属性を抽出対象の属性として前記属性テーブルに登録し、実行ファイルの属性の属性値が入力されると、入力された属性値を削除対象または削除対象外の属性値として前記属性値テーブルに登録する特徴選択設定手段と、
実行ファイルが入力されると、該実行ファイルから、前記属性テーブルに抽出対象として登録された属性の属性値を抽出し、抽出した属性値を特徴として含む特徴ベクトルを生成する特徴抽出手段と、
前記特徴抽出手段が生成した特徴ベクトルから、前記属性値テーブルに削除対象として登録された属性値の削除、または、削除対象外として登録された属性値以外の属性値の削除を行って、該特徴ベクトルを再構成する特徴選択手段と、
前記特徴選択手段が学習対象の実行ファイルの特徴ベクトルを再構成すると、該特徴ベクトルと該学習対象の実行ファイルの特徴ベクトルがマルウェアであるか否かの情報とを基に、該学習対象の実行ファイルの機械学習を行い、前記特徴選択手段が判定対象の実行ファイルの特徴ベクトルを再構成すると、前記機械学習の結果と該特徴ベクトルとを基に、該判定対象の実行ファイルについて、マルウェアらしさのスコアを算出する分類手段と、
前記分類手段が算出した判定対象の実行ファイルのスコアを基に、該判定対象の実行ファイルがマルウェアであるか否かを判定する判定手段と、として機能させる。
(1)第1の実施形態
(1-1)第1の実施形態の構成
図1に、本発明の第1の実施形態のマルウェア判定器10の構成図を示す。
(1-2)第1の実施形態の動作
本実施形態のマルウェア判定器10は、まず、特徴ベクトルに含める特徴を設定する(特徴選択設定処理。図2A)。次に、既存のマルウェアとマルウェアでない実行ファイル(グッドウェア)とから構成された教師データの機械学習を行う(教師データ学習処理。図2B)。学習後は、判定対象の実行ファイルについて、マルウェアであるか否かを判定することができる(対象ファイル判定処理。図2C)。
(1-2-1)特徴選択設定処理
図2Aに、本実施形態のマルウェア判定器10における特徴選択設定処理の概要を説明するフローチャートを示す。
(1-2-2)教師データ学習処理
図2Bに、本実施形態のマルウェア判定器10における教師データ学習処理の概要を説明するフローチャートを示す。
(1-2-3)対象ファイル判定処理
図2Cに、本実施形態のマルウェア判定器10における対象ファイル判定処理の概要を説明するフローチャートを示す。
(2)第2の実施形態
本発明の第2の実施形態のマルウェア判定システムは、図1に示したマルウェア判定器10へ、他の装置から特徴選択に関する設定を自動で入力するものである。
(2-1)第2の実施形態の構成
図5に、本発明の第2の実施形態のマルウェア判定システムの構成図を示す。
(2-2)第2の実施形態の動作
以下、特徴選択試行器20における処理について説明する。
(ア) 現個数≠1、且つ、現個数で最良の指標≦現個数-1 で最良の指標 の時は、現個数-1で最良の指標の特徴選択の設定を採用し、終了([A]が最良の特徴選択の設定となる)。
(3)第3の実施形態
本発明の第3の実施形態のマルウェア判定システムは、図1に示したマルウェア判定器10へ、手入力で特徴選択に関する設定を入力するものである。
(3-1)第3の実施形態の構成
図7に、本発明の第3の実施形態のマルウェア判定システムの構成図を示す。
図7に示すように、本実施形態のマルウェア判定システムは、マルウェア判定器10の他に、既存のマルウェア・グッドウェアから特徴を収集し、その特徴の一覧を用いて、人手で特徴選択を行う特徴選択UI(User Interface)30を設けている。また、特徴選択UI30は、特徴一覧取得部31と、特徴選択入力部32と、を有している。
(3-2)第3の実施形態の設定画面
以下に、特徴選択入力部32で表示される設定画面について説明する。
11 特徴選択設定部
12 特徴選択データベース
13 特徴抽出部
14 特徴選択部
15 分類器
16 判定部
20 特徴選択試行器
21 特徴選択指示部
22 検定部
23 指標計算部
30 特徴選択UI
31 特徴一覧取得部
32 特徴選択入力部
Claims (7)
- 属性テーブルおよび属性値テーブルを含む特徴選択データベースと、
実行ファイルの属性の属性名が入力されると、入力された属性名の属性を抽出対象の属性として前記属性テーブルに登録し、実行ファイルの属性の属性値が入力されると、入力された属性値を削除対象または削除対象外の属性値として前記属性値テーブルに登録する特徴選択設定部と、
実行ファイルが入力されると、該実行ファイルから、前記属性テーブルに抽出対象として登録された属性の属性値を抽出し、抽出した属性値を特徴として含む特徴ベクトルを生成する特徴抽出部と、
前記特徴抽出部が生成した特徴ベクトルから、前記属性値テーブルに削除対象として登録された属性値の削除、または、削除対象外として登録された属性値以外の属性値の削除を行って、該特徴ベクトルを再構成する特徴選択部と、
前記特徴選択部が学習対象の実行ファイルの特徴ベクトルを再構成すると、該特徴ベクトルと該学習対象の実行ファイルの特徴ベクトルがマルウェアであるか否かの情報とを基に、該学習対象の実行ファイルの機械学習を行い、前記特徴選択部が判定対象の実行ファイルの特徴ベクトルを再構成すると、前記機械学習の結果と該特徴ベクトルとを基に、該判定対象の実行ファイルについて、マルウェアらしさのスコアを算出する分類器と、
前記分類器が算出した判定対象の実行ファイルのスコアを基に、該判定対象の実行ファイルがマルウェアであるか否かを判定する判定部と、を含むマルウェア判定器。 - 請求項1に記載のマルウェア判定器を有するマルウェア判定システムであって、
特徴選択試行器をさらに有し、
前記特徴選択試行器は、
実行ファイルの属性の属性名と属性値とからなる特徴を1以上組み合わせた組の中から1つの組を選択して前記マルウェア判定器の前記特徴選択設定部に入力する処理を繰り返す特徴選択指示部と、
前記特徴選択指示部が前記組を前記特徴選択設定部に入力する度に、前記マルウェア判定器の前記特徴抽出部に、学習対象の実行ファイルを入力した後に判定対象の実行ファイルを入力する処理を繰り返す検定部と、
前記特徴選択指示部が前記組を前記特徴選択設定部に入力する度に、前記マルウェア判定器の前記判定部から、前記検定部が前記特徴抽出部に繰り返し入力した各判定対象の実行ファイルについて、該判定対象の実行ファイルのスコアと該判定対象の実行ファイルをマルウェアと判定したか否かを示す情報とを含む判定結果を取得し、該判定結果を基に、前記判定部の判定精度の良さを示す指標を計算する指標計算部と、を含み、
前記特徴選択指示部は、
前記特徴選択設定部に入力した組のうち、前記指標計算部が計算した指標が最も高い組を選出し、前記特徴選択設定部に入力する、マルウェア判定システム。 - 請求項2に記載のマルウェア判定システムであって、
前記指標計算部は、
前記マルウェア判定器の前記判定部が、マルウェアである判定対象の実行ファイルを正しくマルウェアと判定した率である検知率、または、マルウェアでない判定対象の実行ファイルを誤ってマルウェアと判定した率である誤検知率を用いて、前記指標を計算する、マルウェア判定システム。 - 請求項1に記載のマルウェア判定器を有するマルウェア判定システムであって、
ユーザインタフェースをさらに有し、
前記ユーザインタフェースは、
実行ファイルの属性の属性名および属性値を取得する特徴一覧取得部と、
前記特徴一覧取得部が取得した属性名および属性値の一覧を含む設定画面を表示する特徴選択入力部と、を含み、
前記特徴選択入力部は、
属性名を第1のチェックボックスと共に前記設定画面上に表示し、前記属性テーブルに抽出対象として登録された属性の属性名については、該第1のチェックボックスにチェックを付し、
属性値を第2のチェックボックスと共に前記設定画面上に表示し、前記属性値テーブルに削除対象として登録された属性値以外の属性値、または、削除対象外として登録された属性値については、該第2のチェックボックスにチェックを付し、
前記第1のチェックボックスまたは前記第2のチェックボックスのチェック状況が手動で変更されると、変更後の前記第1のチェックボックスにチェックが付いている属性名を前記マルウェア判定器の前記特徴選択設定部に入力すると共に、変更後の前記第2のチェックボックスにチェックが付いていない属性値またはチェックが付いている属性値を前記マルウェア判定器の前記特徴選択設定部に入力する、マルウェア判定システム。 - 請求項4に記載のマルウェア判定システムであって、
前記特徴選択入力部は、
属性値を、該属性値が出現する実行ファイルの数である出現数と実行ファイルの全数と共に前記設定画面上に表示する、マルウェア判定システム。 - マルウェア判定器が行うマルウェア判定方法であって、
実行ファイルの属性の属性名が入力されると、入力された属性名の属性を抽出対象の属性として属性テーブルに登録し、実行ファイルの属性の属性値が入力されると、入力された属性値を削除対象または削除対象外の属性値として属性値テーブルに登録する特徴選択設定ステップと、
実行ファイルが入力されると、該実行ファイルから、前記属性テーブルに抽出対象として登録された属性の属性値を抽出し、抽出した属性値を特徴として含む特徴ベクトルを生成する特徴抽出ステップと、
前記特徴抽出ステップで生成した特徴ベクトルから、前記属性値テーブルに削除対象として登録された属性値の削除、または、削除対象外として登録された属性値以外の属性値の削除を行って、該特徴ベクトルを再構成する特徴選択ステップと、
前記特徴選択ステップで学習対象の実行ファイルの特徴ベクトルを再構成すると、該特徴ベクトルと該学習対象の実行ファイルの特徴ベクトルがマルウェアであるか否かの情報とを基に、該学習対象の実行ファイルの機械学習を行い、前記特徴選択ステップで判定対象の実行ファイルの特徴ベクトルを再構成すると、前記機械学習の結果と該特徴ベクトルとを基に、該判定対象の実行ファイルについて、マルウェアらしさのスコアを算出する分類ステップと、
前記分類ステップで算出した判定対象の実行ファイルのスコアを基に、該判定対象の実行ファイルがマルウェアであるか否かを判定する判定ステップと、を含むマルウェア判定方法。 - コンピュータを、
属性テーブルおよび属性値テーブルを含む記憶手段と、
実行ファイルの属性の属性名が入力されると、入力された属性名の属性を抽出対象の属性として前記属性テーブルに登録し、実行ファイルの属性の属性値が入力されると、入力された属性値を削除対象または削除対象外の属性値として前記属性値テーブルに登録する特徴選択設定手段と、
実行ファイルが入力されると、該実行ファイルから、前記属性テーブルに抽出対象として登録された属性の属性値を抽出し、抽出した属性値を特徴として含む特徴ベクトルを生成する特徴抽出手段と、
前記特徴抽出手段が生成した特徴ベクトルから、前記属性値テーブルに削除対象として登録された属性値の削除、または、削除対象外として登録された属性値以外の属性値の削除を行って、該特徴ベクトルを再構成する特徴選択手段と、
前記特徴選択手段が学習対象の実行ファイルの特徴ベクトルを再構成すると、該特徴ベクトルと該学習対象の実行ファイルの特徴ベクトルがマルウェアであるか否かの情報とを基に、該学習対象の実行ファイルの機械学習を行い、前記特徴選択手段が判定対象の実行ファイルの特徴ベクトルを再構成すると、前記機械学習の結果と該特徴ベクトルとを基に、該判定対象の実行ファイルについて、マルウェアらしさのスコアを算出する分類手段と、
前記分類手段が算出した判定対象の実行ファイルのスコアを基に、該判定対象の実行ファイルがマルウェアであるか否かを判定する判定手段と、として機能させるためのプログラム。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201580030423.0A CN106471516B (zh) | 2014-06-11 | 2015-06-08 | 恶意软件判定器、恶意软件判定系统、恶意软件判定方法以及程序 |
US15/315,903 US10268820B2 (en) | 2014-06-11 | 2015-06-08 | Malware determination device, malware determination system, malware determination method, and program |
EP15806424.6A EP3139297B1 (en) | 2014-06-11 | 2015-06-08 | Malware determination device, malware determination system, malware determination method, and program |
JP2016527799A JP6018345B2 (ja) | 2014-06-11 | 2015-06-08 | マルウェア判定器、マルウェア判定システム、マルウェア判定方法、プログラム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014120428 | 2014-06-11 | ||
JP2014-120428 | 2014-06-11 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015190446A1 true WO2015190446A1 (ja) | 2015-12-17 |
Family
ID=54833537
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2015/066527 WO2015190446A1 (ja) | 2014-06-11 | 2015-06-08 | マルウェア判定器、マルウェア判定システム、マルウェア判定方法、プログラム |
Country Status (5)
Country | Link |
---|---|
US (1) | US10268820B2 (ja) |
EP (1) | EP3139297B1 (ja) |
JP (1) | JP6018345B2 (ja) |
CN (1) | CN106471516B (ja) |
WO (1) | WO2015190446A1 (ja) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5982597B1 (ja) * | 2016-03-10 | 2016-08-31 | 株式会社Ffri | 情報処理装置、情報処理方法、プログラム及びプログラムを記録したコンピュータ読み取り可能な記録媒体 |
JP2017174329A (ja) * | 2016-03-25 | 2017-09-28 | Kddi株式会社 | 情報管理装置、情報管理方法及びコンピュータプログラム |
JP2019533856A (ja) * | 2016-09-19 | 2019-11-21 | エヌ・ティ・ティ イノベーション インスティチュート インクNTT Innovation Institute, Inc. | 脅威スコアリングシステム及び方法 |
CN113726810A (zh) * | 2021-09-07 | 2021-11-30 | 广东电网有限责任公司广州供电局 | 入侵检测系统 |
US11757857B2 (en) | 2017-01-23 | 2023-09-12 | Ntt Research, Inc. | Digital credential issuing system and method |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10176438B2 (en) * | 2015-06-19 | 2019-01-08 | Arizona Board Of Regents On Behalf Of Arizona State University | Systems and methods for data driven malware task identification |
US10193902B1 (en) * | 2015-11-02 | 2019-01-29 | Deep Instinct Ltd. | Methods and systems for malware detection |
US10972482B2 (en) * | 2016-07-05 | 2021-04-06 | Webroot Inc. | Automatic inline detection based on static data |
US10491627B1 (en) * | 2016-09-29 | 2019-11-26 | Fireeye, Inc. | Advanced malware detection using similarity analysis |
US10607004B2 (en) * | 2016-09-29 | 2020-03-31 | Intel Corporation | Methods and apparatus to improve feature engineering efficiency with metadata unit operations |
US10956453B2 (en) * | 2017-05-24 | 2021-03-23 | International Business Machines Corporation | Method to estimate the deletability of data objects |
US10817603B2 (en) | 2017-08-29 | 2020-10-27 | Target Brands, Inc. | Computer security system with malicious script document identification |
US10929534B2 (en) * | 2017-10-18 | 2021-02-23 | AO Kaspersky Lab | System and method detecting malicious files using machine learning |
CN109960901B (zh) * | 2017-12-14 | 2022-06-07 | 北京京东尚科信息技术有限公司 | 桌面应用风险评价、控制的方法、系统、设备和存储介质 |
JP2019118461A (ja) * | 2017-12-28 | 2019-07-22 | 株式会社 ディー・エヌ・エー | 情報処理装置及び情報処理プログラム |
US11558401B1 (en) * | 2018-03-30 | 2023-01-17 | Fireeye Security Holdings Us Llc | Multi-vector malware detection data sharing system for improved detection |
US10824723B2 (en) * | 2018-09-26 | 2020-11-03 | Mcafee, Llc | Identification of malware |
KR102271449B1 (ko) * | 2018-11-17 | 2021-07-01 | 한국과학기술정보연구원 | 인공지능 모델 플랫폼 및 인공지능 모델 플랫폼 운영 방법 |
US11200318B2 (en) * | 2018-12-28 | 2021-12-14 | Mcafee, Llc | Methods and apparatus to detect adversarial malware |
CN113935031B (zh) * | 2020-12-03 | 2022-07-05 | 奇安信网神信息技术(北京)股份有限公司 | 文件特征提取范围配置及静态恶意软件识别的方法、系统 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013222422A (ja) * | 2012-04-19 | 2013-10-28 | Ffri Inc | プログラム、情報処理装置、及び情報処理方法 |
US8635171B1 (en) * | 2009-08-17 | 2014-01-21 | Symantec Corporation | Systems and methods for reducing false positives produced by heuristics |
JP2014504399A (ja) * | 2010-12-01 | 2014-02-20 | ソースファイア インコーポレイテッド | 文脈上の確からしさ、ジェネリックシグネチャ、および機械学習法を用いて悪意のあるソフトウェアを検出する方法 |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1955960A (zh) * | 2005-10-28 | 2007-05-02 | 日电(中国)有限公司 | 文档信息表构造装置以及利用其的浏览和查找系统 |
US8234712B2 (en) * | 2008-04-11 | 2012-07-31 | International Business Machines Corporation | Executable content filtering |
IL195081A0 (en) * | 2008-11-03 | 2011-08-01 | Deutche Telekom Ag | Acquisition of malicious code using active learning |
US20120137367A1 (en) * | 2009-11-06 | 2012-05-31 | Cataphora, Inc. | Continuous anomaly detection based on behavior modeling and heterogeneous information analysis |
JP5569935B2 (ja) | 2010-07-23 | 2014-08-13 | 日本電信電話株式会社 | ソフトウェア検出方法及び装置及びプログラム |
US8875286B2 (en) | 2010-12-01 | 2014-10-28 | Cisco Technology, Inc. | Method and apparatus for detecting malicious software using machine learning techniques |
US20150200962A1 (en) | 2012-06-04 | 2015-07-16 | The Board Of Regents Of The University Of Texas System | Method and system for resilient and adaptive detection of malicious websites |
US9292688B2 (en) * | 2012-09-26 | 2016-03-22 | Northrop Grumman Systems Corporation | System and method for automated machine-learning, zero-day malware detection |
-
2015
- 2015-06-08 US US15/315,903 patent/US10268820B2/en active Active
- 2015-06-08 CN CN201580030423.0A patent/CN106471516B/zh active Active
- 2015-06-08 EP EP15806424.6A patent/EP3139297B1/en active Active
- 2015-06-08 JP JP2016527799A patent/JP6018345B2/ja active Active
- 2015-06-08 WO PCT/JP2015/066527 patent/WO2015190446A1/ja active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8635171B1 (en) * | 2009-08-17 | 2014-01-21 | Symantec Corporation | Systems and methods for reducing false positives produced by heuristics |
JP2014504399A (ja) * | 2010-12-01 | 2014-02-20 | ソースファイア インコーポレイテッド | 文脈上の確からしさ、ジェネリックシグネチャ、および機械学習法を用いて悪意のあるソフトウェアを検出する方法 |
JP2013222422A (ja) * | 2012-04-19 | 2013-10-28 | Ffri Inc | プログラム、情報処理装置、及び情報処理方法 |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5982597B1 (ja) * | 2016-03-10 | 2016-08-31 | 株式会社Ffri | 情報処理装置、情報処理方法、プログラム及びプログラムを記録したコンピュータ読み取り可能な記録媒体 |
JP2017174329A (ja) * | 2016-03-25 | 2017-09-28 | Kddi株式会社 | 情報管理装置、情報管理方法及びコンピュータプログラム |
JP2019533856A (ja) * | 2016-09-19 | 2019-11-21 | エヌ・ティ・ティ イノベーション インスティチュート インクNTT Innovation Institute, Inc. | 脅威スコアリングシステム及び方法 |
JP7073348B2 (ja) | 2016-09-19 | 2022-05-23 | エヌ・ティ・ティ リサーチ インコーポレイテッド | 脅威スコアリングシステム及び方法 |
US11757857B2 (en) | 2017-01-23 | 2023-09-12 | Ntt Research, Inc. | Digital credential issuing system and method |
CN113726810A (zh) * | 2021-09-07 | 2021-11-30 | 广东电网有限责任公司广州供电局 | 入侵检测系统 |
Also Published As
Publication number | Publication date |
---|---|
EP3139297A1 (en) | 2017-03-08 |
JPWO2015190446A1 (ja) | 2017-04-20 |
US10268820B2 (en) | 2019-04-23 |
JP6018345B2 (ja) | 2016-11-02 |
CN106471516A (zh) | 2017-03-01 |
EP3139297A4 (en) | 2017-12-13 |
CN106471516B (zh) | 2019-06-11 |
EP3139297B1 (en) | 2019-04-03 |
US20170098074A1 (en) | 2017-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6018345B2 (ja) | マルウェア判定器、マルウェア判定システム、マルウェア判定方法、プログラム | |
KR101162051B1 (ko) | 문자열 비교 기법을 이용한 악성코드 탐지 및 분류 시스템 및 그 방법 | |
US9621571B2 (en) | Apparatus and method for searching for similar malicious code based on malicious code feature information | |
TWI729472B (zh) | 特徵詞的確定方法、裝置和伺服器 | |
US9519464B2 (en) | Code recommendation | |
KR102317833B1 (ko) | 악성 코드 탐지 모델 학습 방법 및 이를 이용한 탐지 방법 | |
CN103365699B (zh) | 基于apk的系统api和运行时字符串的提取方法及系统 | |
AU2013365452B2 (en) | Document classification device and program | |
JP6039768B1 (ja) | 調整装置、調整方法および調整プログラム | |
CN110572393A (zh) | 一种基于卷积神经网络的恶意软件流量分类方法 | |
US20190362187A1 (en) | Training data creation method and training data creation apparatus | |
JP2017004123A (ja) | 判定装置、判定方法および判定プログラム | |
JP2016206950A (ja) | マルウェア判定のための精査教師データ出力装置、マルウェア判定システム、マルウェア判定方法およびマルウェア判定のための精査教師データ出力プログラム | |
CN109067708B (zh) | 一种网页后门的检测方法、装置、设备及存储介质 | |
US11126715B2 (en) | Signature generation device, signature generation method, recording medium storing signature generation program, and software determination system | |
JP2016031629A (ja) | 特徴選択装置、特徴選択システム、特徴選択方法、および、特徴選択プログラム | |
Li et al. | MDBA: Detecting malware based on bytes n-gram with association mining | |
KR20180133726A (ko) | 특징 벡터를 이용하여 데이터를 분류하는 장치 및 방법 | |
CN111797395A (zh) | 恶意代码可视化及变种检测方法、装置、设备及存储介质 | |
CN112163217A (zh) | 恶意软件变种识别方法、装置、设备及计算机存储介质 | |
KR102466167B1 (ko) | 컴퓨터에서 실행되는 파일의 구조 정보를 이용하여 엔트로피 기반으로 악성코드를 탐지하는 악성코드 탐지 프로그램 및 방법 | |
WO2017135249A1 (ja) | アイコン診断装置、アイコン診断方法およびプログラム | |
KR102289411B1 (ko) | 가중치 기반의 피처 벡터 생성 장치 및 방법 | |
JP6783741B2 (ja) | 距離測定装置、通信システム、作成装置及び距離測定プログラム | |
Wang et al. | Detecting unknown malware on android by machine learning using the feature of dalvik operation code |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15806424 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2016527799 Country of ref document: JP Kind code of ref document: A |
|
REEP | Request for entry into the european phase |
Ref document number: 2015806424 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2015806424 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15315903 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |