CN113761912A - Interpretable judging method and device for malicious software attribution attack organization - Google Patents

Interpretable judging method and device for malicious software attribution attack organization Download PDF

Info

Publication number
CN113761912A
CN113761912A CN202110909793.9A CN202110909793A CN113761912A CN 113761912 A CN113761912 A CN 113761912A CN 202110909793 A CN202110909793 A CN 202110909793A CN 113761912 A CN113761912 A CN 113761912A
Authority
CN
China
Prior art keywords
features
malicious software
character string
code
attribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110909793.9A
Other languages
Chinese (zh)
Other versions
CN113761912B (en
Inventor
严寒冰
王琴琴
周彧
梅瑞
张永铮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
National Computer Network and Information Security Management Center
Original Assignee
Institute of Information Engineering of CAS
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS, National Computer Network and Information Security Management Center filed Critical Institute of Information Engineering of CAS
Priority to CN202110909793.9A priority Critical patent/CN113761912B/en
Publication of CN113761912A publication Critical patent/CN113761912A/en
Application granted granted Critical
Publication of CN113761912B publication Critical patent/CN113761912B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses an interpretable judging method and device for malicious software attribution attack organization, which analyzes the attack organization attribution of malicious software by extracting code characteristics and character string characteristics of the malicious software, and integrates static characteristics and dynamic characteristics of the malicious software, so that the characteristics of the invention are more comprehensive, the characteristics are vectorized by using a natural language processing technology, and meanwhile, the invention uses a model interpretation technology to interpret the result of a classifier, so that the classification result is more convincing, thereby effectively solving the problem that the attack organization attribution of the malicious software cannot be comprehensively analyzed in the prior art.

Description

Interpretable judging method and device for malicious software attribution attack organization
Technical Field
The invention relates to the technical field of computers, in particular to an interpretable judgment method and an interpretable judgment device for malicious software attribution attack organization.
Background
Attack organizations often structure malware to implement cyber attacks. Advanced persistent threat attacks, also known as targeted threat attack, apt (advanced persistent attack) attacks, are one type of cyber attack. APT attacks refer to a process of computer intrusion that is secure and persistent, and are often carefully planned by someone to target a particular target. It is usually for commercial or political reasons, specific to a particular organization or country, and requires high concealment to be maintained over a long period of time. Cyber attacks expose cyber-space security to serious threats. Therefore, it is very necessary to perform network attack analysis and attack organization research. These efforts rely on analysis of malware. However, the features of attack organization attribution of the malicious software are selected only singly, so that the attack organization attribution features are not comprehensive enough.
Disclosure of Invention
The invention provides an interpretable judgment method and an interpretable judgment device for attack organizations to which malicious software belongs, and aims to solve the problem that the attack organizations to which the malicious software belongs cannot be comprehensively analyzed in the prior art.
In a first aspect, the present invention provides an interpretable decision method for a malware attribution attack organization, the method comprising: extracting code features of the malicious software, preprocessing the code features, and vectorizing the preprocessed code features, wherein the code features are the malicious software features taking a function as a unit; extracting character string features of malicious software, preprocessing the character string features, and vectorizing the preprocessed character string features; and performing attack organization attribution of the malicious software based on the vectorized code features and the character string features, and respectively interpreting the classification results of the code features and the character string features.
Optionally, the extracting code features of the malware includes: extracting metadata of the malicious software, and converting the metadata into IR intermediate representation; the metadata comprises a hash of the malware, a compiler, a function name of each function, a Control Flow Graph (CFG), basic blocks and byte codes.
Optionally, the preprocessing the code feature includes: and generalizing the low-frequency words after the IR intermediate representation conversion, and converting all functions of the malicious software into sequential texts according to the program call graph.
Optionally, the vectorizing the pre-processed code features includes: a function vector is generated for the text of each function using the PV-DM algorithm.
Optionally, the extracting the character string features of the malware includes: and extracting a behavior report of the malicious software through a hash value of the malicious software.
Optionally, preprocessing the character string features includes: and segmenting the text in the behavior report.
Optionally, the attributing of attack organization of malware based on vectorized code features and character string features includes: obtaining the classification probability of each vectorized code feature through a random forest classifier; obtaining the classification probability of the character string characteristics after vectorization through a DNN classifier; and integrating the classification probabilities of the multiple code characteristics and the classification probabilities of the character string characteristics to obtain a final classification result of the malicious software.
Optionally, the separately interpreting the code feature and the character string feature classification result includes: and interpreting the code feature classification result through a random forest, and interpreting the character string feature classification result through LIME.
In a second aspect, the present invention provides an interpretable decision-making apparatus for a malware attribution attack organization, the apparatus comprising: the first processing unit is used for extracting code features of malicious software, preprocessing the code features and vectorizing the preprocessed code features, wherein the code features are the malicious software features taking a function as a unit; the second processing unit is used for extracting character string features of the malicious software, preprocessing the character string features and vectorizing the preprocessed character string features; and the third processing unit is used for performing attack organization attribution of the malicious software based on the vectorized code features and the character string features and respectively interpreting the code features and the character string feature classification results.
In a third aspect, the present invention provides a computer-readable storage medium, in which a signal-mapped computer program is stored, and the computer program, when executed by at least one processor, implements any one of the above-mentioned interpretable determination methods for a malware attribution attack organization.
The invention has the following beneficial effects:
the attack organization attribution of the malicious software is analyzed by extracting the code characteristics and the character string characteristics of the malicious software, and the two characteristics synthesize the static characteristics and the dynamic characteristics of the malicious software, so that the characteristics of the attack organization attribution analysis method are more comprehensive, the characteristics are vectorized by using a natural language processing technology, and meanwhile, the results of a classifier are explained by using a model interpretation technology, so that the classification results are more convincing, and the problem that the attack organization attribution of the malicious software cannot be comprehensively analyzed in the prior art is effectively solved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a flowchart illustrating an interpretable determination method for a malware attribution attack organization according to a first embodiment of the present invention;
fig. 2 is a schematic structural diagram of an interpretable determination apparatus for a malware-attributive attack organization according to a first embodiment of the present invention.
Detailed Description
Aiming at the problem that the attack organization affiliation of the malicious software cannot be comprehensively analyzed in the prior art, the attack organization affiliation of the malicious software is analyzed by extracting the code characteristics and the character string characteristics of the malicious software, and particularly, because the two characteristics integrate the static characteristics and the dynamic characteristics of the malicious software, the characteristics are more comprehensive, and the characteristics are vectorized by using a natural language processing technology. The present invention will be described in further detail below with reference to the drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
A first embodiment of the present invention provides an interpretable determination method for malicious software attack organization, and referring to fig. 1, the method includes:
s101, extracting code features of malicious software, preprocessing the code features, and vectorizing the preprocessed code features, wherein the code features are the malicious software features taking a function as a unit;
in specific implementation, the metadata extraction is performed on the malware, the metadata is converted into IR intermediate representation, then the low-frequency words after the IR intermediate representation conversion are generalized, all functions of the malware are converted into sequential texts according to a program call graph, and finally a function vector is generated for the text of each function by using a PV-DM algorithm.
The metadata in the embodiment of the present invention includes a hash of malware, a compiler, a function name of each function, a Control Flow Graph (CFG), basic blocks, and byte codes.
Specifically, in the embodiment of the present invention, the code feature extraction and preprocessing includes: code features refer to malware features in units of functions. In order to obtain function information of the malware, the IDAPython and IDA Pro or other code analysis tools are used for extracting metadata, wherein the metadata comprises hash and builder of the malware, and a function name, a Control Flow Graph (CFG), basic blocks and byte codes of each function. Wherein the function does not include a library function. To reduce the differences with different platforms and multiple compiler options, the present invention uses VEX (or other intermediate representation conversion tools) for Intermediate Representation (IR) conversion. The bytecodes are thus converted into VEX IR, with the compiler information being used for the conversion parameters of the IR.
The vectorization of the code features in the embodiment of the invention specifically comprises the following steps: the present invention uses the PV-DM paragraph vector algorithm to vectorize the functions, generating a vector for each function. The input of the paragraph vector algorithm is a document and the structure of the function is a CFG graph structure, so the invention generates a document for each function. Specific algorithm the following algorithm takes a document as input when generating a function vector, where one VEX IR sentence is treated as one word. To reduce the impact of low frequency vocabulary in documents on the results, it is generalized. Specifically, the constant is replaced by < num >, the temporary variable is replaced by < tmp >, the character string is replaced by < str >, the function name is replaced by < func >, the register is replaced by < reg >, and other low-frequency words are replaced by < other >.
Figure BDA0003203098430000051
S102, extracting character string features of malicious software, preprocessing the character string features, and vectorizing the preprocessed character string features;
specifically, the embodiment of the invention uses a PV-DM algorithm to generate a function vector for the text of each function, extracts the behavior report of the malicious software through the hash value of the malicious software, and finally performs word segmentation on the text in the behavior report.
S103, attack organization attribution of malicious software is carried out on the basis of the vectorized code features and the character string features, and classification results of the code features and the character string features are respectively explained.
That is, in the embodiment of the present invention, the random forest classifier is used to obtain the classification probability of each vectorized code feature, the DNN classifier is used to obtain the classification probability of each vectorized character string feature, and finally the classification probabilities of a plurality of code features and the classification probabilities of the character string features are integrated to obtain the final classification result of the malware.
The respectively explaining the code characteristics and the character string characteristic classification results in the embodiment of the invention is to explain the code characteristic classification results through a random forest and explain the character string characteristic classification results through LIME.
In specific implementation, the character string feature extraction and preprocessing according to the embodiment of the present invention includes: string features refer to dynamic behavior reports of malware. The dynamic behavior report for malware may be downloaded from VirusTotal, or obtained from a Cuckoo sandbox, based on the hash value. The dynamic behavior report is a JSON file.
The embodiment of the invention vectorizes character string characteristics, which comprises the following steps: character string feature vectorization uses a common method in natural language processing, namely a one hot vector. For character string feature vectorization, the file content is subjected to word segmentation, an NLTK method is used for word segmentation, and in consideration of the fact that a large number of special characters exist in the content, the special characters are used for word segmentation. This method is then used to generate vectors for the reported features.
In the embodiment of the present invention, the attack organization attribution specifically includes: attack organization attribution is implemented using a classifier. The code features use a random forest classifier, and function vectors are used as input, because the malicious software has a plurality of functions, the classification probability of each function is obtained through the classifier. The character string features use a DNN classifier and report vectors as input to obtain report classification probabilities. And integrating the classification probabilities and the report classification probabilities of the plurality of functions to obtain a final classification result of the malicious software. The method comprises the following specific steps:
assume that a binary file has n functions, each f1,f2,...,fn. The function vectors are respectively
Figure BDA0003203098430000061
The reporting vector is vr. The output of the classifier is P ═<p1,p2,...,pm>Where m is the number of attacking tissue, p1Representing the probability that the classifier predicted the input as the first attacking tissue. According to the method of the invention, a plurality of functions and report vectors are used as input, and the prediction result is obtained by using corresponding classifiers
Figure BDA0003203098430000062
Combining the prediction results, the prediction probability of the malicious software is P ═<P1,P2,...,Pm>Wherein
Figure BDA0003203098430000063
Figure BDA0003203098430000064
Means that
Figure BDA0003203098430000065
P1 in (1). And the attack organization corresponding to the maximum probability value in the P is the final attack organization attribution result of the malicious software.
It should be noted that the model interpretation in the embodiment of the present invention is to interpret the classification result, that is, what features enable the classification model to make such classification decision. For random forest classifiers, the method of the invention is naturally interpretable. For the DNN classifier, LIME was used for model interpretation.
According to the attack organization attribution result of the malicious software, corresponding attack organization prediction probabilities in the function prediction result are sorted from large to small. The first few functions are the interpretation results of the code feature attribution model, and the functions are key functions with important attention. Meanwhile, in order to find out which functions in the attack organization are similar to the key functions, the similarity between the functions is calculated by using cosine distances of function vectors, and the smaller the distance is, the more similar the functions are.
The character string feature attribution model is interpreted using the LIME model. The result of LIME is a feature rank and corresponding contribution value. The contribution value represents the contribution of the feature to the classification.
Generally speaking, the interpretable attack organization attribution method for the malicious software, which is provided by the invention, can be used for performing attack organization attribution on suspicious malicious software and obtaining important characteristics of the attack organization attribution for network security technicians. This provides an important basis for the analysis of attack organization attacks and threat intelligence.
The method according to an embodiment of the invention will be explained and illustrated in detail below by means of a specific example:
the embodiment of the invention provides a method for explaining attack organization affiliation of malicious software, which comprises the following steps:
extracting code characteristics of the malicious software, namely extracting metadata of the malicious software, and then converting the metadata into Intermediate Representation (IR);
preprocessing the code characteristics, namely generalizing low-frequency words in the IR, and converting all functions of malicious software into sequential texts according to a program call graph;
vectorizing code characteristics, namely generating a function vector for the text of each function by using a PV-DM (para-DM) algorithm;
extracting character string characteristics of the malicious software, namely acquiring a behavior report from VirusTotal by using a hash value of the malicious software;
the character string features are preprocessed by segmenting the text in the behavior report. The word segmentation method uses an NLTK method and special character word segmentation;
vectorization of character string features, namely generating report vectors for behavior reports by using a one hot encoding method;
attack organization attribution of malicious software is that attack organization attribution of code features uses a function vector as input, a random forest classifier performs attack organization classification, attack organization attribution of character string features uses a report vector as input, a DNN (deep neural network) classifier performs attack organization classification, classification results of the code features and the character string features are integrated, and final attack organization attribution results of the malicious software are performed;
and model interpretation, namely respectively interpreting code features and character string feature classification results by using random forests and LIME (local interpretation model-explicit).
Generally speaking, the embodiment of the invention extracts the code features and character string features of the malicious software to synthesize the static features and dynamic features of the malicious software, the features are more comprehensive, the features are vectorized by using a natural language processing technology, wherein PV-DM vectorizes functions, one hot encoding vectorizes behavior reports, the technology can fully express the semantics of the malicious software, and the results of a classifier are interpreted by using a model interpretation technology, so that the classification results are more convincing.
A second embodiment of the present invention provides an interpretable determination apparatus for malicious software belonging attack organization, and referring to fig. 2, the apparatus includes: the first processing unit is used for extracting code features of malicious software, preprocessing the code features and vectorizing the preprocessed code features, wherein the code features are the malicious software features taking a function as a unit; the second processing unit is used for extracting character string features of the malicious software, preprocessing the character string features and vectorizing the preprocessed character string features; and the third processing unit is used for performing attack organization attribution of the malicious software based on the vectorized code features and the character string features and respectively interpreting the code features and the character string feature classification results.
The device provided by the embodiment of the invention can simultaneously extract the code characteristics and the character string characteristics of the malicious software, thereby realizing the analysis of the attack organization attribution of the malicious software by integrating the static characteristics and the dynamic characteristics of the malicious software, and finally realizing the accurate analysis of the attack organization attribution of the malicious software.
The relevant content of the embodiments of the present invention can be understood by referring to the first embodiment of the present invention, and will not be discussed in detail herein.
A third embodiment of the present invention provides a computer-readable storage medium storing a signal-mapped computer program, which when executed by at least one processor, implements the method for interpretable determination of malware homing attack organization of any one of the first embodiments of the present invention.
The relevant content of the embodiments of the present invention can be understood by referring to the first embodiment of the present invention, and will not be discussed in detail herein.
Although the preferred embodiments of the present invention have been disclosed for illustrative purposes, those skilled in the art will appreciate that various modifications, additions and substitutions are possible, and the scope of the invention should not be limited to the embodiments described above.

Claims (10)

1. An interpretable decision method for a malware home attack organization, comprising:
extracting code features of the malicious software, preprocessing the code features, and vectorizing the preprocessed code features, wherein the code features are the malicious software features taking a function as a unit;
extracting character string features of malicious software, preprocessing the character string features, and vectorizing the preprocessed character string features;
and performing attack organization attribution of the malicious software based on the vectorized code features and the character string features, and respectively interpreting the classification results of the code features and the character string features.
2. The method of claim 1, wherein extracting code features of malware comprises:
extracting metadata of the malicious software, and converting the metadata into IR intermediate representation;
the metadata comprises a hash of the malware, a compiler, a function name of each function, a control flow graph CFG, basic blocks and byte codes.
3. The method of claim 2, wherein preprocessing the code features comprises:
and generalizing the low-frequency words after the IR intermediate representation conversion, and converting all functions of the malicious software into sequential texts according to the program call graph.
4. The method of claim 1, wherein vectorizing the pre-processed code features comprises:
a function vector is generated for the text of each function using the PV-DM algorithm.
5. The method according to any one of claims 1-4, wherein the extracting character string features of the malware comprises:
and extracting a behavior report of the malicious software through a hash value of the malicious software.
6. The method of claim 5, wherein preprocessing the string features comprises:
and segmenting the text in the behavior report.
7. The method according to any one of claims 1-4, wherein the vectorized code feature and character string feature based attack organization attribution of malware comprises:
obtaining the classification probability of each vectorized code feature through a random forest classifier;
obtaining the classification probability of the character string characteristics after vectorization through a DNN classifier;
and integrating the classification probabilities of the multiple code characteristics and the classification probabilities of the character string characteristics to obtain a final classification result of the malicious software.
8. The method of claim 7, wherein interpreting the code feature and the string feature classification results separately comprises:
and interpreting the code feature classification result through a random forest, and interpreting the character string feature classification result through LIME.
9. An interpretable decision apparatus for a malware home attack organization, comprising:
the first processing unit is used for extracting code features of malicious software, preprocessing the code features and vectorizing the preprocessed code features, wherein the code features are the malicious software features taking a function as a unit;
the second processing unit is used for extracting character string features of the malicious software, preprocessing the character string features and vectorizing the preprocessed character string features;
and the third processing unit is used for performing attack organization attribution of the malicious software based on the vectorized code features and the character string features and respectively interpreting the code features and the character string feature classification results.
10. A computer-readable storage medium, characterized in that it stores a signal-mapped computer program which, when executed by at least one processor, implements the interpretable decision method for a malware home attack organization of any one of claims 1-8.
CN202110909793.9A 2021-08-09 2021-08-09 Interpretable judging method and device for malicious software attribution attack organization Active CN113761912B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110909793.9A CN113761912B (en) 2021-08-09 2021-08-09 Interpretable judging method and device for malicious software attribution attack organization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110909793.9A CN113761912B (en) 2021-08-09 2021-08-09 Interpretable judging method and device for malicious software attribution attack organization

Publications (2)

Publication Number Publication Date
CN113761912A true CN113761912A (en) 2021-12-07
CN113761912B CN113761912B (en) 2024-04-16

Family

ID=78788789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110909793.9A Active CN113761912B (en) 2021-08-09 2021-08-09 Interpretable judging method and device for malicious software attribution attack organization

Country Status (1)

Country Link
CN (1) CN113761912B (en)

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009037545A (en) * 2007-08-03 2009-02-19 National Institute Of Information & Communication Technology Malware resemblance inspection method and device
CN105205397A (en) * 2015-10-13 2015-12-30 北京奇虎科技有限公司 Rogue program sample classification method and device
CN105653956A (en) * 2016-03-02 2016-06-08 中国科学院信息工程研究所 Android malicious software sorting method based on dynamic behavior dependency graph
US20160357965A1 (en) * 2015-06-04 2016-12-08 Ut Battelle, Llc Automatic clustering of malware variants based on structured control flow
CN107153789A (en) * 2017-04-24 2017-09-12 西安电子科技大学 The method for detecting Android Malware in real time using random forest grader
CN107169355A (en) * 2017-04-28 2017-09-15 北京理工大学 A kind of worm homology analysis method and apparatus
CN107247902A (en) * 2017-05-10 2017-10-13 深信服科技股份有限公司 Malware categorizing system and method
KR101880686B1 (en) * 2018-02-28 2018-07-20 에스지에이솔루션즈 주식회사 A malware code detecting system based on AI(Artificial Intelligence) deep learning
US20190068620A1 (en) * 2017-08-30 2019-02-28 International Business Machines Corporation Detecting malware attacks using extracted behavioral features
CN109784059A (en) * 2019-01-11 2019-05-21 北京中睿天下信息技术有限公司 A kind of wooden horse file source tracing method, system and equipment
CN110135157A (en) * 2019-04-04 2019-08-16 国家计算机网络与信息安全管理中心 Malware homology analysis method, system, electronic equipment and storage medium
CN110222715A (en) * 2019-05-07 2019-09-10 国家计算机网络与信息安全管理中心 A kind of sample homogeneous assays method based on dynamic behaviour chain and behavioral characteristics
CN110704841A (en) * 2019-09-24 2020-01-17 北京电子科技学院 Convolutional neural network-based large-scale android malicious application detection system and method
CN110795732A (en) * 2019-10-10 2020-02-14 南京航空航天大学 SVM-based dynamic and static combination detection method for malicious codes of Android mobile network terminal
CN111552966A (en) * 2020-04-07 2020-08-18 哈尔滨工程大学 Malicious software homology detection method based on information fusion
CN111611583A (en) * 2020-04-08 2020-09-01 国家计算机网络与信息安全管理中心 Malicious code homology analysis method and malicious code homology analysis device
CN111639337A (en) * 2020-04-17 2020-09-08 中国科学院信息工程研究所 Unknown malicious code detection method and system for massive Windows software
CN112000952A (en) * 2020-07-29 2020-11-27 暨南大学 Author organization characteristic engineering method of Windows platform malicious software
RU2738344C1 (en) * 2020-03-10 2020-12-11 Общество с ограниченной ответственностью «Группа АйБи ТДС» Method and system for searching for similar malware based on results of their dynamic analysis
KR20210059991A (en) * 2019-11-18 2021-05-26 쿤텍 주식회사 METHOD FOR IoT ANALYZING MALICIOUS BEHAVIOR AND COMPUTING DEVICE FOR EXECUTING THE METHOD

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009037545A (en) * 2007-08-03 2009-02-19 National Institute Of Information & Communication Technology Malware resemblance inspection method and device
US20160357965A1 (en) * 2015-06-04 2016-12-08 Ut Battelle, Llc Automatic clustering of malware variants based on structured control flow
CN105205397A (en) * 2015-10-13 2015-12-30 北京奇虎科技有限公司 Rogue program sample classification method and device
CN105653956A (en) * 2016-03-02 2016-06-08 中国科学院信息工程研究所 Android malicious software sorting method based on dynamic behavior dependency graph
CN107153789A (en) * 2017-04-24 2017-09-12 西安电子科技大学 The method for detecting Android Malware in real time using random forest grader
CN107169355A (en) * 2017-04-28 2017-09-15 北京理工大学 A kind of worm homology analysis method and apparatus
CN107247902A (en) * 2017-05-10 2017-10-13 深信服科技股份有限公司 Malware categorizing system and method
US20190068620A1 (en) * 2017-08-30 2019-02-28 International Business Machines Corporation Detecting malware attacks using extracted behavioral features
KR101880686B1 (en) * 2018-02-28 2018-07-20 에스지에이솔루션즈 주식회사 A malware code detecting system based on AI(Artificial Intelligence) deep learning
CN109784059A (en) * 2019-01-11 2019-05-21 北京中睿天下信息技术有限公司 A kind of wooden horse file source tracing method, system and equipment
CN110135157A (en) * 2019-04-04 2019-08-16 国家计算机网络与信息安全管理中心 Malware homology analysis method, system, electronic equipment and storage medium
CN110222715A (en) * 2019-05-07 2019-09-10 国家计算机网络与信息安全管理中心 A kind of sample homogeneous assays method based on dynamic behaviour chain and behavioral characteristics
CN110704841A (en) * 2019-09-24 2020-01-17 北京电子科技学院 Convolutional neural network-based large-scale android malicious application detection system and method
CN110795732A (en) * 2019-10-10 2020-02-14 南京航空航天大学 SVM-based dynamic and static combination detection method for malicious codes of Android mobile network terminal
KR20210059991A (en) * 2019-11-18 2021-05-26 쿤텍 주식회사 METHOD FOR IoT ANALYZING MALICIOUS BEHAVIOR AND COMPUTING DEVICE FOR EXECUTING THE METHOD
RU2738344C1 (en) * 2020-03-10 2020-12-11 Общество с ограниченной ответственностью «Группа АйБи ТДС» Method and system for searching for similar malware based on results of their dynamic analysis
CN111552966A (en) * 2020-04-07 2020-08-18 哈尔滨工程大学 Malicious software homology detection method based on information fusion
CN111611583A (en) * 2020-04-08 2020-09-01 国家计算机网络与信息安全管理中心 Malicious code homology analysis method and malicious code homology analysis device
CN111639337A (en) * 2020-04-17 2020-09-08 中国科学院信息工程研究所 Unknown malicious code detection method and system for massive Windows software
CN112000952A (en) * 2020-07-29 2020-11-27 暨南大学 Author organization characteristic engineering method of Windows platform malicious software

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
修扬: "基于操作码序列频率向量和行为特征向量的恶意软件检测", 信息安全与通信保密, no. 09, 10 September 2016 (2016-09-10), pages 97 - 101 *
吴新松: "人工智能可解释性与评估方法", 《信息技术与标准化》, no. 07, 10 July 2021 (2021-07-10), pages 21 - 26 *
张涛: "基于文本嵌入特征表示的恶意软件家族分类", 四川大学学报(自然科学版), vol. 56, no. 03, 13 May 2019 (2019-05-13), pages 441 - 449 *
熊祖涛: "基于Adaboost的Android恶意软件检测方法", 贵州师范学院学报, vol. 32, no. 03, 28 March 2016 (2016-03-28), pages 23 - 27 *
苗红: "基于产业特征语义匹配模型的产业融合预测研究", 《软科学》, vol. 35, no. 07, 15 July 2021 (2021-07-15), pages 16 - 24 *

Also Published As

Publication number Publication date
CN113761912B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
Sun et al. Deep learning and visualization for identifying malware families
US10609050B2 (en) Methods and systems for malware detection
Liu et al. Automatic malware classification and new malware detection using machine learning
CN111832019B (en) Malicious code detection method based on generation countermeasure network
Khammas et al. Feature selection and machine learning classification for malware detection
Ndichu et al. A machine learning approach to malicious JavaScript detection using fixed length vector representation
CN112241530B (en) Malicious PDF document detection method and electronic equipment
CN112329012B (en) Detection method for malicious PDF document containing JavaScript and electronic device
CN112651025A (en) Webshell detection method based on character-level embedded code
CN114297079A (en) XSS fuzzy test case generation method based on time convolution network
Wang et al. Malicious code classification based on opcode sequences and textCNN network
Tang et al. Bhmdc: A byte and hex n-gram based malware detection and classification method
Wang et al. File fragment type identification with convolutional neural networks
Tsai et al. PowerDP: de-obfuscating and profiling malicious PowerShell commands with multi-label classifiers
Khorsand et al. A novel compression-based approach for malware detection using PE header
CN110704611B (en) Illegal text recognition method and device based on feature de-interleaving
CN113918936A (en) SQL injection attack detection method and device
Anandhi et al. Performance evaluation of deep neural network on malware detection: visual feature approach
Bakhshinejad et al. A new compression based method for android malware detection using opcodes
Wang et al. Malware detection using cnn via word embedding in cloud computing infrastructure
CN113761912A (en) Interpretable judging method and device for malicious software attribution attack organization
Cybersecurity Machine learning for malware detection
Cho et al. Mal2d: 2d based deep learning model for malware detection using black and white binary image
CN114169540A (en) Webpage user behavior detection method and system based on improved machine learning
CN112733144A (en) Malicious program intelligent detection method based on deep learning technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant