CN108985055B - Malicious software detection method and system - Google Patents

Malicious software detection method and system Download PDF

Info

Publication number
CN108985055B
CN108985055B CN201810670997.XA CN201810670997A CN108985055B CN 108985055 B CN108985055 B CN 108985055B CN 201810670997 A CN201810670997 A CN 201810670997A CN 108985055 B CN108985055 B CN 108985055B
Authority
CN
China
Prior art keywords
neural network
convolutional neural
training
weight
adopting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810670997.XA
Other languages
Chinese (zh)
Other versions
CN108985055A (en
Inventor
李丹
史闻博
赵立超
赵海杉
郑光聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University Qinhuangdao Branch
Original Assignee
Northeastern University Qinhuangdao Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University Qinhuangdao Branch filed Critical Northeastern University Qinhuangdao Branch
Priority to CN201810670997.XA priority Critical patent/CN108985055B/en
Publication of CN108985055A publication Critical patent/CN108985055A/en
Application granted granted Critical
Publication of CN108985055B publication Critical patent/CN108985055B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/563Static detection by source code analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Virology (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method and a system for detecting malicious software. The method comprises the following steps: acquiring a training sample, wherein the training sample is an executive program of software of a known type; decompiling and numbering the training samples to obtain source codes after the training samples are processed; distributing weights for all codes in the processed source codes by adopting a particle swarm algorithm; training a convolutional neural network by taking the weighted source codes as an input matrix, and in the training process, adjusting the weight of each code by adopting a particle swarm algorithm to obtain the convolutional neural network with accurate output; judging whether the accuracy of the convolutional neural network reaches a set value; if so, stopping training the convolutional neural network, and detecting the software to be detected by adopting the trained convolutional neural network; if not, continuously adopting the particle swarm algorithm to adjust the weight of each code, and training the convolutional neural network. The method and the system for detecting the malicious software have the characteristic of high detection accuracy.

Description

Malicious software detection method and system
Technical Field
The invention relates to the field of malicious software detection, in particular to a malicious software detection method and system.
Background
With the development of scientific technology, the types and complexity of malware are higher and higher, and the identification of malware is also more and more difficult.
Disclosure of Invention
The invention aims to provide a method and a system for detecting malicious software, which have the characteristic of high detection accuracy.
In order to achieve the purpose, the invention provides the following scheme:
a method of malware detection, the method comprising:
obtaining a training sample, wherein the training sample is an executive of software of a known type, and the type comprises benign and malicious;
decompiling and numbering the training samples to obtain source codes processed by the training samples;
distributing weights to all the processed codes in the source codes by adopting a particle swarm algorithm to obtain weighted source codes;
training a convolutional neural network by taking the weighted source codes as an input matrix, and in the training process, adjusting the weight of each code by adopting a particle swarm algorithm to obtain the convolutional neural network with accurate output;
judging whether the accuracy of the convolutional neural network reaches a set value;
if so, stopping training the convolutional neural network, and detecting the software to be detected by adopting the trained convolutional neural network;
if not, continuously adopting the particle swarm algorithm to adjust the weight of each code, and training the convolutional neural network.
Optionally, the allocating weights to the codes in the source code by using a particle swarm algorithm to obtain the weighted source code specifically includes:
arranging the source codes into an n x 1 matrix, wherein each row represents a characteristic and is marked as an initial characteristic matrix;
calculating a weight for each feature by adopting a particle swarm algorithm;
each feature is multiplied by a corresponding weight.
Optionally, the training of the convolutional neural network by using the weighted source code as an input matrix specifically includes:
setting an inner layer parameter top K of the convolutional neural network, wherein K is 3;
and training the convolutional neural network by taking the weighted source code as an input matrix.
Optionally, the adjusting, by using a particle swarm algorithm, the weight of each code in the source code specifically includes:
adjusting the weight of each code in the source codes by adopting a particle swarm optimization algorithm according to the accuracy;
and multiplying the adjusted weight by the corresponding processed source code in the initial feature matrix.
Optionally, the decompiling and numbering the training samples specifically includes:
decompiling the training sample by adopting decompiling software;
and numbering the source codes obtained by decompiling according to a Dalvik code table.
The invention also provides a system for detecting malicious software, which comprises:
the training sample acquisition module is used for acquiring a training sample, wherein the training sample is an executive program of software of a known type, and the type comprises benign and malicious;
the training sample processing module is used for decompiling and numbering the training samples to obtain source codes processed by the training samples;
the weight calculation module is used for distributing weights to all the processed source codes by adopting a particle swarm algorithm to obtain weighted source codes;
the convolutional neural network training module is used for training a convolutional neural network by taking the weighted source codes as an input matrix, and in the training process, the weight of each code is adjusted by adopting a particle swarm algorithm to obtain the convolutional neural network with accurate output;
the accuracy judging module is used for judging whether the accuracy of the convolutional neural network reaches a set value or not;
the detection module is used for stopping training the convolutional neural network when the accuracy of the convolutional neural network output reaches a set value, and detecting the software to be detected by adopting the trained convolutional neural network;
and the weight adjusting module is used for continuously adopting a particle swarm algorithm to adjust the weight of each code in the source code to train the convolutional neural network when the accuracy of the convolutional neural network output does not reach a set value.
Optionally, the weight calculating module specifically includes:
an initial feature matrix determining unit, configured to arrange the source code into an n × 1 matrix, where each row represents a feature and is marked as an initial feature matrix;
the weight calculation unit is used for calculating a weight for each feature by adopting a particle swarm algorithm;
a weighting unit for multiplying each feature by the corresponding weight.
Optionally, the convolutional neural network training module specifically includes:
the parameter setting unit is used for setting an inner layer parameter top K of the convolutional neural network, wherein K is 3;
and the training unit is used for training the convolutional neural network by taking the weighted source code as an input matrix.
Optionally, the weight adjusting module specifically includes:
the weight adjusting unit is used for adjusting the weight of each code in the source codes by adopting an adjusting particle swarm algorithm according to the accuracy;
and the adjusted feature matrix determining unit is used for multiplying the adjusted weight by the corresponding processed source code in the initial feature matrix.
Optionally, the decompiling module specifically includes:
the decompiling unit is used for decompiling the training sample by adopting decompiling software;
and the numbering processing unit is used for numbering the source code obtained by decompiling according to the Dalvik code table.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: according to the malicious software detection method and system, the convolutional neural network and the inertial weight particle swarm algorithm are combined, the inertial weight particle swarm algorithm is adopted to distribute the weight to the source codes subjected to bar numbering processing on the training samples of the convolutional neural network, and the inertial weight particle swarm algorithm is adopted to adjust the weight according to the accuracy of the convolutional neural network until the accuracy of the trained convolutional neural network meets the requirement, so that the malicious software detection method and system provided by the invention have the characteristic of high detection accuracy.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flowchart of a method for detecting malware according to an embodiment of the present invention;
fig. 2 is a structural diagram of a system for detecting malware according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a method and a system for detecting malicious software, which have the characteristic of high detection accuracy.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a flowchart of a method for detecting malware according to an embodiment of the present invention, and as shown in fig. 1, the method for detecting malware according to the present invention includes the following steps:
step 101: obtaining a training sample, wherein the training sample is an executive of software of a known type, and the type comprises benign and malicious;
step 102: decompiling and numbering the training samples to obtain source codes processed by the training samples;
step 103: distributing weights to all the processed codes in the source codes by adopting a particle swarm algorithm to obtain weighted source codes;
step 104: training a convolutional neural network by taking the weighted source codes as an input matrix, and in the training process, adjusting the weight of each code by adopting a particle swarm algorithm to obtain the convolutional neural network with accurate output;
step 105: judging whether the accuracy of the convolutional neural network reaches a set value; the set value of the accuracy of the convolutional neural network may be a value of 90% or more.
Step 106: when the accuracy of the convolutional neural network reaches a set value, stopping training the convolutional neural network, and detecting the software to be detected by adopting the trained convolutional neural network;
step 107: and when the accuracy of the convolutional neural network does not reach a set value, continuously adopting a particle swarm algorithm to adjust the weight of each code, and training the convolutional neural network.
Wherein, step 102 specifically comprises:
decompiling the training sample by adopting decompiling software;
and numbering the source codes obtained by decompiling according to a Dalvik code table.
Step 103 specifically comprises:
arranging the source codes into an n x 1 matrix, wherein each row represents a characteristic and is marked as an initial characteristic matrix;
calculating a weight for each feature by adopting a particle swarm algorithm;
each feature is multiplied by a corresponding weight.
Step 104 specifically includes:
setting an inner layer parameter top K of the convolutional neural network, wherein K is 3;
and training the convolutional neural network by taking the weighted source code as an input matrix.
The adjusting the weight of each code in the source code by adopting the particle swarm algorithm specifically includes:
adjusting the weight of each code in the source codes by adopting a particle swarm optimization algorithm according to the accuracy;
and multiplying the adjusted weight by the corresponding processed source code in the initial feature matrix.
According to the malicious software detection method, the convolutional neural network and the inertial weight particle swarm algorithm are combined, the inertial weight particle swarm algorithm is adopted to distribute the weight to the source code which is subjected to bar numbering processing on the training sample of the convolutional neural network, and the inertial weight particle swarm algorithm is adopted to adjust the weight according to the accuracy of the convolutional neural network until the accuracy of the trained convolutional neural network meets the requirement, so that the malicious software detection method and the malicious software detection system have the characteristic of high detection accuracy.
Fig. 2 is a structural diagram of a system for detecting malware according to an embodiment of the present invention, and as shown in fig. 2, the present invention further provides a system for detecting malware, where the system includes:
a training sample obtaining module 201, configured to obtain a training sample, where the training sample is an execution program of software of a known type, and the type includes benign and malicious;
a training sample processing module 202, configured to decompile and number the training sample to obtain a source code after processing the training sample;
the weight calculation module 203 is configured to assign a weight to each code in the processed source codes by using a particle swarm algorithm to obtain weighted source codes;
the convolutional neural network training module 204 is configured to train a convolutional neural network by using the weighted source codes as an input matrix, and in the training process, adjust the weight of each code by using a particle swarm algorithm to obtain a convolutional neural network with accurate output;
an accuracy determining module 205, configured to determine whether the accuracy of the convolutional neural network reaches a set value;
the detection module 206 is configured to stop training the convolutional neural network when the accuracy of the convolutional neural network output reaches a set value, and detect the software to be detected by using the trained convolutional neural network;
and the weight adjusting module 207 is configured to continue to adjust the weight of each code in the source code by using a particle swarm algorithm when the accuracy of the output of the convolutional neural network does not reach a set value, and train the convolutional neural network.
The decompiling module 202 specifically includes:
the decompiling unit is used for decompiling the training sample by adopting decompiling software;
and the numbering processing unit is used for numbering the source code obtained by decompiling according to the Dalvik code table.
The weight calculation module 203 specifically includes:
an initial feature matrix determining unit, configured to arrange the source code into an n × 1 matrix, where each row represents a feature and is marked as an initial feature matrix;
the weight calculation unit is used for calculating a weight for each feature by adopting a particle swarm algorithm;
a weighting unit for multiplying each feature by the corresponding weight.
The convolutional neural network training module 204 specifically includes:
the parameter setting unit is used for setting an inner layer parameter top K of the convolutional neural network, wherein K is 3;
and the training unit is used for training the convolutional neural network by taking the weighted source code as an input matrix.
The weight adjusting module 207 specifically includes:
the weight adjusting unit is used for adjusting the weight of each code in the source codes by adopting an adjusting particle swarm algorithm according to the accuracy;
and the adjusted feature matrix determining unit is used for multiplying the adjusted weight by the corresponding processed source code in the initial feature matrix.
According to the malicious software detection system, the convolutional neural network and the inertial weight particle swarm algorithm are combined, the inertial weight particle swarm algorithm is adopted to distribute the weight to the source code subjected to bar numbering processing on the training sample of the convolutional neural network, and the inertial weight particle swarm algorithm is adopted to adjust the weight according to the accuracy of the convolutional neural network until the accuracy of the trained convolutional neural network meets the requirement, so that the malicious software detection method and the malicious software detection system have the characteristic of high detection accuracy.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (6)

1. A method for malware detection, the method comprising:
obtaining a training sample, wherein the training sample is an executive of software of a known type, and the type comprises benign and malicious;
decompiling and numbering the training samples to obtain source codes processed by the training samples;
distributing weights to all the processed codes in the source codes by adopting a particle swarm algorithm to obtain weighted source codes;
training a convolutional neural network by taking the weighted source codes as an input matrix, and in the training process, adjusting the weight of each code by adopting a particle swarm algorithm to obtain the convolutional neural network with accurate output;
judging whether the accuracy of the convolutional neural network reaches a set value;
if so, stopping training the convolutional neural network, and detecting the software to be detected by adopting the trained convolutional neural network;
if not, continuing to adopt the particle swarm algorithm to adjust the weight of each code, and training the convolutional neural network;
the method for distributing the weight to each code in the source code by adopting the particle swarm algorithm to obtain the weighted source code specifically comprises the following steps:
arranging the source codes into an n x 1 matrix, wherein each row represents a characteristic and is marked as an initial characteristic matrix;
calculating a weight for each feature by adopting a particle swarm algorithm;
multiplying each feature by a corresponding weight;
the adjusting the weight of each code in the source code by adopting the particle swarm algorithm specifically includes:
adjusting the weight of each code in the source codes by adopting a particle swarm optimization algorithm according to the accuracy;
and multiplying the adjusted weight by the corresponding processed source code in the initial feature matrix.
2. The malware detection method according to claim 1, wherein the training of the convolutional neural network with the weighted source code as an input matrix specifically comprises:
setting an inner layer parameter top K of the convolutional neural network, wherein K is 3;
and training the convolutional neural network by taking the weighted source code as an input matrix.
3. The method according to claim 1, wherein the decompiling and numbering the training samples specifically comprises:
decompiling the training sample by adopting decompiling software;
and numbering the source codes obtained by decompiling according to a Dalvik code table.
4. A malware detection system, the system comprising:
the training sample acquisition module is used for acquiring a training sample, wherein the training sample is an executive program of software of a known type, and the type comprises benign and malicious;
the training sample processing module is used for decompiling and numbering the training samples to obtain source codes processed by the training samples;
the weight calculation module is used for distributing weights to all the processed source codes by adopting a particle swarm algorithm to obtain weighted source codes;
the convolutional neural network training module is used for training a convolutional neural network by taking the weighted source codes as an input matrix, and in the training process, the weight of each code is adjusted by adopting a particle swarm algorithm to obtain the convolutional neural network with accurate output;
the accuracy judging module is used for judging whether the accuracy of the convolutional neural network reaches a set value or not;
the detection module is used for stopping training the convolutional neural network when the accuracy of the convolutional neural network output reaches a set value, and detecting the software to be detected by adopting the trained convolutional neural network;
the weight adjusting module is used for continuously adopting a particle swarm algorithm to adjust the weight of each code in the source code and training the convolutional neural network when the accuracy of the output of the convolutional neural network does not reach a set value;
the weight calculation module specifically includes:
an initial feature matrix determining unit, configured to arrange the source code into an n × 1 matrix, where each row represents a feature and is marked as an initial feature matrix;
the weight calculation unit is used for calculating a weight for each feature by adopting a particle swarm algorithm;
a weighting unit for multiplying each feature by a corresponding weight;
the weight adjusting module specifically includes:
the weight adjusting unit is used for adjusting the weight of each code in the source codes by adopting an adjusting particle swarm algorithm according to the accuracy;
and the adjusted feature matrix determining unit is used for multiplying the adjusted weight by the corresponding processed source code in the initial feature matrix.
5. The malware detection system of claim 4, wherein the convolutional neural network training module specifically comprises:
the parameter setting unit is used for setting an inner layer parameter top K of the convolutional neural network, wherein K is 3;
and the training unit is used for training the convolutional neural network by taking the weighted source code as an input matrix.
6. The malware detection system of claim 4, wherein the decompilation module specifically comprises:
the decompiling unit is used for decompiling the training sample by adopting decompiling software;
and the numbering processing unit is used for numbering the source code obtained by decompiling according to the Dalvik code table.
CN201810670997.XA 2018-06-26 2018-06-26 Malicious software detection method and system Active CN108985055B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810670997.XA CN108985055B (en) 2018-06-26 2018-06-26 Malicious software detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810670997.XA CN108985055B (en) 2018-06-26 2018-06-26 Malicious software detection method and system

Publications (2)

Publication Number Publication Date
CN108985055A CN108985055A (en) 2018-12-11
CN108985055B true CN108985055B (en) 2020-08-28

Family

ID=64538798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810670997.XA Active CN108985055B (en) 2018-06-26 2018-06-26 Malicious software detection method and system

Country Status (1)

Country Link
CN (1) CN108985055B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458239A (en) * 2019-08-15 2019-11-15 东北大学秦皇岛分校 Malware classification method and system based on binary channels convolutional neural networks
CN110489968B (en) * 2019-08-15 2021-02-05 东北大学秦皇岛分校 RNN (radio network node) and CNN (CNN-based) Android malicious software detection method and system
CN110472417B (en) * 2019-08-22 2021-03-30 东北大学秦皇岛分校 Convolutional neural network-based malicious software operation code analysis method
CN110837638B (en) * 2019-11-08 2020-09-01 鹏城实验室 Method, device and equipment for detecting lasso software and storage medium
CN117077141A (en) * 2023-10-13 2023-11-17 国网山东省电力公司鱼台县供电公司 Smart power grid malicious software detection method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927531A (en) * 2014-05-13 2014-07-16 江苏科技大学 Human face recognition method based on local binary value and PSO BP neural network
CN107392021A (en) * 2017-07-20 2017-11-24 中南大学 A kind of Android malicious application detection methods based on multiclass feature
CN107798243A (en) * 2017-11-25 2018-03-13 国网河南省电力公司电力科学研究院 The detection method and device of terminal applies
CN108021810A (en) * 2017-12-06 2018-05-11 北京理工大学 A kind of magnanimity malicious code efficient detection method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103927531A (en) * 2014-05-13 2014-07-16 江苏科技大学 Human face recognition method based on local binary value and PSO BP neural network
CN107392021A (en) * 2017-07-20 2017-11-24 中南大学 A kind of Android malicious application detection methods based on multiclass feature
CN107798243A (en) * 2017-11-25 2018-03-13 国网河南省电力公司电力科学研究院 The detection method and device of terminal applies
CN108021810A (en) * 2017-12-06 2018-05-11 北京理工大学 A kind of magnanimity malicious code efficient detection method

Also Published As

Publication number Publication date
CN108985055A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN108985055B (en) Malicious software detection method and system
CN108595955B (en) Android mobile phone malicious application detection system and method
CN106657057B (en) Anti-crawler system and method
WO2019083737A8 (en) System and method for analyzing binary code for malware classification using artificial neural network techniques
CN106778247B (en) Method and device for dynamically analyzing application program
CN107102886A (en) The detection method and device of Android simulator
GB2594396A (en) Cryptocurrency based malware and ransomware detection systems and methods
US20160277259A1 (en) Traffic quality analysis method and apparatus
RU2012134363A (en) METHOD AND SYSTEM FOR DETERMINING COLOR FROM AN IMAGE
CN108718298B (en) Malicious external connection flow detection method and device
EP1814055A3 (en) Improved method and system for detecting malicious behavioral patterns in a computer, using machine learning
US20120159621A1 (en) Detection system and method of suspicious malicious website using analysis of javascript obfuscation strength
AU2002302782A1 (en) Biometric value generation apparatus and method
CN110351299B (en) Network connection detection method and device
KR102313843B1 (en) Method for predicting malignant url based on mutiple machine learning and apparatus implementing the same method
CN109002715B (en) Malicious software identification method and system based on convolutional neural network
CN104809395A (en) Lightweight-class Android malicious software fast judging method
CN112887329B (en) Hidden service tracing method and device and electronic equipment
CN110162973B (en) Webshell file detection method and device
CN112163222A (en) Malicious software detection method and device
CN110990295B (en) Verification method and device for test cases and electronic equipment
CN106453404B (en) A kind of network inbreak detection method and device
CN111209998B (en) Training method and device of machine learning model based on data type
CN109299592B (en) Man-machine behavior characteristic boundary construction method, system, server and storage medium
CN110197068B (en) Android malicious application detection method based on improved grayish wolf algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant