CN108345794A - The detection method and device of Malware - Google Patents
The detection method and device of Malware Download PDFInfo
- Publication number
- CN108345794A CN108345794A CN201711477108.XA CN201711477108A CN108345794A CN 108345794 A CN108345794 A CN 108345794A CN 201711477108 A CN201711477108 A CN 201711477108A CN 108345794 A CN108345794 A CN 108345794A
- Authority
- CN
- China
- Prior art keywords
- composite character
- feature
- data set
- software
- behavioral characteristics
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Artificial Intelligence (AREA)
- Computer Hardware Design (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Virology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
An embodiment of the present invention provides a kind of detection method and device of Malware, belong to information security field.The method includes:From the software sample set of known software type, the static nature and behavioral characteristics of each software sample are extracted, the static nature of each software sample of extraction and behavioral characteristics are effectively combined, forms composite character data set;According to the selection method of principal component analytical method and feature weight, characteristic dimension is reduced, removes redundancy feature, the composite character data set after being optimized;The feature concentrated to the composite character after optimization with supporting vector machine model is trained, and forms classification and Detection model;Inspection software is treated according to classification and Detection model to be detected.Classification and Detection model is formed with supporting vector machine model, not only increases the efficiency of classification, and improve the accuracy of software detection.
Description
Technical field
The present invention relates to a kind of detection method of Malware, belongs to information security field more particularly to one kind is based on
The detection method and device of the Malware of Android (Android) system.
Background technology
With intelligent terminal be widely current and high speed development, security issues become increasingly urgent for intelligent terminal, a large amount of base
It is also following in the Malware of intelligent terminal system.6 years before 2010, system vulnerability in the operating system of Saipan with
Malicious code is the major security threat of intelligent mobile phone terminal, and from after 2010, smart mobile phone market is gradually ripe and universal
Get up, Apple Inc. had issued iOS4 (Mobile operating system of Apple Inc.'s publication) before this, subsequent Google
The version of Android2.1,2.2 and 2.3 is also issued successively.Early in the end of the year 2009, iOS platforms have found that its first worm-type virus
Ikee, the worm are mainly used for the iOS device after attack is escaped from prison;And for Android platform, also occur striking at the beginning of 2010
It is the first malice in generally acknowledged Android platform to cheat software and ad plug-in, Fake Player (pseudo-operation person) malicious code
Software, main malicious act are to send short message of deducting fees.Second wooden horse " giving you rice " occurred in subsequent android system
Family, in less than two months, the rapid mutation of version up to ten is several, nearly million mobile phones is infected, as first state
The Android malice wooden horses of production use the remote control attack technology of mainstream while also using a variety of countermeasure techniques, in addition to meeting
Column cuts, deletes outside user's short message and back information, can also in the case that do not pass through user allow installation other application software or
Person makes a phone call.Then, the root loopholes of android system and the evils such as cross-platform mobile phone Internetbank wooden horse Zitmo may be implemented
Meaning software also gradually discloses.
In realizing process of the present invention, inventor has found that at least there are the following problems in the prior art:First, it is selected in feature
Aspect is selected, is chosen just for static nature or behavioral characteristics, is unable to thoroughly evaluating software action;Second, characteristic quantity compared with
When more, carry out the method unification of characteristic optimization for primitive character, characteristic set pair after optimization distinguish Malware with
Normal software has little significance;Third, in terms of malware detection classification, using traditional disaggregated model to Malware
Not only precision is not high for detection, but also also to be improved in terms of efficiency.
Invention content
An embodiment of the present invention provides a kind of detection method and device of Malware, not only increase malware detection
Accuracy rate, and improve the efficiency of software detection.
On the one hand, an embodiment of the present invention provides a kind of detection method of Malware, the method includes:
From the software sample set of known software type, the static nature and behavioral characteristics of each software sample are extracted,
The static nature of each software sample of extraction and behavioral characteristics are effectively combined, composite character data set is formed;
Characteristic dimension is reduced to the feature evaluation in composite character data set according to optimization method, removes redundancy feature,
Composite character data set after being optimized;
The feature concentrated to the composite character after optimization with supporting vector machine model is trained, and forms classification and Detection mould
Type;
Inspection software is treated according to classification and Detection model to be detected.
On the other hand, an embodiment of the present invention provides a kind of detection device of Malware, described device includes:
Extraction unit, the static state for from the software sample set of known software type, extracting each software sample are special
It seeks peace behavioral characteristics, the static nature of each software sample of extraction and behavioral characteristics is effectively combined, form composite character number
According to collection;
Optimize unit, for reducing characteristic dimension to the feature evaluation in composite character data set according to optimization method,
Remove redundancy feature, the composite character data set after being optimized;
Training unit, the feature concentrated to the composite character after optimization for the model with support vector machines are instructed
Practice, forms classification and Detection model;
Detection unit is detected for treating inspection software according to classification and Detection model.
Above-mentioned technical proposal has the advantages that:Because using the software sample set from known software type
In, the static nature of each software sample and behavioral characteristics are effectively combined, to generate the skill of composite character data set
Art means, so having reached the technique effect of comprehensive acquisition software sample characteristics;Because using optimization method, to composite character
Feature evaluation in data set selects the skill of the composite character data set after contributing larger feature to be optimized software classification
Art means, so having reached the technique effect for the redundancy feature that removal initial characteristic data is concentrated;Because using supporting vector
Machine model is trained the composite character data set after optimization, generates classification and Detection model, then carry out to software to be detected
The technological means of detection shortens the model training time so having reached, improves the technique effect of the accuracy rate of disaggregated model.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
Obtain other attached drawings according to these attached drawings.
Fig. 1 is the flow chart of the detection method of Malware of the embodiment of the present invention;
Fig. 2 is the structural schematic diagram of the detection device of Malware of the embodiment of the present invention;
Fig. 3 is the sub-process figure of optimization composite character data set of the embodiment of the present invention;
Fig. 4 is the sub-process figure that the embodiment of the present invention forms classification and Detection model;
Fig. 5 is the sub-process figure that the embodiment of the present invention treats that inspection software is detected according to classification and Detection model;
Fig. 6 is the structural schematic diagram of optimization unit of the embodiment of the present invention;
Fig. 7 is the structural schematic diagram of training unit of the embodiment of the present invention;
Fig. 8 is the schematic diagram of static nature vectorization of the embodiment of the present invention;
Fig. 9 is classification accuracy tendency chart;
Figure 10 is that technical solution using the present invention compares the testing result of identical software to be detected with the prior art
Figure.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Referring to FIG. 1, Fig. 1 is the flow chart of the detection method of Malware of the embodiment of the present invention, the method includes:
101, from the software sample set of known software type, static nature and the dynamic for extracting each software sample are special
Sign, the static nature of each software sample of extraction and behavioral characteristics are effectively combined, and form composite character data set;
102, characteristic dimension is reduced to the feature evaluation in composite character data set according to optimization method, removal redundancy is special
Sign, the composite character data set after being optimized;
103, the feature concentrated to the composite character after optimization with supporting vector machine model is trained, and forms classification inspection
Survey model;
104, inspection software is treated according to classification and Detection model to be detected.
Preferably, referring to FIG. 3, Fig. 3 is the sub-process figure of optimization composite character data set of the embodiment of the present invention;Described
Characteristic dimension is reduced to the feature evaluation in composite character data set according to optimization method, redundancy feature is removed, after obtaining optimization
Composite character data set, specifically include:
102.1, according to the method for principal component analysis to the Feature Dimension Reduction in composite character data set, the spy after dimensionality reduction is obtained
Levy data set;
102.2, it is deleted special with feature weight selection algorithm according to the feature in the composite character data set after dimensionality reduction
Levy the feature that weighted value is less than given threshold, the composite character data set after being optimized.
Preferably, referring to FIG. 4, Fig. 4 is the sub-process figure that the embodiment of the present invention forms classification and Detection model, the utilization
The feature that supporting vector machine model concentrates the composite character after optimization is trained, and is formed classification and Detection model, is specifically included:
103.1, the static nature vectorization in the composite character data set after optimization is handled;
103.2, by the behavioral characteristics standardization in the composite character data set after optimization, by the behavioral characteristics
Value is mapped to [0,1] section;
103.3, by the static nature of warp-wise quantification treatment and the behavioral characteristics of normalized processing formed composite character to
Amount file simultaneously preserves;
103.4, composite character vector file is trained with supporting vector machine model, generates disaggregated model file.
It is further preferred that the static nature vectorization processing in the composite character data set by after optimization, specifically
Including:
The static nature that composite character after extraction optimization is concentrated, establishes characteristic set T;
The static nature of each known software sample is compared one by one with characteristic set T;
If being matched to identical static nature inside T, it is labeled as from the static nature of each known software to described
1;
Otherwise, described in giving 0 is labeled as from the static nature in each known software;
Behavioral characteristics standardization in the composite character data set by after optimization, by the value of the behavioral characteristics
It is mapped to [0,1] section, is specifically included:
The behavioral characteristics that composite character after extraction optimization is concentrated, are set as mi;
According to formulaThe behavioral characteristics are mapped in [0,1] section, min (mi) indicate dynamic
State feature miMinimum value, max (mi) indicate behavioral characteristics miMaximum value.
Preferably, referring to FIG. 5, Fig. 5 is the embodiment of the present invention treats inspection software according to classification and Detection model and examined
The sub-process figure of survey, it is described inspection software is treated according to classification and Detection model to be detected, it specifically includes:
104.1, the static nature for extracting software to be detected and behavioral characteristics are effectively combined, generates composite character data
Collection;
104.2, the composite character data set is converted to format as defined in classification and Detection model to store, and inputs classification
Detection model;
104.3, classification and Detection model exports the type of software to be detected.
Referring to FIG. 2, Fig. 2 is the structural schematic diagram of the detection device of Malware of the embodiment of the present invention, described device packet
It includes:
Extraction unit 21, for from the software sample set of known software type, extracting the static state of each software sample
Feature and behavioral characteristics effectively combine the static nature of each software sample of extraction and behavioral characteristics, form composite character
Data set;
Optimize unit 22, for reducing feature dimensions to the feature evaluation in composite character data set according to optimization method
Degree removes redundancy feature, the composite character data set after being optimized;
Training unit 23, the feature for being concentrated to the composite character after optimization with supporting vector machine model are instructed
Practice, forms classification and Detection model;
Detection unit 24 is detected for treating inspection software according to classification and Detection model.
Preferably, referring to FIG. 6, Fig. 6 is the structural schematic diagram of optimization unit of the embodiment of the present invention, the optimization unit
22, it specifically includes:
Principal component analysis module 221, for being dropped to the feature in composite character data set according to the method for principal component analysis
Dimension, obtains the characteristic data set after dimensionality reduction;
Feature selection module 222, for according to the feature in the composite character data set after dimensionality reduction, being selected with feature weight
Algorithm is selected, the feature that feature weight value is less than given threshold, the composite character data set after being optimized are deleted.
Preferably, referring to FIG. 8, Fig. 8 is the schematic diagram of static nature vectorization of the embodiment of the present invention, the training unit
23, it specifically includes:
Static nature preprocessing module 231, for the static nature vectorization in the composite character data set after optimizing
Processing;
Behavioral characteristics preprocessing module 232, for the behavioral characteristics standardization in the composite character data set after optimizing
Processing, [0,1] section is mapped to by the value of the behavioral characteristics;
Preserving module 233, for forming the behavioral characteristics of the static nature of warp-wise quantification treatment and normalized processing
Composite character vector file simultaneously preserves;
Classification based training module 234, it is raw for being trained to the vector file of composite character with supporting vector machine model
Constituent class model file.
It is further preferred that the static nature preprocessing module, specifically includes:
Characteristic set submodule is established, the static nature concentrated for extracting the composite character after optimizing establishes feature set
Close T;
Submodule is compared, for comparing the static nature of each software sample and characteristic set T one by one;
Submodule is marked, if for being matched to identical static nature inside T, to the quiet of each software sample
State signature is 1;
Submodule is marked, matching is less than identical static nature if being additionally operable to inside T, in each software sample
Static nature be labeled as 0.
It is further preferred that the behavioral characteristics preprocessing module, specifically includes:
Value submodule, the behavioral characteristics concentrated for extracting the composite character after optimizing, is set as mi;
Evaluation submodule, for according to formulaThe behavioral characteristics are mapped to [0,1] section
It is interior, min (mi) indicate behavioral characteristics miMinimum value, max (mi) indicate behavioral characteristics miMaximum value.
Preferably, the detection unit, specifically includes:
Characteristic extracting module generates mixed for effectively combining the static nature and behavioral characteristics that extract software to be detected
Close characteristic data set;
Input module is stored for the composite character data set to be converted to format as defined in classification and Detection model, and
Input classification and Detection model;
Output module, the type for exporting software to be detected.
Above-mentioned technical proposal has the following technical effect that:By in the software sample set by known software type, extracting
Each software sample static nature and behavioral characteristics effectively combine, formed composite character data set;Pass through principal component analysis
Method and feature weight selection algorithm, selection contribute software classification larger feature, the composite character after being optimized
Data set;It is special to the dynamic of composite character collection after optimization by the static nature vectorization processing of composite character collection after optimizing
Standardization is levied, the vector file of composite character is input in supporting vector machine model, forms classification and Detection model;Comprehensively
Each feature of software sample is acquired, and selects to contribute larger feature to carry out classification and Detection model training disaggregated model,
With the classification and Detection model inspection software to be detected, the efficiency of detection is not only increased, and improve the standard of detection
True property.
Intelligent terminal is convenient for carrying, and along with becoming stronger day by day for its operational capability, people have surpassed its degree of dependence
Traditional functional mobile phone and PC equipment are crossed.The sharpest edges of intelligent terminal are, while providing basic communication functions, moreover it is possible to
Enough meet the needs of user surfs the Internet whenever and wherever possible, realizes more intelligentized applications.Global technology research and consulting firm
Gartner (Gao Dena) disclosed global operation system of smart phone terminal sale amount in 2013 in 2 months 2014.2013 complete
The mobile phone terminal of ball sells total quantity and is up to 9.68 hundred million, and rised appreciably smart mobile phone in 42.3%, and 2013 than 2012
Annual total sales volume for the first time be more than feature phone annual total sales volume, account for the 53.6% of mobile phone total sales volume, android system
The market share in operation system of smart phone market also increased 12 percentage points in 2013 than 2012, reached
78.4%.According to Market Research Corporation of America IDC, it is expected that the delivering amount of global smart mobile phone in 2014 will increase substantially.Phase
Than in 10.1 hundred million of last year, shipment amount in 2014 will reach 12.5 hundred million, and growth rate is up to 23.8%, by the end of 2018
Year, this number is expected to reach 1,800,000,000.
Android system was issued in 2007 by the OHA (global alliance organization) under Google for the first time.Android systems
The development of system is very swift and violent, constantly impacts the intelligent terminal market based on Nokia, Saipan system.The system is only very short
Time, become the Liang great main forces system to run neck and neck with apple iOS system, occupation rate of market ranks first always.City
The operation system of smart phone of field analysis mechanism (Strategy Analytics) publication is in the whole world in second and third season in 2014
Distribution situation is as shown in table 1, second and third in 2014 in season Android operation system world market share oneself up to 84.6%,
83.6%, and the systems proportion such as iOS and Windows Phone is glided.
Second and third of table 1 2014 years in season operation system of smart phone distribution on global situation
The security study of intelligent terminal is primarily present following 3 directions primarily directed in Android operation system.The
A kind of direction be before Android device loading application software just to code in malicious act that may be present be detected.
This detection method is divided into two methods of static analysis and dynamic analysis, mainly using oneself know malicious act in Malware or
The harm that the features such as code may bring Malware is analyzed.Second of direction is when application program operates in
When in Android device, the method that monitor code is inserted into critical applications interface is changed to the source generation of Android platform
Code, the various actions of rogue program are monitored with this.The third direction is in enterprise security application, frequently with security isolation skill
Art, the main area grade that application program is marked off with virtualization technology, stringent access control is realized with this.
The following detailed description of:102, feature dimensions are reduced to the feature evaluation in composite character data set according to optimization method
Degree removes redundancy feature, the detailed process of the composite character data set after being optimized:
The embodiment of the present invention proposes a kind of fusion principal component analysis (PCA) and feature weight selection algorithm (Relief)
Feature selecting algorithm Relief-PCA.The Relief-PCA algorithm synthesis advantage of two kinds of algorithms, theoretically, Relief-PCA is not
But there is high efficiency, and dimensionality reduction can be carried out to feature, eliminate redundancy feature to improve classification accuracy.Android malice
Software detection research is two classification problems, it is assumed that software sample set the S={ (x of known software type1, y1), (x2,
y2) ..., (xn, yn), it is made of n sample, xn∈Rn, each sample has m feature, i.e. xi=(xi1, xi2..., xim)。
yn∈ { -1,1 }, is marked xnClassification, wherein 1 indicate normal software, -1 indicate Malware, then Relief-PCA algorithms have
Body step is described as follows:
Input:Sample set s samples number e, screens threshold value σ, the intrinsic dimensionality r finally retained, after PCA is screened
Intrinsic dimensionality t, t > r.
Step1:Decentralization processing is carried out to sample set S by formula (1) first:
Step2:Calculate the covariance matrix AA of sampleT, and to AATEigenvalues Decomposition is carried out, maximum t feature is taken out
It is worth corresponding feature vector F '=(f1, f2..., ft), set of eigenvectors F ' is reduced into the feature set S after dimensionality reduction;
Step3:The weighted value of each sample in feature set S after the dimensionality reduction obtained in Step2 is set as 0, i.e. W (i)=
0, i=1,2 ..., t.
Step4:A sample R is randomly selected from sample set S, selection and R immediate one from the sample similar with R
A neighbour, is denoted as H, and selection and the immediate neighbour of R, are denoted as M from the sample with R foreign peoples.
Step5:Each feature weight W is updated using weight equation (2)d。
Wherein, diff (d, R, H) indicates the distance of sample R and sample H about feature d, is calculated using following formula (3):
Value (d, R) indicates that the value of d-th of feature on sample R, the distance of sample R and sample H about feature d are exactly
For feature d, the distance between two samples are calculated, to judge whether this feature is important feature.If diff (d,
R, H) < diff (d, R, M) show this feature in terms of the arest neighbors for distinguishing similar and foreign peoples is contributive, it is therefore desirable to
The weight for attempting increase this feature, if conversely, diff (d, R, H) > diff (d, R, M), show this feature not to classifying
To beneficial contribution, it is therefore desirable to attempt to reduce the weight of this feature, can be filtered out to classification according to the size of feature weight
Big feature is contributed, by the comparison of feature weight and preset threshold value that each iteration is found out, weights are less than the threshold
The feature of value is deleted, and the feature that weights are more than the threshold value is left, and is eliminated those with this and is contributed classification little feature.
Step6:Repeat Step4 to Step5 screening processes e times, the relevance weight of each feature in then exporting
Value WdIf some feature weight W found outd< σ, then delete this feature.
Step7:Retain the preceding r feature obtained in Step6 and is ranked up in the form of descending.
Output:Weights ranking forms optimized blended data feature set in preceding r feature.
The following detailed description of:103, the feature concentrated to the composite character after optimization with supporting vector machine model is instructed
Practice, forms the detailed process of classification and Detection model:
The feature in the composite character data set after optimization is pre-processed first, feature pretreatment is i.e. in composite character data
The feature of concentration is mapped as the process of feature vector before inputting sorting algorithm.Main purpose is by the type of characteristic attribute
It is standardized, it is indicated with same type.The characteristic attribute of extraction of the embodiment of the present invention includes static nature
And behavioral characteristics, static nature are mainly indicated with character string forms, and behavioral characteristics then indicate in digital form, belong to continuous change
Amount.Therefore different pretreating schemes is used for the characteristic data set of both classifications.
A. static nature vectorization
Because the static nature extracted is indicated with character string forms, it cannot be directly transmitted to disaggregated model, therefore right first
Static nature is pre-processed, i.e., these are mapped as to the input data of model, mapping using Feature Mapping using characteristic information
Process is as shown in Figure 8.
The static nature that composite character first after extraction optimization is concentrated, is indicated with character string forms, then sets up one
Characteristic set T is used as " characteristics dictionary ", is the static nature after Relief-PCA algorithms are preferred inside T, using T as standard structure
Static nature vector set is built, by being compared with the feature in T, by this feature if being matched to identical feature in T
Labeled as 1, it is otherwise labeled as 0, it in this way can be by the static nature of the software sample of each known software type by character string
Form is converted into the vector form being made of 0 and 1, that is, completes the process of Feature Mapping.
B. behavioral characteristics standardization
Since behavioral characteristics unit differs, value range also differs, it is therefore necessary to these behavioral characteristics into line number
It is worth normalized, in order to convert character numerical value to the set of eigenvectors that supporting vector machine model uses, the present invention is implemented
Example carries out linear change using min-max standardized methods to initial data, characteristic value is mapped in [0,1] section, to one
A feature miThe specific formula that mapping is standardized using min-max methods is as follows:
Min (m in formula (4)i) indicate characteristic attribute miMinimum value, max (mi) indicate characteristic attribute miMaximum value.
According to above two feature preprocess method, the embodiment of the present invention by the composite character vector of each sample in the form of table 2 into
Row processing and preservation.
The composite character vector set of 2 sample of table
Then the vector of composite character is trained, generates disaggregated model file:The embodiment of the present invention uses
The instruction of supporting vector machine model is completed in libsvm (software package of pattern-recognition and the recurrence of supporting vector machine model) tool boxes
Practice process.The tool box encapsulates complicated realization process, is adjustable support vector machines by simple parameter configuration
Type and kernel function type.Developer only needs to provide attribute matrix and label, by calling svmtrain (support vector machines
Model training) training and foundation of disaggregated model can be completed in method.Libsvm tools are integrated in Python and are supported
The main process of vector machine training is as follows:
A. training sample feature set or software under testing feature set is converted to format as defined in libsvm to store.
B. libsvm kits are downloaded, svmutil (supporting vector machine model tool) is imported in Python and is wrapped.
C. y, x=svm_read_problem (file reading) readings is used to have been converted into libsvm input formats
The vector file for composite character of being association of activity and inertia.Wherein y stores the value of the first column label in this document, and setting 1 and -1 indicates respectively
Normal software and Malware, x then store the characteristic value in this document.
D. the training and generation of model can be completed by model=svm_train (y, x, ' 0-t 2' of-s) method.
Wherein s indicates that support vector machines type, number 0 indicate that support vector machines type is C-SVC, and t indicates the support vector machines of selection
Core type, number 2 indicate to have selected RBF (radial base) function.
E. disaggregated model file classifyModel.model is ultimately generated, and branch (is preserved by svm_save_model
Hold vector machine model) method is saved in file for the class prediction of unknown software.
The superiority of technical solution in order to better illustrate the present invention, below in conjunction with application example in the embodiment of the present invention
Technical solution is stated to be described in detail:
Pass through the actual effect of malware detection method of the experimental verification based on android system.Experimental situation is
Win7 (64) host, 8G memories, 1T hard disks, using by Android Malware Genome Project (Android malice
Software Gene Project group) 1000 malice samples providing of project and Google Play (Google's application) and 1000 it is normal
Sample carries out model training and verification.
(1) optimization mass experiment of the Relief-PCA algorithms to composite character data set
The static nature and behavioral characteristics of this 2000 known softwares based on android system are extracted first, are extracted altogether
Static nature 42321 and running software 30 minutes behavioral characteristics.It needs simply to sieve the data set before experiment
Choosing counts the total degree that each feature occurs, and delete those and the feature that total degree is 1 occur, because these features are not
With popularity and representativeness, Characteristic Number becomes 30219 after simply screening, by the static nature and behavioral characteristics
It effectively combines, forms composite character data set.Relief-PCA algorithms are used in composite character data set, according to algorithm evaluation
Go out the weighted value of each feature, we have selected in five class static natures for the ranking static nature of first five, to illustrate Relief-
PCA algorithms are to the superiority in feature selecting, as shown in table 3:
As shown in Table 3, remain to the big feature of classification contribution degree mostly to access based on privacy of user data, such as
SEND_SMS in permission classes indicates that the permission of transmission short message, the location in Hardware classes indicate application positioning
Hardware capability, the SmsSendService in Application classes indicates to send the service of short message, the SIG_ in Intent classes
STR indicates that acquisition cellular signal strength, the SmsManager.getDefault in API classes obtain the message manager of acquiescence.This
A little access behaviors all shown to private data can be used as the main feature of differentiation Malware and normal software, and
INTERNET (internet) and ACCESS_NETWORK_STATE (network state for allowing routine access) etc.
In 3 all kinds of static natures of table before ranking 5 static nature
Common permission feature is not selected, mainly due to the category feature on distinguishing Malware and normal software
Contribution is little.It can be seen that Relief-PCA algorithms have excavated the really static state spy for distinguishing Malware and normal software
Sign.
(2) performance test of Relief-PCA algorithms
Experiment has used Weka (Waikato intellectual analysis environment) platform, by Relief-PCA methods and traditional Relief
Common information gain (Information Gain, IG) method is compared to verify in method and machine learning field
Advantage of the Relief-PCA methods in composite character data set in optimization.The data set used in experiment comes from by simple
Composite character data set is screened, disaggregated model has selected supporting vector machine model, selected according to the calculation of each method
Feature input as disaggregated model of the feature ranking in preceding 50,150,250,350 and 450.Compare three kinds of feature selectings to calculate
Classification accuracy of the method on Weka platforms, as shown in table 4.
Influence of 4 algorithms of different of table to accuracy rate
As shown in Table 4, Relief-PCA algorithms have compared to traditional Relief algorithms on classification accuracy larger
It is promoted, main reason is that tradition Relief algorithms cannot remove redundancy feature, to affect the ranking of feature, causes to classify
Accuracy rate is not high.For the algorithm for comparing information gain simultaneously, information gain algorithm is better than Relief- under low volume data collection
PCA algorithms, and the Relief-PCA algorithms tool more advantage under mass data collection, and the Malware based on android system
Detection method, what is faced is the great data set of data volume, and therefore, Relief-PCA algorithms are more excellent.
As shown in Figure 9.With the increase of characteristic dimension, the classification accuracy of three kinds of algorithms is gradually increasing, and before the selection 350
When a feature is as data set, classification accuracy reaches highest, and decline situation is then presented later.Cause the master of this Long-term change trend
Reason is wanted to be that the addition of uncorrelated features disturbs the judgement of grader.The classification obtained using traditional Relief methods is accurate
Rate is relatively low, and in selection ranking before preceding 250 features, the classification accuracy that use information gain obtains is calculated higher than other two kinds
Method, but it is higher than other two methods using the classification accuracy that Relief-PCA is obtained after this.
(3) support vector machines detection model performance test
Using technical solution provided by the invention come the accuracy rate and rate of false alarm of testing classification detection model, just from 1000
Static nature and behavioral characteristics are extracted in normal software and 1000 Malwares, behavioral characteristics acquire in software running process
Data in half an hour carry out the optimization of composite character data set using Relief-PCA methods, preserve spy of the ranking preceding 350
Value indicative is as experimental subjects.Experiment uses ten folding cross-validation methods, then finds out the mean value of this 10 experimental results as final
As a result, experimental result is as shown in table 5.
Influence of the 5 different characteristic selection algorithm of table to accuracy rate
As shown in Table 5, the support vector machines detection model accuracy rate that the embodiment of the present invention is formed is 90.22%, rate of false alarm
It is 9.58%, verification and measurement ratio 88.39%.It has chosen and is carried out in the representative achievement in research of field of malware detection herein
Comparison, as shown in Figure 10.All it is higher than other three kinds of achievements in research in accuracy rate and verification and measurement ratio herein.
In summary it tests, unknown software is examined using the classification and Detection model generated based on supporting vector machine model
It surveys, it on the one hand can be by the diversified extraction of the dynamic static nature of Android malware progress, thoroughly evaluating software action, together
Shi Caiyong Relief-PCA algorithms reduce characteristic dimension, remove redundancy feature;On the other hand, with supporting vector machine model pair
Composite character collection after optimization is trained, and has not only saved trained cost, while improving nicety of grading.
It should be understood that the particular order or level of the step of during disclosed are the examples of illustrative methods.Based on setting
Count preference, it should be appreciated that in the process the step of particular order or level can be in the feelings for the protection domain for not departing from the disclosure
It is rearranged under condition.Appended claim to a method is not illustratively sequentially to give the element of various steps, and not
It is to be limited to the particular order or level.
Those skilled in the art will also be appreciated that the various illustrative components, blocks that the embodiment of the present invention is listed
(illustrative logical block), unit and step can pass through the combination of electronic hardware, computer software, or both
It is realized.To clearly show that the replaceability (interchangeability) of hardware and software, above-mentioned is various illustrative
Component (illustrative components), unit and step universally describe their function.Such function
It is that the design requirement for depending on optimization application and whole system is realized by hardware or software.Those skilled in the art can be with
For each optimization application, various methods can be used to realize the function, but this realization is understood not to beyond this
The range of inventive embodiments protection.
Various illustrative logical blocks or unit described in the embodiment of the present invention can by general processor,
Digital signal processor, application-specific integrated circuit (ASIC), field programmable gate array or other programmable logic devices, discrete gate
Or described function is realized or is operated in transistor logic, the design of discrete hardware components or any of the above described combination.General place
It can be microprocessor to manage device, and optionally, which may be any traditional processor, controller, microcontroller
Device or state machine.Processor can also be realized by the combination of computing device, such as digital signal processor and microprocessor,
Multi-microprocessor, one or more microprocessors combine a digital signal processor core or any other like configuration
To realize.
The step of method described in the embodiment of the present invention or algorithm can be directly embedded into hardware, processor execute it is soft
The combination of part module or the two.Software module can be stored in RAM memory, flash memory, ROM memory, EPROM storages
Other any form of storaging mediums in device, eeprom memory, register, hard disk, moveable magnetic disc, CD-ROM or this field
In.Illustratively, storaging medium can be connect with processor, so that processor can read information from storaging medium, and
It can be to storaging medium stored and written information.Optionally, storaging medium can also be integrated into processor.Processor and storaging medium can
To be set in ASIC, ASIC can be set in user terminal.Optionally, processor and storaging medium can also be set to use
In different components in the terminal of family.
In one or more illustrative designs, above-mentioned function described in the embodiment of the present invention can be in hardware, soft
Part, firmware or the arbitrary of this three combine to realize.If realized in software, these functions can store and computer-readable
On medium, or with one or more instruction or code form be transmitted on the medium of computer-readable.Computer readable medium includes electricity
Brain storaging medium and convenient for allow computer program to be transferred to from a place telecommunication media in other places.Storaging medium can be with
It is that any general or special computer can be with the useable medium of access.For example, such computer readable media may include but
It is not limited to RAM, ROM, EEPROM, CD-ROM or other optical disc storage, disk storage or other magnetic storage devices or other
What can be used for carry or store with instruct or data structure and it is other can be by general or special computer or general or specially treated
The medium of the program code of device reading form.In addition, any connection can be properly termed computer readable medium, example
Such as, if software is to pass through a coaxial cable, fiber optic cables, double from a web-site, server or other remote resources
Twisted wire, Digital Subscriber Line (DSL) are defined with being also contained in for the wireless way for transmitting such as example infrared, wireless and microwave
In computer readable medium.The disk (disk) and disk (disc) includes compress disk, radium-shine disk, CD, DVD, floppy disk
And Blu-ray Disc, disk is usually with magnetic duplication data, and disk usually carries out optical reproduction data with laser.Combinations of the above
It can also be included in computer readable medium.
Above-described specific implementation mode has carried out further the purpose of the present invention, technical solution and advantageous effect
It is described in detail, it should be understood that the foregoing is merely the specific implementation mode of the present invention, is not intended to limit the present invention
Protection domain, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should all include
Within protection scope of the present invention.
Claims (10)
1. a kind of detection method of Malware, which is characterized in that the method includes:
From the software sample set of known software type, the static nature and behavioral characteristics of each software sample are extracted, will be carried
The static nature and behavioral characteristics of each software sample taken effectively combine, and form composite character data set;
Characteristic dimension is reduced to the feature evaluation in composite character data set according to optimization method, redundancy feature is removed, obtains
Composite character data set after optimization;
The feature concentrated to the composite character after optimization with supporting vector machine model is trained, and forms classification and Detection model;
Inspection software is treated according to classification and Detection model to be detected.
2. the detection method of Malware according to claim 1, which is characterized in that it is described according to optimization method, to mixed
The feature evaluation that characteristic is concentrated is closed, characteristic dimension is reduced, removes redundancy feature, the composite character data after being optimized
Collection, specifically includes:
According to the method for principal component analysis to the Feature Dimension Reduction in composite character data set, the characteristic data set after dimensionality reduction is obtained;
It is low to delete feature weight value with feature weight selection algorithm according to the feature in the composite character data set after dimensionality reduction
In the feature of given threshold, the composite character data set after being optimized.
3. the detection method of Malware according to claim 1, which is characterized in that described to use supporting vector machine model
The feature concentrated to the composite character after optimization is trained, and is formed classification and Detection model, is specifically included:
By the static nature vectorization processing in the composite character data set after optimization;
By the behavioral characteristics standardization in the composite character data set after optimization, the value of the behavioral characteristics is mapped to
[0,1] section;
The static nature of warp-wise quantification treatment and the behavioral characteristics of normalized processing are formed into composite character vector file and protected
It deposits;
Composite character vector file is trained with supporting vector machine model, generates disaggregated model file.
4. the detection method of Malware according to claim 3, which is characterized in that the composite character by after optimization
Static nature vectorization processing in data set, specifically includes:
The static nature that composite character after extraction optimization is concentrated, establishes characteristic set T;
The static nature of each software sample and characteristic set T are compared one by one;
If being matched to identical static nature inside T, 1 is labeled as to the static nature of each software sample;
Otherwise, the static nature given in each software sample is labeled as 0;
Behavioral characteristics standardization in the composite character data set by after optimization maps the value of the behavioral characteristics
To [0,1] section, specifically include:
The behavioral characteristics that composite character after extraction optimization is concentrated, are set as mi;
According to formulaThe behavioral characteristics are mapped in [0,1] section, min (mi) indicate that dynamic is special
Levy miMinimum value, max (mi) indicate behavioral characteristics miMaximum value.
5. the detection method of Malware according to claim 1, which is characterized in that described according to classification and Detection model pair
Software to be detected is detected, and is specifically included:
The static nature for extracting software to be detected and behavioral characteristics are effectively combined, composite character data set is generated;
The composite character data set is converted to format as defined in classification and Detection model to store, and inputs classification and Detection model;
Classification and Detection model exports the type of software to be detected.
6. a kind of detection device of Malware, which is characterized in that described device includes:
Extraction unit, for from the software sample set of known software type, extract each software sample static nature and
Behavioral characteristics effectively combine the static nature of each software sample of extraction and behavioral characteristics, form composite character data set;
Optimize unit, for reducing characteristic dimension, removal to the feature evaluation in composite character data set according to optimization method
Redundancy feature, the composite character data set after being optimized;
Training unit, the feature for being concentrated to the composite character after optimization with supporting vector machine model are trained, and are formed
Classification and Detection model;
Detection unit is detected for treating inspection software according to classification and Detection model.
7. malware detection device according to claim 6, which is characterized in that the optimization unit specifically includes:
Principal component analysis module, for, to the Feature Dimension Reduction in composite character data set, being obtained according to the method for principal component analysis
Characteristic data set after dimensionality reduction;
Feature selection module, for according to the feature in the composite character data set after dimensionality reduction, with feature weight selection algorithm,
Delete the feature that feature weight value is less than given threshold, the composite character data set after being optimized.
8. malware detection device according to claim 6, which is characterized in that the training unit specifically includes:
Static nature preprocessing module, for the static nature vectorization processing in the composite character data set after optimizing;
Behavioral characteristics preprocessing module will for the behavioral characteristics standardization in the composite character data set after optimizing
The value of the behavioral characteristics is mapped to [0,1] section;
Preserving module, for the static nature of warp-wise quantification treatment and the behavioral characteristics of normalized processing to be formed composite character
Vector file simultaneously preserves;
Classification based training module generates classification mould for being trained to composite character vector file with supporting vector machine model
Type file.
9. malware detection device according to claim 8, which is characterized in that the static nature preprocessing module,
It specifically includes:
Characteristic set submodule is established, the static nature concentrated for extracting the composite character after optimizing establishes characteristic set T;
Submodule is compared, for comparing the static nature of each software sample and characteristic set T one by one;
Submodule is marked, if for being matched to identical static nature inside T, to the static state from each software sample
Signature is 1;
Submodule is marked, matching is less than identical static nature if being additionally operable to inside T, to described from the quiet of each software sample
State signature is 0;
The behavioral characteristics preprocessing module, specifically includes:
Value submodule, the behavioral characteristics concentrated for extracting the composite character after optimizing, is set as mi;
Evaluation submodule, for according to formulaThe behavioral characteristics are mapped in [0,1] section, min
(mi) indicate behavioral characteristics miMinimum value, max (mi) indicate behavioral characteristics miMaximum value.
10. malware detection device according to claim 6, which is characterized in that the detection unit specifically includes:
Characteristic extracting module effectively combines the static nature for extracting software to be detected and behavioral characteristics, generates composite character number
According to collection;
The composite character data set is converted to format as defined in classification and Detection model and stored, and inputs classification by input module
Detection model;
Output module, the type for exporting software to be detected.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711477108.XA CN108345794A (en) | 2017-12-29 | 2017-12-29 | The detection method and device of Malware |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711477108.XA CN108345794A (en) | 2017-12-29 | 2017-12-29 | The detection method and device of Malware |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108345794A true CN108345794A (en) | 2018-07-31 |
Family
ID=62963453
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711477108.XA Pending CN108345794A (en) | 2017-12-29 | 2017-12-29 | The detection method and device of Malware |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108345794A (en) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109255241A (en) * | 2018-08-31 | 2019-01-22 | 国鼎网络空间安全技术有限公司 | Android privilege-escalation leak detection method and system based on machine learning |
CN109933991A (en) * | 2019-03-20 | 2019-06-25 | 杭州拜思科技有限公司 | A kind of method, apparatus of intelligence contract Hole Detection |
CN109933984A (en) * | 2019-02-15 | 2019-06-25 | 中时瑞安(北京)网络科技有限责任公司 | A kind of best cluster result screening technique, device and electronic equipment |
CN110929258A (en) * | 2019-11-07 | 2020-03-27 | 中国电子科技集团公司电子科学研究院 | Automatic detection method and device for malicious mobile application program |
CN111079142A (en) * | 2019-10-31 | 2020-04-28 | 湖北工业大学 | Malicious software detection method based on firework algorithm and support vector machine |
CN111144504A (en) * | 2019-12-30 | 2020-05-12 | 成都科来软件有限公司 | Software image flow identification and classification method based on PCA algorithm |
WO2020108357A1 (en) * | 2018-11-26 | 2020-06-04 | 华为技术有限公司 | Program classification model training method, program classification method, and device |
CN111723371A (en) * | 2020-06-22 | 2020-09-29 | 上海斗象信息科技有限公司 | Method for constructing detection model of malicious file and method for detecting malicious file |
CN112287952A (en) * | 2019-07-22 | 2021-01-29 | 腾讯科技(深圳)有限公司 | Virus clustering method, virus clustering device, storage medium and electronic device |
WO2021053505A1 (en) * | 2019-09-20 | 2021-03-25 | International Business Machines Corporation | Maintaining data privacy in a shared detection model system |
CN112699379A (en) * | 2020-12-31 | 2021-04-23 | 上海戎磐网络科技有限公司 | Firmware vulnerability scanning system and method based on software genes |
CN112818344A (en) * | 2020-08-17 | 2021-05-18 | 北京辰信领创信息技术有限公司 | Method for improving virus killing rate by applying artificial intelligence algorithm |
US11157776B2 (en) | 2019-09-20 | 2021-10-26 | International Business Machines Corporation | Systems and methods for maintaining data privacy in a shared detection model system |
CN113656308A (en) * | 2021-08-18 | 2021-11-16 | 福建卫联科技有限公司 | Computer software analysis system |
US11188320B2 (en) | 2019-09-20 | 2021-11-30 | International Business Machines Corporation | Systems and methods for updating detection models and maintaining data privacy |
CN113760764A (en) * | 2021-09-09 | 2021-12-07 | Oppo广东移动通信有限公司 | Application program detection method and device, electronic equipment and storage medium |
US11216268B2 (en) | 2019-09-20 | 2022-01-04 | International Business Machines Corporation | Systems and methods for updating detection models and maintaining data privacy |
CN113919841A (en) * | 2021-12-13 | 2022-01-11 | 北京雁翎网卫智能科技有限公司 | Block chain transaction monitoring method and system based on static characteristics and dynamic instrumentation |
WO2022037677A1 (en) * | 2020-08-21 | 2022-02-24 | 北京紫光展锐通信技术有限公司 | Method for determining log feature sequence, and vulnerability analysis method and system, and device |
EP3918500B1 (en) * | 2019-03-05 | 2024-04-24 | Siemens Industry Software Inc. | Machine learning-based anomaly detections for embedded software applications |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101964061A (en) * | 2010-09-02 | 2011-02-02 | 北京航空航天大学 | Binary kernel function support vector machine-based vehicle type recognition method |
WO2014152469A1 (en) * | 2013-03-18 | 2014-09-25 | The Trustees Of Columbia University In The City Of New York | Unsupervised anomaly-based malware detection using hardware features |
CN105205396A (en) * | 2015-10-15 | 2015-12-30 | 上海交通大学 | Detecting system for Android malicious code based on deep learning and method thereof |
CN107169351A (en) * | 2017-05-11 | 2017-09-15 | 北京理工大学 | With reference to the Android unknown malware detection methods of dynamic behaviour feature |
-
2017
- 2017-12-29 CN CN201711477108.XA patent/CN108345794A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101964061A (en) * | 2010-09-02 | 2011-02-02 | 北京航空航天大学 | Binary kernel function support vector machine-based vehicle type recognition method |
WO2014152469A1 (en) * | 2013-03-18 | 2014-09-25 | The Trustees Of Columbia University In The City Of New York | Unsupervised anomaly-based malware detection using hardware features |
CN105205396A (en) * | 2015-10-15 | 2015-12-30 | 上海交通大学 | Detecting system for Android malicious code based on deep learning and method thereof |
CN107169351A (en) * | 2017-05-11 | 2017-09-15 | 北京理工大学 | With reference to the Android unknown malware detection methods of dynamic behaviour feature |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109255241B (en) * | 2018-08-31 | 2022-04-22 | 国鼎网络空间安全技术有限公司 | Android permission promotion vulnerability detection method and system based on machine learning |
CN109255241A (en) * | 2018-08-31 | 2019-01-22 | 国鼎网络空间安全技术有限公司 | Android privilege-escalation leak detection method and system based on machine learning |
WO2020108357A1 (en) * | 2018-11-26 | 2020-06-04 | 华为技术有限公司 | Program classification model training method, program classification method, and device |
CN109933984B (en) * | 2019-02-15 | 2020-10-27 | 中时瑞安(北京)网络科技有限责任公司 | Optimal clustering result screening method and device and electronic equipment |
CN109933984A (en) * | 2019-02-15 | 2019-06-25 | 中时瑞安(北京)网络科技有限责任公司 | A kind of best cluster result screening technique, device and electronic equipment |
EP3918500B1 (en) * | 2019-03-05 | 2024-04-24 | Siemens Industry Software Inc. | Machine learning-based anomaly detections for embedded software applications |
CN109933991A (en) * | 2019-03-20 | 2019-06-25 | 杭州拜思科技有限公司 | A kind of method, apparatus of intelligence contract Hole Detection |
CN112287952A (en) * | 2019-07-22 | 2021-01-29 | 腾讯科技(深圳)有限公司 | Virus clustering method, virus clustering device, storage medium and electronic device |
US11216268B2 (en) | 2019-09-20 | 2022-01-04 | International Business Machines Corporation | Systems and methods for updating detection models and maintaining data privacy |
US11157776B2 (en) | 2019-09-20 | 2021-10-26 | International Business Machines Corporation | Systems and methods for maintaining data privacy in a shared detection model system |
WO2021053505A1 (en) * | 2019-09-20 | 2021-03-25 | International Business Machines Corporation | Maintaining data privacy in a shared detection model system |
GB2603373A (en) * | 2019-09-20 | 2022-08-03 | Ibm | Maintaining data privacy in a shared detection model system |
US11188320B2 (en) | 2019-09-20 | 2021-11-30 | International Business Machines Corporation | Systems and methods for updating detection models and maintaining data privacy |
US11080352B2 (en) | 2019-09-20 | 2021-08-03 | International Business Machines Corporation | Systems and methods for maintaining data privacy in a shared detection model system |
CN111079142A (en) * | 2019-10-31 | 2020-04-28 | 湖北工业大学 | Malicious software detection method based on firework algorithm and support vector machine |
CN110929258A (en) * | 2019-11-07 | 2020-03-27 | 中国电子科技集团公司电子科学研究院 | Automatic detection method and device for malicious mobile application program |
CN111144504A (en) * | 2019-12-30 | 2020-05-12 | 成都科来软件有限公司 | Software image flow identification and classification method based on PCA algorithm |
CN111144504B (en) * | 2019-12-30 | 2023-07-28 | 科来网络技术股份有限公司 | Software mirror image flow identification and classification method based on PCA algorithm |
CN111723371A (en) * | 2020-06-22 | 2020-09-29 | 上海斗象信息科技有限公司 | Method for constructing detection model of malicious file and method for detecting malicious file |
CN111723371B (en) * | 2020-06-22 | 2024-02-20 | 上海斗象信息科技有限公司 | Method for constructing malicious file detection model and detecting malicious file |
CN112818344A (en) * | 2020-08-17 | 2021-05-18 | 北京辰信领创信息技术有限公司 | Method for improving virus killing rate by applying artificial intelligence algorithm |
WO2022037677A1 (en) * | 2020-08-21 | 2022-02-24 | 北京紫光展锐通信技术有限公司 | Method for determining log feature sequence, and vulnerability analysis method and system, and device |
CN112699379A (en) * | 2020-12-31 | 2021-04-23 | 上海戎磐网络科技有限公司 | Firmware vulnerability scanning system and method based on software genes |
CN112699379B (en) * | 2020-12-31 | 2024-05-24 | 上海戎磐网络科技有限公司 | Firmware vulnerability scanning system and method based on software genes |
CN113656308A (en) * | 2021-08-18 | 2021-11-16 | 福建卫联科技有限公司 | Computer software analysis system |
CN113760764A (en) * | 2021-09-09 | 2021-12-07 | Oppo广东移动通信有限公司 | Application program detection method and device, electronic equipment and storage medium |
CN113919841A (en) * | 2021-12-13 | 2022-01-11 | 北京雁翎网卫智能科技有限公司 | Block chain transaction monitoring method and system based on static characteristics and dynamic instrumentation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108345794A (en) | The detection method and device of Malware | |
US11580222B2 (en) | Automated malware analysis that automatically clusters sandbox reports of similar malware samples | |
JP5990284B2 (en) | Spam detection system and method using character histogram | |
CN107659570A (en) | Webshell detection methods and system based on machine learning and static and dynamic analysis | |
CN106682495A (en) | Safety protection method and safety protection device | |
CN111931048B (en) | Artificial intelligence-based black product account detection method and related device | |
EP3028203A1 (en) | Signal tokens indicative of malware | |
US20160094574A1 (en) | Determining malware based on signal tokens | |
Darshan et al. | Performance evaluation of filter-based feature selection techniques in classifying portable executable files | |
CN110941956A (en) | Data classification method, device and related equipment | |
CN108734012A (en) | Malware recognition methods, device and electronic equipment | |
US20190370384A1 (en) | Ensemble-based data curation pipeline for efficient label propagation | |
CN111753290B (en) | Software type detection method and related equipment | |
CN109886016B (en) | Method, apparatus, and computer-readable storage medium for detecting abnormal data | |
CN109598124A (en) | A kind of webshell detection method and device | |
US11580220B2 (en) | Methods and apparatus for unknown sample classification using agglomerative clustering | |
KR20200039912A (en) | System and method for automatically analysing android malware by artificial intelligence | |
Song et al. | A method of intrusion detection based on WOA‐XGBoost algorithm | |
CN111931047B (en) | Artificial intelligence-based black product account detection method and related device | |
CN113052577B (en) | Class speculation method and system for block chain digital currency virtual address | |
CN106874760A (en) | A kind of Android malicious code sorting techniques based on hierarchy type SimHash | |
KR102302484B1 (en) | Method for mobile malware classification based feature selection, recording medium and device for performing the method | |
CN111428236A (en) | Malicious software detection method, device, equipment and readable medium | |
CN104504334A (en) | System and method used for evaluating selectivity of classification rules | |
US20240241954A1 (en) | Method of detecting android malware based on heterogeneous graph and apparatus thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180731 |
|
RJ01 | Rejection of invention patent application after publication |