CN103778372B  A kind of spectral method identifying computer software behavior  Google Patents
A kind of spectral method identifying computer software behavior Download PDFInfo
 Publication number
 CN103778372B CN103778372B CN201410012074.7A CN201410012074A CN103778372B CN 103778372 B CN103778372 B CN 103778372B CN 201410012074 A CN201410012074 A CN 201410012074A CN 103778372 B CN103778372 B CN 103778372B
 Authority
 CN
 China
 Prior art keywords
 software action
 software
 model
 computer program
 matrix
 Prior art date
Links
Classifications

 G—PHYSICS
 G06—COMPUTING; CALCULATING; COUNTING
 G06F—ELECTRIC DIGITAL DATA PROCESSING
 G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
 G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
 G06F21/55—Detecting local intrusion or implementing countermeasures
 G06F21/56—Computer malware detection or handling, e.g. antivirus arrangements
 G06F21/562—Static detection
Abstract
Description
Technical field:
The present invention relates to a kind of spectral method identifying computer software behavior.
Background technology:
Whether computer software Activity recognition technology is Malware for one computer program of auxiliary judgment (Malware).Current method uses the lowlevel image feature representing software action (to include condition code, API sequence Deng), mate to come forecasting software behavior by characteristic matching or sequence pattern based on machine learning, the former can only For known malware, once Malware mutates, and needs the condition code storehouse that upgrades in time；The latter deposits The shortcoming high in rate of false alarm, rate of failing to report is high.
Summary of the invention:
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of identify computer software behavior Spectral method.
In order to solve abovementioned technical problem, the present invention provides a kind of spectral method identifying computer software behavior, Comprise the following steps:
(1) structure software action represents model: represent S by the model parameter two tuple (A*, B*) of DHMM Or the software action of G；
(2) extract software action feature: matrix A * is carried out spectral factorization, extract software action feature D；
(3) metric software behavioral similarity: calculate between two computer programs according to B* and D or two Software action similarity between individual computer program group or between a computer program and a program groups.
Further, described step (1) structure software action represents model, in two kinds of situation:
The first situation: the software action of single computer program represents model
S has M kind software action, and every kind of behavior is corresponding with a hidden state of DHMM (S)；With model (A*, B*) represent the software action of S, be embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C；
Input computer program S；
Call a kind of DHMM training algorithm and ask for making P [S  A, B, C] maximized model parameter A and B, point It is not designated as A* and B*；
The second situation: the software action of computer program group represents model
For computer program group G={S_{1}, S_{2}..., S_{N}, G has a M kind software action, every kind of behavior with DHMM(S_{1}), DHMM (S_{2}) ..., DHMM(S_{N}) a total hidden state is corresponding；With model (A*, B*) Represent the software action of G, be embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C；
All computer program S in input G_{1}, S_{2}..., S_{N}；
Call a kind of DHMM training algorithm to ask for making P [S_{1} A, B, C] × P[S_{2} A, B, C] × ... × P [S_{N} A, B, C] maximized model parameter A and B, it is designated as A* and B* respectively.
Further, described step (2) is extracted in software action feature, the software action feature square of M × M Battle array D={d_{ij}}_{M×M}Represent, D ith (i=1,2 ..., M) row element constitute a row vector D_{i}=<d_{i1}, d_{i2}..., d_{iM}>, it is embodied as step as follows:
Input computer program S or the software action model (A*, B*) of computer program group G；
Matrix A * is carried out spectral factorization operation, it is decomposed into A*=X ∑ X^{1}, wherein ∑ be one to angular moment Battle array, the element on each of which diagonal is an eigenvalue of A*, and each row vector of matrix X is and spy The characteristic vector that value indicative is corresponding；
M eigenvalue in ∑ is sorted by numerical values recited；
The characteristic vector of the 1st X corresponding to eigenvalue after sequence is designated as D_{1}, the 2nd eigenvalue institute is right The characteristic vector of the X answered is designated as D_{2}, by that analogy, the spy of the X corresponding to ith feature value after sequence Levy vector and be designated as D_{i}, i=1,2 ..., M.
Further, described step (3) uses step (1) the software action model (A*, B*) that exports and step Suddenly software action feature D that (2) export, calculates and (uses T respectively between two programs_{1}And T_{2}Represent), single Program (uses T_{1}Represent) (use T with a program groups_{2}Represent) between or two program groups (use T respectively_{1} And T_{2}Represent) between software action similarity or distinctiveness ratio, the highest then similarity of distinctiveness ratio is the lowest, otherwise As the same；Similarity the highest expression T_{1}And T_{2}There is the most similar software action；Software action similarity measurement makes With two kinds of matrixes: software action represents the B* in model and software action feature D, for the sake of difference, T_{1}'s The two matrix is used_{1}B* and_{1}D represents, T_{2}Matrix use_{2}B* and_{2}D represents；It is embodied as step as follows:
Setting distinctiveness ratio dist (Y, Z) between matrix, wherein Y and Z represents arbitrary two same order matrixes；
Input_{1}B*、_{1}D、_{2}B* and_{2}D；
Use formula [dist (_{1}B*,_{2}B*)]^{α}×[dist(_{1}D,_{2}D)]^{β}Weigh T_{1}And T_{2}Between software action Distinctiveness ratio, wherein α and β is two real numbers more than or equal to 0.
Compared with prior art, the invention has the beneficial effects as follows: take out from the lowlevel image feature representing software action As going out the software action feature of high level, the behavior of software is described from semantic level；By computer program DHMM (discrete HMM) modeling and spectral factorization method, it is soft that program of expressing quantitatively is had Part behavior characteristics, according to representing model and the similarity identification Malware of behavior characteristics.
Accompanying drawing illustrates:
Fig. 1 is the flow chart of the present invention.
Fig. 2 is the principle schematic of the present invention
Detailed description of the invention:
The invention will be further described with detailed description of the invention below in conjunction with the accompanying drawings:
The present invention relates to a kind of method for computer software Activity recognition, it uses discrete Hidden Markov State transition probability (the State of model (Discrete Hidden Markov Model is called for short DHMM) Transition probabilities) matrix and emission probability (Emission probabilities) thereof The behavior of matrix description software, spectral factorization (Spectral based on state transition probability matrix Decomposition) result represents the behavior characteristics of software, finally according to behavior characteristics and emission probability matrix Identifying the similarity of software action, method flow is as it is shown in figure 1, comprise the following steps:
(1) structure software action represents model: represent S by the model parameter two tuple (A*, B*) of DHMM Or the software action of G；
(2) extract software action feature: matrix A * is carried out spectral factorization, extract software action feature D；
(3) metric software behavioral similarity: calculate between two computer programs according to B* and D or two Software action similarity between individual computer program group or between a computer program and a program groups.
The present invention processes the computer program represented with sequence of events (Event sequence).Sequence of events Being a kind of time or the event string that spatially there is ordering relation, when being used for representing computer program, event can To be that program comprises or the actual computer instruction performed or job sequence on CPU, it can be program bag What contain or program was called in the process of implementation is supplied to apply journey by computer operating system or computer equipment The api function that sequence is called, it is also possible to be other discrete symbols describing software features.The symbolism used
1. event set: V={V_{1}..., V_{k}..., V_{K}, each element in set (uses V_{k}Represent, K=1,2 ..., K) represent an event (discrete symbols), K represents the number of event；
The most single computer program: S=(s_{1}..., s_{t}..., s_{n}), represent that this computer program is by n thing Part is constituted in order, and each event therein (uses s_{t}Represent, t=1,2 ..., n) it is all the element of V, both s_{t}∈V；
3. computer program group: G={S_{1}, S_{2}..., S_{N}, represent that this group computer program is made up of N number of program, Each program is the sequence of events that 2. a use define；
4. one group of software action: U={ θ_{1}, θ_{2}..., θ_{M}, each element in set represents a kind of abstract Software action, M represents the number of behavior；
5. the discrete HMM DHMM(S of computer program S): DHMM(S)=(S, Q, A, B, C), Wherein:
The observed value sequence ofmodel is S=(s_{1}..., s_{t}..., s_{n}), s_{t}∈V={V_{1}..., V_{k}..., V_{K}}；
The status switch ofmodel is Q=(q_{1}..., q_{t}..., q_{n}), each element therein (uses q_{t}Table Show, t=1,2 ..., n) it is the model hidden state corresponding with a software action, hidden state Number is M, both q_{t}∈U；
state transition probability matrix A=(a_{ij})_{M×M}, a_{ij}=P[q_{t+1}=θ_{j}q_{t}=θ_{i}], 1≤i≤M, 1≤j≤M；
state emission probability matrix B=(b_{ik})_{M×K}, b_{ik}=P[v_{k} at tq_{t}=θ_{i}], 1≤i≤M, 1≤k≤K；
initial state probabilities distribution C={c_{1}..., c_{j}..., c_{M}, c_{j}=P[q_{1}=θ_{j}], 1≤j≤M.
Fig. 2 is the principle schematic of the present invention, is explained in detail in the detailed process of each step,
Further, described step (1) structure software action represents model, in two kinds of situation:
The first situation: the software action of single computer program represents model
S has a M kind software action, every kind of behavior withA hidden state corresponding；With model (A*, B*) represent the software action of S, be embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C；
Input computer program S；
Call a kind of DHMM training algorithm and ask for making P [S  A, B, C] maximized model parameter A and B, point It is not designated as A* and B*；
The second situation: the software action of computer program group represents model
For computer program group G={S_{1}, S_{2}..., S_{N}, G has a M kind software action, every kind of behavior with DHMM(S_{1}), DHMM (S_{2}) ..., DHMM (S_{N}) a total hidden state is corresponding；With model (A*, B*) Represent the software action of G, be embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C；
All computer program S in input G_{1}, S_{2}..., S_{N}；
Call a kind of DHMM training algorithm to ask for making P [S_{1} A, B, C] × P[S_{2} A, B, C] × ... × P [S_{N} A, B, C] maximized model parameter A and B, it is designated as A* and B* respectively.
Further, described step (2) is extracted in software action feature, the software action feature square of M × M Battle array D={d_{ij}}_{M×M}Represent, D ith (i=1,2 ..., M) row element constitute a row vector D_{i}=<d_{i1}, d_{i2}..., d_{iM}>, it is embodied as step as follows:
Input computer program S or the software action model (A*, B*) of computer program group G；
Matrix A * is carried out spectral factorization operation, it is decomposed into A*=X ∑ X^{1}, wherein ∑ be one to angular moment Battle array, the element on each of which diagonal is an eigenvalue of A*, and each row vector of matrix X is and spy The characteristic vector that value indicative is corresponding；
M eigenvalue in ∑ is sorted by numerical values recited；
The characteristic vector of the 1st X corresponding to eigenvalue after sequence is designated as D_{1}, the 2nd eigenvalue institute is right The characteristic vector of the X answered is designated as D_{2}, by that analogy, the spy of the X corresponding to ith feature value after sequence Levy vector and be designated as D_{i}, i=1,2 ..., M.
Further, described step (3) uses step (1) the software action model (A*, B*) that exports and step Suddenly software action feature D that (2) export, calculates and (uses T respectively between two programs_{1}And T_{2}Represent), single Program (uses T_{1}Represent) (use T with a program groups_{2}Represent) between or two program groups (use T respectively_{1} And T_{2}Represent) between software action similarity or distinctiveness ratio, the highest then similarity of distinctiveness ratio is the lowest, otherwise As the same；Similarity the highest expression T_{1}And T_{2}There is the most similar software action；Software action similarity measurement makes With two kinds of matrixes: software action represents the B* in model and software action feature D, for the sake of difference, T_{1}'s The two matrix is used_{1}B* and_{1}D represents, T_{2}Matrix use_{2}B* and_{2}D represents；It is embodied as step as follows:
Set distinctiveness ratio dist (Y, Z) between matrix, wherein Y and Z represent arbitrary two with valency matrix；
Input_{1}B*、_{1}D、_{2}B* and_{2}D；
Use formula [dist (_{1}B*,_{2}B*)]^{α}×[dist(_{1}D,_{2}D)]^{β}Weigh T_{1}And T_{2}Between software action Distinctiveness ratio, wherein α and β is two real numbers more than or equal to 0.
Last it should be noted that, above example is only with technical scheme is described, rather than to this The restriction of bright protection domain, although the present invention being described in detail with reference to specific embodiment, this area It is to be appreciated by one skilled in the art that technical solution of the present invention can be modified or equivalent, and not Depart from the spirit and scope of technical solution of the present invention.
Claims (1)
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

CN201410012074.7A CN103778372B (en)  20140113  20140113  A kind of spectral method identifying computer software behavior 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

CN201410012074.7A CN103778372B (en)  20140113  20140113  A kind of spectral method identifying computer software behavior 
Publications (2)
Publication Number  Publication Date 

CN103778372A CN103778372A (en)  20140507 
CN103778372B true CN103778372B (en)  20161019 
Family
ID=50570596
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

CN201410012074.7A CN103778372B (en)  20140113  20140113  A kind of spectral method identifying computer software behavior 
Country Status (1)
Country  Link 

CN (1)  CN103778372B (en) 
Families Citing this family (1)
Publication number  Priority date  Publication date  Assignee  Title 

EP3258409B1 (en) *  20150318  20190717  Nippon Telegraph and Telephone Corporation  Device for detecting terminal infected by malware, system for detecting terminal infected by malware, method for detecting terminal infected by malware, and program for detecting terminal infected by malware 
Citations (2)
Publication number  Priority date  Publication date  Assignee  Title 

CN103294948A (en) *  20120227  20130911  百度在线网络技术（北京）有限公司  Software malicious behavior modeling and judging method and device, and mobile terminal 
CN103500307A (en) *  20130926  20140108  北京邮电大学  Mobile internet malignant application software detection method based on behavior model 
Family Cites Families (1)
Publication number  Priority date  Publication date  Assignee  Title 

US6907396B1 (en) *  20000601  20050614  Networks Associates Technology, Inc.  Detecting computer viruses or malicious software by patching instructions into an emulator 

2014
 20140113 CN CN201410012074.7A patent/CN103778372B/en active IP Right Grant
Patent Citations (2)
Publication number  Priority date  Publication date  Assignee  Title 

CN103294948A (en) *  20120227  20130911  百度在线网络技术（北京）有限公司  Software malicious behavior modeling and judging method and device, and mobile terminal 
CN103500307A (en) *  20130926  20140108  北京邮电大学  Mobile internet malignant application software detection method based on behavior model 
Also Published As
Publication number  Publication date 

CN103778372A (en)  20140507 
Similar Documents
Publication  Publication Date  Title 

Cheng et al.  Fuzzy time series forecasting based on fuzzy logical relationships and similarity measures  
Geng et al.  Facial age estimation by learning from label distributions  
Wang et al.  Combining multiobjective optimization with differential evolution to solve constrained optimization problems  
Quattoni et al.  An efficient projection for l 1,∞ regularization  
Wang et al.  Kinect based dynamic hand gesture recognition algorithm research  
Chang et al.  Robust static output feedback H∞ control for uncertain fuzzy systems  
Chen et al.  GAbased adaptive neural network controllers for nonlinear systems  
Guan et al.  Ensemble of bayesian predictors and decision trees for proactive failure management in cloud computing systems  
Peng et al.  Building program vector representations for deep learning  
Li et al.  Intrusion detection using convolutional neural networks for representation learning  
CN104573359B (en)  A kind of massrent labeled data integration method of task based access control difficulty and mark person's ability  
CN102707256B (en)  Fault diagnosis method based on BPAda Boost nerve network for electric energy meter  
CN103942568B (en)  A kind of sorting technique based on unsupervised feature selection  
Kalash et al.  Malware classification with deep convolutional neural networks  
Dong et al.  Automatic age estimation based on deep learning algorithm  
Yang et al.  Neural network and GA approaches for dwelling fire occurrence prediction  
CN102201236B (en)  Speaker recognition method combining Gaussian mixture model and quantum neural network  
CN104463209A (en)  Method for recognizing digital code on PCB based on BP neural network  
CN106104406B (en)  The method of neutral net and neural metwork training  
Tong et al.  An efficient deep model for dayahead electricity load forecasting with stacked denoising autoencoders  
Urbanowicz et al.  An analysis pipeline with statistical and visualizationguided knowledge discovery for michiganstyle learning classifier systems  
CN105631479B (en)  Depth convolutional network image labeling method and device based on nonequilibrium study  
Morariu et al.  A neural network model for time series forecasting  
CN104317681A (en)  Behavioral abnormality automatic detection method and behavioral abnormality automatic detection system aiming at computer system  
CN103617203B (en)  Proteinligand bindings bit point prediction method based on query driven 
Legal Events
Date  Code  Title  Description 

C06  Publication  
PB01  Publication  
C10  Entry into substantive examination  
SE01  Entry into force of request for substantive examination  
C41  Transfer of patent application or patent right or utility model  
TA01  Transfer of patent application right 
Effective date of registration: 20160518 Address after: 350007 Fuzhou Road, Cangshan District, Fujian, No. three on the road 8 Applicant after: Fujian Normal University Address before: 350117 Fujian city of Fuzhou province science and Technology University City Road No. 1 Qishan campus of Fujian Normal University Applicant before: Chen Lifei Effective date of registration: 20160518 Address after: 350007 Fuzhou Road, Cangshan District, Fujian, No. three on the road 8 Applicant after: Fujian Normal University Address before: 350117 Fujian city of Fuzhou province science and Technology University City Road No. 1 Qishan campus of Fujian Normal University Applicant before: Chen Lifei 

DD01  Delivery of document by public notice 
Addressee: Chen Lifei Document name: Notification of Passing Examination on Formalities Addressee: Chen Lifei Document name: Notification of Passing Examination on Formalities 

GR01  Patent grant  
C14  Grant of patent or utility model  
DD01  Delivery of document by public notice 
Addressee: Fujian Normal University Document name: Notification of Termination of Patent Right 