CN103778372B - A kind of spectral method identifying computer software behavior - Google Patents

A kind of spectral method identifying computer software behavior Download PDF

Info

Publication number
CN103778372B
CN103778372B CN201410012074.7A CN201410012074A CN103778372B CN 103778372 B CN103778372 B CN 103778372B CN 201410012074 A CN201410012074 A CN 201410012074A CN 103778372 B CN103778372 B CN 103778372B
Authority
CN
China
Prior art keywords
software action
software
model
computer program
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410012074.7A
Other languages
Chinese (zh)
Other versions
CN103778372A (en
Inventor
陈黎飞
陈可意
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Normal University
Original Assignee
Fujian Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Normal University filed Critical Fujian Normal University
Priority to CN201410012074.7A priority Critical patent/CN103778372B/en
Publication of CN103778372A publication Critical patent/CN103778372A/en
Application granted granted Critical
Publication of CN103778372B publication Critical patent/CN103778372B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a kind of spectral method identifying computer software behavior: (1) structure software action represents model;(2) software action feature is extracted;(3) metric software behavioral similarity.The invention has the beneficial effects as follows: from the low-level image feature representing software action, take out the software action feature of high level, describe the behavior of software from semantic level;Modeled and spectral factorization method by the DHMM (discrete HMM) of computer program, express the software action feature that program is had quantitatively, according to representing model and the similarity identification Malware of behavior characteristics.

Description

A kind of spectral method identifying computer software behavior
Technical field:
The present invention relates to a kind of spectral method identifying computer software behavior.
Background technology:
Whether computer software Activity recognition technology is Malware for one computer program of auxiliary judgment (Malware).Current method uses the low-level image feature representing software action (to include condition code, API sequence Deng), mate to come forecasting software behavior by characteristic matching or sequence pattern based on machine learning, the former can only For known malware, once Malware mutates, and needs the condition code storehouse that upgrades in time;The latter deposits The shortcoming high in rate of false alarm, rate of failing to report is high.
Summary of the invention:
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of identify computer software behavior Spectral method.
In order to solve above-mentioned technical problem, the present invention provides a kind of spectral method identifying computer software behavior, Comprise the following steps:
(1) structure software action represents model: represent S by the model parameter two tuple (A*, B*) of DHMM Or the software action of G;
(2) extract software action feature: matrix A * is carried out spectral factorization, extract software action feature D;
(3) metric software behavioral similarity: calculate between two computer programs according to B* and D or two Software action similarity between individual computer program group or between a computer program and a program groups.
Further, described step (1) structure software action represents model, in two kinds of situation:
The first situation: the software action of single computer program represents model
S has M kind software action, and every kind of behavior is corresponding with a hidden state of DHMM (S);With model (A*, B*) represent the software action of S, be embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C;
Input computer program S;
Call a kind of DHMM training algorithm and ask for making P [S | A, B, C] maximized model parameter A and B, point It is not designated as A* and B*;
The second situation: the software action of computer program group represents model
For computer program group G={S1, S2..., SN, G has a M kind software action, every kind of behavior with DHMM(S1), DHMM (S2) ..., DHMM(SN) a total hidden state is corresponding;With model (A*, B*) Represent the software action of G, be embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C;
All computer program S in input G1, S2..., SN
Call a kind of DHMM training algorithm to ask for making P [S1| A, B, C] × P[S2| A, B, C] × ... × P [SN| A, B, C] maximized model parameter A and B, it is designated as A* and B* respectively.
Further, described step (2) is extracted in software action feature, the software action feature square of M × M Battle array D={dij}M×MRepresent, D i-th (i=1,2 ..., M) row element constitute a row vector Di=<di1, di2..., diM>, it is embodied as step as follows:
Input computer program S or the software action model (A*, B*) of computer program group G;
Matrix A * is carried out spectral factorization operation, it is decomposed into A*=X ∑ X1, wherein ∑ be one to angular moment Battle array, the element on each of which diagonal is an eigenvalue of A*, and each row vector of matrix X is and spy The characteristic vector that value indicative is corresponding;
M eigenvalue in ∑ is sorted by numerical values recited;
The characteristic vector of the 1st X corresponding to eigenvalue after sequence is designated as D1, the 2nd eigenvalue institute is right The characteristic vector of the X answered is designated as D2, by that analogy, the spy of the X corresponding to ith feature value after sequence Levy vector and be designated as Di, i=1,2 ..., M.
Further, described step (3) uses step (1) the software action model (A*, B*) that exports and step Suddenly software action feature D that (2) export, calculates and (uses T respectively between two programs1And T2Represent), single Program (uses T1Represent) (use T with a program groups2Represent) between or two program groups (use T respectively1 And T2Represent) between software action similarity or distinctiveness ratio, the highest then similarity of distinctiveness ratio is the lowest, otherwise As the same;Similarity the highest expression T1And T2There is the most similar software action;Software action similarity measurement makes With two kinds of matrixes: software action represents the B* in model and software action feature D, for the sake of difference, T1's The two matrix is used1B* and1D represents, T2Matrix use2B* and2D represents;It is embodied as step as follows:
Setting distinctiveness ratio dist (Y, Z) between matrix, wherein Y and Z represents arbitrary two same order matrixes;
Input1B*、1D、2B* and2D;
Use formula [dist (1B*,2B*)]α×[dist(1D,2D)]βWeigh T1And T2Between software action Distinctiveness ratio, wherein α and β is two real numbers more than or equal to 0.
Compared with prior art, the invention has the beneficial effects as follows: take out from the low-level image feature representing software action As going out the software action feature of high level, the behavior of software is described from semantic level;By computer program DHMM (discrete HMM) modeling and spectral factorization method, it is soft that program of expressing quantitatively is had Part behavior characteristics, according to representing model and the similarity identification Malware of behavior characteristics.
Accompanying drawing illustrates:
Fig. 1 is the flow chart of the present invention.
Fig. 2 is the principle schematic of the present invention
Detailed description of the invention:
The invention will be further described with detailed description of the invention below in conjunction with the accompanying drawings:
The present invention relates to a kind of method for computer software Activity recognition, it uses discrete Hidden Markov State transition probability (the State of model (Discrete Hidden Markov Model is called for short DHMM) Transition probabilities) matrix and emission probability (Emission probabilities) thereof The behavior of matrix description software, spectral factorization (Spectral based on state transition probability matrix Decomposition) result represents the behavior characteristics of software, finally according to behavior characteristics and emission probability matrix Identifying the similarity of software action, method flow is as it is shown in figure 1, comprise the following steps:
(1) structure software action represents model: represent S by the model parameter two tuple (A*, B*) of DHMM Or the software action of G;
(2) extract software action feature: matrix A * is carried out spectral factorization, extract software action feature D;
(3) metric software behavioral similarity: calculate between two computer programs according to B* and D or two Software action similarity between individual computer program group or between a computer program and a program groups.
The present invention processes the computer program represented with sequence of events (Event sequence).Sequence of events Being a kind of time or the event string that spatially there is ordering relation, when being used for representing computer program, event can To be that program comprises or the actual computer instruction performed or job sequence on CPU, it can be program bag What contain or program was called in the process of implementation is supplied to apply journey by computer operating system or computer equipment The api function that sequence is called, it is also possible to be other discrete symbols describing software features.The symbolism used
1. event set: V={V1..., Vk..., VK, each element in set (uses VkRepresent, K=1,2 ..., K) represent an event (discrete symbols), K represents the number of event;
The most single computer program: S=(s1..., st..., sn), represent that this computer program is by n thing Part is constituted in order, and each event therein (uses stRepresent, t=1,2 ..., n) it is all the element of V, both st∈V;
3. computer program group: G={S1, S2..., SN, represent that this group computer program is made up of N number of program, Each program is the sequence of events that 2. a use define;
4. one group of software action: U={ θ1, θ2..., θM, each element in set represents a kind of abstract Software action, M represents the number of behavior;
5. the discrete HMM DHMM(S of computer program S): DHMM(S)=(S, Q, A, B, C), Wherein:
The observed value sequence of-model is S=(s1..., st..., sn), st∈V={V1..., Vk..., VK};
The status switch of-model is Q=(q1..., qt..., qn), each element therein (uses qtTable Show, t=1,2 ..., n) it is the model hidden state corresponding with a software action, hidden state Number is M, both qt∈U;
-state transition probability matrix A=(aij)M×M, aij=P[qt+1j|qti], 1≤i≤M, 1≤j≤M;
-state emission probability matrix B=(bik)M×K, bik=P[vk at t|qti], 1≤i≤M, 1≤k≤K;
-initial state probabilities distribution C={c1..., cj..., cM, cj=P[q1j], 1≤j≤M.
Fig. 2 is the principle schematic of the present invention, is explained in detail in the detailed process of each step,
Further, described step (1) structure software action represents model, in two kinds of situation:
The first situation: the software action of single computer program represents model
S has a M kind software action, every kind of behavior withA hidden state corresponding;With model (A*, B*) represent the software action of S, be embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C;
Input computer program S;
Call a kind of DHMM training algorithm and ask for making P [S | A, B, C] maximized model parameter A and B, point It is not designated as A* and B*;
The second situation: the software action of computer program group represents model
For computer program group G={S1, S2..., SN, G has a M kind software action, every kind of behavior with DHMM(S1), DHMM (S2) ..., DHMM (SN) a total hidden state is corresponding;With model (A*, B*) Represent the software action of G, be embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C;
All computer program S in input G1, S2..., SN
Call a kind of DHMM training algorithm to ask for making P [S1| A, B, C] × P[S2| A, B, C] × ... × P [SN| A, B, C] maximized model parameter A and B, it is designated as A* and B* respectively.
Further, described step (2) is extracted in software action feature, the software action feature square of M × M Battle array D={dij}M×MRepresent, D i-th (i=1,2 ..., M) row element constitute a row vector Di=<di1, di2..., diM>, it is embodied as step as follows:
Input computer program S or the software action model (A*, B*) of computer program group G;
Matrix A * is carried out spectral factorization operation, it is decomposed into A*=X ∑ X1, wherein ∑ be one to angular moment Battle array, the element on each of which diagonal is an eigenvalue of A*, and each row vector of matrix X is and spy The characteristic vector that value indicative is corresponding;
M eigenvalue in ∑ is sorted by numerical values recited;
The characteristic vector of the 1st X corresponding to eigenvalue after sequence is designated as D1, the 2nd eigenvalue institute is right The characteristic vector of the X answered is designated as D2, by that analogy, the spy of the X corresponding to ith feature value after sequence Levy vector and be designated as Di, i=1,2 ..., M.
Further, described step (3) uses step (1) the software action model (A*, B*) that exports and step Suddenly software action feature D that (2) export, calculates and (uses T respectively between two programs1And T2Represent), single Program (uses T1Represent) (use T with a program groups2Represent) between or two program groups (use T respectively1 And T2Represent) between software action similarity or distinctiveness ratio, the highest then similarity of distinctiveness ratio is the lowest, otherwise As the same;Similarity the highest expression T1And T2There is the most similar software action;Software action similarity measurement makes With two kinds of matrixes: software action represents the B* in model and software action feature D, for the sake of difference, T1's The two matrix is used1B* and1D represents, T2Matrix use2B* and2D represents;It is embodied as step as follows:
Set distinctiveness ratio dist (Y, Z) between matrix, wherein Y and Z represent arbitrary two with valency matrix;
Input1B*、1D、2B* and2D;
Use formula [dist (1B*,2B*)]α×[dist(1D,2D)]βWeigh T1And T2Between software action Distinctiveness ratio, wherein α and β is two real numbers more than or equal to 0.
Last it should be noted that, above example is only with technical scheme is described, rather than to this The restriction of bright protection domain, although the present invention being described in detail with reference to specific embodiment, this area It is to be appreciated by one skilled in the art that technical solution of the present invention can be modified or equivalent, and not Depart from the spirit and scope of technical solution of the present invention.

Claims (1)

1. the spectral method identifying computer software behavior, it is characterised in that it is realized by following steps:
(1) structure software action represents model: represent the software row of S or G by the model parameter two tuple (A*, B*) of DHMM For;
(2) extract software action feature: matrix A * is carried out spectral factorization, extract software action feature D;
(3) software action similarity measurement: calculate between two computer programs according to B* and D or two computer program groups Between or a computer program and a program groups between software action similarity;
Described step (1) structure software action represents model, in two kinds of situation:
The first situation: the software action of single computer program represents model
S has M kind software action, and every kind of behavior is corresponding with a hidden state of DHMM (S);Represent with model (A*, B*) The software action of S, is embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C;
Input computer program S;
Call a kind of DHMM training algorithm and ask for making P [S | A, B, C] maximized model parameter A and B, be designated as A* respectively And B*;
The second situation: the software action of computer program group represents model
For computer program group G={S1, S2..., SN), G has M kind software action, every kind of behavior and DHMM (S1), DHMM(S2) ..., DHMM (SN) a total hidden state is corresponding;The software action of G is represented with model (A*, B*), It is embodied as step as follows:
Set the number M of hidden state, and given initial state probabilities distribution C;
All computer program S in input G1, S2..., SN
Call a kind of DHMM training algorithm to ask for making P [S1| A, B, C] × P [S2| A, B, C] × ... × P [SN| A, B, C] maximize Model parameter A and B, be designated as A* and B* respectively;
Described step (2) is extracted in software action feature, the software action feature matrix D of M × M={ dij}M×MRepresent, D I-th (i=1,2 ..., M) row element constitute row vector Di=< di1, di2..., diM>, it is embodied as step as follows:
Input computer program S or the software action model (A*, B*) of computer program group G;
Matrix A * is carried out spectral factorization operation, it is decomposed into A*=X ∑ X1, wherein ∑ is a diagonal matrix, and each of which is right Element on linea angulata is an eigenvalue of A*, and each row vector of matrix X is the characteristic vector corresponding with eigenvalue;
M eigenvalue in ∑ is sorted by numerical values recited;
The characteristic vector of the 1st X corresponding to eigenvalue after sequence is designated as D1, the spy of the 2nd X corresponding to eigenvalue Levy vector and be designated as D2, by that analogy, the characteristic vector of the X corresponding to ith feature value after sequence is designated as Di, i=1,2 ..., M;
Described step (3) uses software action feature D that step (1) the software action model (A*, B*) that exports and step (2) export, Calculate and use T respectively between two programs1And T2Expression, single program T1Represent and program groups T2Between expression or Two program groups use T respectively1And T2Software action similarity between expression or distinctiveness ratio, the highest then similarity of distinctiveness ratio is the lowest, Vice versa;Similarity the highest expression T1And T2There is the most similar software action;Software action similarity measurement uses two kinds Matrix: software action represents the B* in model and software action feature D, for the sake of difference, T1The two matrix use1B* and1D represents, T2Matrix use2B* and2D represents;It is embodied as step as follows:
Setting distinctiveness ratio dist (Y, Z) between matrix, wherein Y and Z represents arbitrary two same order matrixes;
Input1B*、1D、2B* and2D;
Use formula [dist (1B*,2B*)]α×[dist(1D,2D)]βWeigh T1And T2Between software action distinctiveness ratio, wherein α and β It is two real numbers more than or equal to 0.
CN201410012074.7A 2014-01-13 2014-01-13 A kind of spectral method identifying computer software behavior Expired - Fee Related CN103778372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410012074.7A CN103778372B (en) 2014-01-13 2014-01-13 A kind of spectral method identifying computer software behavior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410012074.7A CN103778372B (en) 2014-01-13 2014-01-13 A kind of spectral method identifying computer software behavior

Publications (2)

Publication Number Publication Date
CN103778372A CN103778372A (en) 2014-05-07
CN103778372B true CN103778372B (en) 2016-10-19

Family

ID=50570596

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410012074.7A Expired - Fee Related CN103778372B (en) 2014-01-13 2014-01-13 A kind of spectral method identifying computer software behavior

Country Status (1)

Country Link
CN (1) CN103778372B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10303873B2 (en) * 2015-03-18 2019-05-28 Nippon Telegraph And Telephone Corporation Device for detecting malware infected terminal, system for detecting malware infected terminal, method for detecting malware infected terminal, and program for detecting malware infected terminal

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294948A (en) * 2012-02-27 2013-09-11 百度在线网络技术(北京)有限公司 Software malicious behavior modeling and judging method and device, and mobile terminal
CN103500307A (en) * 2013-09-26 2014-01-08 北京邮电大学 Mobile internet malignant application software detection method based on behavior model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6907396B1 (en) * 2000-06-01 2005-06-14 Networks Associates Technology, Inc. Detecting computer viruses or malicious software by patching instructions into an emulator

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294948A (en) * 2012-02-27 2013-09-11 百度在线网络技术(北京)有限公司 Software malicious behavior modeling and judging method and device, and mobile terminal
CN103500307A (en) * 2013-09-26 2014-01-08 北京邮电大学 Mobile internet malignant application software detection method based on behavior model

Also Published As

Publication number Publication date
CN103778372A (en) 2014-05-07

Similar Documents

Publication Publication Date Title
CN106570513B (en) The method for diagnosing faults and device of big data network system
Dong et al. Automatic age estimation based on deep learning algorithm
CN106951499B (en) A kind of knowledge mapping representation method based on translation model
CN104504400B (en) A kind of driver&#39;s anomaly detection method based on online behavior modeling
CN110378480B (en) Model training method and device and computer readable storage medium
CN110263979B (en) Method and device for predicting sample label based on reinforcement learning model
CN106897265B (en) Word vector training method and device
CN109670302B (en) SVM-based classification method for false data injection attacks
CN110909125B (en) Detection method of media rumor of news-level society
CN111292195A (en) Risk account identification method and device
Pratama et al. A novel meta-cognitive-based scaffolding classifier to sequential non-stationary classification problems
CN111612125A (en) Novel HTM time pool method and system for online learning
CN109522432B (en) Image retrieval method integrating adaptive similarity and Bayes framework
CN104077524B (en) Training method and viruses indentification method and device for viruses indentification
CN104049612A (en) Processing workshop scheduling method based on distribution estimation
CN103778372B (en) A kind of spectral method identifying computer software behavior
Halkias et al. Sparse penalty in deep belief networks: using the mixed norm constraint
CN115776401B (en) Method and device for tracing network attack event based on less sample learning
Mostofi et al. Explainable safety risk management in construction with unsupervised learning
CN113516199B (en) Image data generation method based on differential privacy
Liu et al. Contrastive divergence learning for the restricted Boltzmann machine
CN110942089B (en) Multi-level decision-based keystroke recognition method
WO2021059527A1 (en) Learning device, learning method, and recording medium
CN112132269A (en) Model processing method, device, equipment and storage medium
CN111832815A (en) Scientific research hotspot prediction method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160518

Address after: 350007 Fuzhou Road, Cangshan District, Fujian, No. three on the road 8

Applicant after: Fujian Normal University

Address before: 350117 Fujian city of Fuzhou province science and Technology University City Road No. 1 Qishan campus of Fujian Normal University

Applicant before: Chen Lifei

DD01 Delivery of document by public notice

Addressee: Chen Lifei

Document name: Notification of Passing Examination on Formalities

C14 Grant of patent or utility model
GR01 Patent grant
DD01 Delivery of document by public notice

Addressee: Fujian Normal University

Document name: Notification of Termination of Patent Right

DD01 Delivery of document by public notice
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20161019

Termination date: 20190113