CN105488084A - Tree isomorphism based software installation package classification method and system - Google Patents

Tree isomorphism based software installation package classification method and system Download PDF

Info

Publication number
CN105488084A
CN105488084A CN201410813440.9A CN201410813440A CN105488084A CN 105488084 A CN105488084 A CN 105488084A CN 201410813440 A CN201410813440 A CN 201410813440A CN 105488084 A CN105488084 A CN 105488084A
Authority
CN
China
Prior art keywords
installation kit
tree
tree structure
software installation
software
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410813440.9A
Other languages
Chinese (zh)
Inventor
刘爽
童志明
张栗伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Antiy Technology Co Ltd
Original Assignee
Harbin Antiy Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Antiy Technology Co Ltd filed Critical Harbin Antiy Technology Co Ltd
Priority to CN201410813440.9A priority Critical patent/CN105488084A/en
Publication of CN105488084A publication Critical patent/CN105488084A/en
Pending legal-status Critical Current

Links

Landscapes

  • Stored Programmes (AREA)

Abstract

The invention proposes a tree isomorphism based software installation package classification method and system. The method mainly comprises: obtaining resource section information of PE-format installation packages and forming a tree structure by the resource section information; comparing the tree structure with tree structures in a tree structure library in sequence; and if the similarity exceeds a preset value, then indicating that the installation packages and software corresponding to the tree structures in the tree structure library belong to a same category. With the method proposed by the invention, installation packages without feature information can be classified, so that the classification is more reasonable.

Description

Based on software installation kit sorting technique and the system of tree isomorphism
Technical field
The present invention relates to network safety filed, particularly a kind of software installation kit sorting technique based on tree isomorphism and system.
Background technology
The sorting technique of the windows platform software installation kit under current IA32 framework is that the characteristic information remained in its installation kit generated based on installation kit tools carries out classifying mostly.For the installation kit tools not having keeping characteristics information, common sorter is difficult to classify to it; And the software installation kit of the different language that same installation kit tools are derived, common sorter can be assigned in multiple classification, and classification results is inaccurate rationally.
Summary of the invention
For above-mentioned taxonomic defficiency, the present invention proposes a kind of software installation kit sorting technique based on tree isomorphism, can solve the problem causing classification difference in installation kit without characteristic information or different language.
Based on a software installation kit sorting technique for tree isomorphism, comprising:
Obtain software installation kit;
Whether be legal PE form, if so, then continue to detect, otherwise abandon described installation kit if analyzing described software installation kit;
Obtain the resource sections data in PE form;
By described resource sections data, form tree structure;
Contrasted by tree construction in described tree structure and tree structure storehouse, if similarity exceedes preset value, then described software installation kit is software classification corresponding to tree structure in tree structure storehouse, otherwise described software installation kit is new classification.
In described method, described resource sections data comprise: program appearance information, character string information and cursor information.
Based on a software installation kit categorizing system for tree isomorphism, comprising:
Acquisition module, for obtaining software installation kit;
Whether analysis module is legal PE form for analyzing described software installation kit, if so, then continues to detect, otherwise abandons described installation kit; Obtain the resource sections data in PE form; By described resource sections data, form tree structure;
Matching module, for the tree construction in described tree structure and tree structure storehouse is contrasted, if similarity exceedes preset value, then described software installation kit is software classification corresponding to tree structure in tree structure storehouse, otherwise described software installation kit is new classification.
In described system, described resource sections data comprise: program appearance information, character string information and cursor information.
Advantage of the present invention is, the installation kit program file of the windows platform under IA32 framework belongs to PE form, and the information such as appearance information, character string information, cursor information of its Program is all kept in PE file with the form of resource, forms a resource tree.The present invention classifies to installation kit program according to the structure of resource tree, and the installation kit procedure division belonging to the same or similar resource tree of structure is same classification, can realize the effect to software installation kit Rational Classification more.
The present invention proposes a kind of software installation kit sorting technique based on tree isomorphism and system, mainly through obtaining the resource joint information of PE form installation kit, and formed tree structure, tree construction in described tree structure and tree structure storehouse is contrasted successively, if similarity exceedes preset value, then the software that described installation kit is corresponding with the tree construction in tree structure storehouse belongs to same classification.By method of the present invention, can realizing not having the installation kit of characteristic information to classify, making classification more reasonable.
Accompanying drawing explanation
In order to be illustrated more clearly in the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, the accompanying drawing that the following describes is only some embodiments recorded in the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of software installation kit sorting technique embodiment process flow diagram based on tree isomorphism of the present invention;
Fig. 2 is a kind of software installation kit categorizing system example structure schematic diagram based on tree isomorphism of the present invention.
Embodiment
In order to make those skilled in the art person understand technical scheme in the embodiment of the present invention better, and enable above-mentioned purpose of the present invention, feature and advantage become apparent more, below in conjunction with accompanying drawing, technical scheme in the present invention is described in further detail.
For above-mentioned taxonomic defficiency, the present invention proposes a kind of software installation kit sorting technique based on tree isomorphism, can solve the problem causing classification difference in installation kit without characteristic information or different language.
Based on a software installation kit sorting technique for tree isomorphism, as shown in Figure 1, comprising:
S101: obtain software installation kit;
S102: whether analyze described software installation kit is legal PE form, if so, then continues to detect to perform S103, otherwise abandons described installation kit;
S103: obtain the resource sections data in PE form;
S104: by described resource sections data, forms tree structure;
S105: the tree construction in described tree structure and tree structure storehouse is contrasted, judge that similarity exceedes preset value, if so, then described software installation kit is software classification corresponding to tree structure in tree structure storehouse, otherwise described software installation kit is new classification.
In described method, described resource sections data comprise: program appearance information, character string information and cursor information.
Tree construction in described tree structure and tree structure storehouse is contrasted, judge that the concrete grammar that similarity exceedes preset value can be: the minimum representation adopting tree construction, every node layer for tree is given one and is estimated, then the node of every layer is sorted from small to large according to estimating, every node layer of two trees all completes the result based on obtaining serializing after the sequence estimated, and then tries to achieve the longest common subsequence of the Measure Sequences result of two trees.The number percent of longest common subsequence and two tree structure similarities is higher, is the similarity of two tree structures.Certainly can also use the comparative approach of other tree structures, but the Measures compare that the present embodiment adopts is comparatively accurate.
Based on a software installation kit categorizing system for tree isomorphism, as shown in Figure 2, comprising:
Acquisition module 201, for obtaining software installation kit;
Whether analysis module 202 is legal PE form for analyzing described software installation kit, if so, then continues to detect, otherwise abandons described installation kit; Obtain the resource sections data in PE form; By described resource sections data, form tree structure;
Matching module 203, for the tree construction in described tree structure and tree structure storehouse is contrasted, if similarity exceedes preset value, then described software installation kit is software classification corresponding to tree structure in tree structure storehouse, otherwise described software installation kit is new classification.
In described system, described resource sections data comprise: program appearance information, character string information and cursor information.
Advantage of the present invention is, the installation kit program file of the windows platform under IA32 framework belongs to PE form, and the information such as appearance information, character string information, cursor information of its Program is all kept in PE file with the form of resource, forms a resource tree.The present invention classifies to installation kit program according to the structure of resource tree, and the installation kit procedure division belonging to the same or similar resource tree of structure is same classification, can realize the effect to software installation kit Rational Classification more.
The present invention proposes a kind of software installation kit sorting technique based on tree isomorphism and system, mainly through obtaining the resource joint information of PE form installation kit, and formed tree structure, tree construction in described tree structure and tree structure storehouse is contrasted successively, if similarity exceedes preset value, then the software that described installation kit is corresponding with the tree construction in tree structure storehouse belongs to same classification.By method of the present invention, can realizing not having the installation kit of characteristic information to classify, making classification more reasonable.
As seen through the above description of the embodiments, those skilled in the art can be well understood to the mode that the present invention can add required general hardware platform by software and realizes.Based on such understanding, technical scheme of the present invention can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product can be stored in storage medium, as ROM/RAM, magnetic disc, CD etc., comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform the method described in some part of each embodiment of the present invention or embodiment.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for system embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.
The present invention can describe in the general context of computer executable instructions, such as program module.Usually, program module comprises the routine, program, object, assembly, data structure etc. that perform particular task or realize particular abstract data type.Also can put into practice the present invention in a distributed computing environment, in these distributed computing environment, be executed the task by the remote processing devices be connected by communication network.In a distributed computing environment, program module can be arranged in the local and remote computer-readable storage medium comprising memory device.
Although depict the present invention by embodiment, those of ordinary skill in the art know, the present invention has many distortion and change and do not depart from spirit of the present invention, and the claim appended by wishing comprises these distortion and change and do not depart from spirit of the present invention.

Claims (4)

1., based on a software installation kit sorting technique for tree isomorphism, it is characterized in that, comprising:
Obtain software installation kit;
Whether be legal PE form, if so, then continue to detect, otherwise abandon described installation kit if analyzing described software installation kit;
Obtain the resource sections data in PE form;
By described resource sections data, form tree structure;
Contrasted by tree construction in described tree structure and tree structure storehouse, if similarity exceedes preset value, then described software installation kit is software classification corresponding to tree structure in tree structure storehouse, otherwise described software installation kit is new classification.
2. the method for claim 1, is characterized in that, described resource sections data comprise: program appearance information, character string information and cursor information.
3., based on a software installation kit categorizing system for tree isomorphism, it is characterized in that, comprising:
Acquisition module, for obtaining software installation kit;
Whether analysis module is legal PE form for analyzing described software installation kit, if so, then continues to detect, otherwise abandons described installation kit; Obtain the resource sections data in PE form; By described resource sections data, form tree structure;
Matching module, for the tree construction in described tree structure and tree structure storehouse is contrasted, if similarity exceedes preset value, then described software installation kit is software classification corresponding to tree structure in tree structure storehouse, otherwise described software installation kit is new classification.
4. system as claimed in claim 3, it is characterized in that, described resource sections data comprise: program appearance information, character string information and cursor information.
CN201410813440.9A 2014-12-24 2014-12-24 Tree isomorphism based software installation package classification method and system Pending CN105488084A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410813440.9A CN105488084A (en) 2014-12-24 2014-12-24 Tree isomorphism based software installation package classification method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410813440.9A CN105488084A (en) 2014-12-24 2014-12-24 Tree isomorphism based software installation package classification method and system

Publications (1)

Publication Number Publication Date
CN105488084A true CN105488084A (en) 2016-04-13

Family

ID=55675062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410813440.9A Pending CN105488084A (en) 2014-12-24 2014-12-24 Tree isomorphism based software installation package classification method and system

Country Status (1)

Country Link
CN (1) CN105488084A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108255522A (en) * 2016-12-27 2018-07-06 北京金山云网络技术有限公司 A kind of application program sorting technique and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020105541A1 (en) * 1999-10-27 2002-08-08 Fujitsu Limited Multimedia information arranging apparatus and arranging method
CN102722556A (en) * 2012-05-29 2012-10-10 清华大学 Model comparison method based on similarity measurement
CN103067364A (en) * 2012-12-21 2013-04-24 华为技术有限公司 Virus detection method and equipment
CN103761483A (en) * 2014-01-27 2014-04-30 百度在线网络技术(北京)有限公司 Method and device for detecting malicious codes
CN104008333A (en) * 2013-02-21 2014-08-27 腾讯科技(深圳)有限公司 Installation package detecting method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020105541A1 (en) * 1999-10-27 2002-08-08 Fujitsu Limited Multimedia information arranging apparatus and arranging method
CN102722556A (en) * 2012-05-29 2012-10-10 清华大学 Model comparison method based on similarity measurement
CN103067364A (en) * 2012-12-21 2013-04-24 华为技术有限公司 Virus detection method and equipment
CN104008333A (en) * 2013-02-21 2014-08-27 腾讯科技(深圳)有限公司 Installation package detecting method and device
CN103761483A (en) * 2014-01-27 2014-04-30 百度在线网络技术(北京)有限公司 Method and device for detecting malicious codes

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108255522A (en) * 2016-12-27 2018-07-06 北京金山云网络技术有限公司 A kind of application program sorting technique and device

Similar Documents

Publication Publication Date Title
AU2016201689B2 (en) Methods and systems for searching logical patterns
Pont-Tuset et al. Measures and meta-measures for the supervised evaluation of image segmentation
US10210189B2 (en) Root cause analysis of performance problems
Suresh An unsupervised fuzzy clustering method for twitter sentiment analysis
CN106104496A (en) The abnormality detection not being subjected to supervision for arbitrary sequence
TWI464604B (en) Data clustering method and device, data processing apparatus and image processing apparatus
JP6535044B2 (en) Anomaly detection by self-learning of sensor signal
CN106778851B (en) Social relationship prediction system and method based on mobile phone evidence obtaining data
CN106612511B (en) Wireless network throughput evaluation method and device based on support vector machine
IN2015DE02745A (en)
CN104484410A (en) Data fusion method and system applied to big data system
de Sousa et al. Concept drift detection and localization in process mining: An integrated and efficient approach enabled by trace clustering
CN108234452B (en) System and method for identifying network data packet multilayer protocol
CN111126881A (en) Engineering cost risk prediction and assessment method
JP2016045556A (en) Inter-log cause-and-effect estimation device, system abnormality detector, log analysis system, and log analysis method
CN112446341A (en) Alarm event identification method, system, electronic equipment and storage medium
CN105488084A (en) Tree isomorphism based software installation package classification method and system
CN106847306B (en) Abnormal sound signal detection method and device
CN103425579A (en) Safety evaluation method for mobile terminal system based on potential function
US10169418B2 (en) Deriving a multi-pass matching algorithm for data de-duplication
CN113918577B (en) Data table identification method and device, electronic equipment and storage medium
US10338197B2 (en) System and method for use of qualitative modeling for signal analysis
CN110968570A (en) Distributed big data mining system facing E-commerce platform
CN109754159B (en) Method and system for extracting information of power grid operation log
US9665795B2 (en) Method and apparatus for identifying root cause of defect using composite defect map

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160413

RJ01 Rejection of invention patent application after publication