CN104731705B - A kind of dirty data propagation path based on complex network finds method - Google Patents
A kind of dirty data propagation path based on complex network finds method Download PDFInfo
- Publication number
- CN104731705B CN104731705B CN201310750367.0A CN201310750367A CN104731705B CN 104731705 B CN104731705 B CN 104731705B CN 201310750367 A CN201310750367 A CN 201310750367A CN 104731705 B CN104731705 B CN 104731705B
- Authority
- CN
- China
- Prior art keywords
- complex network
- node
- dirty data
- network
- function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Stored Programmes (AREA)
Abstract
The present invention provides a kind of dirty data propagation path based on complex network and finds method, can be to being translated without the binary program for providing source code, and obtained result is handled, and excavates generation useful information.The first step:Decompiling is carried out to binary file, and obtains the intermediate code of C language, after testing a simple C programmer, intermediate language code is obtained;Second step:Function call path is captured, function address is resolved into function name, and is handled and is simplified, and generator matrix form;Ultimately produce function call relationship graph;3rd step:Analytical function call graph, obtains node, side, weight information, and calculating obtains node degree, sets up the complex network figure with key node;4th step:According to the nonuniformity of the power law distribution of complex network figure, find out the point related to construction dirty data and call the high point of frequency.
Description
Technical field
Method is found the present invention relates to a kind of dirty data propagation path based on complex network, belongs to software security techniques neck
Domain.
Background technology
In the research method based on complex network, there are many concepts and method to can be used to reflect that the ASSOCIATE STATISTICS of network is special
Property, the degree that most important of which has node is distributed.In software network, the degree of a node can be extended to such in software
The number of times called by other classes.Therefore, intuitively, if a called number of times of class is more, then its importance
It is higher.But, software network is typically all weighting directed networkses, and the importance of class is only weighed from " degree " and is forbidden
Really.For example, for a class with specific function, it degree itself is simultaneously little, if can but it is removed from software
Can directly result in the software can not operate.
Known by the research to scales-free network, the situation that defect occurs for general node there are two kinds:One kind is random section
There is a part of random node in point design defect, that is, software network has design defect, but the function of this part of nodes
Missing is general to have no effect on the overall normal operation of software;Two be the design defect of importance node, i.e. software network interior joint
The higher node of importance has design defect.In software network, because the nonuniformity of power law distribution is presented in degree distribution,
This make it that the called number of times of this kind of node is very high, and they account for all node numbers in software network and, less than 5%, but realized
Software topmost function.Therefore, once the function of these classes is lacked, it is possible to system crash will be directly resulted in.
In consideration of it, if being treated in software development initial stage and software test procedure to the node emphasis of these only a fews, then
Software development test period can not only be shortened, and the quality of software can also be made to increase.Therefore, it is how more effective and accurate
It is the work currently to be done really to find these important nodes.
Forefathers carry out the correlative study of decompiling on the basis of IDA dis-assemblings, and itself provides abundant dis-assembling
Object information and data represent definition, therefore the realization of intermediate language may not necessarily simulate compilation semanteme as semiology analysis
Perform, be also not required to as SSL is compared the description of complete and complicated instruction system.Intermediate language realization herein, mainly
According to the semanteme and IDA dis-assembling object information of assembly instruction, by building a relatively simple instruction semantic describing word
Allusion quotation, by searching matching accordingly, and then realizes assembler language to the conversion of intermediate language.Have in relating generally to:It is middle
The definition of language, the semantic dictionary description of intermediate language, specific translation is realized.
Existing dirty data discovery technique is not comprehensive enough, can run through the method for running software whole cycle without a set of.
Therefore, once the function of these classes is lacked, it is possible to system crash will be directly resulted in.If in consideration of it, opened in software
If being treated in hair initial stage and software test procedure to the node emphasis of these only a fews, then can not only shorten software development
Test period, and the quality of software can also be made to increase.Therefore, it is how more effective and correctly find these important sections
Point is the work currently to be done.And existing discovery dirty data communications is all based on source code, does not support to binary system journey
The discovery of sequence.
The content of the invention
The present invention provides a kind of dirty data propagation path based on complex network and finds method, can be to no offer source code
Binary program translated, and obtained result is handled, excavates generation useful information, in can be in software
Defect is found in operation, the reliability of software is improved.
Realize that technical scheme is as follows:
A kind of dirty data propagation path based on complex network finds method, comprises the following steps:
The first step:Decompiling is carried out to binary file using IDA plug-in unit Hex-Rays, obtained in similar C language
Between code;
Second step:The intermediate code that the instrument provided using GNU compilers is generated to the first step carries out data collection, capture
Function call path, after path is obtained, resolves to function name, afterwards to tracking data using Addr2line by function address
Handled and simplified, and generator matrix form;Finally use Graphviz generating function call graphs;
3rd step:The function call relationship graph that parsing previous step is obtained, obtains node, side, weight information, and calculate
Node degree is obtained, the complex network figure with key node is set up;
4th step:According to the nonuniformity of the power law distribution of complex network figure, the complex network generated with reference to previous step
Figure, finds out with constructing the related point of dirty data and calling frequency very high point, key node is labeled with different colors,
The path of dirty data is marked with special color, so as to find software defect hidden danger.
3rd step sets up complex network figure and uses following methods, comprises the following steps that:
(1) all functions of binary file obtained second step are used as the node in network;
(2) according to whether have between node relation set up have no right network;
(3) degree of correlation between calculate node;
(4) according to relatedness computation weights;
(5) weighted network figure is set up.
Beneficial effects of the present invention:
The present invention is studied based on disassemblers IDA, is translated to the binary program without offer source code,
And obtained result is handled, excavate generation useful information.To in can be found in running software defect, improve it is soft
The reliability of part.
The object for focusing on analysis of this technology is binary file rather than source code, so premise prepares to need two
Carry system code is converted into the similar form of source code.Secondly, this selected topic innovative point is the knowledge analysis software of complex network
The relations such as function call, largely beneficial to the discovery of software defect.
Brief description of the drawings
Fig. 1 has found the flow chart of method for a kind of dirty data propagation path based on complex network of the present invention;
Fig. 2 collects for the present invention, simplified and visualization track path procedure chart;
Fig. 3 is the function call result schematic diagram of application program in the embodiment of the present invention.
Embodiment
Further the present invention is described in detail below in conjunction with the accompanying drawings.
The embodiment of the present invention is roughly divided into three parts:One is to carry out decompiling to binary file, obtains manageable
Source code or analyzable intermediate language, this part is completed using IDA plug-in unit Hex-Rays;Two be that decompiling result is carried out
Analysis, provides the call graph of function, this part using GNU compiler instruments chain, Addr2line instruments, it is fixed and
Graphviz instruments are completed;Three be to generate the complex network figure with key node according to graph of a relation, utilizes graph theory and complex web
The knowledge of network, finds out the propagation path of dirty data.
The workflow of the present invention is described in detail with reference to Fig. 1:
1. decompiling
After a variety of decompiling instruments are contrasted, the present invention determines the plug-in unit Hex-Rays using IDA, and binary file is entered
Row decompiling, and obtain the intermediate code of similar C language.
2. drafting function call graph
In order to capture the calling figure of simultaneously explicit function, it is necessary to 4 indispensable elements:GNU compiler instruments chain, Addr2line
Intermediate code and Graphviz instruments obtained by instrument, previous step.Addr2line instruments can be with recognition function, given address
Source code line number and executable image.The intermediate code of customization is a very simple instrument, and it can be reduced advises to figure
The address tracking of model, can do simple processing to the code of decompiling here.Graphviz instruments can generate figure image.
Whole step is as shown in Figure 2.
The intermediate code that the instrument provided first by GNU compilers is generated to the first step carries out data collection, captures letter
Number calls path.After path is obtained, function address is resolved into function name using Addr2line.Tracking data is entered afterwards
Row is handled and simplified, and generator matrix form.Finally use Graphviz generating function call graphs.
3. find out dirty data propagation path
Dirty data is to represent that a data are changed, but does not also preserve or further handle.Or itself
The data that value has been lost.Its propagation path can be obtained by constructing dirty data and recording its nodal information, then use complex web
The knowledge clustered in network obtains similar node to achieve the goal.
The graphic file that parsing previous step is obtained first, obtains the information such as node, side, weight, and calculating is saved
Point degree etc..
The complex network model of software systems is set up, is concretely comprised the following steps:
(1) all functions of binary file obtained second step are used as the node in network;
(2) according to whether have between node relation set up have no right network;
(3) degree of correlation between calculate node;
(4) according to relatedness computation weights;
(5) weighted network figure is set up.
According to the nonuniformity of the power law distribution of complex network, the complex network model generated with reference to previous step is found out
To the related point of construction dirty data and calling frequency very high point, key node is labeled with different colors, dirty data
Path marked with special color, so as to find software defect hidden danger.
Although combining the embodiment that accompanying drawing describes the present invention, it will be apparent to those skilled in the art that
Under the premise without departing from the principles of the invention, some deformations can also be made, replaces and improves, these also should be regarded as belonging to this hair
Bright protection domain.
Claims (2)
1. a kind of dirty data propagation path based on complex network finds method, it is characterised in that comprise the following steps:
The first step:Decompiling is carried out to binary file using IDA plug-in unit Hex-Rays, the middle generation of similar C language is obtained
Code;
Second step:The intermediate code that the instrument provided using GNU compilers is generated to the first step carries out data collection, captures function
Path is called, after path is obtained, function address is resolved into function name using Addr2line, tracking data is carried out afterwards
Handle and simplify, and generator matrix form;Finally use Graphviz generating function call graphs;
3rd step:The function call relationship graph that parsing previous step is obtained, obtains node, side, weight information, and calculating is obtained
Node degree, sets up the complex network figure with key node;
4th step:According to the nonuniformity of the power law distribution of complex network figure, the complex network figure generated with reference to previous step is looked for
Go out and the related point of construction dirty data and call the high point of frequency, key node is labeled with different colors, dirty data
Path marked with special color, so as to find software defect hidden danger.
2. a kind of dirty data propagation path based on complex network as claimed in claim 1 finds method, it is characterised in that the
Three steps set up complex network figure and use following methods, comprise the following steps that:
(1) all functions of binary file obtained second step are used as the node in network;
(2) according to whether have between node relation set up have no right network;
(3) degree of correlation between calculate node;
(4) according to relatedness computation weights;
(5) weighted network figure is set up.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310750367.0A CN104731705B (en) | 2013-12-31 | 2013-12-31 | A kind of dirty data propagation path based on complex network finds method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310750367.0A CN104731705B (en) | 2013-12-31 | 2013-12-31 | A kind of dirty data propagation path based on complex network finds method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104731705A CN104731705A (en) | 2015-06-24 |
CN104731705B true CN104731705B (en) | 2017-09-01 |
Family
ID=53455615
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310750367.0A Active CN104731705B (en) | 2013-12-31 | 2013-12-31 | A kind of dirty data propagation path based on complex network finds method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104731705B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104932865B (en) * | 2015-07-10 | 2017-10-10 | 武汉工程大学 | A kind of component agreement method for digging, apparatus and system |
CN105068928A (en) * | 2015-08-04 | 2015-11-18 | 中国人民解放军理工大学 | Complex network theory based software test use-case generating method |
CN114748875B (en) * | 2022-05-20 | 2023-03-24 | 一点灵犀信息技术(广州)有限公司 | Data saving method, device, equipment, storage medium and program product |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101140588A (en) * | 2007-10-10 | 2008-03-12 | 华为技术有限公司 | Method and apparatus for ordering incidence relation search result |
US7434046B1 (en) * | 1999-09-10 | 2008-10-07 | Cisco Technology, Inc. | Method and apparatus providing secure multicast group communication |
CN101330417A (en) * | 2008-07-24 | 2008-12-24 | 安徽大学 | Quotient space overlay model for calculating network shortest path and building method thereof |
CN102841844A (en) * | 2012-07-13 | 2012-12-26 | 北京航空航天大学 | Method for binary code vulnerability discovery on basis of simple symbolic execution |
CN103200096A (en) * | 2013-03-13 | 2013-07-10 | 南京理工大学 | Heuristic routing method avoiding key nodes in complex network |
-
2013
- 2013-12-31 CN CN201310750367.0A patent/CN104731705B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7434046B1 (en) * | 1999-09-10 | 2008-10-07 | Cisco Technology, Inc. | Method and apparatus providing secure multicast group communication |
CN101140588A (en) * | 2007-10-10 | 2008-03-12 | 华为技术有限公司 | Method and apparatus for ordering incidence relation search result |
CN101330417A (en) * | 2008-07-24 | 2008-12-24 | 安徽大学 | Quotient space overlay model for calculating network shortest path and building method thereof |
CN102841844A (en) * | 2012-07-13 | 2012-12-26 | 北京航空航天大学 | Method for binary code vulnerability discovery on basis of simple symbolic execution |
CN103200096A (en) * | 2013-03-13 | 2013-07-10 | 南京理工大学 | Heuristic routing method avoiding key nodes in complex network |
Also Published As
Publication number | Publication date |
---|---|
CN104731705A (en) | 2015-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106503496B (en) | Based on operation code replacement and combined Python shell script anti-reversal method | |
CN104572072B (en) | A kind of language transfer method and equipment to the program based on MVC pattern | |
Buinevich et al. | The life cycle of vulnerabilities in the representations of software for telecommunication devices | |
CN105677574B (en) | Android application leak detection method and system based on function control stream | |
CN103377045B (en) | Method and system for Translation Verification Test | |
CN106371887A (en) | System and method for MSVL compiling | |
CN110196720B (en) | Optimization method for generating dynamic link library by Simulink | |
CN112163420A (en) | NLP technology-based RPA process automatic generation method | |
CN112104709A (en) | Intelligent contract processing method, device, medium and electronic equipment | |
CN110196815A (en) | Software fuzzy test method | |
CN104731705B (en) | A kind of dirty data propagation path based on complex network finds method | |
CN112540767A (en) | Program code generation method, program code generation device, electronic device and storage medium | |
CN106777529A (en) | Integrated circuit fault-resistant injection attacks capability assessment method based on FPGA | |
Martinez et al. | Recovering sequence diagrams from object-oriented code: An ADM approach | |
CN109155129B (en) | Language program control system | |
Balsamo et al. | Deriving performance models from software architecture specifications | |
CN117093222A (en) | Code parameter abstract generation method and system based on improved converter model | |
CN111176995B (en) | Test method and test system based on big data test case | |
CN112685291A (en) | System joint test method and related device | |
Zhang et al. | Automated extraction of grammar optimization rule configurations for metamodel-grammar co-evolution | |
Lerchner et al. | An open S-BPM runtime environment based on abstract state machines | |
Zhang | An Approach for Extracting UML Diagram from Object-Oriented Program Based on J2X | |
Berti et al. | Evaluating Large Language Models in Process Mining: Capabilities, Benchmarks, Evaluation Strategies, and Future Challenges | |
Haga et al. | Inconsistency Checking of UML Sequence Diagrams and State Machines Using the Structure-Behavior Coalescence Method | |
CN111651773B (en) | Automatic binary security vulnerability mining method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |