Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical solution in the embodiment of the present invention is explicitly described, it is clear that described embodiment is the present invention
A part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not having
Every other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.
Fig. 1 is the flow diagram of applied topology figure recognition methods provided in an embodiment of the present invention, as shown in Figure 1, this reality
It applies example and a kind of applied topology figure recognition methods is provided, comprising:
S1, communication association relationship between the multiple nodes and each node of application system to be identified is obtained;
Specifically, applied topology figure identification device can be by web crawlers technology in the application system to be identified
The host of any one node is starting point, and crawler extracts the multiple nodes and each node for including in the application system to be identified
Between communication association relationship.It can be understood that the node is with " IP address+communication port " Lai Dingyi, each node it
Between communication association relationship be the incidence relation with communication direction, " IP address+communication port " (source node) → " IP can be used
Address+communication port " (destination node) Lai Dingyi;Described device can also obtain by other means in the application system
Communication association relationship between multiple nodes and each node.
S2, initial digraph, and root are generated according to the communication association relationship between the multiple node and each node
Polymerization processing is carried out to the initial digraph according to digraph aggregating algorithm, obtains polymerization digraph;
Specifically, described device generates initial according to the communication association relationship between the multiple node and each node
Digraph, and polymerization processing is carried out to the initial digraph according to digraph aggregating algorithm, obtain polymerization digraph.It can manage
Solution, the initial digraph includes that multiple source nodes and multiple destination nodes and each source node are corresponding with each source node
Destination node between for indicating the directed line segment of communication association relationship between the corresponding destination node of the source node,
The method for generating the initial digraph is consistent with the prior art, and details are not described herein again;The polymerization digraph includes multiple poly-
Close the directed line segment for being used to indicate communication association relationship between the aggregation between node and each aggregation, institute
Stating aggregation is to be incited somebody to action according to the communication association relationship in the initial digraph between each node according to digraph aggregating algorithm
What the node in the initial digraph was polymerize, the aggregation includes one or more nodes.
S3, the attribute information for obtaining each node, and it is oriented to the polymerization according to the attribute information of each node
Figure carries out checking treatment, obtains the applied topology figure of the application system to be identified.
Specifically, described device obtains the attribute information of each node, and is sentenced according to the attribute information of each node
Whether the attribute information of the aggregation each node that includes to break in the polymerization digraph consistent, to the polymerization digraph into
Row checking treatment obtains the applied topology figure of the application system to be identified.Wherein, the attribute information may include that system is returned
Belong to information, type affiliation information and cluster attaching information, can also include other information, can specifically carry out according to the actual situation
Setting and adjustment, are not specifically limited herein.
Applied topology figure recognition methods provided in an embodiment of the present invention, by according to the application system to be identified got
Communication association relationship between multiple nodes and each node generates initial digraph, and according to digraph aggregating algorithm to described
Initial digraph carries out polymerization processing, obtains polymerization digraph, then, obtains the attribute information of each node, and according to institute
The attribute information for stating each node carries out checking treatment to the polymerization digraph, and the application for obtaining the application system to be identified is opened up
Figure is flutterred, the recognition efficiency of applied topology figure is improved.
On the basis of the above embodiments, further, the multiple nodes and Ge Jie for obtaining application system to be identified
Communication association relationship between point, comprising:
Choosing the default node in the application system to be identified is starting point, obtains the logical of the host of the default node
Believe port;
Crawler capturing is gradually carried out according to the communication object of the corresponding node of each communication port, is obtained described to be identified
Communication association relationship between the multiple nodes and each node of application system.
Specifically, similar with web crawlers technology, a node in the application system is compared to a page, will be saved
Communication association relationship is compared to the direction of the URL between the page and the page between point and node;Described device chooses described to be identified answer
It is starting point with the default node in system, obtains the communication port of the host of the default node, crawler is according to each described logical
The communication object of the corresponding node in letter port divides the host for itself arriving the communication object, obtains new host again
Communication port, and so on, crawler constantly divide the multiple nodes for finally obtaining the application system to be identified and each node it
Between communication association relationship.Wherein, the default node can be any one in the node that the application system includes, excellent
It is selected as the node of the application system top layer, specifically can be configured and adjust according to the actual situation, is not done herein specific
It limits.
For example, described device obtains the default node as shown in Fig. 2, the default node M 1 is " port IP1+ A1 "
The host of M1 includes 3 ports, respectively A1, A2, A3, then the corresponding node in three ports is M1 (" port IP1+ A1 "),
M2 (" port IP1+ A2 "), M3 (" port IP1+ A3 "), the communication object that node M 1, node are M2 and node is M3 are respectively to save
Point N1 (" port IP2+ B1 "), node G1 (" port IP3+ C1 "), node H1 (" port IP4+ D1 "), described device obtain described in
The host of node N1 includes that 2 communication port are respectively B1 and B2, and described device obtains the communication port B1 and B2 again
Corresponding node N1 (" port IP2+ B1 ") and N2 (" port IP2+ B2 "), while obtaining the host of the communication object of node M 2
Communication port be C1 and C2, corresponding node be G1 (" port IP3+ C1 ") and G2 (" port IP3+ C2 ");Node M 3 is led to
The communication port for believing the host of object is D1 and D2, and corresponding node is H1 (" port IP4+ D1 ") and the (" port IP4+ H2
D2 "), and so on, all nodes until traversing the application system finally obtain the multiple of the application system to be identified
Communication association relationship between node and each node.
On the basis of the above embodiments, further, the initial digraph includes multiple source nodes and multiple purposes
Directed line segment between node and each source node destination node corresponding with each source node;Correspondingly, described according to digraph
Aggregating algorithm carries out polymerization processing to the initial digraph, obtains polymerization digraph, comprising:
S201, multiple source node set and the corresponding purpose of each source node set are obtained according to the initial digraph
Node set;The source node set is combined into the set of the corresponding identical source node of destination node, the source node set pair
The destination node collection answered is combined into the set for the corresponding destination node of source node that the source node set includes;
S202, gathered according to the source node set symphysis at the first collision, and generate the according to the destination node set
Two collision set;Wherein, a source node set is combined into an element of the first collision set, and the source node set is corresponding
Destination node collection be combined into it is described second collision set an element;
S203, the first source node set cooperation chosen in the first collision set are the first candidate collection, and take out
First destination node set in the second collision set is as the second candidate collection;
S204, first candidate collection and second candidate collection are subjected to poor intersection operation, obtain a target
Source node set, and according to the difference set of second candidate collection and first candidate collection to second candidate collection into
Row updates;
S205, the next source node set cooperation taken out in the first collision set are first candidate collection, weight
Step S204 is executed again, until traversing the first collision set or second candidate collection is empty set;Then it executes
Step S206;
S206, the first source node set cooperation chosen in the first collision set are the first candidate collection, and take out
Next destination node set in the second collision set is as second candidate collection, and return step S204, directly
Until the second collision collection is combined into empty set;Then step S207 is executed;
S207, using multiple target source node sets as aggregation, obtain logical between each aggregation
Believe incidence relation, and the polymerization digraph is generated according to the communication association relationship between each aggregation.
Specifically, if described device knows the corresponding destination node phase of multiple source nodes according to the initial digraph judgement
Together, then corresponding using the destination node as the source node set using the multiple source node as a source node set
Destination node set, obtain multiple source node set and the corresponding destination node collection of each source node set in the method
It closes;Then, described device generates the first collision set, and by institute according to being an element by one source node set cooperation
It states the corresponding destination node set of each element in the first collision set and generates the second collision set as an element;Then, institute
It states first element (i.e. first source node set) that device is chosen in the first collision set and is used as the first candidate collection,
And first element (i.e. first destination node set) in the second collision set is taken out as the second candidate collection;So
Afterwards, first candidate collection and second candidate collection are carried out poor intersection operation by described device, that is, first ask described
The difference set of first candidate collection and second candidate collection, then ask first candidate collection and second candidate collection
Then the union of the intersection and difference set is sought in intersection, using the union as a target source node set, and by described second
The difference set of candidate collection and first candidate collection is as the second new candidate collection;Then, described device chooses described the
Next source node set cooperation in one collision set is first candidate collection, again to first candidate collection and institute
State the second candidate collection and carry out poor intersection operation, and so on, until traverse the first collision set or obtain new the
Until two candidate collections are empty set, first source node set cooperation that described device is chosen in the first collision set is first
Candidate collection, and next destination node set in the second collision set is taken out as second candidate collection, then
Poor intersection operation is carried out to first candidate collection and second candidate collection, and so on, until second collision
Until collection is combined into empty set, multiple target source node sets are obtained, then using the multiple target source node set as more
A aggregation obtains the communication association relationship between each aggregation, and according to logical between each aggregation
Believe that incidence relation generates the polymerization digraph.
For example, the corresponding initial digraph of the application system to be identified is as shown in figure 3, the multiple sources obtained according to Fig. 3
Node set and the corresponding destination node set of each source node set are as shown in table 1, and described device is according to the source node
Set and destination node set, generate first collision collection is combined into { Q1, Q2, (Q3, Q4) }, second collision collection be combined into (Q2, Q3,
Q4), (Q4, Q5, Q6), (Q6, Q7, Q8) }, described device takes out first source node set from the first collision set
{ Q1 } is used as the first candidate collection L1, first destination node set { Q2, Q3, Q4 }, which is taken out, from the second collision set makees
For the second candidate collection L2, and calculate (L1/L2)∪(L1∩L2)={ Q1 } as being first destination node set, then
Calculate (L2/L1)={ Q2, Q3, Q4 } as the second new candidate collection L2, at this timeThen described device is from described first
Second source node set { Q2 }, which is taken out, in collision set is used as the first candidate collection L1, (L is calculated again1/L2)∪(L1∩L2)
={ Q2 } is used as second destination node set, then calculates (L2/L1)={ Q3, Q4 } as the second new candidate collection
L2, at this timeThen described device continues to take out third source node set { Q3, Q4 } work from the first collision set
For the first candidate collection L1, and calculate (L1/L2)∪(L1∩L2)={ Q3, Q4 } as the third destination node set, so
After calculateAs the second new candidate collection L2, the second candidate collection L at this time2For empty set;Then described device
First source node set { Q1 }, which is taken out, from the first collision set again is used as the first candidate collection L1, touched from described second
It hits and takes out second destination node set { Q4, Q5, Q6 } in set as the second candidate collection L2, calculate (L1/L2)∪(L1∩
L2)={ Q1 } it is used as the 4th destination node set, calculate (L2/L1)={ Q4, Q5, Q6 } as the second new Candidate Set
Close L2, at this timeSecond source node set { Q2 }, which is taken out, from the first collision set again is used as the first Candidate Set
Close L1, calculate (L1/L2)∪(L1∩L2)={ Q2 } it is used as the 5th destination node set, calculate (L2/L1)={ Q4, Q5,
Q6 } as the second new candidate collection L2, at this timeThird source node set is taken out from the first collision set again
It closes { Q3, Q4 } and is used as the first candidate collection L1, calculate (L1/L2)∪(L1∩L2)={ Q3, Q4 } it is used as the 6th target section
Point set calculates (L2/L1)={ Q5, Q6 } as the second new candidate collection L2Although at this timeIt has stepped through described
First collision set;It is candidate as first that described device takes out first source node set { Q1 } from the first collision set
Set L1, then select the third destination node set { Q6, Q7, Q8 } that described second collides in set and be used as second candidate
Set L2, and so on, until it is described second collision collection be combined into empty set until, using each destination node set of acquisition as
The aggregation.
Table 1
Source node set |
Destination node set |
Q1 |
Q2、Q3、Q4 |
Q2 |
Q4、Q5、Q6 |
Q3、Q4 |
Q6、Q7、Q8 |
On the basis of the above embodiments, further, the attribute information for obtaining each node, comprising:
Each node is matched with preparatory storage node attribute database, if each node and pre- is known in judgement
First storage node attribute database matches, then the attribute information of each node according to the nodal community database lookup,
Otherwise, the attribute information of each node is obtained according to the machine learning model pre-established.
Specifically, described device matches each node with preparatory storage node attribute database, if judgement obtains
Know that each node matches with preparatory storage node attribute database, then it is each according to the nodal community database lookup
Otherwise the attribute information of node obtains the attribute information of each node according to the machine learning model pre-established.It should say
Bright, the attribute information of each node includes system attaching information, type affiliation information and cluster attaching information;Accordingly
Ground, when obtaining the system attaching information of the node according to the machine learning model pre-established, mainly by being returned using logic
Reduction method, according in the node and each known system, there are the quantity of the node of communication association relationship and the output of default weighted value to be somebody's turn to do
The system attaching information of node;It, can when obtaining the type affiliation information of the node according to the machine learning model pre-established
To judge the type affiliation information of the node using K-Means clustering algorithm;It is obtained according to the machine learning model pre-established
When the cluster attaching information of the node, the main algorithm returned using fitting is existed according in the node and each known cluster
The quantity and communication interaction frequency of the node of communication association relationship calculate regression curve and do with the regression curve of known cluster
Match, exports the cluster attaching information of the node.
On the basis of the above embodiments, further, the attribute information according to each node is to the polymerization
Digraph carries out checking treatment, comprising:
Judge respectively include in the corresponding destination node set of each aggregation of the polymerization digraph
Whether the attribute information of multiple source nodes is consistent, if inconsistent, the inconsistent source node of the attribute information is gathered from described
It closes and is separated in node, obtain topological node;
The polymerization digraph is updated according to the communication association relationship between the topological node, it will be updated
Applied topology figure of the polymerization digraph as the application system to be identified.
Specifically, if the corresponding mesh of the aggregation of polymerization one of digraph is known in described device judgement
Mark node set in include multiple source nodes attribute information it is inconsistent, then by the inconsistent source node of the attribute information from
It separates in the aggregation, is opened up by the aggregation obtained after separating treatment and the node separated as described in
Node is flutterred, and the polymerization digraph is updated according to the communication association relationship between the topological node, after update
Applied topology figure of the polymerization digraph as the application system to be identified.
On the basis of the above embodiments, further, the attribute information according to each node is to the polymerization
Digraph carries out checking treatment, comprising:
If the source section for including in the corresponding target origin node set of each aggregation for polymerizeing digraph is known in judgement
The attribute information of point is consistent, then using the polymerization digraph as the applied topology figure of the application system to be identified.
In the above-described embodiments, the attribute information of each node includes system attaching information, type affiliation information sum aggregate
Group's attaching information.
Applied topology figure recognition methods provided in an embodiment of the present invention, by according to the application system to be identified got
Communication association relationship between multiple nodes and each node generates initial digraph, and according to digraph aggregating algorithm to described
Initial digraph carries out polymerization processing, obtains polymerization digraph, then, obtains the attribute information of each node, and according to institute
The attribute information for stating each node carries out checking treatment to the polymerization digraph, and the application for obtaining the application system to be identified is opened up
Figure is flutterred, the recognition efficiency of applied topology figure is improved.
Fig. 4 is the structural schematic diagram of applied topology figure identification device provided in an embodiment of the present invention, as shown in figure 4, this hair
Bright embodiment provides a kind of applied topology figure identification device, including acquiring unit 401, polymerized unit 402 and processing unit 403,
Wherein:
The communication association that acquiring unit 401 is used to obtain between multiple nodes of application system to be identified and each node closes
System;Polymerized unit 402 is used to be generated according to the communication association relationship between the multiple node and each node initial oriented
Figure, and polymerization processing is carried out to the initial digraph according to digraph aggregating algorithm, obtain polymerization digraph;Processing unit
403 for obtaining the attribute information of each node, and according to the attribute information of each node to the polymerization digraph into
Row checking treatment obtains the applied topology figure of the application system to be identified.
Optionally, it is starting point that acquiring unit 401, which is specifically used for choosing the default node in the application system to be identified, is obtained
Take multiple communication port of the host of the default node;According to the communication object of the corresponding node of each communication port by
Step carries out crawler capturing, obtains the communication association relationship between the multiple nodes and each node of the application system to be identified.
Optionally, polymerized unit 402 is specifically used for executing S201, obtains multiple source node sets according to the initial digraph
Close destination node set corresponding with each source node set;It is identical that the source node set is combined into corresponding destination node
The set of source node, it is corresponding that the corresponding destination node collection of the source node set is combined into the source node that the source node set includes
Destination node set;S202, gathered according to the source node set symphysis at the first collision, and according to the destination node collection
Symphysis is gathered at the second collision;Wherein, a source node set is combined into an element of the first collision set, the source node
Gather the element that corresponding destination node collection is combined into the second collision set;S203, the first collision set is chosen
In first source node set cooperation be the first candidate collection, and take out it is described second collision set in first destination node
Set is used as the second candidate collection;S204, first candidate collection and second candidate collection are subjected to poor intersection operation,
Obtain a target source node set, and according to the difference set of second candidate collection and first candidate collection to described the
Two candidate collections are updated;S205, the next source node set cooperation taken out in the first collision set are described first
Candidate collection repeats step S204, and until traversing, first collision is gathered or second candidate collection is that empty set is
Only;Then step S206 is executed;S206, the first source node set cooperation chosen in the first collision set are first candidate
Set, and next destination node set from the second collision set is taken out as second candidate collection, and is returned
Step S204 is returned, until the second collision collection is combined into empty set;Then step S207 is executed;S207, by multiple mesh
Mark source node set cooperation is aggregation, obtains the communication association relationship between each aggregation, and according to each described poly-
The communication association relationship closed between node generates the polymerization digraph.
Optionally, processing unit 403 is specifically used for each node and the progress of preparatory storage node attribute database
Match, if judgement knows that each node matches with preparatory storage node attribute database, according to the nodal community data
Otherwise the attribute information of each node described in library lookup obtains the category of each node according to the machine learning model pre-established
Property information.
Optionally, processing unit 403 is specifically used for judging that each aggregation of the polymerization digraph is corresponding respectively
The destination node set in include multiple source nodes attribute information it is whether consistent, if inconsistent, by the attribute
The inconsistent source node of information is separated from the aggregation, obtains topological node;According to logical between the topological node
Letter incidence relation is updated the polymerization digraph, using the updated polymerization digraph as the application to be identified
Systematic difference topological diagram.
Optionally, if processing unit 403 is specifically used for judgement and knows that each aggregation of the polymerization digraph is corresponding
The attribute information for the source node for including in target origin node set is consistent, then to be identified answers using the polymerization digraph as described
With systematic difference topological diagram.
Optionally, the attribute information of each node includes system attaching information, type affiliation information and cluster ownership letter
Breath.
The embodiment of device provided by the invention specifically can be used for executing the process flow of above-mentioned each method embodiment,
Details are not described herein for function, is referred to the detailed description of above method embodiment.
Fig. 5 is electronic equipment entity apparatus structural schematic diagram provided in an embodiment of the present invention, as shown in figure 5, the electronics is set
Standby may include: processor (processor) 501, memory (memory) 502 and bus 503, wherein processor 501 is deposited
Reservoir 502 completes mutual communication by bus 503.Processor 501 can call the computer program in memory 502,
To execute following method: S1, obtaining communication association relationship between the multiple nodes and each node of application system to be identified;
S2, initial digraph is generated according to the communication association relationship between the multiple node and each node, and according to digraph
Aggregating algorithm carries out polymerization processing to the initial digraph, obtains polymerization digraph;S3, the attribute letter for obtaining each node
Breath, and checking treatment is carried out to the polymerization digraph according to the attribute information of each node, obtain the application to be identified
Systematic difference topological diagram.
The embodiment of the present invention discloses a kind of computer program product, and the computer program product is non-transient including being stored in
Computer program on computer readable storage medium, the computer program include program instruction, when described program instructs quilt
When computer executes, computer is able to carry out method provided by above-mentioned each method embodiment, for example, S1, obtains wait know
Communication association relationship between multiple nodes of other application system and each node;S2, according to the multiple node and described
Communication association relationship between each node generates initial digraph, and according to digraph aggregating algorithm to the initial digraph into
Row polymerization processing obtains polymerization digraph;S3, the attribute information for obtaining each node, and according to the attribute of each node
Information carries out checking treatment to the polymerization digraph, obtains the applied topology figure of the application system to be identified.
The embodiment of the present invention provides a kind of non-transient computer readable storage medium, the non-transient computer readable storage
Medium storing computer program, the computer program make the computer execute side provided by above-mentioned each method embodiment
Method, for example, S1, obtain communication association relationship between the multiple nodes and each node of application system to be identified;S2,
Initial digraph is generated according to the communication association relationship between the multiple node and each node, and is polymerize according to digraph
Algorithm carries out polymerization processing to the initial digraph, obtains polymerization digraph;S3, the attribute information for obtaining each node,
And checking treatment is carried out to the polymerization digraph according to the attribute information of each node, obtain the application system to be identified
Applied topology figure.
In addition, the logical order in above-mentioned memory 502 can be realized by way of SFU software functional unit and conduct
Independent product when selling or using, can store in a computer readable storage medium.Based on this understanding, originally
Substantially the part of the part that contributes to existing technology or the technical solution can be in other words for the technical solution of invention
The form of software product embodies, which is stored in a storage medium, including some instructions to
So that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation of the present invention
The all or part of the steps of example the method.And storage medium above-mentioned include: USB flash disk, mobile hard disk, read-only memory (ROM,
Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. it is various
It can store the medium of program code.
The apparatus embodiments described above are merely exemplary, wherein described, unit can as illustrated by the separation member
It is physically separated with being or may not be, component shown as a unit may or may not be physics list
Member, it can it is in one place, or may be distributed over multiple network units.It can be selected according to the actual needs
In some or all of the modules achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying creativeness
Labour in the case where, it can understand and implement.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on
Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software products in other words, should
Computer software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including several fingers
It enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementation
Method described in certain parts of example or embodiment.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although
Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and
Range.