CN110399722A - A kind of virus family generation method, device, server and storage medium - Google Patents

A kind of virus family generation method, device, server and storage medium Download PDF

Info

Publication number
CN110399722A
CN110399722A CN201910127243.4A CN201910127243A CN110399722A CN 110399722 A CN110399722 A CN 110399722A CN 201910127243 A CN201910127243 A CN 201910127243A CN 110399722 A CN110399722 A CN 110399722A
Authority
CN
China
Prior art keywords
node
virus family
classification
relevant
virus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910127243.4A
Other languages
Chinese (zh)
Other versions
CN110399722B (en
Inventor
谭昱
程虎
杨耀荣
许天胜
曹有理
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910127243.4A priority Critical patent/CN110399722B/en
Publication of CN110399722A publication Critical patent/CN110399722A/en
Application granted granted Critical
Publication of CN110399722B publication Critical patent/CN110399722B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/561Virus type analysis

Abstract

The application provides a kind of virus family analysis method, device, server and storage medium, passes through at least one seed node for obtaining virus family to be determined;Destination node relevant at least one seed node is filtered out from knowledge mapping, obtains the subgraph being made of seed node and destination node;The node in subgraph is clustered using clustering algorithm to obtain at least one node classification;If there is currently no virus families relevant to node classification, each node in node classification is determined to belong to the mode of the same new virus family, realize the purpose that virus family is automatically generated on the basis of independent of manual analysis, and then the relevant information by providing a user virus family, depth analysis is carried out to security incident for user to provide convenience, and shortens the duration that user carries out depth analysis to security incident.

Description

A kind of virus family generation method, device, server and storage medium
Technical field
The present invention relates to field of information security technology, more specifically to a kind of virus family generation method, device, Server and storage medium.
Background technique
Industry company is in the threat discovery and response of virus at present, and be based primarily upon the following two kinds mode: one is be based on The operation of sample level, analyst, according to personal experience, carry out depth analysis excavation for security incident using sample properties, this Although kind method can be realized the depth analysis to security incident, but rely on seriously artificial experience, and response cycle is long.It is another Kind is the suspicious actions to analyst's artificial discovery, and deployment monitoring point is monitored, once there is the thing for triggering the suspicious actions Part will alarm to analyze personnel and to handle in time, although the response of such mode is rapidly, equally be restricted by analyst, and point Analysis person is difficult to carry out depth analysis to security incident, it is difficult to discovery attack overall picture.
As it can be seen that reaching how on the basis of not depending on manual analysis convenient for carrying out the mesh of depth analysis to security incident , it is a problem to be solved.
Summary of the invention
In view of this, to solve the above problems, the present invention provide a kind of virus family analysis method, device, server and Storage medium.Technical solution is as follows:
A kind of virus family generation method, comprising:
Obtain at least one seed node of virus family to be determined;
Destination node relevant at least one described seed node is filtered out from knowledge mapping, is obtained by the seed The subgraph that node and the destination node are constituted;At least one security-related node, institute are stored in the knowledge mapping State between the node at least one node that there are incidence relations;
The node in the subgraph is clustered to obtain at least one node classification using the clustering algorithm of pre-training;
Determine whether there is virus family relevant to the node classification according to the node in the node classification;
If there is currently no virus family relevant to the node classification, each node in the node classification is true It is set to and belongs to the same new virus family.
A kind of virus family generating means, comprising:
Seed node acquiring unit, for obtaining at least one seed node of virus family to be determined;
Subgraph generation unit, for filtering out target section relevant at least one described seed node from knowledge mapping Point obtains the subgraph being made of the seed node and the destination node;It is stored in the knowledge mapping security-related At least one node, there are incidence relations between the node at least one described node;
Cluster cell is clustered to obtain at least one to the node in the subgraph for the clustering algorithm using pre-training A node classification;
Virus family determination unit, for determining whether exist and the section according to the node in the node classification The relevant virus family of point classification;
Virus family generation unit, if for there is currently no virus families relevant to the node classification, it will be described Each node in node classification is determined to belong to the same new virus family.
A kind of server, comprising: at least one processor and at least one processor;The memory is stored with program, The processor calls the program of the memory storage, and described program is for realizing the virus family generation method.
A kind of storage medium, for storing the program for realizing the virus family generation method.
The application provides a kind of virus family analysis method, device, server and storage medium, passes through and obtains disease to be determined At least one seed node of malicious family;Destination node relevant at least one seed node is filtered out from knowledge mapping, Obtain the subgraph being made of seed node and destination node;The node in subgraph is clustered to obtain at least using clustering algorithm One node classification;If each node in node classification is determined there is currently no virus family relevant to node classification For the mode for belonging to the same new virus family, realizes and automatically generate viral family on the basis of independent of manual analysis The purpose of race, and then the relevant information by providing a user virus family carry out depth analysis to security incident for user and mention Convenience has been supplied, the duration that user carries out depth analysis to security incident is shortened.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is by a kind of hardware configuration for the server that virus family generation method is suitable for provided by the embodiments of the present application Block diagram;
Fig. 2 is a kind of clustering algorithm generation method flow chart provided by the embodiments of the present application;
Fig. 3 (a) is a kind of virus family sample schematic diagram provided by the embodiments of the present application;
Fig. 3 (b) is drawing of seeds schematic diagram provided by the embodiments of the present application;
Fig. 3 (c) is the cluster result schematic diagram of the node of a kind of pair of subgraph provided by the embodiments of the present application;
Fig. 4 is a kind of virus family generation method flow chart provided by the embodiments of the present application;
Fig. 5 is a kind of method at least one seed node for obtaining virus family to be determined provided by the embodiments of the present application Flow chart;
Fig. 6 is a kind of determining node classification provided by the embodiments of the present application method flow whether relevant to virus family Figure;
Fig. 7 is another virus family generation method flow chart provided by the embodiments of the present application;
Fig. 8 is a kind of structural schematic diagram of virus family generating means provided by the embodiments of the present application.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Embodiment:
Automatically generating for virus family may be implemented in virus family generation method provided by the embodiments of the present application, by for point Analysis person provides the relevant content of virus family, carries out depth analysis to security incident for analyst and provides convenience, shortens Analyst carries out the duration of depth analysis to security incident.
It describes in detail below to a kind of virus family generation method provided by the embodiments of the present application:
A kind of virus family generation method provided by the embodiments of the present application can be applied to server, which can be net Network side provides the service equipment of service for user, may be the server cluster of multiple servers composition, it is also possible to separate unit Server.
Optionally, Fig. 1 shows the hardware block diagram of server, and referring to Fig.1, the hardware configuration of server can wrap It includes: processor 11, communication interface 12, memory 13 and communication bus 14;
In embodiments of the present invention, processor 11, communication interface 12, memory 13, communication bus 14 quantity can be with For at least one, and processor 11, communication interface 12, memory 13 complete mutual communication by communication bus 14;
Processor 11 may be a central processor CPU or specific integrated circuit ASIC (Application Specific Integrated Circuit), or be arranged to implement the integrated electricity of one or more of the embodiment of the present invention Road etc.;
Memory 13 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-volatile Memory) etc., a for example, at least magnetic disk storage;
Wherein, memory is stored with program, the program that processor can call memory to store, and program is used for:
Obtain at least one seed node of virus family to be determined;
Destination node relevant at least one seed node is filtered out from knowledge mapping, is obtained by seed node and mesh Mark the subgraph that node is constituted;At least one security-related node, the section at least one node are stored in knowledge mapping There are incidence relations between point;
The node in subgraph is clustered to obtain at least one node classification using the clustering algorithm of pre-training;
Determine whether there is virus family relevant to node classification according to the node in node classification;
If each node in node classification is determined to belong to there is currently no virus family relevant to node classification The same new virus family.
Optionally, the refinement function of program and extension function can refer to and be described below.
For the ease of the understanding to the virus family generation method for being suitable for above-mentioned server, now the embodiment of the present application is mentioned A kind of virus family generation method supplied describes in detail.
From realization process, a kind of virus family generation method provided by the embodiments of the present application is largely divided into following two Process: a process is to treat clustering algorithm to be trained to generate the process of clustering algorithm;Another process is based on pre- instruction Experienced clustering algorithm generates and updates the process of virus family.
Further, from realization process, a kind of virus family generation method provided by the embodiments of the present application further include: The process that is modified to the clustering algorithm of pre-training provides a user the process of information relevant to virus family (for example, ringing The information inspection request for coping with virus family shows letter relevant to the information category of information inspection request instruction in virus family Process of breath etc.).
In the following, being trained first to generate the process of clustering algorithm and describe in detail from treating clustering algorithm.Fig. 2 is A kind of clustering algorithm generation method flow chart provided by the embodiments of the present application.
As shown in Fig. 2, this method comprises:
S201, multiple virus family samples are obtained, each virus family sample includes at least one node sample;
In the embodiment of the present application, virus family sample is the virus family by manually marking, which is Analysis generation is carried out by artificial node sample.For example, user has found node sample 1, node sample 2 and node by analysis Sample 3 belongs to the same node classification 1, and node sample 4 and node sample 5 belong to the same node classification 2, node sample 6, section Point sample 7 and node sample 8 belong to the same node classification 3;In this way, node classification 1 can be marked into a virus family Sample (including node sample 1, node sample 2 and node sample 3 in the virus family sample) marks node classification 2 at one A virus family sample (including node sample 4 and node sample 5 in the virus family sample) marks node classification 3 at one A virus family sample (including node sample 6, node sample 7 and node sample 8 in the virus family sample).
Also, a virus family library can be set in the embodiment of the present application, and the initial virus family library is sky, there is disease After malicious family's sample, virus family sample can be stored in virus family library, correspondingly, the embodiment of the present application can be rear It is continuous that virus family in the virus family library is updated, or newly-generated virus family is stored in the virus family Library.
S202, destination node relevant to each node sample is filtered out from knowledge mapping, obtain by destination node and The subgraph that node sample is constituted;
It is stored at least one security-related node in knowledge mapping, exists between the node at least one node Incidence relation.Wherein, the set-up mode of the incidence relation between at least one node interior joint can be for in knowledge mapping Node be arranged nodal community, node nodal community characterization there are each nodes of incidence relation with the node.
In the embodiment of the present application, node can be file, domain name, IP address etc.;Incidence relation between node can Between access relation embodiment, file and IP address to be embodied by the set membership between file, between file and domain name Access relation embodiment etc., as long as there are set membership or access relations between two nodes, can think the two sections There are incidence relations between point.
It is above only that there are the preferred embodiment of incidence relation, related two sections between node provided by the embodiments of the present application The particular content of incidence relation between point, inventor can be configured according to their own needs, it is not limited here.
In the embodiment of the present application, initial data can be monitored and be grabbed, and the initial data grabbed is stored in knowledge Map and sensory perceptual system.Wherein, initial data includes node, and the node in initial data carries nodal community, is being grabbed To after node, node can be stored in knowledge mapping and sensory perceptual system, to realize the expansion to knowledge mapping and sensory perceptual system.
Sensory perceptual system is mainly used for perceiving both candidate nodes, the both candidate nodes perceived is applied to poly- based on pre-training Class algorithm generates and updates the process of virus family, and the particular content in relation to sensory perceptual system sees below raw to virus family It is introduced at process.
By taking the virus family sample got is virus family sample 1 and virus family sample 2 as an example, then from knowledge mapping Middle acquisition and node sample 1, node sample 2, node sample 3, node sample 4, node sample 5, node sample 6, node sample 7 There are each nodes of incidence relation with node sample 8, and each node that will acquire is known as a destination node.Wherein, Topology can be carried out in knowledge mapping according to the nodal community of each node sample, with determine in knowledge mapping with each node There are the destination nodes of incidence relation for sample.
For example, the nodal community that node sample 1 carries characterizes node sample 1, there are incidence relations with node 5;From knowledge graph There are the modes of each destination node of incidence relation with the node sample 1 for acquisition in spectrum are as follows: the node category based on node sample 1 Property determine in knowledge mapping with node sample 1 there are the node of incidence relation be a destination node;The node category that node 5 carries Property characterizes node 5, and there are incidence relations with node 3 and node 6 respectively, then can regard node 3 as a destination node, will save Point 6 is used as a destination node;And then determining depositing with node 3 for the nodal community characterization of node 3 respectively according to knowledge mapping Incidence relation node and node 6 nodal community characterization there are the nodes of incidence relation with node 6;If node 3 There is no there are the node of incidence relation, then analysis of the stopping to node 3 with node 3 for nodal community characterization;If the node of node 6 There are incidence relations with node 6 for attribute characterization node 4, then regard node 4 as a destination node;Continuing according to knowledge mapping Determine the nodal community characterization of node 4 with node 4 there are the node of incidence relation, if the nodal community characterization of node 4 is not deposited There are the node of incidence relation, then stopping the analysis to node 4 with node 4, using each destination node obtained above as Get there are each destination nodes of incidence relation with node sample 1.
It in the embodiment of the present application, can be with structure based on picture information using destination node and node sample as picture information Build out the subgraph that destination node and node sample are constituted.
S203, the node in subgraph is clustered using clustering algorithm to be trained, obtains at least one prediction node class Not;
In the embodiment of the present application, it presetting and needs to be trained clustering algorithm, being calculated by inputting subgraph to training cluster After method, clustering algorithm to be trained can be clustered each node in subgraph, at least one node classification be obtained, in order to just In differentiation, each node classification is temporarily known as a prediction node classification, includes in subgraph in each prediction node classification At least one node.
S204, clustering algorithm to be trained reversely is reconciled based at least one prediction node classification, so that at least one is pre- It surveys node classification and levels off to classification of multiple virus family samples to node sample to classifications of node sample, generate cluster calculation Method.
In the embodiment of the present application, after obtaining at least one prediction node classification, based at least one prediction node classification Reverse phase adjusts the parameter in clustering algorithm to be trained, so that at least one prediction node classification approaches the classification of node sample In classification of multiple virus family samples to node sample;And then the multiple training by treating trained clustering algorithm, to generate Clustering algorithm.
For the ease of now providing a kind of cluster to a kind of understanding of clustering algorithm generation method provided by the embodiments of the present application Algorithm generation method application scenario diagram.
It is three virus family samples of acquisition when once treating trained clustering algorithm to be trained referring to Fig. 3 (a), point It Wei not virus family sample 1, virus family sample 2 and virus family sample 3;It include three node samples in virus family sample 1 Sheet, respectively node 1, node 2 and node 3;Virus family sample 2 includes two node samples, respectively node 4 and node 5; Virus family sample 3 includes 3 node samples, respectively node 6, node 7 and node 8.
Topology is carried out to this 3 Virus Sample interior joint samples according to the incidence relation between knowledge mapping node, is obtained One Zhang Zitu, referring to Fig. 3 (b);The subgraph is inputted to the cluster result obtained after training clustering algorithm referring to Fig. 3 (c), is such as schemed It include two prediction node classifications in the cluster result shown in 3 (c), the part that each dotted line is irised out in Fig. 3 (c) represents one in advance Survey node classification;Its interior joint 6 can individually be considered a prediction node classification, and node 7 can also individually be considered one It predicts node classification, such case is not accounted in virus family generation method provided by the embodiments of the present application only, that is, The corresponding prediction node classification of node 6 is ignored, the corresponding prediction node classification of node 7 is ignored.
As it can be seen that at present clustering algorithm to be trained be to the cluster result of subgraph it is inaccurate, if accurate, need to save Point 1, node 2 and node 3 belong in the same prediction node classification, and node 4 and node 5 belong to the same prediction node classification, Node 6, node 7 and node 8 belong to the same prediction node classification.Based on this, reversely adjusted based on cluster result poly- to training Parameter in class algorithm, so that at least one to training algorithm to subgraph is predicted in node classification to node 1- node 8 Classifying mode levels off to multiple Virus Samples to the classifying mode of node 1-8.
It in the embodiment of the present application, can be with the good clustering algorithm of pre-training, to be based on based on above-mentioned clustering algorithm generating process The good clustering algorithm of the pre-training executes the process for generating and updating virus family.
For the ease of to a kind of understanding of virus family generation method provided by the embodiments of the present application, now from being based on pre-training Good clustering algorithm executes generation and the process of update virus family describes in detail.
Fig. 4 is a kind of virus family generation method flow chart provided by the embodiments of the present application.
As shown in figure 4, this method comprises:
S401, at least one seed node for obtaining virus family to be determined;
In the embodiment of the present application, the method for obtaining at least one seed node of virus family to be determined refers to Fig. 5.
As shown in figure 5, this method comprises:
S501, the multiple both candidate nodes for obtaining sensory perceptual system perception;
In the embodiment of the present application, sensory perceptual system is used for the initial data based on its storage and carries out safety monitoring, and perception can There can be the node of security risk, and using each node monitored as a both candidate nodes.
S502, each both candidate nodes are directed to, according to the both candidate nodes respectively in the attribute value of each default dimension, determining should The score value of both candidate nodes;
In the embodiment of the present application, multiple dimensions are previously provided with, are previously provided with and the dimension for each dimension Weight;Dimension can be range, viral species etc..For example, the weight for presetting " range " this dimension is 1;" virus kind The weight of this dimension of class " is 2.
By taking pre-set dimension includes " range " and " viral species " as an example, for a both candidate nodes, if the candidate The attribute at " range " of node is " 11000 ", is " viral wooden horse " in the attribute of " viral species ", can determine that the candidate saves Point " range " attribute value be " 2 ", determine the both candidate nodes " viral species " attribute value be " 3 ".
In the embodiment of the present application, both candidate nodes can be determined in the attribute of dimension in the attribute of dimension according to both candidate nodes Value.For example, a dimension can be directed to, the corresponding relationship of the attribute and attribute value under the dimension is preset, it is candidate determining Node can then be waited after the attribute of the dimension by searching for the mode of attribute value corresponding with the attribute under the dimension Select node in the attribute value of the dimension.
For example, being directed to " range " this dimension, when setting a property in " 0~1000 " range, the corresponding attribute of the attribute Value is 1;When attribute is in " 1001~12000 " range, the corresponding attribute value of the attribute is 2.For " viral species ", this is one-dimensional Degree, when setting a property as " rogue software viroid ", the corresponding attribute value of the attribute is 1;When attribute is " viral wooden horse ", the category Property corresponding attribute value be 3;When attribute is " persistently attacking viroid ", the corresponding attribute value of the attribute is 5.
Then it is directed to a both candidate nodes, however, it is determined that the both candidate nodes are " 2 " in the attribute value of " range ", determine that the candidate saves Point is " 3 " in the attribute value of " viral species ", and the weight for presetting " range " this dimension is 1;" viral species " this When the weight of dimension is 2, then score value=1*2+2*3=8 of the both candidate nodes.
S503, at least one forward both candidate nodes of score value, at least one both candidate nodes are chosen from multiple both candidate nodes In each both candidate nodes be a seed node.
In the embodiment of the present application, the quantity of seed node can be preset, and then according to both candidate nodes score value from height To low sequence, choose these quantity both candidate nodes from multiple both candidate nodes, and using each both candidate nodes of selection as One seed node.
S402, destination node relevant at least one seed node is filtered out from knowledge mapping, obtain by seed section The subgraph that point and destination node are constituted;At least one security-related node, at least one node are stored in knowledge mapping In node between there are incidence relations;
S403, the node in subgraph is clustered to obtain at least one node classification using the clustering algorithm of pre-training;
S404, determine whether there is virus family relevant to node classification according to the node in node classification;If There is currently no virus families relevant to node classification, execute step S405;If there is currently diseases relevant to node classification Malicious family executes step S406;
In the embodiment of the present application, it can be in determining virus family library with the presence or absence of viral family relevant to node classification Race.
The embodiment of the present application provides a kind of determining node classification method flow diagram whether relevant to virus family, specifically asks Referring to Fig. 6.
As shown in fig. 6, this method comprises:
S601, the first quantity for belonging to the node in virus family in node classification is obtained;
S602, the second quantity for obtaining all nodes in node classification;
S603, when judging whether the first quantity and the second quantity meet preset condition;If the first quantity and the second quantity Meet preset condition, executes step S604;
In the embodiment of the present application, if the first quantity is greater than first threshold divided by the result of the second quantity, then it is assumed that first Quantity and the second quantity meet preset condition;It is on the contrary, it is believed that the first quantity and the second quantity are unsatisfactory for preset condition.
S604, determine that node classification is related to virus family.
In the embodiment of the present application, however, it is determined that, can also be into one when the first quantity and the second quantity are unsatisfactory for preset condition It walks and determines that node classification is unrelated with virus family.
It is above only determining node classification provided by the embodiments of the present application preferred embodiment whether relevant to virus family, In relation to determining node classification concrete mode whether relevant to virus family, inventor can be configured according to their own needs, For example be arranged in the first quantity when meeting second threshold, determine that node classification is related to virus family etc., does not limit herein It is fixed.
S405, each node in node classification is determined to belong to the same new virus family.
In the embodiment of the present application, for a node classification, if being not present and the node class in virus family library The node classification is then determined as a new virus family, and this new virus family is added by not relevant virus family Enter into virus family library.
S406, each node for not being included into virus family in node classification currently is included into virus family.
In the embodiment of the present application, for a node classification, if existing and the node classification in virus family library Associated virus family can then determine each node for not including by the virus family in the node classification, and will determine Each node the virus family is added, to realize update to the virus family.
For example, being virus family 1 with the associated virus family of node classification 1 in virus family library, node classification 1 includes section Point 1, node 2 and node 3;And virus family includes node 1 and node 3;Therefore, virus family 1 is added in node 2, to realize Update to virus family 1 in virus family library.
It is another virus family generation method flow chart provided by the embodiments of the present application referring to Fig. 7.
As shown in fig. 7, this method comprises:
S701, at least one seed node for obtaining virus family to be determined;
S702, destination node relevant at least one seed node is filtered out from knowledge mapping, obtain by seed section The subgraph that point and destination node are constituted;At least one security-related node, at least one node are stored in knowledge mapping In node between there are incidence relations;
S703, the node in subgraph is clustered to obtain at least one node classification using the clustering algorithm of pre-training;
S704, secondary-confirmation cleaning is carried out at least one node classification, obtains at least one destination node classification;
In the embodiment of the present application, in order to further ensure the accuracy of cluster result, automation can be based further on Experience assessor carries out secondary-confirmation cleaning to the cluster result of clustering algorithm, final to cluster to obtain final cluster result It as a result include at least one destination node classification.
Further, the embodiment of the present application cleans to obtain at least one destination node to cluster result progress secondary-confirmation After classification, at least one node classification included by least one destination node classification and cluster result can also be based further on Clustering algorithm is modified, so that revised clustering algorithm is more accurate to the cluster result of subgraph.
S705, determine whether there is virus relevant to destination node classification according to the node in destination node classification Family;If executing step S705 there is currently no virus family relevant to destination node classification;If there is currently with target section The relevant virus family of point classification, executes step S706;
S705, each node in destination node classification is determined to belong to the same new virus family;
S706, each node for not being included into virus family in destination node classification currently is included into virus family.
Further, a kind of virus family generation method provided by the embodiments of the present application can be generated and more new virus man Race, and then the application further can also provide virus family information searching function for user, comprising: it receives to virus family Information inspection request, information inspection request instruction have information category;Show information relevant to information category in virus family.
In the embodiment of the present application, the information category of information inspection request instruction can pass for virus family classification, virus Broadcast source category, malice domain name classification of access etc..
When the information category of information inspection request instruction is virus family classification, show what it requested to check to user The topological diagram of virus family.The topological diagram of the virus family is by the incidence relation structure between each node for constituting virus family It builds.
It, can be according to each in virus family when the information category of information inspection request instruction is viral transmission source category Incidence relation between a node, determines the viral transmission source of virus family, and shows the viral transmission source.
When the information category of information inspection request instruction is the malice domain name classification of access, can be obtained from virus family It is taken as each node of domain name, and shows the node got.For example, in virus family include 3 nodes, respectively node 1, Node 2 and node 3;Wherein, it be domain name, node 3 is IP address that node 1, which is file, node 2, then by node 2 as from viral family The node got in race, and show node 2.
Fig. 8 is a kind of structural schematic diagram of virus family generating means provided by the embodiments of the present application.
As shown in figure 8, the device includes:
Seed node acquiring unit 81, for obtaining at least one seed node of virus family to be determined;
First subgraph generation unit 82, for filtering out target relevant at least one seed node from knowledge mapping Node obtains the subgraph being made of seed node and destination node;At least one security-related is stored in knowledge mapping Node, there are incidence relations between the node at least one node;
First cluster cell 83 is clustered to obtain at least for the clustering algorithm using pre-training to the node in subgraph One node classification;
Virus family determination unit 84, for determining whether exist and node classification according to the node in node classification Relevant virus family;
Virus family generation unit 85, if for there is currently no virus families relevant to node classification, by node class Each node in not is determined to belong to the same new virus family.
Further, a kind of virus family generating means provided by the embodiments of the present application further include that virus family updates list Member, if currently will not be included into virus family in node classification for there is currently virus families relevant to node classification Each node is included into virus family.
In the embodiment of the present application, seed node acquiring unit includes:
Both candidate nodes acquiring unit, for obtaining multiple both candidate nodes of sensory perceptual system perception;
Analytical calculation unit, for being directed to each both candidate nodes, according to the both candidate nodes respectively in each default dimension Attribute value determines the score value of the both candidate nodes;
Seed node selection unit, for choosing at least one forward both candidate nodes of score value from multiple both candidate nodes, Each both candidate nodes are a seed node at least one both candidate nodes.
Further, a kind of virus family generating means provided by the embodiments of the present application further include secondary cleaning confirmation form Member, for determined whether according to the node in node classification exist virus family relevant to node classification before, it is right At least one node classification carries out secondary-confirmation cleaning, obtains at least one destination node classification.
Correspondingly, virus family determination unit specifically for being currently according to each node determination in destination node classification It is no to there is virus family relevant to destination node classification.
In the embodiment of the present application, virus family determination unit includes:
First number obtainment unit, for obtaining the first quantity for belonging to the node in virus family in node classification;
Second number obtainment unit, for obtaining the second quantity of all nodes in node classification;
Virus family determines subelement, if meet preset condition for the first quantity and the second quantity, determines node class It is not related to virus family.
Further, a kind of virus family generating means provided by the embodiments of the present application further include that virus family information is shown Unit, for receiving the information inspection request to virus family, information inspection request instruction has information category;Show virus family In information relevant to information category.
Further, a kind of virus family generating means provided by the embodiments of the present application further include that clustering algorithm training is single Member, comprising:
Virus family sample acquisition unit, for obtaining multiple virus family samples, each virus family sample includes extremely A few node sample;
Second subgraph generation unit, for filtering out destination node relevant to each node sample from knowledge mapping, Obtain the subgraph being made of destination node and node sample;
Second cluster cell obtains at least one for clustering using clustering algorithm to be trained to the node in subgraph A prediction node classification;
Clustering algorithm generates subelement, calculates for reversely being reconciled based at least one prediction node classification to training cluster Method, so that at least one prediction node classification levels off to multiple virus family samples to node sample to the classification of node sample Classification, generate clustering algorithm.
Further, the embodiment of the present application also provides a kind of storage medium, which can be stored with suitable for processor The program of execution, program are used for:
Obtain at least one seed node of virus family to be determined;
Destination node relevant at least one seed node is filtered out from knowledge mapping, is obtained by seed node and mesh Mark the subgraph that node is constituted;At least one security-related node, the section at least one node are stored in knowledge mapping There are incidence relations between point;
The node in subgraph is clustered to obtain at least one node classification using the clustering algorithm of pre-training;
Determine whether there is virus family relevant to node classification according to the node in node classification;
If each node in node classification is determined to belong to there is currently no virus family relevant to node classification The same new virus family.
Optionally, the refinement function of program and extension function can refer to above description.
The application provides a kind of virus family analysis method, device, server and storage medium, passes through and obtains disease to be determined At least one seed node of malicious family;Destination node relevant at least one seed node is filtered out from knowledge mapping, Obtain the subgraph being made of seed node and destination node;The node in subgraph is clustered to obtain at least using clustering algorithm One node classification;If each node in node classification is determined there is currently no virus family relevant to node classification For the mode for belonging to the same new virus family, realizes and automatically generate viral family on the basis of independent of manual analysis The purpose of race, and then the relevant information by providing a user virus family carry out depth analysis to security incident for user and mention Convenience has been supplied, the duration that user carries out depth analysis to security incident is shortened.
A kind of virus family generation method provided by the present invention, device, server and storage medium are carried out above It is discussed in detail, used herein a specific example illustrates the principle and implementation of the invention, above embodiments Illustrate to be merely used to help understand method and its core concept of the invention;At the same time, for those skilled in the art, according to According to thought of the invention, there will be changes in the specific implementation manner and application range, and to sum up, the content of the present specification is not answered It is interpreted as limitation of the present invention.
It should be noted that all the embodiments in this specification are described in a progressive manner, each embodiment weight Point explanation is the difference from other embodiments, and the same or similar parts between the embodiments can be referred to each other. For the device disclosed in the embodiment, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, phase Place is closed referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain Lid non-exclusive inclusion, so that the element that the process, method, article or equipment including a series of elements is intrinsic, It further include either the element intrinsic for these process, method, article or equipments.In the absence of more restrictions, The element limited by sentence "including a ...", it is not excluded that in the process, method, article or equipment for including element also There are other identical elements.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one The widest scope of cause.

Claims (10)

1. a kind of virus family generation method characterized by comprising
Obtain at least one seed node of virus family to be determined;
Destination node relevant at least one described seed node is filtered out from knowledge mapping, is obtained by the seed node The subgraph constituted with the destination node;Be stored at least one security-related node in the knowledge mapping, it is described extremely There are incidence relations between node in a few node;
The node in the subgraph is clustered to obtain at least one node classification using the clustering algorithm of pre-training;
Determine whether there is virus family relevant to the node classification according to the node in the node classification;
If each node in the node classification is determined as there is currently no virus family relevant to the node classification Belong to the same new virus family.
2. the method according to claim 1, wherein if there is currently viral families relevant to the node classification Race, this method further include:
The each node for not being included into the virus family in the node classification currently is included into the virus family.
3. the method according to claim 1, wherein described at least one seed for obtaining virus family to be determined Node, comprising:
Obtain multiple both candidate nodes of sensory perceptual system perception;
The candidate is determined according to the both candidate nodes respectively in the attribute value of each default dimension for each both candidate nodes The score value of node;
Choose forward at least one both candidate nodes of score value from multiple both candidate nodes, it is each at least one both candidate nodes The both candidate nodes are a seed node.
4. according to the method described in claim 3, it is characterized in that, the node according in the node classification determines currently Before virus family relevant to the node classification, this method further include:
Secondary-confirmation cleaning is carried out at least one described node classification, obtains at least one destination node classification;
The node according in the node classification determines whether there is virus family relevant to the node classification, It include: to be determined whether according to each node in the destination node classification in the presence of relevant to the destination node classification Virus family.
5. the method according to claim 1, wherein determining whether the node classification is relevant to virus family Mode includes:
Obtain the first quantity for belonging to the node in virus family in the node classification;
Obtain the second quantity of all nodes in the node classification;
If first quantity and second quantity meet preset condition, the node classification and the virus family are determined It is related.
6. the method according to claim 1, wherein further include:
It receives and the information inspection of the virus family is requested, the information inspection request instruction has information category;
Show information relevant to the information category in the virus family.
7. the process includes: the method according to claim 1, wherein further including clustering algorithm training process
Multiple virus family samples are obtained, each virus family sample includes at least one node sample;
Destination node relevant to each node sample is filtered out from the knowledge mapping, is obtained by the destination node The subgraph constituted with the node sample;
The node in the subgraph is clustered using clustering algorithm to be trained, obtains at least one prediction node classification;
Based on it is described at least one prediction node classification reversely reconciles described in clustering algorithm to be trained so that it is described at least one Prediction node classification levels off to multiple virus family samples to the classification of the node sample and returns to the node sample Class generates clustering algorithm.
8. a kind of virus family generating means characterized by comprising
Seed node acquiring unit, for obtaining at least one seed node of virus family to be determined;
Subgraph generation unit, for filtering out destination node relevant at least one described seed node from knowledge mapping, Obtain the subgraph being made of the seed node and the destination node;Be stored in the knowledge mapping it is security-related extremely A few node, there are incidence relations between the node at least one described node;
Cluster cell clusters the node in the subgraph for the clustering algorithm using pre-training to obtain at least one section Point classification;
Virus family determination unit, for determining whether exist and the node class according to the node in the node classification Not relevant virus family;
Virus family generation unit, if for there is currently no virus families relevant to the node classification, by the node Each node in classification is determined to belong to the same new virus family.
9. a kind of server characterized by comprising at least one processor and at least one processor;The memory is deposited Program is contained, the processor calls the program of the memory storage, and described program is any for realizing such as claim 1-7 Virus family generation method described in one.
10. a kind of storage medium, which is characterized in that for storing the viral family realized as described in claim 1-7 any one The program of race's generation method.
CN201910127243.4A 2019-02-20 2019-02-20 Virus family generation method, device, server and storage medium Active CN110399722B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910127243.4A CN110399722B (en) 2019-02-20 2019-02-20 Virus family generation method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910127243.4A CN110399722B (en) 2019-02-20 2019-02-20 Virus family generation method, device, server and storage medium

Publications (2)

Publication Number Publication Date
CN110399722A true CN110399722A (en) 2019-11-01
CN110399722B CN110399722B (en) 2024-03-26

Family

ID=68322429

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910127243.4A Active CN110399722B (en) 2019-02-20 2019-02-20 Virus family generation method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN110399722B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110990648A (en) * 2019-11-29 2020-04-10 珠海豹趣科技有限公司 Virus query method, server and computer readable storage medium
CN111061754A (en) * 2019-12-10 2020-04-24 北京明略软件系统有限公司 Family map determining method and device, electronic equipment and storage medium
CN112000718A (en) * 2020-10-28 2020-11-27 成都数联铭品科技有限公司 Attribute layout-based knowledge graph display method, system, medium and equipment
CN112183433A (en) * 2020-10-12 2021-01-05 水木未来(北京)科技有限公司 Characterization and quantification method of solid and hollow virus particles
CN113806744A (en) * 2020-06-16 2021-12-17 深信服科技股份有限公司 Virus identification method, device, equipment and readable storage medium
CN113836534A (en) * 2021-09-28 2021-12-24 深信服科技股份有限公司 Virus family identification method, system, equipment and computer storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105224600A (en) * 2015-08-31 2016-01-06 北京奇虎科技有限公司 A kind of detection method of Sample Similarity and device
CN106713335A (en) * 2016-12-30 2017-05-24 山石网科通信技术有限公司 Malicious software identification method and device
CN107679403A (en) * 2017-10-11 2018-02-09 北京理工大学 It is a kind of to extort software mutation detection method based on sequence alignment algorithms

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105224600A (en) * 2015-08-31 2016-01-06 北京奇虎科技有限公司 A kind of detection method of Sample Similarity and device
CN106713335A (en) * 2016-12-30 2017-05-24 山石网科通信技术有限公司 Malicious software identification method and device
CN107679403A (en) * 2017-10-11 2018-02-09 北京理工大学 It is a kind of to extort software mutation detection method based on sequence alignment algorithms

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110990648A (en) * 2019-11-29 2020-04-10 珠海豹趣科技有限公司 Virus query method, server and computer readable storage medium
CN111061754A (en) * 2019-12-10 2020-04-24 北京明略软件系统有限公司 Family map determining method and device, electronic equipment and storage medium
CN111061754B (en) * 2019-12-10 2023-03-14 北京明略软件系统有限公司 Family map determining method and device, electronic equipment and storage medium
CN113806744A (en) * 2020-06-16 2021-12-17 深信服科技股份有限公司 Virus identification method, device, equipment and readable storage medium
CN113806744B (en) * 2020-06-16 2023-09-05 深信服科技股份有限公司 Virus identification method, device, equipment and readable storage medium
CN112183433A (en) * 2020-10-12 2021-01-05 水木未来(北京)科技有限公司 Characterization and quantification method of solid and hollow virus particles
CN112183433B (en) * 2020-10-12 2024-02-23 水木未来(北京)科技有限公司 Characterization and quantification method for solid and hollow virus particles
CN112000718A (en) * 2020-10-28 2020-11-27 成都数联铭品科技有限公司 Attribute layout-based knowledge graph display method, system, medium and equipment
CN112000718B (en) * 2020-10-28 2021-05-18 成都数联铭品科技有限公司 Attribute layout-based knowledge graph display method, system, medium and equipment
CN113836534A (en) * 2021-09-28 2021-12-24 深信服科技股份有限公司 Virus family identification method, system, equipment and computer storage medium
CN113836534B (en) * 2021-09-28 2024-04-12 深信服科技股份有限公司 Virus family identification method, system, equipment and computer storage medium

Also Published As

Publication number Publication date
CN110399722B (en) 2024-03-26

Similar Documents

Publication Publication Date Title
CN110399722A (en) A kind of virus family generation method, device, server and storage medium
Mac Nally Multiple regression and inference in ecology and conservation biology: further comments on identifying important predictor variables
CN104717124B (en) A kind of friend recommendation method, apparatus and server
CN111355697B (en) Detection method, device, equipment and storage medium for botnet domain name family
CN105812177B (en) A kind of network failure processing method and processing equipment
CN111309824A (en) Entity relationship map display method and system
Sugiarto et al. Data classification for air quality on wireless sensor network monitoring system using decision tree algorithm
CN107423742A (en) The determination method and device of crowd's flow
CN110135603B (en) Power network alarm space characteristic analysis method based on improved entropy weight method
Li et al. Identifying overlapping communities in social networks using multi-scale local information expansion
CN112835995B (en) Domain name graph embedded representation analysis method and device based on analytic relationship
CN111310139A (en) Behavior data identification method and device and storage medium
CN107623691A (en) A kind of ddos attack detecting system and method based on reverse transmittance nerve network algorithm
CN109660515A (en) Attack chain detection method and device
KR20160089800A (en) Apparatus and method for investigating cyber incidents
Gaumont et al. Finding remarkably dense sequences of contacts in link streams
CN109981526A (en) A kind of method, apparatus, medium and the equipment of determining Attack Source
Gast et al. Approximability of the vertex cover problem in power-law graphs
CN106951213A (en) A kind of command analysis method and device
Wang et al. An improved topology-potential-based community detection algorithm for complex network
CN108540471A (en) Mobile application clustering network flow method, computer readable storage medium and terminal
CN111767571B (en) Detection method for medical data leakage
CN107656967A (en) A kind of scene information processing method and processing device
CN110995465B (en) Communication point panoramic view information operation and maintenance method and system
Katriel Expected-case analysis for delayed filtering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant