CN113656535B - Abnormal session detection method and device and computer storage medium - Google Patents
Abnormal session detection method and device and computer storage medium Download PDFInfo
- Publication number
- CN113656535B CN113656535B CN202111008418.3A CN202111008418A CN113656535B CN 113656535 B CN113656535 B CN 113656535B CN 202111008418 A CN202111008418 A CN 202111008418A CN 113656535 B CN113656535 B CN 113656535B
- Authority
- CN
- China
- Prior art keywords
- session
- information
- white list
- abnormal
- preset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 63
- 238000001514 detection method Methods 0.000 title claims abstract description 32
- 238000003860 storage Methods 0.000 title claims description 8
- 238000000034 method Methods 0.000 claims description 15
- 230000000737 periodic effect Effects 0.000 claims description 10
- 230000001502 supplementing effect Effects 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 238000009826 distribution Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 238000001914 filtration Methods 0.000 description 5
- 238000002372 labelling Methods 0.000 description 3
- 206010048669 Terminal state Diseases 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000003999 initiator Substances 0.000 description 2
- 238000012098 association analyses Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/322—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The embodiment of the invention discloses an abnormal session detection method, which comprises the following steps: extracting session and device interconnection information corresponding to the session in the data stream; judging whether the corresponding session exists in a preset white list or not according to the equipment interconnection information; classifying the sessions which do not exist in the preset white list step by step according to the session content to obtain a session tree; calculating the data type confidence of each leaf node in the session tree according to the equipment interconnection information; determining a session set corresponding to a leaf node with the data type confidence coefficient larger than a preset confidence coefficient threshold as a latest white list; and judging the conversation which does not exist in the latest white list as an abnormal conversation, so that the accuracy of abnormal conversation detection is improved.
Description
Technical Field
The present invention relates to the field of information security, and in particular, to a method and apparatus for detecting an abnormal session, and a computer storage medium.
Background
There are many ways to discover network security threats in networks, such as Intrusion Detection Systems (IDS), network traffic analysis systems (NTA), etc. NTA, an emerging technology for network threat detection, has emerged in the network security market.
In addition, the detection system in the prior art needs to invest a great deal of funds and manpower to merge, compress and mine the security logs mainly containing the false alarms generated by the detection system by an upper-layer association analysis technology, and has large investment and poor effect.
Aiming at the problem of inaccurate abnormal flow detection in the prior art, no effective solution exists at present.
Disclosure of Invention
In order to solve the problems, the invention provides an abnormal session detection method, an abnormal session detection device and a computer storage medium, which are used for determining a white list according to equipment interconnection information, dividing a session which does not exist in the white list into session trees, extracting characteristic information of each leaf node of the session tree, determining data type confidence coefficient of each leaf node according to a plurality of characteristic information, and judging whether the session under the leaf node is an abnormal session according to the data type confidence coefficient so as to solve the problem of inaccurate judgment of abnormal traffic.
In order to achieve the above object, the present invention provides an abnormal session detection method, including: extracting session and device interconnection information corresponding to the session in the data stream; judging whether the corresponding session exists in a preset white list or not according to the equipment interconnection information; classifying the sessions which do not exist in the preset white list step by step according to the session content to obtain a session tree; calculating the data type confidence of each leaf node in the session tree according to the equipment interconnection information; determining a session set corresponding to a leaf node with the data type confidence coefficient larger than a preset confidence coefficient threshold as a latest white list; and judging the conversation which does not exist in the latest white list as an abnormal conversation.
Further optionally, the calculating the confidence of the data type of each leaf node in the session tree according to the device interconnection information includes: extracting and counting multidimensional feature information in the equipment interconnection information; determining the session initial confidence corresponding to the feature information of each dimension according to the coincidence degree of the feature information of each dimension and the data type feature; and integrating a plurality of session initial confidence degrees to obtain the data type confidence degrees.
Further optionally, the multidimensional feature information in the device interconnection information includes at least two of the following: convergence condition information of the source address; convergence condition information of the destination address; convergence condition information of the destination port; the single source address accesses the session frequency characteristic information; single source address session periodicity variation law characteristic information.
Further optionally, after the determining, according to the device interconnection information, whether the corresponding session exists in the preset whitelist, the method includes: acquiring the data volume of the session existing in the preset white list; comparing the data volume with a preset standard data volume; and determining the session corresponding to the data volume larger than the preset standard data volume as an abnormal session.
Further optionally, after determining the session set corresponding to the leaf node with the data type confidence coefficient greater than the preset confidence coefficient threshold as the latest white list, the method includes: and supplementing the latest white list into the preset white list.
On the other hand, the invention also provides an abnormal session detection device, which comprises: the extraction module is used for extracting the session in the data stream and the equipment interconnection information corresponding to the session; the judging module is used for judging whether the corresponding session exists in a preset white list or not according to the equipment interconnection information; the session tree generation module is used for classifying the sessions which do not exist in the preset white list step by step according to the session content to obtain a session tree; the confidence coefficient calculating module is used for calculating the data type confidence coefficient of each leaf node in the session tree according to the equipment interconnection information; the latest white list determining module is used for determining a session set corresponding to the leaf node with the data type confidence coefficient larger than a preset confidence coefficient threshold value as a latest white list; and the first abnormal session judging module is used for judging the session which does not exist in the latest white list as an abnormal session.
Further optionally, the confidence calculating module includes: the multi-dimensional feature information extraction sub-module is used for extracting and counting multi-dimensional feature information in the equipment interconnection information; the initial confidence determining submodule is used for determining the session initial confidence corresponding to the feature information of each dimension according to the coincidence degree of the feature information of each dimension and the data type feature; and the data type confidence determining submodule is used for synthesizing a plurality of session initial confidence degrees to obtain the data type confidence degrees.
Further optionally, the apparatus further comprises: the data volume determining module is used for obtaining the data volume of the session existing in the preset white list; the comparison module is used for comparing the data volume with a preset standard data volume; and the second abnormal session judging module is used for determining the session corresponding to the data volume larger than the preset standard data volume as an abnormal session.
Further optionally, the method further comprises: and the supplementing module is used for supplementing the latest white list into the preset white list.
On the other hand, the present invention also provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the abnormal session detection method described above.
The technical scheme has the following beneficial effects: filtering abnormal sessions which do not accord with legal connection relations by setting a white list; after dividing the session into session trees, calculating the characteristics of each leaf node, calculating the confidence coefficient of each leaf node, and determining the latest white list according to the confidence coefficient. The white list is dynamically generated according to the session, so that manpower is saved, and the detection efficiency of the abnormal session is improved. In addition, the white list is dynamically updated according to different sessions, so that the scheme can accurately detect each session.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of an abnormal session detection method provided by an embodiment of the present invention;
FIG. 2 is a flowchart of a data type confidence calculation method provided by an embodiment of the present invention;
FIG. 3 is a flow chart of a method for determining abnormal sessions based on data volume according to an embodiment of the present invention;
fig. 4 is a block diagram of an abnormal session detection apparatus according to an embodiment of the present invention;
FIG. 5 is a block diagram of a confidence computation module provided by an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a data amount determining module, a comparing module, and a second abnormal session judging module according to an embodiment of the present invention.
Reference numerals: 100-extraction module 200-judgment module 300-conversation tree generation module 400-confidence calculation module 4001-multidimensional feature information extraction sub-module 4002-initial confidence determination sub-module 4003-data type confidence determination sub-module 500-latest white list determination module 600-first abnormal conversation judgment module 700-data amount determination module 800-comparison module 900-second abnormal conversation judgment module.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the prior art, a system with extremely high equipment configuration is required to perform feature analysis and matching on traffic for detecting abnormal sessions, but the system has low detection accuracy on abnormal sessions, and a large amount of labor is required to merge, compress and excavate all sessions, so that investment is large and effect is poor.
In order to solve the above problems, the present invention provides an abnormal session detection method, and fig. 1 is a flowchart of an abnormal session detection method provided in an embodiment of the present invention, as shown in fig. 1, including:
s101, extracting session and equipment interconnection information corresponding to the session in a data stream;
the device interconnection information includes at least: information such as source address (IMSI), source port, destination address, destination port, protocol type, connection establishment time, periodicity, number of interconnections, etc.
S102, judging whether a corresponding session exists in a preset white list or not according to the equipment interconnection information;
the preset white list is a set of sessions conforming to legal connection relations, and can be set manually or obtained by analyzing data by a system.
S103, classifying the sessions which do not exist in the preset white list step by step according to the session content to obtain a session tree;
and classifying all the sessions step by step according to the information such as the source address, the source port, the destination address, the destination port, the transmission protocol, the instruction of the session initiator, the used account number and the like of the session to obtain a session tree.
S104, calculating the data type confidence of each leaf node in the session tree according to the equipment interconnection information;
for each leaf node in the session tree, the credibility of each leaf node can be scored according to the dimension characteristics of the session number, frequency, periodicity, instruction, single IP address session quantity, 24-hour regular distribution and the like in the equipment interconnection information.
S105, determining a session set corresponding to the leaf node with the data type confidence coefficient larger than a preset confidence coefficient threshold as a latest white list;
s106, judging the conversation which does not exist in the latest white list as an abnormal conversation.
If the confidence coefficient of the data type of a certain leaf node is larger than a preset confidence coefficient threshold value, the session under the node is a session conforming to the legal connection relation, the corresponding session set is determined to be the latest white list, and the session not belonging to the latest white list set is determined to be an abnormal session.
As an optional implementation manner, fig. 2 is a flowchart of a data type confidence calculating method provided by an embodiment of the present invention, as shown in fig. 2, where, S104, calculating, according to the device interconnection information, a data type confidence of each leaf node in the session tree includes:
s1041, extracting and counting multidimensional feature information in the equipment interconnection information;
s1042, determining the session initial confidence corresponding to the feature information of each dimension according to the coincidence degree of the feature information of each dimension and the data type feature;
s1043, synthesizing a plurality of session initial confidence degrees to obtain the data type confidence degrees.
The traffic types can be roughly divided into two types, one is a production terminal and a local server, and the other is a management terminal and a local server.
And for the production terminal and the local server service data stream, the production terminal state acquisition and reporting, the operation instruction of the local server computing center on the terminal and the like are included. The method is characterized by comprising the following steps:
1) Source address (IMSI), destination address high convergence
2) Periodic variation law of session number variation in 24 hours daily
3) Single source address access presence periodicity
For the management data flow of the management terminal and the local server, the management terminal is mainly used for inquiring and synchronizing the state, synchronizing the task, maintaining and the like. The method is characterized by comprising the following steps:
1) Destination address, port height convergence
2) Source address is highly random on session time sequence, different internals randomly select time operation management
3) The session number has a periodic variation law of 24 hours per day, and particularly the working time period distribution of working days is the main
And extracting corresponding multidimensional characteristic information of each leaf node according to the characteristics of the actual data types, comparing the characteristic information of each dimension with the characteristic information of the actual data types, and obtaining an initial confidence coefficient according to the characteristic information similarity after comparison. After the feature information of all the dimensions is obtained, calculating the data type confidence coefficient of the session according to the weight corresponding to each dimension, wherein the higher the data type confidence coefficient is, the higher the probability of belonging to the type of data is.
As an alternative embodiment, a preliminary filtering may be performed on all sessions before the session tree is generated. For the service data flow, the traffic main body is occupied, the labeling port is fixed and limited, so that the type of the session can be rapidly judged according to the characteristics. The service types such as the transmission protocol http, mqtt, DNS recursion are all traffic data flows. And adding the session which is judged to be the service flow in advance into a white list, so that the session does not participate in the subsequent session tree division step, and the subsequent data processing amount is reduced.
As an optional implementation manner, the multidimensional feature information in the device interconnection information at least includes two kinds of following: convergence condition information of the source address; convergence condition information of the destination address; convergence condition information of the destination port; the single source address accesses the session frequency characteristic information; single source address session periodicity variation law characteristic information.
As a specific embodiment, each leaf node may be analyzed in five dimensions.
Dimension 1: convergence information of the source address. The source address of the service data stream is highly converged; the source addresses of the management data streams are substantially converged within a limited address range.
Dimension 2: convergence information of the destination address. The destination address of the service data stream is highly converged; the destination addresses of the management data streams are highly random and converge into a finite set.
Dimension 3: convergence status information of the destination port. The destination ports of the traffic data flows converge into a finite set; the destination ports that manage the data flows converge into a limited set and the daily destination address sets are substantially the same.
Dimension 4: the single source address accesses session frequency characteristic information. The single source address access session frequency of the service data flow presents high-frequency characteristics, and reference distribution exists; the statistical population of weekly or daily frequency of single source address access sessions that manage the data stream is substantially evenly distributed.
Dimension 5: single source address session periodicity variation law characteristic information. A single source address session in the service data stream has a 24-hour daily periodic variation rule (baseline rule); a single source address session in the management data stream is a 24-hour daily periodic law (baseline law), especially five in the morning, evening, monday to friday, as the primary traffic distribution period.
As a specific implementation manner, when determining the data type reputation value of a leaf node, firstly determining the source address convergence condition of the leaf node, if the source address convergence condition of the leaf node accords with the high convergence characteristic, determining the initial reputation value of the business data stream with dimension 1 of the leaf node as 100, and determining the initial reputation value of the management data stream as 0; secondly, determining the convergence condition of the destination address of the leaf node, if the convergence condition of the destination address of the leaf node accords with the high convergence characteristic, determining the initial reputation value of the business data stream of the dimension 2 of the leaf node as 100, and determining the initial reputation value of the management data stream as 0; thirdly, determining the convergence condition of the destination port of the leaf node, if the convergence condition of the destination port of the leaf node accords with the characteristic that the destination port converges in a limited set and the difference of daily destination address sets is large, determining the initial reputation value of the business data stream of the dimension 3 of the leaf node as 100, and determining the initial reputation value of the management data stream as 0; fourth, determining the single source address access session frequency characteristic of the leaf node, if the single source address access session frequency presents the high frequency characteristic and the reference distribution characteristic exists, determining the initial reputation value of the business data stream of the leaf node dimension 4 as 100, and determining the initial reputation value of the management data stream as 0; fifthly, determining the periodic variation rule characteristics of the single source address session of the leaf node, if the single source address session of the leaf node accords with the 24-hour daily periodic variation rule (baseline rule) and does not accord with the characteristics of towards nine-night five, monday friday is the main flow distribution period, determining the initial reputation value of the business data stream of the leaf node dimension 4 as 100, and determining the initial reputation value of the management data stream as 0.
The weight of the service data stream dimension 1-5 is 0.2, 0.3 and 0.1 respectively; then a rule is determined according to the data type of the service data flow, and the confidence of the data type of the leaf node belonging to the service data flow is: 0.2×100+0.2×100+0.2×100+0.3×100+0.1×100=100;
the weight of the management data stream dimension 1-5 is 0.2, 0.2 and 0.2 respectively; then a rule is determined based on the data type of the management data stream, the confidence that the leaf node belongs to the data type of the management data stream is: 0.2+0.2+0.2+0.2+0.2+0.2+0.2 =0;
and comparing the data type confidence degrees of the two data stream types, wherein the data type confidence degree of the leaf node belonging to the service data stream is larger, and then the data type confidence degree 100 is required to be compared with a preset confidence degree threshold value 80, and if the data type confidence degree of the leaf node is larger than the preset confidence degree threshold value, the data type of the leaf node is judged to be the service data stream, the legal connection relation is met, and the session under the leaf node is further classified as a white list.
As an optional implementation manner, fig. 3 is a flowchart of a method for determining an abnormal session according to a data volume according to an embodiment of the present invention, as shown in fig. 3, where after S102 determines whether a corresponding session exists in a preset whitelist according to the device interconnection information, the method includes:
s108, acquiring the data quantity of the session existing in the preset white list;
s109, comparing the data volume with a preset standard data volume;
s110, determining the session corresponding to the data volume larger than the preset standard data volume as an abnormal session.
For the sessions existing in the white list, counting the data quantity of each session in a preset period, and judging the session corresponding to the data quantity higher than the preset data quantity as an abnormal session. Therefore, the abnormal judgment is carried out on the session with sudden increase of the data volume caused by equipment failure, network failure or software reasons, so that the accuracy of abnormal session detection is improved, and the condition of missing report is reduced.
As an optional implementation manner, after determining, as the latest whitelist, the session set corresponding to the leaf node with the data type confidence coefficient greater than the preset confidence coefficient threshold in S105, the method includes: and S107, supplementing the latest white list into the preset white list.
In this embodiment, the preset whitelist is dynamically updated, and after the latest whitelist is obtained, data in the latest whitelist is added into the preset whitelist, and the preset whitelist after the data is supplemented is used as a basis for judging whether the next session is legally connected.
As an optional implementation manner, fig. 4 is a block diagram of an abnormal session detection apparatus provided by an embodiment of the present invention, and as shown in fig. 4, the present invention further provides an abnormal session detection apparatus, including:
an extracting module 100, configured to extract a session in a data stream and device interconnection information corresponding to the session;
the device interconnection information includes at least: information such as source address (IMSI), source port, destination address, destination port, protocol type, connection establishment time, periodicity, number of interconnections, etc.
A judging module 200, configured to judge whether a corresponding session exists in a preset whitelist according to the device interconnection information;
the preset white list is a set of sessions conforming to legal connection relations, and can be set manually or obtained by analyzing data by a system.
The session tree generation module 300 is configured to classify sessions that do not exist in the preset whitelist step by step according to session content, so as to obtain a session tree;
and classifying all the sessions step by step according to the information such as the source address, the source port, the destination address, the destination port, the transmission protocol, the instruction of the session initiator, the used account number and the like of the session to obtain a session tree.
A confidence calculating module 400, configured to calculate a data type confidence of each leaf node in the session tree according to the device interconnection information;
for each leaf node in the session tree, the credibility of each leaf node can be scored according to the dimension characteristics of the session number, frequency, periodicity, instruction, single IP address session quantity, 24-hour regular distribution and the like in the equipment interconnection information.
The latest white list determining module 500 is configured to determine, as a latest white list, a session set corresponding to a leaf node whose data type confidence coefficient is greater than a preset confidence coefficient threshold;
the first abnormal session judging module 600 is configured to judge that a session that does not exist in the latest whitelist is an abnormal session.
If the confidence coefficient of the data type of a certain leaf node is larger than a preset confidence coefficient threshold value, the session under the node is a session conforming to the legal connection relation, the corresponding session set is determined to be the latest white list, and the session not belonging to the latest white list set is determined to be an abnormal session.
As an alternative implementation manner, fig. 5 is a block diagram of a confidence coefficient calculating module provided by an embodiment of the present invention, and as shown in fig. 5, the confidence coefficient calculating module 400 includes:
a multidimensional feature information extraction submodule 4001, configured to extract and count multidimensional feature information in the device interconnection information;
an initial confidence determining submodule 4002, configured to determine, according to the coincidence degree of the feature information of each dimension and the data type feature, a session initial confidence corresponding to the feature information of each dimension;
a data type confidence determining submodule 4003, configured to synthesize a plurality of session initial confidences to obtain the data type confidence.
The traffic types can be roughly divided into two types, one is a production terminal and a local server, and the other is a management terminal and a local server.
And for the production terminal and the local server service data stream, the production terminal state acquisition and reporting, the operation instruction of the local server computing center on the terminal and the like are included. The method is characterized by comprising the following steps:
1) Source address (IMSI), destination address high convergence
2) Periodic variation law of session number variation in 24 hours daily
3) Single source address access presence periodicity
For the management data flow of the management terminal and the local server, the management terminal is mainly used for inquiring and synchronizing the state, synchronizing the task, maintaining and the like. The method is characterized by comprising the following steps:
1) Destination address, port height convergence
2) Source address is highly random on session time sequence, different internals randomly select time operation management
3) The session number has a periodic variation law of 24 hours per day, and particularly the working time period distribution of working days is the main
And extracting corresponding multidimensional characteristic information of each leaf node according to the characteristics of the actual data types, comparing the characteristic information of each dimension with the characteristic information of the actual data types, and obtaining an initial confidence coefficient according to the characteristic information similarity after comparison. After the feature information of all the dimensions is obtained, calculating the data type confidence coefficient of the session according to the weight corresponding to each dimension, wherein the higher the data type confidence coefficient is, the higher the probability of belonging to the type of data is.
As an alternative embodiment, a preliminary filtering may be performed on all sessions before the session tree is generated. For the service data flow, the traffic main body is occupied, the labeling port is fixed and limited, so that the type of the session can be rapidly judged according to the characteristics. The service types such as the transmission protocol http, mqtt, DNS recursion are all traffic data flows. And adding the session which is judged to be the service flow in advance into a white list, so that the session does not participate in the subsequent session tree division step, and the subsequent data processing amount is reduced.
As an optional implementation manner, fig. 6 is a schematic structural diagram of a data amount determining module, a comparing module, and a second abnormal session judging module provided in the embodiment of the present invention, as shown in fig. 6, where the apparatus further includes:
a data amount determining module 700, configured to obtain the data amount of the session existing in the preset whitelist;
a comparison module 800, configured to compare the data amount with a preset standard data amount;
and a second abnormal session judging module 900, configured to determine a session corresponding to a data amount greater than the preset standard data amount as an abnormal session.
As an alternative embodiment, a preliminary filtering may be performed on all sessions before the session tree is generated. For the service data flow, the traffic main body is occupied, the labeling port is fixed and limited, so that the type of the session can be rapidly judged according to the characteristics. The service types such as the transmission protocol http, mqtt, DNS recursion are all traffic data flows. And adding the session which is judged to be the service flow in advance into a white list, so that the session does not participate in the subsequent session tree division step, and the subsequent data processing amount is reduced.
As an alternative embodiment, the device further comprises:
and the supplementing module is used for supplementing the latest white list into the preset white list.
In this embodiment, the preset whitelist is dynamically updated, and after the latest whitelist is obtained, data in the latest whitelist is added into the preset whitelist, and the preset whitelist after the data is supplemented is used as a basis for judging whether the next session is legally connected.
As an alternative embodiment, the present invention also provides a computer storage medium having stored thereon a computer program which when executed by a processor implements the abnormal session detection method described above.
The above-described software is stored in the above-described storage medium including, but not limited to: optical discs, floppy discs, hard discs, erasable memory, etc.
The technical scheme has the following beneficial effects: filtering abnormal sessions which do not accord with legal connection relations by setting a white list; after dividing the session into session trees, calculating the characteristics of each leaf node, calculating the confidence coefficient of each leaf node, and determining the latest white list according to the confidence coefficient. The white list is dynamically generated according to the session, so that manpower is saved, and the detection efficiency of the abnormal session is improved. In addition, the white list is dynamically updated according to different sessions, so that the scheme can accurately detect each session.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.
Claims (7)
1. An abnormal session detection method, comprising:
extracting session and device interconnection information corresponding to the session in the data stream;
judging whether the corresponding session exists in a preset white list or not according to the equipment interconnection information;
classifying the sessions which do not exist in the preset white list step by step according to the session content to obtain a session tree;
calculating the data type confidence of each leaf node in the session tree according to the equipment interconnection information, wherein the data type confidence comprises the following steps: extracting and counting multidimensional feature information in the equipment interconnection information; determining the session initial confidence corresponding to the feature information of each dimension according to the coincidence degree of the feature information of each dimension and the data type feature; synthesizing a plurality of session initial confidence degrees to obtain the data type confidence degrees; the multidimensional feature information in the device interconnection information at least comprises two kinds of following: convergence condition information of the source address; convergence condition information of the destination address; convergence condition information of the destination port; the single source address accesses the session frequency characteristic information; periodic variation law characteristic information of single source address session; the step of extracting and counting multidimensional feature information in the device interconnection information, and determining the session initial confidence corresponding to the feature information of each dimension according to the coincidence degree of the feature information of each dimension and the data type feature comprises the following steps: according to the characteristics of the actual data types, extracting corresponding multidimensional characteristic information from each leaf node, comparing the characteristic information of each dimension with the characteristic information of the actual data types, and obtaining an initial confidence coefficient according to the characteristic information similarity after comparison;
determining a session set corresponding to a leaf node with the data type confidence coefficient larger than a preset confidence coefficient threshold as a latest white list;
and judging the conversation which does not exist in the latest white list as an abnormal conversation.
2. The abnormal session detection method according to claim 1, wherein after determining whether the corresponding session exists in a preset whitelist according to the device interconnection information, the method comprises:
acquiring the data volume of the session existing in the preset white list;
comparing the data volume with a preset standard data volume;
and determining the session corresponding to the data volume larger than the preset standard data volume as an abnormal session.
3. The abnormal session detection method according to claim 1, wherein after determining the session set corresponding to the leaf node with the data type confidence greater than the preset confidence threshold as the latest whitelist, the method comprises:
and supplementing the latest white list into the preset white list.
4. An abnormal session detection apparatus, comprising:
the extraction module is used for extracting the session in the data stream and the equipment interconnection information corresponding to the session;
the judging module is used for judging whether the corresponding session exists in a preset white list or not according to the equipment interconnection information;
the session tree generation module is used for classifying the sessions which do not exist in the preset white list step by step according to the session content to obtain a session tree;
the confidence coefficient calculating module is used for calculating the data type confidence coefficient of each leaf node in the session tree according to the equipment interconnection information; the confidence calculation module comprises: the multi-dimensional feature information extraction sub-module is used for extracting and counting multi-dimensional feature information in the equipment interconnection information; the initial confidence determining submodule is used for determining the session initial confidence corresponding to the feature information of each dimension according to the coincidence degree of the feature information of each dimension and the data type feature; a data type confidence determining submodule, configured to synthesize a plurality of session initial confidences to obtain the data type confidence; the multidimensional feature information in the device interconnection information at least comprises two kinds of following: convergence condition information of the source address; convergence condition information of the destination address; convergence condition information of the destination port; the single source address accesses the session frequency characteristic information; periodic variation law characteristic information of single source address session; the extracting and counting the multidimensional feature information in the device interconnection information, and determining the session initial confidence corresponding to the feature information of each dimension according to the coincidence degree of the feature information of each dimension and the data type feature comprises: according to the characteristics of the actual data types, extracting corresponding multidimensional characteristic information from each leaf node, comparing the characteristic information of each dimension with the characteristic information of the actual data types, and obtaining an initial confidence coefficient according to the characteristic information similarity after comparison;
the latest white list determining module is used for determining a session set corresponding to the leaf node with the data type confidence coefficient larger than a preset confidence coefficient threshold value as a latest white list;
and the first abnormal session judging module is used for judging the session which does not exist in the latest white list as an abnormal session.
5. The abnormal session detection apparatus according to claim 4, further comprising:
the data volume determining module is used for obtaining the data volume of the session existing in the preset white list;
the comparison module is used for comparing the data volume with a preset standard data volume;
and the second abnormal session judging module is used for determining the session corresponding to the data volume larger than the preset standard data volume as an abnormal session.
6. The abnormal session detection apparatus according to claim 4, further comprising:
and the supplementing module is used for supplementing the latest white list into the preset white list.
7. A computer storage medium having stored thereon a computer program, wherein the program when executed by a processor implements the abnormal session detection method according to any of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111008418.3A CN113656535B (en) | 2021-08-31 | 2021-08-31 | Abnormal session detection method and device and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111008418.3A CN113656535B (en) | 2021-08-31 | 2021-08-31 | Abnormal session detection method and device and computer storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113656535A CN113656535A (en) | 2021-11-16 |
CN113656535B true CN113656535B (en) | 2023-11-14 |
Family
ID=78482456
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111008418.3A Active CN113656535B (en) | 2021-08-31 | 2021-08-31 | Abnormal session detection method and device and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113656535B (en) |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105991587A (en) * | 2015-02-13 | 2016-10-05 | 中国移动通信集团山西有限公司 | Intrusion detection method and system |
CN108259482A (en) * | 2018-01-04 | 2018-07-06 | 平安科技(深圳)有限公司 | Network Abnormal data detection method, device, computer equipment and storage medium |
CN109558951A (en) * | 2018-11-23 | 2019-04-02 | 北京知道创宇信息技术有限公司 | A kind of fraud account detection method, device and its storage medium |
CN109587000A (en) * | 2018-11-14 | 2019-04-05 | 上海交通大学 | High latency method for detecting abnormality and system based on collective intelligence network measurement data |
CN109889547A (en) * | 2019-03-29 | 2019-06-14 | 新华三信息安全技术有限公司 | A kind of detection method and device of abnormal network equipment |
CN110149343A (en) * | 2019-05-31 | 2019-08-20 | 国家计算机网络与信息安全管理中心 | A kind of abnormal communications and liaison behavioral value method and system based on stream |
CN110430226A (en) * | 2019-09-16 | 2019-11-08 | 腾讯科技(深圳)有限公司 | Network attack detecting method, device, computer equipment and storage medium |
CN110730195A (en) * | 2019-12-18 | 2020-01-24 | 腾讯科技(深圳)有限公司 | Data processing method and device and computer readable storage medium |
CN110995769A (en) * | 2020-02-27 | 2020-04-10 | 上海飞旗网络技术股份有限公司 | Deep data packet detection method and device and readable storage medium |
CN111666502A (en) * | 2020-07-08 | 2020-09-15 | 腾讯科技(深圳)有限公司 | Abnormal user identification method and device based on deep learning and storage medium |
CN112118261A (en) * | 2020-09-21 | 2020-12-22 | 杭州迪普科技股份有限公司 | Session violation access detection method and device |
CN112313657A (en) * | 2018-10-26 | 2021-02-02 | 谷歌有限责任公司 | Method, system and computer program product for detecting automatic sessions |
CN112784024A (en) * | 2021-01-11 | 2021-05-11 | 软通动力信息技术(集团)股份有限公司 | Man-machine conversation method, device, equipment and storage medium |
CN113127639A (en) * | 2020-01-14 | 2021-07-16 | 北京京东振世信息技术有限公司 | Abnormal session text detection method and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5088403B2 (en) * | 2010-08-02 | 2012-12-05 | 横河電機株式会社 | Unauthorized communication detection system |
US11615144B2 (en) * | 2018-05-31 | 2023-03-28 | Microsoft Technology Licensing, Llc | Machine learning query session enhancement |
-
2021
- 2021-08-31 CN CN202111008418.3A patent/CN113656535B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105991587A (en) * | 2015-02-13 | 2016-10-05 | 中国移动通信集团山西有限公司 | Intrusion detection method and system |
CN108259482A (en) * | 2018-01-04 | 2018-07-06 | 平安科技(深圳)有限公司 | Network Abnormal data detection method, device, computer equipment and storage medium |
CN112313657A (en) * | 2018-10-26 | 2021-02-02 | 谷歌有限责任公司 | Method, system and computer program product for detecting automatic sessions |
CN109587000A (en) * | 2018-11-14 | 2019-04-05 | 上海交通大学 | High latency method for detecting abnormality and system based on collective intelligence network measurement data |
CN109558951A (en) * | 2018-11-23 | 2019-04-02 | 北京知道创宇信息技术有限公司 | A kind of fraud account detection method, device and its storage medium |
CN109889547A (en) * | 2019-03-29 | 2019-06-14 | 新华三信息安全技术有限公司 | A kind of detection method and device of abnormal network equipment |
CN110149343A (en) * | 2019-05-31 | 2019-08-20 | 国家计算机网络与信息安全管理中心 | A kind of abnormal communications and liaison behavioral value method and system based on stream |
CN110430226A (en) * | 2019-09-16 | 2019-11-08 | 腾讯科技(深圳)有限公司 | Network attack detecting method, device, computer equipment and storage medium |
CN110730195A (en) * | 2019-12-18 | 2020-01-24 | 腾讯科技(深圳)有限公司 | Data processing method and device and computer readable storage medium |
CN113127639A (en) * | 2020-01-14 | 2021-07-16 | 北京京东振世信息技术有限公司 | Abnormal session text detection method and device |
CN110995769A (en) * | 2020-02-27 | 2020-04-10 | 上海飞旗网络技术股份有限公司 | Deep data packet detection method and device and readable storage medium |
CN111666502A (en) * | 2020-07-08 | 2020-09-15 | 腾讯科技(深圳)有限公司 | Abnormal user identification method and device based on deep learning and storage medium |
CN112118261A (en) * | 2020-09-21 | 2020-12-22 | 杭州迪普科技股份有限公司 | Session violation access detection method and device |
CN112784024A (en) * | 2021-01-11 | 2021-05-11 | 软通动力信息技术(集团)股份有限公司 | Man-machine conversation method, device, equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
基于随机空间树的数据流异常检测算法;叶炼炼;《计算机工程与设计》;第38卷(第09期);2414-2419 * |
Also Published As
Publication number | Publication date |
---|---|
CN113656535A (en) | 2021-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108965347B (en) | Distributed denial of service attack detection method, device and server | |
CN111277570A (en) | Data security monitoring method and device, electronic equipment and readable medium | |
CN111935170B (en) | Network abnormal flow detection method, device and equipment | |
CN107566163B (en) | Alarm method and device for user behavior analysis association | |
CN108718298B (en) | Malicious external connection flow detection method and device | |
CN111367874B (en) | Log processing method, device, medium and equipment | |
US11847122B2 (en) | Unique SQL query transfer for anomaly detection | |
CN107145779B (en) | Method and device for identifying offline malicious software log | |
CN105704259B (en) | A kind of domain name authority services source IP recognition methods and system | |
CN112416872A (en) | Cloud platform log management system based on big data | |
CN111885106A (en) | Internet of things safety management and control method and system based on terminal equipment characteristic information | |
CN110958231A (en) | Industrial control safety event monitoring platform and method based on Internet | |
CN115865525B (en) | Log data processing method, device, electronic equipment and storage medium | |
CN113037567A (en) | Network attack behavior simulation system and method for power grid enterprise | |
CN111654486A (en) | Server equipment judgment and identification method | |
CN114640504B (en) | CC attack protection method, device, equipment and storage medium | |
CN113656535B (en) | Abnormal session detection method and device and computer storage medium | |
CN117221423A (en) | Flow analysis method and device, electronic equipment and storage medium | |
CN112261019A (en) | Distributed denial of service attack detection method, device and storage medium | |
CN110071898B (en) | Method for removing center to detect node validity | |
CN114844712B (en) | Edge node safety detection system and method based on knowledge graph | |
CN115801307A (en) | Method and system for carrying out port scanning detection by using server log | |
CN113872931A (en) | Method and system for detecting port scanning behavior, server and proxy node | |
CN112583817A (en) | Network oscillation monitoring and early warning method, device and medium | |
CN111224916A (en) | DDOS attack detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |