WO2023207548A1 - 一种流量检测方法、装置、设备及存储介质 - Google Patents

一种流量检测方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023207548A1
WO2023207548A1 PCT/CN2023/086763 CN2023086763W WO2023207548A1 WO 2023207548 A1 WO2023207548 A1 WO 2023207548A1 CN 2023086763 W CN2023086763 W CN 2023086763W WO 2023207548 A1 WO2023207548 A1 WO 2023207548A1
Authority
WO
WIPO (PCT)
Prior art keywords
network traffic
traffic
identification model
target
feature
Prior art date
Application number
PCT/CN2023/086763
Other languages
English (en)
French (fr)
Inventor
张晨
罗辉
郭建新
Original Assignee
北京火山引擎科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京火山引擎科技有限公司 filed Critical 北京火山引擎科技有限公司
Priority to EP23794981.3A priority Critical patent/EP4344134A1/en
Publication of WO2023207548A1 publication Critical patent/WO2023207548A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Definitions

  • the present disclosure relates to the field of Internet technology, and specifically to a traffic detection method, device, equipment and storage medium.
  • Container network is an open network architecture.
  • General network defense solutions are mainly universal defense. For example, matching risky data packets through predefined regular expressions. Successful matching indicates the existence of intrusion risk. This detection method relies on network traffic analysis of historical attack methods to form relevant rules, thereby predefining regular expressions for risk matching.
  • each container has a specific business meaning.
  • Each container generally only handles network requests related to a single business. If a traditional unified security policy is adopted, in the container environment It may bring a lot of invalid filtering, and unknown risks cannot be identified.
  • the traditional method of detecting abnormal network traffic in a container environment has the problem of low detection accuracy and is prone to missed negatives and false positives.
  • Embodiments of the present disclosure provide at least one flow detection method, device, equipment and storage medium.
  • an embodiment of the present disclosure provides a traffic detection method, which method includes:
  • the target traffic identification model corresponding to the target service container from the pre-trained traffic identification model set, and detect based on the called target traffic identification model Whether the network traffic is abnormally accessed network traffic, and the target traffic identification model is trained based on the network traffic associated with the target business container;
  • the network traffic is intercepted.
  • the network status information includes IP quintuple information; according to the network status information, the target service container associated with the network traffic is searched, and a pre-trained traffic identification model is used. Centrally call the target traffic identification model corresponding to the target business container, including:
  • a target service container group associated with the network traffic is determined; the target service container group includes a source container matching the source IP address and a source container matching the source IP address. The destination container matched by the destination IP address;
  • the traffic identification models in the traffic identification model set are trained according to the following steps:
  • the acquired network traffic is aggregated to obtain a network traffic set corresponding to the business container group; each of the business container groups includes The business information of the business containers is the same;
  • the feature matrix is composed of feature vectors corresponding to the network traffic included in the network traffic set;
  • a traffic identification model corresponding to the business container group is calculated; the traffic identification model is used to characterize the aggregated characteristics corresponding to normal access network traffic.
  • the method further includes:
  • the training process of the target traffic identification model is re-executed to update the target traffic identification model.
  • feature extraction of multiple feature dimensions is performed on the network traffic set to obtain a feature matrix corresponding to the network traffic set, including:
  • the traffic identification model corresponding to the business container group is calculated based on the feature matrix, including:
  • the traffic identification model is formed by using the URL feature set of the network traffic set and the confidence interval of the Body parameter in each feature dimension.
  • feature extraction is performed on the request body parameters of the network traffic collection, including:
  • a Body parameter feature set corresponding to the network traffic set is obtained.
  • detecting whether the network traffic is abnormally accessed network traffic based on the called target traffic identification model includes:
  • the network traffic Network traffic with abnormal access includes:
  • the network traffic is determined to be abnormally accessed network traffic, otherwise, It is determined that the network traffic is normal access network traffic.
  • an embodiment of the present disclosure also provides a flow detection device, which includes:
  • a data acquisition module used to obtain network traffic, and analyze the network traffic to obtain network status information related to the network traffic;
  • a traffic detection module configured to search for a target service container associated with the network traffic according to the network status information, and call a target traffic identification model corresponding to the target service container from a set of pre-trained traffic identification models, based on the call
  • the target traffic identification model detects whether the network traffic is abnormally accessed network traffic, and the target traffic identification model is trained based on the network traffic associated with the target business container;
  • a traffic interception module is configured to intercept the network traffic when detecting that the network traffic is abnormally accessed.
  • embodiments of the present disclosure also provide an electronic device, including: a processor, a memory, and a bus.
  • the memory stores machine-readable instructions executable by the processor.
  • the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the steps of the above-mentioned first aspect, or any possible traffic detection method in the first aspect, are performed.
  • embodiments of the present disclosure also provide a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the computer program executes the above-mentioned first aspect, or any of the first aspects. Steps of a possible traffic detection method.
  • the traffic detection method can obtain network traffic, analyze the network traffic to obtain network status information related to the network traffic, and find the target service container associated with the network traffic based on the network status information. , and call the target traffic identification model corresponding to the target business container from the pre-trained traffic identification model set, and detect whether the network traffic is abnormally accessed network traffic based on the called target traffic identification model.
  • the target traffic identification model It is obtained by training based on the network traffic associated with the target service container; when it is detected that the network traffic is abnormal access network traffic, the network traffic is intercepted.
  • a traffic identification model matching the business container is trained based on the network traffic corresponding to the business information of the business container; when performing network traffic detection, it is possible to Find the business container associated with the network traffic, call the target traffic identification model corresponding to the business container from the pre-trained traffic identification model set, and detect the network traffic through the called target traffic identification model; embodiments of the present disclosure can target different Business containers with business meaning use different traffic identification models for traffic detection. Network traffic detection adapted to the business characteristics of the business container can be performed for the network traffic of each business container. Compared with the traditional unified detection strategy, it can be more accurate Perform risk identification and filtering to reduce missed reports and false positives of abnormal traffic.
  • Figure 1 shows a schematic diagram of an application scenario provided by an embodiment of the present disclosure
  • Figure 2 shows a flow chart of a traffic detection method provided by an embodiment of the present disclosure
  • Figure 3 shows a flow chart of a traffic identification model training method provided by an embodiment of the present disclosure
  • Figure 4 shows a flow chart of another traffic detection method provided by an embodiment of the present disclosure
  • Figure 5 shows a flow chart of a traffic identification model update method provided by an embodiment of the present disclosure
  • Figure 6 shows one of the schematic diagrams of a flow detection device provided by an embodiment of the present disclosure
  • Figure 7 shows the second schematic diagram of a flow detection device provided by an embodiment of the present disclosure
  • FIG. 8 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
  • a and/or B can mean: A alone exists, A and B exist simultaneously, and B alone exists. situation.
  • at least one herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, and C, which can mean including from A, Any one or more elements selected from the set composed of B and C.
  • the traditional access detection method is to use a unified security policy, that is, using the same detection method to detect the traffic of business access in all container environments.
  • each container has specific business Meaning, each container generally only handles network requests related to a single business. Therefore, using the same detection method for diverse businesses may bring a large number of invalid filters, and unknown risks may be difficult to identify, resulting in the inability to target each Targeted detection of the business characteristics of containers further makes the detection accuracy of abnormal network traffic in the container environment low, prone to omissions and false positives, and difficult to ensure normal network access in the container environment.
  • the present disclosure provides a traffic detection method that can obtain network traffic and analyze the network traffic to obtain network status information related to the network traffic; based on the network status information, find the association with the network traffic The target business container, and call the target business container pair from the pre-trained traffic identification model set The corresponding target traffic identification model detects whether the network traffic is the network traffic of abnormal access based on the called target traffic identification model.
  • the target traffic identification model is obtained by training based on the network traffic associated with the target business container; after detecting If the network traffic is abnormally accessed, the network traffic is intercepted.
  • a traffic identification model matching the business container is trained based on the network traffic corresponding to the business information of the business container; when performing network traffic detection, it is possible to Find the business container associated with the network traffic, call the target traffic identification model corresponding to the business container from the pre-trained traffic identification model set, and detect the network traffic through the called target traffic identification model; embodiments of the present disclosure can target different Business containers with business meaning use different traffic identification models for traffic detection. Network traffic detection adapted to the business characteristics of the business container can be performed for the network traffic of each business container. Compared with the traditional unified detection strategy, it can be more accurate Perform risk identification and filtering to reduce missed reports and false positives of abnormal traffic.
  • the execution subject of the flow detection method provided by the embodiment of the disclosure is generally a computer device with certain computing capabilities.
  • the computer Devices include, for example, terminal devices or servers or other processing devices.
  • the access detection method can be implemented by the processor calling computer-readable instructions stored in the memory.
  • Figure 1 is a schematic diagram of an application scenario provided by an embodiment of the present disclosure.
  • training can be performed based on the business information and sample network traffic of each business container sent by the container orchestration system and its corresponding business information, thereby training the traffic identification model, and targeting the trained Traffic identification model, the business container can send the network traffic to be detected to the traffic identification model to detect whether the network traffic is abnormally accessed network traffic based on the traffic identification model.
  • FIG 2 is a flow chart of a traffic detection method provided by an embodiment of the present disclosure.
  • the traffic detection method provided by the embodiment of the present disclosure includes steps S201 to S204, wherein:
  • S201 Obtain network traffic, and analyze the network traffic to obtain network status information related to the network traffic.
  • the network status information related to network traffic may include Internet Protocol (Internet Protocol Address, IP) quintuple information, Uniform Resource Locator (Uniform Resource Locator, URL) address, request body (Body) parameter information, etc.
  • Internet Protocol Internet Protocol Address, IP
  • Uniform Resource Locator Uniform Resource Locator
  • URL Uniform Resource Locator
  • request body Body
  • the network traffic is generally in the form of Transmission Control Protocol (Transmission Control Protocol, TCP) data packets, which can be converted into Transmission Control Protocol (Transmission Control Protocol) through Deep Packet Inspection (DPI) technology.
  • TCP Transmission Control Protocol
  • DPI Deep Packet Inspection
  • TCP Transmission Control Protocol
  • HTTP Hyper Text Transfer Protocol
  • network traffic can be obtained through a network hook.
  • S202 Find the target service container associated with the network traffic according to the network status information, and call the target traffic identification model corresponding to the target service container from the pre-trained traffic identification model set, and identify the target traffic based on the call
  • the model detects whether the network traffic is abnormally accessed network traffic, and the target traffic identification model It is trained based on the network traffic associated with the target business container.
  • the traffic identification model set includes multiple traffic identification models, and each traffic identification model stores its corresponding business container, so the target business container associated with the network traffic can be used, A target traffic identification model corresponding to the target service container is determined, so that subsequent traffic detection is performed based on the target traffic identification model.
  • the network status information includes IP five-tuple information
  • the method of searching for a target service container associated with the network traffic according to the network status information, and calling a target traffic identification model corresponding to the target service container from a set of pre-trained traffic identification models including:
  • a target service container group associated with the network traffic is determined; the target service container group includes a source container matching the source IP address and a source container matching the source IP address.
  • the destination IP address matches the destination container;
  • the network status information includes IP five-tuple information
  • the source IP address and the destination IP address are one.
  • the group has an IP address with a business access relationship, so a group of target service container groups with a business access relationship can be determined, which is the target service container group associated with the network traffic.
  • the target service container group includes a source container matching the source IP address and a destination container matching the destination IP address.
  • each target service container group corresponds to a pre-trained traffic identification model.
  • the target traffic identification model corresponding to the target service container group can be determined from a set of pre-trained traffic identification models with the help of the target service container group.
  • the target service container group associated with the network traffic in order to determine the target service container group associated with the network traffic, not only the source IP address and the destination IP address in the IP five-tuple information can be obtained, but also all the IP addresses can be obtained.
  • the URL request and Body parameters indicated by the URL address are used to assist in determining the service characteristics of the target service container group and improve the accuracy of subsequent determination of the target traffic identification model matching the target service container group.
  • the source IP address 10.224.41.163 and the destination IP address 10.224.60.24 can be extracted from the IP quintuple information in the network traffic, so that the source IP address can be determined
  • the matching source container trade, the destination container order matching the destination IP address, and at the same time, the URL request /order/detail indicated by the uniform resource locator URL address can also be extracted from the network traffic, and the Body parameter ⁇ orderNo:23 ⁇ , thus it can be determined that the source container has transaction characteristics and the destination container has order characteristics.
  • the two are a set of target business container groups with business access relationships for order transactions. When the target business container group is determined, it can be obtained from pre-training
  • the traffic identification model is concentrated to determine the target traffic identification model corresponding to the target business container group.
  • Each business container group corresponds to a pre-trained traffic identification model to achieve targeted detection of network traffic near the business source, achieving Precise protection in container network environment.
  • the source container accesses the destination container
  • the port used for business access it is possible to determine the port in the IP five-tuple information.
  • the source port and destination port of the container enable network traffic to be transmitted through the source port of the source container and the destination port of the destination container to achieve business access.
  • FIG. 3 is a flow chart of a traffic identification model training method provided by the embodiment of the present disclosure, including steps S301 to S303:
  • S301 Aggregate the obtained network traffic according to the business information corresponding to the obtained network traffic and the business information of each service container, and obtain a network traffic set corresponding to each business container group; each business container group The business information of each business container in the container is the same.
  • the network traffic used for training and its corresponding business information can be obtained, and at the same time, the business information of each business container can be obtained.
  • the data of each business container in each business container group The business information is the same. After comparing the business information, the same business information can be determined. Therefore, for each business container group, the obtained network traffic corresponding to the same business information can be aggregated to generate a network traffic set. That is, the network traffic collection corresponding to each business container group is generated.
  • the business information of each business container can be obtained through the application programming interface (Application Programming Interface, API) of the container orchestration system, and the business information includes the business attribute information of each business container.
  • API Application Programming Interface
  • the business attribute information of each business container such as nginx, mysql, kafka, etc.
  • S302 Perform feature extraction on multiple feature dimensions of the network traffic set to obtain a feature matrix corresponding to the network traffic set; the feature matrix is composed of feature vectors corresponding to each network traffic in the network traffic set.
  • the uniform resource locator URL in the network traffic collection can be Feature extraction is performed on the address and request body Body parameters respectively, and a URL feature set and a Body parameter feature set contained in the feature matrix are obtained.
  • a feature matrix corresponding to the network traffic set can be formed.
  • the feature extraction of the uniform resource locator URL addresses in the network traffic collection may be performed by using regular expressions to extract the characteristics of each network traffic in the network traffic collection. Extract the accessed resource path from the URL address to obtain the URL characteristics of each network traffic, and then perform deduplication processing on the URL characteristics of each network traffic in the network traffic collection to obtain the URL corresponding to the network traffic collection. feature set.
  • the regular expression method can be used to detect whether there is a string matching the resource path pattern in the URL address of each network traffic in the network traffic collection. If it exists, the URL address can be filtered. , to extract the accessed resource path from the URL address, and then obtain the URL characteristics of each network traffic. In practical applications, the same URL address and the same resource path exist in the network traffic, so in order to To reduce resource occupation and improve the training speed of the traffic identification model, the URL features of each network traffic in the network traffic set can be deduplicated to delete duplicate URL features, thereby obtaining the URL feature set corresponding to the network traffic set. .
  • character cleaning can also be performed using a regular expression method.
  • feature extraction is performed on the request body Body parameters of the network traffic collection, including:
  • a Body parameter feature set corresponding to the network traffic set is obtained.
  • the multiple character-related dimensions include string length, number of special characters, proportion of letters, and proportion of numbers, etc.
  • the Body parameter of the network traffic can be extracted.
  • the length of the string, the number of special characters, the proportion of letters, and the proportion of numbers can be used to obtain the characteristics of the Body parameters of each network traffic in multiple character-related dimensions, and then the characteristics of multiple character-related dimensions can be integrated to achieve Obtain the Body parameter feature set corresponding to the network traffic set.
  • the Body parameter is generally a Key-Value parameter structure.
  • the Key value is related to the input parameter type, and the Value value is related to the business category.
  • the string length in the Body parameter is extracted to determine the string length of the Value value.
  • the length of the string corresponding to generally offensive network traffic is not fixed and corresponds to the normal access network traffic. There is a large deviation in the string length.
  • extracting the number of special characters in the Body parameter is to determine the number of special characters in the Value value. If there are special characters such as *, $, %, &, etc. in the Value value, it generally corresponds to offensive network traffic. .
  • extracting the proportion of letters in the Body parameter is to determine the letter type characters in the Value value among all characters.
  • Proportion; extracting the proportion of numbers in the Body parameter is to determine the proportion of numeric characters in the Value value among all characters.
  • S303 Based on the feature matrix, calculate a traffic identification model corresponding to the service container group; the traffic identification model is used to characterize the aggregated characteristics corresponding to normal access network traffic.
  • a traffic identification model corresponding to the business container group can be calculated.
  • the intermediate traffic identification model corresponding to a single business container can be calculated separately for the feature matrix corresponding to the source container and the feature matrix corresponding to the destination container, and then integrated to obtain the traffic corresponding to the business container group. Identify the model.
  • the feature matrix corresponding to the source container and the feature matrix corresponding to the destination container can be integrated first to obtain the intermediate feature matrix corresponding to the business container group, and then calculate the The traffic identification model corresponding to the business container group.
  • the traffic identification model corresponding to the business container group is calculated based on the feature matrix, including:
  • the traffic identification model is formed by using the URL feature set of the network traffic set and the confidence interval of the Body parameter in each feature dimension.
  • the Body parameters of the network traffic set can be calculated based on the Body parameter feature set in each
  • the confidence interval under the feature dimension is used to form the traffic identification model by using the URL feature set of the network traffic set and the confidence interval of the Body parameter under each feature dimension.
  • the Body parameter feature set includes the characteristics of the Body parameters of the network traffic in character-related dimensions such as string length, number of special characters, proportion of letters, and proportion of numbers. These features are all Numerical features, so the feature calculation method in each dimension is the same.
  • the mean ⁇ and standard deviation ⁇ of the feature can be first calculated, and then based on the agreement Byshev's theorem, given a set of data ⁇ x 1 , x 2 ,..., x n ⁇ , the mean is ⁇ and the standard deviation is ⁇ , then for any k ⁇ 1, it is located in the interval [ ⁇ -k* ⁇ , ⁇ +k The proportion of data within * ⁇ ] is p ⁇ 1-1/k 2 .
  • k is the tolerance. The larger the k value, the greater the probability that the feature value falls within the interval.
  • is the mean
  • is the standard deviation
  • k is any value (k>0)
  • P is the probability estimate of sample X.
  • the confidence interval ⁇ [ ⁇ 1 , ⁇ 2 ] under the current feature dimension can be calculated.
  • the target traffic identification model corresponding to the target service container can be called from the pre-trained traffic identification model set, and then the network traffic can be detected through the target traffic identification model.
  • detecting whether the network traffic is abnormally accessed network traffic based on the invocation of the target traffic identification model includes:
  • feature extraction can be performed on the network traffic to obtain a feature vector of the network traffic.
  • the feature extraction of the network traffic here is similar to the method of feature extraction of multiple feature dimensions on the network traffic set introduced above.
  • the uniform resource locator URL address and request body in the network traffic are Performing feature extraction on the parameters respectively is similar to the method of separately performing feature extraction on the uniform resource locator URL address and request body parameters in the network traffic collection described above, and will not be described again here.
  • the feature vector of the network traffic can be formed based on the extracted URL features and the parameter features of the Body parameters in multiple feature dimensions.
  • the features are The vector is compared with the target traffic identification model, so that it can be determined whether the network traffic is abnormally accessed network traffic.
  • the target traffic identification model is formed by using the URL feature set of the network traffic set and the confidence interval of the Body parameter in each feature dimension.
  • the extracted feature vector can be compared and matched with the URL feature set indicated in the target traffic identification model and the confidence interval of the Body parameter under each feature dimension, thereby determining whether the network traffic is Abnormally accessed network traffic.
  • the network can be determined based on the URL characteristics of the to-be network traffic indicated by the feature vector and the parameter characteristics of the Body parameters of the network traffic in multiple feature dimensions. Whether the URL characteristics of the traffic belong to the URL characteristic set, and whether the parameter characteristics of the Body parameters in each characteristic dimension belong to the confidence interval;
  • the network traffic is determined to be network traffic of abnormal access; otherwise, the network traffic is determined to be network traffic of normal access.
  • this step it may be first determined based on the URL characteristics of the to-be network traffic indicated by the feature vector whether the URL characteristics of the network traffic belong to the URL feature set indicated in the target traffic identification model, and based on the characteristics
  • the parameter characteristics of the Body parameters of the network traffic indicated by the vector in multiple feature dimensions are used to determine the network traffic Whether the parameter characteristics of the Body parameters in each feature dimension belong to the confidence interval, and then obtain the judgment result.
  • the URL characteristics of the network traffic indicated by the feature vector do not belong to the target traffic identification model According to the indicated URL feature set, it can be determined that there is an abnormality in the network traffic.
  • the abnormality probability corresponding to the network traffic is determined through mathematical transformation, as follows:
  • p i is the feature abnormality determination result (abnormality is 1, normal is 0), and ⁇ i is the weight of the feature dimension.
  • a set threshold configured in advance for the network traffic can be obtained.
  • the abnormality probability corresponding to the network traffic is greater than the set threshold, it is determined that the access detection result corresponding to the network traffic is an abnormal access, and the current access is determined.
  • the network traffic is the network traffic of abnormal access. Otherwise, when the abnormality probability corresponding to the network traffic is less than the set threshold, it is determined that the access detection result corresponding to the network traffic is normal access, and the current network traffic is judged to be normal. Accessed network traffic.
  • abnormally accessed network traffic can be intercepted to ensure that all accessed network traffic passes detection.
  • the access detection method provided by the embodiment of the present disclosure can obtain network traffic, analyze the network traffic to obtain network status information related to the network traffic, and find the target service container associated with the network traffic based on the network status information. , and call the target traffic identification model corresponding to the target business container from the pre-trained traffic identification model set, and detect whether the network traffic is abnormally accessed network traffic based on the called target traffic identification model.
  • the target traffic identification model It is obtained by training based on the network traffic associated with the target service container; when it is detected that the network traffic is abnormal access network traffic, the network traffic is intercepted.
  • a traffic identification model matching the business container is trained based on the network traffic corresponding to the business information of the business container; when performing network traffic detection, it is possible to Find the business container associated with the network traffic, call the target traffic identification model corresponding to the business container from the pre-trained traffic identification model set, and detect the network traffic through the called target traffic identification model; embodiments of the present disclosure can target different Business containers with business meaning use different traffic identification models for traffic detection. Network traffic detection adapted to the business characteristics of the business container can be performed for the network traffic of each business container. Compared with the traditional unified detection strategy, it can be more accurate Perform risk identification and filtering to reduce missed reports and false positives of abnormal traffic.
  • FIG 4 is a flow chart of another traffic detection method provided by an embodiment of the present disclosure.
  • the traffic detection method provided by the embodiment of the present disclosure includes steps S401 to S405, in which:
  • S401 Obtain network traffic, and analyze the network traffic to obtain network status information related to the network traffic.
  • S402 Find the target service container associated with the network traffic according to the network status information, and retrieve it from the pre-
  • the trained traffic identification model centrally calls the target traffic identification model corresponding to the target business container. Based on the called target traffic identification model, it detects whether the network traffic is abnormally accessed network traffic.
  • the target traffic identification model is based on the target traffic identification model. Obtained by training on network traffic associated with business containers.
  • steps S401 to S403 can refer to the description of steps S201 to S203, and can achieve the same technical effects and solve the same technical problems, and will not be described again here.
  • S405 Based on the network traffic generated within the preset time period, re-execute the training process of the target traffic identification model to update the target traffic identification model.
  • the target traffic identification model corresponds to the target business container group, and the target business container group is associated with the network traffic. Therefore, if the source container and destination container in the business container group change , that is, after the target traffic identification model is trained and the target business container generates network traffic again, the target traffic identification model also needs to be adjusted and updated accordingly to effectively enhance the target traffic. Identification model robustness.
  • Figure 5 is a flow chart of a traffic identification model update method provided by an embodiment of the present disclosure.
  • the model can be Training, the method of model training is similar to the method introduced above, and will not be repeated here; if there is a corresponding target traffic identification model, it can be judged whether the target traffic identification model needs to be updated.
  • the target business can be detected Whether the identifiers of the images corresponding to the source container and the destination container in the container group have changed. If the identifiers have not changed, it is determined that the target traffic identification model does not need to be updated, and the existing target traffic identification model can continue to be used. If the identifiers have changed, then Determine that the target traffic identification model needs to be updated;
  • the target traffic identification model enters the model adjustment period.
  • the length of the model adjustment period can be set according to the specific business conditions of the target business container group, such as one hour, etc. If the target traffic identification model is in the During the model adjustment period, it is determined that the target traffic identification model does not deviate, that is, the target traffic identification model can continue to use the existing near-source model. If the near-source model is not within the model adjustment period, it is determined If the near-source model deviates, the network traffic generated by the target container within the preset time period can be obtained. Based on the network traffic generated within the preset time period, the training process of the target traffic identification model can be re-executed to obtain The target traffic identification model is updated. It can be understood that the updated target traffic identification model corresponds to the current business container group.
  • the traffic detection method can obtain network traffic, analyze the network traffic to obtain network status information related to the network traffic, and find the target service container associated with the network traffic based on the network status information. , and call the target traffic identification model corresponding to the target business container from the pre-trained traffic identification model set, and detect whether the network traffic is abnormally accessed network traffic based on the called target traffic identification model.
  • the target traffic identification model It is obtained by training based on the network traffic associated with the target service container; when it is detected that the network traffic is abnormal access network traffic, the network traffic is intercepted.
  • a traffic identification model matching the business container is trained based on the network traffic corresponding to the business information of the business container; when performing network traffic detection, it is possible to Find the business container associated with the network traffic, call the target traffic identification model corresponding to the business container from the pre-trained traffic identification model set, and detect the network traffic through the called target traffic identification model; embodiments of the present disclosure can target different Business containers with business meaning use different traffic identification models for traffic detection. Network traffic detection adapted to the business characteristics of the business container can be performed for the network traffic of each business container. Compared with the traditional unified detection strategy, it can be more accurate Perform risk identification and filtering to reduce missed reports and false positives of abnormal traffic.
  • access interception can be performed on accurately identified abnormal access network traffic, thereby improving access security, avoiding unnecessary interception, and ensuring normal network access in a container environment.
  • the writing order of each step does not mean a strict execution order and does not constitute any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible The internal logic is determined.
  • the embodiment of the present disclosure also provides a flow detection device corresponding to the flow detection method. Since the principle of solving the problem of the device in the embodiment of the present disclosure is similar to the above-mentioned flow detection method of the embodiment of the present disclosure, the implementation of the device Please refer to the implementation of the method, and the repeated parts will not be repeated.
  • FIG. 6 is a first schematic diagram of a flow detection device provided by an embodiment of the present disclosure
  • FIG. 7 is a second schematic diagram of a flow detection device provided by an embodiment of the present disclosure.
  • the flow detection device 600 provided by the embodiment of the present disclosure includes:
  • the data acquisition module 601 is used to obtain network traffic, and analyze the network traffic to obtain network status information related to the network traffic;
  • the traffic detection module 602 is configured to search for the target service container associated with the network traffic according to the network status information, and call the target traffic identification model corresponding to the target service container from the pre-trained traffic identification model set, based on The called target traffic identification model detects whether the network traffic is abnormally accessed network traffic, and the target traffic identification model is trained based on the network traffic associated with the target business container;
  • the traffic interception module 603 intercepts the network traffic when detecting that the network traffic is abnormally accessed.
  • the network status information includes IP five-tuple information
  • the traffic detection module 602 is configured to find the target service container associated with the network traffic according to the network status information, and When calling the target traffic identification model corresponding to the target business container from the pre-trained traffic identification model set, it is specifically used for:
  • a target service container group associated with the network traffic is determined; the target service container group includes a source container matching the source IP address and a source container matching the source IP address. The destination container matched by the destination IP address;
  • the traffic detection module 602 trains to obtain the traffic identification model in the traffic identification model set according to the following steps:
  • the acquired network traffic is aggregated to obtain a network traffic set corresponding to the business container group; each of the business containers The business information of the business containers included in the group is the same;
  • the feature matrix is composed of feature vectors corresponding to the network traffic included in the network traffic set;
  • a traffic identification model corresponding to the business container group is calculated; the traffic identification model is used to characterize the aggregated characteristics corresponding to normal access network traffic.
  • the device further includes a model update module 604, which is used to:
  • the training process of the target traffic identification model is re-executed to update the target traffic identification model.
  • the traffic detection module 602 when used to extract features of multiple feature dimensions from the network traffic set to obtain the feature matrix corresponding to the network traffic set, it is specifically used to:
  • the traffic detection module 602 When the traffic detection module 602 is used to calculate the traffic identification model corresponding to the business container group based on the feature matrix, it is specifically used to:
  • the traffic identification model is formed by using the URL feature set of the network traffic set and the confidence interval of the Body parameter in each feature dimension.
  • the traffic detection module 602 when used to extract features from the request body parameters of the network traffic collection, it is specifically used to:
  • a Body parameter feature set corresponding to the network traffic set is obtained.
  • the traffic detection module 602 when used to detect whether the network traffic is abnormally accessed network traffic based on the called target traffic identification model, it is specifically used to:
  • the traffic detection module 602 is used to determine the confidence of each feature dimension based on the extracted feature vector and the URL feature set and Body parameters indicated in the target traffic identification model. degree interval, when determining whether the network traffic is abnormal access network traffic, it is specifically used for:
  • the network traffic is determined to be network traffic of abnormal access; otherwise, the network traffic is determined to be network traffic of normal access.
  • FIG. 8 a schematic structural diagram of an electronic device 800 provided for an embodiment of the present disclosure includes:
  • Processor 810, memory 820, and bus 830 memory 820 is used to store execution instructions, including memory 821 and external memory 822; memory 821 here is also called internal memory, and is used to temporarily store operation data in processor 810, and with The processor 810 exchanges data with the external memory 822 such as a hard disk through the memory 821 and the external memory 822.
  • the processor 810 and the memory 820 communicate through the bus 830, so that The processor 810 can execute the execution instructions mentioned in the above-mentioned traffic detection method embodiment.
  • Embodiments of the present disclosure also provide a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium. When the computer program is run by a processor, the steps of the flow detection method described in the above method embodiment are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • Embodiments of the present disclosure also provide a computer program product.
  • the computer program product includes computer instructions.
  • the steps of the flow detection method described in the above method embodiments can be performed. For details, please refer to the above. Method embodiments will not be described again here.
  • the above-mentioned computer program product can be specifically implemented by hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium.
  • the computer program product is embodied as a software product, such as a Software Development Kit (SDK), etc. wait.
  • SDK Software Development Kit
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium that is executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本公开提供了一种流量检测方法、装置、设备及存储介质,可以获取网络流量,并解析得到网络流量相关的网络状态信息,根据网络状态信息,查找与网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测网络流量是否为异常访问的网络流量,对异常访问的网络流量进行拦截,这样,可以针对不同业务含义的业务容器使用不同的流量识别模型进行流量检测,可以针对每种业务容器的网络流量进行适应于该业务容器的业务特点的网络流量检测,相对传统统一化的检测策略,能更精确地进行风险识别和过滤,从而减少异常流量漏报、误报的情况。

Description

一种流量检测方法、装置、设备及存储介质
本公开要求于2022年04月29日提交中国专利局、申请号为202210468301.1、申请名称为“一种流量检测方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及互联网技术领域,具体而言,涉及一种流量检测方法、装置、设备及存储介质。
背景技术
容器网络是一个开放的网络架构,一般的网络防御方案主要都是通用性防御,比如通过预定义的正则表达式匹配风险数据包,匹配成功即表示存在入侵风险。这种检测方式需要依赖对历史攻击方式进行网络流量分析,形成相关规则,从而预定义进行风险匹配的正则表达式。
但是,在容器环境中,由于容器微服务化的特性,每个容器都有特定的业务含义,每个容器一般只处理单一业务相关的网络请求,如果采用传统的统一化安全策略,在容器环境中可能会带来大量无效过滤,而且对于未知的风险不能识别。
也即,采用传统方式进行容器环境中网络异常流量的检测存在检测的准确性低的问题,容易出现漏报、误报的情况。
发明内容
本公开实施例至少提供一种流量检测方法、装置、设备及存储介质。
第一方面,本公开实施例提供了一种流量检测方法,所述方法包括:
获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;
根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;
在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
一种可选的实施方式中,所述网络状态信息包括IP五元组信息;所述根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,包括:
根据所述IP五元组信息中的源IP地址和目的IP地址,确定与所述网络流量关联的目标业务容器组;所述目标业务容器组包括与所述源IP地址匹配的源容器和与所述目的IP地址匹配的目的容器;
从预先训练的流量识别模型集中,获取与目标业务容器组对应的目标流量识别模型。
一种可选的实施方式中,所述流量识别模型集中的流量识别模型为根据以下步骤训练得到的:
根据获取到的网络流量对应的业务信息以及一个或者多个业务容器的业务信息,将获取到的网络流量进行聚合处理,得到与业务容器组对应的网络流量集合;每个所述业务容器组包括的业务容器的业务信息相同;
对所述网络流量集合进行多个特征维度的特征提取,得到所述网络流量集合对应的特征矩阵;所述特征矩阵由所述网络流量集合包括的网络流量分别对应的特征向量组成;
基于所述特征矩阵,计算得到所述业务容器组对应的流量识别模型;所述流量识别模型用于表征正常访问的网络流量对应的聚合特征。
一种可选的实施方式中,在确定与所述目标业务容器对应的目标流量识别模型之后,还包括:
获取在预设时间段内产生的网络流量;
基于在预设时间段内产生的网络流量,重新执行所述目标流量识别模型的训练过程,以对所述目标流量识别模型进行更新。
一种可选的实施方式中,对所述网络流量集合进行多个特征维度的特征提取,得到所述网络流量集合对应的特征矩阵,包括:
对所述网络流量集合中的统一资源定位符URL地址和请求体Body参数分别进行特征提取,得到所述特征矩阵包含的URL特征集和Body参数特征集;
所述基于所述特征矩阵,计算得到所述业务容器组对应的流量识别模型,包括:
基于所述Body参数特征集,计算所述网络流量集合的Body参数在每个特征维度下的置信度区间;
采用所述网络流量集合的URL特征集和所述Body参数在每个特征维度下的置信度区间,构成所述流量识别模型。
一种可选的实施方式中,对所述网络流量集合的请求体Body参数进行特征提取,包括:
针对所述网络流量集合包括的网络流量,提取该网络流量的Body参数在多个字符相关维度下的特征;
基于所述网络流量集合中各个网络流量的Body参数在多个字符相关维度下的特征,得到所述网络流量集合对应的Body参数特征集。
一种可选的实施方式中,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,包括:
对所述网络流量中的统一资源定位符URL地址和请求体Body参数分别进行特征提取,得到所述网络流量的特征向量包含的URL特征和Body参数在多个特征维度下的参数特征;
基于提取的所述特征向量,以及所述目标流量识别模型中指示的URL特征集和Body参数在每个特征维度下的置信度区间,确定所述网络流量是否为异常访问的网络流量。
一种可选的实施方式中,基于提取的所述特征向量,以及所述目标流量识别模型中指示的URL特征集和Body参数在每个特征维度下的置信度区间,确定所述网络流量是否为异常访问的网络流量,包括:
基于所述特征向量指示的所述网络流量的URL特征,以及所述网络流量的Body参数在多个特征维度下的参数特征,确定所述网络流量的URL特征是否属于所述URL特征集,以及所述Body参数在每个特征维度下的参数特征是否属于所述置信度区间;
根据判断结果,以及所述URL特征和Body参数的每个特征维度分别对应的权重值,确定所述网络流量对应的异常概率;
在所述异常概率大于设定阈值时,确定所述网络流量为异常访问的网络流量,否则, 确定所述网络流量为正常访问的网络流量。
第二方面,本公开实施例还提供一种流量检测装置,所述装置包括:
数据获取模块,用于获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;
流量检测模块,用于根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;
流量拦截模块,用于在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
第三方面,本公开实施例还提供一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述第一方面,或第一方面中任一种可能的流量检测方法的步骤。
第四方面,本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述第一方面,或第一方面中任一种可能的流量检测方法的步骤。
关于上述流量检测的装置、电子设备、及计算机可读存储介质的效果描述参见上述流量检测方法的说明,这里不再赘述。
本公开实施例提供的流量检测方法,可以获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
这样,本公开实施例中,首先针对每种业务形式的业务容器,分别基于与该业务容器的业务信息对应的网络流量训练与该业务容器匹配的流量识别模型;在进行网络流量检测时,可以查找与网络流量关联的业务容器,从预先训练的流量识别模型集中调用与该业务容器对应的目标流量识别模型,并通过调用的目标流量识别模型对网络流量进行检测;本公开实施例可以针对不同业务含义的业务容器使用不同的流量识别模型进行流量检测,可以针对每种业务容器的网络流量进行适应于该业务容器的业务特点的网络流量检测,相对传统统一化的检测策略,能更精确地进行风险识别和过滤,从而减少异常流量漏报、误报的情况。
在此基础上,能够对精确识别出的异常访问的网络流量进行访问拦截,从而提升访问安全性、同时避免不必要的拦截,保障容器环境下网络访问的正常进行。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例所提供的一种应用场景示意图;
图2示出了本公开实施例所提供的一种流量检测方法的流程图;
图3示出了本公开实施例所提供的一种流量识别模型训练方法的流程图;
图4示出了本公开实施例所提供的另一种流量检测方法的流程图;
图5示出了本公开实施例所提供的一种流量识别模型更新方法的流程图;
图6示出了本公开实施例所提供的一种流量检测装置的示意图之一;
图7示出了本公开实施例所提供的一种流量检测装置的示意图之二;
图8示出了本公开实施例所提供的一种电子设备的示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。
经研究发现,传统的访问检测方法为使用统一化安全策略,即使用同一检测方法对所有容器环境中的业务访问进行流量检测,然而由于容器微服务化的特性,每个容器都有特定的业务含义,每个容器一般只处理单一业务相关的网络请求,因此,针对多样化的业务,使用同一检测方法,可能会带来大量无效过滤,并且对于未知的风险可能难以识别,导致无法针对每个容器的业务特点进行有针对性地检测,进一步使得对容器环境中网络异常流量的检测准确性低,容易出现漏报、误报的情况,难以保证容器环境下网络访问的正常进行。
基于上述研究,本公开提供了一种流量检测方法,可以获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对 应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
这样,本公开实施例中,首先针对每种业务形式的业务容器,分别基于与该业务容器的业务信息对应的网络流量训练与该业务容器匹配的流量识别模型;在进行网络流量检测时,可以查找与网络流量关联的业务容器,从预先训练的流量识别模型集中调用与该业务容器对应的目标流量识别模型,并通过调用的目标流量识别模型对网络流量进行检测;本公开实施例可以针对不同业务含义的业务容器使用不同的流量识别模型进行流量检测,可以针对每种业务容器的网络流量进行适应于该业务容器的业务特点的网络流量检测,相对传统统一化的检测策略,能更精确地进行风险识别和过滤,从而减少异常流量漏报、误报的情况。
在此基础上,能够对精确识别出的异常访问的网络流量进行访问拦截,从而提升访问安全性、同时避免不必要的拦截,保障容器环境下网络访问的正常进行。
为便于对本实施例进行理解,首先对本公开实施例所公开的一种流量检测方法进行详细介绍,本公开实施例所提供的流量检测方法的执行主体一般为具有一定计算能力的计算机设备,该计算机设备例如包括:终端设备或服务器或其它处理设备。在一些可能的实现方式中,该访问检测方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
下面对本公开实施例提供的流量检测方法加以说明。
请参阅图1,图1为本公开实施例提供的一种应用场景示意图。如图1中所示,为了训练流量识别模型,可以基于容器编排系统发送的各业务容器的业务信息和样本网络流量及其对应的业务信息进行训练,从而训练得到流量识别模型,针对训练好的流量识别模型,业务容器可以将待检测的网络流量发送至流量识别模型,以实现基于流量识别模型,检测所述网络流量是否为异常访问的网络流量。
请参阅图2,图2为本公开实施例提供的一种流量检测方法的流程图。如图2中所示,本公开实施例提供的流量检测方法包括步骤S201~S204,其中:
S201:获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息。
这里,所述网络流量相关的网络状态信息可以包括互联网协议(Internet Protocol Address,IP)五元组信息、统一资源定位符(Uniform Resource Locator,URL)地址、请求体(Body)参数信息等。
在实际应用中,所述网络流量一般为传输控制协议(Transmission Control Protocol,TCP)数据包的形式,可以通过深度报文检测(Deep Packet Inspection,DPI)技术,将传输控制协议(Transmission Control Protocol,TCP)报文解析为超文本传输协议(Hyper Text Transfer Protocol,HTTP)报文,从而解析得到所述网络流量相关的网络状态信息。
在一些可能的实施方式中,可以通过网络hook器获取网络流量。
S202:根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型 为根据与目标业务容器关联的网络流量训练得到的。
这里,针对预先训练的流量识别模型集,所述流量识别模型集中包括多个流量识别模型,每个流量识别模型存储有与其对应的业务容器,因此可以通过所述网络流量关联的目标业务容器,确定出与所述目标业务容器对应的目标流量识别模型,以在后续基于所述目标流量识别模型进行流量检测。
因此,在所述网络状态信息包括IP五元组信息的情况下,为了从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,在一些可能的实施方式中,所述根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,包括:
根据所述IP五元组信息中的源IP地址和目的IP地址,确定与所述网络流量关联的目标业务容器组;目标业务容器组包括与所述源IP地址匹配的源容器和与所述目的IP地址匹配的目的容器;
从预先训练的流量识别模型集中,获取与目标业务容器组对应的目标流量识别模型。
该步骤中,在所述网络状态信息包括IP五元组信息的情况下,根据所述IP五元组信息中的源IP地址和目的IP地址,可以理解,源IP地址和目的IP地址为一组具有业务访问关系的IP地址,因此可以确定一组具有业务访问关系的目标业务容器组,即为与所述网络流量关联的目标业务容器组。目标业务容器组包括与所述源IP地址匹配的源容器和与所述目的IP地址匹配的目的容器,这里,各目标业务容器组分别对应一个预先训练的流量识别模型,在确定与所述网络流量关联的目标业务容器组的情况下,可以借助所述目标业务容器组,从预先训练的流量识别模型集中确定出与所述目标业务容器组对应的目标流量识别模型。
可选地,在一些可能的实施方式中,为了确定与所述网络流量关联的目标业务容器组,不但可以获取所述IP五元组信息中的源IP地址和目的IP地址,还可以获取所述URL地址指示的URL请求和Body参数,以便辅助确定所述目标业务容器组的业务特性,提高后续确定与所述目标业务容器组匹配的目标流量识别模型的准确性。
示例性的,针对某一网络流量,可以从该网络流量中提取出以下表一所示数据:
表一

具体的,以表一中的第一行为例,可以从该网络流量中的IP五元组信息中提取出源IP地址10.224.41.163,目的IP地址10.224.60.24,从而可以确定与该源IP地址匹配的源容器trade,与该目的IP地址匹配的目的容器order,同时,还可以从该网络流量中提取出统一资源定位符URL地址指示的URL请求/order/detail,Body参数{orderNo:23},从而可以确定该源容器具有交易特性,该目的容器具有订单特性,二者为一组具有订单交易的业务访问关系的目标业务容器组,在确定目标业务容器组的情况下,可以从预先训练的流量识别模型集中确定出与该目标业务容器组对应的目标流量识别模型。
这样,通过业务容器组的划分,达到业务容器在业务场景下的近源聚合,各业务容器组分别对应一个预先训练的流量识别模型,实现对业务近源的网络流量有针对性地检测,达到在容器网络环境下的精准防护。
进一步的,所述源容器在访问所述目的容器时,所述源容器和所述目的容器上存在多个端口,为了确定用于进行业务访问的端口,可以确定所述IP五元组信息中的源端口和目的端口,从而可以通过源容器的源端口与目的容器的目的端口进行网络流量的传输,以实现业务访问。
以上本公开实施例中所述流量识别模型集中的流量识别模型为预先训练好的,参见图3所示,为本公开实施例提供的一种流量识别模型训练方法的流程图,包括步骤S301~S303:
S301:根据获取到的网络流量对应的业务信息以及各业务容器的业务信息,将获取到的网络流量进行聚合处理,得到与每个业务容器组对应的网络流量集合;每个所述业务容器组内各业务容器的业务信息相同。
该步骤中,在需要进行模型训练时,可以获取用于训练的网络流量及其对应的业务信息,同时可以获取各业务容器的业务信息,这里,每个所述业务容器组内各业务容器的业务信息相同,经过业务信息之间的比对,可以确定相同的业务信息,从而针对每个业务容器组,可以将相同业务信息对应的各获取到的网络流量进行聚合处理,生成网络流量集合,即生成各业务容器组对应的网络流量集合。
其中,可以通过容器编排系统的应用程序接口(Application Programming Interface,API)获取各业务容器的业务信息,所述业务信息包括各业务容器的业务属性信息。
示例性的,可以通过K8S容器编排系统中的Deployment或ReplicaSet信息,获取到各业务容器的业务属性信息,例如nginx、mysql、kafka等。
S302:对所述网络流量集合进行多个特征维度的特征提取,得到所述网络流量集合对应的特征矩阵;所述特征矩阵由所述网络流量集合中各网络流量分别对应的特征向量组成。
该步骤中,在生成所述网络流量集合的情况下,可以对所述网络流量集合进行多个特征维度的特征提取,从而对应得到多个特征维度下的特征向量,将所述特征向量进行组合,可以得到所述网络流量集合对应的特征矩阵。
具体的,在一些可能的实施方式中,可以对所述网络流量集合中的统一资源定位符URL 地址和请求体Body参数分别进行特征提取,得到所述特征矩阵包含的URL特征集和Body参数特征集。
可以理解,基于提取得到的URL特征集和Body参数特征集,可以构成所述网络流量集合对应的特征矩阵。
进而,在一些可能的实施方式中,所述对所述网络流量集合中的统一资源定位符URL地址进行特征提取,可以是通过正则表达式的方式,从所述网络流量集合中每个网络流量的URL地址中提取访问的资源路径,得到每个所述网络流量的URL特征,然后对所述网络流量集合中各个网络流量的URL特征进行去重处理,可以得到所述网络流量集合对应的URL特征集。
该步骤中,可以利用正则表达式的方法,检测所述网络流量集合中每个网络流量的URL地址中是否存在与资源路径模式匹配的字符串,若存在,就可以对所述URL地址进行过滤,以从所述URL地址中提取出访问的资源路径,进而得到每个所述网络流量的URL特征,在实际应用中,所述网络流量中存在相同URL地址以及相同资源路径的情况,因此为了减少资源占用,提高流量识别模型的训练速度,可以对所述网络流量集合中各个网络流量的URL特征进行去重处理,以删除重复的URL特征,从而得到所述网络流量集合对应的URL特征集。
进一步地,在一些可能的实施方式中,针对所述URL地址中的一些随机字符,也可以通过正则表达式的方法进行字符清洗。
相应地,在一些可能的实施方式中,对所述网络流量集合的请求体Body参数进行特征提取,包括:
针对所述网络流量集合中每个访问数据包,提取该网络流量的Body参数在多个字符相关维度下的特征;
基于所述网络流量集合中各个网络流量的Body参数在多个字符相关维度下的特征,得到所述网络流量集合对应的Body参数特征集。
这里,所述多个字符相关维度包括字符串长度、特殊字符数量、字母占比、和数字占比等,针对所述网络流量集合中每个网络流量,可以提取该网络流量的Body参数中的字符串长度、特殊字符数量、字母占比、和数字占比,从而得到各个网络流量的Body参数在多个字符相关维度下的特征,进而可以将多个字符相关维度下的特征进行整合,以得到所述网络流量集合对应的Body参数特征集。
这里,Body参数一般是Key-Value的参数结构,Key值与入参类型有关,Value值与业务范畴相关,为了提升体征提取的有效性和业务相关性,可以主要对Body参数中的Value值进行特征提取。
其中,提取Body参数中的字符串长度,即为确定Value值的字符串长度,在实际应用中,一般带有攻击性的网络流量对应的字符串长度不固定,且与正常访问的网络流量对应的字符串长度存在较大偏差。
其中,提取Body参数中的特殊字符数量,即为确定Value值中的特殊字符数量,若Value值中存在例如*、$、%、&等特殊字符,则其一般对应带有攻击性的网络流量。
其中,提取Body参数中的字母占比,即为确定Value值中字母类型字符在所有字符中 的占比;提取Body参数中的数字占比,即为确定Value值中数字类型字符在所有字符中的占比。
S303:基于所述特征矩阵,计算得到所述业务容器组对应的流量识别模型;所述流量识别模型用于表征正常访问的网络流量对应的聚合特征。
该步骤中,针对所述业务容器组中源容器对应的特征矩阵和目的容器对应的特征矩阵,可以计算得到所述业务容器组对应的流量识别模型。
这里,在一些可能的实施方式中,可以先针对源容器对应的特征矩阵和目的容器对应的特征矩阵分别计算得到单个业务容器对应的中间流量识别模型,再整合得到所述业务容器组对应的流量识别模型。
可选地,在另一些可能的实施方式中,可以先将源容器对应的特征矩阵和目的容器对应的特征矩阵进行整合处理,得到所述业务容器组对应的中间特征矩阵,再计算得到所述业务容器组对应的流量识别模型。
具体的,在一些可能的实施方式中,所述基于所述特征矩阵,计算得到所述业务容器组对应的流量识别模型,包括:
基于所述Body参数特征集,计算所述网络流量集合的Body参数在每个特征维度下的置信度区间;
采用所述网络流量集合的URL特征集和所述Body参数在每个特征维度下的置信度区间,构成所述流量识别模型。
该步骤中,在确定所述网络流量集合对应的所述URL特征集和所述Body参数特征集的情况下,可以基于所述Body参数特征集,计算所述网络流量集合的Body参数在每个特征维度下的置信度区间,从而采用所述网络流量集合的URL特征集和所述Body参数在每个特征维度下的置信度区间,构成所述流量识别模型。
基于上述内容可知,所述Body参数特征集包括所述网络流量的Body参数在字符串长度、特殊字符数量、字母占比、和数字占比等字符相关维度下的特征,这几种特征均为数字型特征,因此各维度下的特征计算方法相同。
具体的,在一些可能的实施方式中,为了计算Body参数在每个特征维度下的置信度区间,针对每个特征维度下的特征,可以先计算特征的均值μ和标准差σ,再根据契比雪夫定理,给定一组数据{x1,x2,…,xn},均值为μ,标准差为σ,则对任意k≥1,位于区间[μ-k*σ,μ+k*σ]内的数据所占的比例p≥1-1/k2。这里,k为容忍度,k值越大,则表示特征值落到区间内的概率越大,假如当前特征值未落入该区间内,则表示该特征维度下的特征与正常特征相似度越低,则其对应的访问数据包异常的概率就越大。通过数学变换,计算得到置信度阈值ρ=μ±k*σ
其中,μ为均值,σ为标准差,k为任意值(k>0),P为样本X的概率估计。
通过设定k值,即可计算得到当前特征维度下的置信度区间ρ=[ρ1,ρ2],k值设定的越大,表示对异常访问的容忍度越低。
进一步的,在确定Body参数在每个特征维度下的置信度区间的情况下,可以整合得到 Body参数对应的整合置信度区间P={ρ1,ρ2,…,ρj},从而可以采用所述网络流量集合的URL特征集U和所述Body参数对应的整合置信度区间P,构成所述流量识别模型Y={U,P}。
基于上文内容可知,可以从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,然后就可以通过所述目标流量识别模型对所述网络流量进行检测。
相应地,在一些可能的实施方式中,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,包括:
对所述网络流量中的统一资源定位符URL地址和请求体Body参数分别进行特征提取,得到所述网络流量的特征向量包含的URL特征和Body参数在多个特征维度下的参数特征;
基于提取的所述特征向量,以及所述目标流量识别模型中指示的URL特征集和Body参数在每个特征维度下的置信度区间,确定所述网络流量是否为异常访问的网络流量。
该步骤中,首先可以对所述网络流量进行特征提取,得到所述网络流量的特征向量。这里的对所述网络流量进行特征提取与上文介绍的对所述网络流量集合进行多个特征维度的特征提取的方法相似,对所述网络流量中的统一资源定位符URL地址和请求体Body参数分别进行特征提取,与上文介绍的对所述网络流量集合中的统一资源定位符URL地址和请求体Body参数分别进行特征提取方法相似,在此不再赘述。
可以理解,基于提取得到的URL特征和Body参数在多个特征维度下的参数特征,可以构成所述网络流量的特征向量。
然后,在从所述网络流量提取出所述特征向量和获取到所述目标流量识别模型的情况下,由于所述目标流量识别模型表征正常访问的网络流量对应的聚合特征,因此将所述特征向量和所述目标流量识别模型进行比较,从而可以确定所述网络流量是否为异常访问的网络流量。
其中,通过上述内容可知,采用所述网络流量集合的URL特征集和所述Body参数在每个特征维度下的置信度区间,构成所述目标流量识别模型。
因此,针对提取的所述特征向量,可以与所述目标流量识别模型中指示的URL特征集和Body参数在每个特征维度下的置信度区间进行比较和匹配,从而确定所述网络流量是否为异常访问的网络流量。
具体的,在一些可能的实施方式中,可以基于所述特征向量指示的所述待网络流量的URL特征,以及所述网络流量的Body参数在多个特征维度下的参数特征,确定所述网络流量的URL特征是否属于所述URL特征集,以及所述Body参数在每个特征维度下的参数特征是否属于所述置信度区间;
根据判断结果,以及所述URL特征和Body参数的每个特征维度分别对应的权重值,确定所述网络流量对应的异常概率;
在所述异常概率大于设定阈值时,确定所述网络流量为异常访问的网络流量,否则,确定所述网络流量为正常访问的网络流量。
该步骤中,可以先基于所述特征向量指示的所述待网络流量的URL特征,确定所述网络流量的URL特征是否属于所述目标流量识别模型中指示的URL特征集,并基于所述特征向量指示的所述网络流量的Body参数在多个特征维度下的参数特征,确定所述网络流量 的Body参数在每个特征维度下的参数特征是否属于所述置信度区间,进而获取判断结果,这里,若所述特征向量指示的所述网络流量的URL特征不属于所述目标流量识别模型中指示的URL特征集,可以确定所述网络流量存在异常,若所述特征向量指示的所述网络流量的Body参数在多个特征维度下的参数特征不属于所述置信度区间,可以确定所述网络流量存在异常,根据判断结果,以及所述URL特征和Body参数的每个特征维度分别对应的权重值,通过数学变换,确定所述网络流量对应的异常概率,如下公式:
其中,pi为特征异常判定结果(异常为1,正常为0),σi为该特征维度的权重。
进一步的,可以获取预先为所述网络流量配置的设定阈值,在所述网络流量对应的异常概率大于设定阈值时,确定所述网络流量对应的访问检测结果为异常访问,判断当前所述网络流量为异常访问的网络流量,反之,否则,在所述网络流量对应的异常概率小于设定阈值时,确定所述网络流量对应的访问检测结果为正常访问,判断当前所述网络流量为正常访问的网络流量。
S203:在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
这里,为了保障数据访问的安全性和稳定性,可以将异常访问的网络流量进行访问拦截,以保证进行访问的网络流量都是通过检测的。
本公开实施例提供的访问检测方法,可以获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
这样,本公开实施例中,首先针对每种业务形式的业务容器,分别基于与该业务容器的业务信息对应的网络流量训练与该业务容器匹配的流量识别模型;在进行网络流量检测时,可以查找与网络流量关联的业务容器,从预先训练的流量识别模型集中调用与该业务容器对应的目标流量识别模型,并通过调用的目标流量识别模型对网络流量进行检测;本公开实施例可以针对不同业务含义的业务容器使用不同的流量识别模型进行流量检测,可以针对每种业务容器的网络流量进行适应于该业务容器的业务特点的网络流量检测,相对传统统一化的检测策略,能更精确地进行风险识别和过滤,从而减少异常流量漏报、误报的情况。
在此基础上,能够对精确识别出的异常访问的网络流量进行访问拦截,从而提升访问安全性、同时避免不必要的拦截,保障容器环境下网络访问的正常进行。
请参阅图4,图4为本公开实施例提供的另一种流量检测方法的流程图。如图4中所示,本公开实施例提供的流量检测方法包括步骤S401~S405,其中:
S401:获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息。
S402:根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先 训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的。
S403:在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
其中,步骤S401至步骤S403的描述可以参照步骤S201至步骤S203的描述,并且可以达到相同的技术效果和解决相同的技术问题,在此不做赘述。
S404:获取在预设时间段内产生的网络流量。
S405:基于在预设时间段内产生的网络流量,重新执行所述目标流量识别模型的训练过程,以对所述目标流量识别模型进行更新。
通过上述内容可知,所述目标流量识别模型与所述目标业务容器组对应,所述目标业务容器组与所述网络流量相关联,所以若所述业务容器组中的源容器和目的容器发生变化,即在训练得到所述目标流量识别模型后,所述目标业务容器再次产生网络流量的情况下,所述目标流量识别模型也需要相对应地随之进行调整和更新,有效增强所述目标流量识别模型的鲁棒性。
请同时参阅图5,为本公开实施例提供的一种流量识别模型更新方法的流程图,如图5中所示,当检测到目标业务容器组中的源容器和目的容器启动时,即检测到目标业务容器组中的源容器和目的容器之间开展业务访问时,可以检测所述目标业务容器组当前是否存在对应的目标流量识别模型,若不存在对应的目标流量识别模型,可以进行模型训练,模型训练的方法与上文介绍的方式相似,在此不再赘述;若存在对应的目标流量识别模型,可以判断所述目标流量识别模型是否需要更新,具体的,可以检测所述目标业务容器组中的源容器和目的容器对应的镜像的标识是否发生变化,若标识没有变化,则确定所述目标流量识别模型无需更新,可以继续使用现有目标流量识别模型,若标识发生变化,则确定所述目标流量识别模型需要更新;
进而所述目标流量识别模型进入模型调整期,这里,所述模型调整期的时间长度可以根据目标业务容器组的具体业务情况设定,例如一个小时等,若所述目标流量识别模型处于所述模型调整期内,确定所述目标流量识别模型并不存在偏离,即所述目标流量识别模型可以继续使用现有近源模型,若所述近源模型未处于所述模型调整期内,则确定所述近源模型存在偏离,此时可以获取目标容器在预设时间段内产生的网络流量,基于在预设时间段内产生的网络流量,重新执行所述目标流量识别模型的训练过程,以对所述目标流量识别模型进行更新,可以理解,更新后的所述目标流量识别模型与当前所述业务容器组相对应。
本公开实施例提供的流量检测方法,可以获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
这样,本公开实施例中,首先针对每种业务形式的业务容器,分别基于与该业务容器的业务信息对应的网络流量训练与该业务容器匹配的流量识别模型;在进行网络流量检测时,可以查找与网络流量关联的业务容器,从预先训练的流量识别模型集中调用与该业务容器对应的目标流量识别模型,并通过调用的目标流量识别模型对网络流量进行检测;本公开实施例可以针对不同业务含义的业务容器使用不同的流量识别模型进行流量检测,可以针对每种业务容器的网络流量进行适应于该业务容器的业务特点的网络流量检测,相对传统统一化的检测策略,能更精确地进行风险识别和过滤,从而减少异常流量漏报、误报的情况。
进一步的,能够对精确识别出的异常访问的网络流量进行访问拦截,从而提升访问安全性、同时避免不必要的拦截,保障容器环境下网络访问的正常进行。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
基于同一发明构思,本公开实施例中还提供了与流量检测方法对应的流量检测装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述流量检测方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
请参阅图6和图7,图6为本公开实施例提供的一种流量检测装置的示意图之一,图7为本公开实施例提供的一种流量检测装置的示意图之二。如图6中所示,本公开实施例提供的流量检测装置600包括:
数据获取模块601,用于获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;
流量检测模块602,用于根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;
流量拦截模块603,在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
一种可选的实施方式中,所述网络状态信息包括IP五元组信息;所述流量检测模块602在用于根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型时,具体用于:
根据所述IP五元组信息中的源IP地址和目的IP地址,确定与所述网络流量关联的目标业务容器组;所述目标业务容器组包括与所述源IP地址匹配的源容器和与所述目的IP地址匹配的目的容器;
从预先训练的流量识别模型集中,获取与目标业务容器组对应的目标流量识别模型。
一种可选的实施方式中,所述流量检测模块602根据以下步骤训练得到所述流量识别模型集中的流量识别模型:
根据获取到的网络流量对应的业务信息以及一个或多个业务容器的业务信息,将获取到的网络流量进行聚合处理,得到与业务容器组对应的网络流量集合;每个所述业务容器 组包括的业务容器的业务信息相同;
对所述网络流量集合进行多个特征维度的特征提取,得到所述网络流量集合对应的特征矩阵;所述特征矩阵由所述网络流量集合包括的网络流量分别对应的特征向量组成;
基于所述特征矩阵,计算得到所述业务容器组对应的流量识别模型;所述流量识别模型用于表征正常访问的网络流量对应的聚合特征。
一种可选的实施方式中,所述装置还包括模型更新模块604,所述模型更新模块604用于:
获取在预设时间段内产生的网络流量;
基于在预设时间段内产生的网络流量,重新执行所述目标流量识别模型的训练过程,以对所述目标流量识别模型进行更新。
一种可选的实施方式中,所述流量检测模块602在用于对所述网络流量集合进行多个特征维度的特征提取,得到所述网络流量集合对应的特征矩阵时,具体用于:
对所述网络流量集合中的统一资源定位符URL地址和请求体Body参数分别进行特征提取,得到所述特征矩阵包含的URL特征集和Body参数特征集;
所述流量检测模块602在用于基于所述特征矩阵,计算得到所述业务容器组对应的流量识别模型时,具体用于:
基于所述Body参数特征集,计算所述网络流量集合的Body参数在每个特征维度下的置信度区间;
采用所述网络流量集合的URL特征集和所述Body参数在每个特征维度下的置信度区间,构成所述流量识别模型。
一种可选的实施方式中,所述流量检测模块602在用于对所述网络流量集合的请求体Body参数进行特征提取时,具体用于:
针对所述网络流量集合中每个网络流量,提取该网络流量的Body参数在多个字符相关维度下的特征;
基于所述网络流量集合包括的网络流量的Body参数在多个字符相关维度下的特征,得到所述网络流量集合对应的Body参数特征集。
一种可选的实施方式中,所述流量检测模块602在用于基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量时,具体用于:
对所述网络流量中的统一资源定位符URL地址和请求体Body参数分别进行特征提取,得到所述网络流量的特征向量包含的URL特征和Body参数在多个特征维度下的参数特征;
基于提取的所述特征向量,以及所述目标流量识别模型中指示的URL特征集和Body参数在每个特征维度下的置信度区间,确定所述网络流量是否为异常访问的网络流量。
一种可选的实施方式中,所述流量检测模块602在用于基于提取的所述特征向量,以及所述目标流量识别模型中指示的URL特征集和Body参数在每个特征维度下的置信度区间,确定所述网络流量是否为异常访问的网络流量时,具体用于:
基于所述特征向量指示的所述网络流量的URL特征,以及所述网络流量的Body参数在多个特征维度下的参数特征,确定所述网络流量的URL特征是否属于所述URL特征集,以及所述Body参数在每个特征维度下的参数特征是否属于所述置信度区间;
根据判断结果,以及所述URL特征和Body参数的每个特征维度分别对应的权重值,确定所述网络流量对应的异常概率;
在所述异常概率大于设定阈值时,确定所述网络流量为异常访问的网络流量,否则,确定所述网络流量为正常访问的网络流量。
关于装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。
基于同一技术构思,本公开实施例还提供了一种电子设备。参照图8所示,为本公开实施例提供的电子设备800结构示意图,包括:
处理器810、存储器820、和总线830;存储器820用于存储执行指令,包括内存821和外部存储器822;这里的内存821也称内存储器,用于暂时存放处理器810中的运算数据,以及与硬盘等外部存储器822交换的数据,处理器810通过内存821与外部存储器822进行数据交换,当所述电子设备800运行时,所述处理器810与所述存储器820之间通过总线830通信,使得所述处理器810可以执行上述的流量检测方法实施例中所提及的执行指令。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的流量检测方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例还提供一种计算机程序产品,该计算机程序产品包括有计算机指令,所述计算机指令被处理器执行时可以执行上述方法实施例中所述的流量检测方法的步骤,具体可参见上述方法实施例,在此不再赘述。
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的设备和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的设备、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。

Claims (12)

  1. 一种流量检测方法,所述方法包括:
    获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;
    根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;
    在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
  2. 根据权利要求1所述的方法,其中,所述网络状态信息包括互联网协议IP五元组信息;所述根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,包括:
    根据所述IP五元组信息中的源IP地址和目的IP地址,确定与所述网络流量关联的目标业务容器组;其中所述目标业务容器组包括与所述源IP地址匹配的源容器和与所述目的IP地址匹配的目的容器;
    从预先训练的流量识别模型集中,获取与所述目标业务容器组对应的目标流量识别模型。
  3. 根据权利要求1所述的方法,其中,所述流量识别模型集中的流量识别模型为根据以下步骤训练得到的:
    根据获取到的网络流量对应的业务信息以及一个或多个业务容器的业务信息,将获取到的网络流量进行聚合处理,得到与业务容器组对应的网络流量集合;每个所述业务容器组包括的业务容器的业务信息相同;
    对所述网络流量集合进行多个特征维度的特征提取,得到所述网络流量集合对应的特征矩阵;所述特征矩阵由所述网络流量集合包括的网络流量分别对应的特征向量组成;
    基于所述特征矩阵,计算得到所述业务容器组对应的流量识别模型;所述流量识别模型用于表征正常访问的网络流量对应的聚合特征。
  4. 根据权利要求1所述的方法,其中,在所述确定与所述目标业务容器对应的目标流量识别模型之后,还包括:
    获取在预设时间段内产生的网络流量;
    基于在预设时间段内产生的网络流量,重新执行所述目标流量识别模型的训练过程,以对所述目标流量识别模型进行更新。
  5. 根据权利要求3所述的方法,其中,所述对所述网络流量集合进行多个特征维度的特征提取,得到所述网络流量集合对应的特征矩阵,包括:
    对所述网络流量集合中的统一资源定位符URL地址和请求体Body参数分别进行特征提取,得到所述特征矩阵包含的URL特征集和Body参数特征集;
    所述基于所述特征矩阵,计算得到所述业务容器组对应的流量识别模型,包括:
    基于所述Body参数特征集,计算所述网络流量集合的Body参数在每个特征维度下的置信度区间;
    采用所述网络流量集合的URL特征集和所述Body参数在每个特征维度下的置信度区 间,构成所述流量识别模型。
  6. 根据权利要求5所述的方法,其中,所述对所述网络流量集合的请求体Body参数进行特征提取,包括:
    针对所述网络流量集合中每个网络流量,提取该网络流量的Body参数在多个字符相关维度下的特征;
    基于所述网络流量集合包括的网络流量的Body参数在多个字符相关维度下的特征,得到所述网络流量集合对应的Body参数特征集。
  7. 根据权利要求1所述的方法,其中,所述基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,包括:
    对所述网络流量中的统一资源定位符URL地址和请求体Body参数分别进行特征提取,得到所述网络流量的特征向量包含的URL特征和Body参数在多个特征维度下的参数特征;
    基于提取的所述特征向量,以及所述目标流量识别模型中指示的URL特征集和Body参数在每个特征维度下的置信度区间,确定所述网络流量是否为异常访问的网络流量。
  8. 根据权利要求7所述的方法,其中,所述基于提取的所述特征向量,以及所述目标流量识别模型中指示的URL特征集和Body参数在每个特征维度下的置信度区间,确定所述网络流量是否为异常访问的网络流量,包括:
    基于所述特征向量指示的所述网络流量的URL特征,以及所述网络流量的Body参数在多个特征维度下的参数特征,确定所述网络流量的URL特征是否属于所述URL特征集,以及所述Body参数在每个特征维度下的参数特征是否属于所述置信度区间;
    根据判断结果,以及所述URL特征和Body参数的每个特征维度分别对应的权重值,确定所述网络流量对应的异常概率;
    在所述异常概率大于设定阈值时,确定所述网络流量为异常访问的网络流量,否则,确定所述网络流量为正常访问的网络流量。
  9. 一种流量检测装置,所述装置包括:
    数据获取模块,用于获取网络流量,并解析所述网络流量得到所述网络流量相关的网络状态信息;
    流量检测模块,用于根据所述网络状态信息,查找与所述网络流量关联的目标业务容器,并从预先训练的流量识别模型集中调用与所述目标业务容器对应的目标流量识别模型,基于调用的目标流量识别模型检测所述网络流量是否为异常访问的网络流量,所述目标流量识别模型为根据与目标业务容器关联的网络流量训练得到的;
    流量拦截模块,用于在检测到所述网络流量为异常访问的网络流量的情况下,对所述网络流量进行拦截。
  10. 一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当电子设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至9中任一项所述的流量检测方法的步骤。
  11. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至9中任一项所述的流量检测方法的步骤。
  12. 一种计算机程序产品,所述计算机程序产品在设备上运行时,使得所述设备执行如权利要求1至9中任一项所述的流量检测方法的步骤。
PCT/CN2023/086763 2022-04-29 2023-04-07 一种流量检测方法、装置、设备及存储介质 WO2023207548A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP23794981.3A EP4344134A1 (en) 2022-04-29 2023-04-07 Traffic detection method and apparatus, device and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210468301.1A CN114666162B (zh) 2022-04-29 2022-04-29 一种流量检测方法、装置、设备及存储介质
CN202210468301.1 2022-04-29

Publications (1)

Publication Number Publication Date
WO2023207548A1 true WO2023207548A1 (zh) 2023-11-02

Family

ID=82037364

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/086763 WO2023207548A1 (zh) 2022-04-29 2023-04-07 一种流量检测方法、装置、设备及存储介质

Country Status (3)

Country Link
EP (1) EP4344134A1 (zh)
CN (1) CN114666162B (zh)
WO (1) WO2023207548A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114666162B (zh) * 2022-04-29 2023-05-05 北京火山引擎科技有限公司 一种流量检测方法、装置、设备及存储介质
CN115174131B (zh) * 2022-07-13 2023-07-11 陕西合友网络科技有限公司 基于异常流量识别的信息拦截方法、系统及云平台
CN115499383A (zh) * 2022-07-29 2022-12-20 天翼云科技有限公司 一种流量识别方法、装置、电子设备及存储介质
CN116582468B (zh) * 2023-04-26 2024-01-16 杭州云之盟科技有限公司 互联网流量监测方法、装置、设备及存储介质
CN117097578B (zh) * 2023-10-20 2024-01-05 杭州烛微智能科技有限责任公司 一种网络流量的安全监控方法、系统、介质及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102821002A (zh) * 2011-06-09 2012-12-12 中国移动通信集团河南有限公司信阳分公司 网络流量异常检测方法和系统
CN113746692A (zh) * 2021-07-21 2021-12-03 网宿科技股份有限公司 网络流量统计的方法、电子设备及存储介质
CN113949527A (zh) * 2021-09-07 2022-01-18 中云网安科技有限公司 异常访问的检测方法、装置、电子设备及可读存储介质
US20220060491A1 (en) * 2020-08-21 2022-02-24 Palo Alto Networks, Inc. Malicious traffic detection with anomaly detection modeling
CN114666162A (zh) * 2022-04-29 2022-06-24 北京火山引擎科技有限公司 一种流量检测方法、装置、设备及存储介质

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108667853B (zh) * 2013-11-22 2021-06-01 华为技术有限公司 恶意攻击的检测方法和装置
KR101761737B1 (ko) * 2014-05-20 2017-07-26 한국전자통신연구원 제어 시스템의 이상행위 탐지 시스템 및 방법
US11277420B2 (en) * 2017-02-24 2022-03-15 Ciena Corporation Systems and methods to detect abnormal behavior in networks
CN108616498A (zh) * 2018-02-24 2018-10-02 国家计算机网络与信息安全管理中心 一种web访问异常检测方法和装置
CN109660517B (zh) * 2018-11-19 2021-05-07 北京天融信网络安全技术有限公司 异常行为检测方法、装置及设备
CN109714322B (zh) * 2018-12-14 2020-04-24 中国科学院声学研究所 一种检测网络异常流量的方法及其系统
CN109951500B (zh) * 2019-04-29 2021-10-26 宜人恒业科技发展(北京)有限公司 网络攻击检测方法及装置
TWI780411B (zh) * 2020-03-04 2022-10-11 國立中正大學 基於長短期記憶模型之異常網路流量偵測系統及方法
CN113746686A (zh) * 2020-05-27 2021-12-03 阿里巴巴集团控股有限公司 一种网络流量的状态确定方法、计算设备及存储介质
CN111813498A (zh) * 2020-07-02 2020-10-23 深圳市国电科技通信有限公司 终端容器的监测方法、监测装置、存储介质及处理器
CN113379469A (zh) * 2021-07-06 2021-09-10 上海明略人工智能(集团)有限公司 一种异常流量检测方法、装置、设备及存储介质
CN114117429A (zh) * 2021-11-29 2022-03-01 新华三大数据技术有限公司 一种网络流量的检测方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102821002A (zh) * 2011-06-09 2012-12-12 中国移动通信集团河南有限公司信阳分公司 网络流量异常检测方法和系统
US20220060491A1 (en) * 2020-08-21 2022-02-24 Palo Alto Networks, Inc. Malicious traffic detection with anomaly detection modeling
CN113746692A (zh) * 2021-07-21 2021-12-03 网宿科技股份有限公司 网络流量统计的方法、电子设备及存储介质
CN113949527A (zh) * 2021-09-07 2022-01-18 中云网安科技有限公司 异常访问的检测方法、装置、电子设备及可读存储介质
CN114666162A (zh) * 2022-04-29 2022-06-24 北京火山引擎科技有限公司 一种流量检测方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114666162B (zh) 2023-05-05
CN114666162A (zh) 2022-06-24
EP4344134A1 (en) 2024-03-27

Similar Documents

Publication Publication Date Title
WO2023207548A1 (zh) 一种流量检测方法、装置、设备及存储介质
TWI673625B (zh) 統一資源定位符(url)攻擊檢測方法、裝置以及電子設備
CN107547555B (zh) 一种网站安全监测方法及装置
US10009358B1 (en) Graph based framework for detecting malicious or compromised accounts
CN107483488B (zh) 一种恶意Http检测方法及系统
US20220368703A1 (en) Method and device for detecting security based on machine learning in combination with rule matching
US20200358819A1 (en) Systems and methods using computer vision and machine learning for detection of malicious actions
US9621570B2 (en) System and method for selectively evolving phishing detection rules
WO2019134334A1 (zh) 网络异常数据检测方法、装置、计算机设备和存储介质
KR101949338B1 (ko) 기계 학습 모델에 기반하여 페이로드로부터 sql 인젝션을 탐지하는 방법 및 이를 이용한 장치
CN107786545A (zh) 一种网络攻击行为检测方法及终端设备
US10110616B1 (en) Using group analysis to determine suspicious accounts or activities
CN108632227A (zh) 一种恶意域名检测处理方法及装置
WO2015039553A1 (en) Method and system for identifying fraudulent websites priority claim and related application
WO2024098699A1 (zh) 实体对象的威胁检测方法、装置、设备及存储介质
US8910281B1 (en) Identifying malware sources using phishing kit templates
US10372702B2 (en) Methods and apparatus for detecting anomalies in electronic data
CN109495471B (zh) 一种对web攻击结果判定方法、装置、设备及可读存储介质
CN110955890B (zh) 恶意批量访问行为的检测方法、装置和计算机存储介质
CN111131309A (zh) 分布式拒绝服务检测方法、装置及模型创建方法、装置
US9332031B1 (en) Categorizing accounts based on associated images
US11647046B2 (en) Fuzzy inclusion based impersonation detection
RU2580027C1 (ru) Система и способ формирования правил поиска данных, используемых для фишинга
CN114024701A (zh) 域名检测方法、装置及通信系统
US11907658B2 (en) User-agent anomaly detection using sentence embedding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23794981

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18573252

Country of ref document: US

Ref document number: 2023794981

Country of ref document: EP

Ref document number: 23794981.3

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2023794981

Country of ref document: EP

Effective date: 20231221