WO2021063368A1

WO2021063368A1 - Cdn system-based source station state detection method and device

Info

Publication number: WO2021063368A1
Application number: PCT/CN2020/119009
Authority: WO
Inventors: 朱刚
Original assignee: 华为技术有限公司
Priority date: 2019-09-30
Filing date: 2020-09-29
Publication date: 2021-04-08
Also published as: CN110753041A

Abstract

The present application discloses a content delivery network (CDN) system-based source station state detection method and a device. Said method comprises: receiving log information sent by a node in a CDN system, the log information recording therein a URL of a source station and historical traffic information of the source station; predicting, according to the log information, a traffic curve of the source station, the traffic curve comprising a future time and a predicted traffic value at the future time; and receiving real-time traffic information sent by the source station, and confirming, according to the real-time traffic information and the traffic curve, whether the current working state of the source station is normal. The described solution is able to predict a traffic curve of a source station, and can thereby better defend against an attack message.

Description

CDN system-based source station state detection method and equipment

Technical field

This application relates to the IT field, and in particular to a method and equipment for detecting the status of a source station based on a content distribution network CDN system.

Background technique

Content delivery network (CDN) refers to the use of node server clusters distributed in different regions to form a traffic distribution management network platform to provide users with decentralized storage and high-speed caching of content, and according to the dynamic network traffic and load conditions, The content is distributed to a fast and stable cache server to improve the response speed of user content access and the availability of services. Content providers can provide users with a large amount of content through CDN, such as video, audio, text, etc., and make money through advertising or charging content playback fees. Among them, the content can be video, audio, text, and so on.

Those skilled in the art have discovered through long-term research that under the existing technical conditions, it is easy for the client to attack the origin site through the CDN.

Summary of the invention

In order to solve the above-mentioned problems, the present application provides a method and equipment for detecting the status of a source station based on a content distribution network CDN system, which can predict the traffic curve of the source station, thereby better resisting attack packets.

In the first aspect, a method for detecting the status of an origin station based on a content distribution network CDN system is provided, which is characterized in that it includes:

Receiving log information sent by a node in the CDN system, where the log information records the URL of the source station and historical traffic information of the source station;

Predicting a flow curve of the source station according to the log information, the flow curve including a future time and a predicted flow value at a future time;

Receiving real-time flow information sent by the source station, and confirming whether the current working state of the source station is normal according to the real-time flow information and the flow curve.

In some possible designs, the receiving real-time traffic information sent by the source station, and confirming whether the working state of the source station is normal according to the real-time traffic information and the traffic curve includes:

Acquiring the predicted flow value corresponding to the flow curve at the current moment;

In the case where the flow value recorded in the real-time flow information exceeds the corresponding predicted flow value in the flow curve at the current moment, it is confirmed that the current working state of the source station is abnormal.

In some possible designs, after confirming that the current working state of the source station is abnormal, the method further includes:

It is confirmed whether the flow value recorded in the flow information exceeds the endurance capacity of the source station, if not, an alarm information is sent, and if so, the node in the CDN system is notified to discard the message of the source station.

In some possible designs, the receiving real-time flow information sent by the source station, and confirming whether the working state of the source station is normal according to the real-time flow information and the flow curve includes:

In the case that the flow value recorded in the real-time flow information does not exceed the corresponding predicted flow value in the flow curve at the current moment, it is confirmed that the current working state of the source station is normal.

In some possible designs, the log information also records service type information of the source station. After receiving the real-time traffic information sent by the source station, the method further includes:

Determine whether the service type recorded in the real-time traffic information is consistent with the service type information of the source station recorded in the log information, if yes, confirm that the current working status of the source station is normal, if not, confirm the source station The current working status of is abnormal.

In the second aspect, an intelligent defense device is provided, including: a receiving module, a prediction module, and a confirmation module,

The receiving module is configured to receive log information sent by a node in the CDN system, and the log information records the URL of the source station and historical traffic information of the source station;

The prediction module is configured to predict a flow curve of the source station according to the log information, the flow curve including a future time and a predicted flow value at a future time;

The confirmation module is configured to receive real-time flow information sent by the source station, and confirm whether the current working state of the source station is normal according to the real-time flow information and the flow curve.

In some possible designs, the confirmation module is also used to:

In some possible designs, the device further includes an alarm module for confirming whether the flow value recorded in the flow information exceeds the endurance capacity of the source station, and if it does not exceed the endurance capacity of the source station, In the case of capability, an alarm message is sent, and if the endurance capacity of the source station is exceeded, the node in the CDN system is notified to discard the message of the source station.

In some possible designs, the confirmation module is used to obtain the corresponding predicted flow value in the flow curve at the current time; the flow value recorded in the real-time flow information does not exceed the current time corresponding to the flow curve in the flow curve. In the case of the predicted traffic value, confirm that the current working state of the source station is normal.

In some possible designs, the confirmation module is used to determine whether the service type recorded in the real-time traffic information is consistent with the service type information of the source station recorded in the log information. In the case where the service type information is consistent, Confirm that the current working state of the source station is normal, and if the service types are inconsistent, confirm that the current working state of the source station is abnormal.

In a third aspect, an intelligent defense device is provided, including: a processor and a memory, and the processor executes the code in the memory to execute the method according to any one of the first aspect.

In a fourth aspect, a readable storage medium is provided, which is characterized by including instructions, which when run on an intelligent defense device, cause the intelligent defense device to execute the method described in any one of the first aspect .

In a fifth aspect, a computer program product is provided. When the computer program product is read and executed by a computer, the method described in any one of the first aspects will be executed.

Description of the drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application or the background art, the following will describe the drawings that need to be used in the embodiments of the present application or the background art.

Figure 1 is a schematic diagram of the structure of a content distribution network involved in this application;

2 is a schematic diagram of a client requesting content data from a source site node in a content distribution network related to this application;

Figure 3 is a schematic diagram of a cloud service involved in this application;

Fig. 4 is a schematic structural diagram of a cloud CDN involved in this application;

Figure 5 is a schematic structural diagram of another cloud CDN provided by this application;

FIG. 6 is a schematic flowchart of a method for detecting the status of a source station based on a content distribution network CDN system provided by the present application;

Fig. 7 is a schematic diagram of the flow curves of source station 1 and source station 2 in the three cases of working days, weekends and big holidays in this application;

FIG. 8 is a schematic diagram of the structure of a deep neural network provided by the present application;

Fig. 9 is a schematic structural diagram of an intelligent defense device provided by the present application;

Fig. 10 is a schematic structural diagram of another intelligent defense device provided by the present application.

Detailed ways

Referring to Fig. 1, Fig. 1 is a schematic structural diagram of a content delivery network (CDN) involved in this application. The CDN system includes a source site node 10, a control platform 20, a content distribution network CDN, and clients 101-105. Among them, the content distribution network CDN includes central cache nodes 60-61 and edge cache nodes 70-74.

The clients 101-105 are usually private devices of the user, which are used by the user to access the content data of the origin node 10. For example, the terminal device may be a smart phone, a tablet computer, a desktop computer, a vehicle-mounted device, a wearable device, etc., which are not specifically limited here.

The origin node 10 is usually set in a data center far away from the clients 101-105, and is used to store a large amount of content data. For example, the origin node 10 may be a node of a website that provides video viewing or downloading such as entertainment, sports, news or movies, etc., may be a node of a website that provides audio playback of music or books, etc., may be a node that provides news, There are no specific restrictions here on the nodes of websites where texts such as articles and books are read.

The central cache node is the upper-level node of the edge cache nodes 73-74. At the same time, the central cache node 60-61 is also the lower-level node of the origin node 10. That is, the central cache node can start between the edge cache node and the central cache node. To the role of linking up and down.

The edge cache nodes 70-74, also called proxy caches (surroigates), are only a "single hop" away from the terminal device, and are used for the cache origin node 10 to deliver to the edge cache nodes 70-74 The content data for clients 101-105 to visit nearby. Specifically, the edge cache nodes 70-74 store the mirror image of the origin node 10, and the edge cache nodes 70-74 are usually set at the edge of the network. Therefore, the edge cache nodes 70-74 can replace the origin node 10 to Clients 101-105 provide content data, so as to realize edge storage and dissemination of content data, solve network congestion, and improve the response speed of client 101-105 when accessing source site node 10.

In order to ensure that data can be sent from the edge cache node 70-74 to the client 101-105 as soon as possible, the edge cache node 70-74 and the client 101-105 must follow the following settings.

The edge cache nodes 70-74 are located in different regions. For example, the edge cache node 70 may be located in South China, the edge cache node 71 is located in Central China, the edge cache node 72 is located in West China, the edge cache node 73 is located in North China, and the edge cache node 74 is located in East China.

Clients 101-105 are set in different regions. For example, the client 101 may be set in the South China region, the client 102 is set in the Central China region, the client 103 is set in the West China region, the client 104 is set in the North China region, and the client 105 is set in the East China region.

That is, the client 101 is located in South China, so the client 101 and the edge cache node 70 are in the same area, and the distance between the two is the closest; the client 102 is located in Central China, so the client 102 and the edge cache node 71 are in the same area. The distance between the two is the closest; the client 103 is located in West China, so the client 103 and the edge cache node 72 are in the same area, and the distance between the two is the closest; the client 104 is located in North China, so the client 104 and the edge cache node 73 are in the same area Area, the distance between the two is the closest; the client 105 is located in East China, so the client 105 and the edge cache node 74 are in the same area, and the distance between the two is the closest.

In other embodiments, the number of origin site nodes is not limited to 2, but can be other positive integers, the number of central cache nodes is not limited to 2, but can be other positive integers, and the number of edge cache cache nodes is not limited to 5. It can be other positive integers, and there is no specific limitation this time.

Refer to Figure 2. Figure 2 is a schematic diagram of a CDN content data request process involved in this application. Based on the CDN shown in FIG. 1, as shown in FIG. 2, the content data request process of the CDN of this application includes the following steps:

S101: The client sends a request message to the edge cache node. Correspondingly, the edge cache node receives the request message sent by the client. Wherein, the request message is used for the client to request the content data in the source station from the source station node.

S102: The edge cache node judges whether it has cached the content data in the source station requested by the request message, if yes, go to step S103, if not, go to step S104.

S103: The edge cache node sends the content data requested by the request message, and ends the process.

S104: The edge cache node sends a request message to the source station node. Correspondingly, the source site node receives the request message sent by the edge cache node. Wherein, the request message is used for the edge cache node to request the content data in the source station from the source station node.

S105: The source site node sends the content data to the edge cache node. Correspondingly, the edge cache node receives the content data sent by the source station node.

S106: The edge cache node sends the content data in the source station to the client. Correspondingly, the client receives the content data in the source station sent by the edge cache node, and ends the process.

It is understandable that when any client in FIG. 1 requests content data from any corresponding source site node in FIG. 1, it follows the above-mentioned request process, and no further description is given here.

Refer to Figure 3, which is a schematic diagram of a cloud service involved in this application. The cloud owner deploys cloud computing infrastructure by himself, that is, deploys computing resources (for example, servers) 110, deploys storage resources (for example, storage) 120, deploys network resources (for example, network cards) 130, and so on. Then, the owner of the public cloud (for example, an operator) virtualizes the computing resources, storage resources, and network resources of the cloud computing infrastructure, and provides corresponding services for cloud users (for example, users) to use. Among them, operators can provide the following three services to users: cloud computing infrastructure as a service (Infrastructure as a Service, IaaS), platform as a service (Platform as a Service, PaaS), and software as a service (Software as a Service, SaaS).

The service provided by IaaS to users is the utilization of cloud computing infrastructure, including processing, storage, network and other basic computing resources. Users can deploy and run any software, including operating systems and applications. Users do not manage or control any cloud computing infrastructure, but can control the choice of operating system, storage space, deployment applications, and may also gain control of restricted network components (for example, firewalls, load balancers, etc.).

The service provided by PaaS to users is to deploy applications developed or acquired by users using development languages and tools provided by vendors (such as Java, python, Net, etc.) to cloud computing infrastructure. Users do not need to manage or control the underlying cloud computing infrastructure, including networks, servers, operating systems, storage, etc., but users can control the deployed applications and may also control the configuration of the hosting environment for running applications.

The services provided by SaaS to users are applications run by operators on cloud computing infrastructure. Users can access applications on cloud computing infrastructure on various devices through client interfaces, such as browsers. Users do not need to manage or control any cloud computing infrastructure, including networks, servers, operating systems, storage, and so on.

It is understandable that operators use any one of IaaS, PaaS, and SaaS to provide leasing services for different tenants, and the data and configuration of different tenants are isolated from each other, thereby ensuring the security and privacy of each tenant's data.

Refer to Fig. 4, which is a schematic structural diagram of a cloud CDN involved in the present application. The cloud CDN of this embodiment implements the CDN shown in FIG. 1 on the basis of the cloud service shown in FIG. 2. The tenants of cloud computing infrastructure are content providers, and content providers set their source sites on the cloud computing infrastructure (including: computing resources, storage resources, network resources), and can use storage virtualization technology to flexibly serve as tenants Provide a storage solution to better store the content data in the tenant’s source site node, and use network virtualization technology to flexibly provide the tenant with a traffic solution, so as to better perform the content data of the tenant’s source site. For transmission, server virtualization technology can be used to flexibly provide tenants with computing power solutions, so as to better manage the content data of the tenant's source site.

In a specific embodiment, the cloud CDN is a multi-tenant, multi-source site scenario. In other words, a cloud CDN may include multiple tenants, each tenant may include one source site node or multiple source site nodes, and each source site node may have one or more source sites. Taking the cloud CDN shown in FIG. 4 as an example, the tenant 1 may be a content provider that specializes in providing movies and videos, and the tenant may set a special source station node for movies and videos, that is, the source station node 10, to provide users with movies and videos. Tenant 2 can be a content provider that provides a variety of content. Tenants can set up a special book origin node, namely origin node 11, to provide users with book reading, and a special current affairs origin node, namely origin node. 12. Provide users with current affairs information.

In other embodiments, the number of tenants is not limited to 2, but can be other positive integers, the number of source site nodes is not limited to 3, and can be other positive integers, and the number of central cache nodes is not limited to 2, but can be other positive integers. Integer, the number of edge cache cache nodes is not limited to 5, and can be other positive integers, and there is no specific limitation this time.

In order to prevent attacks from attack messages, the prior art has set up a firewall between the client and the edge cache node. However, the firewall can only simply set a preset threshold, that is, different source sites and different time points are used. The same preset threshold. However, the normal access traffic of different origin sites varies greatly. For example, some large origin sites have an average normal access traffic of 20G, and some small origin sites have an average normal access traffic of 2G. In addition, the normal access traffic at different time points of the same source station is also very different. For example, the average normal access traffic at the source station on major holidays can reach 20G, and the average normal access traffic on weekdays is 2G. Therefore, for different sources The same preset threshold is used for stations and different time points, which can cause many problems. The following will assume that the preset threshold of the source station is 5G, and illustrate the problems with examples:

(1) At 8 o'clock in the morning on a working day, the normal access traffic of the source site is 1G, and the attack traffic is 3G, but since the total traffic does not reach 5G, the firewall cannot alarm and block it.

(2) At 20 o'clock in the evening on a working day, the visit peak period of the source station suddenly came, and the normal access traffic exceeded 5G. Because the normal access traffic exceeded the preset threshold, the firewall error alarm and blockade.

(3) At 12 o'clock in the big holiday, the normal access traffic of the source station broke through 5G. Because the normal access traffic exceeded the preset threshold, the firewall error alarm and blockade.

(4) The capacity of the source site node after expansion is 8G, and the capacity of the edge cache node is 20G. When the normal access traffic of the source site node exceeds 1G and the attack traffic is 5G, the firewall will alarm and block because the normal access traffic exceeds the preset threshold. However, in fact, the endurance of the source site node and edge cache node is greater than normal access The sum of traffic and attack traffic, blocking will result in a large number of normal access being blocked as well.

(5) The capacity of the source site node after expansion is 3G, and the capacity of the edge cache node is 20G. When the normal access traffic of the source site exceeds 1G and the attack traffic is 3G, since the sum of normal access traffic and attack traffic does not exceed the preset threshold, the firewall will not alarm and block, but in fact, the endurance of the source site node It is less than the sum of normal access traffic and attack traffic. Failure to alarm and block will cause the source site node to crash due to access overload.

In order to solve the above-mentioned problems, the present application provides a method and equipment for detecting the status of a source station based on a content distribution network CDN system, which can predict the traffic curve of the source station, thereby better resisting attack packets. The detailed introduction will be given below.

Refer to FIG. 5, which is a schematic structural diagram of another cloud CDN provided by the present application. In this embodiment, the operator can add an intelligent defense device on the basis of the cloud CDN shown in FIG. 4. Thus, the cloud CDN as shown in Figure 5 is obtained. Refer to FIG. 6, which is a schematic flowchart of a method for detecting the status of a source station based on a content distribution network CDN system provided by the present application. As shown in Fig. 6, on the basis of the cloud CDN shown in Fig. 5, this application is based on the source station status detection method of the content distribution network CDN system, including the following steps:

S201: The intelligent defense device receives log information sent by a node in the CDN system, where the log information records the URL of the source station and historical traffic information of the source station;

S202: The intelligent defense device predicts the flow curve of the source station according to the log information, where the flow curve includes a future time and a predicted flow value at a future time;

S203: The intelligent defense device receives the real-time traffic information sent by the source station, and confirms whether the current working state of the source station is normal according to the real-time traffic information and the traffic curve.

In the specific implementation of the present application, the intelligent defense device obtains the predicted flow value corresponding to the flow curve at the current moment; the flow value recorded in the real-time flow information exceeds the predicted flow value corresponding to the flow curve at the current moment In the case of the flow value, confirm that the current working status of the source station is abnormal; in the case that the flow value recorded in the real-time flow information does not exceed the corresponding predicted flow value in the flow curve at the current moment, confirm the The current working status of the source station is normal.

In the specific implementation manner of this application, in the case that the current working status of the source station is confirmed to be abnormal, the processing methods of the smart defense device may include the following two: (1) The smart defense device may directly notify the CDN system The node of discards the message of the source station. (2) The intelligent defense device confirms whether the flow value recorded in the flow information exceeds the endurance of the source station, if not, sends an alarm message, and if so, informs the node in the CDN system to discard the source station’s Message. Wherein, the endurance of the source station is determined by the used rate of the source station node's CPU, memory, network bandwidth, etc., and the source station node's CPU, memory, network bandwidth, etc. Determined by the capacity of the item. For example, although the predicted traffic value of the source site is 3.2G, and the current actual traffic value is 8G, if the source site node where the source site is located can withstand 20G traffic, the smart defense device can send first Instead of informing the nodes in the CDN system to discard the message of the source station, the alarm information can ensure that normal services are not interrupted and improve user experience.

For ease of understanding, the flow curves of the source station 1 and the source station 2 will be described in detail below in combination with the source station on working days, weekends, and major holidays. Among them, working days are days when you go to work and go to school, and weekends are days when you usually rest. For example, Sundays and big holidays usually refer to three or more public holidays, such as Christmas, Spring Festival, National Day, and so on.

1. Working day

(1) The traffic curve of source website 1 is as follows:

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source site 1 at 0:00 on the previous working day 1 is 2.5G, and the historical traffic at 0:00 on the previous working day 2 is 2.3G,..., the historical traffic at 0 o'clock in the previous working day n is 2.7G, so the intelligent defense equipment can input the above data into the working day traffic prediction model to predict the source station 1 at 0 o'clock in the future working day The predicted traffic value is 2.55G.

Suppose that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 1 at 4 o'clock in the previous working day 1 is 0.71G, and the historical traffic at 4 o'clock in the previous working day 2 is 0.52G,..., the historical traffic at 4 o'clock in the previous working day n is 0.57G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 1 at 4 o'clock in the future working day The predicted traffic value is 0.53G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source site 1 at 8 o'clock in the previous working day 1 is 1.59G, and the historical traffic at 8 o'clock in the previous working day 2 is 1.62G,..., the historical traffic at 8 o'clock in the previous working day n is 1.75G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 1 at 8 o'clock in the future working day The predicted traffic value is 1.63G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source site 1 at 12 o'clock in the previous working day 1 is 20.5G, and the historical traffic at 12 o'clock in the previous working day 2 is 20.05G,..., the historical traffic at 0 o'clock in the previous working day n is 22.43G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 1 at 0 o'clock in the future working day The predicted traffic value is 21.53G.

Suppose that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 16:00 on the previous working day 1 is 22.12G, and the historical traffic at 16:00 on the previous working day 2 is 18.45G,..., the historical traffic at 16 o'clock in the previous working day n is 21.32G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 1 at 16 o'clock in the future working day The predicted traffic value is 21.28G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 20 o'clock in the previous working day 1 is 23.52G, and the historical traffic at 20 o'clock in the previous working day 2 is 25.38G,.... The historical traffic at 20 o'clock in the previous working day n is 23.05G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 1 at 20 o'clock in the future working day The predicted traffic value is 24.23G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 24:00 on the previous working day 1 is 0.55G, and the historical traffic at 24:00 on the previous working day 2 is 0.62G,..., the historical traffic at 24 o'clock in the previous working day n is 0.51G, so the intelligent defense equipment can input the above data into the working day traffic prediction model to predict the source station 1 at 24 o'clock in the future working day The predicted traffic value is 0.55G.

Therefore, as shown in Figure 7(a), the traffic curve of the source station 1 in the future working day can be based on the above-mentioned predicted values: 2.55G, 0.53G, 1.63G, 21.53G, 21.28G, 24.23G and The curve formed by 0.55G.

(2) The traffic curve of source website 2 is as follows:

Suppose that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 2 at 0 o'clock in the previous working day 1 is 0.19G, and the historical traffic at 0 o'clock in the previous working day 2 is 0.22G,..., the historical traffic at 0 o'clock in the previous working day n is 0.09G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 2 at 0 o'clock in the future working day The predicted traffic value is 0.13G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 2 at 4 o'clock in the previous working day 1 is 0.07G, and the historical traffic at 4 o'clock in the previous working day 2 is 0.12G,..., the historical traffic at 4 o'clock in the previous working day n is 0.15G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 2 at 4 o'clock in the future working day The predicted traffic value is 0.12G.

Suppose that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 2 at 8 o'clock in the previous working day 1 is 0.82G, and the historical traffic at 8 o'clock in the previous working day 2 is 0.87G,..., the historical traffic at 8 o'clock in the previous working day n is 0.95G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 2 at 8 o'clock in the future working day The predicted traffic value is 0.83G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 2 at 12 o'clock in the previous working day 1 is 2.49G, and the historical traffic at 12 o'clock in the previous working day 2 is 2.82G,.... The historical traffic at 12 o'clock in the previous working day n was 1.79G, so the intelligent defense equipment can input the above data into the working day traffic prediction model to predict the source station 2 at 12 o'clock in the future working day The predicted traffic value is 2.62G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 16:00 on the previous working day 1 is 1.63G, and the historical traffic at 16:00 on the previous working day 2 is 2.48G,..., the historical traffic at 16 o’clock in the previous working day n is 2.19G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 2’s 16 o’clock in the future working day The predicted traffic value is 2.42G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 3 at 20 o'clock in the previous working day 1 is 2.67G, and the historical traffic at 30 o'clock in the previous working day 2 is 3.56G,..., the historical traffic at 20 o’clock in the previous working day n is 3.15G, so the smart defense device can input the above data into the working day traffic forecast model to predict the source station 2’s 20 o’clock in the future working day The predicted traffic value is 3.26G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 2 at 24:00 on the previous working day 1 is 0.21G, and the historical traffic at 0:00 on the previous working day 2 is 0.17G,..., the historical traffic at 24 o'clock in the previous working day n is 0.13G, so the intelligent defense equipment can input the above data into the working day traffic forecast model to predict the source station 2 at 24 o'clock in the future working day The predicted flow value is 0.15G.

Therefore, as shown in Figure 7(b), the traffic curve of the source station 2 in the future working day can be based on the above-mentioned predicted values: 0.13G, 0.12G, 0.83G, 2.62G, 2.42G, 3.26G and Curve composed of 0.15G.

2. Weekend

(1) The traffic curve of source website 1 is as follows:

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 1 at 0 o'clock in the past weekend 1 is 4.53G, and the historical traffic at 0 o'clock in the past weekend 2 is 4.81G ,..., the historical traffic at 0 o’clock on weekends n in the past is 4.92G, so the intelligent defense equipment can input the above data into the weekend traffic forecasting model to predict the predicted traffic value of source station 1 at 0 o’clock on future weekends is 4.78 G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source site 1 at 4 o'clock in the previous weekend 1 was 2.45G, and the historical traffic at 4 o'clock in the previous weekend 2 was 2.83G ,.... The historical traffic at 4 o'clock in the previous weekend n was 2.51G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 1 at 4 o'clock in the future weekend is 2.73 G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 8 o'clock in the previous weekend 1 was 3.07G, and the historical traffic at 8 o'clock in the previous weekend 2 was 3.39G ,.... The historical traffic at 8 o'clock in the previous weekend n was 5.15G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 1 at 8 o'clock in the future weekend is 4.15 G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 12 o'clock in the previous weekend 1 is 24.75G, and the historical traffic at 12 o'clock in the previous weekend 2 is 27.55G ,..., the historical traffic at 0 o'clock on weekend n in the past is 22.48G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 1 at 0 o'clock in the future weekend is 26.29 G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 1 at 16:00 in the past weekend 1 is 28.12G, and the historical traffic at 16:00 in the past weekend 2 is 28.41G ,..., the historical traffic at 16 o'clock in the past weekend n is 30.38G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of the source station 1 at 16 o'clock in the future weekend It is 29.25G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 20 o'clock in the previous weekend 1 is 35.25G, and the historical traffic at 20 o'clock in the previous weekend 2 is 38.38G ,..., the historical traffic at 20 o'clock in the previous weekend n was 37.08G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 1 at 20 o'clock in the future weekend is 37.09 G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 24 o'clock in the previous weekend 1 is 20.58G, and the historical traffic at 24 o'clock in the previous weekend 2 is 20.33G ,..., the historical traffic at 24 o'clock in the past weekend n is 25.57G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 1 at 24 o'clock in the future weekend is 23.88 G.

Therefore, as shown in Figure 7(c), the traffic curve of the source station 1 in the future weekend can be based on the above predicted values: 5.77, 2.68G, 16.88G, 33.75G, 37.25G, 40.77G, and 26.66G Constitute the curve.

(2) The traffic curve of source website 2 is as follows:

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 2 at 0 o'clock in the previous weekend 1 is 1.85G, and the historical traffic at 0 o'clock in the previous weekend 2 is 0.99. …. The historical traffic at 0 o'clock on weekends n in the past is 1.53, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 2 at 0 o’clock on the future weekend is 1.01G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 2 at 4 o'clock in the past weekend 1 is 0.53G, and the historical traffic at 4 o'clock in the past weekend 2 is 0.75G ,..., the historical traffic at 4 o'clock in the past weekend n is 1.01G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 2 at 4 o'clock in the future weekend is 0.99 G.

Suppose that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 2 at 8 o'clock in the previous weekend 1 was 2.11G, and the historical traffic at 8 o'clock in the previous weekend 2 was 1.75G ,..., the historical traffic at 8 o'clock in the previous weekend n was 1.06G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 2 at 8 o'clock in the future weekend is 1.83 G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 2 at 12 o'clock in the previous weekend 1 was 3.69G, and the historical traffic at 12 o'clock in the previous weekend 2 was 2.52G ,..., the historical traffic at 12 o'clock in the past weekend n was 3.72G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 2 at 12 o'clock in the future weekend is 3.62 G.

Suppose that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 16:00 in the past weekend 1 is 3.88G, and the historical traffic at 16 o'clock in the past weekend 2 is 2.91G ,..., the historical traffic at 16 o'clock in the past weekend n is 3.04G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 2 at 16 o'clock in the future weekend is 3.76 G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 3 at 20 o'clock in the previous weekend 1 was 4.19G, and the historical traffic at 20 o'clock in the previous weekend 2 was 4.94G ,..., the historical traffic at 20 o'clock in the previous weekend n was 3.25G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 2 at 20 o'clock in the future weekend is 4.85 G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 2 at 24:00 on the previous weekend 1 is 2.16G, and the historical traffic at 0:00 on the previous weekend 2 is 1.88G ,.... The historical traffic at 24 o'clock in the past weekend n was 1.79G, so the intelligent defense equipment can input the above data into the weekend traffic prediction model to predict the predicted traffic value of source station 2 at 24 o'clock in the future weekend is 2.07 G.

Therefore, as shown in Figure 7(d), the traffic curve of the source station 2 in the future weekend can be based on the above-mentioned predicted values: 1.01G, 0.99G, 1.83G, 3.62G, 3.76G, 4.85G, and 2.07 The curve formed by G.

Three, big holiday

(1) The traffic curve of source website 1 is as follows:

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of origin site 1 at 0 o'clock in the previous big holiday 1 is 5.06G, and the historical traffic at 0 o'clock in the previous big holiday 2 is 4.55G,..., the historical traffic at 0 o’clock in the previous big holiday n is 6.12G, so the intelligent defense device can input the above data into the big holiday traffic prediction model to predict the source station 1 at 0 o’clock in the future big holiday The predicted traffic value is 5.77G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 1 at 4 o'clock in the previous big holiday 1 is 2.14G, and the historical traffic at 4 o'clock in the previous big holiday 2 is 2.08G,.... The historical traffic at 4 o’clock in the previous big holiday n was 2.87G, so the intelligent defense device can input the above data into the big holiday traffic forecast model to predict the source station 1’s 4 o’clock in the future big holiday The predicted traffic value is 2.68G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source site 1 at 8 o'clock in the previous big holiday 1 is 15.85G, and the historical traffic at 8 o'clock in the previous big holiday 2 is 14.09G,.... The historical traffic at 8 o'clock in the previous big holiday n is 17.11G, so the intelligent defense device can input the above data into the big holiday traffic forecast model to predict the source station 1 at 8 o'clock in the future big holiday The predicted traffic value is 16.88G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 12 o'clock in the previous big holiday 1 is 30.45G, and the historical traffic at 12 o'clock in the previous big holiday 2 is 35.22G,.... The historical traffic at 0 o'clock in the previous big holiday n is 32.55G, so the intelligent defense device can input the above data into the big holiday traffic prediction model to predict the source station 1 at 0 o'clock in the future big holiday The predicted traffic value is 33.75G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source site 1 at 16:00 in the previous big holiday 1 is 34.12G, and the historical traffic at 16:00 in the previous big holiday 2 is 39.53G,...., the historical traffic at 16 o'clock in the previous big holiday n is 38.06G, so the intelligent defense device can input the above data into the big holiday traffic forecast model to predict the source station 1 at 16 o'clock in the future big holiday The predicted traffic value is 37.25G.

Suppose that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 1 at 20 o'clock in the previous big holiday 1 is 40.15G, and the historical traffic at 20 o'clock in the previous big holiday 2 is 38.66G,.... The historical traffic at 20 o'clock in the previous big holiday n is 42.43G, so the intelligent defense device can input the above data into the big holiday traffic prediction model to predict the source station 1 at 20 o'clock in the future big holiday The predicted traffic value is 40.77G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source site 1 at 24 o'clock in the previous big holiday 1 is 25.18G, and the historical traffic at 24 o'clock in the previous big holiday 2 is 27.23G,.... The historical traffic at 24 o'clock in the previous big holiday n is 27.17G, so the intelligent defense device can input the above data into the big holiday traffic forecast model to predict the source station 1 at 24 o'clock in the future big holiday The predicted traffic value is 26.66G.

Therefore, as shown in Figure 7(e), the traffic curve of the source station 1 in the future big holiday can be based on the above-mentioned predicted values: 4.78G, 2.73G, 4.15G, 26.29G, 29.25G, 37.09G and The curve formed by 23.88G.

(2) The traffic curve of source website 2 is as follows:

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 2 at 0 o'clock in the previous big holiday 1 is 2.52G, and the historical traffic at 0 o'clock in the previous big holiday 2 is 1.75,..., the historical traffic at 0 o'clock in the previous big holiday n is 2.78, so the intelligent defense device can input the above data into the big holiday traffic prediction model to predict the predicted traffic at 0 o'clock of the source station 2 in the future big holiday The value is 2.57G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 2 at 4 o'clock in the previous big holiday 1 is 1.61G, and the historical traffic at 4 o'clock in the previous big holiday 2 is 1.69G,..., the historical traffic at 4 o’clock in the previous big holiday n is 1.22G, so the intelligent defense device can input the above data into the big holiday traffic forecast model to predict the source station 2’s 4 o’clock in the future big holiday The predicted traffic value is 1.45G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 2 at 8 o'clock in the previous big holiday 1 is 3.22G, and the historical traffic at 8 o'clock in the previous big holiday 2 is 3.79G,..., the historical traffic at 8 o'clock in the previous big holiday n is 2.98G, so the intelligent defense device can input the above data into the big holiday traffic forecast model to predict the source station 2 at 8 o'clock in the future big holiday The predicted traffic value is 3.03G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of source station 2 at 12 o'clock in the previous big holiday 1 is 4.35G, and the historical traffic at 12 o'clock in the previous big holiday 2 is 4.12G,..., the historical traffic at 12 o'clock in the previous big holiday n is 5.09G, so the intelligent defense equipment can input the above data into the big holiday traffic prediction model to predict the source station 2 at 12 o'clock in the future big holiday The predicted traffic value is 4.66G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 1 at 16:00 in the previous big holiday 1 is 5.81G, and the historical traffic at 16:00 in the previous big holiday 2 is 4.93G,..., the historical traffic at 16 o'clock in the previous big holiday n is 4.88G, so the smart defense can input the above data into the big holiday traffic forecast model to predict the 16 o'clock forecast of the source station 2 in the future big holiday The flow value is 5.26G.

Assume that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 3 at 20 o'clock in the previous big holiday 1 is 5.88G, and the historical traffic at 20 o'clock in the previous big holiday 2 is 6.04G,..., the historical traffic at 20 o’clock in the previous big holiday n is 6.25G, so the intelligent defense device can input the above data into the big holiday traffic forecast model to predict the source station 2’s 20 o’clock in the future big holiday The predicted traffic value is 6.17G.

Suppose that the historical traffic information carried in the log information sent by the smart defense device from the CDN system includes: the historical traffic of the source station 2 at 24 o'clock in the previous big holiday 1 is 3.17G, and the historical traffic at 0 o'clock in the previous big holiday 2 is 2.94G,..., the historical traffic at 24 o'clock in the previous big holiday n was 3.09G, so the intelligent defense device can input the above data into the big holiday traffic forecast model to predict the source station 2's 24 o'clock in the future big holiday The predicted traffic value is 3.01G.

Therefore, as shown in Figure 7(f), the traffic curve of the source station 2 in the future big holiday can be based on the above-mentioned predicted values: 2.57G, 1.45G, 3.03G, 4.66G, 5.26G, 6.17G and Curve composed of 3.01G.

For ease of presentation, the above example uses a time interval of 4 hours to predict the predicted flow value of each time node. However, in actual applications, in order to make the curve more accurate, the above time interval can be shortened to 2 hours, 1 hour, and 30 minutes. Sum, 15 minutes, 10 minutes, 5 minutes, etc., of course, when the curve requirements do not need to be so precise, the above time interval can also be increased, which is not specifically limited here.

The weekday traffic prediction model, weekend traffic prediction model, and big holiday traffic prediction model in the above example can be implemented by using a deep neural network or a segmented model. The detailed introduction will be given below.

In the first way, the weekday traffic forecasting model, weekend traffic forecasting model, and big holiday traffic forecasting model can be implemented using deep neural networks.

In the specific implementation of this application, the weekday traffic prediction model can be expressed as:

b ₁ ＝g ₁ (a ₁ )

Among them, b ₁ is the predicted flow value _{of the working day, a 1} is the historical flow of the current sampling time of the source station on the working day, and g ₁ is the difference between the predicted flow value of the working day and the historical flow of the source station's current sampling time of the working day The mapping relationship. Wherein, the mapping relationship g ₁ may be obtained by training a large number of historical traffic of known working days and a large number of predicted traffic values of the source station's current sampling time of known working days. In a specific embodiment, a large number of predicted flow values at the source station's current sampling time of known working days may be predicted flow values at a working Japan time point in the last six months. Correspondingly, a large number of known working days’ flow It can be the historical traffic at the time of working in Japan in the last six months.

(2) The weekend traffic forecast model can be expressed as:

b ₂ ＝g ₂ (a ₂ )

Among them, b ₂ is the predicted traffic value on the weekend, a ₂ is the historical traffic of the current sampling time of the source station on the weekend, and g ₂ is the mapping relationship between the predicted traffic value of the weekend and the historical traffic of the current sampling time of the source station on the weekend. Wherein, the mapping relationship g ₂ may be obtained by training a large number of predicted traffic values of known weekends and a large number of historical traffic of the source station at the current sampling time of known weekends. In a specific embodiment, a large number of known historical traffic at the source station's current sampling time on weekends may be the historical traffic at the current time point on weekends in the last year. Correspondingly, a large number of known weekends’ predicted traffic values may be It is the predicted traffic value at the weekend at this point in the last year.

(3) The traffic forecast model for big holidays can be expressed as:

b ₃ ＝g ₃ (a ₃ )

Among them, b ₃ is the predicted traffic value _{of the big holiday, a 3} is the historical traffic of the source station of the big holiday at the current sampling time, and g ₃ is the difference between the predicted traffic value of the big holiday and the historical traffic of the source station at the current sampling time of the big holiday The mapping relationship. Wherein, the mapping relationship g ₃ may be obtained through training of a large number of predicted traffic values of known major holidays and a large number of historical traffic of source stations of known major holidays at the current sampling time. In a specific embodiment, the historical traffic at the current sampling time of the source station of a large number of known major holidays may be the historical traffic at the current time point of the major holidays in the last two years. Correspondingly, a large number of known major holidays are predicted The flow value can be the predicted flow value at this point of time during the big holiday in the last two years.

In the second method, the weekday traffic forecast model, the weekend traffic forecast model, and the big holiday traffic forecast model can be implemented using a segmented model.

(1) The working day traffic forecast model can be expressed as:

Find the average:

among them,

Is the average value, x ₁ to x _n-1 are the historical traffic at the working time in Japan during the last six months, x _n is the historical traffic at the current sampling time of the source station on the working day, and n is the working at the Japanese time in the most recent six months The sum of the number of historical traffic and the number of historical traffic at the current sampling time of the source station;

Find the variance:

Among them, σ ₁ is the variance, x ₁ to x _n-1 are the historical traffic at the time of working in Japan in the last six months, x _n is the historical traffic at the current sampling time of the source station in the working day, and n is the working in Japan in the last six months. The sum of the number of historical traffic at the time point and the number of historical traffic at the current sampling time of the source station;

Find the confidence interval:

Among them, p is the lower limit of the confidence interval, q is the upper limit of the confidence interval,

Is the average value, t is a natural number greater than zero, and σ ₁ is the variance.

Here, the predicted flow value can be made equal to the upper limit of the confidence interval.

(2) The weekend traffic forecast model can be expressed as:

Find the average:

among them,

Is the average value, y ₁ to y _n-1 are the historical traffic at the weekend at this time point in the last year, y _n is the historical traffic at the source station during the weekend at the current sampling time, and n is the weekend at the current time point in the last year The sum of the number of historical traffic and the number of historical traffic of the source station's current sampling time;

Find the variance:

Among them, σ ₂ is the variance, y ₁ to y _n-1 are the historical traffic at the current time point on the weekend in the last year, y _n is the historical traffic at the current sampling time of the source station on the weekend, and n is the weekend in the last year. The sum of the number of historical traffic at this point in time and the number of historical traffic at the current sampling time of the source station;

Find the confidence interval:

Is the average value, t is a natural number greater than zero, and σ ₂ is the variance.

(3) The traffic forecast model for big holidays can be expressed as:

Find the average:

among them,

Is the average value, z ₁ to z _n-1 are the historical traffic at this time point of the big holiday in the last two years, z _n is the historical traffic at the current sampling time of the source station of the big holiday, and n is the big holiday in the last two years The sum of the number of historical traffic at this point in time and the number of historical traffic at the current sampling time of the source station;

Find the variance:

Among them, σ ₃ is the variance, z ₁ to z _n-1 are the historical traffic at this time point of the big holiday in the last two years, z _n is the historical traffic at the current sampling time of the source station during the big holiday, and n is the time in the last two years The sum of the number of historical traffic at this point of time during the national holiday and the number of historical traffic at the current sampling time of the source station;

Find the confidence interval:

Is the average value, t is a natural number greater than zero, and σ ₃ is the variance.

It can be understood that in the above example, the predicted flow value is equal to the upper limit of the confidence interval as an example. However, in practical applications, the predicted flow value can be equal to the lower limit of the confidence interval, and any one between the upper limit and the lower limit of the confidence interval The value is not limited here.

After adopting the above-mentioned source station status detection method based on the content distribution network CDN system, the problems existing in the prior art can be solved.

(1) At 8:00 a.m. on a working day, the normal access traffic of the source station is 1G, and the attack traffic is 3G. According to the working day traffic prediction model, the predicted traffic value at 10 a.m. on a working day can be calculated to be about 1G. The attack data is superimposed, resulting in the access data being about 4G, which seriously deviates from the normal access level, and an alarm is raised.

(2) At 20 o'clock in the evening of the working day, the visit peak period of the source station suddenly came, and the normal access traffic exceeded 5G. According to the working day traffic prediction model, the normal access traffic on the working day 20 can be calculated to be about 5G. According to the normal visits collected in real time The flow rate is about 5G, the deviation of the two values is small, which belongs to the normal range.

(3) At 12 noon on the big holiday, the normal access traffic of the source station exceeded 5G. According to the traffic forecast model of the big holiday, the normal access traffic at 12 o’clock on the big holiday can be calculated to be about 33G, and the access data collected in real time is about 30G. The value deviation is small and belongs to the normal range.

(4) The capacity of the source site node after expansion is 8G, and the capacity of the edge cache node is 20G. When the normal access traffic of the origin site exceeds 1G and the attack traffic is 5G, by judging that the 6G access is far lower than the 8G endurance of the origin node's 8G and the endurance capacity of the edge cache node 20G, no defensive blocking is performed, and only an alarm is raised.

(5) The capacity of the source site node after expansion is 3G, and the capacity of the edge cache node is 20G. When the normal access traffic of the source site exceeds 1G and the attack traffic is 3G, by judging that the 6G access has exceeded the endurance of the source site node, active blocking defense is performed to prevent the source site node from going down.

The intelligent defense device can also identify the service type of real-time traffic through the service type recognition model, and determine whether the service type of the real-time traffic is consistent with the service type information of the source station recorded in the log information, and if so, confirm the source The current working state of the station is normal. If not, confirm that the current working state of the source station is abnormal.

In the specific implementation manner of this application, the service type identification model can be expressed as:

y ₁ = f ₁ (x)

Among them, y ₁ is the service type, x is the real-time traffic, and f ₁ is the mapping relationship between the real-time traffic and the service type. Wherein, the mapping relationship f ₁ may be obtained through training of a large number of known historical flows and service types corresponding to a large number of known historical flows.

In the specific implementation manner of the present application, as shown in FIG. 8, the service type recognition model may be implemented by using deep neural networks (DNN). In a specific embodiment, the deep neural network includes an input layer, one or more hidden layers, and an output layer.

Input layer:

Assuming that the input of the input layer is the real-time flow I _i , the output and the input are equal, that is, no processing is performed on the input. For simplicity of presentation, it is assumed here that the input layer does not perform any processing. However, in practical applications, the input layer can be normalized and so on, which is not specifically limited here.

Hidden layer:

_{The real-time traffic I i} output by the input layer is taken as the input of the hidden layer. Assuming that there are a total of L (L≥2) hidden layers, let Z ^l denote the output result of the lth layer. When l=1, Z ¹ ＝I _i , where 1≤l≤L, then the relationship between the lth layer and the l+1th layer is:

a ^l+1 = W ^l Z ^l + b ^l

Z ^l+1 = f ^l+1 (a ^l+1 )

Among them, W ^l is the weight vector ^{of the lth layer, b l} is the bias vector of the lth layer, a ^l+1 is the intermediate vector of the l+1th layer, and f ^l+1 is the excitation of the l+1th layer Function, Z ^l+1 is the hidden layer result of the l+1th layer. The excitation function can be any of a sigmoid function, a hyperbolic tangent function, a Relu function, an ELU (Exponential Linear Units) function, and so on.

Output layer:

Assuming the output result Z ^L of the ^L- th layer, input Z L into the softmax function to get the business type.

y=softmax(Z ^L )

Among them, y is the output result of the output layer, Z ^L is the output result of the hidden layer of the Lth layer, and the softmax function is the classification function. It can be understood that the softmax function is taken as an example in the above example for description. However, in actual applications, a logistic function and the like can also be used, which is not specifically limited here.

In the specific implementation of this application, the nature of the training of the service type recognition model can be understood as follows: the work of each layer in the deep neural network can be expressed in mathematical expressions

To describe: From the physical level, the work of each layer in the deep neural network can be understood as the transformation of the input space to the output space (that is, the row space of the matrix to the column of the matrix) through five operations on the input space (the set of input vectors). Space), these five operations include: 1. Dimension Up/Down; 2. Enlarge/Reduce; 3. Rotate; 4. Translation; 5. "Bend". The operations of 1, 2, and 3 are determined by

Completed, the operation of 4 is completed by +b, and the operation of 5 is realized by a(). The reason why the word "space" is used here is because the object to be classified is not a single thing, but a class of things, and space refers to the collection of all individuals of this type of thing. Among them, W is a weight vector, and each value in the vector represents the weight value of a neuron in the layer of neural network. This vector W determines the spatial transformation from the input space to the output space described above, that is, the weight W of each layer controls how the space is transformed. The purpose of training a deep neural network is to finally obtain the weight matrix of all layers of the trained neural network (the weight matrix formed by the vector W of many layers). Therefore, the training process of the neural network is essentially the way of learning to control the space transformation, and more specifically, the learning of the weight matrix.

In the specific implementation of this application, the training process of the service type recognition model can be: the known historical traffic can be input into the service type recognition model to obtain the predicted value, and the known service type is taken as the real desired target value . By comparing the predicted value of the current network with the really desired target value, the weight vector of each layer of neural network is updated according to the difference between the two (of course, there is usually an initialization process before the first update, Pre-configured parameters for each layer in the deep neural network), for example, if the predicted value of the network is high, adjust the weight vector to make it predict lower, and keep adjusting until the neural network can predict the really desired target value . Therefore, it is necessary to predefine "how to compare the difference between the predicted value and the target value". This is the loss function or objective function, which is used to measure the difference between the predicted value and the target value. Important equation. Among them, taking the loss function as an example, the higher the output value (loss) of the loss function, the greater the difference, then the training of the deep neural network becomes a process of reducing this loss as much as possible.

Since attack traffic can vary greatly, but normal access traffic is limited, the above solution trains the service type recognition model through a large number of known historical traffic and known service types, so that the service type recognition model can learn to recognize The rules of the correct service type can identify normal access traffic and identify request packets that cannot be recognized as normal access traffic as attack traffic, which can effectively prevent the source site from being attacked and maintain the security of the entire system. In addition, the newly recognized known historical traffic and known service types can be used online to train the service type recognition model in real time, so as to update the knowledge base of the service type recognition model in time.

Since the access traffic belongs to the tenant and the smart defense device belongs to the operator, the source station has already eliminated the key information in the access traffic before sending the access traffic to the smart defense device. The defense device recognizes that the request packet is attack traffic, and can only investigate where the attack traffic belongs on a large scale.

In order to solve the above problem, the service type of the abnormal message can be identified, so that only the abnormal message needs to be found in the service type described in the abnormal message, which effectively reduces the workload of checking the abnormal message.

In a specific implementation manner of the present application, the intelligent defense device may further include a data type identification model, where the data type identification model is used to identify the data type of the attack traffic. In the specific implementation of this application, the second AI model can be expressed as:

y ₂ = f ₂ (x)

Among them, y ₂ is the data type, x is the attack flow, and f ₂ is the mapping relationship between the attack flow and the data type. Wherein, the mapping relationship f ₂ may be obtained through training with a large number of known attack traffic and a large number of known data types. It can be understood that the prediction process and training process of the data type recognition model are similar to the service type recognition model, and will not be further described here.

In the specific implementation manner of this application, the data type recognition model and the service type recognition model can be integrated in the same model.

In the above method, the data type of the attack flow can be identified through the data type identification model, so that only the access flow of the data type needs to be checked, which greatly reduces the workload of the check.

Refer to FIG. 9, which is a schematic structural diagram of an intelligent defense device provided by the present application. As shown in FIG. 9, the intelligent defense device of the present application includes: a receiving module 310, a prediction module 320, a confirmation module 330, and an alarm module 340.

The receiving module 310 is configured to receive log information sent by nodes in the CDN system, and the log information records the URL of the source station and historical traffic information of the source station;

The prediction module 320 is configured to predict a flow curve of the source station according to the log information, the flow curve including a future time and a predicted flow value at a future time;

The confirmation module 330 is configured to receive real-time flow information sent by the source station, and confirm whether the current working state of the source station is normal according to the real-time flow information and the flow curve.

The alarm module 340 is configured to confirm whether the flow value recorded in the flow information exceeds the endurance capacity of the source station, and if the endurance capacity of the source station is not exceeded, send alarm information, In the case of the endurance of the station, the node in the CDN system is notified to discard the message of the source station.

In the specific implementation manner of the present application, the confirmation module 330 is further configured to obtain the predicted flow value corresponding to the flow curve at the current moment; the flow value recorded in the real-time flow information exceeds the current flow rate at the current moment. In the case of the corresponding predicted flow value in the curve, it is confirmed that the current working state of the source station is abnormal.

In the specific implementation manner of this application, the confirmation module 330 is used to obtain the predicted flow value corresponding to the flow curve at the current moment; the flow value recorded in the real-time flow information does not exceed the current flow rate at the current moment. In the case of the corresponding predicted flow value in the curve, it is confirmed that the current working state of the source station is normal.

In the specific implementation manner of this application, the confirmation module 330 is used to determine whether the service type recorded in the real-time traffic information is consistent with the service type information of the source station recorded in the log information. In the case, it is confirmed that the current working state of the source station is normal, and in the case of inconsistent service types, it is confirmed that the current working state of the source station is abnormal.

It is understandable that the smart defense device shown in Figure 9 can implement the source station status detection method based on the content distribution network CDN system shown in Figure 6. For brevity, please refer to Figure 6 and related descriptions for details, which will not be further described here. .

Refer to FIG. 10, which is a schematic structural diagram of another intelligent defense device provided by the present application. As shown in FIG. 10, the intelligent defense device of the present application includes a processing unit 410 and a communication interface 420. The processing unit 410 is used to execute functions defined by various software programs, for example, to implement the functions of the intelligent defense device. The communication interface 420 is used to communicate and interact with other computing nodes, and other devices may be other physical servers. Specifically, the communication interface 420 may be a network adapter card.

Optionally, the smart defense device may further include an input/output interface 430, and the input/output interface 430 is connected to an input/output device for receiving input information and outputting operation results. The input/output interface 430 may be a mouse, a keyboard, a display, or an optical drive, etc. Optionally, the smart defense device may also include auxiliary storage 440, which is generally also referred to as external storage. The storage medium of auxiliary storage 440 may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, an optical disk), or Semiconductor media (such as solid state drives), etc.

Optionally, the smart defense device may further include a bus 450. Among them, the processing unit 410, the communication interface 420, the input/output interface 430, and the auxiliary memory 440 may be connected through the bus 450. The bus 450 may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus 450 can be divided into an address bus, a data bus, a control bus, and so on. For ease of representation, only one line is used to represent in FIG. 10, but it does not mean that there is only one bus or one type of bus.

The processing unit 410 may have a variety of specific implementation forms. For example, the processing unit 410 may include a processor 411 and a memory 412, and the processor 411 performs related operations of the embodiment shown in FIG. 6 according to program instructions stored in the memory 412. The processor 411 may be a central processing unit (central processing unit, CPU). The processor can also be other general-purpose processors, digital signal processors (digital signal processors, DSP), application specific integrated circuits (ASICs), ready-made programmable gate arrays (field programmable gate arrays, FPGAs) or other Programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. Or, the processor 411 adopts one or more integrated circuits to execute related programs to implement the technical solutions provided in the embodiments of the present application.

In the above-mentioned embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a storage disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a Solid State Disk (SSD)).

Claims

A method for detecting the status of a source station based on a content distribution network CDN system, which is characterized in that it includes:

Receiving log information sent by a node in the CDN system, where the log information records the URL of the source station and historical traffic information of the source station;

Predicting a flow curve of the source station according to the log information, the flow curve including a future time and a predicted flow value at a future time;

Receiving real-time flow information sent by the source station, and confirming whether the current working state of the source station is normal according to the real-time flow information and the flow curve.
The method according to claim 1, wherein the receiving real-time traffic information sent by the source station, and confirming whether the working state of the source station is normal according to the real-time traffic information and the traffic curve, comprises:

Acquiring the predicted flow value corresponding to the flow curve at the current moment;

In the case where the flow value recorded in the real-time flow information exceeds the corresponding predicted flow value in the flow curve at the current moment, it is confirmed that the current working state of the source station is abnormal.
The method according to claim 2, wherein after confirming that the current working state of the source station is abnormal, the method further comprises:

It is confirmed whether the flow value recorded in the flow information exceeds the endurance capacity of the source station, if not, an alarm information is sent, and if so, the node in the CDN system is notified to discard the message of the source station.
The method according to claim 1, wherein the receiving real-time traffic information sent by the source station, and confirming whether the working state of the source station is normal according to the real-time traffic information and the traffic curve, comprises:

Acquiring the predicted flow value corresponding to the flow curve at the current moment;

In the case that the flow value recorded in the real-time flow information does not exceed the corresponding predicted flow value in the flow curve at the current moment, it is confirmed that the current working state of the source station is normal.
The method according to any one of claims 1 to 4, wherein the log information also records service type information of the source station, and after receiving real-time traffic information sent by the source station, the method Also includes:

Determine whether the service type recorded in the real-time traffic information is consistent with the service type information of the source station recorded in the log information, if yes, confirm that the current working status of the source station is normal, if not, confirm the source station The current working status of is abnormal.
An intelligent defense device, characterized by comprising: a receiving module, a prediction module, and a confirmation module,

The receiving module is configured to receive log information sent by a node in the CDN system, and the log information records the URL of the source station and historical traffic information of the source station;

The prediction module is configured to predict a flow curve of the source station according to the log information, the flow curve including a future time and a predicted flow value at a future time;

The confirmation module is configured to receive real-time flow information sent by the source station, and confirm whether the current working state of the source station is normal according to the real-time flow information and the flow curve.
The device according to claim 6, wherein the confirmation module is further configured to:

Acquiring the predicted flow value corresponding to the flow curve at the current moment;

In the case where the flow value recorded in the real-time flow information exceeds the corresponding predicted flow value in the flow curve at the current moment, it is confirmed that the current working state of the source station is abnormal.
The device according to claim 7, wherein the device further comprises an alarm module configured to confirm whether the flow value recorded in the flow information exceeds the endurance capacity of the source station. In the case of the endurance capability of the source station, an alarm message is sent, and if the endurance capability of the source station is exceeded, the node in the CDN system is notified to discard the message of the source station.
The device according to claim 6, wherein the confirmation module is configured to obtain the predicted flow value corresponding to the flow curve at the current moment; the flow value recorded in the real-time flow information does not exceed the current flow rate at the current moment. In the case of the corresponding predicted flow value in the flow curve, it is confirmed that the current working state of the source station is normal.
The device according to any one of claims 6 to 9, wherein the confirmation module is configured to determine whether the service type recorded in the real-time traffic information and the service type information of the source station recorded in the log information are Consistent, if the service type information is consistent, confirm that the current working status of the source station is normal, and if the service types are inconsistent, confirm that the current working status of the source station is abnormal.
An intelligent defense device, comprising: a processor and a memory, and the processor runs the code in the memory to execute the method according to any one of claims 1 to 5.
A readable storage medium, characterized by comprising instructions, which when running on an intelligent defense device, cause the intelligent defense device to execute the method according to any one of claims 1 to 5.