CN113852643B - Content distribution network cache pollution defense method based on content popularity - Google Patents

Content distribution network cache pollution defense method based on content popularity Download PDF

Info

Publication number
CN113852643B
CN113852643B CN202111227105.7A CN202111227105A CN113852643B CN 113852643 B CN113852643 B CN 113852643B CN 202111227105 A CN202111227105 A CN 202111227105A CN 113852643 B CN113852643 B CN 113852643B
Authority
CN
China
Prior art keywords
resource
cache
resources
user
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111227105.7A
Other languages
Chinese (zh)
Other versions
CN113852643A (en
Inventor
朱笑岩
樊甜甜
韩雪雪
冯鹏斌
马建峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202111227105.7A priority Critical patent/CN113852643B/en
Publication of CN113852643A publication Critical patent/CN113852643A/en
Application granted granted Critical
Publication of CN113852643B publication Critical patent/CN113852643B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information
    • H04L63/123Applying verification of the received information received data contents, e.g. message integrity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a content distribution network cache pollution defense method based on content popularity, which comprises the following implementation steps: 1. calculating popularity of all cache resource contents; 2. calculating the hash values of the source stations of all the cache resources; 3. determining hit cache resources; 4. calculating a hash value of the hit cache resource; 5. judging whether the hash value of the hit cache resource is equal to the hash value of the source station stored by the cache server, if so, executing the step 6, otherwise, executing the step 7; 6. determining the cache resource as benign resource and then returning the benign resource to the user; 7. updating the malignant resource. The invention can detect all polluted resources, can also return benign resources of the original cache to the user, and can return the malignant resources to the user after updating the malignant resources by the latest resources of the source server, so that the invention ensures the correctness of the user for accessing the resources.

Description

Content distribution network cache pollution defense method based on content popularity
Technical Field
The invention belongs to the technical field of communication, and further relates to a content distribution network cache pollution prevention method based on content popularity in the technical field of network communication. The invention can be used for detecting the cache attacked by pollution in the content distribution network and clearing the cached malignant resources.
Background
In order to meet the requirements of quick communication of modern Internet, a content delivery network (Content Delivery Network, CDN) caches website content on the network 'edge' closest to users, namely a cache server, so that users can obtain required content nearby, the problems of small network bandwidth, large user access amount and uneven website distribution are technically solved, and the response speed of the users for accessing websites is greatly improved. Thus, network resources cached on the content distribution network are the target for an attacker to break network security. Common cache pollution attacks, such as cache poisoning, cause a cache server to return a user's harmful files or deny service through the cache resources of the cache server on the replaced content distribution network. The existing cache pollution attack defending mode is only designed for a certain attack, and is limited to a specific attack mode, so that the cache pollution attack of a variety cannot be dealt with.
The Hangzhou Seaman science and technology company discloses an attack defense method based on a content distribution network in a patent document "an attack defense method based on a content distribution network" (application number: 202110178012.3 application publication number: CN 113037716A) applied by the Hangzhou Seaman science and technology company. The method comprises the following steps: (1) setting an edge node in a content distribution network; (2) Determining the number of high-protection groups according to the number of links, and establishing high-protection cluster groups; (3) After the domain name is resolved to the content distribution network, setting a threshold value of the request number and the bandwidth in each edge node and link of the content distribution network, and performing exception handling when at least one of the request number and the traffic of a certain edge node exceeds the threshold value; (4) And after the high-security clusters are cut in, respectively monitoring the request number and the flow of each high-security IP of the high-security clusters, and simultaneously monitoring the request number and the flow of the affected edge nodes, links and IPs in the content distribution network. The method has the defect that the number of requests or the traffic after exceeding the threshold value of the edge node of the content distribution network is processed, and the requests and the traffic which are lower than the threshold value are not analyzed and detected.
The content distribution network security detection method and system (application number: 201710882559.5 application publication number: CN 109561051A) of the applied patent document "content distribution network security detection method and system" of the applied communication stock limited company are disclosed. The method comprises the following steps: (1) Acquiring network flow data copied by a content distribution network node to obtain whole network flow data; (2) Carrying out security analysis on the acquired total network traffic data according to a preset intrusion detection rule, and judging whether malicious resources exist or not; and (3) determining whether a safety alarm exists according to the analysis result. The method has the defects that only the malicious resource is judged and pre-warned, and the malicious resource is not processed at all, so that the user can not acquire the required webpage resource when accessing the malicious resource by mistake.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a content distribution network cache pollution defense method based on content popularity, which is used for solving the problems that the prior art only relies on setting a threshold to detect partial malicious resources to cause the risk of missed detection, and the user cannot acquire required webpage resources when accessing the malicious resources by mistake and cannot respond to the user request correctly because the malicious resources are not processed.
The idea of the invention for achieving the above purpose is: according to the cache server, the content popularity of all resources hit the cache by a user request is calculated, the source hash values of all the resources are stored according to the ordering of the content popularity, the hit cache resources are judged by responding to the user source information values, and then the hash values hit the cache resources are compared with the stored source hash values to obtain all the polluted resources. According to the attribute of the hit cache resource, the benign resource of the original cache is returned to the user or the malignant resource is returned to the user after being updated by the latest resource of the source server.
The technical scheme of the invention comprises the following steps:
step 1, calculating popularity of all cache resource contents:
the content popularity of each resource and the website to which each resource in the cache server belongs in each time period is calculated according to the following formula:
wherein P (i, j) represents the content popularity of the ith resource and the jth website which belong to the ith resource in the cache server, ω represents the coefficient of the set content popularity P (i, j), and when the value is the calculated content popularity,the weight of P (i, j) is 0,0.5]Constant taken from range, N i Indicating the number of times the ith resource of the cache server is requested by the user in the T-th time period, N indicating the total number of all the requested resources in the T-th time period of the cache server, k indicating the serial numbers of all the requested resources in the T-th time period of the cache server, N k The method comprises the steps of representing the number of times that a kth resource is requested by a user in a T-th time period, sigma represents summation operation, j represents the serial number of a website to which the ith resource belongs, and m represents the total number of websites to which all the resources belong, requested by the user in the T-th time period, of a cache server;
step 2, calculating the hash values of the source stations of all the cache resources:
sequencing all the content popularity from big to small, sequentially calculating source station hash values of cache resources corresponding to each content popularity, and storing each source station hash value in a cache server in a key value pair set mode;
step 3, determining hit cache resources:
determining response resources with the field value of 'X-Cache' in the resource header information of the response user being 'HIT' as Cache resources hitting the Cache server;
step 4, calculating a hash value of the hit cache resource;
step 5, judging whether the hash value of the hit cache resource is equal to the hash value of the source station stored by the cache server, if so, executing step 6, otherwise, executing step 7;
step 6, after the buffer resource is determined to be benign resource, returning the benign resource to the user;
step 7, updating malignant resources:
and determining the cache resource as a malignant resource, and returning the malignant resource updated by the latest resource of the source server to the user.
Compared with the prior art, the invention has the following advantages:
firstly, the invention calculates and stores the source station hash values of all the resources according to the sequence of the content popularity of all the cache resources, judges the hit cache resources according to the source information value of the responding user resources, compares the hash value of the hit cache resources with the stored source station hash value, detects all the polluted resources, and overcomes the problem that the prior art has the risk of missing detection, so that the invention has the advantage of detecting all malicious resources of the cache server.
Secondly, by judging the attribute of the hit cache resource, the benign resource of the original cache is returned to the user or the malignant resource is returned to the user after being updated by the latest resource of the source server, so that the problems that the malicious resource is only judged and early-warned, the malignant resource is not processed and the user request cannot be responded correctly in the prior art are overcome, and the method and the device have the advantage of guaranteeing the correctness of the user access resource.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a flow chart of an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the drawings and examples.
The specific steps of an implementation of the present invention are further described with reference to fig. 1.
And step 1, calculating popularity of all cache resource contents.
The content popularity of each resource and the website to which each resource in the cache server belongs in each time period is calculated according to the following formula:
wherein P (i, j) represents the content popularity of the ith resource and the jth website which belong to the ith resource in the cache server, ω represents the coefficient of the set content popularity P (i, j), and when the value is the calculated content popularity,the weight of P (i, j) is 0,0.5]Constant taken from range, N i Indicating the number of times the ith resource of the cache server is requested by the user in the T-th time period, N indicating the total number of all the requested resources in the T-th time period of the cache server, k indicating the serial numbers of all the requested resources in the T-th time period of the cache server, N k The number of times that the kth resource is requested by the user in the T-th time period is represented, sigma represents summation operation, j represents the serial number of the website to which the ith resource belongs, and m represents the total number of websites to which the user requests all the resources in the T-th time period by the cache server.
The serial number of the website to which the resource belongs refers to collecting data x according to the global comprehensive ranking sequence of Alexa website traffic 1 ,m],…,[x j ,m-j+1],…,[x m ,1]Wherein x is j Indicating the network station name to which the i-th resource belongs.
And 2, calculating hash values of source stations of all the cache resources.
And sequencing all the content popularity from large to small, sequentially calculating source station hash values of cache resources corresponding to each content popularity, and storing each source station hash value in a cache server in a key value pair set mode.
The key value pair set is in the form of [ N ] 1 ,H 1 ],…,[N t ,H t ],…,[N n ,H n ]Wherein N is t The resource name corresponding to the t-th resource in the cache server is represented, and the value range of t is [1, n ]],H t And the hash value of the source station of the resource corresponding to the t-th resource name in the cache server is represented.
And step 3, determining hit cache resources.
And determining the response resource with the field value of 'X-Cache' in the resource header information of the response user being 'HIT' as the Cache resource hitting the Cache server.
And 4, calculating a hash value of the hit cache resource.
And 5, judging whether the hash value of the hit cache resource is equal to the hash value of the source station stored by the cache server, if so, executing the step 6, otherwise, executing the step 7.
And step 6, determining the cache resource as a benign resource and returning the benign resource to the user.
And 7, updating the malignant resource.
And determining the cache resource as a malignant resource, and returning the malignant resource updated by the latest resource of the source server to the user.
The specific steps of an embodiment of the present invention are further described with reference to fig. 2.
The method comprises the steps of firstly, calculating content popularity of each cache resource required by a user and stored in a user edge cache server;
calculating the source hash value of each cache resource required by the user and stored in the user edge cache server, and storing the calculated source hash values of the cache resources according to the sequence of all content popularity from big to small;
thirdly, the user sends URL (Uniform Resource Locato) request to the cache server;
fourth, the cache server determines whether the user URL request hits the cache resource, if the user URL request contains a random number, in this embodiment of the present invention, the user URL request (url=test.jnumber=math.range ()) indicates that the resource is directly requesting the server for the latest resource, there is no possibility of being contaminated, and no subsequent processing is required. Otherwise, the Cache server inquires the field value of 'X-Cache' in the response header information according to the response resource returned to the user, wherein the field value is 'MISS', which indicates that the response resource does not hit the Cache resource of the Cache server, and the response resource is returned from the source server without subsequent processing; if the field value is 'HIT', judging that the response resource HITs the cache resource;
fifthly, calculating a hash value of a hit cache resource, wherein the hash value of the hit cache resource (test.jsp) is calculated as (MD 5:99B05058C3848023AD83760A61DB9FF 25);
a sixth step of comparing the hash value hit in the cache resource with the stored hash value of the source station, if the two values are equal, executing a seventh step, otherwise executing an eighth step;
seventhly, directly returning benign resources to the user;
and eighth step, updating the malignant resource, forwarding the user request to the source server by the cache server to obtain the latest resource, updating the malignant resource by the latest resource, and returning to the user.

Claims (3)

1. A content distribution network cache pollution prevention method based on content popularity is characterized in that a cache server calculates content popularity of cached resources, calculates and stores source hash values of all the resources according to the sequence of the cached resources, compares the hash value of the hit cache resource with the stored source hash value after judging the hit cache resource, and responds to a user request after determining the attribute of the hit cache resource; the steps of the defending method comprise the following steps:
step 1, calculating popularity of all cache resource contents:
the content popularity of each resource and the website to which each resource in the cache server belongs in each time period is calculated according to the following formula:
wherein P (i, j) represents the content popularity of the ith resource and the jth website which belong to the ith resource in the cache server, ω represents the coefficient of the set content popularity P (i, j), and when the value is the calculated content popularity,the weight of P (i, j) is 0,0.5]Constant taken from range, N i Indicating the number of times the ith resource of the cache server is requested by the user in the T-th time period, N indicating the total number of all the requested resources in the T-th time period of the cache server, k indicating the serial numbers of all the requested resources in the T-th time period of the cache server, N k The method comprises the steps of representing the number of times that the kth resource is requested by a user in the T-th time period, sigma represents summation operation, j represents the serial number of the network station name to which the ith resource belongs, and m represents the total number of the network stations to which all the resources are requested by the user in the T-th time period by a cache server;
step 2, calculating the hash values of the source stations of all the cache resources:
sequencing all the content popularity from big to small, sequentially calculating source station hash values of cache resources corresponding to each content popularity, and storing each source station hash value in a cache server in a key value pair set mode;
step 3, determining hit cache resources:
determining response resources with the field value of 'X-Cache' in the resource header information of the response user being 'HIT' as Cache resources hitting the Cache server;
step 4, calculating a hash value of the hit cache resource;
step 5, judging whether the hash value of the hit cache resource is equal to the hash value of the source station stored by the cache server, if so, executing step 6, otherwise, executing step 7;
step 6, after the buffer resource is determined to be benign resource, returning the benign resource to the user;
step 7, updating malignant resources:
and determining the cache resource as a malignant resource, and returning the malignant resource updated by the latest resource of the source server to the user.
2. The content distribution network cache pollution prevention method based on content popularity of claim 1, wherein the serial number of the website to which the resource belongs in step 1 refers to collecting data according to the global comprehensive ranking order of Alexa website traffic[x 1 ,m],…,[x j ,m-j+1],…,[x m ,1]Wherein x is j Indicating the network station name to which the i-th resource belongs.
3. The content popularity-based content delivery network cache pollution prevention method of claim 1, wherein the set of key-value pairs in step 2 is in the form of [ N 1 ,H 1 ],…,[N t ,H t ],…,[N n ,H n ]Wherein N is t The resource name corresponding to the t-th resource in the cache server is represented, and the value range of t is [1, n ]],H t And the hash value of the source station of the resource corresponding to the t-th resource name in the cache server is represented.
CN202111227105.7A 2021-10-21 2021-10-21 Content distribution network cache pollution defense method based on content popularity Active CN113852643B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111227105.7A CN113852643B (en) 2021-10-21 2021-10-21 Content distribution network cache pollution defense method based on content popularity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111227105.7A CN113852643B (en) 2021-10-21 2021-10-21 Content distribution network cache pollution defense method based on content popularity

Publications (2)

Publication Number Publication Date
CN113852643A CN113852643A (en) 2021-12-28
CN113852643B true CN113852643B (en) 2023-11-14

Family

ID=78982556

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111227105.7A Active CN113852643B (en) 2021-10-21 2021-10-21 Content distribution network cache pollution defense method based on content popularity

Country Status (1)

Country Link
CN (1) CN113852643B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115277080B (en) * 2022-06-22 2023-11-14 西安电子科技大学 Content distribution network cache pollution defense method based on merck tree

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105376229A (en) * 2015-11-13 2016-03-02 中国人民解放军信息工程大学 Method for actively defending against cache pollution attack of content-centric network
US9326261B2 (en) * 2011-07-15 2016-04-26 Huawei Technologies Co., Ltd. Method and apparatus for synchronizing popularity value of cache data and method, apparatus, and system for distributed caching
CN106657249A (en) * 2016-10-25 2017-05-10 杭州迪普科技股份有限公司 Method and device for updating cache resources
CN108667799A (en) * 2018-03-28 2018-10-16 中国科学院信息工程研究所 It is a kind of to be directed to the defence method and system that browser rs cache is poisoned
CN109936633A (en) * 2019-03-11 2019-06-25 重庆邮电大学 Based on the cooperation caching strategy of content different degree in content center network
CN110083761A (en) * 2018-10-18 2019-08-02 中国电子科技集团公司电子科学研究院 A kind of data distributing method based on content popularit, system and storage medium
CN111541722A (en) * 2020-05-22 2020-08-14 哈尔滨工程大学 Information center network cache pollution attack detection and defense method based on density clustering
CN111917853A (en) * 2020-07-24 2020-11-10 山东云缦智能科技有限公司 Optimization method for distributed cache scaling of content distribution network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9167049B2 (en) * 2012-02-02 2015-10-20 Comcast Cable Communications, Llc Content distribution network supporting popularity-based caching

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9326261B2 (en) * 2011-07-15 2016-04-26 Huawei Technologies Co., Ltd. Method and apparatus for synchronizing popularity value of cache data and method, apparatus, and system for distributed caching
CN105376229A (en) * 2015-11-13 2016-03-02 中国人民解放军信息工程大学 Method for actively defending against cache pollution attack of content-centric network
CN106657249A (en) * 2016-10-25 2017-05-10 杭州迪普科技股份有限公司 Method and device for updating cache resources
CN108667799A (en) * 2018-03-28 2018-10-16 中国科学院信息工程研究所 It is a kind of to be directed to the defence method and system that browser rs cache is poisoned
CN110083761A (en) * 2018-10-18 2019-08-02 中国电子科技集团公司电子科学研究院 A kind of data distributing method based on content popularit, system and storage medium
CN109936633A (en) * 2019-03-11 2019-06-25 重庆邮电大学 Based on the cooperation caching strategy of content different degree in content center network
CN111541722A (en) * 2020-05-22 2020-08-14 哈尔滨工程大学 Information center network cache pollution attack detection and defense method based on density clustering
CN111917853A (en) * 2020-07-24 2020-11-10 山东云缦智能科技有限公司 Optimization method for distributed cache scaling of content distribution network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
.Transparent Content Negotiation in HTTP.IETF .1997,全文. *
Koen Holtman, TUE ; Andrew Mutz, Hewlett-Packard *
内容中心网络缓存污染防御技术研究;朱轶;糜正琨;王文鼐;;南京邮电大学学报(自然科学版)(第02期);全文 *
基于内容流行度差异性的CDN-P2P融合分发网络缓存替换机制研究;聂华;张敏;郭敬荣;阳小龙;;通信学报(第S1期);全文 *
浏览器缓存污染防御策略;戴成瑞;陈伟;;计算机应用(第03期);全文 *

Also Published As

Publication number Publication date
CN113852643A (en) 2021-12-28

Similar Documents

Publication Publication Date Title
CN109951500B (en) Network attack detection method and device
CN104113519B (en) Network attack detecting method and its device
CN109274632B (en) Website identification method and device
CN103428189B (en) A kind of methods, devices and systems identifying malicious network device
US9215242B2 (en) Methods and systems for preventing unauthorized acquisition of user information
CN105917348B (en) Information processing unit and movable determination method
US9258289B2 (en) Authentication of IP source addresses
US7854001B1 (en) Aggregation-based phishing site detection
US20130312081A1 (en) Malicious code blocking system
EP3264720A1 (en) Using dns communications to filter domain names
CN108023868B (en) Malicious resource address detection method and device
US20140173730A1 (en) Security Method and Apparatus
WO2010045089A1 (en) Target-based smb and dce/rpc processing for an intrusion detection system or intrusion prevention system
CN109257393A (en) XSS attack defence method and device based on machine learning
CN107733699B (en) Internet asset security management method, system, device and readable storage medium
US20170180402A1 (en) Detection of Coordinated Cyber-Attacks
CN112491784A (en) Request processing method and device of Web site and computer readable storage medium
JP2016146114A (en) Management method of blacklist
CN114021040B (en) Method and system for alarming and protecting malicious event based on service access
CN104378255B (en) The detection method and device of web malicious users
CN113852643B (en) Content distribution network cache pollution defense method based on content popularity
CN113518077A (en) Malicious web crawler detection method, device, equipment and storage medium
CN106685899A (en) Method and device for identifying malicious access
CN109660552A (en) A kind of Web defence method combining address jump and WAF technology
CN112019533A (en) Method and system for relieving DDoS attack on CDN system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant