CN106027642A - Method and system for determining number of disks of CDN (Content Delivery Network) node - Google Patents

Method and system for determining number of disks of CDN (Content Delivery Network) node Download PDF

Info

Publication number
CN106027642A
CN106027642A CN201610334357.2A CN201610334357A CN106027642A CN 106027642 A CN106027642 A CN 106027642A CN 201610334357 A CN201610334357 A CN 201610334357A CN 106027642 A CN106027642 A CN 106027642A
Authority
CN
China
Prior art keywords
accessed
value
hit rate
size
buffered
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610334357.2A
Other languages
Chinese (zh)
Inventor
李洪福
马宙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LeTV Holding Beijing Co Ltd
LeTV Cloud Computing Co Ltd
Original Assignee
LeTV Holding Beijing Co Ltd
LeTV Cloud Computing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LeTV Holding Beijing Co Ltd, LeTV Cloud Computing Co Ltd filed Critical LeTV Holding Beijing Co Ltd
Priority to CN201610334357.2A priority Critical patent/CN106027642A/en
Publication of CN106027642A publication Critical patent/CN106027642A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5682Policies or rules for updating, deleting or replacing the stored data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The disclosure provides a method for determining the number of disks of a CDN (Content Delivery Network) node. The method comprises the steps of: analyzing an access log of a user so as to at least determine all accessed files and sizes of the accessed files; setting constraint conditions: the sum of the sizes of all the cached accessed files is not greater than a preset disk space, and the preset disk space takes a plurality of spacial values; based on a ratio of the sum of the sizes of the cached accessed files accessed every time to the sum of the sizes of the accessed files accessed every time, measuring a cache hit ratio; then combining the constraint conditions to determine a plurality of maximum cache hit ratios corresponding to the plurality of spacial values; and determining the spacial value corresponding to the maximum value of the plurality of maximum cache hit ratios so as to determine the number of the disks. Correspondingly, the disclosure further provides a system for determining the number of the disks of the CDN node. According to the method and the system which are provided by the disclosure, by analyzing the access log of the user, the number of the disks which need to be distributed at the CDN node is determined via a quantization method, so that the finally obtained CDN node has a more reasonable storage space.

Description

For determining the method and system of CDN node number of disks
Technical field
It relates to CDN technical field, it is used for determining CDN node number of disks particularly to one Method and system.
Background technology
CDN (Content Delivery Network, content distributing network) is a kind of by each at network On existing Internet basic one layer of intelligent virtual network that node server is constituted is placed at place. CDN can be in real time according to network traffics and the connection of each node, load state and the distance to user With integrated informations such as response times, the request of user is re-directed on the service node that user is nearest, its Purpose is the content that can select CDN node from user's relative close needed for user sends user, Alleviate the situation of network congestion, improve the response speed of website, and each CDN node includes Multiple disks, the content that user is accessed is stored in these disks, therefore a CDN node institute The disk laid number, directly determine this its can store accessed content number, think user Better service quality is provided, but in view of purchasing each CDN node of Cost Problems of disk Again can not too much laying disk blindly.
In a newly-built CDN node in prior art, or an existing CDN node is expanded Rong Shi, carries out laying simply by the experience of engineer.
Experience and experience yet with each engineer differ, and the most more experience is come Laying the number of disk in CDN node, the CDN node finally given can not provide the user high-quality Service, have problems in that, or the disk laid too much cause the waste of resource and cost Increase, or the very few access content not having enough space storages to need caching of the disk laid, thus Can not directly provide for user access content to cause frequent Hui Yuan, affect Consumer's Experience.
Summary of the invention
The disclosure provides a kind of method and system for determining CDN node number of disks, at least to solve Certainly one of above-mentioned technical problem.
On the one hand, the disclosure provides a kind of method for determining CDN node number of disks, including:
Analyze user access logs at least to determine the big of all accessed files and each accessed file Little;
Set constraints: the size sum of all each accessed files being buffered is not more than predetermined Disk space, described determined magnetic disk space takes incremental multiple spatial values successively;
Size sum based on the accessed file being buffered accessed every time is accessed with access every time The ratio of the size sum of file measures cache hit rate;
According to described constraints and size sum based on the accessed file being buffered accessed every time With the ratio of the size sum of the accessed file every time accessed the tolerance of cache hit rate determined corresponding to Multiple largest buffered hit rates of the plurality of spatial value;
Determine corresponding to the largest buffered hit rate that the value in the plurality of largest buffered hit rate is maximum The spatial value of described determined magnetic disk space is to determine number of disks.
Another aspect, the disclosure also provides for a kind of system for determining CDN node number of disks, including:
Log analysis module, is used for analyzing user access logs at least to determine that all accessed files are with every The size of individual accessed file;
Constraints setting module, is used for setting constraints: all each accessed literary compositions being buffered The size sum of part is not more than determined magnetic disk space, and described determined magnetic disk space takes incremental multiple skies successively Between be worth;
Cache hit rate metric module, for based on the accessed file being buffered accessed every time big The ratio of little sum and every time the size sum of the accessed file of access measures cache hit rate;
Cache hit rate determines module, for according to described constraints with based on being buffered of accessing every time The size sum of accessed file with the ratio of the size sum of the accessed file accessed every time to slow The tolerance depositing hit rate determines the multiple largest buffered hit rates corresponding to the plurality of spatial value;
Number of disks determines module, maximum for determining value in the plurality of largest buffered hit rate The spatial value of the described determined magnetic disk space corresponding to largest buffered hit rate is to determine number of disks.
The method and system that embodiment of the disclosure is by analyzing the access log of user, to take difference Determine in the case of disk space size that the CDN that determines that the method for largest buffered hit rate quantifies saves Point needs the number of the disk laid so that it is empty that the CDN node finally given has more rationally disk Between, neither waste resource, content can be accessed farthest to meet user by caching as much as possible again Access request.
Accompanying drawing explanation
In order to be illustrated more clearly that the technical scheme of disclosure embodiment, institute in embodiment being described below The accompanying drawing used is needed to be briefly described, it should be apparent that, the accompanying drawing in describing below is the disclosure Some embodiments, for those of ordinary skill in the art, in the premise not paying creative work Under, it is also possible to other accompanying drawing is obtained according to these accompanying drawings.
Fig. 1 is the flow process of an embodiment of the method for determining CDN node number of disks of the disclosure Figure;
Fig. 2 is the flow process of another embodiment of the method for determining CDN node number of disks of the disclosure Figure;
Fig. 3 is the signal of an embodiment of the system for determining CDN node number of disks of the disclosure Figure;
Fig. 4 is the signal of another embodiment of the system for determining CDN node number of disks of the disclosure Figure.
Specific embodiment
For making the purpose of disclosure embodiment, technical scheme and advantage clearer, below in conjunction with these public affairs Open the accompanying drawing in embodiment, the technical scheme in disclosure embodiment be clearly and completely described, Obviously, described embodiment is a part of embodiment of the disclosure rather than whole embodiments.Based on Embodiment in the disclosure, those of ordinary skill in the art are obtained under not making creative work premise The every other embodiment obtained, broadly falls into the scope of disclosure protection.
It should be noted that in the case of not conflicting, the embodiment in the application and the spy in embodiment Levy and can be mutually combined.
Also, it should be noted in this article, the relational terms of such as first and second or the like is only used One entity or operation are separated with another entity or operating space, and not necessarily requires or secretly Show relation or the order that there is any this reality between these entities or operation.And, term " bag Include ", " comprising ", not only include those key elements, but also include other key elements being not expressly set out, or Person is also to include the key element intrinsic for this process, method, article or equipment.There is no more limit In the case of system, statement " including ... " key element limited, it is not excluded that in the mistake including described key element Journey, method, article or equipment there is also other identical element.
As it is shown in figure 1, the method being used for determining CDN node number of disks of an embodiment of the disclosure, Comprising:
Analyze user access logs at least to determine the big of all accessed files and each accessed file Little;
Set constraints: the size sum of all each accessed files being buffered is not more than predetermined Disk space, described determined magnetic disk space takes incremental multiple spatial values successively;
Size sum based on the accessed file being buffered accessed every time is accessed with access every time The ratio of the size sum of file measures cache hit rate;
According to described constraints and size sum based on the accessed file being buffered accessed every time With the ratio of the size sum of the accessed file every time accessed the tolerance of cache hit rate determined corresponding to Multiple largest buffered hit rates of the plurality of spatial value;
Determine corresponding to the largest buffered hit rate that the value in the plurality of largest buffered hit rate is maximum The spatial value of described determined magnetic disk space is to determine number of disks.
In the present embodiment by analyzing the access log of user, to take different disk space size In the case of determine that what the method for largest buffered hit rate quantified determines the magnetic that CDN node needs to lay The number of dish so that the CDN node finally given has more rationally disk space, neither wastes resource, Content can be accessed farthest to meet the access request of user by caching as much as possible again.
The disclosure another embodiment for determining in the method for CDN node number of disks, retrain bar Part is:
a1X1+a2X2……+amXm≤ D, wherein XkFor the size of accessed file, akValue 1 or 0, work as akShow during value 1 that size is XiAccessed file be buffered, work as akShow during value 0 Size is XiAccessed file be not buffered, m is the number of accessed file, and k takes 1 to m's Positive integer, D is determined magnetic disk space.
Determined magnetic disk space D in the present embodiment takes a series of multiple spatial values in increasing trend successively, Often take a spatial value and be assured that out that least one set meets parameter a of above-mentioned constraints1To am, i.e. The method defining the accessed file of at least one caching, according to the ginseng determined in each caching method Number a1To amValue can determine the lower requested file needing caching, such as, accessed file bag Include Y1To Y5Time, and accessed file Y1Size is X1, accessed file Y2Size is X2, interviewed Ask file Y3Size is X3, accessed file Y4Size is X4, accessed file Y5Size is X5, When determined magnetic disk space D takes some value, and parameter a determined in the case of meeting above-mentioned constraints1 To a5When value is (1,0,1,1,0) respectively (of course, it is also possible to there is other value mode), Then represent and only need to cache accessed file Y1、Y3、Y4.Then determine predetermined according to this caching method Disk space D takes the cache hit rate in the case of a certain value, in like manner further according to other several caching methods Determine multiple cache hit rate, finally determine that determined magnetic disk space D goes in the case of this certain value the highest Cache hit rate.In like manner, determined magnetic disk space D takes all possible spatial value successively, finally determines Go out to the multiple largest buffered hit rates in the case of individual spatial value.
The disclosure another embodiment for determining in the method for CDN node number of disks, also wrap Include:
Analyze user access logs to determine the accessed number of times of each accessed file;
The plurality of largest buffered hit rate is determined by described constraints and below equation:
H=(∑ akbkXk)/(∑bkXk), (positive integer of k value 1 to m);
Wherein, bkSized by be XkThe accessed number of times of accessed file, H is cache hit rate.
So that embodiment of the disclosure and become apparent from, we are with to 6 access logs (i.e. n=6) Analysis as a example by carry out further instruction and embodiment of the disclosure: first determine these 6 times and access as S1、S2、 S3、S4、S5、S6, the file of these six times corresponding access is txt1, txt2, txt3, txt2, txt4, txt3, It can thus be appreciated that access and relate to altogether 4 files (i.e. m=4) for these 6 times, and file txt1 is accessed once, File txt2 be accessed twice, file txt3 be accessed twice, file txt4 be accessed once (that is, b1=1, b2=2, b3=2, b4=1), the size of these 4 files txt1, txt2, txt3, txt4 be respectively 1M, 2M, 5M, 3M (that is, X1、X2、X3、X4), a1、a2、a3、a4Be respectively used to represent file txt1, txt2, Whether txt3, txt4 are buffered.
Can be obtained by above-described embodiment:
Constraints is:
a1X1+a2X2+a3X3+a4X4=a1+2a2+5a3+3a4≤D;
Cache hit rate is:
H=(∑ akbkXk)/(∑bkXk)
=(a1b1X1+a2b2X2+a3b3X3+a4b4X4)/(b1X1+b2X2+b3X3+b4X4)
=(a1*1*1+a2*2*2+a3*2*5+a4*1*3)/(1*1+2*2+2*5+1*3)
=(a1+4a2+10a3+3a4)/18
In the present embodiment first as a example by the determined magnetic disk space D size as 48T, determine that many groups are full Foot constraints a1+2a2+5a3+3a4Parameter (a of≤48T1、a2、a3、a4), it is then based on determining Many group parameter determinations go out multiple cache hit rate, and determine largest buffered hit rate;Determine the most respectively Determined magnetic disk space D is ... 60T, 72T, 84T, 96T ... time multiple largest buffered hit rates.
In above-described embodiment, work as amNumber more than 4 but more time, knapsack algorithm meter can be used Calculate a1、a2……amValue.
In the present embodiment, the spatial value that determined magnetic disk space D is taken is generally the whole of single disk size Several times, the size of the most single disk is 12T, 60T, 72T, 84T, 96T etc. these be all 12T Multiple, in order to closing to reality application and calculate.
As in figure 2 it is shown, the side for determining CDN node number of disks of another embodiment in the disclosure In method, determine corresponding to the largest buffered hit rate that the value in the plurality of largest buffered hit rate is maximum The spatial value of described determined magnetic disk space to determine that number of disks is:
The largest buffered hit rate that value is maximum is selected from the multiple largest buffered hit rates determined;
The spatial value of the determined magnetic disk space corresponding to the largest buffered hit rate that the value determined is maximum adds After power, size delivery divided by single disk are number of disks.
In this embodiment, the multiple largest buffered obtained when the determined magnetic disk space D determined takes different value Hit rate is selected a largest buffered hit rate that value is maximum further, then determines that this value is The value of the big determined magnetic disk space D corresponding to largest buffered hit rate.
But, by observing determined magnetic disk space D as x-axis, using largest buffered hit rate as y Axle is demarcated the determined magnetic disk space D-largest buffered expectancy curve obtained in a coordinate system and is understood, when predetermined The value of disk space D increases to the speedup of largest buffered hit rate after to a certain degree and will gradually slow down, and examines Consider to border effect (that is, the size of the disk space of increase and the therefore largest buffered hit rate obtained The reduction of ratio of increase), determined magnetic disk space D when taking cache hit rate maximum in reality is The space of the disk of CDN node is actually unsatisfactory, because, when determined magnetic disk space D increases After to a certain extent, further increase the largest buffered hit rate brought the most very little , the not use of what reality, the most but waste the disk space of increase further, therefore, The determined magnetic disk space D corresponding to cache hit rate that in the present embodiment, the most directly value is maximum is The disk space of final CDN node, but after the determined magnetic disk space D determined is weighted just for Determining the number of disks of CDN node, the weight coefficient in the present embodiment is the positive number less than 1, typically Preferably value is 0.7~0.9.
Further, in the present embodiment with the determined magnetic disk space D that finally determines as 96T, weight coefficient Take 0.8, as a example by the size of single disk is 12T, then 96T*0.8/12T=6.4, the result 6 of delivery, i.e. The disk that only need to lay 6 12T in CDN node i.e. can reach the cache hit rate that comparison is high, and obtains Obtain higher cost performance.
Disclosure embodiment can be passed through hardware processor (hardware processor) realize being correlated with Functional module.
It should be noted that for aforesaid each method embodiment, in order to be briefly described, therefore by its all table Stating and merge for a series of action, but those skilled in the art should know, the disclosure is by being retouched The restriction of the sequence of movement stated because according to the disclosure, some step can use other orders or with Shi Jinhang.Secondly, those skilled in the art also should know, embodiment described in this description all belongs to In preferred embodiment, necessary to involved action and the module not necessarily disclosure.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not has in certain embodiment The part described in detail, may refer to the associated description of other embodiments.
As it is shown on figure 3, the system for determining CDN node number of disks of an embodiment of the disclosure, Including:
Log analysis module, is used for analyzing user access logs at least to determine that all accessed files are with every The size of individual accessed file;
Constraints setting module, is used for setting constraints: all each accessed literary compositions being buffered The size sum of part is not more than determined magnetic disk space, and described determined magnetic disk space takes incremental multiple skies successively Between be worth;
Cache hit rate metric module, for based on the accessed file being buffered accessed every time big The ratio of little sum and every time the size sum of the accessed file of access measures cache hit rate;
Cache hit rate determines module, for according to described constraints with based on being buffered of accessing every time The size sum of accessed file with the ratio of the size sum of the accessed file accessed every time to slow The tolerance depositing hit rate determines the multiple largest buffered hit rates corresponding to the plurality of spatial value;
Number of disks determines module, maximum for determining value in the plurality of largest buffered hit rate The spatial value of the described determined magnetic disk space corresponding to largest buffered hit rate is to determine number of disks.
In the present embodiment by analyzing the access log of user, to take different disk space size In the case of determine that what the method for largest buffered hit rate quantified determines the magnetic that CDN node needs to lay The number of dish so that the CDN node finally given has more rationally disk space, neither wastes resource, Content can be accessed farthest to meet the access request of user by caching as much as possible again.
The disclosure another embodiment for determining in the system of CDN node number of disks, retrain bar Part is:
a1X1+a2X2……+amXm≤ D, wherein XkFor the size of accessed file, akValue 1 or 0, work as akShow during value 1 that size is XiAccessed file be buffered, work as akShow during value 0 Size is XiAccessed file be not buffered, m is the number of accessed file, and k takes 1 to m's Positive integer, D is determined magnetic disk space.
Determined magnetic disk space D in the present embodiment takes a series of multiple spatial values in increasing trend successively, Often take a spatial value and be assured that out that least one set meets parameter a of above-mentioned constraints1To am, i.e. The method defining the accessed file of at least one caching, according to the ginseng determined in each caching method Number a1To amValue can determine the lower requested file needing caching, such as, accessed file bag Include Y1To Y5Time, and accessed file Y1Size is X1, accessed file Y2Size is X2, interviewed Ask file Y3Size is X3, accessed file Y4Size is X4, accessed file Y5Size is X5, When determined magnetic disk space D takes some value, and parameter a determined in the case of meeting above-mentioned constraints1 To a5When value is (1,0,1,1,0) respectively (of course, it is also possible to there is other value mode), Then represent and only need to cache accessed file Y1、Y3、Y4.Then determine predetermined according to this caching method Disk space D takes the cache hit rate in the case of a certain value, in like manner further according to other several caching methods Determine multiple cache hit rate, finally determine that determined magnetic disk space D goes in the case of this certain value the highest Cache hit rate.In like manner, determined magnetic disk space D takes all possible spatial value successively, finally determines Go out to the multiple largest buffered hit rates in the case of individual spatial value.
The disclosure another embodiment for determining that in the system of CDN node number of disks, daily record divides Analysis module is additionally operable to analyze user access logs to determine the accessed number of times of each accessed file;
The plurality of largest buffered hit rate is determined by described constraints and below equation:
H=(∑ akbkXk)/(∑bkXk), (positive integer of k value 1 to m);
Wherein, bkSized by be XkThe accessed number of times of accessed file, H is cache hit rate.
As shown in Figure 4, the disclosure another embodiment for determine CDN node number of disks be In system, described number of disks determines that module includes:
Largest buffered hit rate chooses unit, takes for selecting from the multiple largest buffered hit rates determined The largest buffered hit rate that value is maximum;
Number of disks determines unit, for corresponding to the largest buffered hit rate that the value for determining is maximum After the spatial value weighting of determined magnetic disk space, size delivery divided by single disk are number of disks.
In this embodiment, the multiple largest buffered obtained when the determined magnetic disk space D determined takes different value Hit rate is selected a largest buffered hit rate that value is maximum further, then determines that this value is The value of the big determined magnetic disk space D corresponding to largest buffered hit rate.
But, by observing determined magnetic disk space D as x-axis, using largest buffered hit rate as y Axle is demarcated the determined magnetic disk space D-largest buffered expectancy curve obtained in a coordinate system and is understood, when predetermined The value of disk space D increases to the speedup of largest buffered hit rate after to a certain degree and will gradually slow down, and examines Consider to border effect (that is, the size of the disk space of increase and the therefore largest buffered hit rate obtained The reduction of ratio of increase), determined magnetic disk space D when taking cache hit rate maximum in reality is The space of the disk of CDN node is actually unsatisfactory, because, when determined magnetic disk space D increases After to a certain extent, further increase the largest buffered hit rate brought the most very little , the not use of what reality, the most but waste the disk space of increase further, therefore, The determined magnetic disk space D corresponding to cache hit rate that in the present embodiment, the most directly value is maximum is The disk space of final CDN node, but after the determined magnetic disk space D determined is weighted just for Determining the number of disks of CDN node, the weight coefficient in the present embodiment is the positive number less than 1, typically Preferably value is 0.7~0.9.
Further, in the present embodiment with the determined magnetic disk space D that finally determines as 96T, weight coefficient Take 0.8, as a example by the size of single disk is 12T, then 96T*0.8/12T=6.4, the result 6 of delivery, i.e. The disk that only need to lay 6 12T in CDN node i.e. can reach the cache hit rate that comparison is high, and obtains Obtain higher cost performance.
In certain embodiments, the spatial value that determined magnetic disk space D is taken is generally single disk size Integral multiple, the size of the most single disk is 12T, 60T, 72T, 84T, 96T etc. these be all 12T Multiple, in order to closing to reality application and calculate.
On the other hand, embodiment of the disclosure and also provide for a kind of server, described server includes:
Memorizer, is used for storing program;
Processor, for performing the program of described memorizer storage, described program makes described processor hold Any one in above-described embodiment of the row disclosure is for the method determining CDN node number of disks.
On the other hand, embodiment of the disclosure and also provide for a kind of server, described server is laid with these public affairs Any one in the above-described embodiment opened is for determining the system of CDN node number of disks.
Embodiment of the method described above is only schematically, wherein said illustrates as separating component Unit can be or may not be physically separate, the parts shown as unit can be or Person may not be physical location, i.e. may be located at a place, or can also be distributed to multiple network On unit.Some or all of module therein can be selected according to the actual needs to realize the present embodiment The purpose of scheme.Those of ordinary skill in the art are not in the case of paying performing creative labour, the most permissible Understand and implement.
Through the above description of the embodiments, those skilled in the art is it can be understood that arrive each reality The mode of executing can add the mode of required general hardware platform by software and realize, naturally it is also possible to by firmly Part.Based on such understanding, the portion that prior art is contributed by technique scheme the most in other words Dividing and can embody with the form of software product, this computer software product can be stored in computer can Read in storage medium, such as ROM/RAM, magnetic disc, CD etc., including some instructions with so that one Computer equipment (can be personal computer, server, or the network equipment etc.) performs each to be implemented The method described in some part of example or embodiment.
Those skilled in the art it should be appreciated that embodiment of the present disclosure can be provided as method, system, Or computer program.Therefore, the disclosure can use complete hardware embodiment, complete software to implement Mode or the form of the embodiment in terms of combining software and hardware.And, the disclosure can use one Individual or multiple wherein include computer usable program code computer-usable storage medium (include but not It is limited to disk memory and optical memory etc.) form of the upper computer program implemented.
The disclosure is with reference to method, equipment (system) and the computer program according to disclosure embodiment The flow chart of product and/or block diagram describe.It should be understood that flow process can be realized by computer program instructions Flow process in each flow process in figure and/or block diagram and/or square frame and flow chart and/or block diagram And/or the combination of square frame.Can provide these computer program instructions to general purpose computer, special-purpose computer, The processor of Embedded Processor or other programmable data processing device is to produce a machine so that logical The instruction of the processor execution crossing computer or other programmable data processing device produces for realizing at stream The function specified in one flow process of journey figure or multiple flow process and/or one square frame of block diagram or multiple square frame Device.
These computer program instructions may be alternatively stored in and computer or the process of other programmable datas can be guided to set In the standby computer-readable memory worked in a specific way so that be stored in this computer-readable memory In instruction produce and include the manufacture of command device, this command device realize in one flow process of flow chart or The function specified in multiple flow processs and/or one square frame of block diagram or multiple square frame.These computer programs Instruction also can be loaded in computer or other programmable data processing device so that computer or other On programmable device, execution sequence of operations step is to produce computer implemented process, thus at computer Or the instruction performed on other programmable devices provides for realizing in one flow process of flow chart or multiple flow process And/or the step of the function specified in one square frame of block diagram or multiple square frame.
Last it is noted that above example is only in order to illustrate the technical scheme of the disclosure, rather than to it Limit;Although the disclosure being described in detail with reference to previous embodiment, the ordinary skill of this area Personnel it is understood that the technical scheme described in foregoing embodiments still can be modified by it, or Person carries out equivalent to wherein portion of techniques feature;And these amendments or replacement, do not make corresponding skill The essence of art scheme departs from the spirit and scope of the disclosure each embodiment technical scheme.

Claims (10)

1. for the method determining CDN node number of disks, including:
Analyze user access logs at least to determine the big of all accessed files and each accessed file Little;
Set constraints: the size sum of all each accessed files being buffered is not more than predetermined Disk space, described determined magnetic disk space takes incremental multiple spatial values successively;
Size sum based on the accessed file being buffered accessed every time is accessed with access every time The ratio of the size sum of file measures cache hit rate;
According to described constraints and size sum based on the accessed file being buffered accessed every time With the ratio of the size sum of the accessed file every time accessed the tolerance of cache hit rate determined corresponding to Multiple largest buffered hit rates of the plurality of spatial value;
Determine corresponding to the largest buffered hit rate that the value in the plurality of largest buffered hit rate is maximum The spatial value of described determined magnetic disk space is to determine number of disks.
Method the most according to claim 1, wherein, described constraints is:
a1X1+a2X2……+amXm≤ D, wherein XkFor the size of accessed file, akValue 1 or 0, work as akShow during value 1 that size is XiAccessed file be buffered, work as akShow during value 0 Size is XiAccessed file be not buffered, m is the number of accessed file, and k takes 1 to m's Positive integer, D is determined magnetic disk space.
Method the most according to claim 2, wherein, also includes:
Analyze user access logs to determine the accessed number of times of each accessed file;
The plurality of largest buffered hit rate is determined by described constraints and below equation:
H=(∑ akbkXk)/(∑bkXk), (positive integer of k value 1 to m);
Wherein, bkSized by be XkThe accessed number of times of accessed file, H is cache hit rate.
4. according to the method described in any one of claim 1-3, wherein, described determine the plurality of maximum The sky of the described determined magnetic disk space corresponding to largest buffered hit rate that the value in cache hit rate is maximum Between be worth to determine that number of disks is:
The largest buffered hit rate that value is maximum is selected from the multiple largest buffered hit rates determined;
The spatial value of the determined magnetic disk space corresponding to the largest buffered hit rate that the value determined is maximum adds After power, size delivery divided by single disk are number of disks.
Method the most according to claim 4, wherein, the spatial value that described determined magnetic disk space is taken Integral multiple for single disk size.
6. for determining a system for CDN node number of disks, including:
Log analysis module, is used for analyzing user access logs at least to determine that all accessed files are with every The size of individual accessed file;
Constraints setting module, is used for setting constraints: all each accessed literary compositions being buffered The size sum of part is not more than determined magnetic disk space, and described determined magnetic disk space takes incremental multiple skies successively Between be worth;
Cache hit rate metric module, for based on the accessed file being buffered accessed every time big The ratio of little sum and every time the size sum of the accessed file of access measures cache hit rate;
Cache hit rate determines module, for according to described constraints with based on being buffered of accessing every time The size sum of accessed file with the ratio of the size sum of the accessed file accessed every time to slow The tolerance depositing hit rate determines the multiple largest buffered hit rates corresponding to the plurality of spatial value;
Number of disks determines module, maximum for determining value in the plurality of largest buffered hit rate The spatial value of the described determined magnetic disk space corresponding to largest buffered hit rate is to determine number of disks.
System the most according to claim 6, wherein, described constraints is:
a1X1+a2X2……+amXm≤ D, wherein XkFor the size of accessed file, akValue 1 or 0, work as akShow during value 1 that size is XiAccessed file be buffered, work as akShow during value 0 Size is XiAccessed file be not buffered, m is the number of accessed file, and k takes 1 to m's Positive integer, D is determined magnetic disk space.
System the most according to claim 7, wherein, described log analysis module is additionally operable to analyze and uses Family access log is to determine the accessed number of times of each accessed file;
The plurality of largest buffered hit rate is determined by described constraints and below equation:
H=(∑ akbkXk)/(∑bkXk), (positive integer of k value 1 to m);
Wherein, bkSized by be XkThe accessed number of times of accessed file, H is cache hit rate.
9. according to the system described in any one of claim 6-8, wherein, described number of disks determines module Including:
Largest buffered hit rate chooses unit, takes for selecting from the multiple largest buffered hit rates determined The largest buffered hit rate that value is maximum;
Number of disks determines unit, for corresponding to the largest buffered hit rate that the value for determining is maximum After the spatial value weighting of determined magnetic disk space, size delivery divided by single disk are number of disks.
System the most according to claim 9, wherein, the space that described determined magnetic disk space is taken Value is the integral multiple of single disk size.
CN201610334357.2A 2016-05-19 2016-05-19 Method and system for determining number of disks of CDN (Content Delivery Network) node Pending CN106027642A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610334357.2A CN106027642A (en) 2016-05-19 2016-05-19 Method and system for determining number of disks of CDN (Content Delivery Network) node

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610334357.2A CN106027642A (en) 2016-05-19 2016-05-19 Method and system for determining number of disks of CDN (Content Delivery Network) node

Publications (1)

Publication Number Publication Date
CN106027642A true CN106027642A (en) 2016-10-12

Family

ID=57095306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610334357.2A Pending CN106027642A (en) 2016-05-19 2016-05-19 Method and system for determining number of disks of CDN (Content Delivery Network) node

Country Status (1)

Country Link
CN (1) CN106027642A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038127A (en) * 2017-02-08 2017-08-11 阿里巴巴集团控股有限公司 Application system and its buffer control method and device
CN109005056A (en) * 2018-07-16 2018-12-14 网宿科技股份有限公司 Storage capacity evaluation method and apparatus based on CDN application
CN109683816A (en) * 2018-12-14 2019-04-26 北京奇艺世纪科技有限公司 The disk configuration method and system of a kind of time source tree node
CN110401553A (en) * 2018-04-25 2019-11-01 阿里巴巴集团控股有限公司 The method and apparatus of server configuration
CN110933140A (en) * 2019-11-05 2020-03-27 北京字节跳动网络技术有限公司 CDN storage allocation method, system and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101135994A (en) * 2007-09-07 2008-03-05 杭州华三通信技术有限公司 Method and apparatus for dividing cache space and cache controller thereof
CN102546716A (en) * 2010-12-23 2012-07-04 中国移动通信集团公司 Buffer management method, device and streaming media on-demand system
WO2013090126A1 (en) * 2011-12-16 2013-06-20 Microsoft Corporation Application-driven cdn pre-caching
US20130227051A1 (en) * 2012-01-10 2013-08-29 Edgecast Networks, Inc. Multi-Layer Multi-Hit Caching for Long Tail Content
CN104484134A (en) * 2014-12-23 2015-04-01 北京华胜天成科技股份有限公司 Method and device for allocating distributed type storage magnetic discs
CN104794064A (en) * 2015-04-21 2015-07-22 华中科技大学 Cache management method based on region heat degree
CN105022587A (en) * 2014-04-24 2015-11-04 中国移动通信集团设计院有限公司 Method for designing magnetic disk array and storage device for magnetic disk array

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101135994A (en) * 2007-09-07 2008-03-05 杭州华三通信技术有限公司 Method and apparatus for dividing cache space and cache controller thereof
CN102546716A (en) * 2010-12-23 2012-07-04 中国移动通信集团公司 Buffer management method, device and streaming media on-demand system
WO2013090126A1 (en) * 2011-12-16 2013-06-20 Microsoft Corporation Application-driven cdn pre-caching
US20130227051A1 (en) * 2012-01-10 2013-08-29 Edgecast Networks, Inc. Multi-Layer Multi-Hit Caching for Long Tail Content
CN105022587A (en) * 2014-04-24 2015-11-04 中国移动通信集团设计院有限公司 Method for designing magnetic disk array and storage device for magnetic disk array
CN104484134A (en) * 2014-12-23 2015-04-01 北京华胜天成科技股份有限公司 Method and device for allocating distributed type storage magnetic discs
CN104794064A (en) * 2015-04-21 2015-07-22 华中科技大学 Cache management method based on region heat degree

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038127A (en) * 2017-02-08 2017-08-11 阿里巴巴集团控股有限公司 Application system and its buffer control method and device
CN110401553A (en) * 2018-04-25 2019-11-01 阿里巴巴集团控股有限公司 The method and apparatus of server configuration
CN110401553B (en) * 2018-04-25 2022-06-03 阿里巴巴集团控股有限公司 Server configuration method and device
US11431669B2 (en) 2018-04-25 2022-08-30 Alibaba Group Holding Limited Server configuration method and apparatus
CN109005056A (en) * 2018-07-16 2018-12-14 网宿科技股份有限公司 Storage capacity evaluation method and apparatus based on CDN application
US11005717B2 (en) 2018-07-16 2021-05-11 Wangsu Science & Technology Co., Ltd. Storage capacity evaluation method based on content delivery network application and device thereof
CN109683816A (en) * 2018-12-14 2019-04-26 北京奇艺世纪科技有限公司 The disk configuration method and system of a kind of time source tree node
CN109683816B (en) * 2018-12-14 2021-08-27 北京奇艺世纪科技有限公司 Disk configuration method and system for back source tree nodes
CN110933140A (en) * 2019-11-05 2020-03-27 北京字节跳动网络技术有限公司 CDN storage allocation method, system and electronic equipment
CN110933140B (en) * 2019-11-05 2021-12-24 北京字节跳动网络技术有限公司 CDN storage allocation method, system and electronic equipment

Similar Documents

Publication Publication Date Title
CN106027642A (en) Method and system for determining number of disks of CDN (Content Delivery Network) node
US20200050968A1 (en) Interactive interfaces for machine learning model evaluations
CA2953959C (en) Feature processing recipes for machine learning
CN103345514B (en) Streaming data processing method under big data environment
CN110019396A (en) A kind of data analysis system and method based on distributed multidimensional analysis
CN105593818A (en) Apparatus and method for scheduling distributed workflow tasks
CN105279240B (en) The metadata forecasting method and system of client origin information association perception
CN110175154A (en) A kind of processing method of log recording, server and storage medium
WO2009103221A1 (en) Effective relating theme model data processing method and system thereof
CN108268710A (en) A kind of IMA system dynamic restructuring policy optimization methods based on genetic algorithm
CN105488366A (en) Data permission control method and system
CN106202092A (en) The method and system that data process
CN107783734A (en) A kind of resource allocation methods, device and terminal based on super fusion storage system
CN109583799A (en) The method and device of region division, electronic equipment
CN110289994A (en) A kind of cluster capacity adjustment method and device
EP3076310A1 (en) Variable virtual split dictionary for search optimization
CN107562532A (en) A kind of method and device for the hardware resource utilization for predicting device clusters
CN107102896A (en) A kind of operating method of multi-level buffer, device and electronic equipment
CN106790529A (en) The dispatching method of computing resource, control centre and scheduling system
CN110297869A (en) A kind of AI Data Warehouse Platform and operating method
US20220019587A1 (en) Access path optimization
CN108733694A (en) Method and apparatus are recommended in retrieval
CN106790258B (en) A kind of method and system of screening server network request
CN104216887B (en) Method and apparatus for being summarized to sampled data
CN107016050A (en) Data processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20161012