CN106453546B - The method of distributed storage scheduling - Google Patents
The method of distributed storage scheduling Download PDFInfo
- Publication number
- CN106453546B CN106453546B CN201610875745.1A CN201610875745A CN106453546B CN 106453546 B CN106453546 B CN 106453546B CN 201610875745 A CN201610875745 A CN 201610875745A CN 106453546 B CN106453546 B CN 106453546B
- Authority
- CN
- China
- Prior art keywords
- evaluation
- memory node
- matrix
- factor
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/12—Shortest path evaluation
- H04L45/122—Shortest path evaluation by minimising distances, e.g. by selecting a route with minimum of number of hops
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/50—Queue scheduling
Abstract
The present invention relates to the methods of distributed storage scheduling, comprising: a. establishes evaluation index: obtaining evaluations matrix relevant to m memory node;B. data normalization is handled: after eliminating dimensional effect to each data in evaluations matrix, obtaining evaluation index value matrix;C. compared two-by-two using Triangular Fuzzy Number: obtaining the weight vectors of the judgment matrix being made of Triangular Fuzzy Number and each factor of evaluation and the weight of weighting evaluation matrix and each factor of evaluation;D. it acquires positive and negative ideal value: acquiring positive and negative ideal value and get Ge memory node to the distance of ideal value, each memory node is ranked up, optimal memory node is selected.Method of the invention, comprehensive analysis can be carried out to the various influence factors in scheduling process, it is responded to select most preferred memory node, high degree improves the data transmission performance and storage efficiency of network remote distributed storage, hence it is evident that improves the storage quality of distributed storage.
Description
Technical field
The present invention relates to the distributed storage methods in cloud storage, are concretely that distributed storage is carried out in cloud storage
The method of scheduling.
Background technique
In the field of distributed storage, Cinder is a kind of using extremely wide distributed storage architecture, its tune
Degree is divided into two stages, is filtering and weighting respectively.When a storage request arrives, it is first
The filtering stage screens satisfactory memory node, and each memory node only meets or is not inconsistent in screening
Two kinds of conjunction is undesirable as a result, the queue then entered to weighting that meets the requirements, and is eliminated;It executes later
The weighting stage is ranked up satisfactory memory node, chooses memory node the most suitable, is saved by the storage
Point provides storage service for request.The current whether satisfactory standard of filtering procedure inspection memory node is the storage
The storage request whether node has enough memory spaces to be able to carry out this time, if there is being placed in queue so as to later
Weighting process uses, on the contrary then do not consider.Weighting process later is based on residual memory space and deposits to qualification
Storage node is ranked up, and chooses the maximum memory node of residual memory space to provide service.By above two step, a transmission
Scheduling process to the request of Cinder terminates.
The dispatching method of Cinder plays a key role in the service quality of distributed storage, but current
Cinder scheduling there is a problem of following: not ensured that with the dispatching method that residual memory space is unique regulation goal currently
The service quality of cloud storage.For example, when the network congestion of some memory node is more serious but its remaining memory space is maximum
When, Cinder is not aware that this point, it can choose the memory node still to handle storage request.But at this time by
In network congestion influence obviously the memory node be not ideal service node.Furthermore, it is understood that Cinder is this
Only consider memory node residual memory space dispatching method do not reach scheduling comprehensive performance it is optimal.Shadow in memory node
The factor for ringing service quality further includes other many factors in addition to remaining space, and many factors that only will affect service quality are comprehensive
Conjunction, which takes into account, realizes that multidimensional scheduling can be only achieved optimal dispatching effect.
Summary of the invention
The present invention provides a kind of methods of distributed storage scheduling, to overcome current single goal scheduling cannot being dispatched to property
The defect of energy optimal service node, keeps the selection mode to memory node more comprehensive, improves storage efficiency and quality.
The method of distributed storage scheduling of the present invention, including the following steps:
A. it establishes evaluation index: according to scheduling request, collecting on the influential factor of evaluation of scheduling, then analyze each institute
The correlation of factor of evaluation and scheduling is stated, evaluations matrix relevant to m memory node is obtained, wherein m is natural number;
B. data normalization is handled: being a kind of Data processing commonly mode to data normalization.It is logical in this method
After standardization formula is crossed to each data elimination dimensional effect in the evaluations matrix, the evaluation index value after being standardized
Matrix;
C. compared two-by-two using Triangular Fuzzy Number: by Triangular Fuzzy Number to each in the evaluation index value matrix
Memory node compares two-by-two, such as can use r for the comparison between two memory nodes m and nmn=(a, b, c) indicate, in
Value b indicates significance level, and two boundary values a and c are then used to indicate fog-level, and two are illustrated when the difference that b subtracts a is bigger
The ambiguity that node compares is higher, illustrates that this is relatively non-fuzzy if difference is 0.Same reason,Indicate different degree of the node n relative to node m.Then the judgement square being made of Triangular Fuzzy Number is obtained
The weight vectors of battle array and each factor of evaluation, and weighting evaluation matrix is obtained, normalized is done to the weight vectors and is obtained
The weight of each factor of evaluation;
D. it acquires positive and negative ideal value: acquiring positive and negative ideal value on the weighting evaluation matrix, pass through the Manhattan of weighting
Range formula calculates each memory node in weighting evaluation matrix respectively and reuses the degree of approach to the distance of positive and negative ideal value to define
Comprehensive performance calculates the comprehensive evaluation value of each memory node, according to the comprehensive evaluation value of each memory node to each memory node
It is ranked up, selects response memory node of the smallest memory node of comprehensive evaluation value as scheduling request.
Wherein establishing evaluation index is the basis analyzed, and the factor for influencing scheduling can be issued to tune from scheduling request
Degree request handles this overall process by a certain memory node to be analyzed.It first relates to issue request in this course
The factor of client, followed by transmits the factor of the network of request, is finally the factor of the service node of processing request.It therefore can
The factor of evaluation to be judged as to include client factor, network factors and server-side factor.
From the point of view of specific, client factor includes hop count of the client away from server-side, that is, client is to storing
The distance factor of node;Network factors include whether packet loss, the network in network transmission process are unimpeded etc.;Server-side factor
It include processor load, memory usage and memory space occupancy etc..Therefore, hop count, packet loss, processor are negative
It carries, memory usage and memory space occupancy this 5 principal elements have codetermined the quality of storage.
A kind of optional mode is, in step a by Pearson product-moment correlation coefficient analyze each factor of evaluation and
The correlation of scheduling, available scheduled correlationWhereinRepresent test sample x's
Mean value, similarlyThe mean value of y is represented, n representative sample capacity, the value of r is between minus 1 to positive 1, and explanation has just when being positive value
Correlation, explanation has negative correlation when being negative value.
Further, it in step b, is closed according to the superior degree of each data in evaluations matrix is corresponding with data value size
System carries out dimensional effect Processing for removing using different standardization formula.Such as more big for data value superior degree is more excellent
Data p (i, j) can be using standardization formula: p (i, j)=n (i, j)/[nmax(i)+nmin(i)];It is smaller for data value excellent
The more more excellent data p (i, j) of degree can be using standardization formula: p (i, j)=[nmax(i)+nmin(i)-n(i,j)]/[nmax
(i)+nmin(i)], wherein n (i, j) indicates the node in evaluations matrix N, nmax(i) maximum value of i-th of factor of evaluation is indicated,
nmin(i) minimum value of i-th of factor of evaluation is indicated.
Preferably, it is calculated respectively in step d by the manhatton distance formula of weighting and respectively stores section in weighting evaluation matrix
Point arrives the distance of positive and negative ideal value.It can useIndicate that memory node i to the distance of positive ideal value, is usedIndicate memory node i
To the distance of negative ideal value.The size of value shows memory node i and the direct distance of positive ideal value, the smaller then table of the value
The bright memory node is nearer it is to positive ideal value;SimilarlyShow the distance between memory node i and negative ideal value.Such as
When the calculating parameter of candidate storage node is set as client distance service end hop count, network packet loss rate, cpu load, interior
When depositing utilization rate and residual storage capacity, these parameters are obviously all the smaller the better, so at this moment optimal node is exactly from negative reason
Want to be worthNearest memory node.
Various influence factors in scheduling process can be carried out comprehensive point by the method for distributed storage scheduling of the invention
Analysis, is responded, high degree improves the data of network remote distributed storage to select most preferred memory node
Transmission performance and storage efficiency, hence it is evident that improve the storage quality of distributed storage.
Specific embodiment with reference to embodiments is described in further detail above content of the invention again.
But the range that this should not be interpreted as to the above-mentioned theme of the present invention is only limitted to example below.Think not departing from the above-mentioned technology of the present invention
In the case of thinking, the various replacements or change made according to ordinary skill knowledge and customary means should all be included in this hair
In bright range.
Detailed description of the invention
Fig. 1 is the flow chart of the method for distributed storage of the present invention scheduling.
Specific embodiment
The method of distributed storage scheduling of the present invention, step have as shown in Figure 1:
A. it establishes evaluation index: according to scheduling request, collecting on the influential factor of evaluation of scheduling.According to the full mistake of scheduling
Factor of evaluation is divided into the factor for the client for issuing request, the factor of the network of transmission request, the service section for handling request by journey
The factor of point.Wherein client factor includes hop count of the client away from server-side, and network factors include network transmission
Whether packet loss and network in the process be unimpeded, and server-side factor includes processor load, memory usage and memory space
Occupancy.
Based on this 5 factors of evaluation, each influence factor and scheduling are analyzed by Pearson product-moment correlation coefficient
Correlation:
WhereinThe mean value of test sample x is represented, similarlyThe mean value of y is represented, n representative sample capacity, the value of r is between minus 1
To between positive 1, explanation has positive correlation when being positive value, and explanation has negative correlation when being negative value.
Evaluation object is m physical store node of actual storage volume, can be expressed as ki∈ K, wherein i ∈ 1,2,
3,…,m}.There are 5 factors for influencing scheduling to consider each memory node, then can establish the evaluations matrix of m × 5
N:
B. data normalization is handled: eliminating dimensional effect to each data in the evaluations matrix by standardization formula.
According to the corresponding relationship of the superior degree of each data in evaluations matrix and data value size, using different standardization formula into
Row dimensional effect Processing for removing, such as the superior degree of " transmission delay " is the smaller the better.Superior degree more big for data value
More excellent data p (i, j) can be using standardization formula: p (i, j)=n (i, j)/[nmax(i)+nmin(i)];For data value
The more excellent data p (i, j) of smaller superior degree can be using standardization formula: p (i, j)=[nmax(i)+nmin(i)-n(i,
j)]/[nmax(i)+nmin(i)], wherein n (i, j) indicates the node in evaluations matrix N, nmax(i) i-th of factor of evaluation is indicated
Maximum value, nmin(i) minimum value of i-th of factor of evaluation is indicated.
Then the evaluation index value matrix N after being standardized·:
C. compared two-by-two using Triangular Fuzzy Number: by Triangular Fuzzy Number to each in the evaluation index value matrix
Memory node compares two-by-two, altogether relatively m (m-1)/2 time.Then the judgment matrix being made of Triangular Fuzzy Number is obtainedWith the weight vectors of each factor of evaluation, and weighting matrix is obtained
T does normalized to the weight vectors and obtains the weight of each factor of evaluation.
Such as have 5 alternative memory nodes, obtained weighting matrix T are as follows:
Above-mentioned weighting matrix T the sum of is transformed to by decimal form and seeks evaluation index average value:
Calculating fuzzy synthesis degree formula is used to judgment matrix:I=
1,2 ..., n, whereinIt is the summed result of respective items calculated in judgment matrix,It is the weight of item to be solved.This reality
Apply in example is to calculate 5 evaluation indexes, therefore n=5 obtains each evaluation index relative to other evaluation indexes in turn
Significance level:
R can be used for the comparison between two memory nodes m and nmn=(a, b, c) indicates that intermediate value b is in a and c
Value indicates significance level, and two boundary values a and c are then used to indicate fog-level, and two are illustrated when the difference that b subtracts a is bigger
The ambiguity that node compares is higher, illustrates that this is relatively non-fuzzy if difference is 0.Similarly,Table
Show different degree of the node n relative to node m.Pass through formula:
Each evaluation index and its can be calculated
What its evaluation index was compared estimates:It can similarly obtain: V (S1
≥S5)=0.417, V (S2≥S3)=0.235, V (S2≥S4)=0.228, V (S2≥S5)=0.762, V (S5≥S3)=
0.396, V (S5≥S4)=0.391, remaining each fiducial value is 1.
Recycle formula:With
D (P)=minV (P >=Px), x=1,2 ..., n;P≠Pi, the weight vectors d of available each factor of evaluation
(Ci):
d(C1)=V (S1≥S2,S3,S4,S5)=min (0.65,1,1,0.417)=0.417
d(C2)=V (S2≥S1,S3,S4,S5)=min (1,0.235,0.228,0.762)=0.228
d(C3)=V (S3≥S1,S2,S4,S5)=min (1,1,1,1)=1
d(C4)=V (S4≥S1,S2,S3,S5)=min (1,1,1,1)=1
d(C5)=V (S5≥S1,S2,S3,S4)=min (1,1,0.396,0.391)=0.391
Wherein P indicates the amount of chance event possibility occurrence size in new probability formula, PnFor corresponding SnProbability.
Doing inspection to each weight has:
d′(C1)+d′(C2)+d′(C3)+d′(C4)+d′(C5)=0.137+0.075+0.329+0.329+0.13=1
Wherein d ' (Ci) it is d (C1)~d (C5) evaluation divided by they summation value.
The client acquired according to above formula is to the hop count of each memory node, network packet loss rate, cpu load, memory
Weight vector A is obtained after 5 parametric solutions of utilization rate and disk space usage amount:
A=(a1,a2,a3,a4,a5)=(0.137,0.075,0.329,0,329,0.13)
(the r in Triangular Fuzzy Number judgment matrix R is replaced with weight vector A1,r2,r3,r4,r5), obtain weighting evaluation matrix
Z。
D. it acquires positive and negative ideal value: acquiring positive and negative ideal value on the weighting evaluation matrix Z, can make on matrix Z
It is that memory node is ranked up with TOPSIS algorithm, uses Z respectively+And Z-It indicates, wherein Z+By respectively being evaluated in weighting evaluation matrix Z
The maximum value of index forms, Z-Then it is made of the minimum value of each evaluation index in weighting evaluation matrix Z:
Then by weighting manhatton distance formula calculate respectively each memory node to positive and negative ideal value distance:
Wherein i=1,2 ..., m, wherein ajIt is evaluation index
Weight, xijIt is the value of j-th of evaluation index of i-th of memory node,WithRespectively j-th of evaluation index is to positive and negative reason
Think the distance of value,It is distance of the memory node i to positive ideal value,It is distance of the memory node i to negative ideal value,
The size of value shows memory node i and the direct distance of positive ideal value, and the value is smaller, show the memory node nearer it is to
Positive ideal value, similarlyShow direct distance between memory node i and negative ideal value.
The degree of approach is reused to define comprehensive performance, calculates the comprehensive evaluation value C of each memory node:The comprehensive performance and C of memory nodeiValue it is negatively correlated, by respectively storing in this present embodiment
The calculating parameter of node be set as client distance service end hop count, network packet loss rate, cpu load, memory usage and
Residual storage capacity, these parameters are obviously all the smaller the better, so at this moment optimal memory node is exactly from negative ideal value Di -
Nearest memory node, therefore work as CiWhen getting 0, that is, the memory node for being 0 at a distance from negative ideal value is optimal
Node.Therefore according to the C of each memory nodeiValue, is ranked up each memory node, selects CiIt is worth the smallest memory node to make
For the response memory node of scheduling request.
Claims (5)
1. the method for distributed storage scheduling, feature include:
A. it establishes evaluation index: according to scheduling request, collecting on the influential factor of evaluation of scheduling, then analyze each institute's commentary
The correlation of valence factor and scheduling obtains evaluations matrix relevant to m memory node, and wherein m is natural number;
B. data normalization is handled: after eliminating dimensional effect to each data in the evaluations matrix by standardization formula, being obtained
Evaluation index value matrix after to standardization;
C. compared two-by-two using Triangular Fuzzy Number: by Triangular Fuzzy Number to each storage in the evaluation index value matrix
After node compares two-by-two, the weight vectors of the judgment matrix being made of Triangular Fuzzy Number and each factor of evaluation are obtained, and obtain
Weighting evaluation matrix does normalized to the weight vectors and obtains the weight of each factor of evaluation;
D. it acquires positive and negative ideal value: acquiring positive and negative ideal value on the weighting evaluation matrix, pass through the manhatton distance of weighting
Formula calculates each memory node in weighting evaluation matrix respectively and reuses the degree of approach to the distance of positive and negative ideal value to define synthesis
Performance calculates the comprehensive evaluation value of each memory node, is carried out according to the comprehensive evaluation value of each memory node to each memory node
Sequence, selects response memory node of the smallest memory node of comprehensive evaluation value as scheduling request.
2. the method for distributed storage scheduling as described in claim 1, it is characterized in that: the factor of evaluation includes client
Factor, network factors and server-side factor.
3. the method for distributed storage scheduling as claimed in claim 2, it is characterized in that: client factor includes client away from clothes
The hop count at business end;Network factors include whether packet loss in network transmission process and network are unimpeded;Server-side factor packet
Include processor load, memory usage and memory space occupancy.
4. the method that the distributed storage as described in one of claims 1 to 3 is dispatched, it is characterized in that: pass through Pearson came in step a
Product moment correlation coefficient analyzes the correlation of each factor of evaluation and scheduling.
5. the method that the distributed storage as described in one of claims 1 to 3 is dispatched, it is characterized in that: in step b, according to evaluation
The superior degree of each data in matrix and the corresponding relationship of data value size carry out dimension effect using different standardization formula
Answer Processing for removing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610875745.1A CN106453546B (en) | 2016-10-08 | 2016-10-08 | The method of distributed storage scheduling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610875745.1A CN106453546B (en) | 2016-10-08 | 2016-10-08 | The method of distributed storage scheduling |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106453546A CN106453546A (en) | 2017-02-22 |
CN106453546B true CN106453546B (en) | 2019-05-07 |
Family
ID=58172006
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610875745.1A Active CN106453546B (en) | 2016-10-08 | 2016-10-08 | The method of distributed storage scheduling |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106453546B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107172189A (en) * | 2017-06-14 | 2017-09-15 | 郑州云海信息技术有限公司 | A kind of many concurrent picture storage methods |
CN109800076B (en) * | 2017-11-16 | 2021-09-10 | 航天信息股份有限公司 | Storage scheduling method and device |
CN111144701B (en) * | 2019-12-04 | 2022-03-22 | 中国电子科技集团公司第三十研究所 | ETL job scheduling resource classification evaluation method under distributed environment |
CN112491862B (en) * | 2020-11-23 | 2022-08-02 | 中国联合网络通信集团有限公司 | Distributed encryption method and device |
CN113064554B (en) * | 2021-04-08 | 2022-08-30 | 易联众信息技术股份有限公司 | Optimal storage node matching method, device and medium based on distributed storage |
CN114531365B (en) * | 2022-04-24 | 2022-07-15 | 北京华创方舟科技集团有限公司 | Cloud resource automatic operation and maintenance method under multi-cloud environment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731528A (en) * | 2015-03-17 | 2015-06-24 | 清华大学 | Construction method and system for storage service of cloud computing block |
CN104933505A (en) * | 2015-04-22 | 2015-09-23 | 国家电网公司 | Decision and evaluation method for intelligent power distribution network group based on fuzzy assessment |
-
2016
- 2016-10-08 CN CN201610875745.1A patent/CN106453546B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731528A (en) * | 2015-03-17 | 2015-06-24 | 清华大学 | Construction method and system for storage service of cloud computing block |
CN104933505A (en) * | 2015-04-22 | 2015-09-23 | 国家电网公司 | Decision and evaluation method for intelligent power distribution network group based on fuzzy assessment |
Non-Patent Citations (2)
Title |
---|
"Genetic Algorithm for the Project Scheduling Problem with Fuzzy Time Parameters";Yilun Huang,etc al.,;《2011 IEEE International Conference on Industrial Engineering and Engineering Management》;20111209;689-693 * |
"基于改进层次分析法的大面积停电事故抢修后模糊综合评价";尹洪等,;《华东电力》;20120831;第40卷(第8期);1341-1345 * |
Also Published As
Publication number | Publication date |
---|---|
CN106453546A (en) | 2017-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106453546B (en) | The method of distributed storage scheduling | |
CN110287245B (en) | Method and system for scheduling and executing distributed ETL (extract transform load) tasks | |
CN104317650B (en) | A kind of job scheduling method towards Map/Reduce type mass data processing platforms | |
CN105718479B (en) | Execution strategy generation method and device under cross-IDC big data processing architecture | |
CN108681973A (en) | Sorting technique, device, computer equipment and the storage medium of power consumer | |
CN109885397A (en) | The loading commissions migration algorithm of time delay optimization in a kind of edge calculations environment | |
CN109471847B (en) | I/O congestion control method and control system | |
CN110990121B (en) | Kubernetes scheduling strategy based on application portraits | |
CN109189876A (en) | A kind of data processing method and device | |
CN116501711A (en) | Computing power network task scheduling method based on 'memory computing separation' architecture | |
CN113837311A (en) | Resident customer clustering method and device based on demand response data | |
CN105867998A (en) | Virtual machine cluster deployment algorithm | |
CN110334157A (en) | A kind of cloud computing management system | |
CN115858168A (en) | Earth application model arrangement system and method based on importance ranking | |
CN110766043A (en) | K-means clustering algorithm based on power grid information data | |
CN107066328A (en) | The construction method of large-scale data processing platform | |
CN111027841A (en) | Low-voltage transformer area line loss calculation method based on gradient lifting decision tree | |
CN107193940A (en) | Big data method for optimization analysis | |
Huang et al. | Effective scheduling function design in SDN through deep reinforcement learning | |
CN105825311B (en) | Set meal determining method and system | |
CN111260288B (en) | Order management method, device, medium and electronic equipment | |
CN115665157B (en) | Balanced scheduling method and system based on application resource types | |
CN105095455B (en) | A kind of data connection optimization method and data arithmetic system | |
CN116701979A (en) | Social network data analysis method and system based on limited k-means | |
CN115660730A (en) | Loss user analysis method and system based on classification algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |