CN103677757A - Method for controlling parallelism degree of program capable of sensing band width of storage device - Google Patents

Method for controlling parallelism degree of program capable of sensing band width of storage device Download PDF

Info

Publication number
CN103677757A
CN103677757A CN201310477537.2A CN201310477537A CN103677757A CN 103677757 A CN103677757 A CN 103677757A CN 201310477537 A CN201310477537 A CN 201310477537A CN 103677757 A CN103677757 A CN 103677757A
Authority
CN
China
Prior art keywords
degree
parallelism
efficiency monitoring
function
efficiency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310477537.2A
Other languages
Chinese (zh)
Other versions
CN103677757B (en
Inventor
刘轶
王庆全
刘弢
李钦
高飞
朱延超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kaixi Beijing Information Technology Co ltd
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201310477537.2A priority Critical patent/CN103677757B/en
Publication of CN103677757A publication Critical patent/CN103677757A/en
Application granted granted Critical
Publication of CN103677757B publication Critical patent/CN103677757B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a method for controlling the parallelism degree of a program capable of sensing the band width of a storage device. By means of the method, the parallelism degree of the application program is dynamically adjusted according to the comprehensive performance and real-time loading conditions of the storage device of an operating platform of the application program in a self-adaptive mode, namely the number of courses or threads, so that the application program keeps and obtains the parallelism degree with the optimal I/O efficiency. According to the method, the information of the I/O efficiency of the application program is monitored and recorded in real time and is used as feedback information to control and adjust the parallelism degree of the application program. The parallelism degree of the program is probed step by step and increased gradually until the parallelism degree reaches the inflection point of the actual I/O efficiency, and then fine adjustment is carried out to obtain the optimal parallelism degree for different application program platforms; further according to the real-time loading conditions of the different application program platforms, periodic, dynamic and self-adaptive operation combining active adjustment and passive adjustment is carried out to obtain the optimal parallelism degree of the application program.

Description

A kind of program parallelization degree control method that can perception memory device bandwidth
Technical field
The present invention relates to a kind of control that is applicable to the concurrent program degree of parallelism of chip multi-core multiprocessor, particularly a kind of program parallelization degree control method that can perception memory device I/O bandwidth.
Background technology
Chip multi-core multiprocessor (Chip Mulit-Processor, CMP) is exactly that a plurality of calculating kernels are integrated in processor chips, thereby improves computing power.The sharing and synchronization that need to carry out data between the program that each core cpu of CMP processor is carried out, so its hardware configuration must be supported internuclear communication.
After the appearance of CMP treatment technology, the parallelization of conventional serial program becomes the emphasis of research, most concurrent programs are usingd the check figure of CMP as the important references index of selecting degree of parallelism (process or number of threads), this processing policy has been obtained good effect in the application program of computation-intensive, if but the CMP of calculated performance only consider to(for) the application of I/O intensity may not reach maximum acceleration effect or throughput (I/O throughput is also referred to as I/O bandwidth), because application program may be limited to the I/O(I/O of this application platform of operation) performance.
The I/O ability of different application platform differs greatly, and this mainly depends on memory device (for example, ordinary PC, with the server of scsi interface, with the server of disk array and RAID card, magnetic disc i/o bandwidth has very big-difference).Scale based on memory device, has produced the hierarchical structure based on friction speed and capacity memory.As for a multi-level store hierarchical structure, please refer to < < Computer Systems Organization--quantitative method > > mono-book of the translations such as in July, 2004 Zheng Weimin, Fig. 5 .1 in the 5th chapter storage hierarchy design of this book is explained in detail.Being applied in while moving on different parallel tables of frequent access I/O equipment, is difficult to directly determine the degree of parallelism that can obtain maximum I/O efficiency, and the too low meeting of degree of parallelism causes I/O bandwidth to be not fully utilized wastes processor core resource simultaneously; Degree of parallelism is too high equally may not necessarily improving performance, because I/O bandwidth has become bottleneck, cross highland degree of parallelism and cause phase mutual interference aggravation between parallel execution unit, such as meeting makes file prefetch mechanisms and the disk action confusion reigned of kernel, thereby cause overall performance not rise counter falling.
For the Parallel application of I/O intensity, according to the handling property of different application platform, come, determine that degree of parallelism cannot the maximum advantage of bringing into play parallel processing.Therefore, how can, adaptively according to the comprehensive hardware condition of different application platform and real-time load state, determine that dynamically corresponding best degree of parallelism becomes the key of dealing with problems.
Summary of the invention
The object of the invention is to propose a kind of adaptive according to the memory device combination property of application program operation platform and real-time load state, dynamically adjust the degree of parallelism of application program, be process or number of threads, make application program keep obtaining the degree of parallelism of best I/O efficiency.The method, by monitoring in real time and records application program I/O efficiency information, is controlled the degree of parallelism of adjusting application program using it as feedback information.The method until reach the flex point of I/O actual efficiency, then finely tunes to obtain the best degree of parallelism for different application platform by progressively surveying the degree of parallelism of progressive increase program; And then carry out periodically according to the real-time load state of different application platform, the best degree of parallelism that the dynamic self-adapting that active accommodation combines with passive adjustment has operated application program obtains.
A kind of program parallelization degree control method that can perception memory device bandwidth of the present invention, the program of moving on application platform can be divided into three states, i.e. initial state, stable state and final state according to program I/O efficiency;
Initial state refers to a kind of state when application program is opened with lower degree of parallelism operation;
Stable state refers to a kind of state when application program enters stable operation after initial state;
Final state refers to a kind of state when application program is moved through maintaining higher I/O efficiency after stable state;
It is characterized in that: described program parallelization degree control method includes following treatment step:
Step 1: the real-time monitoring of I/O efficiency
Step 101: the byte number of the read/write function of recording address intercepting-efficiency monitoring
I I/O efficiency monitoring time T iin, adopt the mode of address intercepting, the entrance function read of the read operation that operating system is provided be labeled as with I/O efficiency monitoring read function c_read, and record the byte number that described c_read reads function and be designated as
Figure BDA0000395157840000021
the entrance function write of the write operation that operating system is provided be labeled as with I/O efficiency monitoring write function c_write, and record the byte number that described c_write writes function and be designated as
Figure BDA0000395157840000022
have, at T ithe byte number of the read/write function of the address intercepting-efficiency monitoring under condition is expressed as with set R SYS T i = R c _ read T i + R c _ write T i ;
In application program running time T alwaysunder condition, the byte number of the read/write function of all addresses intercepting-efficiency monitoring is designated as
Figure BDA0000395157840000024
referred to as address intercepting-efficiency monitoring-byte number
Figure BDA0000395157840000025
Step 102: the byte number that records the read/write function of function overloading-efficiency monitoring
I+1 I/O efficiency monitoring time T i+1in, adopt function overloading mode to provide reading function u_read and writing function u_write with monitoring function to user; The byte number that described u_read reads function is designated as
Figure BDA0000395157840000031
the byte number that described u_write writes function is designated as have, at T i+1the byte number of the read/write function of the function overloading-efficiency monitoring under condition is expressed as with set
Figure BDA0000395157840000033
At T alwaysunder condition, the byte number of the read/write function of all function overloading-efficiency monitorings is designated as
Figure BDA0000395157840000034
referred to as function overloading-efficiency monitoring-byte number
Figure BDA0000395157840000035
Step 103: the byte number of the read/write function of record interface information-efficiency monitoring
I I/O efficiency monitoring time T iin, the I/O statistical information interface that adopts operating system to provide, carries out the raw data of pre-service kernel generation, and completes the monitoring of I/O efficiency, and the system monitoring obtaining is read function s_read and is write function s_write; The byte number that described s_read reads function is designated as
Figure BDA0000395157840000036
the byte number that described s_write writes function is designated as
Figure BDA0000395157840000037
have, at T ithe byte number of the read/write function of the interface message-efficiency monitoring under condition is expressed as with set
Figure BDA0000395157840000038
At T alwaysunder condition, the byte number of the read/write function of total interface information-efficiency monitoring is designated as
Figure BDA0000395157840000039
referred to as interface message-efficiency monitoring-byte number
Figure BDA00003951578400000310
Step 2: the denoising of the byte number of read/write function
During application program operation, include different I/O efficiency monitoring type MODE, be that MODE includes address intercepting-efficiency monitoring type SYS, function overloading-efficiency monitoring type USR and interface message-efficiency monitoring type i NT, be MODE={SYS, USR, INT};
According to denoising condition the byte number of read/write function is carried out to filtering interfering processing, obtain filtering postbyte number; If being " 1 ", value chooses
Figure BDA00003951578400000312
and carry out degree of parallelism extraction process, then locate fast, if being " 0 ", value abandons
Figure BDA0000395157840000041
degree of parallelism extract, directly locate fast;
Q represents the number of I/O efficiency monitoring time;
I represents any one I/O efficiency monitoring time;
Figure BDA00003951578400000413
be illustrated in I/O efficiency monitoring time T iunder condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE;
be illustrated in I/O efficiency monitoring time T i+1under condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE;
C threshold valuethe critical value that represents the judgement wave phenomenon generation that denoising arranges;
According to denoising condition E to address intercepting-efficiency monitoring-byte number
Figure BDA0000395157840000042
denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
Figure BDA0000395157840000043
According to denoising condition E to function overloading-efficiency monitoring-byte number
Figure BDA0000395157840000044
denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
According to denoising condition E docking port information-efficiency monitoring-byte number denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
Figure BDA0000395157840000047
Step 3: the quick location of degree of parallelism
The value of described degree of parallelism P is 2 power, has P=1,2,4,8 ..., 2 p, p represents exponential; Application program brings into operation with degree of parallelism P=1 after opening, and the filtration postbyte number scale of obtaining under the condition of P=1 is
Figure BDA0000395157840000048
after value P=1, degree of parallelism is according to P=2 poperation, and obtain filtering postbyte number scale and be
Figure BDA0000395157840000049
According to degree of parallelism, from little to greatly the filtration postbyte number different degree of parallelisms being carried out to histogram structure, obtain degree of parallelism-I/O efficiency histogram; Byte number by more described degree of parallelism-I/O efficiency histogram, selects maximum number of byte
Figure BDA00003951578400000410
this maximum number of byte
Figure BDA00003951578400000411
it is the flex point of I/O efficiency monitoring;
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=1, be
Figure BDA00003951578400000412
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=2, be
Figure BDA0000395157840000051
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=4, be
Figure BDA0000395157840000052
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=8, be
Figure BDA0000395157840000053
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=16, be
Figure BDA0000395157840000054
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=32, be
By degree of parallelism P=2 pcondition under address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining be
Figure BDA0000395157840000056
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain address intercepting-efficiency monitoring histogram; Byte number by the intercepting-efficiency monitoring histogram of more described address, selects maximum number of byte
Figure BDA0000395157840000057
this maximum number of byte
Figure BDA0000395157840000058
it is the flex point of the I/O efficiency monitoring of address intercepting;
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=1-filtration postbyte number scale, be
Figure BDA0000395157840000059
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=2-filtration postbyte number scale, be
Figure BDA00003951578400000510
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=4-filtration postbyte number scale, be
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=8-filtration postbyte number scale, be
Figure BDA00003951578400000512
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=16-filtration postbyte number scale, be
Figure BDA00003951578400000513
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=32-filtration postbyte number scale, be
By degree of parallelism P=2 pcondition under function overloading-efficiency monitoring-filtration postbyte number scale of obtaining be
Figure BDA0000395157840000061
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain function overloading-efficiency monitoring histogram; Byte number by more described function overloading-efficiency monitoring histogram, selects maximum number of byte
Figure BDA0000395157840000062
this maximum number of byte
Figure BDA0000395157840000063
it is the flex point of the I/O efficiency monitoring of function overloading;
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=1-filtration postbyte number scale, be
Figure BDA0000395157840000064
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=2-filtration postbyte number scale, be
Figure BDA0000395157840000065
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=4-filtration postbyte number scale, be
Figure BDA0000395157840000066
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=8-filtration postbyte number scale, be
Figure BDA0000395157840000067
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=16-filtration postbyte number scale, be
Figure BDA0000395157840000068
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=32-filtration postbyte number scale, be
Figure BDA0000395157840000069
By degree of parallelism P=2 pcondition under interface message-efficiency monitoring-filtration postbyte number scale of obtaining be
Figure BDA00003951578400000610
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain interface message-efficiency monitoring histogram; Byte number by more described interface message-efficiency monitoring histogram, selects maximum number of byte
Figure BDA00003951578400000611
this maximum number of byte
Figure BDA00003951578400000612
it is the flex point of the I/O efficiency monitoring of interface message;
Step 4: obtain best degree of parallelism
The thick degree of parallelism P recording in histogram according to the filtration postbyte number under different degree of parallelisms slightly, and at thick degree of parallelism P slightlyboth sides with the record of the read-write efficiency byte number that reduces by 2 degree of parallelism units and carry out at every turn, to obtain best degree of parallelism P essence; When obtaining best degree of parallelism P slightlyafter, the program of moving on application platform will be according to P slightlyrestart step 1 to the operation of step 4, until application program end of run;
Obtaining best degree of parallelism P slightlyafter, application program can be carried out initiatively and passive dynamic self-adapting, so just can improve I/O bandwidth, also can make application program obtain the throughput of larger acceleration effect and Geng Gao simultaneously.
The present invention can perception storage system I/O bandwidth the advantage of program parallelization degree control method be:
1. the present invention is directed to the application of I/O intensity, by the I/O efficiency of monitoring application program, can make application program obtain the throughput of larger acceleration effect and Geng Gao.
The method of the dynamic self-adapting that 2. the present invention combines with passive adjustment with active accommodation is obtained the degree of parallelism of maximum I/O efficiency, can adjust accordingly according to the variation of system load the best degree of parallelism of application program.
3. degree of parallelism control method of the present invention, with respect to traditional feedback, does not rely on outside input reference quantity, and reference quantity is progressively to be determined by the relativeness between adjustment process and the change of I/O efficiency.
4. the active accommodation in the present invention is the strategy not having in traditional feedback control technology, and active accommodation can effectively reduce because external environment condition changes has to carry out the negative effect that passive adjustment brings.
Accompanying drawing explanation
Fig. 1 is the different conditions schematic diagram of the program moved on application platform of the present invention.
Fig. 2 is program parallelization degree control flow schematic diagram that can perception memory device bandwidth of the present invention.
Fig. 3 A is that the present invention adopts degree of parallelism to carry out the histogram of location fast.
Fig. 3 B is the histogram that the present invention adopts the contrary degree of parallelism after flex point.
Fig. 4 is active of the present invention and the schematic diagram of passive adjustment.
Embodiment
Below in conjunction with accompanying drawing, the present invention is described in further detail.
Shown in Figure 1, the program of moving on application platform can be divided into three states, i.e. initial state, stable state and final state according to program I/O efficiency.
Initial state refers to a kind of state when application program is opened with lower degree of parallelism operation.
Stable state refers to a kind of state when application program enters stable operation after initial state.
Final state refers to a kind of state when application program is moved through maintaining higher I/O efficiency after stable state.
Conversion regime between initial state and stable state is designated as quick location, and described quick location refers to the process that increases degree of parallelism by 2 exponential depth rule.
Conversion regime between stable state and final state is designated as fine setting, and described fine setting refers near the process that detects best degree of parallelism when stable state determined degree of parallelism value.
Conversion regime between final state and stable state is designated as active accommodation, and what described active accommodation referred to periodic triggers unconditionally transfers application program the operation of stable state to from final state.
Conversion regime between final state and initial state is designated as passive adjustment, and described passive adjustment refers to that application program I/O efficiency change rate when final state triggers while surpassing threshold value transfers application program to the operation of initial state from final state.
Shown in Figure 2, the present invention is a kind of program parallelization degree control method that can perception memory device bandwidth, the method is adaptive according to the comprehensive hardware condition of different application platform and real-time load state, dynamically determine the best I/O efficiency performance that application program can obtain, and program parallelization degree control method that can perception memory device bandwidth.Described program parallelization degree control method includes following treatment step:
Step 1: the real-time monitoring of I/O efficiency
In the present invention, the number of establishing the I/O efficiency monitoring time is q, and first I/O efficiency monitoring time is designated as T 1, second I/O efficiency monitoring time is designated as T 2, last I/O efficiency monitoring time is designated as T q.For convenience of description, any one I/O efficiency monitoring time is designated as to T i, described T ithe last I/O efficiency monitoring time be called T i-1, described T ithe rear I/O efficiency monitoring time be called T i+1.The program of moving on application platform is designated as to T from initial state to stable state to the working time of final state always(referred to as application program working time).At a T alwaysto have a plurality of I/O efficiency monitoring time, i.e. T in section working time always=T 1+ T 2+ ... + T i-1+ T i+ T i+1+ ... + T q.The time interval between adjacent two I/O efficiency monitoring times is designated as T p, usually set
Figure BDA0000395157840000081
Step 101: the byte number of the read/write function of recording address intercepting-efficiency monitoring
In first I/O efficiency monitoring time T 1in, adopt the mode of address intercepting, the entrance function read of the read operation that operating system is provided be labeled as with I/O efficiency monitoring read function c_read, and record the byte number that described c_read reads function and be designated as
Figure BDA0000395157840000082
the entrance function write of the write operation that operating system is provided be labeled as with I/O efficiency monitoring write function c_write, and record the byte number that described c_write writes function and be designated as
Figure BDA0000395157840000083
in the present invention, the read/write function mark that adopts address interception way to carry out is in order to complete system-level I/O efficiency monitoring, to have, at T 1the byte number of the read/write function of the address intercepting-efficiency monitoring under condition is expressed as with set R SYS T 1 = R c _ read T 1 + R c _ write T 1 .
In like manner can obtain, second I/O efficiency monitoring time T 2in, adopt the mode of address intercepting, the entrance function read of the read operation that operating system is provided be labeled as with I/O efficiency monitoring read function c_read, and record the byte number that described c_read reads function and be designated as
Figure BDA0000395157840000091
the entrance function write of the write operation that operating system is provided be labeled as with I/O efficiency monitoring write function c_write, and record the byte number that described c_write writes function and be designated as
Figure BDA0000395157840000092
have, at T 2the byte number of the read/write function of the address intercepting-efficiency monitoring under condition is expressed as with set R SYS T 2 = R c _ read T 2 + R c _ write T 2 .
In like manner can obtain, i I/O efficiency monitoring time T iin, adopt the mode of address intercepting, the entrance function read of the read operation that operating system is provided be labeled as with I/O efficiency monitoring read function c_read, and record the byte number that described c_read reads function and be designated as
Figure BDA0000395157840000094
the entrance function write of the write operation that operating system is provided be labeled as with I/O efficiency monitoring write function c_write, and record the byte number that described c_write writes function and be designated as
Figure BDA0000395157840000095
have, at T ithe byte number of the read/write function of the address intercepting-efficiency monitoring under condition is expressed as with set R SYS T i = R c _ read T i + R c _ write T i .
In like manner can obtain, i+1 I/O efficiency monitoring time T i+1in, adopt the mode of address intercepting, the entrance function read of the read operation that operating system is provided be labeled as with I/O efficiency monitoring read function c_read, and record the byte number that described c_read reads function and be designated as
Figure BDA0000395157840000097
the entrance function write of the write operation that operating system is provided be labeled as with I/O efficiency monitoring write function c_write, and record the byte number that described c_write writes function and be designated as
Figure BDA0000395157840000098
have, at T i+1the byte number of the read/write function of the address intercepting-efficiency monitoring under condition is expressed as with set R SYS T i + 1 = R c _ read T i + 1 + R c _ write T i + 1 .
In like manner can obtain, in the end an I/O efficiency monitoring time T qin, adopt the mode of address intercepting, the entrance function read of the read operation that operating system is provided be labeled as with I/O efficiency monitoring read function c_read, and record the byte number that described c_read reads function and be designated as
Figure BDA00003951578400000910
the entrance function write of the write operation that operating system is provided be labeled as with I/O efficiency monitoring write function c_write, and record the byte number that described c_write writes function and be designated as
Figure BDA00003951578400000911
have, at T qthe byte number of the read/write function of the address intercepting-efficiency monitoring under condition is expressed as with set R SYS T q = R c _ read T q + R c _ write T q .
At T alwaysunder condition, the byte number of the read/write function of all addresses intercepting-efficiency monitoring is designated as
Figure BDA00003951578400000913
referred to as address intercepting-efficiency monitoring-byte number
In the present invention, the read/write function mark that adopts address interception way to carry out is in order to complete system-level I/O efficiency monitoring.
Step 102: the byte number that records the read/write function of function overloading-efficiency monitoring
In first I/O efficiency monitoring time T 1in, adopt function overloading mode to provide reading function u_read and writing function u_write with monitoring function to user; The byte number that described u_read reads function is designated as the byte number that described u_write writes function is designated as
Figure BDA0000395157840000102
have, at T 1the byte number of the read/write function of the function overloading-efficiency monitoring under condition is expressed as with set
In like manner can obtain, second I/O efficiency monitoring time T 2in, adopt function overloading mode to provide reading function u_read and writing function u_write with monitoring function to user; The byte number that described u_read reads function is designated as the byte number that described u_write writes function is designated as
Figure BDA0000395157840000105
have, at T 2the byte number of the read/write function of the function overloading-efficiency monitoring under condition is expressed as with set
Figure BDA0000395157840000106
In like manner can obtain, i I/O efficiency monitoring time T iin, adopt function overloading mode to provide reading function u_read and writing function u_write with monitoring function to user; The byte number that described u_read reads function is designated as
Figure BDA0000395157840000107
the byte number that described u_write writes function is designated as
Figure BDA0000395157840000108
have, at T ithe byte number of the read/write function of the function overloading-efficiency monitoring under condition is expressed as with set
Figure BDA0000395157840000109
In like manner can obtain, i+1 I/O efficiency monitoring time T i+1in, adopt function overloading mode to provide reading function u_read and writing function u_write with monitoring function to user; The byte number that described u_read reads function is designated as
Figure BDA00003951578400001010
the byte number that described u_write writes function is designated as
Figure BDA00003951578400001011
have, at T i+1the byte number of the read/write function of the function overloading-efficiency monitoring under condition is expressed as with set
In like manner can obtain, in the end an I/O efficiency monitoring time T qin, adopt function overloading mode to provide reading function u_read and writing function u_write with monitoring function to user; The byte number that described u_read reads function is designated as the byte number that described u_write writes function is designated as
Figure BDA00003951578400001014
have, at T qthe byte number of the read/write function of the function overloading-efficiency monitoring under condition is expressed as with set
Figure BDA00003951578400001015
At T alwaysunder condition, the byte number of the read/write function of all function overloading-efficiency monitorings is designated as
Figure BDA00003951578400001016
referred to as function overloading-efficiency monitoring-byte number
Figure BDA00003951578400001017
Step 103: the byte number of the read/write function of record interface information-efficiency monitoring
In first I/O efficiency monitoring time T 1in, the I/O statistical information interface that adopts operating system to provide, carries out the raw data of pre-service kernel generation, and completes the monitoring of I/O efficiency, and the system monitoring obtaining is read function s_read and is write function s_write; The byte number that described s_read reads function is designated as
Figure BDA00003951578400001018
the byte number that described s_write writes function is designated as
Figure BDA0000395157840000111
have, at T 1the byte number of the read/write function of the interface message-efficiency monitoring under condition is expressed as with set
Figure BDA0000395157840000112
In like manner can obtain, second I/O efficiency monitoring time T 2in, the I/O statistical information interface that adopts operating system to provide, carries out the raw data of pre-service kernel generation, and completes the monitoring of I/O efficiency, and the system monitoring obtaining is read function s_read and is write function s_write; The byte number that described s_read reads function is designated as the byte number that described s_write writes function is designated as
Figure BDA0000395157840000114
have, at T 2the byte number of the read/write function of the interface message-efficiency monitoring under condition is expressed as with set
Figure BDA0000395157840000115
In like manner can obtain, i I/O efficiency monitoring time T iin, the I/O statistical information interface that adopts operating system to provide, carries out the raw data of pre-service kernel generation, and completes the monitoring of I/O efficiency, and the system monitoring obtaining is read function s_read and is write function s_write; The byte number that described s_read reads function is designated as the byte number that described s_write writes function is designated as
Figure BDA0000395157840000117
have, at T ithe byte number of the read/write function of the interface message-efficiency monitoring under condition is expressed as with set
In like manner can obtain, i+1 I/O efficiency monitoring time T i+1in, the I/O statistical information interface that adopts operating system to provide, carries out the raw data of pre-service kernel generation, and completes the monitoring of I/O efficiency, and the system monitoring obtaining is read function s_read and is write function s_write; The byte number that described s_read reads function is designated as
Figure BDA0000395157840000119
the byte number that described s_write writes function is designated as
Figure BDA00003951578400001110
have, at T i+1the byte number of the read/write function of the interface message-efficiency monitoring under condition is expressed as with set
Figure BDA00003951578400001111
In like manner can obtain, in the end two I/O efficiency monitoring time T qin, the I/O statistical information interface that adopts operating system to provide, carries out the raw data of pre-service kernel generation, and completes the monitoring of I/O efficiency, and the system monitoring obtaining is read function s_read and is write function s_write; The byte number that described s_read reads function is designated as
Figure BDA00003951578400001112
the byte number that described s_write writes function is designated as
Figure BDA00003951578400001113
have, at T qthe byte number of the read/write function of the interface message-efficiency monitoring under condition is expressed as with set
Figure BDA00003951578400001114
At T alwaysunder condition, the byte number of the read/write function of total interface information-efficiency monitoring is designated as
Figure BDA00003951578400001115
referred to as interface message-efficiency monitoring-byte number
In the present invention, the monitoring of I/O efficiency is adopted to multiple means, this is conducive to adapt to the user's request under different application scene.
Step 2: the denoising of the byte number of read/write function
During application program operation, include different I/O efficiency monitoring type MODE, in the present invention, described MODE includes address intercepting-efficiency monitoring type SYS, function overloading-efficiency monitoring type USR and interface message-efficiency monitoring type i NT, i.e. MODE={SYS, USR, INT}.
According to denoising condition
Figure BDA0000395157840000121
the byte number of read/write function is carried out to filtering interfering processing, obtain filtering postbyte number.If being " 1 ", value chooses
Figure BDA0000395157840000122
and carry out degree of parallelism extraction process, then locate fast, if being " 0 ", value abandons
Figure BDA0000395157840000123
degree of parallelism extract, directly locate fast.
Q represents the number of I/O efficiency monitoring time;
I represents any one I/O efficiency monitoring time;
Figure BDA0000395157840000124
be illustrated in I/O efficiency monitoring time T iunder condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE;
Figure BDA0000395157840000125
be illustrated in I/O efficiency monitoring time T i+1under condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE;
C threshold valuethe critical value that represents the judgement wave phenomenon generation that denoising arranges.
In the present invention, according to denoising condition E to address intercepting-efficiency monitoring-byte number
Figure BDA0000395157840000126
denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
Figure BDA0000395157840000127
In the present invention, according to denoising condition E to function overloading-efficiency monitoring-byte number
Figure BDA0000395157840000128
denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
Figure BDA0000395157840000129
In the present invention, according to denoising condition E docking port information-efficiency monitoring-byte number denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
Figure BDA0000395157840000131
In the present invention, adopt denoising condition to distinguish and disturb and normal I/O efficiency change, the characteristic that denoising condition characterizes is exactly the continuity of time, disturbs instantaneously often, when continuous interference occurs, can be referred to as to fluctuate.By the unified processing of above-mentioned multiple read/write function, can carry out the monitoring of I/O efficiency with the read-write efficiency in a plurality of continuous unit interval.
Step 3: the quick location of degree of parallelism
Shown in Fig. 3 A, Fig. 3 B, in the present invention, using degree of parallelism P as horizontal ordinate, i.e. the process of corresponding application program or number of threads, the byte number that ordinate is read/write function, sets up the histogram of I/O efficiency monitoring.
The value of described degree of parallelism P is 2 power, has P=1,2,4,8 ..., 2 p, p represents exponential.Application program brings into operation with degree of parallelism P=1 after opening, and the filtration postbyte number scale of obtaining under the condition of P=1 is
Figure BDA0000395157840000132
after value P=1, degree of parallelism is according to P=2 poperation, and obtain filtering postbyte number scale and be
Figure BDA0000395157840000133
According to degree of parallelism, from little to greatly the filtration postbyte number different degree of parallelisms being carried out to histogram structure, obtain degree of parallelism-I/O efficiency histogram; Byte number by more described degree of parallelism-I/O efficiency histogram, selects maximum number of byte
Figure BDA0000395157840000134
this maximum number of byte
Figure BDA0000395157840000135
be the flex point of I/O efficiency monitoring, as shown in Figure 3A.
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=1, be
Figure BDA0000395157840000136
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=2, be
Figure BDA0000395157840000137
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=4, be
Figure BDA0000395157840000138
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=8, be
Figure BDA0000395157840000139
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=16, be
Figure BDA00003951578400001310
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=32, be
Figure BDA0000395157840000141
By degree of parallelism P=2 pcondition under address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining be
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain address intercepting-efficiency monitoring histogram; Byte number by the intercepting-efficiency monitoring histogram of more described address, selects maximum number of byte
Figure BDA0000395157840000143
this maximum number of byte
Figure BDA0000395157840000144
it is the flex point of the I/O efficiency monitoring of address intercepting.
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=1-filtration postbyte number scale, be
Figure BDA0000395157840000145
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=2-filtration postbyte number scale, be
Figure BDA0000395157840000146
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=4-filtration postbyte number scale, be
Figure BDA0000395157840000147
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=8-filtration postbyte number scale, be
Figure BDA0000395157840000148
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=16-filtration postbyte number scale, be
Figure BDA0000395157840000149
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=32-filtration postbyte number scale, be
Figure BDA00003951578400001410
By degree of parallelism P=2 pcondition under function overloading-efficiency monitoring-filtration postbyte number scale of obtaining be
Figure BDA00003951578400001411
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain function overloading-efficiency monitoring histogram; Byte number by more described function overloading-efficiency monitoring histogram, selects maximum number of byte
Figure BDA00003951578400001412
this maximum number of byte
Figure BDA00003951578400001413
it is the flex point of the I/O efficiency monitoring of function overloading.
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=1-filtration postbyte number scale, be
Figure BDA0000395157840000151
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=2-filtration postbyte number scale, be
Figure BDA0000395157840000152
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=4-filtration postbyte number scale, be
Figure BDA0000395157840000153
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=8-filtration postbyte number scale, be
Figure BDA0000395157840000154
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=16-filtration postbyte number scale, be
Figure BDA0000395157840000155
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=32-filtration postbyte number scale, be
Figure BDA0000395157840000156
By degree of parallelism P=2 pcondition under interface message-efficiency monitoring-filtration postbyte number scale of obtaining be
Figure BDA0000395157840000157
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain interface message-efficiency monitoring histogram; Byte number by more described interface message-efficiency monitoring histogram, selects maximum number of byte
Figure BDA0000395157840000158
this maximum number of byte
Figure BDA0000395157840000159
it is the flex point of the I/O efficiency monitoring of interface message.The degree of parallelism P at described flex point place is designated as thick degree of parallelism P slightly.
In the present invention, in conjunction with the state machine in Fig. 1, represent to be described below, when application program is opened, in initial state, after the process of quick location, enter afterwards stable state, final state.Below in conjunction with Fig. 3 A, describe, according to mentioned above principle application program, with degree of parallelism 1,2,4,8,16,32, move successively, degree of parallelism increases with exponential law, the I/O efficiency that can see application program increases successively, while applying afterwards with degree of parallelism 16 operation, can see that I/O efficiency starts to reduce, in order to guarantee certain reliability, continue to increase degree of parallelism to 32, I/O efficiency does not have the trend of recovering, so, just no longer increasing degree of parallelism and can obtaining I/O efficiency performance flex point is degree of parallelism 8, i.e. P slightly=8.
Step 4: obtain best degree of parallelism
At Fig. 3 A, there is flex point P slightlyafter=8, degree of parallelism with less is adjusted to the Linear fine tuning that stride carries out degree of parallelism, to obtain best read-write efficiency, i.e. smart degree of parallelism P essence.In conjunction with the state machine in Fig. 1, represent to be described below, application program enters after stable state, through the process of finely tuning, enters final state.Below in conjunction with Fig. 3 B, describe, according to mentioned above principle application program at flex point P slightlyfinely tune to reduce by 2 degree of parallelism units at every turn on=8 both sides, the method of using in figure is to start to reduce successively 2 degree of parallelism unit's linear decreases from degree of parallelism 14 at every turn, until degree of parallelism is down to 4, can be clear that the smart degree of parallelism P that obtains best I/O performance essence10(P in fact essence=10), application program will be with best degree of parallelism P afterwards essence=10 operations of read/write function again.In the present invention, described best degree of parallelism P essencefine setting be at thick degree of parallelism P slightlyboth sides with the record of the read-write efficiency byte number that reduces by 2 degree of parallelism units and carry out at every turn.
In the present invention, degree of parallelism fine setting being obtained is designated as best degree of parallelism P essence.When obtaining best degree of parallelism P essenceafter, the program of moving on application platform will be according to P essencerestart step 1 to the operation of step 4, until application program end of run.Obtaining best degree of parallelism P essenceafter, application program can be carried out initiatively and passive dynamic self-adapting, so just can improve I/O bandwidth, also can make application program obtain the throughput of larger acceleration effect and Geng Gao simultaneously.
The present invention proposes can perception memory device bandwidth program parallelization degree control method be to bring into operation until application program ends up being a control procedure from application program.
The dynamic self-adapting of best degree of parallelism
(1) initiatively self-adaptation adjustment: in the present invention, what described active accommodation referred to periodic triggers unconditionally transfers application program the operation of stable state to from final state.The status mechanism conversion of application program operation has initial state, stable state and final state.When obtaining best degree of parallelism P essenceafter, final state is carried out the real-time monitoring of I/O efficiency to stable state with active accommodation, and then carry out step 2 to the re-treatment of step 4.In conjunction with the state machine in Fig. 1, represent to be described below, application program is after entering final state, when active accommodation is triggered, by the unconditional stable state that enters.In this process, only with best degree of parallelism P essencecarry out step 1 to the processing of step 4, to obtain the real-time change of the I/O bandwidth of application program, also can make application program obtain the throughput of larger acceleration effect and Geng Gao simultaneously.
(2) passive self-adaptation adjustment: in the present invention, described passive adjustment refers to that application program I/O efficiency change rate when final state triggers while surpassing threshold value transfers application program to the operation of initial state from final state.When obtaining best degree of parallelism P essenceafter, final state is carried out the real-time monitoring of I/O efficiency to initial state with passive adjustment, and then carry out step 2 to the re-treatment of step 4.
Application program on application platform is at best degree of parallelism P essenceduring lower operation, when the change of the I/O of application program read-write efficiency rate of change B is greater than the threshold parameter B of the given passive adjustment of triggering threshold valuetime, open immediately the quick location of a new round and the process of fine setting, passive adjustment that Here it is.In conjunction with the state machine in Fig. 1, represent to be described below, application program, after entering final state, when passive adjustment is triggered, enters initial state by unconditional, to trigger the detection process of a new round.
The condition that triggers passive adjustment is B > B threshold value, and
Figure BDA0000395157840000171
Figure BDA0000395157840000172
be illustrated in I/O efficiency monitoring time T iunder condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE.
Figure BDA0000395157840000173
be illustrated in I/O efficiency monitoring time T i+1under condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE.
The technology of active accommodation and passive adjustment is described below in conjunction with Fig. 4, can see application program in the I/O efficiency at I/O efficiency monitoring time T 2 places lower than current threshold value, passive adjustment is triggered, the quick position fixing process of a new round is carried out in application immediately, starts to survey until arrive I/O efficiency monitoring time T 3 to complete detection process arrival final state from lower degree of parallelism.After having adopted active accommodation technology, be applied in I/O efficiency monitoring time T 1 place and just triggered active accommodation, active accommodation only can cause the action of fine setting, after fine setting, in I/O efficiency monitoring time T 3, has reached final state constantly.From above process, can find out, in application, take the initiative after adjustment technology, the active accommodation that application program triggers when T1 can avoid application program when T2, to trigger passive adjustment, thereby make application program to the average I/O efficiency between T3, obtain larger raising at T1, reduced dynamic development adjustment to applying the negative effect of normal operation
For I/O intensive applications, improving throughput is its important performance indexes, it from Physical layer, is exactly the lifting of its I/O efficiency, in order better to control I/O efficiency, can offer using the I/O efficiency of application as feedback information user's application, by feedback regulation mechanism, maximize I/O efficiency information.In feedback regulation mechanism, treat and adjust input to obtain original output by the adjustment algorithm of controller, and original output and difference with reference to output are offered to controller as feedback quantity use, to finally obtaining being bordering on most the output of reference quantity.

Claims (4)

1. a program parallelization degree control method that can perception memory device bandwidth, the program of moving on application platform can be divided into three states, i.e. initial state, stable state and final state according to program I/O efficiency;
Initial state refers to a kind of state when application program is opened with lower degree of parallelism operation;
Stable state refers to a kind of state when application program enters stable operation after initial state;
Final state refers to a kind of state when application program is moved through maintaining higher I/O efficiency after stable state;
It is characterized in that: described program parallelization degree control method includes following treatment step:
Step 1: the real-time monitoring of I/O efficiency
Step 101: the byte number of the read/write function of recording address intercepting-efficiency monitoring
I I/O efficiency monitoring time T iin, adopt the mode of address intercepting, the entrance function read of the read operation that operating system is provided be labeled as with I/O efficiency monitoring read function c_read, and record the byte number that described c_read reads function and be designated as
Figure FDA0000395157830000011
the entrance function write of the write operation that operating system is provided be labeled as with I/O efficiency monitoring write function c_write, and record the byte number that described c_write writes function and be designated as
Figure FDA0000395157830000012
have, at T ithe byte number of the read/write function of the address intercepting-efficiency monitoring under condition is expressed as with set R SYS T i = R c _ read T i + R c _ write T i ;
In application program running time T alwaysunder condition, the byte number of the read/write function of all addresses intercepting-efficiency monitoring is designated as referred to as address intercepting-efficiency monitoring-byte number
Figure FDA0000395157830000015
Step 102: the byte number that records the read/write function of function overloading-efficiency monitoring
I+1 I/O efficiency monitoring time T i+1in, adopt function overloading mode to provide reading function u_read and writing function u_write with monitoring function to user; The byte number that described u_read reads function is designated as
Figure FDA0000395157830000016
the byte number that described u_write writes function is designated as
Figure FDA0000395157830000017
have, at T i+1the byte number of the read/write function of the function overloading-efficiency monitoring under condition is expressed as with set
Figure FDA0000395157830000018
At T alwaysunder condition, the byte number of the read/write function of all function overloading-efficiency monitorings is designated as referred to as function overloading-efficiency monitoring-byte number
Figure FDA00003951578300000110
Step 103: the byte number of the read/write function of record interface information-efficiency monitoring
I I/O efficiency monitoring time T iin, the I/O statistical information interface that adopts operating system to provide, carries out the raw data of pre-service kernel generation, and completes the monitoring of I/O efficiency, and the system monitoring obtaining is read function s_read and is write function s_write; The byte number that described s_read reads function is designated as
Figure FDA00003951578300000111
the byte number that described s_write writes function is designated as have, at T ithe byte number of the read/write function of the interface message-efficiency monitoring under condition is expressed as with set
Figure FDA00003951578300000113
At T alwaysunder condition, the byte number of the read/write function of total interface information-efficiency monitoring is designated as
Figure FDA00003951578300000114
referred to as interface message-efficiency monitoring-byte number
Figure FDA0000395157830000021
Step 2: the denoising of the byte number of read/write function
During application program operation, include different I/O efficiency monitoring type MODE, be that MODE includes address intercepting-efficiency monitoring type SYS, function overloading-efficiency monitoring type USR and interface message-efficiency monitoring type i NT, be MODE={SYS, USR, INT};
According to denoising condition
Figure FDA0000395157830000022
the byte number of read/write function is carried out to filtering interfering processing, obtain filtering postbyte number; If being " 1 ", value chooses
Figure FDA0000395157830000023
and carry out degree of parallelism extraction process, then locate fast, if being " 0 ", value abandons
Figure FDA0000395157830000024
degree of parallelism extract, directly locate fast;
Q represents the number of I/O efficiency monitoring time;
I represents any one I/O efficiency monitoring time;
Figure FDA0000395157830000025
be illustrated in I/O efficiency monitoring time T iunder condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE;
Figure FDA0000395157830000026
be illustrated in I/O efficiency monitoring time T i+1under condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE;
C threshold valuethe critical value that represents the judgement wave phenomenon generation that denoising arranges;
According to denoising condition E to address intercepting-efficiency monitoring-byte number
Figure FDA0000395157830000027
denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
Figure FDA0000395157830000028
According to denoising condition E to function overloading-efficiency monitoring-byte number
Figure FDA0000395157830000029
denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
Figure FDA00003951578300000210
According to denoising condition E docking port information-efficiency monitoring-byte number denoising after, the function overloading-efficiency monitoring obtaining-filtration postbyte number scale is
Figure FDA00003951578300000212
Step 3: the quick location of degree of parallelism
The value of described degree of parallelism P is 2 power, has P=1,2,4,8 ..., 2 p, p represents exponential; Application program brings into operation with degree of parallelism P=1 after opening, and the filtration postbyte number scale of obtaining under the condition of P=1 is
Figure FDA0000395157830000031
after value P=1, degree of parallelism is according to P=2 poperation, and obtain filtering postbyte number scale and be
Figure FDA0000395157830000032
According to degree of parallelism, from little to greatly the filtration postbyte number different degree of parallelisms being carried out to histogram structure, obtain degree of parallelism-I/O efficiency histogram; Byte number by more described degree of parallelism-I/O efficiency histogram, selects maximum number of byte
Figure FDA0000395157830000033
this maximum number of byte be the flex point of I/O efficiency monitoring, as shown in Figure 3A;
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=1, be
Figure FDA0000395157830000035
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=2, be
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=4, be
Figure FDA0000395157830000037
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=8, be
Figure FDA0000395157830000038
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=16, be
Figure FDA0000395157830000039
By address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining under the condition of degree of parallelism P=32, be
Figure FDA00003951578300000310
By degree of parallelism P=2 pcondition under address intercepting-efficiency monitoring-filtration postbyte number scale of obtaining be
Figure FDA00003951578300000311
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain address intercepting-efficiency monitoring histogram; Byte number by the intercepting-efficiency monitoring histogram of more described address, selects maximum number of byte
Figure FDA00003951578300000312
this maximum number of byte
Figure FDA00003951578300000313
it is the flex point of the I/O efficiency monitoring of address intercepting;
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=1-filtration postbyte number scale, be
Figure FDA00003951578300000314
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=2-filtration postbyte number scale, be
Figure FDA0000395157830000041
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=4-filtration postbyte number scale, be
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=8-filtration postbyte number scale, be
Figure FDA0000395157830000043
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=16-filtration postbyte number scale, be
Figure FDA0000395157830000044
By function overloading-efficiency monitoring of obtaining under the condition of degree of parallelism P=32-filtration postbyte number scale, be
Figure FDA0000395157830000045
By degree of parallelism P=2 pcondition under function overloading-efficiency monitoring-filtration postbyte number scale of obtaining be
Figure FDA00003951578300000415
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain function overloading-efficiency monitoring histogram; Byte number by more described function overloading-efficiency monitoring histogram, selects maximum number of byte
Figure FDA0000395157830000046
this maximum number of byte
Figure FDA0000395157830000047
it is the flex point of the I/O efficiency monitoring of function overloading;
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=1-filtration postbyte number scale, be
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=2-filtration postbyte number scale, be
Figure FDA0000395157830000049
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=4-filtration postbyte number scale, be
Figure FDA00003951578300000410
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=8-filtration postbyte number scale, be
Figure FDA00003951578300000411
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=16-filtration postbyte number scale, be
Figure FDA00003951578300000412
By interface message-efficiency monitoring of obtaining under the condition of degree of parallelism P=32-filtration postbyte number scale, be
Figure FDA00003951578300000413
By degree of parallelism P=2 pcondition under interface message-efficiency monitoring-filtration postbyte number scale of obtaining be
Figure FDA00003951578300000414
Filtration postbyte number under different degree of parallelisms is written into and take degree of parallelism P as horizontal ordinate, in the histogram that byte number is ordinate, obtain interface message-efficiency monitoring histogram; Byte number by more described interface message-efficiency monitoring histogram, selects maximum number of byte
Figure FDA0000395157830000051
this maximum number of byte
Figure FDA0000395157830000052
it is the flex point of the I/O efficiency monitoring of interface message;
Step 4: obtain best degree of parallelism
The thick degree of parallelism P recording in histogram according to the filtration postbyte number under different degree of parallelisms slightly, and at thick degree of parallelism P slightlyboth sides with the record of the read-write efficiency byte number that reduces by 2 degree of parallelism units and carry out at every turn, to obtain best degree of parallelism P essence; When obtaining best degree of parallelism P slightlyafter, the program of moving on application platform will be according to P slightlyrestart step 1 to the operation of step 4, until application program end of run;
Obtaining best degree of parallelism P slightlyafter, application program can be carried out initiatively and passive dynamic self-adapting, so just can improve I/O bandwidth, also can make application program obtain the throughput of larger acceleration effect and Geng Gao simultaneously.
2. program parallelization degree control method that can perception memory device bandwidth according to claim 1, is characterized in that: the active self-adaptation that application program is carried out is adjusted into: when obtaining best degree of parallelism P essenceafter, final state is carried out the real-time monitoring of I/O efficiency to stable state with active accommodation, and then carry out step 2 to the re-treatment of step 4.
3. program parallelization degree control method that can perception memory device bandwidth according to claim 1, is characterized in that: the passive self-adaptation that application program is carried out is adjusted into: when obtaining best degree of parallelism P essenceafter, final state is carried out the real-time monitoring of I/O efficiency to initial state with passive adjustment, and then carry out step 2 to the re-treatment of step 4; Application program on application platform is at best degree of parallelism P essenceduring lower operation, when the change of the I/O of application program read-write efficiency rate of change B is greater than the threshold parameter B of the given passive adjustment of triggering threshold valuetime, open immediately the quick location of a new round and the process of fine setting; The condition that triggers passive adjustment is B > B threshold value, and
Figure FDA0000395157830000053
Figure FDA0000395157830000054
be illustrated in I/O efficiency monitoring time T iunder condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE,
Figure FDA0000395157830000055
be illustrated in I/O efficiency monitoring time T i+1under condition, the byte number of the read/write function of any I/O efficiency monitoring type MODE.
4. program parallelization degree control method that can perception memory device bandwidth according to claim 1, it is characterized in that: the method is adaptive according to the comprehensive hardware condition of different application platform and real-time load state, dynamically determine the best I/O efficiency performance that application program can obtain, and program parallelization degree control method that can perception memory device bandwidth.
CN201310477537.2A 2013-10-14 2013-10-14 A kind of can the program parallelization degree control method of perception memory device bandwidth Expired - Fee Related CN103677757B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310477537.2A CN103677757B (en) 2013-10-14 2013-10-14 A kind of can the program parallelization degree control method of perception memory device bandwidth

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310477537.2A CN103677757B (en) 2013-10-14 2013-10-14 A kind of can the program parallelization degree control method of perception memory device bandwidth

Publications (2)

Publication Number Publication Date
CN103677757A true CN103677757A (en) 2014-03-26
CN103677757B CN103677757B (en) 2016-01-06

Family

ID=50315436

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310477537.2A Expired - Fee Related CN103677757B (en) 2013-10-14 2013-10-14 A kind of can the program parallelization degree control method of perception memory device bandwidth

Country Status (1)

Country Link
CN (1) CN103677757B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109324894A (en) * 2018-08-13 2019-02-12 中兴飞流信息科技有限公司 PC cluster method, apparatus and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7437546B2 (en) * 2005-08-03 2008-10-14 Intel Corporation Multiple, cooperating operating systems (OS) platform system and method
CN102521047A (en) * 2011-11-15 2012-06-27 重庆邮电大学 Method for realizing interrupted load balance among multi-core processors

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7437546B2 (en) * 2005-08-03 2008-10-14 Intel Corporation Multiple, cooperating operating systems (OS) platform system and method
CN102521047A (en) * 2011-11-15 2012-06-27 重庆邮电大学 Method for realizing interrupted load balance among multi-core processors

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘轶等: "多核处理器大规模并行系统中的任务分配问题及算法", 《小型微型计算机系统》, vol. 29, no. 5, 15 May 2008 (2008-05-15), pages 972 - 975 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109324894A (en) * 2018-08-13 2019-02-12 中兴飞流信息科技有限公司 PC cluster method, apparatus and computer readable storage medium

Also Published As

Publication number Publication date
CN103677757B (en) 2016-01-06

Similar Documents

Publication Publication Date Title
US8019790B2 (en) System and method of dynamically changing file representations
US9183137B2 (en) Storage control system with data management mechanism and method of operation thereof
US11392565B1 (en) Optimizing data compression in a storage system
KR20130002046A (en) Power management method for storage device including multi-core
WO2014209234A1 (en) Method and apparatus for hot data region optimized dynamic management
Niu et al. Hybrid storage systems: A survey of architectures and algorithms
KR20180115614A (en) Opportunity window hints for background operations in ssd
CN101493795A (en) Storage system, storage controller, and cache implementing method in the storage system
EP2981920B1 (en) Detection of user behavior using time series modeling
US20190065404A1 (en) Adaptive caching in a storage device
KR20170002866A (en) Adaptive Cache Management Method according to the Access Chracteristics of the User Application in a Distributed Environment
Zhang et al. CRFTL: cache reallocation-based page-level flash translation layer for smartphones
JP6680069B2 (en) Storage control device, storage system, and storage device control program
US20220308779A1 (en) Data relocation system
CN103677757B (en) A kind of can the program parallelization degree control method of perception memory device bandwidth
US20240061782A1 (en) Method and device for data caching
US11645204B2 (en) Managing cache replacement in a storage cache based on input-output access types of data stored in the storage cache
Menon et al. Logstore: A workload-aware, adaptable key-value store on hybrid storage systems
US11182087B2 (en) Modifying write performance to prolong life of a physical memory device
TWI539368B (en) Data writing method and system
Yoo et al. Low power mobile storage: SSD case study
US12019532B2 (en) Distributed file system performance optimization for path-level settings using machine learning
US20220334944A1 (en) Distributed file system performance optimization for path-level settings using machine learning
US20230178136A1 (en) Memory device detecting weakness of operation pattern and method of operating the same
Ma et al. MAID-Q: Minimizing Tail Latency in Embedded Flash With SMR Disk via-Learning Model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20210420

Address after: 100160, No. 4, building 12, No. 128, South Fourth Ring Road, Fengtai District, Beijing, China (1515-1516)

Patentee after: Kaixi (Beijing) Information Technology Co.,Ltd.

Address before: 100191 Haidian District, Xueyuan Road, No. 37,

Patentee before: BEIHANG University

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160106

Termination date: 20211014

CF01 Termination of patent right due to non-payment of annual fee