CN108762684A - Hot spot data migrates flow control method, device, electronic equipment and storage medium - Google Patents

Hot spot data migrates flow control method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN108762684A
CN108762684A CN201810565747.XA CN201810565747A CN108762684A CN 108762684 A CN108762684 A CN 108762684A CN 201810565747 A CN201810565747 A CN 201810565747A CN 108762684 A CN108762684 A CN 108762684A
Authority
CN
China
Prior art keywords
data block
data
flow control
period
hot spot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810565747.XA
Other languages
Chinese (zh)
Other versions
CN108762684B (en
Inventor
陈学伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810565747.XA priority Critical patent/CN108762684B/en
Priority to PCT/CN2018/100168 priority patent/WO2019232925A1/en
Publication of CN108762684A publication Critical patent/CN108762684A/en
Application granted granted Critical
Publication of CN108762684B publication Critical patent/CN108762684B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of hot spot data migration flow control method, including:Every preset time period records the data set that user accesses;The data set is divided into multiple data blocks;Judge whether have data block for hot spot data in the multiple data block;When determining that it is hot spot data to have data block, judge whether the data block for being confirmed as hot spot data is written in caching;When judging that the data block for being confirmed as hot spot data is not written in caching, the current statistic period corresponding flow control threshold value in the migration period is obtained;Based on the current statistic period corresponding flow control threshold value, the data block for being confirmed as hot spot data is written in caching.The present invention also provides a kind of hot spot data migration flow control apparatus, electronic equipment and storage mediums.The present invention can avoid causing obviously to impact to normal input and output service feature, have good fluid control effect by hot spot data write-in caching, while saving the time for reading hot spot data.

Description

Hot spot data migrates flow control method, device, electronic equipment and storage medium
Technical field
The present invention relates to field of computer technology, and in particular to a kind of hot spot data migration flow control method, device, electronics are set Standby and storage medium.
Background technology
Caching is exactly the buffering area of data exchange,, can be first from caching when reading data when a certain hardware, such as CPU Search need data, if having found directly execute, can not find if if looked for from memory.In the speed of service ratio of caching The effect deposited faster, therefore cached just is to aid in hardware and quickly runs.
However, caching is the duplicate of small part data in memory, so when finding data in hardware to caching, can go out The case where now can not find (because these data are not copied to from memory in caching), hardware goes to look for data in memory at this time, The speed of service of whole system will slow down in this way.
Hot spot data is the data that hardware is frequently necessary to use, and hot spot data is deposited into advance in caching, can be so as to When hardware calls hot spot data, it can directly be obtained from caching, the time of data acquisition is saved with this.
However, during storing hot spot data to caching, a large amount of input and output (Input/ will produce Output, IO), if at this time be exactly user application the peak periods IO, can influence user application response time, to user with Carry out bad experience.
Invention content
In view of the foregoing, it is necessary to propose that a kind of hot spot data migrates flow control method, device, electronic equipment and storage and is situated between Matter can be avoided by hot spot data write-in caching, while saving the time for reading hot spot data to normal input and output Service feature causes obviously to impact, and has good fluid control effect.
The first aspect of the present invention provides a kind of hot spot data migration flow control method, the method includes:
Every preset time period records the data set that user accesses;
The data set is divided into multiple data blocks;
Judge whether have data block for hot spot data in the multiple data block;
When determining that it is hot spot data to have data block, judge whether the data block for being confirmed as hot spot data is written caching In;
When judging that the data block for being confirmed as hot spot data is not written in caching, the current system in the migration period is obtained Count period corresponding flow control threshold value;
Based on the current statistic period corresponding flow control threshold value, the data block for being confirmed as hot spot data is written Into caching.
Preferably, the data set is divided into multiple data blocks includes:
The data ensemble average is divided into the data block of preset quantity;Or
By the data block that the data set random division is preset quantity;Or
The data set is divided into multiple data blocks according to default size.
It is preferably, described that judge whether to have in the multiple data block data block for hot spot data be by calculating data block Whether accessed probability value is hot spot data based on the probability value prediction data block, including:Each data block is counted in institute State the number being accessed in preset time period;Based on the number that each data block is accessed in the preset time period, calculate The probability value that each data block is accessed in the preset time period;Judge whether the accessed probability value of each data block is big In predetermined probabilities value;When judging that the accessed probability value of data block is more than the predetermined probabilities value, determines and be more than described preset The corresponding data block of accessed probability value of probability value is hot spot data;When judge the accessed probability value of data block be less than or When person is equal to the predetermined probabilities value, the corresponding number of accessed probability value less than or equal to the predetermined probabilities value is determined It is non-thermal point data according to block.
Preferably, the current statistic period corresponding flow control threshold value obtained in the migration period includes:
Judge whether the current statistic period is first measurement period;
When it is first measurement period to determine the current statistic period, default flow control threshold value is determined as described current The corresponding flow control threshold value of measurement period;
When it is first measurement period to determine the current statistic period not, obtains user in a upper measurement period and answer I/O load determines that the current statistic period corresponds to according to the I/O load that user in a upper measurement period applies Flow control threshold value.
Preferably, the I/O load applied according to user in a upper measurement period, determines the current statistic period Corresponding flow control threshold value includes:
The data block size for each IO that user applies in a measurement period is obtained, a upper statistics is calculated The average data block size of IO in period;
The propagation delay time of each data block in a upper measurement period is obtained, a upper measurement period is calculated The average data block time delay of interior IO;
Obtain a reference value of the data block size of pre-set IO and a reference value of corresponding data block time delay;
According to the average data block size of the IO in a upper measurement period, average data block time delay, data The a reference value of a reference value of block size, corresponding data block time delay calculates the I/O load intensity in a upper measurement period;
According to the I/O load intensity in a upper measurement period, determined using trained load disaggregated model in advance I/O load classification in a upper measurement period;
Current statistic period corresponding flow control threshold value is calculated according to the I/O load classification in a upper measurement period.
Preferably, the average data block size according to the IO in a upper measurement period, average data The a reference value of block time delay, a reference value of data block size, corresponding data block time delay calculates in a upper measurement period The calculation formula of I/O load intensity is:Wherein, X is the average of the IO in an above-mentioned upper measurement period According to block size, Y is the average data block time delay, and M is a reference value of the data block size, and N is the corresponding data block The a reference value of time delay.
Preferably, the I/O load classification according in a upper measurement period calculates current statistic period corresponding flow control Threshold value includes:
When the I/O load classification in a upper measurement period is high load classification, by a upper measurement period Corresponding flow control threshold value reduces the first predetermined amplitude, obtains current statistic period corresponding flow control threshold value;
When the I/O load classification in a upper measurement period is low-load classification, by a upper measurement period Corresponding flow control threshold value improves the second predetermined amplitude, obtains current statistic period corresponding flow control threshold value;
When the I/O load classification in a upper measurement period is normal load classification, by the upper statistics week Phase corresponding flow control threshold value is as current statistic period corresponding flow control threshold value.
The second aspect of the present invention provides a kind of hot spot data migration flow control apparatus, and described device includes:
Logging modle, the data set accessed for every preset time period record user;
Division module, for the data set to be divided into multiple data blocks;
Judgment module, for judging whether have data block for hot spot data in the multiple data block;
Judgment module is additionally operable to the data for judging to be confirmed as hot spot data when determining that it is hot spot data to have data block Whether block is written in caching;
Acquisition module, for judging that the data block for being confirmed as hot spot data is not written in caching when the judgment module When, obtain the current statistic period corresponding flow control threshold value in the migration period;
Transferring module is confirmed as hot spot number for being based on the current statistic period corresponding flow control threshold value by described According to data block be written in caching.
The third aspect of the present invention provides a kind of electronic equipment, and the electronic equipment includes processor and memory, described Processor is for realizing the hot spot data migration flow control method when executing the computer program stored in the memory.
The fourth aspect of the present invention provides a kind of computer readable storage medium, is deposited on the computer readable storage medium Computer program is contained, the computer program realizes the hot spot data migration flow control method when being executed by processor.
Hot spot data migration flow control method, device, electronic equipment and storage medium of the present invention, every preset time The data set that segment record user accesses, multiple data blocks are divided by the data set, are determining that it is hot spot data to have data block And when being not written into caching, the corresponding flow control threshold value of different measurement periods in the period is migrated by acquisition, based on described each The corresponding flow control threshold value of a measurement period, the data block for being confirmed as hot spot data is written in caching, is used improving While user data is migrated to the efficiency of caching, reduction loss of data risk, it can avoid to normal input and output service feature It causes obviously to impact, there is good fluid control effect.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is the flow chart for the hot spot data migration flow control method that the embodiment of the present invention one provides.
Fig. 2 is that the I/O load provided by Embodiment 2 of the present invention applied according to user in a upper measurement period determines currently The flow chart of the method for the corresponding flow control threshold value of measurement period.
Fig. 3 is the functional block diagram for the hot spot data migration flow control apparatus that the embodiment of the present invention three provides.
Fig. 4 is the schematic diagram for the electronic equipment that the embodiment of the present invention four provides.
Following specific implementation mode will be further illustrated the present invention in conjunction with above-mentioned attached drawing.
Specific implementation mode
To better understand the objects, features and advantages of the present invention, below in conjunction with the accompanying drawings and specific real Applying example, the present invention will be described in detail.It should be noted that in the absence of conflict, the embodiment of the present invention and embodiment In feature can be combined with each other.
Elaborate many details in the following description to facilitate a thorough understanding of the present invention, described embodiment only It is only a part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill The every other embodiment that personnel are obtained without making creative work, shall fall within the protection scope of the present invention.
Unless otherwise defined, all of technologies and scientific terms used here by the article and belong to the technical field of the present invention The normally understood meaning of technical staff is identical.Used term is intended merely to description tool in the description of the invention herein The purpose of the embodiment of body, it is not intended that in the limitation present invention.
The hot spot data migration flow control method of the embodiment of the present invention is applied in one or more electronic equipment.The heat Point data migration flow control method can also be applied to by electronic equipment and the clothes being attached by network and the electronic equipment In the hardware environment that business device is constituted.Network includes but not limited to:Wide area network, Metropolitan Area Network (MAN) or LAN.The embodiment of the present invention Hot spot data migration flow control method can be executed by server, can also be executed by electronic equipment;It can also be by servicing Device and electronic equipment execute jointly.
For needing progress hot spot data to migrate the electronic equipment of flow control method, it can directly collect cost on an electronic device The hot spot data that the method for invention is provided migrates flow control function, or installation for realizing the client of the method for the present invention. For another example, method provided by the present invention can also be with Software Development Kit (Software Development Kit, SDK) Form operates in the equipment such as server, in the form of SDK provide hot spot data migration flow control function interface, electronic equipment or The function that flow control is carried out to writing disk from the background can be realized by the interface of offer for other equipment.
Embodiment one
Fig. 1 is the flow chart for the hot spot data migration flow control method that the embodiment of the present invention one provides.According to different requirements, Execution sequence in the flow chart can change, and certain steps can be omitted.
The data set that S11, every preset time period record user access.
Preset time period is the pre-set time cycle, for example, one week or 10 days etc..The present invention is to preset time period It is not specifically limited, can voluntarily be arranged according to the hardware or data access scenarios of electronic system.
When electronic equipment detects the instruction of user accesses data, the instruction of user accesses data is responded, user is accessed Data feedback to user.It is recorded in the data set that all users access in the preset time period.
S12, the data set is divided into multiple data blocks.
The data set that the user recorded accesses is divided into multiple data blocks.
In a preferred embodiment of the present invention, it may include following a kind of or more the data set to be divided into multiple data blocks The combination of kind:
1) the data ensemble average is divided into the data block of preset quantity.
The preset quantity is the number of pre-set data block, for example, the data ensemble average is divided into 10 The size of data block, each data block is identical.
2) it is the data block of preset quantity by the data set random division.
For example, being 10 data blocks by the data set random division, the size of each data block is all different.
3) data set is divided into multiple data blocks according to default size.
The default size is the size of pre-set data block, for example, the data set is divided into multiple data The size of block, each data block is 1Mb.The default size can also be 10Mb or bigger.
S13, judge whether have data block for hot spot data in the multiple data block.
In a preferred embodiment of the present invention, judge whether have data block that can lead to for hot spot data in the multiple data block It crosses and calculates the accessed probability value of data block, whether be hot spot data based on the probability value prediction data block.
It is described to judge whether there is data block to can specifically include for hot spot data in the multiple data block:
1) number that each data block is accessed in the preset time period is counted;
2) number being accessed in the preset time period based on each data block, calculates each data block described pre- If the probability value being accessed in the period;
3) judge whether the accessed probability value of each data block is more than predetermined probabilities value;
4) it when judging that the accessed probability value of data block is more than the predetermined probabilities value, determines and is more than the predetermined probabilities The corresponding data block of accessed probability value of value is hot spot data;When judging that the accessed probability value of data block is less than or waits When the predetermined probabilities value, the corresponding data block of accessed probability value less than or equal to the predetermined probabilities value is determined For non-thermal point data.
For example, if preset time period is one week, the data set that user in this week accesses is divided into 20 numbers According to block, including data block 1, data block 2, data block 3, data block 4, data block 5, data block 6, data block 7, data block 8, data Block 9, data block 10, data block 11, data block 12, data block 13, data block 14, data block 15, data block 16, data block 17, Data block 18, data block 19 and data block 20.Wherein, the data block 1 is accessed 10 times in one week, data block 2 exists Be accessed in one week 5 times, data block 3 be accessed in one week 8 times, data block 4 20 times, data are accessed in one week Block 5 be accessed in one week 50 times, data block 6 be accessed in one week 3 times, data block 7 be accessed 20 in one week Secondary, data block 8 be accessed in one week 40 times, data block 91 time, the quilt in one week of data block 10 are accessed in one week Have accessed 5 times, data block 11 is accessed 9 times in one week, data block 12 is accessed 11 times in one week, data block 13 exists Be accessed in one week 10 times, data block 14 be accessed in one week 12 times, data block 15 be accessed in one week 20 times, Data block 16 be accessed in one week 30 times, data block 17 14 times, the quilt in one week of data block 18 are accessed in one week Have accessed 0 time, data block 19 is accessed in one week 2 times and data block 20 is accessed 50 times in one week.It calculates every The formula of the accessed probability value of a data block is:Wherein, XiIndicate that i-th of data block is accessed in one week Number, PiThe probability being accessed in one week for i-th of data block.It is possible thereby to calculate what the data block 1 was accessed Probability value is as follows:
Similar, the accessed probability value P of the data block 2 can be calculated2=1.56%, what data block 3 was accessed Probability value P3=2.5% etc., the accessed probability value of other data blocks is not repeating.
In present pre-ferred embodiments, the predetermined threshold value can be, such as 20%, therefore accessed probability value is more than The data for including inside 20% data block can be considered as hot spot data.
When determining that it is hot spot data to have data block, step S14 is executed;When it is hot spot data to determine no data block, It can return and execute above-mentioned steps S11.
Whether the data block that S14, judgement are confirmed as hot spot data is written in caching.
When successful hit is to the data block for being confirmed as hot spot data in the buffer, illustrate to be confirmed as hot spot data Data block has been written into caching;When in the buffer without hit to the data block for being confirmed as hot spot data, illustrate true The data block for being set to hot spot data is not written in caching.
When judging to be confirmed as in the data block write-in caching of hot spot data, it can directly terminate flow;When judging quilt When being determined as the data block of hot spot data and not being written in caching, step S15 is executed.
S15, the current statistic period corresponding flow control threshold value migrated in the period is obtained.
The data block for being confirmed as hot spot data and not being written in caching is cached to the whole of completion write-in from write-in is started A process is referred to as a migration period.One migration period can be divided into multiple measurement periods, and measurement period can be with For a preset time period, for example, a measurement period is set as 1 second.
The flow control refers to flow control.The implementation method of flow control includes following two:One is pass through router, exchange The QoS module of machine realizes the flow control based on source address, destination address, source port, destination interface and protocol type;It is another Kind is to realize the flow control based on application layer by the fluidic device of profession.
In this preferred embodiment, the current statistic period corresponding flow control threshold value obtained in the migration period specifically can be with Including:
1) judge whether the current statistic period is first measurement period.
Can by judge current time whether be the 1st second come judge currently to migrate the period whether be first statistics week Phase.
2) when it is first measurement period to determine the current statistic period, default flow control threshold value is determined as described work as The corresponding flow control threshold value of preceding measurement period;
The corresponding flow control threshold value of first measurement period in the migration period of the present invention is pre-set flow control threshold value, It can rule of thumb be pre-set by the manager of system.That is, using a preset flow control threshold value as in the migration period The flow control threshold value of first measurement period.
3) when it is first measurement period to determine the current statistic period not, user in a upper measurement period is obtained The I/O load of application determines the current statistic period pair according to the I/O load that user in a upper measurement period applies The flow control threshold value answered.
Each remaining measurement period in addition to first measurement period in the migration period can correspond to a flow control threshold Value.The corresponding flow control threshold value of each measurement period of residue is that dynamic adjusts, and the current statistic period, corresponding flow control threshold value could To be calculated according to the I/O load in a upper measurement period, the corresponding flow control threshold value of next measurement period can be according to working as I/O load in preceding measurement period is calculated.Specifically, calculating second according to the I/O load in first measurement period The corresponding flow control threshold value of measurement period;The corresponding stream of third measurement period is calculated according to the I/O load in second measurement period Control threshold value;And so on.
The I/O load applied according to user in a upper measurement period determines that the current statistic period corresponds to The detailed process of flow control threshold value may refer to Fig. 2 and its corresponding description.
S16, it is based on the current statistic period corresponding flow control threshold value, by the data block for being confirmed as hot spot data It is written in caching.
The data block for being confirmed as hot spot data is written to according to the current statistic period corresponding flow control threshold value slow In depositing, it is confirmed as the write-in that the data block of hot spot data carries out data with the flow that the current statistic period controls so that write-in Hot spot data in caching is unlikely to too fast or excessively slow, can avoid causing obviously to impact to normal input and output service feature, The hot spot data in caching is written into access for user.
Embodiment two
Fig. 2 is that the I/O load provided by Embodiment 2 of the present invention applied according to user in a upper measurement period determines currently The flow chart of the method for the corresponding flow control threshold value of measurement period.
S21, the data block size for obtaining each IO of user's application in a upper measurement period, calculate described upper one The average data block size of IO in measurement period.
It is flat that arithmetic average value-based algorithm, geometry may be used in the average data block size of IO in a upper measurement period Mean algorithm or root mean square average algorithm calculate.
For example, it is assumed that detect in a measurement period, user's application shares ten IO, the data block of ten IO Size is respectively:2M, 1M, 3M, 0.5M, 10M, 4M, 0.1M, 1.2M, 5M and 8M.It is calculated using the arithmetic average value-based algorithm The average data block size of IO in a upper measurement period is:
S22, the propagation delay time for obtaining each data block in the upper measurement period, calculate a upper statistics The average data block time delay of IO in period.
The propagation delay time (referred to as time delay) refers to node makes data block enter transmission from node when sending data When time needed for media, i.e. a transmitting station are sent required whole from beginning transmission data frame to data frame Between or a receiving station from start receive data frame finish required All Time to data frame receipt.
In a preferred embodiment of the present invention, the propagation delay time of the data block can be installed from each memory node one It is acquired in a load measuring tool or performance monitoring tool.
As described above, the average data block time delay of the IO in a upper measurement period can also use arithmetic mean of instantaneous value Algorithm, geometric mean algorithm or root mean square average algorithm calculate.Assuming that, it is assumed that detect a measurement period Interior, the propagation delay time of ten IO is respectively:1s, 0.8s, 1.5s, 0.4s, 5s, 2s, 0.02s, 0.6s, 3s and 4.5s, then it is described IO average data block time delays in a upper measurement period using arithmetic average value-based algorithm come when calculating, as a result:
(1s+0.8s+1.5s+0.4s+5s+2s+0.1s+0.6s+3s+4.4s)=1.88s.
It should be understood that if the average data block size of the IO in a upper measurement period uses arithmetic average value-based algorithm It calculates, then the average data block time delay of the IO in a upper measurement period is also calculated using arithmetic average value-based algorithm;On if The average data block size of IO in one measurement period is calculated using geometric mean algorithm, then in a upper measurement period The average data block time delay of IO also calculated using geometric mean algorithm;If the IO's in a upper measurement period is flat Equal data block size is calculated using root mean square average algorithm, then the average data block time delay of the IO in a upper measurement period Also it is calculated using root mean square average algorithm.
The a reference value of the data block size of S23, the pre-set IO of acquisition and a reference value of corresponding data block time delay.
In a preferred embodiment of the present invention, the base of a reference value of the I/O data block size and corresponding data block time delay Quasi- value can rule of thumb be pre-set by the administrator of storage system.For example, rule of thumb, the data block of 4K in transmission, Time delay is minimum, can ideally reach 50ms, then a reference value of the I/O data block size could be provided as 4k, corresponding The a reference value of data block time delay could be provided as 50ms.
S24, according to the average data block size of the IO in a upper measurement period, average data block time delay, The a reference value of a reference value of data block size, corresponding data block time delay calculates the I/O load in a upper measurement period Intensity.
For example, it is assumed that when the average data block size of the IO in a upper measurement period is X, average data block The a reference value for prolonging as Y, data block size is M, a reference value of corresponding data block time delay is N, then a upper measurement period The calculation formula of interior I/O load intensity is:
S25, according to the I/O load intensity in a upper measurement period, utilize trained load disaggregated model in advance Determine the I/O load classification in a upper measurement period.
In a preferred embodiment of the present invention, the I/O load classification includes:It is high load classification, normal load classification, low negative Carry classification.
Preferably, the load disaggregated model includes, but are not limited to:Support vector machines (Support Vector Machine, SVM) model.By the average data block size of the IO in a upper measurement period, the upper statistics week I/O load intensity in the average data block time delay of IO in phase, a upper measurement period is as the load disaggregated model Input export the I/O load classification in a upper measurement period after load disaggregated model calculating.
In a preferred embodiment of the invention, the training process of the load disaggregated model includes:
1) the I/O load data of positive sample and the I/O load data of negative sample are obtained, and by the I/O load data mark of positive sample Load class is noted, so that the I/O load data of positive sample carry I/O load class label.
For example, 500 high load classifications, normal load classification, the corresponding I/O load data of low-load classification are chosen respectively, And classification, I/O data label that can be using " 1 " as high load, using " 2 " as normal load are marked to each I/O load data I/O data label, the I/O data label using " 3 " as low-load.
2) the I/O load data of the I/O load data of the positive sample and the negative sample are randomly divided into the first default ratio The verification collection of the training set and the second preset ratio of example trains the load disaggregated model using the training set, and utilizes institute State the accuracy rate of the load disaggregated model after verification collection verification training.
First the training sample in the training set of different loads classification is distributed in different files.For example, height is negative The training sample of load classification is distributed in the first file, the training sample of normal load classification is distributed in the second file, The training sample of low-load classification is distributed in third file.Then the first default ratio is extracted respectively in different files The training sample of example (for example, 70%) carries out the training of load disaggregated model as total training sample, from different files In take the training sample of remaining second preset ratio (for example, 30%) respectively as total test sample training completed described in It loads disaggregated model and carries out Accuracy Verification.
If 3) accuracy rate is more than or equal to default accuracy rate, terminate to train, with the load after training Disaggregated model identifies the I/O load classification in the current statistic period as grader;If the accuracy rate is less than default accurate When rate, then increase positive sample quantity and negative sample quantity to load disaggregated model described in re -training until the accuracy rate is more than Or equal to default accuracy rate.
S26, current statistic period corresponding flow control threshold value is calculated according to the I/O load classification in a upper measurement period.
Specifically, the I/O load classification according in a upper measurement period calculates current statistic period corresponding flow control Threshold value may include:
1) when the I/O load classification in a upper measurement period is high load classification, by the upper statistics week Phase, corresponding flow control threshold value reduced the first predetermined amplitude, obtained current statistic period corresponding flow control threshold value.
When I/O load in a upper measurement period is high load, flow control threshold is reduced according to first predetermined amplitude Value, to execute the behaviour of write-in caching to the data block for being confirmed as hot spot data with low flow control threshold value within the current statistic period Make, ensures the efficient access of user's application by reducing the speed of Data Migration.
In a preferred embodiment of the invention, first predetermined amplitude can be the corresponding flow control of a upper measurement period The 1/2 of threshold value.I.e. current statistic period corresponding flow control threshold value is the 1/2 of the corresponding flow control threshold value of a upper measurement period, under The corresponding flow control threshold value of one measurement period is the 1/2 of current statistic period corresponding flow control threshold value.
2) when the I/O load classification in a upper measurement period is low-load classification, by the upper statistics week Phase, corresponding flow control threshold value improved the second predetermined amplitude, obtained the corresponding flow control threshold value of next measurement period.
When I/O load in a upper measurement period is low-load, flow control threshold is improved according to second predetermined amplitude Value, to execute the behaviour of write-in caching to the data block for being confirmed as hot spot data with high flow control threshold value within the current statistic period Make, on the basis of ensureing the access quality of user's application, improves the speed of Data Migration.
In a preferred embodiment of the invention, second predetermined amplitude can be the corresponding flow control of a upper measurement period 1.5 times of threshold value.I.e. current statistic period corresponding flow control threshold value is the 1.5 of the corresponding flow control threshold value of a upper measurement period Times, the corresponding flow control threshold value of next measurement period is 1.5 times of current statistic period corresponding flow control threshold value.
3) when the I/O load classification in a upper measurement period is normal load classification, by a upper statistics Period corresponding flow control threshold value is as current statistic period corresponding flow control threshold value.
In conclusion hot spot data of the present invention migrates flow control method, every preset time period records user and accesses Data set, the data set is divided into multiple data blocks, is determining that it is hot spot data and to be not written into caching to have data block When, the corresponding flow control threshold value of different measurement periods in the period is migrated by acquisition, is corresponded to based on each described measurement period Flow control threshold value, the data block for being confirmed as hot spot data is written in caching, is improving migration of subscriber data to slow The efficiency deposited while reduce loss of data risk, can avoid causing obviously to impact to normal input and output service feature, have There is good fluid control effect.
Secondly, current statistic period corresponding flow control threshold value is the I/O load applied according to user in a upper measurement period Automatically it is adjusted into Mobile state, is not required to manager and adjusts manually, reduce the workload of manager, avoid the subjectivity because of manager The not accurate problem of adjustment caused by factor.
The above is only the specific implementation mode of the present invention, but scope of protection of the present invention is not limited thereto, for For those skilled in the art, without departing from the concept of the premise of the invention, improvement, but these can also be made It all belongs to the scope of protection of the present invention.
With reference to the 3rd to 4 figure, respectively to realizing that above-mentioned hot spot data migrates the function mould of the electronic equipment of flow control method Block and hardware configuration are introduced.
Embodiment three
Fig. 3 is the functional block diagram in hot spot data of the present invention migration flow control apparatus preferred embodiment.
In some embodiments, the hot spot data migration flow control apparatus 30 is run in electronic equipment.The hot spot number May include multiple function modules being made of program code segments according to migration flow control apparatus 30.The hot spot data migrates flow control The program code of each program segment in device 30 can be stored in memory, and performed by least one processor, with It executes (referring to Fig. 1-2 and its associated description) hot spot data and migrates flow control method.
In the present embodiment, function of the hot spot data migration flow control apparatus 30 performed by it can be divided into Multiple function modules.The function module may include:Logging modle 301, judgment module 303, obtains mould at division module 302 Block 304, transferring module 305, computing module 306, determining module 307 and training module 308.The so-called module of the present invention refers to one Kind performed by least one processor and the series of computation machine program segment of fixed function can be completed, be stored in In memory.In some embodiments, it will be described in detail in subsequent embodiment about the function of each module.
Logging modle 301, the data set accessed for every preset time period record user.
Preset time period is the pre-set time cycle, for example, one week or 10 days etc..The present invention is to preset time period It is not specifically limited, can voluntarily be arranged according to the hardware or data access scenarios of electronic system.
When electronic equipment detects the instruction of user accesses data, the instruction of user accesses data is responded, user is accessed Data feedback to user.Logging modle 301 is recorded in the data set that all users access in the preset time period.
Division module 302, for the data set to be divided into multiple data blocks.
The data set that the user recorded accesses is divided into multiple data blocks.
In a preferred embodiment of the present invention, the data set is divided into multiple data blocks and may include by division module 302 The combination of one or more of:
1) the data ensemble average is divided into the data block of preset quantity.
The preset quantity is the number of pre-set data block, for example, the data ensemble average is divided into 10 The size of data block, each data block is identical.
2) it is the data block of preset quantity by the data set random division.
For example, being 10 data blocks by the data set random division, the size of each data block is all different.
3) data set is divided into multiple data blocks according to default size.
The default size is the size of pre-set data block, for example, the data set is divided into multiple data The size of block, each data block is 1Mb.The default size can also be 10Mb or bigger.
Judgment module 303, for judging whether have data block for hot spot data in the multiple data block.
In a preferred embodiment of the present invention, judgment module 303 judges whether have data block for heat in the multiple data block Whether the probability value that point data can be accessed by calculating data block, be hot spot number based on the probability value prediction data block According to.
The judgment module 303 judges whether have data block that can specifically be wrapped for hot spot data in the multiple data block It includes:
1) number that each data block is accessed in the preset time period is counted;
2) number being accessed in the preset time period based on each data block, calculates each data block described pre- If the probability value being accessed in the period;
3) judge whether the accessed probability value of each data block is more than predetermined probabilities value;
4) it when judging that the accessed probability value of data block is more than the predetermined probabilities value, determines and is more than the predetermined probabilities The corresponding data block of accessed probability value of value is hot spot data;When judging that the accessed probability value of data block is less than or waits When the predetermined probabilities value, the corresponding data block of accessed probability value less than or equal to the predetermined probabilities value is determined For non-thermal point data.
For example, if preset time period is one week, the data set that user in this week accesses is divided into 20 numbers According to block, including data block 1, data block 2, data block 3, data block 4, data block 5, data block 6, data block 7, data block 8, data Block 9, data block 10, data block 11, data block 12, data block 13, data block 14, data block 15, data block 16, data block 17, Data block 18, data block 19 and data block 20.Wherein, the data block 1 is accessed 10 times in one week, data block 2 exists Be accessed in one week 5 times, data block 3 be accessed in one week 8 times, data block 4 20 times, data are accessed in one week Block 5 be accessed in one week 50 times, data block 6 be accessed in one week 3 times, data block 7 be accessed 20 in one week Secondary, data block 8 be accessed in one week 40 times, data block 91 time, the quilt in one week of data block 10 are accessed in one week Have accessed 5 times, data block 11 is accessed 9 times in one week, data block 12 is accessed 11 times in one week, data block 13 exists Be accessed in one week 10 times, data block 14 be accessed in one week 12 times, data block 15 be accessed in one week 20 times, Data block 16 be accessed in one week 30 times, data block 17 14 times, the quilt in one week of data block 18 are accessed in one week Have accessed 0 time, data block 19 is accessed in one week 2 times and data block 20 is accessed 50 times in one week.It calculates every The formula of the accessed probability value of a data block is:Wherein, XiIndicate that i-th of data block is accessed in one week Number, PiThe probability being accessed in one week for i-th of data block.It is possible thereby to calculate what the data block 1 was accessed Probability value is as follows:
Similar, the accessed probability value P of the data block 2 can be calculated2=1.56%, what data block 3 was accessed Probability value P3=2.5% etc., the accessed probability value of other data blocks is not repeating.
In present pre-ferred embodiments, the predetermined threshold value can be, such as 20%, therefore accessed probability value is more than The data for including inside 20% data block can be considered as hot spot data.
Judgment module 303 is additionally operable to the number for judging to be confirmed as hot spot data when determining that it is hot spot data to have data block Whether be written in caching according to block.
When successful hit is to the data block for being confirmed as hot spot data in the buffer, illustrate to be confirmed as hot spot data Data block has been written into caching;When in the buffer without hit to the data block for being confirmed as hot spot data, illustrate true The data block for being set to hot spot data is not written in caching.
Acquisition module 304, for judging that the data block for being confirmed as hot spot data is not written when the judgment module 303 When in caching, the current statistic period corresponding flow control threshold value in the migration period is obtained.
The data block for being confirmed as hot spot data and not being written in caching is cached to the whole of completion write-in from write-in is started A process is referred to as a migration period.One migration period can be divided into multiple measurement periods, and measurement period can be with For a preset time period, for example, a measurement period is set as 1 second.
The flow control refers to flow control.The implementation method of flow control includes following two:One is pass through router, exchange The QoS module of machine realizes the flow control based on source address, destination address, source port, destination interface and protocol type;It is another Kind is to realize the flow control based on application layer by the fluidic device of profession.
In this preferred embodiment, the acquisition module 304 obtains the current statistic period corresponding flow control in the migration period Threshold value can specifically include:
1) judge whether the current statistic period is first measurement period.
Can by judge current time whether be the 1st second come judge currently to migrate the period whether be first statistics week Phase.
2) when it is first measurement period to determine the current statistic period, default flow control threshold value is determined as described work as The corresponding flow control threshold value of preceding measurement period;
The corresponding flow control threshold value of first measurement period in the migration period of the present invention is pre-set flow control threshold value, It can rule of thumb be pre-set by the manager of system.That is, using a preset flow control threshold value as in the migration period The flow control threshold value of first measurement period.
3) when it is first measurement period to determine the current statistic period not, user in a upper measurement period is obtained The I/O load of application determines the current statistic period pair according to the I/O load that user in a upper measurement period applies The flow control threshold value answered.
Each remaining measurement period in addition to first measurement period in the migration period can correspond to a flow control threshold Value.The corresponding flow control threshold value of each measurement period of residue is that dynamic adjusts, and the current statistic period, corresponding flow control threshold value could To be calculated according to the I/O load in a upper measurement period, the corresponding flow control threshold value of next measurement period can be according to working as I/O load in preceding measurement period is calculated.Specifically, calculating second according to the I/O load in first measurement period The corresponding flow control threshold value of measurement period;The corresponding stream of third measurement period is calculated according to the I/O load in second measurement period Control threshold value;And so on.
Transferring module 305 is confirmed as hot spot for being based on the current statistic period corresponding flow control threshold value by described The data block of data is written in caching.
The data block for being confirmed as hot spot data is written to according to the current statistic period corresponding flow control threshold value slow In depositing, it is confirmed as the write-in that the data block of hot spot data carries out data with the flow that the current statistic period controls so that write-in Hot spot data in caching is unlikely to too fast or excessively slow, is written into the hot spot data in caching and is accessed for user.
The acquisition module 304 is additionally operable to obtain the data block for each IO that user applies in a measurement period Size calculates the average data block size of the IO in a upper measurement period.
It is flat that arithmetic average value-based algorithm, geometry may be used in the average data block size of IO in a upper measurement period Mean algorithm or root mean square average algorithm calculate.
For example, it is assumed that detect in a measurement period, user's application shares ten IO, the data block of ten IO Size is respectively:2M, 1M, 3M, 0.5M, 10M, 4M, 0.1M, 1.2M, 5M and 8M.It is calculated using the arithmetic average value-based algorithm The average data block size of IO in a upper measurement period is:
The acquisition module 304 is additionally operable to obtain the propagation delay time of each data block in a upper measurement period, Calculate the average data block time delay of the IO in a upper measurement period.
The propagation delay time (referred to as time delay) refers to node makes data block enter transmission from node when sending data When time needed for media, i.e. a transmitting station are sent required whole from beginning transmission data frame to data frame Between or a receiving station from start receive data frame finish required All Time to data frame receipt.
In a preferred embodiment of the present invention, the propagation delay time of the data block can be installed from each memory node one It is acquired in a load measuring tool or performance monitoring tool.
As described above, the average data block time delay of the IO in a upper measurement period can also use arithmetic mean of instantaneous value Algorithm, geometric mean algorithm or root mean square average algorithm calculate.Assuming that, it is assumed that detect a measurement period Interior, the propagation delay time of ten IO is respectively:1s, 0.8s, 1.5s, 0.4s, 5s, 2s, 0.02s, 0.6s, 3s and 4.5s, then it is described IO average data block time delays in a upper measurement period using arithmetic average value-based algorithm come when calculating, as a result:
(1s+0.8s+1.5s+0.4s+5s+2s+0.1s+0.6s+3s+4.4s)=1.88s.
It should be understood that if the average data block size of the IO in a upper measurement period uses arithmetic average value-based algorithm It calculates, then the average data block time delay of the IO in a upper measurement period is also calculated using arithmetic average value-based algorithm;On if The average data block size of IO in one measurement period is calculated using geometric mean algorithm, then in a upper measurement period The average data block time delay of IO also calculated using geometric mean algorithm;If the IO's in a upper measurement period is flat Equal data block size is calculated using root mean square average algorithm, then the average data block time delay of the IO in a upper measurement period Also it is calculated using root mean square average algorithm.
The acquisition module 304 is additionally operable to obtain a reference value of the data block size of pre-set IO and corresponding number According to a reference value of block time delay.
In a preferred embodiment of the present invention, the base of a reference value of the I/O data block size and corresponding data block time delay Quasi- value can rule of thumb be pre-set by the administrator of storage system.For example, rule of thumb, the data block of 4K in transmission, Time delay is minimum, can ideally reach 50ms, then a reference value of the I/O data block size could be provided as 4k, corresponding The a reference value of data block time delay could be provided as 50ms.
Computing module 306 is used for the average data block size according to the IO in a upper measurement period, is averaged The a reference value of data block time delay, a reference value of data block size, corresponding data block time delay calculates a upper measurement period Interior I/O load intensity.
For example, it is assumed that when the average data block size of the IO in a upper measurement period is X, average data block The a reference value for prolonging as Y, data block size is M, a reference value of corresponding data block time delay is N, then a upper measurement period The calculation formula of interior I/O load intensity is:
Determining module 307, for according to the I/O load intensity in a upper measurement period, using trained in advance Load disaggregated model determines the I/O load classification in a upper measurement period.
In a preferred embodiment of the present invention, the I/O load classification includes:It is high load classification, normal load classification, low negative Carry classification.
Preferably, the load disaggregated model includes, but are not limited to:Support vector machines (Support Vector Machine, SVM) model.By the average data block size of the IO in a upper measurement period, the upper statistics week I/O load intensity in the average data block time delay of IO in phase, a upper measurement period is as the load disaggregated model Input export the I/O load classification in a upper measurement period after load disaggregated model calculating.
Training module 308, for training load disaggregated model.
The training module 308 training load disaggregated model process include:
1) the I/O load data of positive sample and the I/O load data of negative sample are obtained, and by the I/O load data mark of positive sample Load class is noted, so that the I/O load data of positive sample carry I/O load class label.
For example, 500 high load classifications, normal load classification, the corresponding I/O load data of low-load classification are chosen respectively, And classification, I/O data label that can be using " 1 " as high load, using " 2 " as normal load are marked to each I/O load data I/O data label, the I/O data label using " 3 " as low-load.
2) the I/O load data of the I/O load data of the positive sample and the negative sample are randomly divided into the first default ratio The verification collection of the training set and the second preset ratio of example trains the load disaggregated model using the training set, and utilizes institute State the accuracy rate of the load disaggregated model after verification collection verification training.
First the training sample in the training set of different loads classification is distributed in different files.For example, height is negative The training sample of load classification is distributed in the first file, the training sample of normal load classification is distributed in the second file, The training sample of low-load classification is distributed in third file.Then the first default ratio is extracted respectively in different files The training sample of example (for example, 70%) carries out the training of load disaggregated model as total training sample, from different files In take the training sample of remaining second preset ratio (for example, 30%) respectively as total test sample training completed described in It loads disaggregated model and carries out Accuracy Verification.
If 3) accuracy rate is more than or equal to default accuracy rate, terminate to train, with the load after training Disaggregated model identifies the I/O load classification in the current statistic period as grader;If the accuracy rate is less than default accurate When rate, then increase positive sample quantity and negative sample quantity to load disaggregated model described in re -training until the accuracy rate is more than Or equal to default accuracy rate.
The computing module 306 is additionally operable to calculate current statistic week according to the I/O load classification in a upper measurement period Phase corresponding flow control threshold value.
Specifically, the computing module 306 calculates current statistic week according to the I/O load classification in a upper measurement period Phase, corresponding flow control threshold value may include:
1) when the I/O load classification in a upper measurement period is high load classification, by the upper statistics week Phase, corresponding flow control threshold value reduced the first predetermined amplitude, obtained current statistic period corresponding flow control threshold value.
When I/O load in a upper measurement period is high load, flow control threshold is reduced according to first predetermined amplitude Value, to execute the behaviour of write-in caching to the data block for being confirmed as hot spot data with low flow control threshold value within the current statistic period Make, ensures the efficient access of user's application by reducing the speed of Data Migration.
In a preferred embodiment of the invention, first predetermined amplitude can be the corresponding flow control of a upper measurement period The 1/2 of threshold value.I.e. current statistic period corresponding flow control threshold value is the 1/2 of the corresponding flow control threshold value of a upper measurement period, under The corresponding flow control threshold value of one measurement period is the 1/2 of current statistic period corresponding flow control threshold value.
2) when the I/O load classification in a upper measurement period is low-load classification, by the upper statistics week Phase, corresponding flow control threshold value improved the second predetermined amplitude, obtained the corresponding flow control threshold value of next measurement period.
When I/O load in a upper measurement period is low-load, flow control threshold is improved according to second predetermined amplitude Value, to execute the behaviour of write-in caching to the data block for being confirmed as hot spot data with high flow control threshold value within the current statistic period Make, on the basis of ensureing the access quality of user's application, improves the speed of Data Migration.
In a preferred embodiment of the invention, second predetermined amplitude can be the corresponding flow control of a upper measurement period 1.5 times of threshold value.I.e. current statistic period corresponding flow control threshold value is the 1.5 of the corresponding flow control threshold value of a upper measurement period Times, the corresponding flow control threshold value of next measurement period is 1.5 times of current statistic period corresponding flow control threshold value.
3) when the I/O load classification in a upper measurement period is normal load classification, by a upper statistics Period corresponding flow control threshold value is as current statistic period corresponding flow control threshold value.
In conclusion hot spot data of the present invention migrates flow control apparatus, every preset time period records user and accesses Data set, the data set is divided into multiple data blocks, is determining that it is hot spot data and to be not written into caching to have data block When, the corresponding flow control threshold value of different measurement periods in the period is migrated by acquisition, is corresponded to based on each described measurement period Flow control threshold value, the data block for being confirmed as hot spot data is written in caching, is improving migration of subscriber data to slow The efficiency deposited while reduce loss of data risk, can avoid causing obviously to impact to normal input and output service feature, have There is good fluid control effect.
Secondly, current statistic period corresponding flow control threshold value is the I/O load applied according to user in a upper measurement period Automatically it is adjusted into Mobile state, is not required to manager and adjusts manually, reduce the workload of manager, avoid the subjectivity because of manager The not accurate problem of adjustment caused by factor.
The above-mentioned integrated unit realized in the form of software function module, can be stored in one and computer-readable deposit In storage media.Above-mentioned software function module is stored in a storage medium, including some instructions are used so that a computer It is each that equipment (can be personal computer, double screen equipment or the network equipment etc.) or processor (processor) execute the present invention The part of a embodiment the method.
Example IV
Fig. 4 is the schematic diagram for the electronic equipment that the embodiment of the present invention four provides.
The electronic equipment 4 includes:Memory 41, at least one processor 42 are stored in the memory 41 and can The computer program 43 and at least one communication bus 44 run at least one processor 42.
At least one processor 42 realizes the step in above method embodiment when executing the computer program 43.
Illustratively, the computer program 43 can be divided into one or more module/units, it is one or Multiple module/units are stored in the memory 41, and are executed by least one processor 42, to complete the present invention Step in above method embodiment.One or more of module/units can be can complete specific function a series of Computer program instructions section, the instruction segment is for describing implementation procedure of the computer program 43 in the electronic equipment 4.
The electronic equipment 4 can be that the calculating such as desktop PC, notebook, palm PC and cloud server are set It is standby.It will be understood by those skilled in the art that the schematic diagram 4 is only the example of electronic equipment 4, do not constitute to electronic equipment 4 restriction may include either combining certain components or different components, such as institute than illustrating more or fewer components It can also includes input-output equipment, network access equipment, bus etc. to state electronic equipment 4.
At least one processor 42 can be central processing unit (Central Processing Unit, CPU), It can also be other general processors, digital signal processor (Digital Signal Processor, DSP), special integrated Circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..The processor 42 can be microprocessor or the processor 42 can also be any conventional processor Deng the processor 42 is the control centre of the electronic equipment 4, utilizes various interfaces and the entire electronic equipment of connection 4 Various pieces.
The memory 41 can be used for storing the computer program 43 and/or module/unit, and the processor 42 passes through Operation executes the computer program and/or module/unit being stored in the memory 41, and calls and be stored in memory Data in 41 realize the various functions of the electronic equipment 4.The memory 41 can include mainly storing program area and storage Data field, wherein storing program area can storage program area, (for example sound plays the application program needed at least one function Function, image player function etc.) etc.;Storage data field can be stored uses created data (such as sound according to electronic equipment 4 Frequency evidence, phone directory etc.) etc..In addition, memory 41 may include high-speed random access memory, can also include non-volatile Memory, such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) block, flash card (Flash Card), at least one disk memory, flush memory device or other Volatile solid-state part.
If the integrated module/unit of the electronic equipment 4 is realized in the form of SFU software functional unit and as independent Product is sold or in use, can be stored in a computer read/write memory medium.Based on this understanding, the present invention is real All or part of flow in existing above-described embodiment method, can also instruct relevant hardware come complete by computer program At the computer program can be stored in a computer readable storage medium, which is being executed by processor When, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer program code, described Computer program code can be source code form, object identification code form, executable file or certain intermediate forms etc..The meter Calculation machine readable medium may include:Can carry the computer program code any entity or device, recording medium, USB flash disk, Mobile hard disk, magnetic disc, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory Device (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It needs to illustrate It is that the content that the computer-readable medium includes can be fitted according to legislation in jurisdiction and the requirement of patent practice When increase and decrease, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium does not include that electric carrier wave is believed Number and telecommunication signal.
In several embodiments provided by the present invention, it should be understood that disclosed electronic equipment and method, Ke Yitong Other modes are crossed to realize.For example, electronic equipment embodiment described above is only schematical, for example, the unit Division, only a kind of division of logic function, formula that in actual implementation, there may be another division manner.
In addition, each functional unit in each embodiment of the present invention can be integrated in same treatment unit, it can also That each unit physically exists alone, can also two or more units be integrated in same unit.Above-mentioned integrated list The form that hardware had both may be used in member is realized, can also be realized in the form of hardware adds software function module.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case of without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Profit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent requirements of the claims Variation includes within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This Outside, it is clear that one word of " comprising " is not excluded for other units or, odd number is not excluded for plural number.The multiple units stated in system claims Or device can also be realized by a unit or device by software or hardware.The first, the second equal words are used for indicating name Claim, and does not represent any particular order.
Finally it should be noted that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although reference Preferred embodiment describes the invention in detail, it will be understood by those of ordinary skill in the art that, it can be to the present invention's Technical solution is modified or equivalent replacement, without departing from the spirit of the technical scheme of the invention range.

Claims (10)

1. a kind of hot spot data migrates flow control method, which is characterized in that the method includes:
Every preset time period records the data set that user accesses;
The data set is divided into multiple data blocks;
Judge whether have data block for hot spot data in the multiple data block;
When determining that it is hot spot data to have data block, judge whether the data block for being confirmed as hot spot data is written in caching;
When judging that the data block for being confirmed as hot spot data is not written in caching, the current statistic week in the migration period is obtained Phase corresponding flow control threshold value;
Based on the current statistic period corresponding flow control threshold value, the data block for being confirmed as hot spot data is written to slow In depositing.
2. the method as described in claim 1, which is characterized in that the data set, which is divided into multiple data blocks, includes:
The data ensemble average is divided into the data block of preset quantity;Or
By the data block that the data set random division is preset quantity;Or
The data set is divided into multiple data blocks according to default size.
3. the method as described in claim 1, which is characterized in that described to judge whether to have in the multiple data block the data block be Whether hot spot data is the probability value being accessed by calculating data block, be hot spot number based on the probability value prediction data block According to, including:
Count the number that each data block is accessed in the preset time period;
Based on the number that each data block is accessed in the preset time period, each data block is calculated in the preset time The probability value being accessed in section;
Judge whether the accessed probability value of each data block is more than predetermined probabilities value;
When judging that the accessed probability value of data block is more than the predetermined probabilities value, the quilt more than the predetermined probabilities value is determined The corresponding data block of probability value of access is hot spot data;
When judging that the accessed probability value of data block is less than or equal to the predetermined probabilities value, determines and be less than or equal to institute The corresponding data block of accessed probability value for stating predetermined probabilities value is non-thermal point data.
4. the method as described in claim 1, which is characterized in that the current statistic period obtained in the migration period is corresponding Flow control threshold value includes:
Judge whether the current statistic period is first measurement period;
When it is first measurement period to determine the current statistic period, default flow control threshold value is determined as the current statistic Period corresponding flow control threshold value;
When it is first measurement period to determine the current statistic period not, user's application in a upper measurement period is obtained I/O load determines the current statistic period corresponding stream according to the I/O load that user in a upper measurement period applies Control threshold value.
5. method as claimed in claim 4, which is characterized in that the IO applied according to user in a upper measurement period is negative It carries, determines that the current statistic period corresponding flow control threshold value includes:
The data block size for each IO that user applies in a measurement period is obtained, a upper measurement period is calculated The average data block size of interior IO;
The propagation delay time of each data block in a upper measurement period is obtained, is calculated in a upper measurement period The average data block time delay of IO;
Obtain a reference value of the data block size of pre-set IO and a reference value of corresponding data block time delay;
It is big according to the average data block size of the IO in a upper measurement period, average data block time delay, data block The a reference value of small a reference value, corresponding data block time delay calculates the I/O load intensity in a upper measurement period;
According to the I/O load intensity in a upper measurement period, described in trained load disaggregated model determines in advance I/O load classification in a upper measurement period;
Current statistic period corresponding flow control threshold value is calculated according to the I/O load classification in a upper measurement period.
6. method as claimed in claim 5, which is characterized in that the IO's according in a upper measurement period The a reference value of average data block size, average data block time delay, a reference value of data block size, corresponding data block time delay, meter The calculation formula for calculating the I/O load intensity in a upper measurement period is:Wherein, X is an above-mentioned upper system The average data block size of the IO in the period is counted, Y is the average data block time delay, and M is the base of the data block size Quasi- value, N are a reference value of the corresponding data block time delay.
7. such as method described in claim 5 or 6, which is characterized in that the I/O load class according in a upper measurement period Not Ji Suan current statistic period corresponding flow control threshold value include:
When the I/O load classification in a upper measurement period is high load classification, a upper measurement period is corresponded to Flow control threshold value reduce the first predetermined amplitude, obtain current statistic period corresponding flow control threshold value;
When the I/O load classification in a upper measurement period is low-load classification, a upper measurement period is corresponded to Flow control threshold value improve the second predetermined amplitude, obtain the corresponding flow control threshold value of current period;
When the I/O load classification in a upper measurement period is normal load classification, by a upper measurement period pair The flow control threshold value answered is as current statistic period corresponding flow control threshold value.
8. a kind of hot spot data migrates flow control apparatus, which is characterized in that described device includes:
Logging modle, the data set accessed for every preset time period record user;
Division module, for the data set to be divided into multiple data blocks;
Judgment module, for judging whether have data block for hot spot data in the multiple data block;
Judgment module is additionally operable to when determining that it is hot spot data to have data block, judges that the data block for being confirmed as hot spot data is In no write-in caching;
Acquisition module, when the data block for being confirmed as hot spot data when judgment module judgement is not written in caching, Obtain the current statistic period corresponding flow control threshold value in the migration period;
Transferring module, for being based on the current statistic period corresponding flow control threshold value, by the hot spot data that is confirmed as Data block is written in caching.
9. a kind of electronic equipment, which is characterized in that the electronic equipment includes processor and memory, and the processor is for holding Realize that hot spot data as claimed in any of claims 1 to 7 in one of claims moves when the computer program stored in the row memory Move flow control method.
10. a kind of computer readable storage medium, computer program, feature are stored on the computer readable storage medium It is, the computer program realizes that hot spot data as claimed in any of claims 1 to 7 in one of claims moves when being executed by processor Move flow control method.
CN201810565747.XA 2018-06-04 2018-06-04 Hot spot data migration flow control method and device, electronic equipment and storage medium Active CN108762684B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810565747.XA CN108762684B (en) 2018-06-04 2018-06-04 Hot spot data migration flow control method and device, electronic equipment and storage medium
PCT/CN2018/100168 WO2019232925A1 (en) 2018-06-04 2018-08-13 Hotspot data migration flow control method and apparatus, and electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810565747.XA CN108762684B (en) 2018-06-04 2018-06-04 Hot spot data migration flow control method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN108762684A true CN108762684A (en) 2018-11-06
CN108762684B CN108762684B (en) 2021-03-05

Family

ID=64002688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810565747.XA Active CN108762684B (en) 2018-06-04 2018-06-04 Hot spot data migration flow control method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN108762684B (en)
WO (1) WO2019232925A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020220739A1 (en) * 2019-04-28 2020-11-05 华为技术有限公司 Request control method, related device, and computer storage medium
CN113076339A (en) * 2021-03-18 2021-07-06 北京沃东天骏信息技术有限公司 Data caching method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160359746A1 (en) * 2015-06-04 2016-12-08 Dell Software Inc. Selectively suppress or throttle migration of data across wan connections
CN106682705A (en) * 2017-02-04 2017-05-17 武汉阿帕科技有限公司 Method and apparatus of identifying load characteristics
CN106775461A (en) * 2016-11-30 2017-05-31 华为技术有限公司 Hot spot data determines method, equipment and device
CN107454004A (en) * 2016-05-30 2017-12-08 阿里巴巴集团控股有限公司 A kind of flow control methods and device
CN107463514A (en) * 2017-08-16 2017-12-12 郑州云海信息技术有限公司 A kind of date storage method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092526B (en) * 2011-10-31 2016-03-30 国际商业机器公司 The method and apparatus of Data Migration is carried out between memory device
US9436406B2 (en) * 2014-07-07 2016-09-06 International Business Machines Corporation Migration decision window selection based on hotspot characteristics
CN107222426B (en) * 2016-03-21 2021-07-20 阿里巴巴集团控股有限公司 Flow control method, device and system
CN107341240B (en) * 2017-07-05 2019-11-15 中国人民大学 A kind of processing method for coping with tilt data stream on-line joining process

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160359746A1 (en) * 2015-06-04 2016-12-08 Dell Software Inc. Selectively suppress or throttle migration of data across wan connections
CN107454004A (en) * 2016-05-30 2017-12-08 阿里巴巴集团控股有限公司 A kind of flow control methods and device
CN106775461A (en) * 2016-11-30 2017-05-31 华为技术有限公司 Hot spot data determines method, equipment and device
CN106682705A (en) * 2017-02-04 2017-05-17 武汉阿帕科技有限公司 Method and apparatus of identifying load characteristics
CN107463514A (en) * 2017-08-16 2017-12-12 郑州云海信息技术有限公司 A kind of date storage method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020220739A1 (en) * 2019-04-28 2020-11-05 华为技术有限公司 Request control method, related device, and computer storage medium
CN113076339A (en) * 2021-03-18 2021-07-06 北京沃东天骏信息技术有限公司 Data caching method, device, equipment and storage medium

Also Published As

Publication number Publication date
WO2019232925A1 (en) 2019-12-12
CN108762684B (en) 2021-03-05

Similar Documents

Publication Publication Date Title
Ebadi et al. An energy‐aware method for data replication in the cloud environments using a tabu search and particle swarm optimization algorithm
CN105205014B (en) A kind of date storage method and device
CN108959399A (en) Distributed data deletes flow control method, device, electronic equipment and storage medium
CN109710405B (en) Block chain intelligent contract management method and device, electronic equipment and storage medium
CN108804039A (en) Adaptive data restore flow control method, device, electronic equipment and storage medium
Di et al. Characterizing and modeling cloud applications/jobs on a Google data center
CN113377540A (en) Cluster resource scheduling method and device, electronic equipment and storage medium
CN108762686A (en) Consistency verification of data flow control method, device, electronic equipment and storage medium
CN110138732A (en) Response method, device, equipment and the storage medium of access request
CN109669774A (en) Quantization method, method of combination, device and the network equipment of hardware resource
CN113821332B (en) Method, device, equipment and medium for optimizing efficiency of automatic machine learning system
CN107291539A (en) Cluster program scheduler method based on resource significance level
Smolka et al. Evaluation of fog application placement algorithms: a survey
CN111858025A (en) Mixed scheduling method, device, equipment and medium based on GPU card video memory
Li et al. Cost-aware automatic scaling and workload-aware replica management for edge-cloud environment
CN108573029A (en) A kind of method, apparatus and storage medium obtaining network access relational data
CN108762684A (en) Hot spot data migrates flow control method, device, electronic equipment and storage medium
CN113722276A (en) Log data processing method, system, storage medium and electronic equipment
Zhou et al. EVCT: An efficient VM deployment algorithm for a software-defined data center in a connected and autonomous vehicle environment
CN108763107A (en) Write disk flow control method, device, electronic equipment and storage medium in backstage
CN109948803A (en) Algorithm model optimization method, device and equipment
US11374869B2 (en) Managing bandwidth based on user behavior
CN110069319A (en) A kind of multiple target dispatching method of virtual machine and system towards cloudlet resource management
Wang et al. S-CDA: A smart cloud disk allocation approach in cloud block storage system
CN106027685A (en) Peak access method based on cloud computation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant