CN108763107B - Background disc writing flow control method and device, electronic equipment and storage medium - Google Patents

Background disc writing flow control method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN108763107B
CN108763107B CN201810565748.4A CN201810565748A CN108763107B CN 108763107 B CN108763107 B CN 108763107B CN 201810565748 A CN201810565748 A CN 201810565748A CN 108763107 B CN108763107 B CN 108763107B
Authority
CN
China
Prior art keywords
statistical period
cache
flow control
data block
load
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810565748.4A
Other languages
Chinese (zh)
Other versions
CN108763107A (en
Inventor
陈学伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810565748.4A priority Critical patent/CN108763107B/en
Priority to PCT/CN2018/108129 priority patent/WO2019232994A1/en
Publication of CN108763107A publication Critical patent/CN108763107A/en
Application granted granted Critical
Publication of CN108763107B publication Critical patent/CN108763107B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0871Allocation or management of cache space

Abstract

A background write disk flow control method comprises the following steps: when receiving a storage command of user data, writing the user data into a configured cache; when detecting that the cache information in the cache meets a first preset condition, acquiring a flow control threshold corresponding to a current statistical period in a write-in period; writing the user data with the first identifier in the cache into the hard disk based on the flow control threshold corresponding to the current statistical period; and when detecting that the cache information in the cache does not meet the first preset condition but meets the second preset condition, clearing the user data with the second identification in the cache. The invention also provides a background disc-writing flow control device, electronic equipment and a storage medium. The invention can improve the efficiency of data writing into the hard disk, reduce the risk of data loss, and avoid obvious impact on normal input and output service performance; the storage space of the cache is saved, and the efficiency of writing data into the cache is improved.

Description

Background disc writing flow control method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of computers, in particular to a background disc-writing flow control method and device, electronic equipment and a storage medium.
Background
The Cache is a buffer area (also called "Cache") for data exchange, and after receiving an instruction for writing data, the hard disk does not write the data onto the disk immediately, but temporarily stores the data in the Cache, and then sends a "data written" signal to the system, at which time the system considers that the data has been written, and continues to perform the following operations. The cache usually uses RAM (non-permanent storage that is lost when power is off), so after the data in the cache is used up, the data is stored in a memory such as a hard disk for permanent storage, and this operation is called background disk writing.
However, in the process of storing the data in the cache to a memory such as a hard disk, a large amount of input/output (IO) of the user application may be generated, and if it is exactly the IO peak time of the user application, the response time of the user application may be affected, and bad experience may be brought to the user. The traditional solution is to adopt peak-shifting operation, namely when the user application is busy in the daytime, the load of system IO is heavy, at the moment, the operation of storing the data in the cache into the hard disk is not carried out, and the user application in a few time periods at night is selected for execution; but even at night the user's application of IO is busy and so peak shifting operations may not be applicable.
In addition, when the cache data is written into the hard disk, some cache data which is not needed to be written into the hard disk is written into the hard disk, so that the writing time is wasted, and the time of the cache data which is needed to be written into the hard disk is occupied.
Disclosure of Invention
In view of the above, it is necessary to provide a background disk-writing flow control method, a background disk-writing flow control device, an electronic device, and a storage medium, which can improve the efficiency of writing the data in the cache into the hard disk, reduce the risk of data loss, avoid significant impact on normal input/output service performance, and have a good flow control effect.
The first aspect of the present invention provides a background disc writing flow control method, where the method includes:
when a storage command of user data is received, writing the user data into a configured cache;
when detecting that the cache information in the cache meets a first preset condition, acquiring a flow control threshold corresponding to a current statistical period in a write-in period;
writing the user data corresponding to the data identifier in the cache as the first identifier into a hard disk based on the flow control threshold corresponding to the current statistical period;
and when detecting that the cache information in the cache does not meet the first preset condition but meets a second preset condition, clearing the user data corresponding to the data identifier in the cache as the second identifier.
Preferably, the obtaining of the flow control threshold corresponding to the current statistical period in the writing period includes:
judging whether the current statistical period is the first statistical period or not;
when the current statistical period is determined to be the first statistical period, determining a preset flow control threshold as a flow control threshold corresponding to the current statistical period;
and when the current statistical period is determined not to be the first statistical period, obtaining the IO load applied by the user in the last statistical period, and determining the flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the last statistical period.
Preferably, determining the flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the previous statistical period includes:
acquiring the data block size of each IO applied by a user in the previous statistical period, and calculating the average data block size of the IO in the previous statistical period;
acquiring the transmission delay of each data block in the previous statistical period, and calculating the average IO data block delay in the previous statistical period;
acquiring a preset reference value of the size of an IO data block and a reference value of corresponding data block time delay;
calculating the IO load intensity in the last statistical period according to the average data block size, the average data block time delay, the reference value of the data block size and the reference value of the corresponding data block time delay of the IO in the last statistical period;
determining the IO load category in the last statistical period by utilizing a pre-trained load classification model according to the IO load intensity in the last statistical period;
and calculating the flow control threshold corresponding to the current statistical period according to the IO load category in the last statistical period.
Preferably, the calculation formula for calculating the IO load intensity in the previous statistical period according to the average data block size, the average data block delay, the reference value of the data block size, and the reference value of the corresponding data block delay of the IO in the previous statistical period is as follows: l =
Figure 100002_DEST_PATH_IMAGE002
Wherein, X is the average data block size of the IO in the previous statistical period, Y is the average data block delay, M is a reference value of the data block size, and N is a reference value of the corresponding data block delay.
Preferably, the calculating the flow control threshold corresponding to the current statistical period according to the IO load class in the previous statistical period includes:
when the IO load category in the previous statistical period is a high load category, reducing the flow control threshold corresponding to the previous statistical period by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical period;
when the IO load category in the previous statistical period is a low load category, increasing the flow control threshold corresponding to the previous statistical period by a second preset amplitude to obtain a flow control threshold corresponding to the next statistical period;
and when the IO load category in the last statistical period is a normal load category, taking the flow control threshold corresponding to the last statistical period as the flow control threshold corresponding to the current statistical period.
Preferably, the detecting whether the cache information in the cache meets the first preset condition includes one or more of the following combinations:
detecting whether the residual storage space in the cache is smaller than a preset space threshold value;
and detecting whether the total amount of the user data in the cache is larger than a preset limit threshold value.
Preferably, the detecting whether the cache information in the cache meets a second preset condition is: and detecting whether the caching time of the user data in the cache is earlier than a preset time threshold.
A second aspect of the present invention provides a background disc writing flow control apparatus, including:
the cache writing module is used for writing the user data into the configured cache when receiving a storage command of the user data;
the first detection module is used for detecting whether the cache information in the cache meets a first preset condition;
the flow control acquisition module is used for acquiring a flow control threshold corresponding to a current statistical period in a write-in period when the first detection module detects that the cache information in the cache meets a first preset condition;
the hard disk writing module is used for writing the user data corresponding to the data identifier in the cache as the first identifier into the hard disk based on the flow control threshold corresponding to the current statistical period;
the second detection module is used for detecting whether the cache information in the cache meets a second preset condition or not when the first detection module detects that the cache information in the cache does not meet the first preset condition;
and the cache clearing module is used for clearing the user data corresponding to the second identifier identified by the data in the cache when the second detection module detects that the cache information in the cache meets the second preset condition.
A third aspect of the present invention provides an electronic device comprising a processor and a memory, wherein the processor is configured to implement the background write stream control method when executing a computer program stored in the memory.
A fourth aspect of the present invention provides a computer-readable storage medium having a computer program stored thereon, where the computer program, when executed by a processor, implements the background write stream control method.
According to the background disk-writing flow control method, the background disk-writing flow control device, the electronic equipment and the storage medium, user data are written into the configured cache, and when cache information in the cache meets a first preset condition, the user data corresponding to the data identification in the cache in the current statistical period as the first identification can be written into the pointed hard disk according to different flow control thresholds, so that the efficiency of writing the user data into the hard disk is improved, the risk of data loss is reduced, meanwhile, obvious impact on normal input and output service performance can be avoided, and a good flow control effect is achieved; in addition, when the cache information in the cache is detected to not meet the first preset condition but meet the second preset condition, the user data is removed from the cache, so that the storage space of the cache can be saved, and the efficiency of writing the user data into the cache is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a background write disk flow control method according to an embodiment of the present invention.
Fig. 2 is a flowchart of a method for determining a flow control threshold corresponding to a current statistical period according to an IO load applied by a user in a previous statistical period according to a second embodiment of the present invention.
Fig. 3 is a functional block diagram of a background write disk flow control apparatus according to a third embodiment of the present invention.
Fig. 4 is a schematic diagram of an electronic device according to a fourth embodiment of the present invention.
The following detailed description will further illustrate the invention in conjunction with the above-described figures.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.
In the following description, numerous specific details are set forth to provide a thorough understanding of the present invention, and the described embodiments are merely a subset of the embodiments of the present invention, rather than a complete embodiment. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
The background disc-writing flow control method provided by the embodiment of the invention is applied to one or more electronic devices. The background disc-writing flow control method can also be applied to a hardware environment formed by electronic equipment and a server connected with the electronic equipment through a network. Networks include, but are not limited to: a wide area network, a metropolitan area network, or a local area network. The background disc writing flow control method can be executed by a server or an electronic device; or may be performed by both the server and the electronic device.
For the electronic device which needs to perform the background disc-writing flow control method, the background disc-writing flow control function provided by the method of the invention can be directly integrated on the electronic device, or a client for realizing the method of the invention is installed. For another example, the method provided by the present invention may also be run on a device such as a server in the form of a Software Development Kit (SDK), an interface with a background disk-writing flow control function is provided in the form of an SDK, and an electronic device or other devices may implement the function of performing flow control on the background disk-writing through the provided interface.
Example one
Fig. 1 is a flowchart of a background write disk flow control method according to an embodiment of the present invention. The execution sequence in the flowchart may be changed and some steps may be omitted according to different requirements.
And S11, when receiving the storage command of the user data, writing the user data into the configured cache.
When the electronic equipment receives a storage instruction of user data, a data writing instruction is generated, a cache memory is configured, and the user data is written into the configured cache memory.
The user data includes: data content, address information, and data identification. The address information includes information such as a source address and a destination address of the user data.
The data identifier is used to indicate whether the user data needs to be written into the hard disk, and in this embodiment, the data identifier may be a first identifier or a second identifier. When the data identification is a first identification, indicating that the user data needs to be written into a hard disk; and when the data identifier is a second identifier, indicating that the user data does not need to be written into the hard disk. For example, if the data flag is "1", it indicates that the user data needs to be written into the hard disk, and if the data flag is "0", it indicates that the user data does not need to be written into the hard disk, and the user data may be discarded directly after being used up.
S12, detecting whether the cache information in the cache meets a first preset condition.
In a preferred embodiment of the present invention, the cache information includes: the remaining storage space in the cache, the total amount of user data in the cache, the cache time of the user data in the cache, and the user data. The total amount of user data refers to the total size of the user data stored in the cache.
The detecting whether the cache information in the cache meets a first preset condition includes one or more of the following combinations:
1) detecting whether the residual storage space in the cache is smaller than a preset space threshold value;
when detecting that the residual storage space in the cache is smaller than the preset space threshold, determining that the cache information in the cache meets a first preset condition; when detecting that the remaining storage space in the cache is greater than or equal to the preset space threshold, determining that the cache information in the cache does not meet a first preset condition.
2) And detecting whether the total amount of the user data in the cache is larger than a preset limit threshold value.
When detecting that the total amount of the user data in the cache is greater than the preset limit threshold, determining that the cache information in the cache meets a first preset condition; when detecting that the total amount of the user data in the cache is smaller than or equal to the preset limit threshold, determining that the cache information in the cache does not meet a first preset condition.
When it is detected that the cache information in the cache meets the first preset condition, executing step S13; otherwise, when it is detected that the cache information in the cache does not satisfy the first predetermined condition, step S15 is executed.
And S13, acquiring a flow control threshold corresponding to the current statistical period in the writing period.
The whole process from the beginning of writing the user data in the cache to the completion of writing is called a writing period. One writing period may be divided into a plurality of statistical periods, and one statistical period may be a preset time period, for example, one statistical period is set to 1 second.
The flow control refers to flow control. The flow control method comprises the following two methods: one is to realize the flow control based on the source address, the destination address, the source port, the destination port and the protocol type through the QoS module of the router and the switch; and the other method realizes the flow control based on the application layer through a professional flow control device.
In this preferred embodiment, the acquiring the flow control threshold corresponding to the current statistical period in the write-in period may specifically include:
1) and judging whether the current statistical period is the first statistical period.
Whether the current writing period is the first statistical period can be judged by judging whether the current time is the 1 st second.
2) When the current statistical period is determined to be the first statistical period, determining a preset flow control threshold as a flow control threshold corresponding to the current statistical period;
the flow control threshold corresponding to the first statistical period in the write-in period is a preset flow control threshold, and can be preset by a system manager according to experience. Namely, a preset flow control threshold is adopted as the flow control threshold of the first statistical period in the writing period.
3) And when the current statistical period is determined not to be the first statistical period, obtaining the IO load applied by the user in the last statistical period, and determining the flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the last statistical period.
Each of the remaining statistical periods within the write cycle, except for the first statistical period, may correspond to a flow control threshold. The flow control threshold corresponding to each of the remaining statistical periods is dynamically adjusted, the flow control threshold corresponding to the current statistical period may be calculated according to the IO load in the previous statistical period, and the flow control threshold corresponding to the next statistical period may be calculated according to the IO load in the current statistical period. Specifically, a flow control threshold corresponding to a second statistical period is calculated according to the IO load in the first statistical period; calculating a flow control threshold corresponding to a third statistical period according to the IO load in the second statistical period; and so on.
The specific process of determining the flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the previous statistical period may refer to fig. 2 and the corresponding description thereof.
And S14, writing the user data corresponding to the data identifier in the cache as the first identifier into a hard disk based on the flow control threshold corresponding to the current statistical period.
And the user data corresponding to the data identifier in the cache which is the first identifier is the data needing to be written into the hard disk, and the data needing to be written into the hard disk is written into the hard disk based on the flow control threshold corresponding to the current statistical period. If the flow control threshold corresponding to the current statistical period is large, the speed of writing the user data corresponding to the data identifier as the first identifier in the cache into the hard disk is controlled by the large flow control threshold, the speed of writing the user data into the hard disk can be improved, the storage pressure in the cache can be relieved, and the problem that the user data in the cache is lost due to system power failure or other abnormal conditions can be avoided. If the flow control threshold corresponding to the current statistical period is smaller, the speed of writing the user data corresponding to the data identifier as the first identifier in the cache into the hard disk is not too high, and obvious impact on normal input and output service performance is avoided.
S15, detecting whether the cache information in the cache meets a second preset condition.
The detecting whether the cache information in the cache meets a second preset condition is as follows: and detecting whether the caching time of the user data in the cache is earlier than a preset time threshold.
When the cache time of the user data in the cache is detected to be earlier than the preset time threshold, determining that the cache information in the cache meets a second preset condition; and when detecting that the caching time of the user data in the cache is not earlier than the preset time threshold, determining that the cache information in the cache does not meet a second preset condition.
When it is detected that the cache information in the cache meets the second preset condition, executing step S16; otherwise, when it is detected that the cache information in the cache does not satisfy the second preset condition, the process may return to step S12.
And S16, removing the user data corresponding to the data identifier in the cache as the second identifier.
The user data corresponding to the second identifier identified as the data identifier in the cache is data which does not need to be written into the hard disk, the user data corresponding to the second identifier identified as the data identifier is not written into the hard disk, and the user data corresponding to the second identifier is cleared from the cache when the user data is determined to be earlier than the preset time threshold, so that the storage space in the cache can be saved, more storage space is provided for the user data which needs to be written into the hard disk, the speed of writing the user data into the cache is improved, the impact on IO data applied by a user can be further reduced, and the user experience is improved.
Example two
Fig. 2 is a flowchart of a method for determining a flow control threshold corresponding to a current statistical period according to an IO load applied by a user in a previous statistical period according to a second embodiment of the present invention.
S21, obtaining the data block size of each IO applied by the user in the previous statistical period, and calculating the average data block size of the IO in the previous statistical period.
The average data block size of the IO in the last statistical period may be calculated by using an arithmetic mean algorithm, a geometric mean algorithm, or a root mean square mean algorithm.
For example, suppose that ten IO times are detected in the last statistical period, and the data block sizes of the ten IO times are: 2M, 1M, 3M,0.5M, 10M, 4M, 0.1M, 1.2M, 5M and 8M. Calculating the average data block size of the IO in the last statistical period by using the arithmetic mean algorithm as follows:
Figure DEST_PATH_IMAGE004
=(2M + 1M + 3M + 0.5M + 10M + 4M + 0.1M + 1.2M + 5M + 8M)/10 = 3.48M。
and S22, acquiring the transmission delay of each data block in the previous statistical period, and calculating the average data block delay of the IO in the previous statistical period.
The transmission delay (referred to as delay for short) refers to the time required for a node to enter a data block from the node to a transmission medium when the node transmits data, that is, the total time required for a transmitting station to transmit a data frame from the beginning to finish transmitting the data frame, or the total time required for a receiving station to receive the data frame from the beginning to finish receiving the data frame.
In a preferred embodiment of the present invention, the transmission delay of the data block may be obtained from a load measurement tool or a performance monitoring tool installed in each storage node.
As described above, the average data block delay of the IO in the last statistical period may also be calculated by using an arithmetic mean algorithm, a geometric mean algorithm, or a root mean square mean algorithm. Suppose, in the last statistical period, the transmission delays of ten IO are: 1s, 0.8s, 1.5s, 0.4s, 5s, 2s, 0.02s, 0.6s, 3s, and 4.5s, when the IO average data block delay in the previous statistical period is calculated by using an arithmetic mean algorithm, the result is:
(1s+0.8s+1.5s+0.4s+5s+2s+0.1s+0.6s+3s+4.4s)=1.88s。
it should be understood that, if the average data block size of the IO in the previous statistical period is calculated by using an arithmetic mean algorithm, the average data block delay of the IO in the previous statistical period is also calculated by using the arithmetic mean algorithm; if the average data block size of the IO in the previous statistical period is calculated by adopting a geometric mean algorithm, calculating the average data block time delay of the IO in the previous statistical period by adopting the geometric mean algorithm; or if the average data block size of the IO in the previous statistical period is calculated by using the root mean square average algorithm, the average data block delay of the IO in the previous statistical period is also calculated by using the root mean square average algorithm.
And S23, acquiring a preset reference value of the IO data block size and a corresponding reference value of the data block time delay.
In a preferred embodiment of the present invention, the reference value of the IO data block size and the reference value of the corresponding data block delay may be preset by an administrator of the storage system according to experience. For example, according to experience, when a 4K data block is transmitted, the delay is minimum, and may reach 50ms in an ideal state, then the reference value of the IO data block size may be set to 4K, and the reference value of the corresponding data block delay may be set to 50 ms.
And S24, calculating the IO load intensity in the last statistical period according to the average data block size, the average data block time delay, the reference value of the data block size and the reference value of the corresponding data block time delay of the IO in the last statistical period.
For example, assuming that the average data block size of the IO in the previous statistical period is X, the average data block delay is Y, the reference value of the data block size is M, and the reference value of the corresponding data block delay is N, the calculation formula of the IO load intensity in the previous statistical period is as follows: l =
Figure DEST_PATH_IMAGE002A
And S25, determining the IO load category in the last statistical period by using a pre-trained load classification model according to the IO load strength in the last statistical period.
In a preferred embodiment of the present invention, the IO load categories include: high load class, normal load class, low load class.
Preferably, the load classification model includes, but is not limited to: support Vector Machine (SVM) models. And taking the average data block size of the IO in the last statistical period, the average data block time delay of the IO in the last statistical period and the IO load intensity in the last statistical period as the input of the load classification model, and outputting the IO load category in the last statistical period after calculation of the load classification model.
In a preferred embodiment of the present invention, the training process of the load classification model includes:
1) and obtaining the IO load data of the positive sample and the IO load data of the negative sample, and labeling the load class of the IO load data of the positive sample so as to enable the IO load data of the positive sample to carry the IO load class label.
For example, 500 pieces of IO load data corresponding to a high load category, a normal load category, and a low load category are respectively selected, and each piece of IO load data is labeled with a category, "1" may be used as an IO data tag of a high load, "2" may be used as an IO data tag of a normal load, and "3" may be used as an IO data tag of a low load.
2) And randomly dividing the IO load data of the positive sample and the IO load data of the negative sample into a training set with a first preset proportion and a verification set with a second preset proportion, training the load classification model by using the training set, and verifying the accuracy of the trained load classification model by using the verification set.
The training samples in the training sets of different load classes are distributed to different folders. For example, training samples of a high load category are distributed into a first folder, training samples of a normal load category are distributed into a second folder, and training samples of a low load category are distributed into a third folder. Then, training samples with a first preset proportion (for example, 70%) are respectively extracted from different folders and used as total training samples to perform training of the load classification model, and training samples with a remaining second preset proportion (for example, 30%) are respectively extracted from different folders and used as total test samples to perform accuracy verification on the trained load classification model.
3) If the accuracy is greater than or equal to a preset accuracy, ending the training, and identifying the IO load category in the current statistical period by taking the trained load classification model as a classifier; and if the accuracy is smaller than the preset accuracy, increasing the number of positive samples and the number of negative samples to retrain the load classification model until the accuracy is larger than or equal to the preset accuracy.
And S26, calculating the flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period.
Specifically, the calculating a flow control threshold corresponding to the current statistical period according to the IO load class in the previous statistical period may include:
1) and when the IO load category in the last statistical period is a high load category, reducing the flow control threshold corresponding to the last statistical period by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical period.
And when the IO load in the last statistical period is a high load, reducing the flow control threshold according to the first preset amplitude, so as to perform write-in operation on the data in the cache by using the low flow control threshold in the current statistical period, and ensuring the efficient access of the user application by reducing the data write-in speed.
In a preferred embodiment of the present invention, the first preset amplitude may be 1/2 of the flow control threshold corresponding to the previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1/2 of the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1/2 of the flow control threshold corresponding to the current statistical period.
2) And when the IO load category in the last statistical period is a low load category, increasing the flow control threshold corresponding to the last statistical period by a second preset amplitude to obtain the flow control threshold corresponding to the next statistical period.
And when the IO load in the last statistical period is a low load, increasing the flow control threshold according to the second preset amplitude, so as to perform write-in operation on the data in the cache by using the high flow control threshold in the current statistical period, and increasing the speed of data write-in on the basis of ensuring the access quality of the user application.
In a preferred embodiment of the present invention, the second preset amplitude may be 1.5 times of a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1.5 times of the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1.5 times of the flow control threshold corresponding to the current statistical period.
3) And when the IO load category in the last statistical period is a normal load category, taking the flow control threshold corresponding to the last statistical period as the flow control threshold corresponding to the current statistical period.
In summary, the background disc-writing flow control method according to the present invention writes user data into a configured cache, and writes user data corresponding to a data identifier in the cache as a first identifier into an pointed hard disc by using a preset flow control threshold when a current write cycle is a first statistical cycle by detecting that cache information in the cache meets a first preset condition; and when detecting that the cache information in the cache meets a first preset condition and the current write-in period is not the first statistical period, dynamically adjusting a flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the previous statistical period, and writing user data corresponding to the data identifier in the cache in the current statistical period as the first identifier into the pointed hard disk according to different flow control thresholds. The efficiency of writing user data into a hard disk is improved, the risk of data loss is reduced, meanwhile, obvious impact on normal input and output service performance can be avoided, and the flow control effect is good.
And secondly, the flow control threshold corresponding to the current statistical period is automatically and dynamically adjusted according to the IO load applied by the user in the previous statistical period without manual adjustment of a manager, so that the workload of the manager is reduced, and the problem of inaccurate adjustment caused by subjective factors of the manager is solved.
In addition, when the cache information in the cache is detected to not meet the first preset condition but meet the second preset condition, the user data is removed from the cache, so that the storage space of the cache can be saved, and the efficiency of writing the user data into the cache is improved.
The above description is only a specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and it will be apparent to those skilled in the art that modifications may be made without departing from the inventive concept of the present invention, and these modifications are within the scope of the present invention.
The functional modules and hardware structures of the electronic device for implementing the above background disc-writing flow control method are described below with reference to fig. 3 to 4.
EXAMPLE III
Fig. 3 is a functional block diagram of a background write stream control apparatus according to a preferred embodiment of the present invention.
In some embodiments, the background write disk flow control apparatus 30 is implemented in an electronic device. The background write disk flow control apparatus 30 may include a plurality of functional modules composed of program code segments. The program code of each program segment in the background write disk flow control apparatus 30 may be stored in a memory and executed by at least one processor to perform (see fig. 1-2 and the related description) the background write disk flow control method.
In this embodiment, the background disc writing flow control device 30 may be divided into a plurality of functional modules according to the functions executed by the background disc writing flow control device. The functional module may include: the system comprises a cache writing module 301, a first detection module 302, a flow control acquisition module 303, a hard disk writing module 304, a second detection module 305, a cache clearing module 306, a flow control calculation module 307 and a model training module 308. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory. In some embodiments, the functionality of the modules will be described in greater detail in subsequent embodiments.
A buffer write module 301, configured to write the user data into the configured buffer when receiving a storage command of the user data.
When the electronic equipment receives a storage instruction of user data, a data writing instruction is generated, a cache memory is configured, and the user data is written into the configured cache memory.
The user data includes: data content, address information, and data identification. The address information includes information such as a source address and a destination address of the user data.
The data identifier is used to indicate whether the user data needs to be written into the hard disk, and in this embodiment, the data identifier may be a first identifier or a second identifier. When the data identification is a first identification, indicating that the user data needs to be written into a hard disk; and when the data identifier is a second identifier, indicating that the user data does not need to be written into the hard disk. For example, if the data flag is "1", it indicates that the user data needs to be written into the hard disk, and if the data flag is "0", it indicates that the user data does not need to be written into the hard disk, and the user data may be discarded directly after being used up.
The first detecting module 302 is configured to detect whether the cache information in the cache meets a first preset condition.
In a preferred embodiment of the present invention, the cache information includes: the remaining storage space in the cache, the total amount of user data in the cache, the cache time of the user data in the cache, and the user data. The total amount of user data refers to the total size of the user data stored in the cache.
The first detecting module 302 detects whether the cache information in the cache satisfies a first preset condition, including one or more of the following combinations:
1) detecting whether the residual storage space in the cache is smaller than a preset space threshold value;
when detecting that the residual storage space in the cache is smaller than the preset space threshold, determining that the cache information in the cache meets a first preset condition; when detecting that the remaining storage space in the cache is greater than or equal to the preset space threshold, determining that the cache information in the cache does not meet a first preset condition.
2) And detecting whether the total amount of the user data in the cache is larger than a preset limit threshold value.
When detecting that the total amount of the user data in the cache is greater than the preset limit threshold, determining that the cache information in the cache meets a first preset condition; when detecting that the total amount of the user data in the cache is smaller than or equal to the preset limit threshold, determining that the cache information in the cache does not meet a first preset condition.
The flow control obtaining module 303 is configured to obtain a flow control threshold corresponding to a current statistical period in a write-in period when the first detecting module 302 detects that the cache information in the cache meets the first preset condition.
The whole process from the beginning of writing the user data in the cache to the completion of writing is called a writing period. One writing period may be divided into a plurality of statistical periods, and one statistical period may be a preset time period, for example, one statistical period is set to 1 second.
The flow control refers to flow control. The flow control method comprises the following two methods: one is to realize the flow control based on the source address, the destination address, the source port, the destination port and the protocol type through the QoS module of the router and the switch; and the other method realizes the flow control based on the application layer through a professional flow control device.
In this preferred embodiment, the acquiring, by the flow control acquiring module 303, the flow control threshold corresponding to the current statistical period in the write-in period may specifically include:
1) and judging whether the current statistical period is the first statistical period.
Whether the current writing period is the first statistical period can be judged by judging whether the current time is the 1 st second.
2) When the current statistical period is determined to be the first statistical period, determining a preset flow control threshold as a flow control threshold corresponding to the current statistical period;
the flow control threshold corresponding to the first statistical period in the write-in period is a preset flow control threshold, and can be preset by a system manager according to experience. Namely, a preset flow control threshold is adopted as the flow control threshold of the first statistical period in the writing period.
3) And when the current statistical period is determined not to be the first statistical period, obtaining the IO load applied by the user in the last statistical period, and determining the flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the last statistical period.
Each of the remaining statistical periods within the write cycle, except for the first statistical period, may correspond to a flow control threshold. The flow control threshold corresponding to each of the remaining statistical periods is dynamically adjusted, the flow control threshold corresponding to the current statistical period may be calculated according to the IO load in the previous statistical period, and the flow control threshold corresponding to the next statistical period may be calculated according to the IO load in the current statistical period. Specifically, a flow control threshold corresponding to a second statistical period is calculated according to the IO load in the first statistical period; calculating a flow control threshold corresponding to a third statistical period according to the IO load in the second statistical period; and so on.
A hard disk writing module 304, configured to write, to a hard disk, user data corresponding to the data identifier in the cache that is the first identifier based on the flow control threshold corresponding to the current statistics period.
And the user data corresponding to the data identifier in the cache which is the first identifier is the data needing to be written into the hard disk, and the data needing to be written into the hard disk is written into the hard disk based on the flow control threshold corresponding to the current statistical period. If the flow control threshold corresponding to the current statistical period is large, the speed of writing the user data corresponding to the data identifier as the first identifier in the cache into the hard disk is controlled by the large flow control threshold, the speed of writing the user data into the hard disk can be improved, the storage pressure in the cache can be relieved, and the problem that the user data in the cache is lost due to system power failure or other abnormal conditions can be avoided. If the flow control threshold corresponding to the current statistical period is smaller, the speed of writing the user data corresponding to the data identifier as the first identifier in the cache into the hard disk is not too high, and obvious impact on normal input and output service performance is avoided.
A second detecting module 305, configured to detect whether the cache information in the cache meets a second predetermined condition when the first detecting module 302 detects that the cache information in the cache does not meet the first predetermined condition.
The second detecting module 305 detects whether the cache information in the cache satisfies a second predetermined condition: and detecting whether the caching time of the user data in the cache is earlier than a preset time threshold.
When the cache time of the user data in the cache is detected to be earlier than the preset time threshold, determining that the cache information in the cache meets a second preset condition; and when detecting that the caching time of the user data in the cache is not earlier than the preset time threshold, determining that the cache information in the cache does not meet a second preset condition.
A cache clearing module 306, configured to clear, when the second detecting module 305 detects that the cache information in the cache meets the second preset condition, the user data corresponding to the data identifier in the cache being the second identifier.
The user data corresponding to the second identifier identified as the data identifier in the cache is data which does not need to be written into the hard disk, the user data corresponding to the second identifier identified as the data identifier is not written into the hard disk, and the user data corresponding to the second identifier is cleared from the cache when the user data is determined to be earlier than the preset time threshold, so that the storage space in the cache can be saved, more storage space is provided for the user data which needs to be written into the hard disk, the speed of writing the user data into the cache is improved, the impact on IO data applied by a user can be further reduced, and the user experience is improved.
The flow control obtaining module 303 is further specifically configured to obtain a data block size of each IO applied by the user in a previous statistical period, and calculate an average data block size of the IO in the previous statistical period.
The average data block size of the IO in the last statistical period may be calculated by using an arithmetic mean algorithm, a geometric mean algorithm, or a root mean square mean algorithm.
For example, suppose that ten IO times are detected in the last statistical period, and the data block sizes of the ten IO times are: 2M, 1M, 3M, 0.5M, 10M, 4M, 0.1M, 1.2M, 5M and 8M. Calculating the average data block size of the IO in the last statistical period by using the arithmetic mean algorithm as follows:
Figure DEST_PATH_IMAGE004A
=(2M + 1M + 3M + 0.5M + 10M + 4M + 0.1M + 1.2M + 5M + 8M)/10 = 3.48M。
the flow control obtaining module 303 is further specifically configured to obtain a transmission delay of each data block in the previous statistical period, and calculate an average data block delay of the IO in the previous statistical period.
The transmission delay (referred to as delay for short) refers to the time required for a node to enter a data block from the node to a transmission medium when the node transmits data, that is, the total time required for a transmitting station to transmit a data frame from the beginning to finish transmitting the data frame, or the total time required for a receiving station to receive the data frame from the beginning to finish receiving the data frame.
In a preferred embodiment of the present invention, the transmission delay of the data block may be obtained from a load measurement tool or a performance monitoring tool installed in each storage node.
As described above, the average data block delay of the IO in the last statistical period may also be calculated by using an arithmetic mean algorithm, a geometric mean algorithm, or a root mean square mean algorithm. Suppose, in the last statistical period, the transmission delays of ten IO are: 1s, 0.8s, 1.5s, 0.4s, 5s, 2s, 0.02s, 0.6s, 3s, and 4.5s, when the IO average data block delay in the previous statistical period is calculated by using an arithmetic mean algorithm, the result is:
(1s+0.8s+1.5s+0.4s+5s+2s+0.1s+0.6s+3s+4.4s)=1.88s。
it should be understood that, if the average data block size of the IO in the previous statistical period is calculated by using an arithmetic mean algorithm, the average data block delay of the IO in the previous statistical period is also calculated by using the arithmetic mean algorithm; if the average data block size of the IO in the previous statistical period is calculated by adopting a geometric mean algorithm, calculating the average data block time delay of the IO in the previous statistical period by adopting the geometric mean algorithm; or if the average data block size of the IO in the previous statistical period is calculated by using the root mean square average algorithm, the average data block delay of the IO in the previous statistical period is also calculated by using the root mean square average algorithm.
The flow control obtaining module 303 is further specifically configured to obtain a preset reference value of the size of the IO data block and a reference value of the corresponding data block delay.
In a preferred embodiment of the present invention, the reference value of the IO data block size and the reference value of the corresponding data block delay may be preset by an administrator of the storage system according to experience. For example, according to experience, when a 4K data block is transmitted, the delay is minimum, and may reach 50ms in an ideal state, then the reference value of the IO data block size may be set to 4K, and the reference value of the corresponding data block delay may be set to 50 ms.
And a flow control calculation module 307, configured to calculate the IO load intensity in the last statistical period according to the average data block size, the average data block delay, the reference value of the data block size, and the reference value of the corresponding data block delay of the IO in the last statistical period.
For example, assuming that the average data block size of the IO in the previous statistical period is X, the average data block delay is Y, the reference value of the data block size is M, and the reference value of the corresponding data block delay is N, the calculation formula of the IO load intensity in the previous statistical period is as follows: l =
Figure DEST_PATH_IMAGE002AA
The flow control obtaining module 303 determines the IO load category in the previous statistical period by using a pre-trained load classification model according to the IO load intensity in the previous statistical period.
In a preferred embodiment of the present invention, the IO load categories include: high load class, normal load class, low load class.
Preferably, the load classification model includes, but is not limited to: support Vector Machine (SVM) models. And taking the average data block size of the IO in the last statistical period, the average data block time delay of the IO in the last statistical period and the IO load intensity in the last statistical period as the input of the load classification model, and outputting the IO load category in the last statistical period after calculation of the load classification model.
And a model training module 308 for training the load classification model.
In a preferred embodiment of the present invention, the process of training the load classification model by the model training module 308 includes:
1) and obtaining the IO load data of the positive sample and the IO load data of the negative sample, and labeling the load class of the IO load data of the positive sample so as to enable the IO load data of the positive sample to carry the IO load class label.
For example, 500 pieces of IO load data corresponding to a high load category, a normal load category, and a low load category are respectively selected, and each piece of IO load data is labeled with a category, "1" may be used as an IO data tag of a high load, "2" may be used as an IO data tag of a normal load, and "3" may be used as an IO data tag of a low load.
2) And randomly dividing the IO load data of the positive sample and the IO load data of the negative sample into a training set with a first preset proportion and a verification set with a second preset proportion, training the load classification model by using the training set, and verifying the accuracy of the trained load classification model by using the verification set.
The training samples in the training sets of different load classes are distributed to different folders. For example, training samples of a high load category are distributed into a first folder, training samples of a normal load category are distributed into a second folder, and training samples of a low load category are distributed into a third folder. Then, training samples with a first preset proportion (for example, 70%) are respectively extracted from different folders and used as total training samples to perform training of the load classification model, and training samples with a remaining second preset proportion (for example, 30%) are respectively extracted from different folders and used as total test samples to perform accuracy verification on the trained load classification model.
3) If the accuracy is greater than or equal to a preset accuracy, ending the training, and identifying the IO load category in the current statistical period by taking the trained load classification model as a classifier; and if the accuracy is smaller than the preset accuracy, increasing the number of positive samples and the number of negative samples to retrain the load classification model until the accuracy is larger than or equal to the preset accuracy.
The flow control calculating module 307 is further configured to calculate a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period.
Specifically, the calculating a flow control threshold corresponding to the current statistical period according to the IO load class in the previous statistical period may include:
1) and when the IO load category in the last statistical period is a high load category, reducing the flow control threshold corresponding to the last statistical period by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical period.
And when the IO load in the last statistical period is a high load, reducing the flow control threshold according to the first preset amplitude, so as to perform write-in operation on the data in the cache by using the low flow control threshold in the current statistical period, and ensuring the efficient access of the user application by reducing the data write-in speed.
In a preferred embodiment of the present invention, the first preset amplitude may be 1/2 of the flow control threshold corresponding to the previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1/2 of the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1/2 of the flow control threshold corresponding to the current statistical period.
2) And when the IO load category in the last statistical period is a low load category, increasing the flow control threshold corresponding to the last statistical period by a second preset amplitude to obtain the flow control threshold corresponding to the next statistical period.
And when the IO load in the last statistical period is a low load, increasing the flow control threshold according to the second preset amplitude, so as to perform write-in operation on the data in the cache by using the high flow control threshold in the current statistical period, and increasing the speed of data write-in on the basis of ensuring the access quality of the user application.
In a preferred embodiment of the present invention, the second preset amplitude may be 1.5 times of a flow control threshold corresponding to a previous statistical period. That is, the flow control threshold corresponding to the current statistical period is 1.5 times of the flow control threshold corresponding to the previous statistical period, and the flow control threshold corresponding to the next statistical period is 1.5 times of the flow control threshold corresponding to the current statistical period.
3) And when the IO load category in the last statistical period is a normal load category, taking the flow control threshold corresponding to the last statistical period as the flow control threshold corresponding to the current statistical period.
In summary, the background disc-writing flow control apparatus according to the present invention writes user data into a configured cache, and writes user data corresponding to a data identifier in the cache, which is identified as a first identifier, into an pointed hard disc by using a preset flow control threshold value when a current write cycle is a first statistical cycle by detecting that cache information in the cache meets a first preset condition; and when detecting that the cache information in the cache meets a first preset condition and the current write-in period is not the first statistical period, dynamically adjusting a flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the previous statistical period, and writing user data corresponding to the data identifier in the cache in the current statistical period as the first identifier into the pointed hard disk according to different flow control thresholds. The efficiency of writing user data into a hard disk is improved, the risk of data loss is reduced, meanwhile, obvious impact on normal input and output service performance can be avoided, and the flow control effect is good.
And secondly, the flow control threshold corresponding to the current statistical period is automatically and dynamically adjusted according to the IO load applied by the user in the previous statistical period without manual adjustment of a manager, so that the workload of the manager is reduced, and the problem of inaccurate adjustment caused by subjective factors of the manager is solved.
In addition, when the cache information in the cache is detected to not meet the first preset condition but meet the second preset condition, the user data is removed from the cache, so that the storage space of the cache can be saved, and the efficiency of writing the user data into the cache is improved.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a dual-screen device, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present invention.
Example four
Fig. 4 is a schematic diagram of an electronic device according to a fourth embodiment of the present invention.
The electronic device 4 includes: a memory 41, at least one processor 42, a computer program 43 stored in said memory 41 and executable on said at least one processor 42, and at least one communication bus 44.
The steps in the above-described method embodiments are implemented when the computer program 43 is executed by the at least one processor 42.
Illustratively, the computer program 43 may be divided into one or more modules/units, which are stored in the memory 41 and executed by the at least one processor 42 to perform the steps in the above-described method embodiments of the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 43 in the electronic device 4.
The electronic device 4 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. It will be understood by those skilled in the art that the schematic diagram 4 is merely an example of the electronic device 4, and does not constitute a limitation to the electronic device 4, and may include more or less components than those shown, or combine some components, or different components, for example, the electronic device 4 may further include an input-output device, a network access device, a bus, etc.
The at least one Processor 42 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The processor 42 may be a microprocessor or the processor 42 may be any conventional processor or the like, the processor 42 being the control center of the electronic device 4 and connecting the various parts of the entire electronic device 4 using various interfaces and lines.
The memory 41 may be used for storing the computer program 43 and/or the module/unit, and the processor 42 may implement various functions of the electronic device 4 by running or executing the computer program and/or the module/unit stored in the memory 41 and calling data stored in the memory 41. The memory 41 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the stored data area may store data (such as audio data, a phonebook, etc.) created according to the use of the electronic apparatus 4, and the like. In addition, the memory 41 may include a high speed random access memory, and may also include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
The integrated modules/units of the electronic device 4 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
In the embodiments provided in the present invention, it should be understood that the disclosed electronic device and method can be implemented in other ways. For example, the above-described embodiments of the electronic device are merely illustrative, and for example, the division of the units is only one logical functional division, and there may be other divisions when the actual implementation is performed.
In addition, functional units in the embodiments of the present invention may be integrated into the same processing unit, or each unit may exist alone physically, or two or more units are integrated into the same unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or that the singular does not exclude the plural. A plurality of units or means recited in the present invention may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit scope of the technical solutions of the present invention.

Claims (9)

1. A background write stream control method, the method comprising:
when a storage command of user data is received, writing the user data into a configured cache;
when it is detected that the cache information in the cache meets a first preset condition, judging whether a current statistical period is a first statistical period, when it is determined that the current statistical period is not the first statistical period, obtaining an IO load applied by a user in a previous statistical period, determining a flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the previous statistical period, wherein determining the flow control threshold corresponding to the current statistical period according to the IO load applied by the user in the previous statistical period comprises: acquiring the data block size of each IO applied by a user in the previous statistical period, and calculating the average data block size of the IO in the previous statistical period; acquiring the transmission delay of each data block in the previous statistical period, and calculating the average IO data block delay in the previous statistical period; acquiring a preset reference value of the size of an IO data block and a reference value of corresponding data block time delay; calculating the IO load intensity in the last statistical period according to the average data block size, the average data block time delay, the reference value of the data block size and the reference value of the corresponding data block time delay of the IO in the last statistical period; determining the IO load category in the last statistical period by utilizing a pre-trained load classification model according to the IO load intensity in the last statistical period; calculating a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
writing the user data corresponding to the data identifier in the cache as the first identifier into a hard disk based on the flow control threshold corresponding to the current statistical period;
and when detecting that the cache information in the cache does not meet the first preset condition but meets a second preset condition, clearing the user data corresponding to the data identifier in the cache as the second identifier.
2. The method of claim 1, wherein the method further comprises:
and when the current statistical period is determined to be the first statistical period, determining a preset flow control threshold value as a flow control threshold value corresponding to the current statistical period.
3. The method according to claim 2, wherein the calculation formula for calculating the IO load intensity in the previous statistical period according to the average data block size, the average data block delay, the reference value of the data block size, and the reference value of the corresponding data block delay of the IO in the previous statistical period is as follows: l =
Figure DEST_PATH_IMAGE002
Wherein X is as defined aboveAnd the average data block size of the IO in the last statistical period, Y is the average data block time delay, M is a reference value of the data block size, and N is a reference value of the corresponding data block time delay.
4. The method according to claim 2 or 3, wherein the calculating the flow control threshold corresponding to the current statistical period according to the IO load class in the last statistical period includes:
when the IO load category in the previous statistical period is a high load category, reducing the flow control threshold corresponding to the previous statistical period by a first preset amplitude to obtain the flow control threshold corresponding to the current statistical period;
when the IO load category in the previous statistical period is a low load category, increasing the flow control threshold corresponding to the previous statistical period by a second preset amplitude to obtain a flow control threshold corresponding to the next statistical period;
and when the IO load category in the last statistical period is a normal load category, taking the flow control threshold corresponding to the last statistical period as the flow control threshold corresponding to the current statistical period.
5. The method according to any one of claims 1 to 3, wherein detecting whether the cache information in the cache satisfies the first predetermined condition comprises one or more of the following:
detecting whether the residual storage space in the cache is smaller than a preset space threshold value;
and detecting whether the total amount of the user data in the cache is larger than a preset limit threshold value.
6. The method according to any one of claims 1 to 3, wherein detecting whether the cache information in the cache satisfies a second predetermined condition is: and detecting whether the caching time of the user data in the cache is earlier than a preset time threshold.
7. A background write disk flow control apparatus, the apparatus comprising:
the cache writing module is used for writing the user data into the configured cache when receiving a storage command of the user data;
the first detection module is used for detecting whether the cache information in the cache meets a first preset condition;
a flow control obtaining module, configured to, when the first detecting module detects that the cache information in the cache meets a first preset condition, determine whether a current statistics period is a first statistics period, when it is determined that the current statistics period is not the first statistics period, obtain an IO load applied by a user in a previous statistics period, determine, according to the IO load applied by the user in the previous statistics period, a flow control threshold corresponding to the current statistics period, and determine, according to the IO load applied by the user in the previous statistics period, a flow control threshold corresponding to the current statistics period includes: acquiring the data block size of each IO applied by a user in the previous statistical period, and calculating the average data block size of the IO in the previous statistical period; acquiring the transmission delay of each data block in the previous statistical period, and calculating the average IO data block delay in the previous statistical period; acquiring a preset reference value of the size of an IO data block and a reference value of corresponding data block time delay; calculating the IO load intensity in the last statistical period according to the average data block size, the average data block time delay, the reference value of the data block size and the reference value of the corresponding data block time delay of the IO in the last statistical period; determining the IO load category in the last statistical period by utilizing a pre-trained load classification model according to the IO load intensity in the last statistical period; calculating a flow control threshold corresponding to the current statistical period according to the IO load category in the previous statistical period;
the hard disk writing module is used for writing the user data corresponding to the data identifier in the cache as the first identifier into the hard disk based on the flow control threshold corresponding to the current statistical period;
the second detection module is used for detecting whether the cache information in the cache meets a second preset condition or not when the first detection module detects that the cache information in the cache does not meet the first preset condition;
and the cache clearing module is used for clearing the user data corresponding to the second identifier identified by the data in the cache when the second detection module detects that the cache information in the cache meets the second preset condition.
8. An electronic device, comprising a processor and a memory, wherein the processor is configured to implement the background write disk flow control method according to any one of claims 1 to 6 when executing the computer program stored in the memory.
9. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the background write disk flow control method according to any one of claims 1 to 6.
CN201810565748.4A 2018-06-04 2018-06-04 Background disc writing flow control method and device, electronic equipment and storage medium Active CN108763107B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810565748.4A CN108763107B (en) 2018-06-04 2018-06-04 Background disc writing flow control method and device, electronic equipment and storage medium
PCT/CN2018/108129 WO2019232994A1 (en) 2018-06-04 2018-09-27 Flow control method and apparatus for writing in disk in background, and electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810565748.4A CN108763107B (en) 2018-06-04 2018-06-04 Background disc writing flow control method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN108763107A CN108763107A (en) 2018-11-06
CN108763107B true CN108763107B (en) 2022-03-01

Family

ID=64002695

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810565748.4A Active CN108763107B (en) 2018-06-04 2018-06-04 Background disc writing flow control method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN108763107B (en)
WO (1) WO2019232994A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114449094B (en) * 2020-10-30 2023-04-11 华为技术有限公司 Control method, electronic device, chip and storage medium
CN114363640B (en) * 2022-01-05 2023-11-21 上海哔哩哔哩科技有限公司 Data storage method, device and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101090323A (en) * 2006-06-14 2007-12-19 国际商业机器公司 Storage device allocation managing method and system in switches utilizing a flow control
CN103729313A (en) * 2012-10-11 2014-04-16 苏州捷泰科信息技术有限公司 Method and device for controlling input and output flow of SSD cache
CN104978335A (en) * 2014-04-04 2015-10-14 阿里巴巴集团控股有限公司 Data access control method and data access control device
CN107544862A (en) * 2016-06-29 2018-01-05 中兴通讯股份有限公司 A kind of data storage reconstructing method and device, memory node based on correcting and eleting codes

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7277982B2 (en) * 2004-07-27 2007-10-02 International Business Machines Corporation DRAM access command queuing structure
CN103543955A (en) * 2013-08-05 2014-01-29 记忆科技(深圳)有限公司 Method and system for reading cache with solid state disk as equipment and solid state disk
US9361145B1 (en) * 2014-06-27 2016-06-07 Amazon Technologies, Inc. Virtual machine state replication using DMA write records
CN104461935B (en) * 2014-11-27 2018-03-13 华为技术有限公司 A kind of method, apparatus and system for carrying out data storage
CN106547476B (en) * 2015-09-22 2021-11-09 伊姆西Ip控股有限责任公司 Method and apparatus for data storage system
CN105677258A (en) * 2016-02-23 2016-06-15 浪潮(北京)电子信息产业有限公司 Method and system for managing log data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101090323A (en) * 2006-06-14 2007-12-19 国际商业机器公司 Storage device allocation managing method and system in switches utilizing a flow control
CN103729313A (en) * 2012-10-11 2014-04-16 苏州捷泰科信息技术有限公司 Method and device for controlling input and output flow of SSD cache
CN104978335A (en) * 2014-04-04 2015-10-14 阿里巴巴集团控股有限公司 Data access control method and data access control device
CN107544862A (en) * 2016-06-29 2018-01-05 中兴通讯股份有限公司 A kind of data storage reconstructing method and device, memory node based on correcting and eleting codes

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
多网网关中基于网络处理器的分组转发及流控研究;冯兴;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20130415;全文 *

Also Published As

Publication number Publication date
CN108763107A (en) 2018-11-06
WO2019232994A1 (en) 2019-12-12

Similar Documents

Publication Publication Date Title
CN108804039B (en) Adaptive data recovery flow control method and device, electronic equipment and storage medium
CN108762686B (en) Data consistency check flow control method and device, electronic equipment and storage medium
CN108959399B (en) Distributed data deletion flow control method and device, electronic equipment and storage medium
CN110599335A (en) User financial risk assessment method and device based on multiple models
CN108763107B (en) Background disc writing flow control method and device, electronic equipment and storage medium
CN104778079A (en) Method and device used for dispatching and execution and distributed system
CN109753478B (en) Parallel data processing method and device based on FPGA
CN113176992B (en) A/B experiment shunting method, device and computer readable storage medium
CN112925637A (en) Load balancing device and method for edge operation network
WO2018166145A1 (en) Method and device for batch offering of repayment data
US11494633B2 (en) Techniques to manage training or trained models for deep learning applications
CN108762684B (en) Hot spot data migration flow control method and device, electronic equipment and storage medium
CN110708369B (en) File deployment method and device for equipment nodes, scheduling server and storage medium
US7818555B2 (en) Method and apparatus for changing a configuration of a computing system
US10554513B2 (en) Technologies for filtering network packets on ingress
WO2017008563A1 (en) Data processing method and device, and storage medium
US20210073686A1 (en) Self-structured machine learning classifiers
CN112114931B (en) Deep learning program configuration method and device, electronic equipment and storage medium
CN110297785B (en) Financial data flow control device and method based on FPGA
CN112073327A (en) Anti-congestion software distribution method, device and storage medium
US20210141435A1 (en) Software switch and method therein
CN112199226B (en) Problem determination method and related product
US20220407821A1 (en) Computer-readable recording medium storing data processing program, data processing method, and data processing system
US20230132786A1 (en) Artificial intelligence based power consumption optimization
TWI721464B (en) A deep learning program configuration method, device, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant