CN111198659B - Concurrent I/O stream model identification method and system based on multi-sliding window implementation - Google Patents

Concurrent I/O stream model identification method and system based on multi-sliding window implementation Download PDF

Info

Publication number
CN111198659B
CN111198659B CN201911371423.3A CN201911371423A CN111198659B CN 111198659 B CN111198659 B CN 111198659B CN 201911371423 A CN201911371423 A CN 201911371423A CN 111198659 B CN111198659 B CN 111198659B
Authority
CN
China
Prior art keywords
request
window
recorded
requests
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911371423.3A
Other languages
Chinese (zh)
Other versions
CN111198659A (en
Inventor
沙方浩
沈海嘉
吴瑞强
范玉
张帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Zhongke Shuguang Storage Technology Co ltd
Original Assignee
Tianjin Zhongke Shuguang Storage Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Zhongke Shuguang Storage Technology Co ltd filed Critical Tianjin Zhongke Shuguang Storage Technology Co ltd
Priority to CN201911371423.3A priority Critical patent/CN111198659B/en
Publication of CN111198659A publication Critical patent/CN111198659A/en
Application granted granted Critical
Publication of CN111198659B publication Critical patent/CN111198659B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0862Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a concurrent I/O flow model identification method based on multi-sliding window, which comprises the steps of taking a fixed segment as a unit, mapping all I/O requests which belong to one segment in received I/O requests to the same hash table row; establishing a detection window in a controllable range by taking the recorded I/O request as a reference, wherein the detection window slides along with the recorded I/O request; setting a plurality of detection windows in the same hash table row, and judging that the I/O requests hitting the same detection window are in a sequential model; the method and the system expand the hash mapping processing of a single I/O request into processing taking a fixed segment as a unit, namely all I/O requests in the same segment are mapped to the same hash table row, and then the whole ordered local unordered I/O model is identified through a plurality of sliding windows, so that the problem of local unordered caused by multithreading distribution processing of the ordered I/O requests by a common storage system is solved.

Description

Concurrent I/O stream model identification method and system based on multi-sliding window implementation
Technical Field
The application relates to the technical field of computer data processing, in particular to a concurrent I/O flow model identification method and system based on multi-sliding window implementation.
Background
With the explosive growth of data volume, the high-performance I/O requirement of the application program on the storage system is also more and more obvious, and in order to cope with the requirement of the high-performance I/O, common solutions are in two directions of expanding and upgrading storage hardware and optimizing storage system software. In large data processing scenarios such as big data and cloud computing, the hardware expansion and upgrading capability of a storage system is often weaker than that of a computing system due to the consideration of stability, security, cost control and the like. Therefore, optimizing storage system software, squeezing more available performance from storage hardware, wherein targeted processing of a particular I/O model is one of the optimization directions, is becoming increasingly important.
In the existing open source storage system, the identification of the I/O model is generally simpler, wherein the Bcache cache system has a certain representativeness for realizing the pre-reading sequence detection.
The Bcache buffer system provides a pre-reading function, and realizes a sequential I/O stream identification module for the pre-reading function, and the module is realized based on a hash table and supports detection of multiple concurrent I/O streams.
In the processing flow of the application scenario of the multi-path concurrent I/O stream, for the storage system, even if the I/O requests sent by the application end are in standard sequence, due to the influence of the multi-path concurrency, the adjacent I/O requests received by the working threads of the storage system do not necessarily belong to the same path, that is, the multi-path concurrency breaks up the request sequence of the original I/O stream, which finally results in randomization of the disk requests, and greatly influences the performance of the disk.
We use two concurrent I/O streams to illustrate the recognition principle of Bcache, as shown in fig. 1 below:
columns 0, 1, 2 represent hash table rows and X, Y, Z represent received adjacent I/O requests.
Assuming that the address ranges of the 3I/O requests are X [100,200], Y [800,900], Z [200,300], respectively, the process flow is as follows:
receiving an I/O request X, performing hash mapping calculation on a starting address 100, and assuming that the starting address is mapped to a 1 st row, if the 1 st row does not find the I/O request recorded with the 100 th address, then considering that X is in a random I/O model, performing hash mapping calculation on an ending address 200 of X, and assuming that the starting address is mapped to a 0 th row, then recording in the 0 th row;
receiving an I/O request Y, performing hash mapping calculation on a starting address 800, and assuming that the starting address is mapped to a 0 th row, if the I/O request recorded by the 800 th address is not found in the 0 th row, then considering that Y is in a random I/O model, performing hash mapping calculation on an ending address 900 of Y, and assuming that the starting address is mapped to a 2 nd row, then recording in the 2 nd row;
and receiving an I/O request Z, performing hash mapping calculation on a starting address 200, mapping the same as an ending address of X to a 0 th row, searching a 200 address record of X, considering Z and X as the same path of sequential I/O stream, belonging to a sequential I/O model, merging with X, updating the ending address to be a 300 address of Z, performing hash mapping calculation on X by using the ending address 300, and transferring from the 0 th row to the 1 st row record on the assumption that the X is mapped to the 1 st row.
The Bcache cache system judges whether the I/O request accords with the sequence model in this way, and if so, the Bcache cache system starts subsequent prereading based on the current I/O request. In a conventional storage system, in order to improve throughput, a multithreading manner is often used to process a received I/O request, and even if a front-end application is a single-path sequential I/O request, the received I/O request may become locally unordered after being subjected to multithreading in the storage system. Therefore, the implementation scheme of the recognition I/O model of Bcache has the limitation, and the specific problems are as follows:
according to Bcache's judgment logic, the two I/O requests of X and Z in FIG. 1 are actually connected sequentially, but after the distributed multithreading process, Z may be before X response, and at this time, the two I/O requests are considered to be in a random I/O model, thus greatly reducing the recognition accuracy.
The method of hash mapping of Bcache to single I/O request is not suitable for the I/O aggregation processing function because the original I/O request with a segment sequence is divided into a plurality of small segments due to local disorder after the multithreading processing is distributed in the storage system.
In view of this, the present application has been made.
Disclosure of Invention
Aiming at the defects in the prior art, the application provides a concurrent I/O flow model identification method and system based on multi-sliding window implementation, so that more intelligent analysis and judgment can be realized on I/O requests, and the identification accuracy is improved.
In order to achieve the above purpose, the technical scheme of the application is as follows:
a concurrent I/O stream model identification method based on multi-sliding window implementation comprises
Mapping all I/O requests belonging to one segment in the received I/O requests to the same hash table row by taking the fixed segment as a unit;
establishing a detection window in a controllable range by taking the recorded I/O request as a reference, wherein the detection window slides along with the recorded I/O request;
a plurality of detection windows are arranged in the same hash table row, and I/O requests hitting the same detection window are judged to be in a sequence model.
Furthermore, in the concurrent I/O stream model identification method based on the multi-sliding window implementation, in the "based on the recorded I/O request, a detection window is established in a controllable range, the detection window slides along with the recorded I/O request",
the detection window takes the number of I/O requests as a unit, and in the whole window range, the whole window is divided into three sub-windows of left, middle and right by taking the leftmost and rightmost I/O request records as left and right axes of the window; for the width of the sub window, setting the working thread number as T, and defining an adjustable coefficient N, wherein N is a positive integer; then:
left and right sub-window width = MAX (maximum I/O granularity recorded, new I/O granularity) T N.
Further, in the concurrent I/O stream model identification method based on the multi-sliding window implementation, if the granularity of the current I/O request is larger than that of the right axis and it is determined that the mapping address is on the right axis of the right window by a plurality of granularities, the current I/O request is judged to hit and recorded, and the right axis is updated;
if the current I/O request hits the middle window, only record and not update the left and right axes.
On the other hand, the application also provides a concurrent I/O stream model identification system based on the multi-sliding window, which comprises a processor and a memory, wherein the memory stores a program, and when the program is run by the processor, the following steps are executed:
mapping all I/O requests belonging to one segment in the received I/O requests to the same hash table row by taking the fixed segment as a unit;
establishing a detection window in a controllable range by taking the recorded I/O request as a reference, wherein the detection window slides along with the recorded I/O request;
a plurality of detection windows are arranged in the same hash table row, and I/O requests hitting the same detection window are judged to be in a sequence model.
Further, in the concurrent I/O stream model identification system based on the multi-sliding window implementation, the program performs the "mapping all I/O requests belonging to a segment length in the received I/O requests into the same hash table row":
the detection window takes the number of I/O requests as a unit, and in the whole window range, the whole window is divided into three sub-windows of left, middle and right by taking the leftmost and rightmost I/O request records as left and right axes of the window; for the width of the sub window, setting the working thread number as T, and defining an adjustable coefficient N, wherein N is a positive integer; then:
left and right sub-window width = MAX (maximum I/O granularity recorded, new I/O granularity) T N.
Further, in the concurrent I/O stream model identification system based on the multi-sliding window implementation, if the granularity of the current I/O request is larger than that of the right axis and it is determined that the mapping address is on the right axis of the right window by a plurality of granularities, the current I/O request is judged to hit and recorded, and the right axis is updated;
if the current I/O request hits the middle window, only record and not update the left and right axes.
Compared with the prior art, the application has the beneficial effects that:
the method and the system expand the hash mapping processing of a single I/O request into processing taking a fixed segment as a unit, namely all I/O requests in the same segment are mapped to the same hash table row, and then an integral ordered local unordered I/O model is identified through a plurality of sliding windows, so that the problem of local unordered caused by multithreading distribution processing of the sequential I/O requests by a common storage system is solved; the sequence of I/O records belonging to the same fixed section is identified through multiple sliding windows, so that more intelligent analysis and judgment are realized, and the identification accuracy is greatly improved; the identification algorithm has wider application range, is suitable for the overall optimization schemes of cache pre-reading, I/O request aggregation and the like of a common storage system, and further improves the utilization rate of a disk and a cache.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. Like elements or portions are generally identified by like reference numerals throughout the several figures. In the drawings, elements or portions thereof are not necessarily drawn to scale.
FIG. 1 is a schematic diagram of Bcache I/O model identification in the prior art;
FIG. 2 is a schematic diagram illustrating hash processing in units of fixed segments in a concurrent I/O stream model identification method based on a multi-sliding window implementation according to an embodiment of the present application;
FIG. 3 is a schematic diagram of the operation of setting a detection window in the method of the present application;
fig. 4 is a schematic diagram of a method of the present application in which multiple sliding windows are disposed in the same segment of a row.
Detailed Description
Embodiments of the technical scheme of the present application will be described in detail below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present application, and thus are merely examples, and are not intended to limit the scope of the present application.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
Example 1
A concurrent I/O stream model identification method based on multi-sliding window implementation comprises
S1, mapping all I/O requests belonging to one segment in the received I/O requests to the same hash table row by taking a fixed segment as a unit;
s2, establishing a detection window in a controllable range by taking the recorded I/O request as a reference, wherein the detection window slides along with the recorded I/O request;
s3, a plurality of detection windows are arranged in the same hash table row, and I/O requests hitting the same detection window are judged to be in a sequence model.
The hash mapping processing of a single I/O request is expanded into processing taking a fixed segment as a unit, namely all I/O requests in the same segment are mapped to the same hash table row, then an integral ordered local unordered I/O model is identified through a plurality of sliding windows, and the problem of local unordered caused by multithreading distribution processing of the sequential I/O requests by a common storage system is solved. Specific:
s1, mapping all I/O requests which belong to a segment length in the received I/O requests into the same hash table row:
as shown in fig. 2, A, B, C, D is 4I/O requests connected in sequence, and the segments where the 4I/O requests are located range from the start address of a to the end address of D, after the multithreaded dispatch process, the sequence of A, B, C, D is disordered, and the process flow is as follows:
(1) if B arrives first, calculating the segment where B is located, carrying out hash mapping to a corresponding hash table row, searching whether the record of the same segment exists or not, if not, considering that B is in a random I/O model, and finally recording B to a corresponding position;
(2) after the second arrival of A, calculating the section where A is located, and the section which is the same as the section B (namely, the section length is equal to that of the section B, the section B is the same as that of the section B), if the record B is found, the A is considered to be in a sequential model, and finally the record A is recorded to the corresponding position;
(3) d, when the third arrival is achieved, calculating the section where the D is located, and then searching the section which is the same as A, B section, and considering the D to be in a sequential model, and finally recording the D to the corresponding position;
(4) and C finally, after the section where the C is located is calculated, the section is the same as A, B, D, and when the D record is found, the C is considered to be in the sequential model, and finally the C is recorded to the corresponding position.
Since the length of the same segment in a hash table row is typically hundreds of times the granularity of an I/O request, that is, several I/O requests received adjacently may belong to the same segment, but are random in nature, the same segment, i.e., the order, cannot be considered. Therefore, in the application, the recorded I/O request is taken as a reference, a detection window is established in a controllable range, the window slides along with the continuous recorded I/O request, and the I/O request hitting the detection window is considered to be in a sequential model.
Specific: s2, in the process of establishing a detection window in a controllable range based on the recorded I/O request, wherein the detection window slides along with the recorded I/O request, the detection window takes the number of the I/O requests as a unit and takes the leftmost and rightmost I/O request records as left and right axes of the window in the whole window range due to the variable length of the granularity of the I/O request, and the whole window is divided into three sub-windows in the left, middle and right. For the width of the sub window, setting the working thread number as T, defining an adjustable coefficient N, wherein N is a positive integer, and determining according to actual conditions; then:
left-right sub-window width = MAX (maximum I/O granularity recorded, new I/O granularity) T N;
the following description is of one specific embodiment: when t=2, n=1, a set of three granularity I/O requests of 4k, 8k, and 16k are processed, the detection window operation is as shown in fig. 3, each cell represents a granularity of 4k, and the window operation is as follows:
(1) the first I/O request is 8k granularity, is the first I/O request, and generates left and right axes (respectively represented by L and R) after recording;
(2) the second I/O request is 8k granularity and is larger than the right axis R, the length of the right window 301 which is 2 k lengths to the right of the current right axis is firstly determined, then the I/O request is hit and recorded, and finally the right axis R is updated;
(3) the third I/O request is of 4k granularity, hits the middle window 302, only records, and does not update the left and right axes;
(4) the fourth I/O request is of granularity 4k and is larger than the right axis R, the length of the right window 301 of the right axis R to the right of 2 pieces of 8k is determined firstly, then the I/O request is hit and recorded, and finally the right axis R is updated;
(5) the fifth I/O request is 16k granularity, larger than the right axis R, the length of the right window 301 to the right of the right axis R is determined to be 2 pieces of 16k, then the I/O request is hit and recorded, and the right axis R is updated.
Since the length of the hash table row belonging to the same segment is relatively large, a scene that multiple paths of sequential I/O streams exist in the length of the hash table row may occur, and at this time, erroneous judgment needs to be reduced as much as possible. Therefore, for the I/O request of the same segment, the method further prevents erroneous judgment in a mode of comprising a step S3 and a plurality of windows in the segment.
S3, a plurality of detection windows are arranged in the same hash table row, the I/O request hitting the same detection window is judged to be in a sequential model, as shown in figure 4,
A. b, C and X, Y, Z belong to the same segment of I/O requests, but are far apart, so the present application distributes these two sets of I/O requests to two detection windows, and the operation flow is as follows:
(1) a is a first I/O request, is considered to be a random model, and is recorded at a corresponding position to generate left and right axes of an A window;
(2) x is a second I/O request, does not hit the window A, is considered to be in a random model, and is recorded at a corresponding position to generate left and right axes of the window X;
(3) c is a third I/O request, hits the window A, is considered to be in a sequential model, is recorded at a corresponding position, and updates the right axis of the window A;
(4) z is a fourth I/O request, hits the X window, is considered to be in the sequential model, and then is recorded at a corresponding position to update the right axis of the X window;
(5) b is a fifth I/O request, hits the window A, is considered to be in the sequential model, and is then recorded in the corresponding location;
(6) y is the fifth I/O request, hits the X window, is considered to be in the sequential model, and is then recorded in the corresponding location.
The method and the device can realize more intelligent analysis and judgment by identifying the sequence of the I/O records belonging to the same fixed segment through multiple sliding windows, solve the problem of local disorder caused by multithread distribution processing of the sequential I/O requests by a common storage system, and greatly improve the identification accuracy; the identification algorithm has wider application range, is suitable for the overall optimization schemes of cache pre-reading, I/O request aggregation and the like of a common storage system, and further improves the utilization rate of a disk and a cache.
Example 2
The application also provides a concurrent IO stream model identification system based on the multi-sliding window, which is used for implementing the method, and comprises a processor and a memory, wherein a program is stored in the memory, and when the program is run by the processor, the following steps are executed:
mapping all I/O requests belonging to one segment in the received I/O requests to the same hash table row by taking the fixed segment as a unit;
establishing a detection window in a controllable range by taking the recorded I/O request as a reference, wherein the detection window slides along with the recorded I/O request;
a plurality of detection windows are arranged in the same hash table row, and I/O requests hitting the same detection window are judged to be in a sequence model.
The hash mapping processing of a single I/O request is expanded into processing taking a fixed segment as a unit, namely all I/O requests in the same segment are mapped to the same hash table row, then an integral ordered local unordered I/O model is identified through a plurality of sliding windows, and the problem of local unordered caused by multithreading distribution processing of the sequential I/O requests by a common storage system is solved. Specific:
the program performs the "map all I/O requests belonging to a segment length together in the received I/O requests into the same hash table row":
as shown in fig. 2, A, B, C, D is 4I/O requests connected in sequence, and the segments where the 4I/O requests are located range from the start address of a to the end address of D, after the multithreaded dispatch process, the sequence of A, B, C, D is disordered, and the process flow is as follows:
(1) if B arrives first, calculating the segment where B is located, carrying out hash mapping to a corresponding hash table row, searching whether the record of the same segment exists or not, if not, considering that B is in a random I/O model, and finally recording B to a corresponding position;
(2) after the second arrival of A, calculating the section where A is located, and the section which is the same as the section B (namely, the section length is equal to that of the section B, the section B is the same as that of the section B), if the record B is found, the A is considered to be in a sequential model, and finally the record A is recorded to the corresponding position;
(3) d, when the third arrival is achieved, calculating the section where the D is located, and then searching the section which is the same as A, B section, and considering the D to be in a sequential model, and finally recording the D to the corresponding position;
(4) and C finally, after the section where the C is located is calculated, the section is the same as A, B, D, and when the D record is found, the C is considered to be in the sequential model, and finally the C is recorded to the corresponding position.
Since the length of the same segment in a hash table row is typically hundreds of times the granularity of an I/O request, that is, several I/O requests received adjacently may belong to the same segment, but are random in nature, the same segment, i.e., the order, cannot be considered. Therefore, in the application, the recorded I/O request is taken as a reference, a detection window is established in a controllable range, the window slides along with the continuous recorded I/O request, and the I/O request hitting the detection window is considered to be in a sequential model.
Specific: in the program execution, a detection window is established in a controllable range based on the recorded I/O request, and slides along with the recorded I/O request, because of the variable length of the granularity of the I/O request, the detection window takes the number of the I/O requests as a unit, and the left-most and right-most I/O request records as left and right axes of the window in the whole window range, so that the whole window is divided into three sub-windows in the left, middle and right directions. For the width of the sub window, setting the working thread number as T, defining an adjustable coefficient N, wherein N is a positive integer, and determining according to actual conditions; then:
left-right sub-window width = MAX (maximum I/O granularity recorded, new I/O granularity) T N;
the following description is of one specific embodiment: when t=2, n=1, a set of three granularity I/O requests of 4k, 8k, and 16k are processed, the detection window operation is as shown in fig. 3, each cell represents a granularity of 4k, and the window operation is as follows:
(1) the first I/O request is 8k granularity, is the first I/O request, and generates left and right axes (respectively represented by L and R) after recording;
(2) the second I/O request is 8k granularity and is larger than the right axis R, the length of the right window 301 which is 2 k lengths to the right of the current right axis is firstly determined, then the I/O request is hit and recorded, and finally the right axis R is updated;
(3) the third I/O request is of 4k granularity, hits the middle window 302, only records, and does not update the left and right axes;
(4) the fourth I/O request is of granularity 4k and is larger than the right axis R, the length of the right window 301 of the right axis R to the right of 2 pieces of 8k is determined firstly, then the I/O request is hit and recorded, and finally the right axis R is updated;
(5) the fifth I/O request is 16k granularity, larger than the right axis R, the length of the right window 301 to the right of the right axis R is determined to be 2 pieces of 16k, then the I/O request is hit and recorded, and the right axis R is updated.
Since the length of the hash table row belonging to the same segment is relatively large, a scene that multiple paths of sequential I/O streams exist in the length of the hash table row may occur, and at this time, erroneous judgment needs to be reduced as much as possible. Therefore, for the I/O request of the same segment, the method further prevents erroneous judgment in a mode of comprising a step S3 and a plurality of windows in the segment.
Program execution sets multiple detection windows in the same hash table row, and the I/O requests hitting the same detection window are determined to be in the order model, as shown in figure 4,
A. b, C and X, Y, Z belong to the same segment of I/O requests, but are far apart, so the present application distributes these two sets of I/O requests to two detection windows, and the operation flow is as follows:
(1) a is a first I/O request, is considered to be a random model, and is recorded at a corresponding position to generate left and right axes of an A window;
(2) x is a second I/O request, does not hit the window A, is considered to be in a random model, and is recorded at a corresponding position to generate left and right axes of the window X;
(3) c is a third I/O request, hits the window A, is considered to be in a sequential model, is recorded at a corresponding position, and updates the right axis of the window A;
(4) z is a fourth I/O request, hits the X window, is considered to be in the sequential model, and then is recorded at a corresponding position to update the right axis of the X window;
(5) b is a fifth I/O request, hits the window A, is considered to be in the sequential model, and is then recorded in the corresponding location;
(6) y is the fifth I/O request, hits the X window, is considered to be in the sequential model, and is then recorded in the corresponding location.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application, and are intended to be included within the scope of the appended claims and description.

Claims (4)

1. A concurrent I/O stream model identification method based on multi-sliding window implementation is characterized by comprising the following steps of
Mapping all I/O requests belonging to one segment in the received I/O requests to the same hash table row by taking the fixed segment as a unit;
establishing a detection window in a controllable range by taking the recorded I/O request as a reference, wherein the detection window slides along with the recorded I/O request;
a plurality of detection windows are arranged in the same hash table row, the I/O request hitting the same detection window is judged to be in a sequential model,
wherein, the "based on the recorded I/O request, a detection window is established in the controllable range, and the detection window slides along with the recorded I/O request" includes:
the detection window takes the number of I/O requests as a unit, and in the whole window range, the whole window is divided into three sub-windows of left, middle and right by taking the leftmost and rightmost I/O request records as left and right axes of the window; for the width of the sub window, setting the working thread number as T, and defining an adjustable coefficient N, wherein N is a positive integer; then:
left and right sub-window width = MAX (maximum I/O granularity recorded, new I/O granularity) T N.
2. The concurrent I/O flow model identification method based on the multi-sliding window implementation of claim 1, wherein,
if the granularity of the current I/O request is larger than that of the right axis and the mapping address is determined to be at a plurality of granularities on the right axis of the right window, judging that the current I/O request hits and recording the I/O request, and updating the right axis;
if the current I/O request hits the middle window, only record and not update the left and right axes.
3. The concurrent I/O stream model identification system based on the multi-sliding window is characterized by comprising a processor and a memory, wherein a program is stored in the memory, and when the program is run by the processor, the following steps are executed:
mapping all I/O requests belonging to one segment in the received I/O requests to the same hash table row by taking the fixed segment as a unit;
establishing a detection window in a controllable range by taking the recorded I/O request as a reference, wherein the detection window slides along with the recorded I/O request;
a plurality of detection windows are arranged in the same hash table row, the I/O request hitting the same detection window is judged to be in a sequential model,
wherein, the "based on the recorded I/O request, a detection window is established in the controllable range, and the detection window slides along with the recorded I/O request" includes:
the detection window takes the number of I/O requests as a unit, and in the whole window range, the whole window is divided into three sub-windows of left, middle and right by taking the leftmost and rightmost I/O request records as left and right axes of the window; for the width of the sub window, setting the working thread number as T, and defining an adjustable coefficient N, wherein N is a positive integer; then:
left and right sub-window width = MAX (maximum I/O granularity recorded, new I/O granularity) T N.
4. The concurrent I/O flow model identification system based on a multi-sliding window implementation of claim 3 wherein,
if the granularity of the current I/O request is larger than that of the right axis and the mapping address is determined to be at a plurality of granularities on the right axis of the right window, judging that the current I/O request hits and recording the I/O request, and updating the right axis;
if the current I/O request hits the middle window, only record and not update the left and right axes.
CN201911371423.3A 2019-12-26 2019-12-26 Concurrent I/O stream model identification method and system based on multi-sliding window implementation Active CN111198659B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911371423.3A CN111198659B (en) 2019-12-26 2019-12-26 Concurrent I/O stream model identification method and system based on multi-sliding window implementation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911371423.3A CN111198659B (en) 2019-12-26 2019-12-26 Concurrent I/O stream model identification method and system based on multi-sliding window implementation

Publications (2)

Publication Number Publication Date
CN111198659A CN111198659A (en) 2020-05-26
CN111198659B true CN111198659B (en) 2023-09-05

Family

ID=70744359

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911371423.3A Active CN111198659B (en) 2019-12-26 2019-12-26 Concurrent I/O stream model identification method and system based on multi-sliding window implementation

Country Status (1)

Country Link
CN (1) CN111198659B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298556A (en) * 2011-08-26 2011-12-28 成都市华为赛门铁克科技有限公司 Data stream recognition method and device
CN103309966A (en) * 2013-06-04 2013-09-18 中国科学院信息工程研究所 Data flow point connection query method based on time slide windows
CN106294211A (en) * 2016-08-08 2017-01-04 浪潮(北京)电子信息产业有限公司 The detection method of a kind of multichannel sequential flow and device
US9612754B1 (en) * 2015-06-29 2017-04-04 EMC IP Holding Company LLC Data storage system with window allocation using window cache
CN106708865A (en) * 2015-11-16 2017-05-24 杭州华为数字技术有限公司 Method and device for accessing window data in stream processing system
CN108009111A (en) * 2016-11-01 2018-05-08 华为技术有限公司 Data flow connection method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298556A (en) * 2011-08-26 2011-12-28 成都市华为赛门铁克科技有限公司 Data stream recognition method and device
CN103309966A (en) * 2013-06-04 2013-09-18 中国科学院信息工程研究所 Data flow point connection query method based on time slide windows
US9612754B1 (en) * 2015-06-29 2017-04-04 EMC IP Holding Company LLC Data storage system with window allocation using window cache
CN106708865A (en) * 2015-11-16 2017-05-24 杭州华为数字技术有限公司 Method and device for accessing window data in stream processing system
CN106294211A (en) * 2016-08-08 2017-01-04 浪潮(北京)电子信息产业有限公司 The detection method of a kind of multichannel sequential flow and device
CN108009111A (en) * 2016-11-01 2018-05-08 华为技术有限公司 Data flow connection method and device

Also Published As

Publication number Publication date
CN111198659A (en) 2020-05-26

Similar Documents

Publication Publication Date Title
US10810210B2 (en) Performance and usability enhancements for continuous subgraph matching queries on graph-structured data
US9292767B2 (en) Decision tree computation in hardware utilizing a physically distinct integrated circuit with on-chip memory and a reordering of data to be grouped
US8751737B2 (en) Method and apparatus for using a shared ring buffer to provide thread synchronization in a multi-core processor system
US20210124983A1 (en) Device and method for anomaly detection on an input stream of events
US10268504B2 (en) Interrupt information processing method, virtual machine monitor, and interrupt controller
US11435953B2 (en) Method for predicting LBA information, and SSD
US9268595B2 (en) Scheduling thread execution based on thread affinity
CN104536701A (en) Realizing method and system for NVME protocol multi-command queues
CN112148221B (en) Method, device, equipment and storage medium for inspecting redundant array of inexpensive disks
CN113885945B (en) Calculation acceleration method, equipment and medium
RU2629431C2 (en) Programmable logic controller and method of its event-driven programming
WO2021238252A1 (en) Method and device for local random pre-reading of file in distributed file system
CN111198659B (en) Concurrent I/O stream model identification method and system based on multi-sliding window implementation
US10740029B2 (en) Expandable buffer for memory transactions
CN114185885A (en) Streaming data processing method and system based on column storage database
CN115840654B (en) Message processing method, system, computing device and readable storage medium
US9531641B2 (en) Virtual output queue linked list management scheme for switch fabric
US6895493B2 (en) System and method for processing data in an integrated circuit environment
US20200327053A1 (en) Data write-in method and apparatus
CN111045959B (en) Complex algorithm variable mapping method based on storage optimization
US11055267B2 (en) Handling time series index updates at ingestion
US11977488B2 (en) Cache prefetching method and system based on K-Truss graph for storage system, and medium
CN116303125B (en) Request scheduling method, cache, device, computer equipment and storage medium
CN116483739B (en) KV pair quick writing architecture based on hash calculation
US20230205699A1 (en) Region aware delta prefetcher

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant