CN114168272A - Random-reading kernel IO optimization method during file caching and reading - Google Patents
Random-reading kernel IO optimization method during file caching and reading Download PDFInfo
- Publication number
- CN114168272A CN114168272A CN202210131238.2A CN202210131238A CN114168272A CN 114168272 A CN114168272 A CN 114168272A CN 202210131238 A CN202210131238 A CN 202210131238A CN 114168272 A CN114168272 A CN 114168272A
- Authority
- CN
- China
- Prior art keywords
- file
- reading
- read
- value
- maximum
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/172—Caching, prefetching or hoarding of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/455—Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
- G06F9/45533—Hypervisors; Virtual machine monitors
- G06F9/45558—Hypervisor-specific management and integration aspects
- G06F2009/45579—I/O management, e.g. providing access to device drivers or storage
Abstract
The invention relates to a random-reading kernel IO optimization method during caching and reading files, which comprises the following steps that when an operating system reads files, a kernel pre-reading process is started, if an application program is set or a kernel is marked as a random mode, pre-reading is carried out according to a pre-reading design, wherein the pre-reading design is as follows: comparing the file size with the maximum IO value, and reading the file in different modes: when the size of a file to be read is smaller than the maximum IO value, if the read request quantity is smaller than the maximum IO value, the file is completely pre-read into the kernel page cache; and when the size of the file to be read is larger than the maximum IO value, setting a file pre-reading part according to the file pointer and the file residual amount. When entering the random reading process, the invention reads the file in advance according to a certain rule so as to improve the performance of reading the file randomly and improve the performance of the operating system.
Description
Technical Field
The invention relates to a random file reading mechanism in the field of linux operating systems, in particular to a kernel IO (input/output) optimization method for random reading during file cache reading.
Background
With the advent of the virtualization era, an operating system is of great importance as a soul of a virtual machine, performance is a key point of virtualization, and the operating system with high performance can assist in development of virtualization. The file read-write performance of the operating system depends greatly on the length and characteristics of the kernel IO. When the kernel reads files in the cache, if the files are read sequentially, pre-reading is triggered, the contents of the files to be read next time are read into the cache in advance, the next time of reading is directly read from the cache to shorten an IO path, but when the files are read randomly, the kernel only reads the requested amount in order to reduce unnecessary invalid pre-reading. The random read performance may lag much more than the sequential read performance.
Disclosure of Invention
The invention mainly aims to provide a kernel IO optimization method for random reading during caching of read files, which is used for pre-reading files according to a certain rule during reading of the files to improve the performance of random reading of the files and improve the performance of an operating system when a random reading process is started, is provided as a kernel random pre-reading interface, is convenient for self-defined use of an application program and improves the availability of the operating system.
In order to accomplish the above object, the present invention provides a kernel IO optimization method for random reading during caching of read files, which enters a kernel pre-reading process when an operating system reads files, and performs pre-reading according to a pre-reading design if an application program is set or a kernel is marked as a random mode, wherein the pre-reading design is as follows:
comparing the file size with the maximum IO value, and reading the file in different modes:
when the size of a file to be read is smaller than the maximum IO value, if the read request amount of one time is smaller than the maximum IO value, the whole file to be read is completely pre-read into the kernel page cache;
when the size of the file to be read is larger than the maximum IO value, setting a file pre-reading part according to the position of the file pointer and the size of the file which is not read:
if the read request amount of one time is larger than the maximum IO value of one time, reading the read request amount of the current time in n batches, wherein the read request amount of each batch in the n-1 batches is the maximum IO value of one time, calculating the size of the current residual unread file when the nth batch is read, and changing the read request amount of the nth batch into the value of the size of the current residual unread file if the size of the current residual unread file is smaller than the maximum IO value of one time; if the size of the current residual unread file is larger than the primary IO maximum value, updating the nth read request quantity to the primary IO maximum value, wherein n is larger than or equal to 2 and is a natural number;
and if the read request amount of one time is smaller than the maximum IO value of one time, updating the read request amount of this time to be the maximum IO value of one time.
Preferably, in the pre-reading process, it is determined whether the current file pointer is the last page of the file, and if the current file pointer is the last page of the file, the file is not subjected to pre-reading design, and the required read request amount is directly read at the file pointer.
Further preferably, in the pre-reading process, if the current file pointer is not the last page of the file, the pre-reading design is entered.
The invention has the beneficial effects that:
the method is equivalent to designing an in-core pre-reading interface for realizing RANDOM reading, when a file is read randomly, a file reading mode is marked as FMODE _ RANDOM (RANDOM mode), a pre-reading mechanism of RANDOM reading is entered, unread pages of the file are read into pagecache (in-core page cache) according to the pre-reading design, when the data is read next IO (input/output) and cannot be interrupted by missing pages, then the cache is filled from a disk, the cache can be directly hit and returned, IO (input/output) paths are shortened, the IO times are reduced by matching with adjustment bdi (Background dev info, rear-end equipment information) - > IO _ pages (the maximum number of the primary IO pages), and the performance is greatly improved. The invention can intuitively promote the random reading iops of mass files through fio test.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is an overall flow chart of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention is clearly and completely described below with reference to the drawings in the embodiments of the present invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
As shown in fig. 1, the present invention provides a kernel IO optimization method for RANDOM reading during file cache reading, which enters a kernel pre-reading process when an operating system reads a file, and performs pre-reading according to a pre-reading design if an application program is set or a kernel is marked as FMODE _ RANDOM (RANDOM mode). If the application or kernel is not marked as FMODE _ RANDOM (RANDOM mode), then the required amount of read requests is read directly at the file pointer. In the prior art, since the maximum value of IO read at random once is 2M, the maximum value of IO read at random once is defined by 2M in this embodiment.
Specifically, in the present embodiment, the pre-reading is designed to:
the file size is compared with the maximum IO value (i.e. 2M) and then the file is read in different ways:
when the size of the file to be read is smaller than the maximum IO value, if the read request amount is smaller than the maximum IO value, the read request amount of the current time is updated to be the sum of the current read request amount and the size of the remaining unread file, that is, the whole file to be read is all pre-read into the kernel page cache. That is, if the size of the file is smaller than 2M, the current request amount is from the current file position to the end of the file, that is, the whole file is completely pre-read into the kernel page cache at this time, so that the file cache is hit probabilistically in the read request of the next random block, and the operation performance is improved.
When the size of the file to be read is larger than the maximum IO value, setting a file pre-reading part according to the position of the file pointer and the size of the file which is not read in the file: if the read request quantity of one time is larger than the maximum IO value, reading the read request quantity of the current time in n batches, wherein the read request quantity of each batch in n-1 batches is the maximum IO value, when the read request quantity of the nth batch is reached, calculating the size of the current residual unread file, if the size of the current residual unread file is smaller than the maximum IO value, changing the read request quantity of the nth batch into the value of the size of the current residual unread file, and if the size of the current residual unread file is larger than the maximum IO value, updating the read request quantity of the nth batch into the maximum IO value, wherein n is larger than or equal to 2, and n is a natural number; and if the read request amount of one time is smaller than the maximum IO value of one time, updating the read request amount of this time to be the maximum IO value of one time. That is, to illustrate: when the current file size is larger than 2M, there are several situations:
if the read request amount is greater than 2M once, for example, the read request amount is 11M, 6 times (11/2 =6, where 11 is the read request amount to be read 11M, and 2 is the maximum IO value 2M) of reading are performed, the read amount in the previous 5 times is 2M (i.e., the maximum IO value) in each reading, when the read request amount is less than 2M in the last reading, i.e., the 6 th time (after 5 times of reading, the read request amount in one time is still 1M, which is less than the maximum IO value 2M), if the current remaining file size is less than or equal to 2M (assuming that the original file size is 11.5M or 12M, the original file size is still unread after 5 times of reading is 1.5M or 2M, and 1.5 or 2 is less than or equal to the maximum IO value 2M), the file remaining after 5 th time of reading is read once in the 6 th time (that is, pre-reading the rest 1.5M or 2M into a kernel page cache at the 6 th reading time); if the current remaining file size is larger than 2M, the read request amount at this time is updated to 2M (that is, if the original file size is 15M and the read request amount is 11M, after 5 times of reading, the original file size still remains 5M, and the read request amount at the time of 6 th time of reading is changed to 2M, that is, the maximum IO value at one time);
if the read request amount of one time is less than 2M, the read request amount is updated to 2M, for example, the original file size is 21M, and the request amount is 1.5M (1.5M is less than the maximum IO value of one time 2M), the request amount of this time 1.5M is updated to the maximum IO value of one time 2M.
In this embodiment, it is determined whether the current file pointer is the last page of the file in the pre-reading process, and if the current file pointer is the last page of the file, the file is not pre-read designed, and the required read request amount is directly read at the file pointer. And if the current file pointer is not the last page of the file, entering a pre-reading design.
In this embodiment, since the random access kernel reads at maximum one IO maximum value of 2M at a time, bdi- > IO _ pages can be promoted from the original 320 pages, that is, 1.28M to 2M to cooperate with the above strategy to promote the random access performance in all directions. In addition, the random pre-reading interface as the kernel is placed after congestion control, and the pre-reading is directly returned to stop when IO congestion is reached.
It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Claims (3)
1. A kernel IO optimization method for random reading during caching and file reading is characterized in that a kernel pre-reading flow is entered when an operating system reads files, and pre-reading is performed according to a pre-reading design if an application program is set or a kernel is marked as a random mode, wherein the pre-reading design is as follows:
comparing the file size with the maximum IO value, and reading the file in different modes:
when the size of a file to be read is smaller than the maximum IO value, if the read request amount of one time is smaller than the maximum IO value, the whole file to be read is completely pre-read into the kernel page cache;
when the size of the file to be read is larger than the maximum IO value, setting a file pre-reading part according to the position of the file pointer and the size of the file which is not read in the file:
if the read request amount of one time is larger than the maximum IO value of one time, reading the read request amount of the current time in n batches, wherein the read request amount of each batch in the n-1 batches is the maximum IO value of one time, calculating the size of the current remaining unread file when the nth batch is read, and changing the read request amount of the nth batch into the value of the size of the current remaining unread file if the size of the current remaining unread file is smaller than the maximum IO value of one time; if the size of the current residual unread file is larger than the primary IO maximum value, updating the read request quantity of the nth batch to the primary IO maximum value, wherein n is larger than or equal to 2 and is a natural number;
and if the read request amount of one time is smaller than the maximum IO value of one time, updating the read request amount of this time to be the maximum IO value of one time.
2. The method according to claim 1, wherein the method for optimizing the random-read core IO during the cache read of the file is characterized in that whether the current file pointer is the last page of the file is determined in the pre-read process, and if the current file pointer is the last page, the file is not subjected to pre-read design, and the required read request amount is directly read at the file pointer.
3. The method according to claim 2, wherein in the pre-reading process, if the current file pointer is not the last page of the file, the pre-reading design is entered.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210131238.2A CN114168272B (en) | 2022-02-14 | 2022-02-14 | Random-reading kernel IO optimization method during file caching and reading |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210131238.2A CN114168272B (en) | 2022-02-14 | 2022-02-14 | Random-reading kernel IO optimization method during file caching and reading |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114168272A true CN114168272A (en) | 2022-03-11 |
CN114168272B CN114168272B (en) | 2022-04-19 |
Family
ID=80489800
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210131238.2A Active CN114168272B (en) | 2022-02-14 | 2022-02-14 | Random-reading kernel IO optimization method during file caching and reading |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114168272B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050154825A1 (en) * | 2004-01-08 | 2005-07-14 | Fair Robert L. | Adaptive file readahead based on multiple factors |
CN107590278A (en) * | 2017-09-28 | 2018-01-16 | 郑州云海信息技术有限公司 | A kind of pre-reading method of files and relevant apparatus based on CEPH |
CN109542361A (en) * | 2018-12-04 | 2019-03-29 | 郑州云海信息技术有限公司 | A kind of distributed memory system file reading, system and relevant apparatus |
CN111258967A (en) * | 2020-02-11 | 2020-06-09 | 西安奥卡云数据科技有限公司 | Data reading method and device in file system and computer readable storage medium |
CN113377725A (en) * | 2021-08-13 | 2021-09-10 | 苏州浪潮智能科技有限公司 | Pre-reading method and system of kernel client and computer readable storage medium |
WO2021238265A1 (en) * | 2020-05-28 | 2021-12-02 | 广东浪潮智慧计算技术有限公司 | File pre-reading method, apparatus and device, and storage medium |
-
2022
- 2022-02-14 CN CN202210131238.2A patent/CN114168272B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050154825A1 (en) * | 2004-01-08 | 2005-07-14 | Fair Robert L. | Adaptive file readahead based on multiple factors |
CN107590278A (en) * | 2017-09-28 | 2018-01-16 | 郑州云海信息技术有限公司 | A kind of pre-reading method of files and relevant apparatus based on CEPH |
CN109542361A (en) * | 2018-12-04 | 2019-03-29 | 郑州云海信息技术有限公司 | A kind of distributed memory system file reading, system and relevant apparatus |
CN111258967A (en) * | 2020-02-11 | 2020-06-09 | 西安奥卡云数据科技有限公司 | Data reading method and device in file system and computer readable storage medium |
WO2021238265A1 (en) * | 2020-05-28 | 2021-12-02 | 广东浪潮智慧计算技术有限公司 | File pre-reading method, apparatus and device, and storage medium |
CN113377725A (en) * | 2021-08-13 | 2021-09-10 | 苏州浪潮智能科技有限公司 | Pre-reading method and system of kernel client and computer readable storage medium |
Non-Patent Citations (1)
Title |
---|
吴峰光等: "一种支持并发访问流的文件预取算法", 《软件学报》 * |
Also Published As
Publication number | Publication date |
---|---|
CN114168272B (en) | 2022-04-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0848321B1 (en) | Method of data migration | |
US5983324A (en) | Data prefetch control method for main storage cache for protecting prefetched data from replacement before utilization thereof | |
US5522054A (en) | Dynamic control of outstanding hard disk read requests for sequential and random operations | |
JP2004185349A (en) | Update data writing method using journal log | |
US5201040A (en) | Multiprocessor system having subsystems which are loosely coupled through a random access storage and which each include a tightly coupled multiprocessor | |
CN110673798B (en) | Storage system and IO (input/output) tray-dropping method and device thereof | |
US6557088B2 (en) | Information processing system and recording medium recording a program to cause a computer to execute steps | |
US11681623B1 (en) | Pre-read data caching method and apparatus, device, and storage medium | |
WO2008020389A2 (en) | Flash memory access circuit | |
CN110837480A (en) | Processing method and device of cache data, computer storage medium and electronic equipment | |
CN110413211A (en) | Memory management method, electronic equipment and computer-readable medium | |
CN110413413A (en) | A kind of method for writing data, device, equipment and storage medium | |
CN114168272B (en) | Random-reading kernel IO optimization method during file caching and reading | |
CN117130792B (en) | Processing method, device, equipment and storage medium for cache object | |
KR20060017816A (en) | Method and device for transferring data between a main memory and a storage device | |
JP6944576B2 (en) | Cache device, instruction cache, instruction processing system, data processing method, data processing device, computer-readable storage medium and computer program | |
CN112136110A (en) | Information processing apparatus, adjustment method, and adjustment program | |
WO2019206260A1 (en) | Method and apparatus for reading file cache | |
US5835957A (en) | System and method for a fast data write from a computer system to a storage system by overlapping transfer operations | |
US7571446B2 (en) | Server, computer system, object management method, server control method, computer program | |
CN111858665B (en) | Method, system, terminal and storage medium for improving soft copy reading performance | |
TWI416336B (en) | Nic with sharing buffer and method thereof | |
EP2677426A2 (en) | Cyclic write area of a cache on a storage controller which can be adapted based on the amount of data requests | |
JP2526728B2 (en) | Disk cache automatic usage method | |
CN115774621B (en) | Request processing method, system, equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |