CN114168272A - Random-reading kernel IO optimization method during file caching and reading - Google Patents

Random-reading kernel IO optimization method during file caching and reading Download PDF

Info

Publication number
CN114168272A
CN114168272A CN202210131238.2A CN202210131238A CN114168272A CN 114168272 A CN114168272 A CN 114168272A CN 202210131238 A CN202210131238 A CN 202210131238A CN 114168272 A CN114168272 A CN 114168272A
Authority
CN
China
Prior art keywords
file
reading
read
value
maximum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210131238.2A
Other languages
Chinese (zh)
Other versions
CN114168272B (en
Inventor
薛亚茅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kirin Software Co Ltd
Original Assignee
Kirin Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kirin Software Co Ltd filed Critical Kirin Software Co Ltd
Priority to CN202210131238.2A priority Critical patent/CN114168272B/en
Publication of CN114168272A publication Critical patent/CN114168272A/en
Application granted granted Critical
Publication of CN114168272B publication Critical patent/CN114168272B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45579I/O management, e.g. providing access to device drivers or storage

Abstract

The invention relates to a random-reading kernel IO optimization method during caching and reading files, which comprises the following steps that when an operating system reads files, a kernel pre-reading process is started, if an application program is set or a kernel is marked as a random mode, pre-reading is carried out according to a pre-reading design, wherein the pre-reading design is as follows: comparing the file size with the maximum IO value, and reading the file in different modes: when the size of a file to be read is smaller than the maximum IO value, if the read request quantity is smaller than the maximum IO value, the file is completely pre-read into the kernel page cache; and when the size of the file to be read is larger than the maximum IO value, setting a file pre-reading part according to the file pointer and the file residual amount. When entering the random reading process, the invention reads the file in advance according to a certain rule so as to improve the performance of reading the file randomly and improve the performance of the operating system.

Description

Random-reading kernel IO optimization method during file caching and reading
Technical Field
The invention relates to a random file reading mechanism in the field of linux operating systems, in particular to a kernel IO (input/output) optimization method for random reading during file cache reading.
Background
With the advent of the virtualization era, an operating system is of great importance as a soul of a virtual machine, performance is a key point of virtualization, and the operating system with high performance can assist in development of virtualization. The file read-write performance of the operating system depends greatly on the length and characteristics of the kernel IO. When the kernel reads files in the cache, if the files are read sequentially, pre-reading is triggered, the contents of the files to be read next time are read into the cache in advance, the next time of reading is directly read from the cache to shorten an IO path, but when the files are read randomly, the kernel only reads the requested amount in order to reduce unnecessary invalid pre-reading. The random read performance may lag much more than the sequential read performance.
Disclosure of Invention
The invention mainly aims to provide a kernel IO optimization method for random reading during caching of read files, which is used for pre-reading files according to a certain rule during reading of the files to improve the performance of random reading of the files and improve the performance of an operating system when a random reading process is started, is provided as a kernel random pre-reading interface, is convenient for self-defined use of an application program and improves the availability of the operating system.
In order to accomplish the above object, the present invention provides a kernel IO optimization method for random reading during caching of read files, which enters a kernel pre-reading process when an operating system reads files, and performs pre-reading according to a pre-reading design if an application program is set or a kernel is marked as a random mode, wherein the pre-reading design is as follows:
comparing the file size with the maximum IO value, and reading the file in different modes:
when the size of a file to be read is smaller than the maximum IO value, if the read request amount of one time is smaller than the maximum IO value, the whole file to be read is completely pre-read into the kernel page cache;
when the size of the file to be read is larger than the maximum IO value, setting a file pre-reading part according to the position of the file pointer and the size of the file which is not read:
if the read request amount of one time is larger than the maximum IO value of one time, reading the read request amount of the current time in n batches, wherein the read request amount of each batch in the n-1 batches is the maximum IO value of one time, calculating the size of the current residual unread file when the nth batch is read, and changing the read request amount of the nth batch into the value of the size of the current residual unread file if the size of the current residual unread file is smaller than the maximum IO value of one time; if the size of the current residual unread file is larger than the primary IO maximum value, updating the nth read request quantity to the primary IO maximum value, wherein n is larger than or equal to 2 and is a natural number;
and if the read request amount of one time is smaller than the maximum IO value of one time, updating the read request amount of this time to be the maximum IO value of one time.
Preferably, in the pre-reading process, it is determined whether the current file pointer is the last page of the file, and if the current file pointer is the last page of the file, the file is not subjected to pre-reading design, and the required read request amount is directly read at the file pointer.
Further preferably, in the pre-reading process, if the current file pointer is not the last page of the file, the pre-reading design is entered.
The invention has the beneficial effects that:
the method is equivalent to designing an in-core pre-reading interface for realizing RANDOM reading, when a file is read randomly, a file reading mode is marked as FMODE _ RANDOM (RANDOM mode), a pre-reading mechanism of RANDOM reading is entered, unread pages of the file are read into pagecache (in-core page cache) according to the pre-reading design, when the data is read next IO (input/output) and cannot be interrupted by missing pages, then the cache is filled from a disk, the cache can be directly hit and returned, IO (input/output) paths are shortened, the IO times are reduced by matching with adjustment bdi (Background dev info, rear-end equipment information) - > IO _ pages (the maximum number of the primary IO pages), and the performance is greatly improved. The invention can intuitively promote the random reading iops of mass files through fio test.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is an overall flow chart of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention is clearly and completely described below with reference to the drawings in the embodiments of the present invention. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways than those specifically described and will be readily apparent to those of ordinary skill in the art without departing from the spirit of the present invention, and therefore the present invention is not limited to the specific embodiments disclosed below.
As shown in fig. 1, the present invention provides a kernel IO optimization method for RANDOM reading during file cache reading, which enters a kernel pre-reading process when an operating system reads a file, and performs pre-reading according to a pre-reading design if an application program is set or a kernel is marked as FMODE _ RANDOM (RANDOM mode). If the application or kernel is not marked as FMODE _ RANDOM (RANDOM mode), then the required amount of read requests is read directly at the file pointer. In the prior art, since the maximum value of IO read at random once is 2M, the maximum value of IO read at random once is defined by 2M in this embodiment.
Specifically, in the present embodiment, the pre-reading is designed to:
the file size is compared with the maximum IO value (i.e. 2M) and then the file is read in different ways:
when the size of the file to be read is smaller than the maximum IO value, if the read request amount is smaller than the maximum IO value, the read request amount of the current time is updated to be the sum of the current read request amount and the size of the remaining unread file, that is, the whole file to be read is all pre-read into the kernel page cache. That is, if the size of the file is smaller than 2M, the current request amount is from the current file position to the end of the file, that is, the whole file is completely pre-read into the kernel page cache at this time, so that the file cache is hit probabilistically in the read request of the next random block, and the operation performance is improved.
When the size of the file to be read is larger than the maximum IO value, setting a file pre-reading part according to the position of the file pointer and the size of the file which is not read in the file: if the read request quantity of one time is larger than the maximum IO value, reading the read request quantity of the current time in n batches, wherein the read request quantity of each batch in n-1 batches is the maximum IO value, when the read request quantity of the nth batch is reached, calculating the size of the current residual unread file, if the size of the current residual unread file is smaller than the maximum IO value, changing the read request quantity of the nth batch into the value of the size of the current residual unread file, and if the size of the current residual unread file is larger than the maximum IO value, updating the read request quantity of the nth batch into the maximum IO value, wherein n is larger than or equal to 2, and n is a natural number; and if the read request amount of one time is smaller than the maximum IO value of one time, updating the read request amount of this time to be the maximum IO value of one time. That is, to illustrate: when the current file size is larger than 2M, there are several situations:
if the read request amount is greater than 2M once, for example, the read request amount is 11M, 6 times (11/2 =6, where 11 is the read request amount to be read 11M, and 2 is the maximum IO value 2M) of reading are performed, the read amount in the previous 5 times is 2M (i.e., the maximum IO value) in each reading, when the read request amount is less than 2M in the last reading, i.e., the 6 th time (after 5 times of reading, the read request amount in one time is still 1M, which is less than the maximum IO value 2M), if the current remaining file size is less than or equal to 2M (assuming that the original file size is 11.5M or 12M, the original file size is still unread after 5 times of reading is 1.5M or 2M, and 1.5 or 2 is less than or equal to the maximum IO value 2M), the file remaining after 5 th time of reading is read once in the 6 th time (that is, pre-reading the rest 1.5M or 2M into a kernel page cache at the 6 th reading time); if the current remaining file size is larger than 2M, the read request amount at this time is updated to 2M (that is, if the original file size is 15M and the read request amount is 11M, after 5 times of reading, the original file size still remains 5M, and the read request amount at the time of 6 th time of reading is changed to 2M, that is, the maximum IO value at one time);
if the read request amount of one time is less than 2M, the read request amount is updated to 2M, for example, the original file size is 21M, and the request amount is 1.5M (1.5M is less than the maximum IO value of one time 2M), the request amount of this time 1.5M is updated to the maximum IO value of one time 2M.
In this embodiment, it is determined whether the current file pointer is the last page of the file in the pre-reading process, and if the current file pointer is the last page of the file, the file is not pre-read designed, and the required read request amount is directly read at the file pointer. And if the current file pointer is not the last page of the file, entering a pre-reading design.
In this embodiment, since the random access kernel reads at maximum one IO maximum value of 2M at a time, bdi- > IO _ pages can be promoted from the original 320 pages, that is, 1.28M to 2M to cooperate with the above strategy to promote the random access performance in all directions. In addition, the random pre-reading interface as the kernel is placed after congestion control, and the pre-reading is directly returned to stop when IO congestion is reached.
It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Claims (3)

1. A kernel IO optimization method for random reading during caching and file reading is characterized in that a kernel pre-reading flow is entered when an operating system reads files, and pre-reading is performed according to a pre-reading design if an application program is set or a kernel is marked as a random mode, wherein the pre-reading design is as follows:
comparing the file size with the maximum IO value, and reading the file in different modes:
when the size of a file to be read is smaller than the maximum IO value, if the read request amount of one time is smaller than the maximum IO value, the whole file to be read is completely pre-read into the kernel page cache;
when the size of the file to be read is larger than the maximum IO value, setting a file pre-reading part according to the position of the file pointer and the size of the file which is not read in the file:
if the read request amount of one time is larger than the maximum IO value of one time, reading the read request amount of the current time in n batches, wherein the read request amount of each batch in the n-1 batches is the maximum IO value of one time, calculating the size of the current remaining unread file when the nth batch is read, and changing the read request amount of the nth batch into the value of the size of the current remaining unread file if the size of the current remaining unread file is smaller than the maximum IO value of one time; if the size of the current residual unread file is larger than the primary IO maximum value, updating the read request quantity of the nth batch to the primary IO maximum value, wherein n is larger than or equal to 2 and is a natural number;
and if the read request amount of one time is smaller than the maximum IO value of one time, updating the read request amount of this time to be the maximum IO value of one time.
2. The method according to claim 1, wherein the method for optimizing the random-read core IO during the cache read of the file is characterized in that whether the current file pointer is the last page of the file is determined in the pre-read process, and if the current file pointer is the last page, the file is not subjected to pre-read design, and the required read request amount is directly read at the file pointer.
3. The method according to claim 2, wherein in the pre-reading process, if the current file pointer is not the last page of the file, the pre-reading design is entered.
CN202210131238.2A 2022-02-14 2022-02-14 Random-reading kernel IO optimization method during file caching and reading Active CN114168272B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210131238.2A CN114168272B (en) 2022-02-14 2022-02-14 Random-reading kernel IO optimization method during file caching and reading

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210131238.2A CN114168272B (en) 2022-02-14 2022-02-14 Random-reading kernel IO optimization method during file caching and reading

Publications (2)

Publication Number Publication Date
CN114168272A true CN114168272A (en) 2022-03-11
CN114168272B CN114168272B (en) 2022-04-19

Family

ID=80489800

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210131238.2A Active CN114168272B (en) 2022-02-14 2022-02-14 Random-reading kernel IO optimization method during file caching and reading

Country Status (1)

Country Link
CN (1) CN114168272B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050154825A1 (en) * 2004-01-08 2005-07-14 Fair Robert L. Adaptive file readahead based on multiple factors
CN107590278A (en) * 2017-09-28 2018-01-16 郑州云海信息技术有限公司 A kind of pre-reading method of files and relevant apparatus based on CEPH
CN109542361A (en) * 2018-12-04 2019-03-29 郑州云海信息技术有限公司 A kind of distributed memory system file reading, system and relevant apparatus
CN111258967A (en) * 2020-02-11 2020-06-09 西安奥卡云数据科技有限公司 Data reading method and device in file system and computer readable storage medium
CN113377725A (en) * 2021-08-13 2021-09-10 苏州浪潮智能科技有限公司 Pre-reading method and system of kernel client and computer readable storage medium
WO2021238265A1 (en) * 2020-05-28 2021-12-02 广东浪潮智慧计算技术有限公司 File pre-reading method, apparatus and device, and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050154825A1 (en) * 2004-01-08 2005-07-14 Fair Robert L. Adaptive file readahead based on multiple factors
CN107590278A (en) * 2017-09-28 2018-01-16 郑州云海信息技术有限公司 A kind of pre-reading method of files and relevant apparatus based on CEPH
CN109542361A (en) * 2018-12-04 2019-03-29 郑州云海信息技术有限公司 A kind of distributed memory system file reading, system and relevant apparatus
CN111258967A (en) * 2020-02-11 2020-06-09 西安奥卡云数据科技有限公司 Data reading method and device in file system and computer readable storage medium
WO2021238265A1 (en) * 2020-05-28 2021-12-02 广东浪潮智慧计算技术有限公司 File pre-reading method, apparatus and device, and storage medium
CN113377725A (en) * 2021-08-13 2021-09-10 苏州浪潮智能科技有限公司 Pre-reading method and system of kernel client and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴峰光等: "一种支持并发访问流的文件预取算法", 《软件学报》 *

Also Published As

Publication number Publication date
CN114168272B (en) 2022-04-19

Similar Documents

Publication Publication Date Title
EP0848321B1 (en) Method of data migration
US5983324A (en) Data prefetch control method for main storage cache for protecting prefetched data from replacement before utilization thereof
US5522054A (en) Dynamic control of outstanding hard disk read requests for sequential and random operations
JP2004185349A (en) Update data writing method using journal log
US5201040A (en) Multiprocessor system having subsystems which are loosely coupled through a random access storage and which each include a tightly coupled multiprocessor
CN110673798B (en) Storage system and IO (input/output) tray-dropping method and device thereof
US6557088B2 (en) Information processing system and recording medium recording a program to cause a computer to execute steps
US11681623B1 (en) Pre-read data caching method and apparatus, device, and storage medium
WO2008020389A2 (en) Flash memory access circuit
CN110837480A (en) Processing method and device of cache data, computer storage medium and electronic equipment
CN110413211A (en) Memory management method, electronic equipment and computer-readable medium
CN110413413A (en) A kind of method for writing data, device, equipment and storage medium
CN114168272B (en) Random-reading kernel IO optimization method during file caching and reading
CN117130792B (en) Processing method, device, equipment and storage medium for cache object
KR20060017816A (en) Method and device for transferring data between a main memory and a storage device
JP6944576B2 (en) Cache device, instruction cache, instruction processing system, data processing method, data processing device, computer-readable storage medium and computer program
CN112136110A (en) Information processing apparatus, adjustment method, and adjustment program
WO2019206260A1 (en) Method and apparatus for reading file cache
US5835957A (en) System and method for a fast data write from a computer system to a storage system by overlapping transfer operations
US7571446B2 (en) Server, computer system, object management method, server control method, computer program
CN111858665B (en) Method, system, terminal and storage medium for improving soft copy reading performance
TWI416336B (en) Nic with sharing buffer and method thereof
EP2677426A2 (en) Cyclic write area of a cache on a storage controller which can be adapted based on the amount of data requests
JP2526728B2 (en) Disk cache automatic usage method
CN115774621B (en) Request processing method, system, equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant