WO2022001133A1 - Method and system for improving soft copy read performance, terminal, and storage medium - Google Patents

Method and system for improving soft copy read performance, terminal, and storage medium Download PDF

Info

Publication number
WO2022001133A1
WO2022001133A1 PCT/CN2021/076951 CN2021076951W WO2022001133A1 WO 2022001133 A1 WO2022001133 A1 WO 2022001133A1 CN 2021076951 W CN2021076951 W CN 2021076951W WO 2022001133 A1 WO2022001133 A1 WO 2022001133A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
read
cache
reading
soft copy
Prior art date
Application number
PCT/CN2021/076951
Other languages
French (fr)
Chinese (zh)
Inventor
苏志恒
李文鹏
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2022001133A1 publication Critical patent/WO2022001133A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • the present application belongs to the technical field of distributed systems, and in particular relates to a method, system, terminal and storage medium for improving soft copy read performance.
  • the existing practice is: first, it needs to associate with the source file, and then read the data from the hard disk; when reading the snapshot data of the source file directly from the disk , the data is not in the cache, and the upper-layer application needs to wait for the data to be successfully read from the disk before returning.
  • the time-consuming of reading data directly from the hard disk is 50 times that of the cache reading, which seriously affects the read performance and data transmission efficiency of the storage system. Reading data becomes the bottleneck of read bandwidth and read performance of distributed systems.
  • the present application provides a method, system, terminal and storage medium for improving soft copy read performance, so as to solve the above technical problems.
  • the present application provides a method for improving soft copy read performance, including:
  • the data in the cache is moved according to the data cache elimination algorithm.
  • the method also includes:
  • the data read range rules are automatically collected and the cache elimination model is adapted, the read data is moved to the corresponding cache elimination queue according to the matching of the cache elimination model, and the default cache The data of the elimination queue is moved into the corresponding cache elimination queue.
  • the data segments involved in the pre-reading data are obtained, including:
  • the cyclically pre-reading the corresponding source file data according to the data segment and placing it in the cache includes:
  • the present application provides a system for improving soft copy read performance, including:
  • the file reading unit is configured to read the soft copy file and associate it with the source file data within the read range;
  • the data pre-reading unit is configured to calculate the data segments involved in the pre-reading data according to the pre-reading algorithm and the read range;
  • a cache writing unit configured to cyclically pre-read the corresponding source file data according to the data segment and put it into the cache
  • the cache elimination unit is configured to move the data in the cache according to the data cache elimination algorithm.
  • a terminal including:
  • processor memory, where,
  • the memory is used to store computer programs
  • the processor is used to call and run the computer program from the memory, so that the terminal executes the above-mentioned method of the terminal.
  • a computer-readable storage medium is provided, and instructions are stored in the computer-readable storage medium, which, when executed on a computer, cause the computer to perform the methods described in the above aspects.
  • the present application provides a method, system, terminal and storage medium for improving soft copy read performance.
  • the pre-reading algorithm and the self-learning technology of the cache module elimination mechanism Through the distributed system pre-reading algorithm and the self-learning technology of the cache module elimination mechanism, the pre-reading of soft-copy files is realized and the reading of soft-copy files is improved. performance and system throughput.
  • FIG. 1 is a schematic flowchart of a method according to an embodiment of the present application.
  • FIG. 2 is a schematic block diagram of a system according to an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a terminal according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a method according to an embodiment of the present application.
  • the execution body of FIG. 1 may be a system for improving soft copy read performance.
  • the method 100 includes:
  • Step 110 read the soft copy file and associate it with the source file data within the read range
  • Step 120 calculating the data segments involved in the pre-reading data according to the pre-reading algorithm and the read range;
  • Step 130 cyclically pre-read the corresponding source file data according to the data segment and put it into the cache;
  • Step 140 Move the data in the cache according to the data cache elimination algorithm.
  • the method further includes:
  • the data read range rules are automatically collected and the cache elimination model is adapted, the read data is moved to the corresponding cache elimination queue according to the matching of the cache elimination model, and the default cache The data of the elimination queue is moved into the corresponding cache elimination queue.
  • the data segments involved in the pre-reading data are calculated according to the pre-reading algorithm and the read range, including:
  • the cyclically pre-reading the corresponding source file data according to the data segment and placing it in the cache includes:
  • the following describes a method for improving soft copy read performance provided by the present application based on the principle of a method for improving soft copy read performance in the present application, combined with the process of performing soft copy reading on a distributed system in the embodiment. method is further described.
  • the method for improving soft copy read performance includes:
  • the client When the client reads the soft copy file, it first associates the source file snapshot data segment or data segment information corresponding to the read range (OFFSET ⁇ OFFSET+SIZE), and then reads the data. If the read data hits the cache, it is directly read from the cache.
  • the cached data that has been read is put into the default cache elimination queue Q1; if the read data is not in the cache, the data is directly read from the disk, and then returned; during the data read process, the cache module automatically collects the data read range After the cache elimination mode is successfully selected, the data of Q1 is moved into the corresponding cache elimination queue Q2; after the data reading is completed, the cache module pre-reading algorithm is called to pass in the current read data The offset OFFSET and the length of the read data SIZE, calculate the data segments involved in the subsequent pre-read data, and then find the source file corresponding to each pre-read data segment, read the data of each data segment asynchronously and load it into the cache , wait for the next read hit, and put it into the corresponding cache elimination queue for cache update.
  • the implementation of the client program mainly includes the following steps:
  • the read data is returned to the client.
  • the cache module handler mainly includes the following steps:
  • step (5) If not, then enter step (5);
  • step (6) judge whether it is pre-reading, if it is pre-reading, then enter step (6);
  • step (8) If it is not pre-reading, then enter step (8);
  • the self-learning data reads the range data, and matches the corresponding cache elimination model. After the matching is successful, the cache data is moved.
  • the system 200 includes:
  • a file reading unit 210 configured to read the soft copy file and associate it with the source file data within the read range
  • the data pre-reading unit 220 is configured to calculate and obtain the data segments involved in the pre-reading data according to the pre-reading algorithm and the reading range;
  • the cache writing unit 230 is configured to cyclically pre-read the corresponding source file data according to the data segment and put it into the cache;
  • the cache elimination unit 240 is configured to move the data in the cache according to the data cache elimination algorithm.
  • FIG. 3 is a schematic structural diagram of a terminal system 300 provided by an embodiment of the present application.
  • the terminal system 300 may be used to execute a method for improving soft copy read performance provided by an embodiment of the present application.
  • the terminal system 300 may include: a processor 310 , a memory 320 and a communication unit 330 . These components communicate through one or more buses. Those skilled in the art can understand that the structure of the server shown in the figure does not constitute a limitation on this application. It can be either a bus-shaped structure, a star-shaped structure, or a More or fewer components than shown may be included, or some components may be combined, or a different arrangement of components.
  • the memory 320 can be used to store the execution instructions of the processor 310, and the memory 320 can be implemented by any type of volatile or non-volatile storage terminal or their combination, such as static random access memory (SRAM), electrical Erasable Programmable Read Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk .
  • SRAM static random access memory
  • EEPROM electrical Erasable Programmable Read Only Memory
  • EPROM Erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • magnetic Memory Flash Memory
  • magnetic Disk Magnetic Disk or Optical Disk
  • the processor 310 is the control center of the storage terminal, using various interfaces and lines to connect various parts of the entire electronic terminal, by running or executing the software programs and/or modules stored in the memory 320, and calling the data stored in the memory, To perform various functions of the electronic terminal and/or process data.
  • the processor may be composed of an integrated circuit (Integrated Circuit, IC for short), for example, may be composed of a single packaged IC, or may be composed of a plurality of packaged ICs connected with the same function or different functions.
  • the processor 310 may only include a central processing unit (Central Processing Unit, CPU for short).
  • the CPU may be a single computing core, or may include multiple computing cores.
  • the communication unit 330 is used for establishing a communication channel, so that the storage terminal can communicate with other terminals. Receive user data sent by other terminals or send user data to other terminals.
  • the present application also provides a computer storage medium, wherein the computer storage medium can store a program, and when the program is executed, the program can include some or all of the steps in the embodiments provided in the present application.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (English: read-only memory, abbreviated as: ROM) or a random access memory (English: random access memory, abbreviated as: RAM) and the like.
  • the present application realizes the pre-reading of soft-copy files through the self-learning technology of the distributed system pre-reading algorithm and the cache module elimination mechanism, and improves the reading performance and system throughput of soft-copy files.
  • the technical effects that can be achieved in this embodiment can be found in The above description will not be repeated here.
  • the computer software products are stored in a storage medium such as a USB flash drive, a mobile Hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes, including several instructions to make a computer terminal (It may be a personal computer, a server, or a second terminal, a network terminal, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • a storage medium such as a USB flash drive, a mobile Hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes, including several instructions to make a computer terminal (It may be a personal computer, a server, or a second terminal, a network terminal, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

Abstract

A method and system for improving soft copy read performance, a terminal and a storage medium, the method comprising: reading a soft copy file and associating same with source file data within a read range; according to a pre-reading algorithm and a read range, calculating data segments involved in pre-read data; pre-reading corresponding source file data according to a data segment cycle and placing same into a cache; and moving the data in the cache according to a data cache elimination algorithm. By means of self-learning technology of a distributed system pre-reading algorithm and a cache module elimination mechanism, the soft copy file is pre-read, and the read performance and system throughput of the soft copy file are improved.

Description

一种提升软拷贝读性能的方法、系统、终端及存储介质A method, system, terminal and storage medium for improving soft copy read performance
本申请要求于2020年06月28日提交至中国专利局、申请号为202010598742.4、发明名称为“一种提升软拷贝读性能的方法、系统、终端及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application submitted to the China Patent Office on June 28, 2020, the application number is 202010598742.4, and the invention title is "A method, system, terminal and storage medium for improving soft copy reading performance", The entire contents of which are incorporated herein by reference.
技术领域technical field
本申请属于分布式系统技术领域,具体涉及一种提升软拷贝读性能的方法、系统、终端及存储介质。The present application belongs to the technical field of distributed systems, and in particular relates to a method, system, terminal and storage medium for improving soft copy read performance.
背景技术Background technique
在大数据时代,基于实现的文件软拷贝功能,在读取软拷贝文件时,现有做法是:首先需要关联到源文件,然后再从硬盘读取数据;下盘直读源文件快照数据时,数据不在缓存中,上层应用需要等待数据从磁盘读取成功后再返回,直接从硬盘读取数据耗时是缓存读取的50倍,严重影响存储系统读性能和数据传输效率,直接从磁盘读取数据成为分布式系统读带宽和读性能的瓶颈。In the era of big data, based on the implemented file soft copy function, when reading a soft copy file, the existing practice is: first, it needs to associate with the source file, and then read the data from the hard disk; when reading the snapshot data of the source file directly from the disk , the data is not in the cache, and the upper-layer application needs to wait for the data to be successfully read from the disk before returning. The time-consuming of reading data directly from the hard disk is 50 times that of the cache reading, which seriously affects the read performance and data transmission efficiency of the storage system. Reading data becomes the bottleneck of read bandwidth and read performance of distributed systems.
发明内容SUMMARY OF THE INVENTION
针对现有技术的上述不足,本申请提供一种提升软拷贝读性能的方法、系统、终端及存储介质,以解决上述技术问题。In view of the above deficiencies of the prior art, the present application provides a method, system, terminal and storage medium for improving soft copy read performance, so as to solve the above technical problems.
第一方面,本申请提供一种提升软拷贝读性能的方法,包括:In a first aspect, the present application provides a method for improving soft copy read performance, including:
读取软拷贝文件并关联到读取范围内的源文件数据;Read the soft copy file and associate it with the source file data within the read range;
根据预读算法和读取范围计算得到预读数据涉及到的数据段;Calculate the data segments involved in the pre-read data according to the pre-read algorithm and the read range;
根据所述数据段循环预读出对应的源文件数据并放到缓存中;Circularly pre-read the corresponding source file data according to the data segment and put it into the cache;
根据数据缓存淘汰算法将缓存中的数据进行移动。The data in the cache is moved according to the data cache elimination algorithm.
进一步的,所述方法还包括:Further, the method also includes:
创建默认缓存淘汰队列和相应缓存淘汰队列;Create a default cache elimination queue and a corresponding cache elimination queue;
读取软拷贝文件关联的源文件数据,并判断读取数据的来源:Read the source file data associated with the soft copy file, and determine the source of the read data:
若从缓存直接读出,则将已读完的缓存数据放入默认缓存淘汰队列;If read directly from the cache, put the read cache data into the default cache elimination queue;
若从磁盘直接读出,则在数据读取过程中,自动收集数据读取范围规律并适配缓存淘汰模型,根据缓存淘汰模型匹配情况将读取数据移入到相应缓存淘汰队列,并把默认缓存淘汰队列的数据移入到相应缓存淘汰队列。If it is read directly from the disk, during the data reading process, the data read range rules are automatically collected and the cache elimination model is adapted, the read data is moved to the corresponding cache elimination queue according to the matching of the cache elimination model, and the default cache The data of the elimination queue is moved into the corresponding cache elimination queue.
进一步的,所述根据预读算法和读取范围计算得到预读数据涉及到的数据段,包括:Further, according to the pre-reading algorithm and the read range, the data segments involved in the pre-reading data are obtained, including:
调用预读算法,并根据读取范围传入当前读取数据的偏移量和读取数据的长度,计算出预读数据涉及到的数据段。Call the read-ahead algorithm, and calculate the data segment involved in the read-ahead data by passing in the offset of the current read data and the length of the read data according to the read range.
进一步的,所述根据所述数据段循环预读出对应的源文件数据并放到缓存中,包括:Further, the cyclically pre-reading the corresponding source file data according to the data segment and placing it in the cache includes:
找到各个预读数据段对应的源文件,循环异步读出各个数据段的源文件数据并加载到缓存中,等待下次从缓存直接读出时,放入默认缓存淘汰队列进行缓存更新。Find the source file corresponding to each pre-read data segment, read the source file data of each data segment asynchronously and load it into the cache, and wait for the next time it is directly read from the cache, put it into the default cache elimination queue for cache update.
第二方面,本申请提供一种提升软拷贝读性能的系统,包括:In a second aspect, the present application provides a system for improving soft copy read performance, including:
文件读取单元,配置用于读取软拷贝文件并关联到读取范围内的源文件数据;The file reading unit is configured to read the soft copy file and associate it with the source file data within the read range;
数据预读单元,配置用于根据预读算法和读取范围计算得到预读数据涉及到的数据段;The data pre-reading unit is configured to calculate the data segments involved in the pre-reading data according to the pre-reading algorithm and the read range;
缓存写入单元,配置用于根据所述数据段循环预读出对应的源文件数据并放到缓存中;a cache writing unit, configured to cyclically pre-read the corresponding source file data according to the data segment and put it into the cache;
缓存淘汰单元,配置用于根据数据缓存淘汰算法将缓存中的数据进行移动。The cache elimination unit is configured to move the data in the cache according to the data cache elimination algorithm.
第三方面,提供一种终端,包括:In a third aspect, a terminal is provided, including:
处理器、存储器,其中,processor, memory, where,
该存储器用于存储计算机程序,The memory is used to store computer programs,
该处理器用于从存储器中调用并运行该计算机程序,使得终端执行上述的终端的方法。The processor is used to call and run the computer program from the memory, so that the terminal executes the above-mentioned method of the terminal.
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述各方面所述的方法。In a fourth aspect, a computer-readable storage medium is provided, and instructions are stored in the computer-readable storage medium, which, when executed on a computer, cause the computer to perform the methods described in the above aspects.
本申请的有益效果在于,The beneficial effect of the present application is that,
本申请提供的一种提升软拷贝读性能的方法、系统、终端及存储介质,通过分布式系统预读算法和缓存模块淘汰机制的自学习技术,实现软拷贝文件预读,提升软拷贝文件读性能和系统吞吐量。The present application provides a method, system, terminal and storage medium for improving soft copy read performance. Through the distributed system pre-reading algorithm and the self-learning technology of the cache module elimination mechanism, the pre-reading of soft-copy files is realized and the reading of soft-copy files is improved. performance and system throughput.
此外,本申请设计原理可靠,结构简单,具有非常广泛的应用前景。In addition, the design principle of the present application is reliable, the structure is simple, and has a very broad application prospect.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. In other words, other drawings can also be obtained based on these drawings without creative labor.
图1是本申请一个实施例的方法的示意性流程图。FIG. 1 is a schematic flowchart of a method according to an embodiment of the present application.
图2是本申请一个实施例的系统的示意性框图。FIG. 2 is a schematic block diagram of a system according to an embodiment of the present application.
图3为本申请实施例提供的一种终端的结构示意图。FIG. 3 is a schematic structural diagram of a terminal according to an embodiment of the present application.
具体实施方式detailed description
为了使本技术领域的人员更好地理解本申请中的技术方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described The embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the scope of protection of the present application.
下面对本申请中出现的关键术语进行解释。Key terms appearing in this application are explained below.
图1是本申请一个实施例的方法的示意性流程图。其中,图1执行主体可以为一种提升软拷贝读性能的系统。FIG. 1 is a schematic flowchart of a method according to an embodiment of the present application. The execution body of FIG. 1 may be a system for improving soft copy read performance.
如图1所示,该方法100包括:As shown in Figure 1, the method 100 includes:
步骤110,读取软拷贝文件并关联到读取范围内的源文件数据; Step 110, read the soft copy file and associate it with the source file data within the read range;
步骤120,根据预读算法和读取范围计算得到预读数据涉及到的数据段; Step 120, calculating the data segments involved in the pre-reading data according to the pre-reading algorithm and the read range;
步骤130,根据所述数据段循环预读出对应的源文件数据并放到缓存中; Step 130, cyclically pre-read the corresponding source file data according to the data segment and put it into the cache;
步骤140,根据数据缓存淘汰算法将缓存中的数据进行移动。Step 140: Move the data in the cache according to the data cache elimination algorithm.
可选地,作为本申请一个实施例,所述方法还包括:Optionally, as an embodiment of the present application, the method further includes:
创建默认缓存淘汰队列和相应缓存淘汰队列;Create a default cache elimination queue and a corresponding cache elimination queue;
读取软拷贝文件关联的源文件数据,并判断读取数据的来源:Read the source file data associated with the soft copy file, and determine the source of the read data:
若从缓存直接读出,则将已读完的缓存数据放入默认缓存淘汰队列;If read directly from the cache, put the read cache data into the default cache elimination queue;
若从磁盘直接读出,则在数据读取过程中,自动收集数据读取范围规律并适配缓存淘汰模型,根据缓存淘汰模型匹配情况将读取数据移入到相应缓存淘汰队列,并把默认缓存淘汰队列的数据移入到相应缓存淘汰队列。If it is read directly from the disk, during the data reading process, the data read range rules are automatically collected and the cache elimination model is adapted, the read data is moved to the corresponding cache elimination queue according to the matching of the cache elimination model, and the default cache The data of the elimination queue is moved into the corresponding cache elimination queue.
可选地,作为本申请一个实施例,所述根据预读算法和读取范围计算得到预读数据涉及到的数据段,包括:Optionally, as an embodiment of the present application, the data segments involved in the pre-reading data are calculated according to the pre-reading algorithm and the read range, including:
调用预读算法,并根据读取范围传入当前读取数据的偏移量和读取数据的长度,计算出预读数据涉及到的数据段。Call the read-ahead algorithm, and calculate the data segment involved in the read-ahead data by passing in the offset of the current read data and the length of the read data according to the read range.
可选地,作为本申请一个实施例,所述根据所述数据段循环预读出对应的源文件数据并放到缓存中,包括:Optionally, as an embodiment of the present application, the cyclically pre-reading the corresponding source file data according to the data segment and placing it in the cache includes:
找到各个预读数据段对应的源文件,循环异步读出各个数据段的源文件数据并加载到缓存中,等待下次从缓存直接读出时,放入默认缓存淘汰队列进行缓存更新。Find the source file corresponding to each pre-read data segment, read the source file data of each data segment asynchronously and load it into the cache, and wait for the next time it is directly read from the cache, put it into the default cache elimination queue for cache update.
为了便于对本申请的理解,下面以本申请一种提升软拷贝读性能的方法的原理,结合实施例中对分布式系统进行软拷贝读取的过程,对本申请提供的一种提升软拷贝读性能的方法做进一步的描述。In order to facilitate the understanding of the present application, the following describes a method for improving soft copy read performance provided by the present application based on the principle of a method for improving soft copy read performance in the present application, combined with the process of performing soft copy reading on a distributed system in the embodiment. method is further described.
具体的,所述一种提升软拷贝读性能的方法包括:Specifically, the method for improving soft copy read performance includes:
首先,运行分布式系统和客户端程序,创建软拷贝文件;First, run the distributed system and client program to create soft copy files;
客户端在读取软拷贝文件时,先关联读取范围(OFFSET~OFFSET+SIZE)对应的源文件快照数据段或数据段信息,再读取数据,若读取数据命中缓存直接从缓存读出数据后返回,已读完的缓存数据放入默认缓存淘汰队列Q1;若读取数据不在缓存中,从磁盘直接读取数据,然后返回;数据读取过程中,缓存模块自动收集数据读取范围规律并适配相应的缓存淘汰模式,待缓存淘汰模式选定成功后,再把Q1的数据移入到相应缓存淘汰队列Q2;数据读取完成后,调用缓存模块预读算法传入当前读取数据的偏移量OFFSET和读取数据的长度SIZE,计算出后续预读数据涉及到的数据段,再找到各个预读数据段对应的源文件,循环异步读出各个数据段的数据并加载到缓存中,等待下次读取命中后,放入相应的缓存淘 汰队列进行缓存更新。When the client reads the soft copy file, it first associates the source file snapshot data segment or data segment information corresponding to the read range (OFFSET~OFFSET+SIZE), and then reads the data. If the read data hits the cache, it is directly read from the cache. After the data is returned, the cached data that has been read is put into the default cache elimination queue Q1; if the read data is not in the cache, the data is directly read from the disk, and then returned; during the data read process, the cache module automatically collects the data read range After the cache elimination mode is successfully selected, the data of Q1 is moved into the corresponding cache elimination queue Q2; after the data reading is completed, the cache module pre-reading algorithm is called to pass in the current read data The offset OFFSET and the length of the read data SIZE, calculate the data segments involved in the subsequent pre-read data, and then find the source file corresponding to each pre-read data segment, read the data of each data segment asynchronously and load it into the cache , wait for the next read hit, and put it into the corresponding cache elimination queue for cache update.
客户端程序实现时主要包括以下几个步骤:The implementation of the client program mainly includes the following steps:
(1)运行分布式系统和客户端程序,创建软拷贝文件;(1) Run distributed systems and client programs to create soft copy files;
(2)读取软拷贝文件,并等待读取数据返回;(2) Read the soft copy file and wait for the read data to return;
(3)关联读取范围对应的源文件快照数据段或数据段信息;(3) The source file snapshot data segment or data segment information corresponding to the associated read range;
(4)根据关联到的数据段读取数据并等待返回;(4) Read the data according to the associated data segment and wait for the return;
(5)数据读取完成后,根据当前读取数据的范围并调用缓存模块预读算法计算出预读数据涉及到的数据段;(5) After the data reading is completed, according to the range of the current read data and call the cache module pre-reading algorithm to calculate the data segment involved in the pre-reading data;
(6)关联预读数据段涉及到的源文件数据段或快照数据段;(6) The source file data segment or snapshot data segment involved in the associated pre-reading data segment;
(7)循环下发异步消息预读各个数据段数据;(7) Circularly issue asynchronous messages to pre-read the data of each data segment;
(8)预读消息下发完成后,给客户端返回读取到的数据。(8) After the pre-reading message is delivered, the read data is returned to the client.
缓存模块处理程序主要包括以下步骤:The cache module handler mainly includes the following steps:
(1)接收到客户端读消息后,判断要读取数据是否在缓存;(1) After receiving the client read message, determine whether the data to be read is in the cache;
(2)若是则进入步骤(3);(2) If so, enter step (3);
(3)从缓存中取出数据并把读取过的缓存数据放入默认缓存淘汰队列Q1,(3) Take the data from the cache and put the read cache data into the default cache elimination queue Q1,
待后续缓存淘汰模型匹配成功后,再移入相应缓存淘汰队列;After the subsequent cache elimination model is successfully matched, it is moved to the corresponding cache elimination queue;
(4)若不是则进入步骤(5);(4) If not, then enter step (5);
(5)判断是否是预读,若是预读则进入步骤(6);(5) judge whether it is pre-reading, if it is pre-reading, then enter step (6);
(6)下发异步读取数据操作并返回,待数据读取成功后放入到相应缓存队列中;(6) Issue an asynchronous read data operation and return it, and put it into the corresponding cache queue after the data is read successfully;
(7)若不是预读,则进入步骤(8);(7) If it is not pre-reading, then enter step (8);
(8)下发读数据操作并等待返回;(8) Issue the read data operation and wait for the return;
(9)自学习数据读取范围数据,并匹配相应的缓存淘汰模型,匹配成功后,进行缓存数据移动。(9) The self-learning data reads the range data, and matches the corresponding cache elimination model. After the matching is successful, the cache data is moved.
如图2示,该系统200包括:As shown in Figure 2, the system 200 includes:
文件读取单元210,配置用于读取软拷贝文件并关联到读取范围内的源文件数据;a file reading unit 210, configured to read the soft copy file and associate it with the source file data within the read range;
数据预读单元220,配置用于根据预读算法和读取范围计算得到预读数据涉及到的数据段;The data pre-reading unit 220 is configured to calculate and obtain the data segments involved in the pre-reading data according to the pre-reading algorithm and the reading range;
缓存写入单元230,配置用于根据所述数据段循环预读出对应的源文件数据并放到缓存中;The cache writing unit 230 is configured to cyclically pre-read the corresponding source file data according to the data segment and put it into the cache;
缓存淘汰单元240,配置用于根据数据缓存淘汰算法将缓存中的数据进行移动。The cache elimination unit 240 is configured to move the data in the cache according to the data cache elimination algorithm.
图3为本申请实施例提供的一种终端系统300的结构示意图,该终端系统300可以用于执行本申请实施例提供的一种提升软拷贝读性能的方法。FIG. 3 is a schematic structural diagram of a terminal system 300 provided by an embodiment of the present application. The terminal system 300 may be used to execute a method for improving soft copy read performance provided by an embodiment of the present application.
其中,该终端系统300可以包括:处理器310、存储器320及通信单元330。这些组件通过一条或多条总线进行通信,本领域技术人员可以理解,图中示出的服务器的结构并不构成对本申请的限定,它既可以是总线形结构,也可以是星型结构,还可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。The terminal system 300 may include: a processor 310 , a memory 320 and a communication unit 330 . These components communicate through one or more buses. Those skilled in the art can understand that the structure of the server shown in the figure does not constitute a limitation on this application. It can be either a bus-shaped structure, a star-shaped structure, or a More or fewer components than shown may be included, or some components may be combined, or a different arrangement of components.
其中,该存储器320可以用于存储处理器310的执行指令,存储器320可以由任何类型的易失性或非易失性存储终端或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。当存储器320中的执行指令由处理器310执行时,使得终端300能够执行以下上述方法实施例中的部分或全部步骤。Wherein, the memory 320 can be used to store the execution instructions of the processor 310, and the memory 320 can be implemented by any type of volatile or non-volatile storage terminal or their combination, such as static random access memory (SRAM), electrical Erasable Programmable Read Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk . When the execution instructions in the memory 320 are executed by the processor 310, the terminal 300 is enabled to execute some or all of the steps in the following method embodiments.
处理器310为存储终端的控制中心,利用各种接口和线路连接整个电子终端的各个部分,通过运行或执行存储在存储器320内的软件程序和/或模块,以及调用存储在存储器内的数据,以执行电子终端的各种功能和/或处理数据。所述处理器可以由集成电路(Integrated Circuit,简称IC)组成,例如可以由单颗封装的IC所组成,也可以由连接多颗相同功能或不同功能的封装IC而组成。举例来说,处理器310可以仅包括中央处理器(Central Processing Unit,简称CPU)。在本申请实施方式中,CPU可以是单运算核心,也可以包括多运算核心。The processor 310 is the control center of the storage terminal, using various interfaces and lines to connect various parts of the entire electronic terminal, by running or executing the software programs and/or modules stored in the memory 320, and calling the data stored in the memory, To perform various functions of the electronic terminal and/or process data. The processor may be composed of an integrated circuit (Integrated Circuit, IC for short), for example, may be composed of a single packaged IC, or may be composed of a plurality of packaged ICs connected with the same function or different functions. For example, the processor 310 may only include a central processing unit (Central Processing Unit, CPU for short). In the embodiments of the present application, the CPU may be a single computing core, or may include multiple computing cores.
通信单元330,用于建立通信信道,从而使所述存储终端可以与其它终端进行通信。接收其他终端发送的用户数据或者向其他终端发送用户数据。The communication unit 330 is used for establishing a communication channel, so that the storage terminal can communicate with other terminals. Receive user data sent by other terminals or send user data to other terminals.
本申请还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时可包括本申请提供的各实施例中的部分或全部步骤。所述的存储介质可为磁碟、光盘、只读存储记忆体(英文:read-only memory,简称:ROM)或随机存储记忆体(英文:random access memory,简称:RAM)等。The present application also provides a computer storage medium, wherein the computer storage medium can store a program, and when the program is executed, the program can include some or all of the steps in the embodiments provided in the present application. The storage medium may be a magnetic disk, an optical disk, a read-only memory (English: read-only memory, abbreviated as: ROM) or a random access memory (English: random access memory, abbreviated as: RAM) and the like.
因此,本申请通过分布式系统预读算法和缓存模块淘汰机制的自学习技术,实现软拷贝文件预读,提升软拷贝文件读性能和系统吞吐量,本实施例所能达到的技术效果可以参见上文中的描述,此处不再赘述。Therefore, the present application realizes the pre-reading of soft-copy files through the self-learning technology of the distributed system pre-reading algorithm and the cache module elimination mechanism, and improves the reading performance and system throughput of soft-copy files. The technical effects that can be achieved in this embodiment can be found in The above description will not be repeated here.
本领域的技术人员可以清楚地了解到本申请实施例中的技术可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本申请实施例中的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中如U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质,包括若干指令用以使得一台计算机终端(可以是个人计算机,服务器,或者第二终端、网络终端等)执行本申请各个实施例所述方法的全部或部分步骤。Those skilled in the art can clearly understand that the technology in the embodiments of the present application can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solutions in the embodiments of the present application can be embodied in the form of software products in essence or in the parts that make contributions to the prior art. The computer software products are stored in a storage medium such as a USB flash drive, a mobile Hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes, including several instructions to make a computer terminal (It may be a personal computer, a server, or a second terminal, a network terminal, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
本说明书中各个实施例之间相同相似的部分互相参见即可。尤其,对于终端实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例中的说明即可。It is sufficient to refer to each other for the same and similar parts among the various embodiments in this specification. In particular, for the terminal embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the description in the method embodiment.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、系统和方法,可以通过其它的方式实现。例如,以上所描述的系统实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,系统或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, systems and methods may be implemented in other manners. For example, the system embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection between systems or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
尽管通过参考附图并结合优选实施例的方式对本申请进行了详细描述,但本申请并不限于此。在不脱离本申请的精神和实质的前提下,本领域普通技术人员可以对本申请的实施例进行各种等效的修改或替换,而这些修改或替换都应在本申请的涵盖范围内/任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应所述以权利要求的保护范围为准。Although the present application has been described in detail with reference to the accompanying drawings and in conjunction with the preferred embodiments, the present application is not limited thereto. Without departing from the spirit and essence of the present application, those of ordinary skill in the art can make various equivalent modifications or substitutions to the embodiments of the present application, and these modifications or substitutions should all fall within the scope of the present application/any Those skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present application, which should all be covered by the protection scope of the present application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Claims (7)

  1. 一种提升软拷贝读性能的方法,其特征在于,包括:A method for improving soft copy read performance, comprising:
    读取软拷贝文件并关联到读取范围内的源文件数据;Read the soft copy file and associate it with the source file data within the read range;
    根据预读算法和读取范围计算得到预读数据涉及到的数据段;Calculate the data segments involved in the pre-read data according to the pre-read algorithm and the read range;
    根据所述数据段循环预读出对应的源文件数据并放到缓存中;Circularly pre-read the corresponding source file data according to the data segment and put it into the cache;
    根据数据缓存淘汰算法将缓存中的数据进行移动。The data in the cache is moved according to the data cache elimination algorithm.
  2. 根据权利要求1所述的一种提升软拷贝读性能的方法,其特征在于,所述方法还包括:The method for improving soft copy read performance according to claim 1, wherein the method further comprises:
    创建默认缓存淘汰队列和相应缓存淘汰队列;Create a default cache elimination queue and a corresponding cache elimination queue;
    读取软拷贝文件关联的源文件数据,并判断读取数据的来源:Read the source file data associated with the soft copy file, and determine the source of the read data:
    若从缓存直接读出,则将已读完的缓存数据放入默认缓存淘汰队列;If read directly from the cache, put the read cache data into the default cache elimination queue;
    若从磁盘直接读出,则在数据读取过程中,自动收集数据读取范围规律并适配缓存淘汰模型,根据缓存淘汰模型匹配情况将读取数据移入到相应缓存淘汰队列,并把默认缓存淘汰队列的数据移入到相应缓存淘汰队列。If it is read directly from the disk, during the data reading process, the data read range rules are automatically collected and the cache elimination model is adapted, the read data is moved to the corresponding cache elimination queue according to the matching of the cache elimination model, and the default cache The data of the elimination queue is moved into the corresponding cache elimination queue.
  3. 根据权利要求1所述的一种提升软拷贝读性能的方法,其特征在于,所述根据预读算法和读取范围计算得到预读数据涉及到的数据段,包括:The method for improving soft copy read performance according to claim 1, wherein the calculating and obtaining the data segments involved in the pre-reading data according to the pre-reading algorithm and the read range comprises:
    调用预读算法,并根据读取范围传入当前读取数据的偏移量和读取数据的长度,计算出预读数据涉及到的数据段。Call the read-ahead algorithm, and calculate the data segment involved in the read-ahead data by passing in the offset of the current read data and the length of the read data according to the read range.
  4. 根据权利要求2所述的一种提升软拷贝读性能的方法,其特征在于,所述根据所述数据段循环预读出对应的源文件数据并放到缓存中,包括:The method for improving soft copy read performance according to claim 2, wherein the cyclically pre-reading the corresponding source file data according to the data segment and placing it in the cache comprises:
    找到各个预读数据段对应的源文件,循环异步读出各个数据段的源文件数据并加载到缓存中,等待下次从缓存直接读出时,放入默认缓存淘汰队列进行缓存更新。Find the source file corresponding to each pre-read data segment, read the source file data of each data segment asynchronously and load it into the cache, and wait for the next time it is directly read from the cache, put it into the default cache elimination queue for cache update.
  5. 一种提升软拷贝读性能的系统,其特征在于,包括:A system for improving soft copy read performance, comprising:
    文件读取单元,配置用于读取软拷贝文件并关联到读取范围内的源文件数据;The file reading unit is configured to read the soft copy file and associate it with the source file data within the read range;
    数据预读单元,配置用于根据预读算法和读取范围计算得到预读数据涉及到的数据段;The data pre-reading unit is configured to calculate the data segments involved in the pre-reading data according to the pre-reading algorithm and the read range;
    缓存写入单元,配置用于根据所述数据段循环预读出对应的源文件数据并放到缓存中;a cache writing unit, configured to cyclically pre-read the corresponding source file data according to the data segment and put it into the cache;
    缓存淘汰单元,配置用于根据数据缓存淘汰算法将缓存中的数据进行移动。The cache elimination unit is configured to move the data in the cache according to the data cache elimination algorithm.
  6. 一种终端,其特征在于,包括:A terminal, characterized in that it includes:
    处理器;processor;
    用于存储处理器的执行指令的存储器;memory for storing instructions for execution of the processor;
    其中,所述处理器被配置为执行权利要求1-4任一项所述的方法。wherein the processor is configured to perform the method of any one of claims 1-4.
  7. 一种存储有计算机程序的计算机可读存储介质,其特征在于,该程序被处理器执行时实现如权利要求1-4中任一项所述的方法。A computer-readable storage medium storing a computer program, characterized in that, when the program is executed by a processor, the method according to any one of claims 1-4 is implemented.
PCT/CN2021/076951 2020-06-28 2021-02-19 Method and system for improving soft copy read performance, terminal, and storage medium WO2022001133A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010598742.4 2020-06-28
CN202010598742.4A CN111858665B (en) 2020-06-28 2020-06-28 Method, system, terminal and storage medium for improving soft copy reading performance

Publications (1)

Publication Number Publication Date
WO2022001133A1 true WO2022001133A1 (en) 2022-01-06

Family

ID=72988600

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/076951 WO2022001133A1 (en) 2020-06-28 2021-02-19 Method and system for improving soft copy read performance, terminal, and storage medium

Country Status (2)

Country Link
CN (1) CN111858665B (en)
WO (1) WO2022001133A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111858665B (en) * 2020-06-28 2022-12-06 苏州浪潮智能科技有限公司 Method, system, terminal and storage medium for improving soft copy reading performance

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959519A (en) * 2018-06-28 2018-12-07 郑州云海信息技术有限公司 A kind of method, apparatus and computer readable storage medium reading data
CN109947720A (en) * 2019-04-12 2019-06-28 苏州浪潮智能科技有限公司 A kind of pre-reading method of files, device, equipment and readable storage medium storing program for executing
CN111258967A (en) * 2020-02-11 2020-06-09 西安奥卡云数据科技有限公司 Data reading method and device in file system and computer readable storage medium
CN111858665A (en) * 2020-06-28 2020-10-30 苏州浪潮智能科技有限公司 Method, system, terminal and storage medium for improving soft copy reading performance

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6775745B1 (en) * 2001-09-07 2004-08-10 Roxio, Inc. Method and apparatus for hybrid data caching mechanism
CN107590278A (en) * 2017-09-28 2018-01-16 郑州云海信息技术有限公司 A kind of pre-reading method of files and relevant apparatus based on CEPH

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959519A (en) * 2018-06-28 2018-12-07 郑州云海信息技术有限公司 A kind of method, apparatus and computer readable storage medium reading data
CN109947720A (en) * 2019-04-12 2019-06-28 苏州浪潮智能科技有限公司 A kind of pre-reading method of files, device, equipment and readable storage medium storing program for executing
CN111258967A (en) * 2020-02-11 2020-06-09 西安奥卡云数据科技有限公司 Data reading method and device in file system and computer readable storage medium
CN111858665A (en) * 2020-06-28 2020-10-30 苏州浪潮智能科技有限公司 Method, system, terminal and storage medium for improving soft copy reading performance

Also Published As

Publication number Publication date
CN111858665A (en) 2020-10-30
CN111858665B (en) 2022-12-06

Similar Documents

Publication Publication Date Title
US20190163364A1 (en) System and method for tcp offload for nvme over tcp-ip
US10382380B1 (en) Workload management service for first-in first-out queues for network-accessible queuing and messaging services
CN112597251B (en) Database cluster log synchronization method and device, server and storage medium
US11593107B2 (en) Handling an input/output store instruction
CN114201421B (en) Data stream processing method, storage control node and readable storage medium
WO2012051845A1 (en) Data transfer method and system
US9535702B2 (en) Asset management device and method in a hardware platform
CN110737682A (en) cache operation method, device, storage medium and electronic equipment
WO2019153702A1 (en) Interrupt processing method, apparatus and server
US11074203B2 (en) Handling an input/output store instruction
WO2020113947A1 (en) Network file deletion method and device, computer device and storage medium
CN109471851A (en) Data processing method, device, server and storage medium
CN113419824A (en) Data processing method, device, system and computer storage medium
WO2021249059A1 (en) Network card and method for network card to process data
WO2024041022A1 (en) Database table alteration method and apparatus, device and storage medium
WO2022001133A1 (en) Method and system for improving soft copy read performance, terminal, and storage medium
US7466716B2 (en) Reducing latency in a channel adapter by accelerated I/O control block processing
US20200371827A1 (en) Method, Apparatus, Device and Medium for Processing Data
CN115150471A (en) Data processing method, device, equipment, storage medium and program product
US20240095172A1 (en) Data packet processing method and appratus
WO2021207923A1 (en) Cluster expansion method and apparatus, storage medium, and electronic device
US9760577B2 (en) Write-behind caching in distributed file systems
US10061725B2 (en) Scanning memory for de-duplication using RDMA
WO2024027140A1 (en) Data processing method and apparatus, and device, system and readable storage medium
US10133691B2 (en) Synchronous input/output (I/O) cache line padding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21833693

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21833693

Country of ref document: EP

Kind code of ref document: A1