WO2016078259A1 - 一种基于嵌入式文件系统的流式数据读取方法 - Google Patents

一种基于嵌入式文件系统的流式数据读取方法 Download PDF

Info

Publication number
WO2016078259A1
WO2016078259A1 PCT/CN2015/074082 CN2015074082W WO2016078259A1 WO 2016078259 A1 WO2016078259 A1 WO 2016078259A1 CN 2015074082 W CN2015074082 W CN 2015074082W WO 2016078259 A1 WO2016078259 A1 WO 2016078259A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
read
data
subtask
streaming data
Prior art date
Application number
PCT/CN2015/074082
Other languages
English (en)
French (fr)
Inventor
陈君
吴京洪
李明哲
樊皓
叶晓舟
Original Assignee
中国科学院声学研究所
北京中科智网科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院声学研究所, 北京中科智网科技有限公司 filed Critical 中国科学院声学研究所
Priority to US15/527,323 priority Critical patent/US20170322948A1/en
Publication of WO2016078259A1 publication Critical patent/WO2016078259A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • G06F16/1767Concurrency control, e.g. optimistic or pessimistic approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • G06F16/152File search processing using file content signatures, e.g. hash values
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs

Definitions

  • the present invention relates to the field of data storage technologies, and in particular, to a streaming data reading method based on an embedded file system.
  • Embedded system has limited resources and simple structure. It is different in its particularity and specificity. It rarely uses a common operating system and file system in embedded systems. Instead, it customizes the file system for embedded systems for specific application scenarios.
  • the application range of the system is very wide. It is impossible to have a file system in all embedded systems. It can be applied to everything from embedded servers to embedded set-top boxes, but to system application environments and targets. To choose to build a suitable file system. Different file systems manage disk policies and read and write data differently.
  • the most urgent solution in the prior art is the high throughput and high concurrency of data reading.
  • the rate at which the file system reads data depends on the IO performance of the underlying interface on the one hand, and on the other hand, depending on the scheduling efficiency of the file system itself.
  • the concurrency of the file system to read data is related to the internal scheduling mechanism.
  • the object of the present invention is to provide a high throughput and high concurrent data reading service for an embedded stream service, thereby proposing a streaming data reading method based on an embedded file system.
  • the present invention provides a streaming data reading method based on an embedded file system, the method comprising the following steps:
  • each subtask is responsible for reading a piece of physically contiguous data and caching it;
  • the data is taken out from the subtask cache, and is encapsulated in a streaming data format.
  • Each packaged data is submitted to the caller of the read task, and the subtask is released and the next subtask is triggered after the submission is completed;
  • the hash value of the request file name is calculated, and the hash value is searched to determine whether the requested data exists on the disk.
  • a request parameter for reading streaming data includes: a file name, a starting offset and an ending offset of the read data, and after creating a new read task for the request, allocating storage space for the read task, the file name
  • the hash value, the start offset of the data to be read, and the end offset information are stored in the storage space allocated by the read task, thereby completing the read task initialization.
  • the task length is calculated according to the start offset and the end offset of the read task, and the read task is decomposed into multiple subtasks according to the location information of the streaming data to be read on the disk; all the subtasks are passed
  • the linked list is connected in series, and the subtasks are triggered in sequence.
  • the starting sector and length of the streaming data to be read by the subtask are first acquired, and the memory space is requested for the streaming data to be read according to the length of the streaming data to be read, and then According to the starting sector, the streaming data is read from which disk, and finally the lower interface is called to read the streaming data of the specified segment from the specified disk.
  • the underlying interface sends a message to notify the file system that the current subtask succeeds or fails, and after receiving the message that the subtask is successfully completed, the file system fetches data from the current subtask cache.
  • a memory space is pre-allocated for the streaming data to be read, and the data read from the disk is buffered; the length of the streaming data to be read for each subtask identifier must be the size of the disk sector. Integer multiple, and the subtask uses asynchronous non-blocking IO mode when reading data from disk.
  • the file system sends a message to the file system, and after receiving the message, the file system copies the data from the data buffer of the sub-task into the newly applied memory, and encapsulates and encapsulates the data according to the streaming data format. After that, it is submitted to the caller of this read task, and then the next subtask is triggered until all subtasks have ended.
  • the task is terminated in advance by adjusting the task end position forward, and the read data is additionally added by the backward adjustment task end position for the task that has been read.
  • the read task end offset is changed as needed.
  • the update is ignored; otherwise, the new task is used to end the offset replacement.
  • the read data in the task parameters ends the offset and regenerates the subtask based on the new task end offset.
  • the present invention ensures that each subtask reads a piece of logically and physically continuous data by decomposing tasks, and limits the length of data read by a single subtask, thereby improving the efficiency of reading data;
  • the present invention also allows the user to change the end offset in the process of reading data, enriching the operation mode of the user, and has a great advantage in the streaming service application scenario.
  • FIG. 1 is a schematic flowchart of a streaming data reading method based on an embedded file system according to an embodiment of the present invention
  • FIG. 2 is a message driven flowchart of the embodiment of the invention shown in FIG. 1;
  • FIG 3 is a flow chart of the read task of the embodiment of the invention shown in Figure 1;
  • FIG. 4 is a schematic diagram of a task list of the embodiment of the invention shown in FIG. 1.
  • the embodiment of the present invention provides a streaming data reading method based on an embedded file system, which solves the problem of insufficient data reading efficiency and high concurrency capability in the existing embedded streaming service.
  • the efficiency of reading data is improved, the asynchronous read mechanism is adopted to ensure high concurrent reading of streaming data, and the user is allowed to change the ending offset in the process of reading data, enriching the operation mode of the user, and applying in the streaming service.
  • FIG. 1 is a schematic flowchart of a streaming data reading method based on an embedded file system according to an embodiment of the present invention
  • FIG. 2 is a message driven flowchart.
  • the embodiment of the present invention adopts an event-driven mechanism, and all events are performed by using a message as a carrier. Drivers are driven by messages about starting tasks, updating tasks, processing read data, and ending tasks.
  • the embodiment of the present invention is described in detail below with reference to FIG. 1 and FIG. 2. As shown in FIG. 1, the method includes steps 101-104:
  • step 101 a request to read streaming data is received.
  • a new read task is created for the request, and storage space is allocated for the newly created read task and related parameters are initialized.
  • the message receiver is responsible for receiving all messages, judging the received messages, and responding according to the message type.
  • the message types include starting tasks, updating tasks, processing read data, and ending tasks.
  • the file system will issue a startup message.
  • the file system executes the first branch of Figure 2, "Start Task", to start the task.
  • a read task is created for the new request.
  • the determining method is: calculating a hash value of the requested file name, searching for the hash value, if it can be found , that is, the requested streaming data exists on the disk, then immediately create a new read task for the request, allocate storage space for the new task and initially Relevant parameters; if the requested streaming data does not exist on the disk, the user is notified that the read request failed.
  • a streaming data read request parameter includes a file name, a start offset and an end offset of the read data, and a new read task, allocates a memory space for the new task, and uses a file name hash value and data to be read.
  • the start offset, end offset, and other information are stored in the task space to complete task initialization.
  • the read task is decomposed into a plurality of subtasks, each of which is responsible for reading a piece of physically continuous data and caching;
  • the file system obtains the metadata information of the requested file, and combines the location information of the requested streaming data on the disk, according to the starting offset of the streaming data to be read and the data to be read.
  • the length divides the read task, and the divided subtasks are logically continuous. Each subtask is responsible for reading a piece of logically and physically continuous data, and the data read by the adjacent subtasks is not necessarily physically continuous.
  • the start offset and the task length of the current read task are extracted, and the file index information corresponding to the stream data to be read is queried, and the disk location information for storing the streaming data is obtained, and the task length and The initial offset is calculated, and combined with the disk position information of the streaming data, the read task is decomposed into several subtasks, and each subtask is responsible for reading a piece of logically and physically continuous data, and the data length is the sector size. Integer multiple; the data read by adjacent subtasks is logically continuous, but may be physically discontinuous, because a streaming data is often not stored continuously on disk. The purpose of dividing subtasks is to ensure that each A piece of physically continuous data is read from the disk.
  • the data length of the subtask is limited, and the length of the data read by the single subtask is not too long.
  • the subtask information is stored in a linked list. Each node in the linked list includes the starting sector of the current subtask read data and the length of the current subtask read data, and the length is represented by the number of sectors.
  • the length of the subtask to be read is calculated by the number of sectors and the sector size, and is calculated according to the calculation.
  • the length of the sub-task request memory space is used to cache the data read from the disk, and then the disk storing the streaming data to be read by the sub-task is found according to the starting sector number, and the lower interface is called and transferred to the disk.
  • the parameters such as the number, the starting sector number, the number of sectors, and the cache address of the streaming data to be read can read the specified data from the specified disk.
  • step 103 the data is taken out from the subtask cache, and is encapsulated in a streaming data format.
  • Each packaged data is submitted to the caller of the read task, and the subtask is released and the next subtask is triggered after the submission is completed. ;
  • the file system automatically triggers the first subtask.
  • the file system first obtains subtask parameters, including the starting sector number of the read data and the number of sectors to be read. Calculate the amount of data to be read by the sub-task by the size of the sector and the number of sectors to be read, apply for the memory space according to the amount of data, and use it to cache the data to be read, and then calculate the starting of the sub-task to be read.
  • the disk number of the sector is located.
  • the lower layer read interface is called to read data from the specified disk, and the parameters such as the disk number, the starting sector number, and the number of sectors are transferred.
  • the lower layer interface will send a message to report that the subtask is successfully completed, and the message receiver determines that the message type is a subtask completion notification message after receiving the message.
  • the file system executes the third branch of Figure 1 "Processing Read Data", which is the most important process in the entire read task. Whenever the message of the last subtask is successfully completed, the message is triggered next. Subtask, loop through this process until all subtasks are executed or a subtask fails.
  • the subtask uses the asynchronous non-blocking IO mode when reading data from the disk, and returns immediately after calling the lower layer interface without blocking in the IO process.
  • the mechanism is applicable to multi-core cooperation, and is advantageous for multi-task high concurrent implementation and streaming. Efficient reading of data. After the data corresponding to the subtask is completely read, the underlying interface sends a message to report whether the subtask is successfully completed. After receiving the message that the subtask is successfully completed, the file system fetches the data from the subtask cache and presses the streaming data.
  • the format is encapsulated, and each piece of data is encapsulated and submitted to the caller of the current read task until the data read by the sub-task has been completely submitted or the remaining data is temporarily insufficiently submitted, and the remaining data that is insufficiently submitted is temporarily cached. After the next subtask reads the data from the disk, the cached data is taken out for encapsulation and submission.
  • FIG. 3 is a flow chart of a read task of the embodiment of the invention shown in FIG. 1.
  • the process processes the read data, that is, the data is encapsulated in a streaming data format, and the content length of each block of data after encapsulation is a fixed value. The value is related to the specific streaming service application scenario.
  • the data read by the subtask may be left after being encapsulated by the streaming data root. The remaining data is not enough to be encapsulated into a piece of streaming data and submitted to the user, and the remaining data of the subtask is cached. After the next subtask is completed, the data is encapsulated, and the process is repeated until all subtasks have been completed.
  • the remaining data may not be encapsulated into data after the data is encapsulated in the streaming data format.
  • the last piece of standard data because this piece of data is the last piece of data of the entire read task, there is no subsequent data, so the last piece of data is not enough to be packaged into a piece of standard data but still submitted to the user.
  • the user can change the read task end offset as needed. If the user finds that he only needs to read a part of the data instead of the entire file, he can adjust the task end offset forward, and then the embedded file system can be called as the user.
  • the interface for updating the task parameters is provided. After the interface is invoked, the file system sends a message for updating the task. After the message receiver receives the message, the file system performs the second branch "update task" in Figure 2.
  • the new task end offset is smaller than the original task end offset, it is updated forward, that is, the task ends early.
  • the file system obtains the data offset read by the current subtask. If the new task end offset is smaller than the data offset read by the current subtask, the update cannot be completed, and the update request is directly ignored; if the new task ends the offset ratio If the current subtask read offset is large, the read end offset in the task parameter is replaced with a new task end offset, and the subtask is regenerated according to the new end offset, and the subtask list is updated.
  • step 104 when all the subtasks are successfully completed, the task caller is reported to complete the task normally, waiting for the task caller to end the current read task.
  • the file system when a subtask fails to execute, an error occurs in processing read data, or an update task fails, the file system will be the master.
  • the user reports an exception to the user.
  • the file system reports to the user that the read task is normally completed.
  • the user After the user receives the file system exception or completes the report, the user ends the task and ends the task.
  • the interface is also implemented by the file system for the user to call. In principle, the user can actively end a read task at any time.
  • the subtask is completed after the read data is encapsulated and submitted, and the task space and the data space are released when the subtask ends, and the current head node in the subtask list is deleted when the task space is released, and the data space refers to The memory space requested by the subtask to cache the read data. Only when the previous subtask is successfully completed can the next subtask be triggered. If a subtask fails to execute, when the file system receives the failure message, it will actively report the task exception to the task caller when all subtasks have succeeded. Upon completion, the file system also reports to the task caller that the task completed normally, waiting for the task caller to end the current read task.
  • the interface function provided by the file system can be called to actively end the task, and the task caller can also end the task actively during the process of the task, and the embodiment of the present invention It also supports mid-way update of task parameters. For tasks that have not been completed, you can end the task ahead of time by adjusting the task end offset. For tasks that have already been read, you can also adjust the task end offset to add read data.
  • the method provides users with flexible and varied operation modes, and is suitable for various application scenarios of streaming data.
  • each node in the linked list represents a subtask, and the node includes subtask parameters, such as a starting sector number, a sector number, and a disk number.
  • the task list is generated when the task is started.
  • the node header node is released, and the “current subtask” is pointed to the next subtask.
  • the node in the dotted line box in Fig. 4 indicates that the execution has been completed.
  • Subtasks each time a subtask is triggered, the task parameters are obtained through the "current subtask", and the "current subtask” always points to the head node of the task list.
  • the task list before the update parameter is deleted, and then the new task end offset and the current task state are recalculated and a new task linked list is generated.
  • the embodiment of the present invention ensures that each sub-task reads a piece of logically and physically continuous data by decomposing the read task, and limits the data length read by the single sub-task, thereby improving the efficiency of reading the data;
  • the mechanism returns immediately after calling the lower-layer read interface, and does not need to block the process of reading data. It also supports multi-core collaboration.
  • the lower-layer interface sends a message to report that the sub-task is successfully executed. This message then drives the next sub-task, the next sub-task. Tasks may be performed by another core, which provides a guarantee of high concurrency performance for streaming data reads.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

一种基于嵌入式文件系统的流式数据读取方法,该方法包括:接收一个读取流式数据的请求,当被请求的流式数据存在于磁盘时,则为该请求新建一个读任务,并为新建的读任务分配存储空间并初始化相关参数(101);将读任务分解成多个子任务,每个子任务负责读取一段物理上连续的数据,并进行缓存(102);从子任务缓存中取出数据,按流式数据格式进行封装,每封装完一块数据就提交给本次读任务的调用方,提交完成后释放本次子任务并触发下一个子任务(103);当所有子任务都成功完成时,向任务调用方报告任务正常完成,等待任务调用方结束当前读任务(104)。该方法有利于流式数据的高吞吐和大并发读取,有效地解决了在大量用户访问热点数据时服务器并发性能低下的问题。

Description

一种基于嵌入式文件系统的流式数据读取方法 技术领域
本发明涉及数据存储技术领域,尤其涉及一种基于嵌入式文件系统的流式数据读取方法。
背景技术
随着互联网及多媒体产业的迅猛发展,各种存储技术及存储系统也得到了飞速发展。这些存储系统为海量的互联网信息及多媒体数据信息提供了方便、快速、高效的存储及访问服务。
嵌入式系统资源有限,结构简单,介于其特殊性和专用性,很少在嵌入式系统中采用通用的操作系统和文件系统,而是针对特定应用场景为嵌入式系统定制文件系统;可嵌入式系统的应用范围非常广泛,不可能有一种文件系统在所有嵌入式系统中一统天下,适用于大到嵌入式服务器,小到嵌入式机顶盒等的所有情况,而是根据系统应用环境和目标等来选择构建合适的文件系统。不同文件系统管理磁盘的策略和读写数据的方法不一样,现有技术最亟待解决的就是数据读取的高吞吐、高并发问题。
文件系统读取数据的速率一方面取决于底层接口的IO性能,另一方面取决于文件系统自身的调度效率,文件系统读取数据的并发能力则与内部调度机制有关。
发明内容
本发明的目的在于,为嵌入式流服务提供高吞吐、高并发的数据读取服务,从而提出了一种基于嵌入式文件系统的流式数据读取方法。
为实现上述目的,本发明提出了一种基于嵌入式文件系统的流式数据读取方法,该方法包括以下步骤:
接收一个读取流式数据的请求,当被请求的流式数据存在于磁盘时,则为该请求新建一个读任务,并为新建的读任务分配存储空间并初始化相关参数;
将读任务分解成多个子任务,每个子任务负责读取一段物理上连续的数据,并进行缓存;
从子任务缓存中取出数据,按流式数据格式进行封装,每封装完一块数据就提交给本次读任务的调用方,提交完成后释放本次子任务并触发下一个子任务;
当所有子任务都成功完成时,向任务调用方报告任务正常完成,等待任务调用方结 束当前读任务。
优选地,当收到一个读取流式数据的请求时,计算请求文件名的哈希值,通过对哈希值进行查找,进而判断被请求数据是否存在于磁盘。
优选地,一个读取流式数据的请求参数包括:文件名、读取数据的起始偏移和结束偏移,在为该请求新建一个读任务后,为读任务分配存储空间,将文件名哈希值、待读取数据的起始偏移、结束偏移信息存入读任务分配的存储空间,从而完成读任务初始化。
优选地,根据读任务的起始偏移和结束偏移,计算出任务长度,结合待读流式数据存放于磁盘中的位置信息,将所述读任务分解成多个子任务;所有子任务通过链表串联,按先后顺序依次触发子任务。
优选地,在每个子任务开始后,首先获取本次子任务欲读取流式数据的起始扇区和长度,根据欲读取流式数据长度为待读取流式数据申请内存空间,再根据起始扇区计算出将从哪一块磁盘中读流式数据,最后调用下层接口从指定磁盘中读取指定区段的流式数据。
优选地,在每个子任务完成后,底层接口发送消息通知文件系统当前子任务执行成功或失败,文件系统在收到子任务成功完成的消息后,从当前子任务缓存中取出数据。
优选地,在执行每个子任务时会为待读流式数据预分配内存空间,用于缓存从磁盘中读出的数据;每个子任务标识的待读流式数据长度须为磁盘扇区大小的整数倍,且子任务从磁盘中读取数据时采用异步非阻塞IO模式。
优选地,在上一次子任务成功结束后向文件系统发送消息,文件系统收到消息后从子任务的数据缓存区将数据拷贝到新申请的内存中,并按流式数据格式进行封装,封装好后提交给本次读取任务的调用方,进而触发下一个子任务,直到所有子任务均已结束。
优选地,对于尚未完成的读任务,通过向前调整任务结束位置提前结束任务,对于已经读取完成的任务,通过向后调整任务结束位置追加读取数据。
优选地,在每个子任务进行过程中,根据需要更改读任务结束偏移,当新任务结束偏移比当前子任务结束偏移小,则忽略本次更新;否则,用新任务结束偏移替换任务参数中的读数据结束偏移,并根据新任务结束偏移重新生成子任务。
与现有技术相比,本发明的优势在于:
1、高效性——本发明通过对任务进行分解,保障了每个子任务读取一段逻辑和物理上均连续的数据,同时限制单个子任务读取的数据长度,提高了读取数据的效率;
2、高并发——采用异步读机制,调用下层读接口后立即返回,无需阻塞在读数据过程中;还支持多核协作,子任务成功执行后下层接口发送消息报告子任务成功执行,此消息再去驱动下一子任务,下一子任务则可能由另一个核执行,这两点为流式数据读 取的高并发性能提供了保障。
此外,本发明还允许用户在读取数据的过程中更改结束偏移,丰富了用户的操作方式,在流服务应用场景中具有较大优势。
附图说明
图1是本发明实施例提供的一种基于嵌入式文件系统的流式数据读取方法流程示意图;
图2是图1所示发明实施例消息驱动流程图;
图3是图1所示发明实施例读任务流程图;
图4是图1所示发明实施例子任务链表示意图。
具体实施方式
下面结合附图和实例对本发明进行详细说明,使得本发明的上述优点更加明确:
本发明实施例针对现有嵌入式流服务中存在的数据读取效率和并发能力不够高的问题,提出了一种基于嵌入式文件系统的流式数据读取方法,该方法通过对任务进行分解,提高了读取数据的效率,采用异步读机制保障了流式数据的高并发读取,还允许用户在读取数据的过程中更改结束偏移,丰富了用户的操作方式,在流服务应用场景中具有较大优势。
图1是本发明实施例提供的一种基于嵌入式文件系统的流式数据读取方法流程示意图,图2是消息驱动流程图,本发明实施例采用事件驱动机制,所有事件以消息为载体进行驱动,有关启动任务、更新任务、处理读出数据和结束任务都是由消息驱动。以下结合图1和图2对本发明实施例进行详细说明,如图1所示,该方法包括步骤101-104:
在步骤101,接收一个读取流式数据的请求,当被请求的流式数据存在磁盘时,则为该请求新建一个读任务,并为新建的读任务分配存储空间并初始化相关参数。
具体地,消息接收器负责接收所有消息,对收到的消息进行判断,根据消息类型进行响应,消息类型包括启动任务、更新任务、处理读出数据和结束任务。当用户调用文件系统提供的接口请求读取数据成功后,文件系统会发出一个启动消息,消息接收器收到启动消息后,由文件系统执行图2的第一个分支“启动任务”,启动任务即为新请求创建一个读任务。
优选地,当收到一个读取流式数据的请求时,首先判断被请求的流式数据是否存在,判断方法是:计算请求文件名的哈希值,对哈希值进行查找,若能找到,即被请求的流式数据存在于磁盘中,则立即为该请求新建一个读任务,为新任务分配存储空间并初始 化相关参数;若被请求的流式数据不存在于磁盘中,则通知用户读请求失败。
一个流式数据读取请求参数包括文件名、读取数据的起始偏移和结束偏移等,新建一个读任务后,为新任务分配内存空间,将文件名哈希值、待读取数据的起始偏移、结束偏移等信息存入任务空间,从而完成任务初始化。
在步骤102,将读任务分解成多个子任务,每个子任务负责读取一段物理上连续的数据,并进行缓存;
具体地,在读任务创建成功后,文件系统获取被请求文件的元数据信息,结合被请求的流式数据存储在磁盘的的位置信息,依据待读流式数据的起始偏移和待读数据长度对读任务进行划分,划分后的子任务在逻辑上具有连续性,每个子任务负责读取一段逻辑上和物理上均连续的数据,而相邻子任务读出的数据在物理上不一定连续。
优选地,在读任务新建成功后,提取本次读任务的起始偏移和任务长度,查询待读流式数据对应的文件索引信息,可获取存放流式数据的磁盘位置信息,对任务长度和起始偏移进行计算,结合存放流式数据的磁盘位置信息,将该读任务分解成若干个子任务,每个子任务负责读取一段逻辑上和物理上均连续的数据,数据长度为扇区大小的整数倍;相邻子任务读出的数据在逻辑上具有连续性,但物理上却可能不连续,因为一个流式数据往往并非是连续存储在磁盘中,划分子任务的目的就是为了保证每次从磁盘中读出一段物理上连续的数据,同时,为了保障读取流式数据的效率,对子任务的数据长度进行限制,单个子任务读取的数据长度不宜太长。子任务信息以链表的方式存储,链表中每个节点中包含本次子任务读取数据的起始扇区和本次子任务读取数据的长度,该长度用扇区数表示。待任务分解结束后,主动触发第一个子任务。
在触发一个子任务后,首先获取本次子任务欲读取数据的起始扇区和欲读数据长度,其中,本次子任务欲读数据长度由扇区数和扇区大小算出,根据算出的长度为本次子任务申请内存空间,用于缓存从磁盘中读出的数据,再根据起始扇区编号找到存储本次子任务待读流式数据的磁盘,调用下层接口并传入磁盘编号、起始扇区编号、扇区数、待读流式数据的缓存地址等参数,便可从指定磁盘中读取指定数据。
在步骤103,从子任务缓存中取出数据,按流式数据格式进行封装,每封装完一块数据就提交给本次读任务的调用方,提交完成后释放本次子任务并触发下一个子任务;
具体地,在生成子任务后,由文件系统主动触发第一个子任务,开启子任务后,文件系统先获取子任务参数,包括读取数据的起始扇区号和本次待读扇区数目,通过扇区大小和待读扇区数计算出本次子任务待读数据量,根据此数据量申请内存空间,用于缓存待读数据,再通过计算得出本次子任务待读起始扇区所在磁盘编号,最后调用下层读接口从指定磁盘中读出数据,传入磁盘编号、起始扇区号、扇区数等参数,调用后立即 返回而不是等数据完全读出后再返回,待数据完全读出子任务缓存后,下层接口会发送消息报告子任务成功完成,消息接收器收到消息后判断消息类型为子任务完成通知消息,文件系统则执行图1中第三个分支“处理读出数据”,该流程为整个读任务中最主要流程,每当收到上一子任务成功完成的消息后,即由此消息触发下一子任务,循环进行此流程直至所有子任务全部执行或某个子任务执行失败。
优选地,子任务从磁盘中读取数据时采用异步非阻塞IO模式,调用下层接口后立即返回而无需阻塞在IO过程中,该机制适用于多核协作,利于多任务的高并发实现和流式数据的高效读取。当本次子任务对应的数据被完全读出后,底层接口会发送消息报告子任务是否成功完成,文件系统收到子任务成功完成的消息后,从子任务缓存中取出数据,按流式数据格式进行封装,每封装完一块数据就提交给本次读任务的调用方,直到本次子任务读出的数据已全部提交或剩余数据暂不足提交,对于不足提交的剩余数据,将其暂时缓存,待下一子任务从磁盘读出数据后,再取出缓存数据进行封装并提交。
图3是图1所示发明实施例的读任务流程图,该流程对读出数据进行处理,即按流式数据格式对数据进行封装,封装后每块数据中的内容长度为某一固定值,此值与具体流服务应用场景有关,子任务读出的数据按流式数据根式封装后可能有剩余,剩余数据不足以封装成一块流式数据提交给用户,则将子任务剩余数据缓存,待下一子任务完成后再对数据进行封装,循环进行此流程直至所有子任务均已完成,当所有子任务均已完成后,按流式数据格式封装数据后剩余数据仍可能不足以封装成最后一块标准数据,由于此段数据是整个读任务的最后一段数据,已无后续数据存在,所以最后一块数据不足以封装成一块标准数据但仍然提交给用户。
读任务进行过程中,用户可以根据需要更改读任务结束偏移,如用户发现自己只需要读一部分数据而非整个文件,则可以向前调整任务结束偏移,则可以调用嵌入式文件系统为用户提供的更新任务参数的接口,调用接口后文件系统会发送一条更新任务的消息,消息接收器收到此消息后,文件系统执行图2中第二个分支“更新任务”,
比较原始任务结束偏移和新任务结束偏移的大小关系,若新任务结束偏移小于原始任务结束偏移则为向前更新,即提前结束任务。文件系统获取当前子任务读取的数据偏移,若新任务结束偏移比当前子任务读取的数据偏移小则更新不可能完成,直接忽略本次更新请求;若新任务结束偏移比当前子任务读取偏移大,则用新的任务结束偏移替换任务参数中的读数据结束偏移,并根据新的结束偏移重新生成子任务,更新子任务链表。
在步骤104,当所有子任务都成功完成时,向任务调用方报告任务正常完成,等待任务调用方结束当前读任务。
具体地,当子任务执行失败、处理读出数据出错或更新任务出错时,文件系统会主 动向用户报告异常,当所有子任务都成功完成且处理读出数据正常时,文件系统会向用户报告读任务正常完成,用户收到文件系统的异常或完成报告后,主动结束任务,结束任务的接口也由文件系统实现,供用户调用。原则上,用户可以在任何时候主动结束一个读任务。
优选地,将读出的数据封装并提交完成后子任务才算结束,子任务结束时释放任务空间和数据空间,释放任务空间即删除子任务链表中的当前头结点,数据空间指的是子任务开始时申请的用于缓存读出数据的内存空间。只有当上一个子任务成功完成后,才能触发下一个子任务,若某个子任务执行失败,当文件系统收到失败消息时,会主动向任务调用方报告任务异常,当所有子任务都已成功完成时,文件系统也会向任务调用方报告任务正常完成,等待任务调用方结束当前读任务。
任务调用方收到文件系统报告异常或任务结束后可调用文件系统提供的接口函数主动结束任务,甚至在任务进行过程中任务调用方也可主动结束该任务,除此之外,本发明实施例还支持中途更新任务参数,对于尚未完成的任务,可以通过向前调整任务结束偏移来提前结束任务,对于已经读取完成的任务,还可以向后调整任务结束偏移来追加读取数据,该方法为用户提供了灵活多变的操作方式,适合于流式数据的多种应用场景中。
图4是图1所示本发明实施例子任务链表示意图,如图4所示,链表中每个节点表示一个子任务,节点中包含子任务参数,如起始扇区号、扇区数目、磁盘编号等,该任务链表在启动任务时生成,每当一个子任务执行完成后则释放链表头结点,将“当前子任务”指向下一子任务,图4中虚线框中节点即表示已经执行完成的子任务,每次触发一个子任务则通过“当前子任务”获取任务参数,“当前子任务”始终指向任务链表的头结点。当更新任务结束偏移后,先删除更新参数前的任务链表,再通过新任务结束偏移和当前任务状态重新计算并生成新的任务链表。
本发明实施例通过对读任务进行分解,保障了每个子任务读取一段逻辑和物理上均连续的数据,同时限制单个子任务读取的数据长度,提高了读取数据的效率;采用异步读机制,调用下层读接口后立即返回,无需阻塞在读数据过程中;还支持多核协作,子任务成功执行后下层接口发送消息报告子任务成功执行,此消息再去驱动下一子任务,下一子任务则可能由另一个核执行,这两点为流式数据读取的高并发性能提供了保障。
最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,对本发明的技术方案进行修改或者等同替换,都不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。

Claims (10)

  1. 一种基于嵌入式文件系统的流式数据读取方法,其特征在于:
    接收一个读取流式数据的请求,当被请求的流式数据存在于磁盘时,则为该请求新建一个读任务,并为新建的读任务分配存储空间并初始化相关参数;
    将所述读任务分解成多个子任务,每个子任务负责读取一段物理上连续的数据,并进行缓存;
    从子任务缓存中取出数据,按流式数据格式进行封装,每封装完一块数据就提交给本次读任务的调用方,提交完成后释放本次子任务并触发下一个子任务;
    当所有子任务都成功完成时,向任务调用方报告任务正常完成,等待任务调用方结束当前读任务。
  2. 根据权利要求1所述的方法,其特征在于:通过以下步骤判断所述被请求的流式数据是否存在磁盘:
    当收到一个读取流式数据的请求时,计算请求文件名的哈希值,通过在文件系统元数据中对所述哈希值进行查找,进而判断被请求数据是否存在于磁盘。
  3. 根据权利要求1所述的方法,其特征在于:所述一个读取流式数据的请求参数包括:文件名、读取数据的起始偏移和结束偏移,在为该请求新建一个读任务后,为读任务分配存储空间,将文件名哈希值、待读流式取数据的起始偏移、结束偏移信息存入读任务分配的存储空间,从而完成读任务初始化。
  4. 根据权利要求1所述的方法,其特征在于:将所述读任务分解成多个子任务的步骤包括:
    根据读任务的起始偏移和结束偏移,计算出任务长度,结合待读流式数据存放于磁盘中的位置信息,将所述读任务分解成多个子任务;
    所有子任务通过链表串联,按先后顺序依次触发子任务。
  5. 根据权利要求1所述的方法,其特征在于:在所述每个子任务开始后,首先获取本次子任务欲读取流式数据的起始扇区和长度,根据欲读取流式数据长度为待读取流 式数据申请内存空间,再根据起始扇区计算出将从哪一块磁盘读出流式数据,最后调用下层接口从指定磁盘中读取指定区段的流式数据。
  6. 根据权利要求1所述的方法,其特征在于:在执行每个子任务时会为待读流式数据预分配内存空间,用于缓存从磁盘中读出的数据;每个子任务标识的待读流式数据长度须为磁盘扇区大小的整数倍,且子任务从磁盘中读取数据时采用异步非阻塞IO模式。
  7. 根据权利要求1所述的方法,其特征在于:在所述每个子任务完成后,底层接口发送消息通知文件系统当前子任务执行成功或失败,文件系统在收到子任务成功完成的消息后,从当前子任务缓存中取出数据。
  8. 根据权利要求1所述的方法,其特征在于:在上一次子任务成功结束后向文件系统发送消息,文件系统在收到消息后从子任务的数据缓存区将数据拷贝到新申请的内存中,并按流式数据格式进行封装,封装好后提交给本次读取任务的调用方,进而触发下一个子任务,直到所有子任务均已结束。
  9. 根据权利要求1所述的方法,其特征在于,对于尚未完成的读任务,通过向前调整任务结束位置提前结束任务;对于已经读取完成的任务,通过向后调整任务结束位置追加读取数据。
  10. 根据权利要求1所述的方法,其特征在于:在所述每个子任务进行过程中,可根据需要更改读任务结束偏移,当新任务结束偏移比当前子任务结束偏移小,则忽略本次更新;否则,用新任务结束偏移替换任务参数中的读数据结束偏移,并根据新任务结束偏移重新生成子任务。
PCT/CN2015/074082 2014-11-17 2015-03-12 一种基于嵌入式文件系统的流式数据读取方法 WO2016078259A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/527,323 US20170322948A1 (en) 2014-11-17 2015-03-12 Streaming data reading method based on embedded file system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410653260.9 2014-11-17
CN201410653260.9A CN104331255B (zh) 2014-11-17 2014-11-17 一种基于嵌入式文件系统的流式数据读取方法

Publications (1)

Publication Number Publication Date
WO2016078259A1 true WO2016078259A1 (zh) 2016-05-26

Family

ID=52405990

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/074082 WO2016078259A1 (zh) 2014-11-17 2015-03-12 一种基于嵌入式文件系统的流式数据读取方法

Country Status (3)

Country Link
US (1) US20170322948A1 (zh)
CN (1) CN104331255B (zh)
WO (1) WO2016078259A1 (zh)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331255B (zh) * 2014-11-17 2018-04-17 中国科学院声学研究所 一种基于嵌入式文件系统的流式数据读取方法
CN105871980A (zh) * 2015-12-01 2016-08-17 乐视体育文化产业发展(北京)有限公司 提高缓存命中率的方法及装置
TWI615005B (zh) * 2016-06-24 2018-02-11 財團法人電信技術中心 網路效能的測試系統與測試方法
CN107870928A (zh) * 2016-09-26 2018-04-03 上海泓智信息科技有限公司 文件读取方法和装置
US10387207B2 (en) * 2016-12-06 2019-08-20 International Business Machines Corporation Data processing
CN106598735B (zh) * 2016-12-13 2019-08-09 广东金赋科技股份有限公司 一种分布式计算方法、主控节点和计算系统
CN110516738B (zh) * 2019-08-23 2022-09-16 佳都科技集团股份有限公司 一种分布式比对聚类方法、装置、电子设备及存储介质
CN110781137A (zh) * 2019-10-28 2020-02-11 柏科数据技术(深圳)股份有限公司 分布式系统的目录读取方法、装置、服务器和存储介质
CN110781159B (zh) * 2019-10-28 2021-02-02 柏科数据技术(深圳)股份有限公司 Ceph目录文件信息读取方法、装置、服务器及存储介质
CN113127443A (zh) * 2020-01-14 2021-07-16 北京京东振世信息技术有限公司 一种更新缓存数据的方法和装置
CN111611105A (zh) * 2020-05-15 2020-09-01 杭州涂鸦信息技术有限公司 一种对并发业务请求异步处理的优化方法及相关设备
CN113487026B (zh) * 2021-07-05 2024-05-03 江苏号百科技有限公司 一种图计算中io节点高效读取数据的方法及系统
WO2023077451A1 (zh) * 2021-11-05 2023-05-11 中国科学院计算技术研究所 一种基于列存数据库的流式数据处理方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5109511A (en) * 1988-05-31 1992-04-28 Hitachi, Ltd. Shared resource managing method and system
CN101650669A (zh) * 2008-08-14 2010-02-17 英业达股份有限公司 多线程下执行磁盘读写的方法
CN101656751A (zh) * 2008-08-18 2010-02-24 北京数码大方科技有限公司 加速上传与下载文件的方法及其系统
CN102368779A (zh) * 2011-01-25 2012-03-07 麦克奥迪实业集团有限公司 一种用于移动互联网设备的超大图像加载显示方法
CN104331255A (zh) * 2014-11-17 2015-02-04 中国科学院声学研究所 一种基于嵌入式文件系统的流式数据读取方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110145037A1 (en) * 2009-12-16 2011-06-16 Vertafore, Inc. Document management method and apparatus to process a workflow task by parallel or serially processing subtasks thereof
CN102467415B (zh) * 2010-11-03 2013-11-20 大唐移动通信设备有限公司 一种业务面任务处理方法及设备
CN103942098A (zh) * 2014-04-29 2014-07-23 国家电网公司 一种任务处理系统和方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5109511A (en) * 1988-05-31 1992-04-28 Hitachi, Ltd. Shared resource managing method and system
CN101650669A (zh) * 2008-08-14 2010-02-17 英业达股份有限公司 多线程下执行磁盘读写的方法
CN101656751A (zh) * 2008-08-18 2010-02-24 北京数码大方科技有限公司 加速上传与下载文件的方法及其系统
CN102368779A (zh) * 2011-01-25 2012-03-07 麦克奥迪实业集团有限公司 一种用于移动互联网设备的超大图像加载显示方法
CN104331255A (zh) * 2014-11-17 2015-02-04 中国科学院声学研究所 一种基于嵌入式文件系统的流式数据读取方法

Also Published As

Publication number Publication date
CN104331255A (zh) 2015-02-04
CN104331255B (zh) 2018-04-17
US20170322948A1 (en) 2017-11-09

Similar Documents

Publication Publication Date Title
WO2016078259A1 (zh) 一种基于嵌入式文件系统的流式数据读取方法
CN109582466B (zh) 一种定时任务执行方法、分布式服务器集群及电子设备
JP3920818B2 (ja) スケジューリング方法および情報処理システム
US9262218B2 (en) Methods and apparatus for resource management in cluster computing
WO2017028697A1 (zh) 计算机集群的扩容和缩容方法及设备
JP3889726B2 (ja) スケジューリング方法および情報処理システム
US11411885B2 (en) Network-accessible data volume modification
US8996469B2 (en) Methods and apparatus for job state tracking in cluster computing
US20190356717A1 (en) Multimedia file processing
JP2005018590A (ja) スケジューリング方法およびリアルタイム処理システム
US9804889B2 (en) Methods and apparatus for state objects in cluster computing
US10037298B2 (en) Network-accessible data volume modification
US9164856B2 (en) Persistent messaging mechanism
US11132221B2 (en) Method, apparatus, and computer-readable medium for dynamic binding of tasks in a data exchange
WO2023169235A1 (zh) 数据访问方法、系统、设备及存储介质
WO2022062833A1 (zh) 内存分配方法及相关设备
KR102601576B1 (ko) 단계 지원 작업 흐름을 위한 방법 및 장치
CN110851285A (zh) 一种基于gpu虚拟化的资源复用方法、装置及设备
US11647103B1 (en) Compression-as-a-service for data transmissions
US9052950B2 (en) Selective constant complexity dismissal in task scheduling
CN112689248A (zh) 一种消息处理方法及系统
US9342419B2 (en) Persistent messaging mechanism
US20230393782A1 (en) Io request pipeline processing device, method and system, and storage medium
CN113076180B (zh) 上行数据通路的构建方法及数据处理系统
CN113076189B (zh) 具有多数据通路的数据处理系统及用多数据通路构建虚拟电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15860353

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 15527323

Country of ref document: US

122 Ep: pct application non-entry in european phase

Ref document number: 15860353

Country of ref document: EP

Kind code of ref document: A1