CN113568736A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN113568736A
CN113568736A CN202110715843.XA CN202110715843A CN113568736A CN 113568736 A CN113568736 A CN 113568736A CN 202110715843 A CN202110715843 A CN 202110715843A CN 113568736 A CN113568736 A CN 113568736A
Authority
CN
China
Prior art keywords
data processing
file
request
processing request
requests
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110715843.XA
Other languages
Chinese (zh)
Inventor
朴君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Innovation Co
Original Assignee
Alibaba Singapore Holdings Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Singapore Holdings Pte Ltd filed Critical Alibaba Singapore Holdings Pte Ltd
Priority to CN202110715843.XA priority Critical patent/CN113568736A/en
Publication of CN113568736A publication Critical patent/CN113568736A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Abstract

The embodiment of the specification provides a data processing method and a data processing device, wherein the data processing method comprises the steps of determining a file handle corresponding to each process in a plurality of processes, and issuing a plurality of data processing requests of a file based on the file handle, wherein each data processing request carries a data processing type and a data interval for carrying out data processing on the file; determining at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval; and sending the at least two target data processing requests to a queue adaptation device, and performing parallel processing on the target data processing requests through the queue adaptation device.

Description

Data processing method and device
Technical Field
The embodiment of the specification relates to the technical field of data processing, in particular to a data processing method. One or more embodiments of the present specification also relate to a data processing apparatus, a data processing system, a computing device, a computer-readable storage medium, and a computer program.
Background
In recent years, with the continuous development of storage hardware technology, the performance of Hard disks has been greatly improved, and from an original HDD (Hard Disk Drive) mechanical Disk to an SSD (Solid State Disk) Solid State Disk based on an SATA (Serial ATA) interface, to an Nvme SSD Hard Disk based on a PCIe (PCI-Express) interface, a bare Disk can provide throughput performance at the level of one hundred thousand IOPS (Input/Output Operations Per Second) and GB.
Currently, a mainstream single-machine file system (EXT4) designs an I/O flow with an inode (index node) as a center, and this design mode is relatively friendly to a CPU, but is not friendly to a storage device, for example, multiple processes concurrently perform read and write operations on the same file, and the file system performs mutual exclusion access restriction on the inode of the file, resulting in I/O serialization. However, the current Nvme protocol specifies that each hardware device can support 64K hardware queues, so the current file system architecture cannot utilize the concurrency capability of multiple processes, and cannot exert the throughput capability of the high-performance storage device.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide the technical field of data processing, and in particular, relate to a data processing method. One or more embodiments of the present specification also relate to a data processing apparatus, a data processing system, a computing device, a computer-readable storage medium, and a computer program, so as to solve the technical deficiencies of the prior art.
According to a first aspect of embodiments herein, there is provided a data processing method including:
determining a file handle corresponding to each process in a plurality of processes, and issuing a plurality of data processing requests of a file based on the file handle, wherein each data processing request carries a data processing type and a data interval for performing data processing on the file;
determining at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval;
and sending the at least two target data processing requests to a queue adaptation device, and performing parallel processing on the target data processing requests through the queue adaptation device.
According to a second aspect of embodiments herein, there is provided a data processing apparatus comprising:
the file processing system comprises a request receiving module, a file processing module and a file processing module, wherein the request receiving module is configured to determine a file handle corresponding to each process in a plurality of processes and issue a plurality of data processing requests of a file based on the file handle, and each data processing request carries a data processing type and a data interval for carrying out data processing on the file;
a request determination module configured to determine at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval;
the request processing module is configured to send the at least two target data processing requests to a queue adaptation device, and the target data processing requests are processed in parallel through the queue adaptation device.
According to a third aspect of the embodiments of the present specification, there is provided a data processing system comprising a file concurrency control means, a queue adaptation means, wherein,
the file concurrency control device is configured to determine a file handle corresponding to each process in a plurality of processes, issue a plurality of data processing requests of a file based on the file handle, wherein each data processing request carries a data processing type and a data interval for performing data processing on the file, determine at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval, and send the at least two target data processing requests to a queue adaptation device;
the queue adapting device is configured to receive the at least two target data processing requests sent by the file concurrency device, and perform parallel processing on the target data processing requests.
According to a fourth aspect of embodiments of the present specification, there is provided a computer program, wherein when the computer program is executed in a computer, the computer is caused to execute the steps of the above-described data processing method.
According to a fifth aspect of embodiments herein, there is provided a computing device comprising:
a memory and a processor;
the memory is used for storing computer-executable instructions, and the processor is used for executing the computer-executable instructions, and the computer-executable instructions realize the steps of the data processing method when being executed by the processor.
According to a sixth aspect of embodiments herein, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the data processing method described above.
One embodiment of the present specification implements a data processing method and apparatus, where the data processing method includes determining a file handle corresponding to each process in multiple processes, and issuing multiple data processing requests for a file based on the file handle, where each data processing request carries a data processing type and a data interval for performing data processing on the file; determining at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval; and sending the at least two target data processing requests to a queue adaptation device, and performing parallel processing on the target data processing requests through the queue adaptation device. Specifically, the data processing method processes the data processing request by taking the file handle of the process as the center, converts the data processing request of the same file into the parallel data processing request based on the file handle, fully exerts the multi-queue concurrency capability of the hardware, and greatly improves the throughput performance of the file system.
Drawings
FIG. 1 is a schematic diagram illustrating an EXT4 file system designing an i/o flow with an inode as a center to implement read and write operations on files;
FIG. 2 is a flow chart of a data processing method provided by an embodiment of the present specification;
fig. 3 is a schematic diagram of a handle interval tree in a data processing method according to an embodiment of the present specification;
FIG. 4 is a flowchart illustrating a data processing method according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present specification;
FIG. 6 is a block diagram of a data processing system, according to one embodiment of the present disclosure;
fig. 7 is a block diagram of a computing device according to an embodiment of the present disclosure.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.
The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
First, the noun terms to which one or more embodiments of the present specification relate are explained.
SATA: serial Advanced Technology Attachment is a computer bus responsible for data transfer between a motherboard and mass storage devices (e.g., hard disks and optical disk drives).
Nvme: Non-Volatile Memory Express, Non-Volatile Memory host controller interface specification.
Round Robin: and (6) rotating.
The i/o request: and inputting and outputting the request.
fd: file handle, in file i/o, to read data from a file, the application first calls the operating system function and passes the file name and selects a path to the file to open the file. The function retrieves a sequence number, the file handle, which is the unique identification for the open file.
LBA: logical Block Address, a common mechanism for describing the Block in which data is located on a computer storage device, is commonly used in auxiliary memory devices such as hard disks.
Glib library: the method is the most common C language function library under the Linux platform, and has good portability and practicability.
In recent years, with the continuous development of storage hardware technology, the performance of hard disks has been greatly improved, and from an original HDD mechanical disk to an SSD solid state disk based on a SATA interface and then to an Nvme SSD hard disk based on a PCIe interface, a bare disk can provide throughput performance of ten-million (hundred) grades of IOPS and GB grades. Currently, a mainstream single-machine file system (EXT4) constructs a metadata layer based on a universal block device layer, wherein the universal block device layer is a software layer independent of EXT4, and is mainly responsible for presenting a standard block device semantic (such as LBA and Length) to an upper layer, and the lower layer is connected to various hardware disk device drives, such as HDD, Nvme SSD, and the like; the metadata layer is not an independent software layer, and belongs to a software stack of an EXT4 file system, that is, an inode management logic in the background art, so that the performance of the Nvme SSD is limited by the previous inode management logic (multiple processes concurrently perform read-write operations on the same file, and the file system performs mutual exclusion access limitation on the inode of the file, resulting in I/O serialization) and the generic block device layer, that is, the hardware throughput performance of the bottom layer cannot be exerted due to the existence of the metadata layer and the generic block device layer, and the main reasons are as follows:
1. the i/o path of the file system does not sense the hardware queue resources at the bottom layer, and the concurrency advantage of multiple queues cannot be fully exerted;
2. the file system takes an inode as a center to design an i/o flow, the design mode is friendly to a CPU (Central processing Unit) but not friendly to storage equipment, for example, multiple processes concurrently perform read-write operation on the same file, and the file system performs exclusive access limitation on the i node of the file, so that i/o serialization is caused. At present, the Nvme protocol specifies that each hardware device can support 64K hardware queues, so that the current file system architecture cannot utilize the concurrency capability of multiple processes to exert the throughput performance of the storage device.
Referring to fig. 1, fig. 1 shows a schematic diagram of an EXT4 file system for designing an i/o flow with an inode as a center to implement read and write operations on a file.
Fig. 1 includes a host and an Nvme SSD controller, where the host includes a process, an EXT4 file system, a generic block device layer, and an Nvme device driver.
Take the example that the processes include process 1, process 2, process 3, and process 4.
In specific implementation, when process 1, process 2, process 3 and process 4 of the host receive a read-write request of the client for the same file, the read-write request is issued to the EXT4 file system through process 1, process 2, process 3 and process 4;
the EXT4 file system serializes the read-write request issued by the process 1, the process 2, the process 3 and the process 4 through an inode lock, and then sends the read-write request after serialization to a universal block device layer;
the universal block device layer converts the received read-write request into a read-write instruction aiming at the hardware disk and issues the read-write instruction to the Nvme device driver;
the Nvme device driver responds to the read-write instruction of the hardware disk issued by the universal block device layer, allocates a corresponding hardware queue for the read-write instruction of the hardware disk, and issues the hardware queue to the Nvme SSD controller, and the Nvme SSD controller processes the read-write instruction of the hardware disk received by the hardware queue to complete data read-write aiming at the file.
According to the scheme, the multiple processes concurrently carry out read-write operation on the same file, and the file system can carry out exclusive access limitation on the inode of the file, so that i/o serialization is caused. At present, the Nvme protocol specifies that each hardware device can support 64K hardware queues, so that the current file system architecture cannot utilize the concurrency capability of multiple processes to exert the throughput performance of the storage device.
Based on this, in the present specification, a data processing method is provided. One or more embodiments of the present specification relate to a data processing apparatus, a data processing system, a computing device, a computer-readable storage medium, and a computer program, which are described in detail in the following embodiments one by one.
Referring to fig. 2, fig. 2 is a flowchart illustrating a data processing method according to an embodiment of the present disclosure, which specifically includes the following steps.
Step 202: determining a file handle corresponding to each process in a plurality of processes, and issuing a plurality of data processing requests of the file based on the file handles.
Each data processing request carries a data processing type and a data interval for performing data processing on the file. A data processing request may be understood as a data read request or a data write request, etc.
In practical applications, the data processing type includes, but is not limited to, a data writing type, a data reading type, a data modification type, a data searching type, and the like. The data section may be understood as a file area where data processing such as writing, reading, modifying, deleting, and the like is performed for a file.
In addition, the file handle of the file has a corresponding relationship with the file. In an actual file operation process, a file system provides some file operation interfaces for a process through a Glib library, such as an open interface, a read interface, a write interface, a close interface and the like, when the process needs to access a file based on a file request, the process actively calls the open interface to open the file in the file system, the file system returns a file handle for the file to the process, and the file handle is a unique identification sentence for the open file; and then calls the read/write interface to operate on the file. While a specific file request may come from a command sent by the user to the process (via the IPC channel) or a file request initiated by the process itself.
Specifically, before determining a file handle corresponding to each process of the multiple processes, the method further includes:
the file is opened based on file operation requests which are issued by a plurality of processes and aim at the same file, and a file handle corresponding to the file is distributed to each process in the plurality of processes.
Wherein the plurality of processes includes two or more processes. A file is understood to be any type of file containing any data, such as a student information management file in the doc format, an employee information management file in the txt format, and so on.
In practical application, the file is opened based on a file operation request for the same file issued by a plurality of processes, and a file handle corresponding to the file is allocated to each process in the plurality of processes, which can be understood as opening the file based on a read/write operation request (i/o request) for the same file issued by a plurality of processes, and allocating one file handle corresponding to the file to each process according to the number of processes in the processes.
For example, if there are 4 processes, the file may be opened based on the i/o requests issued by the 4 processes for the file of the same file, and a file handle corresponding to the file is allocated to each of the 4 processes.
In the embodiment of the present specification, before issuing a data processing request for a same file, a process opens the file based on the data processing request, obtains a file handle corresponding to the file, implements subsequent processing on the data processing request for the file based on the file handle, converts a serial data processing request for a same inode into a parallel data processing request based on the file handle, and greatly improves throughput performance of a file system.
Step 204: determining at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval.
Wherein the data processing type comprises a data writing type and a data reading type;
accordingly, the determining a target data processing request from the plurality of data processing requests based on the data processing type and the data interval comprises:
determining a data processing request with a data processing type being the data writing type in the plurality of data processing requests as a first data processing request;
determining a data processing request with a data processing type being the data reading type in the plurality of data processing requests as a second data processing request;
at least two target data processing requests are determined based on the first data processing request and the second data processing request.
The data writing type may be understood as a type of writing data into a file, and the data reading type may be understood as a type of reading data from a file. The target data processing request can be understood as a data processing request without mutual exclusion when the target data processing request is implemented specifically; specifically, the target data processing request includes target data processing requests for reading and writing to the same data area of the same file.
Specifically, when the data processing type includes a data writing type and a data reading type, a data processing request with the data processing type being the data writing type in the plurality of data processing requests is used as a first data processing request; and taking the data processing request with the data processing type being the data reading type in the plurality of data processing requests as a second data processing request.
Then, at least two target data processing requests are determined based on the first data processing request and the second data processing request. Specifically, the purpose of determining at least two target data processing requests is to enable subsequent parallel processing.
In this embodiment of the present description, when the data processing type includes a data writing type and a data reading type, the multiple data processing requests may be classified based on the data processing type, and then a target data processing request that can implement parallel processing may be quickly and accurately obtained from the data processing requests after the classification.
In specific implementation, the manner of determining the target data processing request based on the classified data processing request is as follows:
said determining at least two target data processing requests based on said first data processing request and said second data processing request comprises:
determining a data interval corresponding to the first data processing request and a data interval corresponding to the second data processing request;
and determining the at least two target data processing requests based on the data interval corresponding to the first data processing request and the data interval corresponding to the second data processing request.
Wherein the at least two target data processing requests comprise a first data processing request and/or a second data processing request.
Specifically, after the data processing requests are classified into a first data processing request and a second data processing request according to the data processing types; acquiring a data interval corresponding to each first data processing request and a data interval corresponding to each second data processing request; subsequently, at least two target data processing requests can be obtained from the first data processing request and the second data processing request based on the data interval corresponding to each data processing request.
For example, if the first data processing request is request a, request aa, and the second data processing request is request b, request bb; the data interval corresponding to the request a is 0-3, the data interval corresponding to the request aa is 4-5, the data interval corresponding to the request b is 2-3, and the data interval corresponding to the request bb is 6-8; where 0-3, 4-5, 2-3 and 6-8 can be understood as the number of lines of data stored in a file or as sectors of data stored in a file, etc.
Then, after obtaining the data interval corresponding to each data processing request in the first data processing request and the second data processing request, it may be determined, subsequently, based on the data interval corresponding to each data processing request, which data processing requests are mutually exclusive and which data processing requests are executable in parallel, and then, the data processing requests that are executable in parallel are used as target data processing requests to perform subsequent parallel and efficient processing.
In specific implementation, the determining the at least two target data processing requests based on the data interval corresponding to the first data processing request and the data interval corresponding to the second data processing request includes:
taking a first data processing request and a second data processing request of which data intervals are not crossed as a first initial data processing request;
taking the first data processing request or the second data processing request with the data section crossed as a second initial data processing request;
taking the first data processing request with the data intervals not crossed in the first data processing request as a third initial data processing request;
taking a first data processing request with data intervals crossed in the first data processing request as a fourth initial data processing request;
wherein the at least two target data processing requests comprise the first initial data processing request, a second initial data processing request, a third initial data processing request, and/or the fourth initial data processing request.
Specifically, the first data processing request and the second data processing request, of which the data intervals are not crossed, may be understood as that the data intervals of the first data processing request, of which the data processing type is a data writing type, and the data intervals of the second data processing request, of which the data processing type is a data reading type, are not crossed; that is, the data read and the data write are not data processing requests for the same data interval, and the data processing request is defined as a first initial data processing request. In practical applications, the first initial data processing request may be processed subsequently in parallel.
The first data processing request and the second data processing request with the data intervals crossed can be understood as that the data intervals of the first data processing request with the data processing type being a data writing type and the data intervals of the second data processing request with the data processing type being a data reading type are crossed; that is, the data reading and the data writing are data processing requests for the same data interval, or at least data processing requests for the intersection of the same data interval, and in this case, the first data processing request or the second data processing request in which the data interval is intersected is defined as a second initial data processing request. In practical application, when performing subsequent processing, the subsequent processing is performed only for the first data processing request or the second data processing request each time, and the other data processing request is subjected to blocking processing, so as to avoid confusion of reading and writing of data.
The data section of the first data processing request with the data processing type being the data writing type does not have the intersection; that is, the data write is not a data processing request for the same data interval, and the data processing request is defined as a third initial data processing request. In practical applications, the third initial data processing request may be processed in parallel subsequently.
The first data processing request with the data section crossed can be understood as that the data section of the first data processing request with the data processing type being the data writing type is crossed; that is, the data writing is a data processing request for the same data section or a data processing request for data writing at least for the intersection section of the same data section, and a first data processing request having data section intersection is defined as a fourth initial data processing request. In practical application, when performing subsequent processing, the subsequent processing is performed only on a certain first data processing request with data section intersection, and the other first data processing requests are subjected to blocking processing, so as to avoid data writing confusion.
In practical application, if a plurality of data processing requests are all data reading types, the plurality of data processing requests can be processed in parallel; if the plurality of data processing requests are all data writing types, one data processing request needs to be selected from the data processing requests with data intervals crossed as the current data processing, and other data processing requests need to be blocked and wait; if the plurality of data processing requests include both data processing requests of a data reading type and data processing requests of a data writing type, then target data processing requests which can be subsequently processed in parallel can be acquired according to the above manner, and efficient and parallel processing of the subsequent target data processing requests is realized.
Following the above example, the data interval of request a is 0-3, the data interval of request aa is 4-5, the data interval of request b is 2-3, and the data interval of request bb is 6-8.
As can be seen from the data sections, if there is an intersection between the data sections of the request a of the first data processing request and the request b of the second data processing request, that is, the data sections 2-3, and the data sections of the request aa of the first data processing request and the request bb of the second data processing request do not intersect with other data sections, then the request aa of the first data processing request and the request bb of the second data processing request can be regarded as the first initial data processing request.
And in the case that the request a of the first data processing request and the request b of the second data processing request have data section intersection, one of the two requests can be selected as the second initial data processing request.
In addition, whether the first data processing request with the data section crossed exists in the first data processing request is judged.
The first data processing request, for example, the first data processing request is request b, request bb, request bbb; the data interval corresponding to the request b is 0-3, the data interval corresponding to the request bb is 4-5, and the data interval corresponding to the request bbb is 5-6.
In this case, it can be determined that there is an intersection between the data sections of the request bb and the request bbb of the first data processing request, and the data area of the request b of the first data processing request does not intersect with the data areas of the other two requests. Then the request b of the first data processing request is taken as a third initial data processing request and the request bb or the request bbb of the first data processing request is taken as a fourth initial data processing request.
And finally, taking the first initial data processing request, the second initial data processing request, the third initial data processing request and the fourth initial data processing request which are selected through the data interval as target data processing requests, namely, subsequent data processing requests which can be processed in parallel.
In this embodiment of the present description, according to a data interval corresponding to each data processing request, a data reading type and a data writing type, where mutually exclusive data processing requests are blocked when data is read and written, and a data processing request where mutually exclusive data is written in the data writing type are blocked, so as to avoid confusion of reading and writing data of a file; and simultaneously acquiring other non-mutually exclusive data processing requests, so that the target data processing requests can be subsequently processed in parallel.
Step 206: and sending the at least two target data processing requests to a queue adaptation device, and performing parallel processing on the target data processing requests through the queue adaptation device.
Specifically, the sending the at least two target data processing requests to a queue adaptation device, and performing parallel processing on the target data processing requests by the queue adaptation device includes:
and sending the at least two target data processing requests to a queue adapting device, and sending the target data processing requests to corresponding hardware queues through the queue adapting device to finish parallel processing of the target data processing requests.
In practical application, the queue adapting device comprises a handle queue adapter, a hardware device driver and a hardware disk controller; the handle queue adapter provides a general hardware queue access adaptation layer, and is in butt joint with various hardware storage device queues, such as an Nvme SSD queue based on PC ie. Meanwhile, the system is responsible for scheduling and managing hardware queue resources and butting the upper layer i/o request based on the file handle. The handle queue adapter has the first function of sending an i/o request based on a file handle, which is sent by a process received by a file system layer, to a specific hardware queue through a hardware device driver; the second function is to interface various hardware queues, provide a uniform and abstract hardware queue operation interface based on file handles to an upper file system layer, and realize that the i/o requests based on the file handles can perform subsequent parallel processing through the corresponding hardware queues based on the file handles. The hardware device driver is used for converting the i/o request based on the file handle into a specific read-write request aiming at the hardware disk, and then sending the read-write request to the hardware disk controller through the corresponding hardware queue based on the file handle. And the hardware disk controller performs data processing based on the read-write request aiming at the hardware disk received by the hardware queue.
In addition, the hardware resources (i.e. the number of the hardware queues) are pre-allocated, and the queue adaptation device only needs to be responsible for dynamically scheduling each target data processing request to each hardware queue for parallel processing.
In this embodiment of the present description, after at least two selected target data processing requests are sent to the queue adaptation device, a corresponding hardware queue may be matched for each target data processing request through a handle queue adapter, a hardware device driver, a hardware disk controller, and the like in the queue adaptation device, and then parallel processing on the target data processing requests is completed through the matched hardware queue, so as to improve the overall data processing performance.
In another embodiment of the present specification, after determining at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval, the method further includes:
determining a file handle of a process issuing the target data processing request and determining a data interval corresponding to the target data processing request;
determining a read-write interval of a file handle of a process issuing the target data processing request based on a data interval corresponding to the target data processing request;
acquiring a handle interval tree corresponding to the file, and determining a current read-write interval of each file handle in the handle interval tree;
and determining a blocking data processing request from the target data processing request based on the incidence relation between the read-write interval of the file handle of the process issuing the target data processing request and the current read-write interval of the corresponding file handle in the handle interval tree.
Specifically, a handle interval tree of each file is pre-stored in the memory of the host, each handle interval tree is composed of a plurality of file handles, and each file handle corresponds to the current read-write interval. Each read-write interval represents a data area of a currently written or read file, and may be understood as the data interval of the above embodiment. The difference is that the data interval is an interval in which data is to be read or written from or to the file, and the read-write interval is an interval in which data is currently read or written from or to the file.
In practical application, if the data interval of the file handle of the current target data processing request is in conflict with the current read-write interval of the corresponding file handle in the handle interval tree, the conflicting target data processing request is blocked, so as to ensure the safe processing of the current data processing request.
Specifically, each target data processing request is issued by a process, and each process corresponds to one file handle, so that each target data processing request has a corresponding file handle, and a data interval corresponding to the target data processing request can be regarded as a read-write interval of the corresponding file handle; and comparing the read-write interval of each file handle with the current read-write interval of each file handle in the interval tree, taking the target data processing request with conflict as a blocking data processing request when the read-write interval of each file handle is in conflict with the current read-write interval of each file handle in the interval tree, and performing blocking processing during subsequent processing. The determination method of whether the read-write interval of the file handle conflicts with the current read-write interval of each file handle in the interval tree or not may be determined by determining the target data processing request according to the foregoing embodiment, and details are not repeated here.
In specific implementation, when the target data processing request is a data processing request of a data reading type and the current read-write interval of the file handle in the handle interval tree is also a data reading interval of the file, the target data processing request cannot be blocked even if the read-write interval has conflict (intersection); and if the target data processing request is a data processing request of a data writing type and the current read-write interval of the file handle in the handle interval tree is also a data writing interval of the file, blocking the target data processing request and determining the target data processing request as a blocked data processing request. The specific implementation mode is as follows:
specifically, the determining a blocking data processing request from the target data processing request based on the association relationship between the read-write interval of the file handle issuing the target data processing request and the current read-write interval of the corresponding file handle in the handle interval tree includes:
and under the condition that the read-write interval of the file handle of the process issuing the target data processing request is crossed with the read-write interval of the corresponding file handle in the handle interval tree, determining the target data processing request as a blocking data processing request. For example, the target data processing request is a data write type, and the file handle corresponding to the handle interval tree is also a data write type, so that when the two read-write intervals are crossed, the target data processing request is used as a blocking data processing request to perform blocking processing; in addition, when the target data processing request is a data writing type and the corresponding file handle in the handle interval tree is a data reading type, the target data processing request is also used as a blocking data processing request to perform blocking processing when the two reading and writing intervals are crossed; and when the target data processing request is a data reading type and the file handle corresponding to the handle interval tree is a data writing type, the target data processing request is also used as a blocking data processing request to perform blocking processing when the two reading and writing intervals are crossed.
In practical application, the target data processing request and the file handle corresponding to the handle interval tree do not have the condition of mutual exclusion when the file handle is read operation, and under the condition that the target data processing request is write operation, the read-write interval comparison is performed on the target data processing request according to the above mode, the mutually exclusive target data processing request is subjected to blocking processing, and data coverage or data loss is avoided.
Referring to fig. 3, fig. 3 is a schematic diagram illustrating a handle interval tree in a data processing method according to an embodiment of the present specification.
Fig. 3 is a handle interval tree of a file maintained in the memory of the host, and is used to manage i/o read-write intervals of file handles of the file. Through the handle interval tree, i/o conflict of file handles can be efficiently judged, wherein leaf nodes represent concurrent actual read-write intervals. If the i/o request of a file handle conflicts with the current read-write interval of the handle interval tree of the file, the i/o issuing is blocked, and the issuing is carried out next time after the conflict is solved.
The first node "Root" in fig. 3 represents a length interval [0, End ] of the whole file, the nodes "FD 1, FD2, FD3, FD 4" are respectively 4 file handle nodes, [0,5] below the file handle node "FD 1" represents a read-write interval, "RO" represents read-only, that is, the file handle node currently processes a data read operation; the [6,8] below the file handle node "FD 2" represents a read-write interval, "WR" represents write, that is, the current processing of the file handle node is data write; the [0,3] below the file handle node "FD 3" represents a read-write interval, "RO" represents read-only, that is, the file handle node currently handles a data read operation; the [9, END ] below the file handle node "FD 4" represents a read-write interval, "RO" represents read-only, that is, the file handle node currently handles a data read operation; nodes without labels can be understood as empty nodes, and the data interval below each empty node can be represented as an unread-write interval of each empty node.
In specific implementation, the text concurrency controller of the file system sequentially inserts the handle requests (target data processing requests) processed in parallel into the handle interval tree, and judges whether the subsequent handle requests can be processed in parallel or not through the handle interval tree.
In practical applications, in order to ensure the integrity of data processing, after a target data processing request is processed in parallel, data processing requests other than the target data processing request in a plurality of data processing requests blocked before are also processed in parallel.
Meanwhile, after the non-blocking data processing request in the target data processing request is completed, parallel data processing needs to be performed on the blocking data processing request to ensure the integrity of data processing, and the specific implementation manner is as follows:
after the determining the target data processing request as a blocking data processing request, further comprising:
and after the parallel processing of other target data processing requests except the blocking data processing request in the target data processing requests is finished, performing parallel data processing on the blocking data processing request.
In another embodiment of this specification, the sending, by the queue adapting device, the target data processing request to a corresponding hardware queue includes:
determining the number of the target data processing requests and the number of hardware queues;
determining whether the number of the target data processing requests is less than or equal to the number of hardware queues,
if yes, the processed target data processing requests are sequentially sent to corresponding hardware queues through the queue adapting device,
and if not, sending the processed target data processing request to a corresponding hardware queue in a polling scheduling mode through the queue adapting device.
In practical application, before parallel processing is performed on target data processing requests, a corresponding hardware queue is matched for each target data processing request, so that parallel processing of each target data processing request is realized through the corresponding hardware queue subsequently.
Specifically, the number of at least two target data processing requests and the number of preset hardware queues are determined, and if the target data processing requests and the preset hardware queues have a mapping relation, the target data processing requests are distributed to corresponding hardware queues; and if the target data processing requests do not have a mapping relation with the preset hardware queues, distributing corresponding hardware queues for each target data processing request according to the number of the target data processing requests and the number of the hardware queues.
That is, if the number of at least two target data processing requests is less than or equal to the number of hardware queues, a hardware queue may be randomly allocated to each target data processing request, and if the number of at least two target data processing requests is greater than the number of hardware queues, a hardware queue may be allocated to each target data processing request in a round robin manner.
The data processing method provided by the embodiment of the specification processes the data processing request by taking the file handle of the process as the center, converts the data processing request of the same file into the parallel data processing request based on the file handle, fully exerts the multi-queue concurrency capability of hardware, and greatly improves the throughput performance of the file system.
In another implementation scheme, when a file handle is returned by opening a file based on multiple data processing requests, a corresponding hardware queue may be matched for each file handle, then after a file concurrency controller of the file system performs concurrency conflict processing on multiple data processing requests based on the file handle, and sends at least two target data processing requests based on the file handle to a queue adaptation device, a handle queue adapter in the queue adaptation device may determine a hardware queue corresponding to each target data processing request according to the file handle corresponding to each target data processing request, and may subsequently implement parallel processing on each target data processing request through the hardware queue.
Referring to fig. 4, fig. 4 is a flowchart illustrating a processing procedure of a data processing method according to an embodiment of the present specification, which specifically includes the following steps.
Step 402: process 1, process 2, process 3, and process 4 of the host issue an i/o request for a file to the EXT4 file system running on the host based on the file handle of the file.
Step 404: after receiving the i/o request, the file concurrency controller in the EXT4 file system processes the i/o request and blocks the issuing of mutually exclusive i/o requests; and meanwhile, issuing the non-mutually exclusive i/o request to the handle queue adapter.
Step 406: the handle queue adapter distributes corresponding hardware queues for the non-mutually exclusive i/o requests and issues the non-mutually exclusive i/o requests to the Nvme device driver.
Step 408: the Nvme device driver converts the non-mutually exclusive i/o request into a read-write request aiming at a specific hardware disk, and sends the read-write request to the Nvme SSD controller through a corresponding hardware queue.
Step 410: the Nvme SSD controller performs parallel processing on i/o requests received through the hardware queue for a particular hardware disk.
Specifically, the handle queue adapter may be understood as a handle queue plug-in fig. 4, and the Nvme device driver may be understood as an Nvme device driver in fig. 4.
The data processing method provided by the embodiment of the specification is a file system capable of perceiving and storing multiple queues of hardware, an i/o flow is designed by taking a process file handle as a center, the concurrent capability of the multiple queues of the hardware is fully exerted, and the throughput performance of the file system is greatly improved.
Corresponding to the above method embodiment, the present specification further provides an embodiment of a data processing apparatus, and fig. 5 shows a schematic structural diagram of a data processing apparatus provided in an embodiment of the present specification. As shown in fig. 5, the apparatus includes:
a request receiving module 502, configured to determine a file handle corresponding to each process of a plurality of processes, and issue a plurality of data processing requests for a file based on the file handle, where each data processing request carries a data processing type and a data interval for performing data processing on the file;
a request determination module 504 configured to determine at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval;
a request processing module 506 configured to send the at least two target data processing requests to a queue adaptation device, and perform parallel processing on the target data processing requests through the queue adaptation device.
Optionally, the apparatus further comprises:
a handle assignment module configured to:
the file is opened based on file operation requests which are issued by a plurality of processes and aim at the same file, and a file handle corresponding to the file is distributed to each process in the plurality of processes.
Optionally, the data processing type includes a data writing type and a data reading type;
accordingly, the request determination module 504 is further configured to:
determining a data processing request with a data processing type being the data writing type in the plurality of data processing requests as a first data processing request;
determining a data processing request with a data processing type being the data reading type in the plurality of data processing requests as a second data processing request;
at least two target data processing requests are determined based on the first data processing request and the second data processing request.
Optionally, the request determining module 504 is further configured to:
determining a data interval corresponding to the first data processing request and a data interval corresponding to the second data processing request;
and determining the at least two target data processing requests based on the data interval corresponding to the first data processing request and the data interval corresponding to the second data processing request.
Optionally, the request determining module 504 is further configured to:
taking a first data processing request and a second data processing request of which data intervals are not crossed as a first initial data processing request;
taking the first data processing request or the second data processing request with the data section crossed as a second initial data processing request;
taking the first data processing request with the data intervals not crossed in the first data processing request as a third initial data processing request;
taking a first data processing request with data intervals crossed in the first data processing request as a fourth initial data processing request;
wherein the at least two target data processing requests comprise the first initial data processing request, a second initial data processing request, a third initial data processing request, and/or the fourth initial data processing request.
Optionally, the request processing module 506 is further configured to:
and sending the at least two target data processing requests to a queue adapting device, and sending the target data processing requests to corresponding hardware queues through the queue adapting device to finish parallel processing of the at least two target data processing requests.
Optionally, the apparatus further comprises:
a blocking module configured to:
determining a file handle of a process issuing the target data processing request and determining a data interval corresponding to the target data processing request;
determining a read-write interval of a file handle of a process issuing the target data processing request based on a data interval corresponding to the target data processing request;
acquiring a handle interval tree corresponding to the file, and determining a current read-write interval of each file handle in the handle interval tree;
and determining a blocking data processing request from the target data processing request based on the incidence relation between the read-write interval of the file handle of the process issuing the target data processing request and the current read-write interval of the corresponding file handle in the handle interval tree.
Optionally, the blocking module is further configured to:
and under the condition that the read-write interval of the file handle of the process issuing the target data processing request is crossed with the read-write interval of the corresponding file handle in the handle interval tree, determining the target data processing request as a blocking data processing request.
Optionally, the apparatus further comprises:
a data processing module configured to:
and after the parallel processing of other target data processing requests except the blocking data processing request in the target data processing requests is finished, performing parallel data processing on the blocking data processing request.
Optionally, the request processing module 506 is further configured to:
determining the number of the at least two target data processing requests and the number of hardware queues;
determining whether the number of the at least two target data processing requests is less than or equal to the number of hardware queues,
if yes, the queue adapting device sends the processed at least two target data processing requests to corresponding hardware queues in sequence,
and if not, sending the processed at least two target data processing requests to corresponding hardware queues in a polling scheduling mode through the queue adapting device.
The data processing device provided in the embodiments of the present description processes a data processing request with a file handle of a process as a center, converts the data processing request for the same file into a parallel data processing request based on the file handle, fully exerts the hardware multi-queue concurrency capability, and greatly improves the throughput performance of a file system.
The above is a schematic configuration of a data processing apparatus of the present embodiment. It should be noted that the technical solution of the data processing apparatus and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the data processing apparatus can be referred to the description of the technical solution of the data processing method.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a data processing system according to an embodiment of the present disclosure, where the data processing system includes a file concurrency control device 602 and a queue adaptation device 604, where,
the file concurrency control device 602 is configured to determine a file handle corresponding to each process of a plurality of processes, and issue a plurality of data processing requests of a file based on the file handle, where each data processing request carries a data processing type and a data interval for performing data processing on the file, determine at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval, and send the at least two target data processing requests to the queue adaptation device 604;
the queue adapting device 604 is configured to receive the at least two target data processing requests sent by the file concurrency device, and perform parallel processing on the target data processing requests.
The file concurrency control device may be understood as a file concurrency controller in a file system, and the execution subject of the data processing method may be understood as the file concurrency controller. Therefore, details of the file concurrency control device in the data processing system will not be described, and specific operation details can be referred to the data processing method.
Specifically, the queue adapting device includes: the device comprises a handle queue adapter, a hardware device driver and a hardware disk controller; wherein the content of the first and second substances,
the handle queue adapter is configured to receive the at least two target data processing requests sent by the file concurrency device, call a hardware queue of the hardware disk controller through the hardware device driver, allocate corresponding hardware queues to the at least two target data processing requests, and send the at least two target data processing requests to the hardware device driver;
the hardware device driver is configured to convert the at least two target data processing requests into at least two processing requests for a hardware disk, and send the at least two processing requests for the hardware disk to the hardware disk controller through a hardware queue of the hardware disk controller corresponding to the at least two processing requests for the hardware disk;
the hardware disk controller is configured to process the at least two processing requests for the hardware disk received through the hardware queue in parallel.
In this embodiment of the present description, after receiving at least two target data processing requests issued by a file concurrency control device, a handle queue adapter initiates the target data processing requests to a hardware queue controlled by a hardware disk through a hardware device driver, the hardware device driver converts the target data processing requests into read-write requests for a specific hardware disk, and sends the read-write requests to the hardware disk controller through a hardware queue of the hardware disk controller, and the hardware disk controller realizes parallel data processing after receiving the read-write requests for the hardware disk sent through the hardware queue; the hardware multi-queue concurrency capability is fully exerted, and the throughput performance of the file system is improved by replying.
The above is a schematic scheme of a data processing system of the present embodiment. It should be noted that the technical solution of the data processing system and some technical solutions of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the data processing system can be referred to the description of some technical solutions of the data processing method.
FIG. 7 illustrates a block diagram of a computing device 700 provided in accordance with one embodiment of the present description. The components of the computing device 700 include, but are not limited to, memory 710 and a processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.
Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 740 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 700, as well as other components not shown in FIG. 7, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 7 is for purposes of example only and is not limiting as to the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 700 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.
Wherein the processor 720 is configured to execute computer-executable instructions that, when executed by the processor, implement the steps of the data processing method described above.
The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the computing device can be referred to the description of the technical solution of the data processing method.
An embodiment of the present specification further provides a computer-readable storage medium storing computer-executable instructions, which when executed by a processor implement the steps of the data processing method described above.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the data processing method, and details that are not described in detail in the technical solution of the storage medium can be referred to the description of the technical solution of the data processing method.
An embodiment of the present specification further provides a computer program, wherein when the computer program is executed in a computer, the computer is caused to execute the steps of the data processing method.
The above is an illustrative scheme of a computer program of the present embodiment. It should be noted that the technical solution of the computer program and the technical solution of the data processing method belong to the same concept, and details that are not described in detail in the technical solution of the computer program can be referred to the description of the technical solution of the data processing method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims (14)

1. A method of data processing, comprising:
determining a file handle corresponding to each process in a plurality of processes, and issuing a plurality of data processing requests of a file based on the file handle, wherein each data processing request carries a data processing type and a data interval for performing data processing on the file;
determining at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval;
and sending the at least two target data processing requests to a queue adaptation device, and performing parallel processing on the target data processing requests through the queue adaptation device.
2. The data processing method of claim 1, wherein before determining the file handle corresponding to each of the plurality of processes, further comprising:
the file is opened based on file operation requests which are issued by a plurality of processes and aim at the same file, and a file handle corresponding to the file is distributed to each process in the plurality of processes.
3. The data processing method according to claim 1 or 2, the data processing type including a data write type and a data read type;
accordingly, the determining a target data processing request from the plurality of data processing requests based on the data processing type and the data interval comprises:
determining a data processing request with a data processing type being the data writing type in the plurality of data processing requests as a first data processing request;
determining a data processing request with a data processing type being the data reading type in the plurality of data processing requests as a second data processing request;
at least two target data processing requests are determined based on the first data processing request and the second data processing request.
4. The data processing method of claim 3, said determining at least two target data processing requests based on the first data processing request and the second data processing request, comprising:
determining a data interval corresponding to the first data processing request and a data interval corresponding to the second data processing request;
and determining the at least two target data processing requests based on the data interval corresponding to the first data processing request and the data interval corresponding to the second data processing request.
5. The data processing method of claim 4, wherein determining the target data processing request based on the data interval corresponding to the first data processing request and the data interval corresponding to the second data processing request comprises:
taking a first data processing request and a second data processing request of which data intervals are not crossed as a first initial data processing request;
taking the first data processing request or the second data processing request with the data section crossed as a second initial data processing request;
taking the first data processing request with the data intervals not crossed in the first data processing request as a third initial data processing request;
taking a first data processing request with data intervals crossed in the first data processing request as a fourth initial data processing request;
wherein the at least two target data processing requests comprise the first initial data processing request, a second initial data processing request, a third initial data processing request, and/or the fourth initial data processing request.
6. The data processing method according to claim 1, wherein the sending the at least two target data processing requests to a queue adaptation device, and the parallel processing of the at least two target data processing requests by the queue adaptation device comprises:
and sending the at least two target data processing requests to a queue adapting device, and sending the target data processing requests to corresponding hardware queues through the queue adapting device to finish parallel processing of the at least two target data processing requests.
7. The data processing method of claim 1, further comprising, after determining at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval:
determining a file handle of a process issuing the target data processing request and determining a data interval corresponding to the target data processing request;
determining a read-write interval of a file handle of a process issuing the target data processing request based on a data interval corresponding to the target data processing request;
acquiring a handle interval tree corresponding to the file, and determining a current read-write interval of each file handle in the handle interval tree;
and determining a blocking data processing request from the target data processing request based on the incidence relation between the read-write interval of the file handle of the process issuing the target data processing request and the current read-write interval of the corresponding file handle in the handle interval tree.
8. The data processing method according to claim 7, wherein the determining a blocking data processing request from the target data processing request based on an association relationship between a read-write interval of a file handle issuing the target data processing request and a current read-write interval of a corresponding file handle in the handle interval tree comprises:
and under the condition that the read-write interval of the file handle of the process issuing the target data processing request is crossed with the read-write interval of the corresponding file handle in the handle interval tree, determining the target data processing request as a blocking data processing request.
9. The data processing method of claim 8, after determining the target data processing request as a blocking data processing request, further comprising:
and after the parallel processing of other target data processing requests except the blocking data processing request in the target data processing requests is finished, performing parallel data processing on the blocking data processing request.
10. The data processing method of claim 6, wherein the sending, by the queue adaptation device, the at least two target data processing requests to corresponding hardware queues comprises:
determining the number of the at least two target data processing requests and the number of hardware queues;
determining whether the number of the at least two target data processing requests is less than or equal to the number of hardware queues,
if yes, the queue adapting device sends the processed at least two target data processing requests to corresponding hardware queues in sequence,
and if not, sending the processed at least two target data processing requests to corresponding hardware queues in a polling scheduling mode through the queue adapting device.
11. A data processing apparatus comprising:
the file processing system comprises a request receiving module, a file processing module and a file processing module, wherein the request receiving module is configured to determine a file handle corresponding to each process in a plurality of processes and issue a plurality of data processing requests of a file based on the file handle, and each data processing request carries a data processing type and a data interval for carrying out data processing on the file;
a request determination module configured to determine at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval;
the request processing module is configured to send the at least two target data processing requests to a queue adaptation device, and the target data processing requests are processed in parallel through the queue adaptation device.
12. A data processing system comprises a file concurrency device and a queue adaptation device, wherein,
the file concurrency control device is configured to determine a file handle corresponding to each process in a plurality of processes, issue a plurality of data processing requests of a file based on the file handle, wherein each data processing request carries a data processing type and a data interval for performing data processing on the file, determine at least two target data processing requests from the plurality of data processing requests based on the data processing type and the data interval, and send the at least two target data processing requests to a queue adaptation device;
the queue adapting device is configured to receive the at least two target data processing requests sent by the file concurrency device, and perform parallel processing on the target data processing requests.
13. The data processing system of claim 12, the queue adaptation device comprising: the device comprises a handle queue adapter, a hardware device driver and a hardware disk controller; wherein the content of the first and second substances,
the handle queue adapter is configured to receive the at least two target data processing requests sent by the file concurrency device, call a hardware queue of the hardware disk controller through the hardware device driver, allocate corresponding hardware queues to the at least two target data processing requests, and send the at least two target data processing requests to the hardware device driver;
the hardware device driver is configured to convert the at least two target data processing requests into at least two processing requests for a hardware disk, and send the at least two processing requests for the hardware disk to the hardware disk controller through a hardware queue of the hardware disk controller corresponding to the at least two processing requests for the hardware disk;
the hardware disk controller is configured to process the at least two processing requests for the hardware disk received through the hardware queue in parallel.
14. A computer program for causing a computer to carry out the steps of the data processing method according to any one of claims 1 to 10 when said computer program is carried out on the computer.
CN202110715843.XA 2021-06-24 2021-06-24 Data processing method and device Pending CN113568736A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110715843.XA CN113568736A (en) 2021-06-24 2021-06-24 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110715843.XA CN113568736A (en) 2021-06-24 2021-06-24 Data processing method and device

Publications (1)

Publication Number Publication Date
CN113568736A true CN113568736A (en) 2021-10-29

Family

ID=78162822

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110715843.XA Pending CN113568736A (en) 2021-06-24 2021-06-24 Data processing method and device

Country Status (1)

Country Link
CN (1) CN113568736A (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101807144A (en) * 2010-03-17 2010-08-18 上海大学 Prospective multi-threaded parallel execution optimization method
CN103559020A (en) * 2013-11-07 2014-02-05 中国科学院软件研究所 Method for realizing parallel compression and parallel decompression on FASTQ file containing DNA (deoxyribonucleic acid) sequence read data
CN103761291A (en) * 2014-01-16 2014-04-30 中国人民解放军国防科学技术大学 Geographical raster data parallel reading-writing method based on request aggregation
CN107273339A (en) * 2017-06-21 2017-10-20 郑州云海信息技术有限公司 A kind of task processing method and device
CN107704194A (en) * 2016-08-08 2018-02-16 北京忆恒创源科技有限公司 Without lock I O process method and its device
WO2019134084A1 (en) * 2018-01-04 2019-07-11 深圳市天软科技开发有限公司 Code execution method and apparatus, terminal device, and computer-readable storage medium
CN110110154A (en) * 2018-02-01 2019-08-09 腾讯科技(深圳)有限公司 A kind of processing method of map file, device and storage medium
CN110162401A (en) * 2019-05-24 2019-08-23 广州中望龙腾软件股份有限公司 The parallel read method of DWG file, electronic equipment and storage medium
CN110442444A (en) * 2019-06-18 2019-11-12 中国科学院计算机网络信息中心 A kind of parallel data access method and system towards mass remote sensing image
CN111625254A (en) * 2020-05-06 2020-09-04 Oppo(重庆)智能科技有限公司 File processing method, device, terminal and storage medium
CN111796948A (en) * 2020-07-02 2020-10-20 长视科技股份有限公司 Shared memory access method and device, computer equipment and storage medium
CN112463306A (en) * 2020-12-03 2021-03-09 南京机敏软件科技有限公司 Method for sharing disk data consistency in virtual machine
CN112559210A (en) * 2020-12-16 2021-03-26 北京仿真中心 Shared resource read-write mutual exclusion method based on RTX real-time system
CN112839099A (en) * 2021-01-29 2021-05-25 苏州浪潮智能科技有限公司 Distributed byte lock detection control method and device

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101807144A (en) * 2010-03-17 2010-08-18 上海大学 Prospective multi-threaded parallel execution optimization method
CN103559020A (en) * 2013-11-07 2014-02-05 中国科学院软件研究所 Method for realizing parallel compression and parallel decompression on FASTQ file containing DNA (deoxyribonucleic acid) sequence read data
CN103761291A (en) * 2014-01-16 2014-04-30 中国人民解放军国防科学技术大学 Geographical raster data parallel reading-writing method based on request aggregation
CN107704194A (en) * 2016-08-08 2018-02-16 北京忆恒创源科技有限公司 Without lock I O process method and its device
CN107273339A (en) * 2017-06-21 2017-10-20 郑州云海信息技术有限公司 A kind of task processing method and device
WO2019134084A1 (en) * 2018-01-04 2019-07-11 深圳市天软科技开发有限公司 Code execution method and apparatus, terminal device, and computer-readable storage medium
CN110110154A (en) * 2018-02-01 2019-08-09 腾讯科技(深圳)有限公司 A kind of processing method of map file, device and storage medium
CN110162401A (en) * 2019-05-24 2019-08-23 广州中望龙腾软件股份有限公司 The parallel read method of DWG file, electronic equipment and storage medium
CN110442444A (en) * 2019-06-18 2019-11-12 中国科学院计算机网络信息中心 A kind of parallel data access method and system towards mass remote sensing image
CN111625254A (en) * 2020-05-06 2020-09-04 Oppo(重庆)智能科技有限公司 File processing method, device, terminal and storage medium
CN111796948A (en) * 2020-07-02 2020-10-20 长视科技股份有限公司 Shared memory access method and device, computer equipment and storage medium
CN112463306A (en) * 2020-12-03 2021-03-09 南京机敏软件科技有限公司 Method for sharing disk data consistency in virtual machine
CN112559210A (en) * 2020-12-16 2021-03-26 北京仿真中心 Shared resource read-write mutual exclusion method based on RTX real-time system
CN112839099A (en) * 2021-01-29 2021-05-25 苏州浪潮智能科技有限公司 Distributed byte lock detection control method and device

Similar Documents

Publication Publication Date Title
US10268716B2 (en) Enhanced hadoop framework for big-data applications
CN110663019B (en) File system for Shingled Magnetic Recording (SMR)
US11169710B2 (en) Method and apparatus for SSD storage access
EP3718008B1 (en) Provisioning using pre-fetched data in serverless computing environments
CN105893139B (en) Method and device for providing storage service for tenant in cloud storage environment
JP5276218B2 (en) Convert LUNs to files or files to LUNs in real time
US10437481B2 (en) Data access method and related apparatus and system
US9973394B2 (en) Eventual consistency among many clusters including entities in a master member regime
CN109753231A (en) Method key assignments storage equipment and operate it
Alshammari et al. H2hadoop: Improving hadoop performance using the metadata of related jobs
US10298649B2 (en) Guaranteeing stream exclusivity in a multi-tenant environment
US10901621B2 (en) Dual-level storage device reservation
US20200319797A1 (en) System and method for file processing from a block device
US20160140140A1 (en) File classification in a distributed file system
US11157456B2 (en) Replication of data in a distributed file system using an arbiter
US20230055511A1 (en) Optimizing clustered filesystem lock ordering in multi-gateway supported hybrid cloud environment
CN113568736A (en) Data processing method and device
CN104572638A (en) Data reading and writing method and device
Herodotou et al. Trident: task scheduling over tiered storage systems in big data platforms
US11720554B2 (en) Iterative query expansion for document discovery
US8225009B1 (en) Systems and methods for selectively discovering storage devices connected to host computing devices
US20220237694A1 (en) Price Superhighway
KR20210085674A (en) Storage device configured to support multi-streams and operation method thereof
US20240028477A1 (en) Systems and methods for backing up clustered and non-clustered data
US20220092049A1 (en) Workload-driven database reorganization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40069942

Country of ref document: HK

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240301

Address after: # 03-06, Lai Zan Da Building 1, 51 Belarusian Road, Singapore

Applicant after: Alibaba Innovation Co.

Country or region after: Singapore

Address before: Room 01, 45th Floor, AXA Building, 8 Shanton Road

Applicant before: Alibaba Singapore Holdings Ltd.

Country or region before: Singapore