CN116431590A

CN116431590A - Data processing method and related equipment

Info

Publication number: CN116431590A
Application number: CN202111015604.XA
Authority: CN
Inventors: 维贾伊·帕奈; 舒巴姆·马莫迪亚; 秦华东; 冯永刚
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2021-07-22
Filing date: 2021-08-31
Publication date: 2023-07-14

Abstract

The embodiment of the application discloses a data processing method and related equipment, wherein under the condition that a parent directory is in a locking state, a plurality of target requests are acquired, and each target request is used for indicating creation or deletion of a child file under the parent directory; in response to the parent directory transitioning from the locked state to the unlocked state, the plurality of target requests are executed in parallel, and the parent directory is in the locked state when the plurality of target requests are executed. By the method, a plurality of target requests can be executed in the locking state of the parent directory at the same time, parallel processing of the plurality of target requests aiming at the same parent directory is realized, and the data processing efficiency is improved.

Description

Data processing method and related equipment

The priority of the indian patent application filed at 22, 7, 2021, filed at 22, IN patent office, application number IN202131032931, application name "A METHOD AND APPARATUS FOR PROCESSING FILE OPERATION REQUEST" (a method and apparatus for processing file manipulation requests), is claimed IN this application, the entire contents of which are incorporated herein by reference.

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to a data processing method and related equipment.

Background

A distributed storage system refers to a system that dispersedly stores data on a plurality of independent storage nodes. The distributed network storage system adopts an expandable system structure, and utilizes a plurality of storage nodes to share the storage load, so that the reliability, availability and access efficiency of the system are improved, and the system is easy to expand.

In a file system of a storage node, metadata information (name, attribute, size, time, etc.) of a file is stored in a directory entry (Denry) table and an Inode (Inode) table, respectively. Each new file creation or deletion updates the Dentry table and Inode table on disk. Since the Denry table and the Inode table are shared objects between files within the same parent directory, a parent directory lock will be taken for the parent directory of the file to ensure metadata consistency.

If a certain storage node receives multiple creation requests or deletion requests for files under the same father directory at the same time, all the requests are queued and processed in sequence on the storage node due to the existence of the father directory lock. That is, each time a request is executed, the parent directory is locked by the exclusive lock, but the parent directory in the locked state cannot be accessed by other requests, and other requests can only wait for the completion of the currently executing request, and the parent directory is converted from the locked state to the unlocked state, so that the next request can be rotated to continue execution. This way of processing in sequence is less efficient and increases the operating delay.

Disclosure of Invention

The embodiment of the application provides a data processing method and related equipment, which are used for improving the data processing efficiency.

In a first aspect, an embodiment of the present application provides a method for processing data, where a server obtains a plurality of target requests when a parent directory is in a locked state. The parent directory is in the locked state, which indicates that the current parent directory is being accessed by other requests, and the parent directory is always in the locked state until the request is completed, and the parent directory cannot be accessed by other requests during the time. Thus, the server, after having acquired a plurality of target requests, will temporarily not execute the target requests during the time the parent directory is in the locked state, wherein each target request is used to indicate creation or deletion of a child file under the parent directory.

When the parent directory is changed from the locked state to the unlocked state, the parent directory may be accessed by other requests. In the prior art, a parent directory can only be allowed to be accessed by one request under each unlocking state, and in the process of accessing the parent directory by the request, the parent directory is always in a locking state and cannot be accessed by other requests, so that a server can only execute a plurality of waiting requests one by one. In the embodiment of the present application, once the parent directory is converted from the locked state to the unlocked state, a plurality of target requests may be executed in the locked state of the parent directory at the same time, that is, the server may execute the acquired plurality of target requests in parallel, and in the process of executing the plurality of target requests, the server needs to configure the parent directory to the locked state. Therefore, in the embodiment of the application, when the parent directory is locked by adopting the mutual exclusion lock (namely, the parent directory is in a locking state), the effect that a plurality of target requests access the parent directory in parallel is realized.

Further, during the execution of the multiple target requests, if the server continues to receive the request for creating or deleting the child file under the parent directory, the parent directory is still in the locked state and cannot be accessed, and the server will continue to wait for the requests. After all the target requests are executed, the server converts the parent directory from the locking state to the unlocking state, and then the parent directory can be accessed again, so that the server can continue to execute the other waiting requests in parallel in a similar way to the way of executing the target requests.

In the embodiment of the present application, the specific number of deletion requests and creation requests in the plurality of target requests is not limited. Among the plurality of target requests, a plurality of creation requests and a plurality of deletion requests can be included; alternatively, the plurality of target requests may all be create requests; alternatively, the plurality of target requests may all be delete requests; wherein each delete request is for indicating deletion of a child file under the parent directory, and each create request is for indicating creation of a child file under the parent directory. By way of example, assuming that there are 10 target requests acquired by the server, the 10 target requests may consist of 5 delete requests and 5 create requests; alternatively, the 10 target requests may all be create requests; alternatively, all of the 10 target requests may be delete requests.

On the other hand, in the embodiment of the present application, each target request may be initiated by a different client, but multiple target requests may also be initiated by the same client, which is not limited herein.

In the embodiment of the application, under the condition that the parent directory is in a locking state, a plurality of target requests are acquired, and each target request is used for indicating creation or deletion of a child file under the parent directory; in response to the parent directory transitioning from the locked state to the unlocked state, the plurality of target requests are executed in parallel, and the parent directory is in the locked state when the plurality of target requests are executed. By the method, a plurality of target requests can be executed in the locking state of the parent directory at the same time, parallel processing of the plurality of target requests under the same parent directory is achieved, and data processing efficiency is improved.

Based on the first aspect, in an alternative implementation manner, after the server receives the plurality of target requests, the server may store the target requests to the target queue to wait because the parent directory is in a locked state and cannot be accessed at this time. After the parent directory transitions from the locked state to the unlocked state, the server may retrieve the plurality of target requests from the target queue and then execute the plurality of target requests in parallel.

Based on the first aspect, in an alternative implementation, the target queue may be a semaphore queue, i.e. the server may store these target requests to the semaphore queue to wait. The semaphore queue uses the parent directory as a shared resource of the semaphore queue so as to manage the access authority of the parent directory, and the semaphore queue is configured to send out semaphore information when the parent directory is converted from a locking state to an unlocking state. Therefore, in the embodiment of the present application, when the execution of the first request is completed and the parent directory as the shared resource is converted from the locked state to the unlocked state, the server receives the semaphore information, so as to cancel the waiting state of the plurality of target requests in the semaphore queue, that is, the server may obtain the plurality of target requests from the semaphore queue and then execute the plurality of target requests in parallel.

Semaphores (Semaphore), sometimes referred to as semaphores, are a tool used in a multi-threaded environment that is responsible for coordinating threads to ensure that threads are able to properly and reasonably use shared resources. Before entering a critical code segment, a thread must acquire a semaphore and once the critical code segment is executed (equivalent to the completion of the first request execution in this application), the thread must release the semaphore. For example, when a target thread needs to execute a critical code segment, it must wait until other threads release the semaphore because there is no existing semaphore, and the target thread cannot execute the critical code segment using the resource.

In the embodiment of the application, the target requests can be processed more conveniently after the parent directory is unlocked by configuring the target requests as the semaphore queues, so that the implementation efficiency of the scheme is improved.

Based on the first aspect, in an optional implementation manner, after the server completes all execution of the obtained target requests, the parent directory may be converted from the locked state to the unlocked state, so that subsequent other requests can continue to access the parent directory.

Based on the first aspect, in an alternative implementation manner, after creating or deleting each sub-file in the file system, metadata of a parent directory where the sub-files are located needs to be updated. Since the prior art queues individual requests in order to be processed one by one, each time a server executes a create and delete request, the metadata of the parent directory needs to be updated once for the request accordingly. Therefore, when the server executes a plurality of requests, the server also needs to execute the same times of updating operation of the metadata of the parent directory, so that the number of times of disk reading and writing is excessive.

The metadata of the parent directory includes some attribute information of the child file under the parent directory, such as access time, update count, and the like of the child file under the parent directory. The information is common information of the sub files under the parent directory, which is also called a common key in metadata of the parent directory. Because in the embodiment of the application, the multiple target requests are processed in parallel, the metadata of the parent directory does not need to be updated once each time one target request is executed. Specifically, the server may determine update information corresponding to each target request, thereby obtaining a plurality of update information corresponding to a plurality of target requests, and then perform cumulative calculation on the plurality of update information to obtain target update information, for example, cumulative calculating a file count of the current parallel processing, cumulative calculating a modification time of the current parallel processing, and the like, and then update metadata of the parent directory once according to the target update information.

In the above way, in the process of executing a plurality of target requests in parallel, the metadata of the parent directory only needs to be updated once, so that the data processing amount is reduced.

Based on the first aspect, in an alternative implementation manner, the plurality of target requests may include a plurality of deletion requests, where each deletion request is used to indicate deletion of a child file under the same parent directory, and each deletion request corresponds to a first inode (i.e., an inode corresponding to a child file that needs to be deleted). In the file system, a certain sub-file is deleted, and the corresponding inode of the sub-file is actually deactivated.

An inode table contains a plurality of inode pages, the inodes of each subfile being stored in a corresponding inode page. When reading an inode, the entire inode page is read at one time by taking the entire inode page as the smallest reading unit. Because the prior art queues each request in order, in the prior art, every time a sub-file is deleted, the inode page of the sub-file needs to be read from the disk, and then the inode of the sub-file in the inode page is deactivated. Assuming that 10 requests need to delete the child files under the same parent directory, correspondingly, each time the server executes a delete request, the server needs to read the corresponding inode page from the disk, and find the inode corresponding to the delete request from the inode page for deactivation, so 10 delete requests need to repeatedly read and write 10 inode pages from the disk, resulting in excessive disk read and write times.

In this embodiment of the present application, since a plurality of deletion requests are processed in parallel, all first inode pages where the first inodes corresponding to each deletion request are located may be read in parallel, and then the first inodes in each first inode page are deactivated.

By the method, in the process of executing a plurality of deletion requests in parallel, only one time of reading is needed to the disk, all the inode pages of the first inodes corresponding to all the deletion requests can be read, and the reading and writing times of the disk are reduced.

Based on the first aspect, in an alternative implementation manner, the plurality of target requests may include a plurality of creation requests, where each creation request is used to indicate creation of a child file under the same parent directory, and each creation request corresponds to a second inode (i.e., an inode corresponding to the child file that needs to be created). In the file system, when a sub-file is created, a new inode (i.e., a second inode) needs to be written to the disk naturally.

One inode table then contains multiple inode pages, and the inodes of each newly created subfile need to be written to an idle inode page (i.e., the second inode page in this application) in the inode table. When reading an inode, the entire inode page is read at one time by taking the entire inode page as the smallest reading unit. Because the prior art queues each request in order, in the prior art, every sub-file is created, an idle inode page in an inode table needs to be read from a disk, and then a newly built inode is written into the idle inode page. Assuming that 10 requests need to be created for the child files under the same parent directory, correspondingly, each time the server executes a creation request, the server needs to read the free inode pages in the inode table from the disk and write the newly built inodes into the free inode pages, so 10 creation requests need to repeatedly read and write 10 free inode pages from the disk, and the disk reading and writing times are excessive.

Since one inode page can hold multiple inodes, an idle inode page in an inode table can generally accommodate writes of multiple newly built inodes. On the other hand, all the inodes in the inode pages are written in sequence, and the subsequent inode page starts to write a new inode only when the previous inode page is full. In this embodiment of the present application, since multiple creation requests are processed in parallel, that is, multiple newly created inodes (i.e., the second inode in the present application) need to be written into an idle inode page (i.e., the second inode page in the present application). Therefore, it is necessary to determine which free inode pages these second inodes need to be written to according to the number of the second inodes, that is, the second inode pages are determined according to the second inodes, and the second pages are free pages in the inode table for writing the second inodes. After the second inode page is determined, the second inode page is read, and then the second inode is written into the second inode page.

By way of example, assume that one inode page can accommodate 30 inodes, with 5 more free slots available for new inodes to write. If there are 20 newly built inodes to be written at this time, 5 inodes may be scheduled to be written to 5 free slots in the above-mentioned inode page, and the remaining 15 inodes may be written to the next free inode page. Therefore, in the writing process of 20 newly built inodes, 2 second inode pages (free inode pages) need to be read in total. For another example, if one free inode page in the inode table is sufficient to accommodate all the inodes to be newly created, then only the free inode page needs to be read to perform the writing of the inode.

By the method, in the process of executing a plurality of creation requests in parallel, only one idle inode page needs to be read to the disk, all second inodes corresponding to all creation requests can be written to the idle inode page, and the read-write times of the disk are reduced.

Based on the first aspect, in an alternative implementation manner, the server receives a first request before acquiring a plurality of target requests, where the first request is used to indicate creation or deletion of a child file under a parent directory. The server may directly execute the first request when the parent directory is in the unlocked state, and the parent directory is in the locked state when the first request is executed. At this time, after the server receives a plurality of target requests, the parent directory is in a locked state, so that the target requests cannot access the parent directory temporarily, i.e. the target requests cannot be executed temporarily, and the server is in a waiting state.

Based on the first aspect, in an alternative implementation manner, after the execution of the first request is completed, the parent directory is converted from the locked state to the unlocked state, and then the parent directory can be accessed.

In a second aspect, an embodiment of the present application provides a data processing apparatus, including:

An obtaining unit, configured to obtain, when the parent directory is in a locked state, a plurality of target requests, each of which is used to indicate creation or deletion of a child file under the parent directory;

and the execution unit is used for responding to the conversion of the parent directory from the locking state to the unlocking state, executing a plurality of target requests in parallel, and when the plurality of target requests are executed, the parent directory is in the locking state.

Based on the second aspect, in an optional implementation manner, the acquiring unit is specifically configured to:

receiving a plurality of target requests;

storing the plurality of target requests to a target queue;

the acquisition unit is further used for acquiring a plurality of target requests from the target queue.

Based on the second aspect, in an optional implementation manner, the target queue is a semaphore queue, and the semaphore queue is configured to issue semaphore information if the parent directory is converted from a locked state to an unlocked state;

and the acquisition unit is specifically used for acquiring a plurality of target requests from the semaphore queue in response to the received semaphore information.

Based on the second aspect, in an optional implementation manner, the data processing apparatus further includes:

and the conversion unit is used for converting the parent directory from the locking state to the unlocking state in response to completion of parallel execution of the plurality of target requests.

Based on the second aspect, in an optional implementation manner, the execution unit is specifically configured to:

determining the update information corresponding to each target request to obtain a plurality of update information corresponding to a plurality of target requests;

determining target update information according to the plurality of update information;

and updating the metadata of the parent directory according to the target update information.

Based on the second aspect, in an optional implementation manner, the plurality of target requests include a plurality of deletion requests, each deletion request is used for indicating deletion of a child file under a parent directory, each deletion request corresponds to one first inode, and the execution unit is specifically configured to:

reading an inode page of each first inode in parallel to obtain the first inode page;

the first inode in the first inode page is deactivated.

Based on the second aspect, in an optional implementation manner, the plurality of target requests include a plurality of creation requests, each creation request is used for indicating creation of a child file under the parent directory, and the execution unit is specifically configured to:

determining a plurality of second inodes according to the plurality of creation requests;

acquiring second inode pages corresponding to a plurality of second inodes, wherein the second inode pages are idle inode pages in an inode table;

and writing the second inode into the second inode page.

Based on the second aspect, in an optional implementation manner, the obtaining unit is further configured to obtain a first request, where the first request is used to indicate creation or deletion of a child file under the parent directory;

and the execution unit is also used for responding to the fact that the parent directory is in an unlocking state, executing the first request, and when the first request is executed, the parent directory is in a locking state.

Based on the second aspect, in an optional implementation manner, the conversion unit is further configured to convert the parent directory from the locked state to the unlocked state in response to completing the execution of the first request.

In a third aspect, an embodiment of the present invention provides a computer device including a memory, a communication interface, and a processor coupled to the memory and the communication interface; the memory is used for storing instructions, the processor is used for executing the instructions, and the communication interface is used for communicating with other devices under the control of the processor; wherein the processor, when executing the instructions, performs the method of data processing as described in any of the above aspects.

In a fourth aspect, embodiments of the present application provide a computer readable storage medium having a computer program stored therein, which when run on a computer, causes the computer to perform the method of data processing according to any of the above aspects.

In a fifth aspect, embodiments of the present application provide a computer program product or computer program comprising computer instructions which, when run on a computer, cause the computer to perform the method of data processing of any of the above aspects.

In a sixth aspect, embodiments of the present application provide a chip system, which includes a processor for implementing the functions involved in the above aspects, for example, transmitting or processing data and/or information involved in the above method. In one possible design, the chip system further includes a memory for holding program instructions and data necessary for the server or the communication device. The chip system can be composed of chips, and can also comprise chips and other discrete devices.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings may be obtained according to the provided drawings without inventive effort to a person skilled in the art.

FIG. 1 is a schematic diagram of a scenario in which a server receives data processing requests from a plurality of clients;

FIG. 2 is a flow chart of a process of a server for a request for creation or deletion of a plurality of files under the same parent directory in the prior art;

FIG. 3 is a schematic diagram of a scenario in which a server queues multiple requests in order;

FIG. 4 is a schematic flow chart of a data processing method in an embodiment of the present application;

FIG. 5 is a flow chart of parallel execution of multiple target requests according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a scenario in which a server executes multiple target requests in parallel;

FIG. 7 is a schematic diagram of a scenario in which multiple inode pages are read in parallel in an embodiment of the present application;

FIG. 8 is a schematic diagram of a scenario in which multiple inodes are written in parallel in an embodiment of the present application;

FIG. 9 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

Embodiments of the present invention will be described below with reference to the accompanying drawings in the embodiments of the present invention. The terminology used in the description of the embodiments of the invention herein is for the purpose of describing particular embodiments of the invention only and is not intended to be limiting of the invention. As one of ordinary skill in the art can appreciate, with the development of technology and the appearance of new scenes, the technical solutions provided in the embodiments of the present application are applicable to similar technical problems.

The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

With the rapid development of big data, corresponding business demands are also increasing, massive data is in explosive growth, and the volume of the data is huge and the updating is frequent. Therefore, for a storage system (e.g., a server) storing a large amount of service data, data processing requests from a plurality of clients, such as a data creation request, a data deletion request, or a data update request, are often faced.

Referring to fig. 1, fig. 1 is a schematic view of a scenario in which a server receives data processing requests from a plurality of clients, as shown in fig. 1, in which a plurality of clients are shown in fig. 1, a client may be a computer device, and in an actual scenario, more or fewer clients may participate in a data processing process, and a specific number depends on the actual scenario and is not limited herein. The server in fig. 1 may be used as an independent storage system, or may be one of the storage nodes in the distributed storage system. The server is used as a storage node, and metadata information (name, attribute, size, time and the like) of each file is respectively stored in a directory entry Denry table and an index node Inode table in a file system. Each new file creation or deletion updates the Dentry table and Inode table on disk. Since the Denry table and the Inode table are shared objects between files within the same parent directory, a parent directory lock will be taken for the parent directory of the file to ensure metadata consistency.

Referring to fig. 2, fig. 2 is a flowchart of a process of a server facing a request for creating or deleting a plurality of files under the same parent directory in the prior art. As shown in fig. 2, in this server (storage node), a plurality of creation requests or deletion requests for files under the same parent directory are faced, namely, request 1, request 2, and request 3 as shown in fig. 2. All of these requests will be queued on the storage node in order due to the existence of the parent directory lock. This way of processing in sequence is less efficient and increases the operating delay. That is, since the server first received request 1, the server would first process request 1. During the processing of the request 1, the parent directory is locked, so that the parent directory cannot be accessed by the request 2 and the request 3, and the request 2 and the request 3 are in a waiting state. After the request 1 is processed, the parent directory is unlocked, and the server executes the request 2 because the server receives the request 2 before the request 3. Similarly, in the processing of the request 2, the parent directory is continuously locked, so that the parent directory cannot be accessed by the request 3, and the request 3 is in a waiting state. And unlocking the parent directory after the request 2 is processed, and finally executing the request 3 by the server.

For ease of understanding, referring to fig. 3, fig. 3 is a schematic diagram of a scenario in which a server queues a plurality of requests in order. As shown in fig. 3, in the queuing sequence processing flow, the server can only process one request at the same time, on one hand, the processing efficiency is low, and the operation delay is increased; on the other hand, since each request is processed separately, metadata (Inode table and directory table) under the parent directory is updated in the processing process of each request, which results in high CPU utilization, more disk writing times, and great occupation of computing resources of the server.

In view of this, the present application provides a method for data processing for improving the efficiency of data processing. Referring to fig. 4, fig. 4 is a flow chart of a data processing method in an embodiment of the present application, as shown in fig. 4, the method for processing data in an embodiment of the present application includes:

101. processing the first request;

in the file system of the server, when a certain request initiates creation or deletion of a child file under a certain parent directory to the server, the server needs to update the metadata of the child file and the metadata of the parent directory. And in the process of executing the request, the server locks the parent directory by adopting a mutual exclusion lock, so that the problem that metadata under the parent directory are inconsistent due to the fact that other requests create or delete the child files under the parent directory again when the request is executed is avoided. It should be noted that, in the method for processing data provided in the embodiment of the present application, the method is applicable to creation or deletion of a child file under the same parent directory. For convenience of description, the requests mentioned in the embodiments of the present application are all directed to creation or deletion of files under the same parent directory. And those requests for modifying and updating only byte contents of the file do not relate to the operation of metadata of the parent directory, so that mutual exclusion lock is not needed to lock the parent directory in the execution process, and the data processing method provided by the application is not needed.

The server in the embodiment of the present application may be an independent physical server, or may be a server cluster formed by a plurality of physical servers or one of storage nodes in a distributed system. The server, serving as a storage node for the service data, may receive a plurality of requests from a plurality of clients.

The server acts as a storage node, any files in its file system being separated into data and metadata. Data refers to actual data in the file, and metadata refers to system data used to describe characteristics of a file, such as access rights, file owners, and distribution information of file data blocks, etc. The distribution information includes the location of the file on the disk and the location of the disk in the cluster. The user needs to manipulate a file to first obtain its metadata to locate the file and obtain the content or related attributes of the file. Metadata includes index nodes (inodes) and directory entries (directory), which are used to record the file's meta-information and directory hierarchy.

Inodes, i.e., inodes, record meta-information of files, such as inode number, file size, access rights, creation time, modification time, and location of data in disk, etc. The index nodes are unique identifiers of files, and the index nodes are in one-to-one correspondence with each other and are also stored in the hard disk, so that the index nodes occupy disk space. However, the inode of a file does not record the name of the file, which is stored in a directory entry.

Directory entries, i.e., directory, are used to record the names of files, inode pointers, and hierarchical associations with other directory entries. A plurality of directory entries are associated to form a directory structure, but unlike an inode, a directory entry is a data structure maintained by the kernel, which is not stored on disk, but rather cached in memory.

After the server receives the first request, when the parent directory where the file operated by the first request is located is in an unlocked state (i.e. the parent directory is not in an accessed state), the server may directly execute the first request, and in the process of executing the first request, the parent directory may be in a locked state. If other requests (i.e. target requests in the application) initiate creation or deletion of child files to the same parent directory, the requests cannot be executed temporarily and are in a waiting state, so that the first request is waited for to be processed.

102. Acquiring a plurality of target requests;

as seen in step 101, the parent directory is always in a locked state until processing is completed, and is not accessible by other requests during the parent directory. Thus, the server, after having acquired a plurality of target requests, will temporarily not execute the target requests during the time the parent directory is in the locked state, wherein each target request is used to indicate creation or deletion of a child file under the parent directory.

Further, after the server receives the target requests, the server may store the target requests in the target queue to wait because the parent directory is in the locked state and cannot be accessed. After the execution of the first request in step 101 is completed, the parent directory will transition from the locked state to the unlocked state, and the server may obtain the plurality of target requests from the target queue, and then execute step 103.

Further, the target queue may be a semaphore queue, i.e. the server may store the target requests to the semaphore queue for waiting. The semaphore queue uses the parent directory as a shared resource of the semaphore queue so as to manage the access authority of the parent directory, and the semaphore queue is configured to send out semaphore information when the parent directory is converted from a locking state to an unlocking state. Therefore, in the embodiment of the present application, when the execution of the first request is completed and the parent directory as the shared resource is converted from the locked state to the unlocked state, the server receives the semaphore information, so as to cancel the waiting state of the plurality of target requests in the semaphore queue, that is, the server may obtain the plurality of target requests from the semaphore queue, and then step 103 is performed.

103. Processing a plurality of target requests in parallel;

as introduced in step 102, after the first request is completed, the parent directory is transitioned from the locked state to the unlocked state, at which point the parent directory may be accessed by other requests. In the prior art, a parent directory can only be allowed to be accessed by one request under each unlocking state, and in the process of accessing the parent directory by the request, the parent directory is always in a locking state and cannot be accessed by other requests, so that a server can only execute a plurality of waiting requests one by one. In this embodiment of the present application, once the parent directory is converted from the locked state to the unlocked state, the server may allocate multiple target requests to be executed in the locked state of the parent directory at the same time, that is, the server may execute multiple target requests in step 102 in parallel, and in the process of executing multiple target requests, the server needs to configure the parent directory to the locked state. Therefore, in the embodiment of the application, when the parent directory is locked by adopting the mutual exclusion lock (namely, the parent directory is in a locking state), the effect that a plurality of target requests access the parent directory in parallel is realized.

For example, referring to fig. 5, fig. 5 is a schematic flow chart of parallel execution of multiple target requests in an embodiment of the present application. As shown in fig. 5, the server receives a total of 3 target requests (request 1, request 2, and request 3). The server receives the request 1 first, and the parent directory corresponding to the target request is in an unlocked state at the moment, so that the server can be accessed, the request 1 can be directly executed by the server, and the parent directory is in a locked state in the request execution process. When the server then receives request 2 and request 3, which cannot be accessed because the parent directory is locked, then request 2 and request 3 are in the wait queue (semaphore queue). After request 1 is performed, the parent directory is unlocked, then request 2 and request 3 may be performed in parallel, and the parent directory is again locked.

For ease of understanding, referring to fig. 6, fig. 6 is a schematic diagram of a scenario in which a server executes multiple target requests in parallel. As shown in fig. 6, after the server processes the request 1, the subsequent requests can be processed in parallel, so that the efficiency of data processing is greatly improved.

Further, during the execution of the multiple target requests, if the server continues to receive the request for creating or deleting the child file under the parent directory, and the parent directory is still in the locked state and cannot be accessed at this time, the server waits for the requests in the manner of step 102. After all the target requests are executed, the server converts the parent directory from the locking state to the unlocking state, and then the parent directory can be accessed again, so that the server can continue to execute the other waiting requests in parallel in a similar way to the way of executing the target requests.

Specifically, the target request may be a delete request, where the delete request is used to indicate deletion of a child file under the parent directory; the target request may be a create request, where the create request is to indicate creation of a child file under the parent directory. Since the metadata of the delete request and the create request are not processed in the same manner, in the process of executing multiple target requests in parallel, the delete request and the create request cannot be executed in parallel. Therefore, when the plurality of target requests include both the plurality of deletion requests and the plurality of creation requests, the plurality of deletion requests therein may be executed in parallel first, and then the plurality of creation requests therein may be executed in parallel; it is also possible to execute a plurality of creation requests therein in parallel and then execute a plurality of deletion requests therein in parallel.

The embodiments of the present application are also different, and are described below.

Delete request: in this embodiment of the present application, each delete request is used to indicate deletion of a child file under the same parent directory, and each delete request corresponds to a first inode (i.e., an inode corresponding to a child file that needs to be deleted). In the file system, a certain sub-file is deleted, and the corresponding inode of the sub-file is actually deactivated.

In this embodiment of the present application, since a plurality of deletion requests are processed in parallel, all first inode pages where the first inodes corresponding to each deletion request are located may be read in parallel. For example, referring to fig. 7, fig. 7 is a schematic diagram of a scenario in which a plurality of inode pages are read in parallel in an embodiment of the present application. As shown in fig. 7, suppose that there are 3 deletion requests, deletion of file 1, file 2, and file 3 under the same parent directory is required, respectively. Whereas file 1 is on the 1 st inode page in FIG. 7, file 2 is on the 2 nd inode page in FIG. 7, and file 3 is on the 3 rd inode page in FIG. 7. The server may read the 1 st inode page, the 2 nd inode page, and the 3 rd inode page from disk at a time.

After the first inode pages where each first inode is located are read out in parallel, the first inodes in the respective first inode pages may be deactivated. By the method, in the process of executing a plurality of deletion requests in parallel, only one time of reading is needed to the disk, all the inode pages of the first inodes corresponding to all the deletion requests can be read, and the reading and writing times of the disk are reduced.

Creating a request: in this embodiment of the present application, each creation request is used to indicate creation of a child file under the same parent directory, where each creation request corresponds to a second inode (i.e., an inode corresponding to the child file that needs to be created). In the file system, when a sub-file is created, a new inode (i.e., a second inode) needs to be written to the disk naturally.

For ease of understanding, please refer to fig. 8, fig. 8 is a schematic diagram of a scenario in which multiple inodes are written in parallel in an embodiment of the present application. Suppose there are 3 creation requests, which respectively need to create file 1, file 2, and file 3 under the same parent directory. As shown in fig. 8, the inode table is free from slot 33, and can accommodate writing of new inodes. Therefore, the free inode page in the figure can be read, and then the inodes corresponding to each of the file 1, the file 2 and the file 3 are written into the 33 # slot, the 34 # slot and the 35 # slot in the free inode page.

Further, in the file system, in the process of creating the subfiles, a directory of the subfiles needs to be created. In this embodiment of the present application, the directory structure of the file system may be created as a data structure of the b+ tree, and for the new child files in this embodiment of the present application, the new multiple new child files may be ordered according to the word order in a manner of file names, so that the child files with similar file name ordering fall in the same leaf node of the b+ tree as much as possible. And then, if the sub-files corresponding to the newly created directory are required to be read, written and deleted in batches, the batch requests can be sequenced in a mode of file names, and the directory structure is searched further. Because the densities in the same leaf node are ordered according to the file names in the process of newly creating the densities, the batch requests are searched according to the file names, and the possibility that a plurality of different densities share the same leaf node is greatly improved, so that the reading quantity of the leaf node is reduced, and the disk reading times are reduced.

Illustratively, it is assumed that when multiple subfiles are created, their densities are sorted by name and then at the same leaf node that includes densities of file names 1 through 9. If the follow-up request needs to read or delete the files from 1 to 9, after the requests are ordered according to the names, all the densities of the files from 1 to 9 can be obtained directly through the leaf node, and the other leaf nodes do not need to be searched for the densities.

In the file system, after each sub-file is created or deleted, metadata of a parent directory where the sub-files are located also needs to be updated. Since the prior art queues individual requests in order to be processed one by one, each time a server executes a create and delete request, the metadata of the parent directory needs to be updated once for the request accordingly. Therefore, when the server executes a plurality of requests, the server also needs to execute the same times of updating operation of the metadata of the parent directory, so that the number of times of disk reading and writing is excessive.

In order to better implement the above-described scheme of the embodiments of the present application, on the basis of the embodiments corresponding to fig. 4 and fig. 5, a related device for implementing the above-described scheme is further provided below. Referring to fig. 9, fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, where the data processing apparatus includes:

an obtaining unit 201, configured to obtain, in a case where the parent directory is in a locked state, a plurality of target requests, each of which is used to indicate creation or deletion of a child file under the parent directory;

the execution unit 202 is configured to execute a plurality of target requests in parallel in response to the parent directory being converted from the locked state to the unlocked state, and the parent directory is in the locked state when the plurality of target requests are executed.

In one possible design, the acquisition unit 201 is specifically configured to:

receiving a plurality of target requests;

storing the plurality of target requests to a target queue;

In one possible design, the target queue is a semaphore queue configured to issue semaphore information if the parent directory transitions from a locked state to an unlocked state;

The obtaining unit 201 is specifically configured to obtain, in response to receiving the semaphore information, a plurality of target requests from the semaphore queue.

In one possible design, the data processing apparatus further comprises:

and a conversion unit 203, configured to convert the parent directory from the locked state to the unlocked state in response to completion of executing the plurality of target requests in parallel.

In one possible design, the execution unit 202 is specifically configured to:

In one possible design, the plurality of target requests include a plurality of delete requests, each delete request being for indicating deletion of a child file under the parent directory, each delete request corresponding to one of the first inodes, and the execution unit 202 is specifically configured to:

the first inode in the first inode page is deactivated.

In one possible design, the plurality of target requests includes a plurality of creation requests, each creation request indicating creation of a child file under a parent directory, and the execution unit 202 is specifically configured to:

and writing the second inode into the second inode page.

In one possible design, the obtaining unit 201 is further configured to obtain a first request, where the first request is used to indicate creation or deletion of a child file under the parent directory;

the execution unit 202 is further configured to execute the first request in response to the parent directory being in the unlocked state, and when the first request is executed, the parent directory is in the locked state.

In one possible design, the converting unit 203 is further configured to convert the parent directory from the locked state to the unlocked state in response to completing the execution of the first request.

It should be noted that, content such as information interaction and execution process between each module/unit in the computer device, each method embodiment corresponding to fig. 4 and 5 in the present application is based on the same concept, and specific content may be referred to the description in the foregoing method embodiment shown in the present application, which is not repeated herein.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a computer device provided in this embodiment of the present application, on which the data processing apparatus described in the corresponding embodiment of fig. 4 may be disposed in the computer device 300, so as to implement the functions of the server in the corresponding embodiment of fig. 4, specifically, the computer device 300 is implemented by one or more servers, where the computer device 300 may be relatively different due to different configurations or performances, and may include one or more central processing units (central processing units, CPU) 322 (for example, one or more processors) and a memory 332, and one or more storage media 330 (for example, one or more mass storage devices) storing the application programs 342 or the data 344. Wherein the memory 332 and the storage medium 330 may be transitory or persistent. The program stored on the storage medium 330 may include one or more modules (not shown), each of which may include a series of instruction operations in a computer device. Still further, the central processor 322 may be configured to communicate with the storage medium 330 to execute a series of instruction operations in the storage medium 330 on the computer device 300.

The computer device 300 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input/output interfaces 358, and/or one or more operating systems 341, such as Windows Server ^TM ，Mac OS X ^TM ，Unix ^TM ，Linux ^TM ，FreeBSD ^TM Etc.

Embodiments of the present application also provide a computer program product comprising a computer program product which, when run on a computer, causes the computer to perform the steps performed by a server in a method as described in the embodiments of fig. 4 or fig. 5.

There is also provided in an embodiment of the present application a computer-readable storage medium having stored therein a program for performing signal processing, which when run on a computer causes the computer to perform the steps performed by the server in the method described in the embodiment of fig. 4 or fig. 5 as described above.

The data processing device provided in this embodiment of the present application may specifically be a chip, where the chip includes: a processing unit, which may be, for example, a processor, and a communication unit, which may be, for example, an input/output interface, pins or circuitry, etc. The processing unit may execute the computer-executable instructions stored in the storage unit to cause the chip to perform the data processing method described in the embodiment shown in fig. 4. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit in the wireless access device side located outside the chip, such as a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a random access memory (random access memory, RAM), etc.

The processor mentioned in any of the above may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits for controlling the execution of the program of the method of the first aspect.

It should be further noted that the above described embodiments of the apparatus are only schematic, where the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the application, the connection relation between the modules represents that the modules have communication connection therebetween, and can be specifically implemented as one or more communication buses or signal lines.

From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general purpose hardware, or of course may be implemented by dedicated hardware including application specific integrated circuits, dedicated CPUs, dedicated memories, dedicated components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be varied, such as analog circuits, digital circuits, or dedicated circuits. However, a software program implementation is a preferred embodiment in many cases for the present application. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk of a computer, etc., including several instructions for causing a computer device (which may be a personal computer, a training device, or a network device, etc.) to perform the method described in the embodiments of the present application.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.

The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, training device, or data center to another website, computer, training device, or data center via a wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a training device, a data center, or the like that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.

Claims

1. A method of data processing, the method comprising:

acquiring a plurality of target requests under the condition that a parent directory is in a locking state, wherein each target request is used for indicating creation or deletion of a child file under the parent directory;

the plurality of target requests are executed in parallel in response to the parent directory transitioning from the locked state to the unlocked state, and the parent directory is in the locked state when the plurality of target requests are executed.

2. The method of claim 1, wherein the obtaining a plurality of target requests comprises:

receiving a plurality of target requests;

storing the plurality of target requests to a target queue;

before the parallel execution of the plurality of target requests, the method further comprises:

the plurality of target requests are obtained from the target queue.

3. The method of claim 2, wherein the target queue is a semaphore queue configured to issue semaphore information if the parent directory transitions from a locked state to an unlocked state;

the obtaining the plurality of target requests from the target queue includes:

The plurality of target requests are retrieved from the semaphore queue in response to receiving the semaphore information.

4. A method according to claim 1, 2 or 3, wherein after said executing said plurality of target requests in parallel, the method further comprises:

and in response to completion of the executing the plurality of target requests in parallel, converting the parent directory from a locked state to an unlocked state.

5. The method of any of claims 1-4, wherein the executing the plurality of target requests in parallel comprises:

determining the update information corresponding to each target request, and obtaining a plurality of update information corresponding to the plurality of target requests;

and updating the metadata of the parent directory according to the target updating information.

6. The method of any of claims 1 to 5, wherein the plurality of target requests include a plurality of delete requests, each delete request indicating deletion of a child file under the parent directory, each delete request corresponding to a first inode, the executing the plurality of target requests in parallel comprising:

the first inode in the first inode page is deactivated.

7. The method of any of claims 1-5, wherein the plurality of target requests comprises a plurality of creation requests, each of the creation requests to indicate creation of a child file under the parent directory, the executing the plurality of target requests in parallel comprising:

acquiring second inode pages corresponding to the plurality of second inodes, wherein the second inode pages are idle inode pages in an inode table;

and writing the second inode into the second inode page in parallel.

8. The method of any one of claims 1 to 7, wherein prior to the obtaining the plurality of target requests, the method further comprises:

acquiring a first request, wherein the first request is used for indicating creation or deletion of a child file under the parent directory;

and responding to the parent directory in an unlocking state, executing the first request, and when the first request is executed, setting the parent directory in a locking state.

9. The method of claim 8, wherein after the executing the first request, the method further comprises:

And in response to completing the executing the first request, converting the parent directory from a locked state to an unlocked state.

10. A data processing apparatus, characterized in that the data processing apparatus comprises:

an obtaining unit, configured to obtain a plurality of target requests in a case where a parent directory is in a locked state, where each target request is used to indicate creation or deletion of a child file under the parent directory;

and the execution unit is used for responding to the conversion of the parent directory from the locking state to the unlocking state, executing the target requests in parallel, and when the target requests are executed, the parent directory is in the locking state.

11. The data processing device according to claim 10, wherein the acquisition unit is specifically configured to:

receiving a plurality of target requests;

storing the plurality of target requests to a target queue;

the obtaining unit is further configured to obtain the plurality of target requests from the target queue.

12. The data processing apparatus of claim 11, wherein the target queue is a semaphore queue configured to issue semaphore information if the parent directory transitions from a locked state to an unlocked state;

The acquiring unit is specifically configured to acquire the plurality of target requests from the semaphore queue in response to receiving the semaphore information.

13. The data processing apparatus according to claim 10, 11 or 12, characterized in that the data processing apparatus further comprises:

and the conversion unit is used for converting the parent directory from a locking state to an unlocking state in response to completion of the parallel execution of the plurality of target requests.

14. The data processing apparatus according to any one of claims 10 to 13, wherein the execution unit is specifically configured to:

15. The data processing apparatus according to any one of claims 10 to 14, wherein the plurality of target requests comprises a plurality of delete requests, each delete request being for indicating deletion of a child file under the parent directory, each delete request corresponding to one of the first inodes, the execution unit being specifically configured to:

the first inode in the first inode page is deactivated.

16. The data processing apparatus according to any one of claims 10 to 14, wherein the plurality of target requests comprises a plurality of creation requests, each of the creation requests being for indicating creation of a child file under the parent directory, the execution unit being specifically configured to:

and writing the second inode into the second inode page in parallel.

17. The data processing device according to any one of claims 10 to 16, wherein,

the obtaining unit is further configured to obtain a first request, where the first request is used to indicate creation or deletion of a child file under the parent directory;

the execution unit is further configured to execute the first request in response to the parent directory being in an unlocked state, and when the first request is executed, the parent directory is in a locked state.

18. The data processing apparatus of claim 17, wherein the data processing apparatus further comprises a data processing device,

the conversion unit is further configured to convert the parent directory from a locked state to an unlocked state in response to completion of the executing the first request.

19. A computer device comprising a processor and a memory, the processor being coupled to the memory,

the memory is used for storing programs;

the processor for executing the program in the memory, causing the computer device to perform the method of any one of claims 1 to 9.

20. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 9.

21. A computer program product having computer readable instructions stored therein, which when executed by a processor, implement the method of any of claims 1 to 9.

22. A chip system comprising at least one processor, wherein program instructions, when executed in the at least one processor, cause the method of any one of claims 1 to 9 to be performed.