CN114860655A - File processing method, device and storage medium - Google Patents

File processing method, device and storage medium Download PDF

Info

Publication number
CN114860655A
CN114860655A CN202210283907.8A CN202210283907A CN114860655A CN 114860655 A CN114860655 A CN 114860655A CN 202210283907 A CN202210283907 A CN 202210283907A CN 114860655 A CN114860655 A CN 114860655A
Authority
CN
China
Prior art keywords
node
file
agent
unique identifier
global unique
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210283907.8A
Other languages
Chinese (zh)
Inventor
张伟
朱凌宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202210283907.8A priority Critical patent/CN114860655A/en
Publication of CN114860655A publication Critical patent/CN114860655A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a file processing method, a file processing device and a storage medium, which are applied to the fields of high-performance calculation, AI and big data, and comprise the following steps: a first node acquires a global unique identifier of a first file to be processed; when the first node determines that the secondary directory comprises the first file global unique identifier, the first node locally acquires a file corresponding to the first file global unique identifier; or when the first node determines that the second-level directory does not comprise the first file global unique identifier and the first-level directory comprises the first file global unique identifier, the first node acquires the address of the second node in the first-level directory; and the first node acquires the file corresponding to the global unique identifier of the first file from the second node according to the address of the second node. Therefore, the local storage space of the computing node is constructed into an additional elastic cache pool, the upper limit of the capacity of the cluster system is expanded, and the service capacity of the file storage system is improved.

Description

File processing method, device and storage medium
Technical Field
The present application relates to the field of high-performance computing, AI, and big data technologies, and in particular, to a method and an apparatus for processing a file, and a storage medium.
Background
With the development of internet technology, big data is becoming a trend. In various online and offline data center scenes, computer data is increasing, which puts higher demands on the computing capacity and storage capacity of the computer. In possible implementation, in order to manage mass data, a file storage system is adopted to separate calculation and storage, and the file storage system can provide storage service for the calculation nodes through the storage capacity of the file storage system.
The storage space of a file storage system is generally configured by a plurality of servers in which a plurality of hard disks are arranged. The storage capacity of the file storage system is limited by the number of hard disks, the network access capacity of the data center, the physical space limitation and other factors, and the total storage capacity of the file storage system and the number of the served computing nodes are limited. When the number of nodes served by the file storage system is large, and the amount of computing data and the number of elastic nodes are large, the file storage system is difficult to meet the requirements of computing communication in a network, so that the service efficiency of the file storage system is low, and the use experience of a user is influenced.
Disclosure of Invention
The application provides a file processing method, a file processing device and a file processing storage medium, which are used for solving the problem that when the number of nodes served by a file storage system is large, and the number of computing data is large and the number of elastic nodes is large, the file storage system is difficult to meet the requirement of computing communication in a network, so that the service efficiency of the file storage system is low.
In a first aspect, the present application provides a file processing method, which is applied to a computing cluster, where the computing cluster includes a file storage system and a plurality of computing nodes, the file storage system is provided with a primary directory, each computing node is provided with a secondary directory and a computing node proxy, and the primary directory includes meta information of files cached in the plurality of computing nodes; the second-level directory of any one computing node comprises meta-information of a file cached locally by any one computing node; the computing node Agent of any computing node is used for acting the communication between any computing node and each node in the computing cluster; the method comprises the following steps:
a first node acquires a global unique identifier of a first file to be processed; the first node is any one of the computing nodes in the computing cluster;
when the first node determines that the secondary directory comprises the first file global unique identifier, the first node locally acquires a file corresponding to the first file global unique identifier;
or when the first node determines that the second-level directory does not include the first file global unique identifier and the first-level directory includes the first file global unique identifier, the first node acquires the address of the second node in the first-level directory, and the second node is a node in which the file corresponding to the first file global unique identifier is stored;
and the first node acquires the file corresponding to the global unique identifier of the first file from the second node according to the address of the second node.
Optionally, the metadata is stored in a plurality of metadata entries, and the metadata entry of any one of the primary directories includes: the method comprises the steps of table entry state, a file reading counter, a file global unique identifier, an Agent global unique identifier and address information of nodes stored in a cache file.
Optionally, the method further includes:
when a first node is started, an Agent of the first node tests the access capability to a primary directory in a file storage system;
if the primary directory does not exist in the file storage system, the primary directory is created in the file storage system.
Optionally, before the first node obtains the global unique identifier of the first file to be processed, the method further includes:
the Agent of the first node hijacks an Application Programming Interface (API) for accessing files stored by the file storage system.
Optionally, the obtaining, by the first node, the global unique identifier of the first file to be processed includes:
when the first node obtains the file reading operation, the first node obtains a first file global unique identifier corresponding to the file reading operation.
Optionally, after the first node obtains the global unique identifier of the first file to be processed, the method further includes:
the Agent of the first node retrieves a secondary directory in the first node;
if the second-level directory has the first file global unique identifier, the Agent of the first node adds one to a read counter corresponding to the first file global unique identifier in the second-level directory;
after the first node completes the file reading from the local, the Agent of the first node subtracts one from a reading counter corresponding to the first file global unique identifier in the secondary directory.
Optionally, the obtaining, by the first node, the file corresponding to the first file global unique identifier from the second node according to the address of the second node includes:
the Agent of the first node acquires file reading permission from the Agent of the second node;
and when the Agent of the first node obtains the permission of reading, the Agent of the first node acquires the file corresponding to the global unique identifier of the first file from the second node.
Optionally, the method further includes:
if the primary directory does not comprise the first file global unique identifier, the Agent of the first node acquires a metadata entry address of the first file global unique identifier in the primary directory;
if the metadata table entry address is used, the Agent of the first node continuously searches a blank table entry in the primary directory;
if the Agent of the first node obtains a blank table item before searching for N times, the Agent of the first node writes a metadata table item corresponding to the global unique identifier of the first file into the blank table item;
if the Agent of the first node is successfully written, the first node acquires a file from the file storage system and modifies a secondary directory of the first node;
if the Agent of the first node is not successfully written, the Agent of the first node writes a metadata table item corresponding to the first file global unique identifier into a next blank table item until the writing is successful or the preset writing times are reached;
and if the first node successfully writes the metadata entry corresponding to the global unique identifier of the first file, the first node acquires the file from the file storage system and modifies the secondary directory of the first node.
Optionally, the method further includes:
the Agent of the first node checks the table entry state bit of the blank table entry; the table entry state bit is used for identifying whether a blank table entry is occupied or not; the first value is used for identifying that a blank table entry is occupied, and the second value is used for identifying that the blank table entry is available;
when the table item status bit is a second value, the Agent of the first node negates the table item status bit;
the Agent of the first node writes the Agent global unique identifier of the first node into the blank table entry;
the Agent of the first node checks the table entry state bit of the blank table entry;
if the table entry state bit is a first value, the Agent of the first node judges whether the Agent global unique identifier in the blank table entry is the Agent global unique identifier of the first node;
and if so, writing other meta-information into the blank table entry when the Agent of the first node successfully contends.
Optionally, the method further includes:
if the Agent of the first node does not compete successfully, the Agent of the first node detects and counts;
and if the count does not exceed N, performing contention for the next blank table entry, and updating the count to be N + 1.
Optionally, after the Agent of the first node hijacks an application programming interface API for accessing files stored in the file storage system, the method further includes:
when the first node obtains the file writing operation, the first node Agent obtains a second file global unique identifier corresponding to the file writing operation;
the Agent of the first node retrieves the secondary catalog of the first node;
if the secondary directory comprises the second file global unique identifier, the Agent of the first node checks whether a read counter corresponding to the second file global unique identifier in the secondary directory returns to zero;
and if the reading counter returns to zero, the Agent of the first node deletes the metadata table entry corresponding to the global unique identifier of the second file in the primary directory and the secondary directory.
Optionally, the Agent of the first node searches the secondary directory of the first node, and further includes:
if the secondary directory does not comprise the second file global unique identifier and the primary directory comprises the second file global unique identifier, the Agent of the first node sends a file deletion request to the Agent of the second node;
and the Agent of the second node deletes the metadata table entry corresponding to the second file global unique identifier in the primary directory and the secondary directory based on the file deletion request.
Optionally, the method further includes:
and the Agent of the first node writes the file in the local of the first node.
In a second aspect, the present application provides a file processing apparatus, which is applied to a computing cluster, where the computing cluster includes a file storage system and a plurality of computing nodes, the file storage system is provided with a primary directory, each computing node is provided with a secondary directory and a computing node proxy, and the primary directory includes meta information of files cached in the plurality of computing nodes; the second-level directory of any one computing node comprises meta-information of a file cached locally by any one computing node; the computing node Agent of any computing node is used for acting the communication between any computing node and each node in the computing cluster; the device comprises:
the meta-information acquisition module of the first node is used for acquiring a global unique identifier of a first file to be processed; the first node is any one of the computing nodes in the computing cluster;
the file acquisition module of the first node is used for acquiring a file corresponding to the global unique identifier of the first file locally when the judgment module of the first node determines that the secondary directory comprises the global unique identifier of the first file;
or the meta-information obtaining module of the first node is further configured to obtain an address of a second node in the primary directory when the judging module of the first node determines that the secondary directory does not include the first global unique file identifier and the primary directory includes the first global unique file identifier, where the second node is a node in which a file corresponding to the first global unique file identifier is stored;
and the file acquisition module of the first node is further used for acquiring a file corresponding to the global unique identifier of the first file from the second node according to the address of the second node.
In a third aspect, an embodiment of the present application provides a terminal device, including: a memory and a processor;
the memory is used for storing computer instructions; the processor is configured to execute the memory-stored computer instructions to implement the method of any of the first aspects.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium on which a computer program is stored, the computer program being executed by a processor to implement the method of any one of the first aspect.
In a fifth aspect, the present application provides a computer program product comprising a computer program that, when executed by a processor, implements the method of any one of the first aspects.
According to the file processing method, the file processing device and the storage medium, a secondary directory for recording cached files in a single computing node and a primary directory for recording cached files in a plurality of computing nodes are set. When the file to be processed is a cached file in a computing node of the cluster system, the first node may obtain the file from the local storage of the computing node corresponding to the first file global unique identifier based on the secondary directory and the primary directory. According to the method, the local storage of the computing nodes is constructed into an additional elastic cache pool, so that the storage capacity of the cluster system is improved, the storage pressure of the file storage system is reduced, and the service capability of the file storage system is improved.
Drawings
Fig. 1 is a schematic view of a scene of a file processing method according to an embodiment of the present application;
FIG. 2 is a schematic flowchart of a document processing method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a file processing method according to an embodiment of the present application;
FIG. 4 is a schematic flowchart of a document processing method according to an embodiment of the present application;
FIG. 5 is a schematic flowchart of a document processing method according to an embodiment of the present application;
FIG. 6 is a schematic flowchart of a document processing method according to an embodiment of the present application;
FIG. 7 is a flowchart illustrating a document processing method according to an embodiment of the present application;
fig. 8 is a schematic view of a scene of a file processing method according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a file processing terminal device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terms referred to in this application are explained first:
a file storage system: the system is used for realizing file storage in a computing cluster, where the file storage system in the present application may be Network Attached Storage (NAS), and the NAS is a generic term of a file storage space that shares its storage content externally through a network. It is understood that the file storage system of the present application may be other storage systems.
Compute node Agents the Agent in each compute node, typically a user process. The Agent does not need to be managed in a unified mode and has a unique global identification.
Kubernetes (K8 s): is an open source for managing applications of containers on multiple hosts in a cloud platform. Kubernetes provides mechanisms for application deployment, planning, updating, and maintenance.
In online and offline data center scenarios, a virtual computing node is often constructed by using a K8s equivalent container management tool, or a computing cluster is constructed by using a large number of physical servers. For a centralized or distributed storage scenario, the storage system may provide storage services for the compute nodes in a compute cluster. The application can take a file storage system as an example. The file storage system is limited by factors such as physical space of the data center, the number of hard disks which can be inserted into the server, network access capability of the data center and the like. In some scenarios of large number of nodes, massive calculation data and popping-up accompanied by a large number of nodes, the storage capacity and data scheduling capability of the file storage system may not support the calculation communication requirement in the network, thereby affecting the service quality of the file storage system and further reducing the user experience.
By way of example, the following describes problems that may exist in a file storage system in a data center scenario with reference to examples.
In a High Performance Computing (HPC) scenario, the Lustre file system may develop a set of cache prefetching mechanism and cache aging mechanism based on local storage resources of compute nodes to improve file access latency and reduce the pressure on the Object Storage Target (OST) side. File loading failure may occur due to concurrent loading of the same data by the multi-node reading data, and reading operation may be limited; meanwhile, the capacity of a cache memory (cache) is small, and the super-large files cannot be cached. The hit rate is low in a random reading scene, and the service capability of the file storage system is reduced.
In another data center scenario, the Alluxio file system manages storage of multiple computing nodes through a central management node, and a distributed storage pool is constructed. However, the design architecture based on the central management node is prone to make the storage system unavailable due to the failure of the central management node. In a scenario where a large number of computing nodes are popped in and popped out, the central management device may also be limited by its own management scheduling capability, and computing node resources cannot be efficiently and reasonably allocated, so that the service capability of the file storage system is reduced.
In view of this, an embodiment of the present application provides a file processing method, where a primary directory is set in a file storage system, and a secondary directory is set in each compute node; the first-level directory comprises meta information of cache files in a plurality of computing nodes, and the second-level directory comprises meta information of cache files in a single computing node; the first node judges whether a file global unique identifier of a file to be processed exists in a secondary directory of the first node, if so, the first node locally acquires the file corresponding to the file global unique identifier; if not, judging whether the primary directory has the globally unique identifier of the file or not; when the first-level directory exists, the first node determines a second node through the global unique file identification, and acquires a corresponding file from the second node. Therefore, the local storage space of the computing node is constructed into an additional elastic cache pool, the upper limit of the capacity of the cluster system is expanded, meanwhile, the first node can actively acquire file data according to the directory, the management and scheduling of the central management node are not needed, and the service capacity of the file storage system is improved.
An application scenario to which the present application is applied is described below with reference to fig. 1.
As shown in a of fig. 1, the K8s cluster system may include a computing system and a file storage system, wherein the computing system includes a plurality of computing nodes, and the computing nodes may be, but are not limited to, hosts, servers, virtual machines or other devices with the same functions. The computing nodes may include a first node 101 and a second node 102. Wherein, b of fig. 1 illustrates a structural diagram of the computing node by taking the first node 101 as an example. As shown in b of fig. 1, the compute node may include a local storage, a secondary directory, a compute node Agent (Agent), a NAS Client (NAS Client), and the like. The secondary directory of any one of the compute nodes includes meta-information for files cached locally at any one of the compute nodes.
The file storage system may include a primary directory and a file storage pool, where the file storage pool may be a server configured with storage media such as a hard disk, and may also be a storage media such as a hard disk, which is not limited in this application. The first level directory includes meta information for files cached in the plurality of compute nodes. The primary directory may be stored in a file storage pool, a memory of a server, a cache of a memory, or other storage media.
A plurality of computing nodes can communicate by means of a computing network, and the Agent of any computing node can be used for acting the communication between any computing node and each node in the computing cluster. For example, an Agent of a first node may access a second node through a computing network. The plurality of computing nodes and the file storage system may be in communication by means of a storage network. For example, a first node may retrieve a primary directory over a storage network and retrieve file data from a file storage system. It is understood that the computing network and the storage network are defined for the convenience of distinguishing the two communication modes, and in fact, the computing network and the storage network may use the same network or different networks.
By way of example, the embodiment of the application can be applied to scenes such as AI model training and HPC rendering calculation. For example, when a user trains an AI model based on the document processing method provided by the embodiment of the present application, the computing node may read a picture from the document storage system for a single training. And during the K training, the computing node reads the picture again randomly. Part of the pictures may be used by the compute node in the J (J < K) th training and cached in the local storage of the compute node. Then, during the K-th training, the computing node can quickly read the part of the picture based on the local storage of the computing node and/or the local storage of other nodes, so that the access to a back-end file storage system is reduced, and the service capability of the file storage system is improved.
As another example, in an HPC rendering computing scenario, compute nodes load data from a file storage system that is needed for a single rendering task. At the rendering time of the Kth time, part of data acquired by the computing node may be called by the computing node in the rendering task of the J (J < K) th time and cached in the local storage of the computing node. Then, during the rendering at the Kth time, the computing node can quickly acquire the part of data based on the local storage of the computing node and/or the local storage of other nodes, so that the access to a back-end file storage system is reduced, and the service capacity of the file storage system is improved.
It should be noted that the above examples exemplarily indicate the applicable scenarios of the embodiments of the present application.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 2 is a schematic flowchart illustrating a file processing method provided in an embodiment of the present application, and as shown in fig. 2:
s201, a first node acquires a global unique identifier of a first file to be processed; the first node is any one of the compute nodes in the compute cluster.
The first node is a node for performing a computing task, wherein the first node may be any computing node in a computing cluster. The file global unique identifier is used for identifying a file required for executing the computing task, and for example, the file global unique identifier can be obtained through hash value calculation. The first file global unique identifier is an identifier of a file to be processed of the first node. After determining a file to be processed, the first node acquires a corresponding file global unique identifier based on the file.
S202, when the first node determines that the secondary directory comprises the first file global unique identifier, the first node locally obtains a file corresponding to the first file global unique identifier.
Each computing node is provided with a secondary directory, and the secondary directory of any computing node comprises meta information of a file cached locally by any computing node. The meta-information may include a file global unique identifier of the cached file. When the second-level directory of the first node comprises the first file global unique identifier, the file to be processed is a file stored in the local cache of the first node. When the first node reads the file to be processed, the file can be acquired based on local storage of the first node.
S203, when the first node determines that the second-level directory does not include the first file global unique identifier and the first-level directory includes the first file global unique identifier, the first node acquires the address of the second node in the first-level directory, and the second node is a node in which the file corresponding to the first file global unique identifier is stored.
The primary directory includes meta information for files cached in the plurality of compute nodes. The meta-information is stored in the form of a plurality of meta-data entries, and the meta-data entries of any one of the primary directories include: the method comprises the steps of table entry state, a file reading counter, a file global unique identifier, an Agent global unique identifier and address information of nodes stored in a cache file. Table 1 exemplarily shows the format of the metadata entry, as shown in table 1:
table 1 metadata table entry
Figure BDA0003557374660000081
And each computing node is also provided with an Agent, and the Agent of any computing node is used for acting the communication between any computing node and each node in the computing cluster. When the first-level directory comprises the first file global unique identifier, the files to be processed are cached in a certain computing node managed by the cluster system. And defining the computing node including the file to be processed in the local cache as a second node. The first node may determine address information for the second node based on the primary directory.
S204, the first node acquires the file corresponding to the first file global unique identifier from the second node according to the address of the second node.
The first node can establish communication with the second node through the Agent, and the first node can acquire a file corresponding to the first file global unique identifier from the second node according to the address of the second node.
It can be understood that the second node and the first node may be the same computing node or different computing nodes, which is not limited in this application.
The file processing method provided by the embodiment of the application records the secondary directory of the cached files in a single computing node and records the primary directory of the cached files in a plurality of computing nodes. When the file to be processed is a cached file in a computing node of the cluster system, the first node may obtain the file from the local storage of the computing node corresponding to the first file global unique identifier based on the secondary directory and the primary directory. According to the method, the local storage of the computing nodes is constructed into an additional elastic cache pool, so that the storage capacity of a cluster system is improved, the storage pressure of a file storage system is reduced, and the service capacity of the file storage system is improved.
Exemplarily, fig. 3 shows a schematic flowchart of a file processing method provided in an embodiment of the present application, as shown in fig. 3:
s301, when the first node is started, the Agent of the first node tests the access capability of the first-level directory in the file storage system.
The Agent may be a user process. The agents do not need to be managed in a unified mode, and each computing node is provided with an Agent global unique identifier. For example, an Agent is globally uniquely identified as a network address of a compute node. The agents have network access capability, and the access capability of the primary directory in the file storage system can be tested through a network, so that whether the Agent of the first node can communicate with the agents of other computing nodes is determined, and cache data on other nodes can be obtained.
The first node is self-started when being started, and the Agent can execute the following operations:
the Agent may register a certain amount of storage space in the first node. The storage space may be used to cache pending files.
The Agent may create a certain amount of contiguous space in the first node's cache as an address (entry) to record the meta-information of the secondary directory. For example, the entry may include a file path + a file name or a file global unique identifier + address information of the file in the cache pool, and the like
The Agent can test the access capability of the primary directory based on the NAS Client, if the access is successful, the primary directory can be used, and the Agent can be operated by the file processing method provided by the embodiment of the application.
When the primary directory does not exist in the file storage system, the Agent creates the primary directory in the file storage system.
And if the primary directory does not exist in the file storage system, the Agent of the first node is the first started Agent. The Agent can create a primary directory in the file storage system, and the scale of the primary directory can be set by a user in a self-defined mode.
S302, the Agent of the first node hijacks an Application Programming Interface (API) used for accessing files stored in the file storage system.
The Agent's Portable Operating System Interface (POSIX) also has the function of hijacking the API of the system, and the Agent can actively screen out and process the access action to the file storage system. The Agent can monitor the action of the first node for accessing the file storage system in real time through the hijacking API.
S303, when the first node obtains the file reading operation, the first node obtains a first file global unique identifier corresponding to the file reading operation.
When the Agent monitors that the first node has an access action on the file reading operation of the file storage system, the Agent hijacks the file reading operation and processes the file reading operation to obtain a first file global unique identifier corresponding to the file reading operation.
S304, the Agent of the first node retrieves the secondary directory in the first node.
The secondary directory may be stored locally at the first node and recorded and managed using a hash table. The secondary catalog can be retrieved in a loop mode, and the Agent starts to retrieve downwards at the current position of the secondary catalog until the current position is retrieved again. The secondary directory may be stored in the compute node memory to speed up traversal.
S305, if the second-level directory has the first file global unique identifier, the Agent of the first node adds one to a reading counter corresponding to the first file global unique identifier in the second-level directory.
When the global unique identifier of the first file exists in the secondary directory, the file to be processed is cached locally in the first node. The Agent of the first node can read the file based on the first file global unique identification, and a reading counter corresponding to the first file global unique identification in the secondary directory is increased by one. Wherein the value of the read counter may identify the number of compute nodes currently reading the file to be processed.
S306, after the first node completes the file reading from the local, the Agent of the first node subtracts one from the read counter corresponding to the first file global unique identifier in the secondary directory.
The Agent of the first node can subtract one from the read counter corresponding to the first file global unique identifier in the secondary directory to indicate that the file to be processed is read completely.
The above embodiment is a case where the Agent of the first node can acquire the cache file in its local storage based on the secondary directory. If the second-level directory of the first node does not have the first file global unique identifier, the following operations can be executed:
s307, the Agent of the first node retrieves the primary directory in the file storage system.
The primary directory can be stored in a file storage pool of a file storage system and recorded and managed by using a hash table mode. The primary directory may be retrieved in a loop manner. In order to fully optimize performance, special hardware may be used in the file storage system service to optimize the reading and writing of the first-level directory, for example, a multi-level cache management is performed in an Application Specific Integrated Circuit (ASIC) + High Bandwidth Memory (HBM) manner.
When the second-level directory does not have the first file global unique identifier of the file to be processed, the Agent of the first node can search the first file global unique identifier in the first-level directory of the file storage system.
S308, if the first file global unique identifier exists in the primary directory, the Agent of the first node acquires the address of the second node.
The metadata entry includes metadata, where the metadata may be a file global unique identifier and address information of a node where the file is stored. The meta information of the same file can be recorded in the same entry, and the meta information has a corresponding relationship. The Agent of the first node may determine address information of the second node based on the first file globally unique identifier.
S309, the Agent of the first node obtains the file reading permission from the Agent of the second node.
And the Agent of the first node acquires the file reading permission from the Agent of the second node based on the address information of the second node.
S310, when the Agent of the first node obtains the permission of reading, the Agent of the first node obtains the file corresponding to the global unique identifier of the first file from the second node.
The above embodiment is a scenario in which an Agent of a first node can obtain a globally unique identifier of a first file in a primary directory and a secondary directory, and a description is given below of a case in which the globally unique identifier of the first file does not exist in both the primary directory and the secondary directory.
S311, if the first-level directory does not include the first file global unique identifier, the Agent of the first node obtains a metadata entry address of the first file global unique identifier in the first-level directory.
When the first file global unique identification is not included in the primary directory, the Agent of the first node can calculate the address of the metadata table item in the primary directory based on the first file global unique identification. The calculated metadata entry address is an estimated address and is not an actual address. For example, the first node may calculate a hash value key based on the first globally unique file identifier. The upper limit recordable in the primary directory is metadata entries of a computing nodes, the number of the computing nodes may far exceed a, and then key values of b file global unique identifiers can be mapped to one metadata entry according to a negotiation mechanism in the Agent. And the Agent of the first node acquires the addresses of a plurality of corresponding metadata entries according to the global unique identifier of the first file.
S312, if the metadata entry address is used, the Agent of the first node continuously searches the blank entry in the primary directory.
After acquiring the address of the metadata entry, the Agent of the first node checks whether the address is a blank entry, and if so, step 313 is executed. If not, the Agent of the first node continuously judges whether the metadata entry position of the next hop is a blank entry.
S313, if the Agent of the first node obtains a blank table item before searching N times, writing a metadata table item corresponding to the first file global unique identifier into the blank table item.
N designates the hop count for the user and can be set by self-definition. And after the Agent of the first node finds the blank table entry, writing a metadata table entry recording the metadata information of the file to be processed into the blank table entry.
It can be understood that, since the number of the computing nodes is greater than the addresses of the metadata entries in the primary directory, a situation that a blank entry is not found after Loop completion may occur, and the Agent of the first node may acquire the file to be processed from the file storage system based on the NAS Client and does not record the metadata entry corresponding to the data to be processed in the primary directory.
And S314, if the Agent of the first node is successfully written, the first node acquires the file from the file storage system and caches the file to the local.
After the writing is successful, the Agent of the first node can read the file from the file storage system based on the first file global unique identifier and cache the file into the local of the first node.
And S315, if the Agent of the first node is not successfully written, writing a metadata table entry corresponding to the global unique identifier of the first file into the next blank table entry until the writing is successful or the number M of times is reached.
M designates the hop count for the user, and can be set by self, and M and N can be set to be the same value or different values.
S316, the Agent of the first node reads and caches the file from the file storage system, and modifies the second-level directory of the first node.
When the Agent of the first node successfully writes the metadata table item into the blank table item, the data can be read from the file storage system based on the NAS Client and cached to the local. The Agent of the first node may modify the secondary directory of the first node according to the meta-information in the primary directory. When the computing node reads the file next time, the corresponding first file global unique identifier can be directly searched from the primary directory and the secondary directory, and therefore the file is obtained in the local storage of the computing node.
Exemplarily, step S309 is described below with reference to fig. 4, and fig. 4 shows a schematic flow diagram of an Agent of a first node obtaining a file read permission from an Agent of a second node.
S401, the Agent of the first node sends a read request of the file to be processed to the Agent of the second node.
S402, the Agent of the second node judges whether the file to be processed is readable according to the reading request.
And S403, if the file to be processed is readable, the Agent of the second node returns the permission information of the file allowed to be read to the Agent of the first node.
S404, if the Agent of the second node does not determine that the file is readable in the window time, reading failure information is returned to the Agent of the first node. The window time can be set by the user.
Optionally, when the Agent of the second node does not allow the Agent of the first node to perform a file read operation, the Agent of the first node may read the file to be processed from the file storage system by using the NAS Clent. The first node will not record the meta information of the file into the secondary directory and the primary directory.
Optionally, steps S401 to S404 may be freely combined. For example, step S403 and step S404 may be merged into an Agent of the second node to return to an Agent of the first node whether the file to be processed is readable.
The file processing method provided by the embodiment of the application records the secondary directory of the cached files in a single computing node and records the primary directory of the cached files in a plurality of computing nodes. When the file to be processed is a cached file in a computing node of the cluster system, the Agent of the first node may obtain the file from the local storage of the computing node corresponding to the first file global unique identifier based on the secondary directory and the primary directory. According to the method, the local storage of the computing nodes is constructed into an additional elastic cache pool, so that the storage capacity of the cluster system is improved, the storage pressure of the file storage system is reduced, and the service capability of the file storage system is improved.
The present application may provide an entry contention mechanism to implement the first node' S Agent to perform steps S311-S314. Exemplarily, the entry contention mechanism provided in the embodiment of the present application is described below with reference to fig. 5, and the specific steps are as follows:
it should be noted that the entry contention mechanism provided in the embodiment of the present application includes contention of entry resources in the primary directory.
S501, the Agent of the first node checks the table entry state bit of the blank table entry.
The entry status bit is used to identify whether the address of the current metadata entry is occupied. The entry status bit may set a first value to indicate that the entry is in an occupied state and a second value to indicate that the entry is in an available state. For example, in this embodiment of the present application, the entry status bit is set to 1 bit, the first value being 1 indicates that the entry is occupied, and the second value being 0 indicates that the entry is available.
And S502, when the table item state bit is a second value, the Agent of the first node inverts the table item state bit.
The first value and the second value are defined as two opposite values, and when the table entry state bit is the second value, the first node determines that a blank table entry is available. The Agent can negate the table entry status bit, and other computing nodes do not contend for the blank table entry when detecting that the table entry status bit is unavailable.
S503, the Agent of the first node writes the Agent global unique identifier into the blank table entry.
S504, the Agent of the first node checks the table entry state bit of the blank table entry.
In one possible scenario, L compute nodes may contend for the same white space entry at the same time. The following scenario may occur: taking L ═ 2 as an example, 2 compute nodes detect that the entry state bit is the second value at the same time, and all perform the negation operation on the entry state bit. The entry state bit is inverted to a first value and is inverted to a second value. And when the Agent of the first node detects the blank table entry again, the table entry state bit is a second value. And 2 computing nodes write corresponding Agent global unique identifiers in the blank table entries, so that the entry resources are wrongly written. The Agent of the first node may verify by looking back at the entry status bit.
If the table entry status bit is the second value, step S507 is executed.
And S505, if the table entry state bit is the first value, the Agent of the first node judges whether the Agent global unique identifier of the blank table entry is the Agent global unique identifier corresponding to the first node.
Taking L-3 as an example, the table entry status bit is still the first value after being inverted three times. At this time, the meta-information may be written into the blank table entry by other computing nodes, and the Agent of the first node may determine whether the Agent is the Agent global unique identifier written by the first node by checking the Agent global unique identifier.
If the Agent global unique identifier in the blank table entry is not the Agent global unique identifier of the first node, step S507 is executed.
And S506, if the Agent global unique identifier in the blank table entry is the Agent global unique identifier of the first node, writing other meta-information into the blank table entry.
And after the first node determines that the Agent global unique identifier in the blank table entry is the meta-information written by the first node, writing other meta-information in the meta-information table entry into the blank table entry. For example, other meta-information may be entry status, file read counter, file global unique identifier, and address information of the node where the cached file is stored.
The Agent of the first node successfully contends for the blank table entry, can read the file to be processed from the file storage system and cache the file to the local, and simultaneously updates the second-level directory.
And S507, detecting the count by the Agent of the first node, if the count does not exceed N, executing S501 in the next blank table entry, and updating the count to be N + 1.
Counting the number of times of contending for the blank table entry by the Agent for indicating the first node, and contending for the next blank table entry again by the first node after the current entry resource contention fails until the contention is successful or the number of times of arrival.
According to the file processing method provided by the embodiment of the application, the entry contention mechanism is arranged in the primary directory, so that the condition that entry meta information is written in a messy manner is reduced. By means of the state bit of the backlookup table item and the globally unique identifier of the backlookup Agent, the situations of deadlock, starvation and the like of a plurality of computing nodes are reduced, and the accuracy of obtaining the file based on the first-level directory in the embodiment of the application is improved.
It can be understood that the file processing method in the embodiment of the present application improves the storage IO performance of the K8s cluster system through the local storage and Cache storage of the compute node. Meanwhile, the file processing method provided by the embodiment of the application can improve the performance of the elastic cache pool based on the read-write access capability of the Agent of the computing node to the primary directory and the secondary directory. The above embodiment performs a file read operation on the primary directory and the secondary directory by the Agent of the first node. The file write operation is described below with reference to fig. 6.
Illustratively, fig. 6 shows a schematic flowchart of a file writing operation provided by an embodiment of the present application. As shown in fig. 6:
s601, when the first node obtains the file writing operation, the first node obtains a second file global unique identifier corresponding to the file writing operation.
And the first node hijacks the POSIX API continuously, and when the first node obtains the access action of the file writing operation, the first node obtains a second file global unique identifier corresponding to the file writing operation. The global unique identifier of the second file is the identifier of the file to be written into the first node.
S602, the Agent of the first node searches the secondary catalog of the first node.
S603, if the secondary directory comprises the second file global unique identifier, the Agent of the first node checks whether a read counter corresponding to the second file global unique identifier in the secondary directory returns to zero.
It will be appreciated that the primary directory and the secondary directory may support multiple compute nodes to perform file read operations simultaneously. However, when other compute nodes read a file based on a metadata entry, the first node can no longer write the file based on the metadata entry.
When the first node executes file writing operation, the read counter corresponding to the second file global unique identifier of the secondary directory is determined to return to zero, and subsequent file reading operation of the computing node is refused.
And S604, if the reading counter returns to zero, deleting the metadata table entry corresponding to the global unique identifier of the second file in the primary directory by the Agent of the first node.
S605, deleting the metadata table item corresponding to the global unique identifier of the second file in the secondary directory by the Agent of the first node.
For example, the embodiment of the present application may preferentially delete the primary directory. And the second-level directory is deleted after the first-level directory is deleted, so that the situation that the second-level directory is successfully deleted but the deletion of the first-level directory fails can be reduced. The first-level directory may also be deleted preferentially in the embodiments of the present application, which is not limited in this application.
S606, if the secondary directory does not comprise the global unique identifier of the second file, the Agent of the first node checks whether the primary directory comprises the global unique identifier of the second file.
S607, if the primary directory includes the global unique identifier of the second file, the Agent of the first node sends a file deletion request to the second node.
When the first node deletes the metadata entry, the first node can only delete the metadata entry of the secondary directory and the primary directory corresponding to the first node, and cannot empty the metadata entry of the secondary directory in the second node based on the primary directory. On the basis, the first node can send a file deletion request corresponding to the second file global unique identifier to the second node.
And S608, deleting the metadata table entry corresponding to the second file global unique identifier in the primary directory by the Agent of the second node based on the file deletion request.
And S609, deleting the metadata table item corresponding to the global unique identifier of the second file in the secondary directory by the Agent of the second node.
The first node may perform step S610 after determining that the metadata entry corresponding to the global unique identifier of the second file has been deleted by the second node. It will be appreciated that the computing node may delete the cached file in the local cache upon determining that the metadata entry in the primary directory and the secondary directory is deleted.
S610, the Agent of the first node writes the file in the local of the first node by using the NAS Client.
For example, the flow of the first node requesting the second node to delete the metadata entry corresponding to the globally unique identifier of the second file is shown in fig. 7.
S701, the Agent of the second node obtains a file deletion request.
S702, the Agent of the second node waits for the read counter to return to zero. During the period, the second node refuses the subsequent calculation node to perform file reading operation on the metadata table item corresponding to the second file global unique identification in the secondary directory of the second node.
And S703, if the waiting time exceeds the time threshold, the second node informs the first node that the file deletion fails. Wherein, the time threshold value can be set by self.
S704, if the waiting time does not exceed the time threshold, the Agent of the second node deletes the metadata entry corresponding to the second file global unique identifier in the primary directory and the secondary directory of the second node.
S705, after the Agent of the second node successfully deletes the metadata entry, the Agent sends file deletion success to the first node.
It is understood that steps S701-S705 may be performed in the order of fig. 7, or may be performed in other orders. For example, step S703 is executed after S702 is determined as "yes", and S704 and S705 are executed after S702 is determined as "no". Step S703 may be before S704, or after S704 and before S705, which is not limited in this application.
According to the file processing method provided by the embodiment of the application, the metadata table entry corresponding to the global unique identifier of the second file is deleted in the secondary directory and the primary directory corresponding to the computing node, so that an effective file deleting mechanism is constructed. During file writing operation, the computing node can utilize the file writing capacity of the file storage system, and the overall performance of the K8s cluster system is improved.
It can be understood that, in the file processing method provided in the embodiment of the present application, the first node may actively acquire a file from the elastic cache pool based on the Agent when the condition is satisfied, and a central manager does not need to be configured to schedule the pop-in and pop-up of the computing node. When any node in the computing nodes pops up, the cache files and the meta information in the nodes of the first-level directory and the second-level directory are deleted, so that the accuracy of the corresponding relation between the directories and the elastic buffer pool is improved.
The following describes the elastic buffer pool provided in the embodiment of the present application with reference to the above embodiment and fig. 8. Fig. 8 shows a frame example diagram of a K8s cluster system, as shown in fig. 8:
the K8s cluster system may include elastic compute nodes and a file storage system. The elastic compute nodes may include elastic compute node #0, elastic compute node #1 … … elastic compute node # n. Wherein, the elastic computing node can comprise: secondary catalog, Agent, POSIX, NAS Client and local storage. The secondary directory may be stored in memory or local storage of the compute node. The Agent is used for acting any one computing node to communicate with each node in the computing cluster. And the POSIX is used for monitoring and hijacking the read-write access action of the computing node on the file storage system in real time. The Agent may utilize the NAS Client to retrieve files from the file storage system. The file storage system may include a primary directory and a plurality of file storage pools in which the primary directory may be stored.
For example, taking the flexible computing node #0 as a first node and the flexible computing node #1 as a second node, when POSIX in the first node hijacks a read-write access task of acquiring a file to be processed from a file storage system, the first node Agent may acquire a globally unique identifier of the first file and retrieve a secondary directory of the first node. And if the secondary directory comprises the first file global unique identifier, the Agent of the first node acquires the file from the local storage. And if the secondary directory does not comprise the globally unique identifier of the first file, the Agent of the first node retrieves the primary directory through the storage network.
And if the primary directory comprises the first file global unique identifier, the Agent of the first node acquires the address of the second node according to the primary directory. The Agent of the first node establishes communication with the Agent of the second node based on a computing or storage network. And after obtaining the file reading permission, the Agent of the first node acquires the file from the local storage of the second node.
And if the first-level directory and the second-level directory do not comprise the first file global unique identifier, the Agent of the first node can acquire the file from the file storage pool by using the NAS Client based on the storage network.
It should be noted that the elastic storage pool is an aggregate of local storages of elastic compute nodes 0-n, and fig. 8 may be understood as an exemplary definition of an aggregate of local storages of elastic compute nodes #0- # n as an elastic cache pool, rather than one elastic cache pool that is re-established independently of the elastic compute nodes.
According to the file processing method provided by the embodiment of the application, the local storage space of the computing node is constructed into the additional elastic cache pool, the upper limit of the capacity of the cluster system is expanded, meanwhile, the first node can actively acquire file data according to the directory, the management and scheduling of the central management node are not needed, and the service capacity of the file storage system is improved.
The method provided by the embodiment of the present application is described above with reference to fig. 1 to 8, and the related apparatus for performing the method provided by the embodiment of the present application is described below.
The file processing device provided by the embodiment of the application is applied to a computing cluster, wherein the computing cluster comprises a network attached storage file storage system and a plurality of computing nodes, the file storage system is provided with a primary directory, each computing node is provided with a secondary directory and a computing node agent, and the primary directory comprises meta information of files cached in the computing nodes; the second-level directory of any one computing node comprises meta-information of a file cached locally by any one computing node; the computing node Agent of any computing node is used for acting the communication between any computing node and each node in the computing cluster; the device includes:
the meta-information acquisition module of the first node is used for acquiring a global unique identifier of a first file to be processed; the first node is any one of the compute nodes in the compute cluster.
And the file acquisition module of the first node is used for acquiring a file corresponding to the global unique identifier of the first file locally by the first node when the judgment module of the first node determines that the secondary directory comprises the global unique identifier of the first file.
Or, the meta-information obtaining module of the first node is further configured to obtain an address of a second node in the primary directory when the judging module of the first node determines that the secondary directory does not include the first global unique file identifier and the primary directory includes the first global unique file identifier, where the second node is a node in which a file corresponding to the first global unique file identifier is stored.
And the file acquisition module of the first node is further used for acquiring a file corresponding to the global unique identifier of the first file from the second node according to the address of the second node.
Optionally, the metadata is stored in a form of a plurality of metadata entries, and the metadata obtaining module is configured to obtain the metadata entries of any one of the primary directories, and includes: the method comprises the steps of table entry state, a file reading counter, a file global unique identifier, an Agent global unique identifier and address information of nodes stored in a cache file.
Optionally, the document processing apparatus further includes:
and the test module of the first node is used for testing the access capability to the primary directory in the file storage system when the first node is started.
The creating module of the first node is used for creating a primary directory in the file storage system if the primary directory does not exist in the file storage system.
Optionally, before the obtaining module of the meta information of the first node obtains the global unique identifier of the first file to be processed, the method further includes:
and the hijacking module of the first node is used for hijacking an Application Programming Interface (API) for accessing the files stored in the file storage system.
Optionally, the meta-information obtaining module of the first node is specifically configured to, when the first node obtains the file read operation, obtain, by the first node, a first file global unique identifier corresponding to the file read operation.
Optionally, after the meta-information obtaining module of the first node obtains the global unique identifier of the first file to be processed, the method further includes:
and the retrieval module of the first node is used for retrieving the secondary directory in the first node.
And the counting module of the first node is used for adding one to a reading counter corresponding to the first file global unique identifier in the secondary directory if the secondary directory has the first file global unique identifier.
And the counting module of the first node is also used for subtracting one from a reading counter corresponding to the global unique identifier of the first file in the secondary directory after the first node completes the reading of the file from the local.
Optionally, the file obtaining module of the first node is configured to obtain, from the second node according to the address of the second node, a file corresponding to the global unique identifier of the first file, and specifically includes:
and the communication module of the first node acquires the file reading permission from the Agent of the second node.
When the communication module of the first node obtains the permission to read, the file acquisition module of the first node acquires a file corresponding to the global unique identifier of the first file from the second node.
Optionally, the file processing apparatus further includes:
when the judging module of the first node determines that the first-level directory does not include the first file global unique identifier, the meta-information obtaining module of the first node obtains a meta-data entry address of the first file global unique identifier in the first-level directory.
When the judging module of the first node determines that the address of the metadata table entry is used, the retrieval module of the first node continuously searches the blank table entry in the primary directory.
And if the retrieval module of the first node obtains a blank table item before searching for N times, the meta-information editing module of the first node writes a meta-data table item corresponding to the global unique identifier of the first file into the blank table item.
If the meta-information editing module of the first node writes successfully, the file obtaining module of the first node obtains the file from the file storage system, and the meta-information editing module of the first node modifies the secondary directory of the first node.
And if the meta-information editing module of the first node fails to write, writing the meta-information editing module of the first node into the next blank table entry by the meta-information editing module of the first node, wherein the meta-information editing module of the first node corresponds to the global unique identifier of the first file until the writing is successful or the preset writing times are reached.
And if the meta-information editing module of the first node successfully writes the meta-data entry corresponding to the global unique identifier of the first file, the file acquisition module of the first node acquires the file from the file storage system and modifies the secondary directory of the first node by using the meta-information editing module of the first node.
Optionally, the document processing apparatus further includes:
the table entry state module of the first node checks the table entry state bit of the blank table entry; the table entry state bit is used for identifying whether a blank table entry is occupied or not; the first value is used for identifying that the blank table entry is occupied, and the second value is used for identifying that the blank table entry is available.
And if the table entry state bit is the second value, the table entry state module of the first node inverts the table entry state bit.
And the meta-information editing module of the first node writes the Agent global unique identifier of the first node into the blank table entry.
The entry state module of the first node checks the entry state bit of the blank entry.
And if the table entry state bit is the first value, the judging module of the first node judges whether the Agent global unique identifier in the blank table entry is the Agent global unique identifier of the first node.
If yes, writing other meta-information into the blank table entry when the meta-information editing module of the first node successfully contends.
Optionally, the file processing apparatus further includes:
and if the meta-information editing module of the first node does not contend successfully, the counting module of the first node detects the counting.
If the count does not exceed N, the meta-information editing module of the first node performs contention for the next blank table entry, and the counting module is updated to N + 1.
Optionally, after the hijacking module of the first node hijacks an application programming interface API for accessing files stored in the file storage system, the method further includes:
when the hijack module of the first node obtains the file write operation, the meta-information obtaining module of the first node obtains a second file global unique identifier corresponding to the file write operation.
The retrieval module of the first node retrieves in the secondary directory of the first node.
If the secondary directory comprises the second file global unique identifier, the counting module of the first node checks whether a read counter corresponding to the second file global unique identifier in the secondary directory returns to zero.
And if the reading counter returns to zero, the meta-information deleting module of the first node deletes the meta-data table entry corresponding to the global unique identifier of the second file in the primary directory and the secondary directory.
Optionally, the retrieving module of the first node retrieves from the second-level directory of the first node, further including:
and if the secondary directory does not comprise the second file global unique identifier and the primary directory comprises the second file global unique identifier, the communication module of the first node sends a file deletion request to the communication module of the second node.
And the meta-information deleting module of the second node deletes the meta-data table entry corresponding to the second file global unique identifier in the primary directory and the secondary directory based on the file deleting request.
Optionally, the file processing apparatus further includes:
and the file writing module of the first node writes the file locally at the first node.
The file processing apparatus of the embodiment of the present application may be configured to execute the technical solution of the embodiment of the file processing method, and the implementation principle and the technical effect of the file processing apparatus are similar and will not be described herein again.
Fig. 9 is a schematic structural diagram of a file processing terminal device according to an embodiment of the present application. As shown in fig. 9, the file processing terminal device 90 provided in this embodiment may include:
a processor 901; and a memory 902 for storing executable instructions of the terminal device.
The processor is configured to execute the technical solution of the above-mentioned file processing method embodiment by executing the executable instructions, and the implementation principle and technical effect are similar, which are not described herein again.
The embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the technical solution of the embodiment of the file processing method is implemented, and the implementation principle and the technical effect are similar, and are not described herein again.
The embodiment of the present application further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the technical solution of the embodiment of the file processing method, and the implementation principle and the technical effect are similar, and are not described herein again.
In the above specific implementation of the terminal device or the server, it should be understood that the processor may be a Central Processing Unit (CPU), or may be another general-purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in a processor.
Those skilled in the art will appreciate that all or a portion of the steps of any of the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium, and when executed, performs all or part of the steps of the above-described method embodiments.
The technical scheme of the application can be stored in a computer readable storage medium if the technical scheme is realized in a software form and is sold or used as a product. Based on this understanding, all or part of the technical solutions of the present application may be embodied in the form of a software product stored in a storage medium, including a computer program or several instructions. The computer software product enables a computer device (which may be a personal computer, a server, a network device, or a similar electronic device) to perform all or part of the steps of the method according to one embodiment of the present application. The storage medium may be various media capable of storing program codes, such as a usb disk, a removable hard disk, a ROM, a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (14)

1. A file processing method is characterized in that the method is applied to a computing cluster, the computing cluster comprises a file storage system and a plurality of computing nodes, the file storage system is provided with a primary directory, each computing node is provided with a secondary directory and a computing node agent, and the primary directory comprises meta information of files cached in the computing nodes; the second-level directory of any one of the computing nodes comprises meta-information of files cached locally by any one of the computing nodes; the computing node Agent of any one of the computing nodes is used for acting the communication between any one of the computing nodes and each node in the computing cluster; the method comprises the following steps:
a first node acquires a global unique identifier of a first file to be processed; the first node is any one computing node in the computing cluster;
when the first node determines that the secondary directory comprises the first file global unique identifier, the first node locally acquires a file corresponding to the first file global unique identifier;
or, when the first node determines that the second-level directory does not include the first file global unique identifier and the first-level directory includes the first file global unique identifier, the first node acquires an address of a second node in the first-level directory, wherein the second node is a node in which a file corresponding to the first file global unique identifier is stored;
and the first node acquires the file corresponding to the first file global unique identifier from the second node according to the address of the second node.
2. The method of claim 1, wherein the meta information is stored in a plurality of meta data entries, and wherein any of the meta data entries of the primary directory comprises: the method comprises the steps of table entry state, a file reading counter, a file global unique identifier, an Agent global unique identifier and address information of nodes stored in a cache file.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
when the first node is started, the Agent of the first node tests the access capability of a primary directory in the file storage system;
if the primary directory does not exist in the file storage system, the primary directory is created in the file storage system.
4. The method according to claim 3, wherein before the first node obtains the globally unique identifier of the first file to be processed, the method further comprises:
and the Agent of the first node hijacks an Application Programming Interface (API) for accessing the files stored in the file storage system.
5. The method according to claim 1 or 2, wherein the first node obtains the globally unique identifier of the first file to be processed, and comprises:
when the first node obtains the file reading operation, the first node obtains a first file global unique identifier corresponding to the file reading operation.
6. The method according to claim 5, wherein after the first node obtains the globally unique identifier of the first file to be processed, the method further comprises:
the Agent of the first node retrieves a secondary directory in the first node;
if the second-level directory has the first file global unique identifier, the Agent of the first node adds one to a read counter corresponding to the first file global unique identifier in the second-level directory;
and after the first node finishes reading the file from the local, the Agent of the first node subtracts one from a reading counter corresponding to the first file global unique identifier in the secondary directory.
7. The method according to claim 6, wherein the obtaining, by the first node, the file corresponding to the first file globally unique identifier from the second node according to the address of the second node, comprises:
the Agent of the first node acquires file reading permission from the Agent of the second node;
and when the Agent of the first node obtains the permission of reading, the Agent of the first node acquires the file corresponding to the global unique identifier of the first file from the second node.
8. The method of claim 7, further comprising:
if the first-level directory does not comprise the first file global unique identifier, the Agent of the first node acquires a metadata entry address of the first file global unique identifier in the first-level directory;
if the metadata entry address is used, the Agent of the first node continuously searches a blank entry in the primary directory;
if the Agent of the first node obtains a blank table item before searching for N times, the Agent of the first node writes a metadata table item corresponding to the global unique identifier of the first file into the blank table item;
if the Agent of the first node is successfully written, the first node acquires the file from the file storage system and modifies a secondary directory of the first node;
if the Agent of the first node is not successfully written, the Agent of the first node writes a metadata table item corresponding to the first file global unique identifier into a next blank table item until the writing is successful or the preset writing times are reached;
and if the first node successfully writes the metadata entry corresponding to the global unique identifier of the first file, the first node acquires the file from the file storage system and modifies the secondary directory of the first node.
9. The method of claim 8, further comprising:
the Agent of the first node checks the table entry state bit of the blank table entry; the table entry state bit is used for identifying whether the blank table entry is occupied or not; the first value is used for identifying that the blank table entry is occupied, and the second value is used for identifying that the blank table entry is available;
when the table entry state bit is a second value, the Agent of the first node inverts the table entry state bit;
the Agent of the first node writes the Agent global unique identifier of the first node into the blank table entry;
the Agent of the first node checks the table entry state bit of the blank table entry;
if the table entry state bit is a first value, the Agent of the first node judges whether an Agent global unique identifier in the blank table entry is the Agent global unique identifier of the first node;
and if so, writing other meta-information into the blank table entry when the Agent of the first node successfully contends.
10. The method of claim 9, further comprising:
if the Agent of the first node does not successfully contend, the Agent of the first node detects and counts;
and if the count does not exceed N, performing contention for the next blank table entry, and updating the count to be N + 1.
11. The method of claim 4, wherein after the Agent of the first node hijacks an Application Programming Interface (API) for accessing the files stored in the file storage system, the method further comprises:
when the first node obtains a file write operation, the first node Agent obtains a second file global unique identifier corresponding to the file write operation;
the Agent of the first node retrieves the secondary catalog of the first node;
if the secondary directory comprises the second file global unique identifier, the Agent of the first node checks whether a read counter corresponding to the second file global unique identifier in the secondary directory returns to zero;
and if the reading counter returns to zero, the Agent of the first node deletes the metadata entry corresponding to the global unique identifier of the second file in the primary directory and the secondary directory.
12. The method of claim 11, wherein the Agent of the first node is retrieved at a secondary directory of the first node, further comprising:
if the secondary directory does not comprise the second file global unique identifier and the primary directory comprises the second file global unique identifier, the Agent of the first node sends a file deletion request to the Agent of the second node;
and deleting the metadata table entry corresponding to the second file global unique identifier in the primary directory and the secondary directory by the Agent of the second node based on the file deletion request.
13. A terminal device, comprising: a memory and a processor; the memory is to store computer instructions; the processor is configured to execute the computer instructions stored by the memory to implement the method of any of claims 1-12.
14. A computer-readable storage medium, having stored thereon a computer program for execution by a processor to perform the method of any one of claims 1-12.
CN202210283907.8A 2022-03-21 2022-03-21 File processing method, device and storage medium Pending CN114860655A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210283907.8A CN114860655A (en) 2022-03-21 2022-03-21 File processing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210283907.8A CN114860655A (en) 2022-03-21 2022-03-21 File processing method, device and storage medium

Publications (1)

Publication Number Publication Date
CN114860655A true CN114860655A (en) 2022-08-05

Family

ID=82627283

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210283907.8A Pending CN114860655A (en) 2022-03-21 2022-03-21 File processing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN114860655A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326239A (en) * 2015-06-18 2017-01-11 阿里巴巴集团控股有限公司 Distributed file system and file meta-information management method thereof
US20200104216A1 (en) * 2018-10-01 2020-04-02 Rubrik, Inc. Fileset passthrough using data management and storage node
CN113868251A (en) * 2021-09-24 2021-12-31 北京百度网讯科技有限公司 Global secondary indexing method and device for distributed database

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326239A (en) * 2015-06-18 2017-01-11 阿里巴巴集团控股有限公司 Distributed file system and file meta-information management method thereof
US20200104216A1 (en) * 2018-10-01 2020-04-02 Rubrik, Inc. Fileset passthrough using data management and storage node
CN113868251A (en) * 2021-09-24 2021-12-31 北京百度网讯科技有限公司 Global secondary indexing method and device for distributed database

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TORRES, E.等: "A quantitative justification to dynamic partial replication of Web contents through an agent architecture", 《INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE》, 2 July 2015 (2015-07-02), pages 82 - 89 *
杨宇: "分布式存储系统中海量文件随机存取技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, 15 August 2015 (2015-08-15), pages 137 - 40 *

Similar Documents

Publication Publication Date Title
CN110046133B (en) Metadata management method, device and system for storage file system
US7146389B2 (en) Method for rebalancing free disk space among network storages virtualized into a single file system view
US7694103B1 (en) Efficient use of memory and accessing of stored records
US10877680B2 (en) Data processing method and apparatus
US11652883B2 (en) Accessing a scale-out block interface in a cloud-based distributed computing environment
US11245774B2 (en) Cache storage for streaming data
US10884926B2 (en) Method and system for distributed storage using client-side global persistent cache
US20130290636A1 (en) Managing memory
CN109144413A (en) A kind of metadata management method and device
WO2021143351A1 (en) Distributed retrieval method, apparatus and system, computer device, and storage medium
CN108540510B (en) Cloud host creation method and device and cloud service system
US7536512B2 (en) Method and apparatus for space efficient identification of candidate objects for eviction from a large cache
CN111399760B (en) NAS cluster metadata processing method and device, NAS gateway and medium
CN107133334B (en) Data synchronization method based on high-bandwidth storage system
CN109254958A (en) Distributed data reading/writing method, equipment and system
CN115878677B (en) Data processing method and device for distributed multi-level cache
US11010410B1 (en) Processing data groupings belonging to data grouping containers
WO2021036909A1 (en) Picture write-in method and apparatus
CN108052296B (en) Data reading method and device and computer storage medium
CN114860655A (en) File processing method, device and storage medium
CN113835613B (en) File reading method and device, electronic equipment and storage medium
GB2502288A (en) Modifying the order of checking virtual machines for cached disc data
CN114416676A (en) Data processing method, device, equipment and storage medium
CN114579514B (en) File processing method, device and equipment based on multiple computing nodes
CN114764403A (en) Data migration method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination