CN113010479A - File management method, device and medium - Google Patents

File management method, device and medium Download PDF

Info

Publication number
CN113010479A
CN113010479A CN202110290894.2A CN202110290894A CN113010479A CN 113010479 A CN113010479 A CN 113010479A CN 202110290894 A CN202110290894 A CN 202110290894A CN 113010479 A CN113010479 A CN 113010479A
Authority
CN
China
Prior art keywords
file
file directory
directory
label
management method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110290894.2A
Other languages
Chinese (zh)
Inventor
姬贵阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yingxin Computer Technology Co Ltd
Original Assignee
Shandong Yingxin Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yingxin Computer Technology Co Ltd filed Critical Shandong Yingxin Computer Technology Co Ltd
Priority to CN202110290894.2A priority Critical patent/CN113010479A/en
Publication of CN113010479A publication Critical patent/CN113010479A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The file management method, the file management device and the file management medium are characterized in that the file directory of a file system is labeled, and the file directory label comprises at least one parameter of the file directory, such as the size of the file directory, the number of folders under the file directory, the number of files and the like, so that when an AI platform reads and transmits files through a computing node, especially when large files are operated, the file directory label can be directly obtained, the statistical operation of the size and the number of the files through a network by the computing node is avoided, the use efficiency of the files on the network through I/O is ensured, and the traversing read-write speed of the file directory is improved. Meanwhile, due to the improvement of the reading and writing speed, the occupation of file output on I/O resources of the AI platform is reduced, the model training efficiency is improved, and the performance experience of algorithm personnel of the AI platform in using the AI platform is improved.

Description

File management method, device and medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a method, an apparatus, and a medium for file management.
Background
With the rapid development of Artificial Intelligence (AI), more and more research enterprises and college researchers carry out deep learning model training on an AI platform, and an important function of the AI platform is to perform read-write operation on a file of Network Storage (Network Storage) through a computing node, wherein the Network Storage is mounted on each computing node through a Network.
At present, an AI platform generally needs to perform a display operation and a transmission operation for a file operation, but before the display, statistics on the size of a file directory needs to be continuously performed through a network, and before the transmission, blocking and packaging of the file directory also needs to be performed, so that the structure of the file directory and the size of the file need to be known, thereby determining the disk space remaining. Due to the storage of the files in blocks, in the prior art, for the reading, writing and transmission of the files of the AI platform of the large-scale cluster, due to various reasons such as network and the like, the reading and writing speed is extremely slow, the traversal of the file directory is very low under the condition of concurrency, and the situation that the reading is blocked due to lock sometimes occurs. Meanwhile, traversal of the file directory occupies a large amount of resources of the AI platform, which causes high cluster read-write I/O, which affects normal training of other models and also affects use of other modules in the AI platform.
Therefore, the technical problem to be solved by the technical personnel in the field is how to improve the speed of traversing, reading and writing the file directory and reduce the occupation of file output on the resources of the AI platform.
Disclosure of Invention
The application aims to provide a file management method, a file management device and a file management medium, which are used for improving the speed of traversing and reading and writing a file directory and reducing the occupation of file output on I/O resources of an AI platform.
In order to solve the above technical problem, the present application provides a file management method, including:
acquiring a file directory of a file system;
constructing a file directory label for the file directory;
when an acquisition request of a computing node is acquired, the file directory label is sent to the computing node;
wherein the file directory tag comprises at least one parameter of the file directory.
Preferably, after the obtaining the file directory of the file system, the method further includes:
and code arrangement is carried out on the file directories through a Hash algorithm so as to construct a file directory ordered queue.
Preferably, after the constructing the file directory tag for the file directory, the method further includes:
monitoring the file directory;
and updating the file directory label when the file directory changes.
Preferably, the updating the file directory tag when the file directory changes includes:
locking the file directory label;
modifying the file directory label according to the change condition of the file directory;
and releasing the file directory label.
Preferably, the monitoring the file directory specifically includes:
and monitoring the file directory through Inotify.
Preferably, after the monitoring the file directory by Inotify, the method further includes:
and acquiring a change list of the file directory sent by Inotify.
Preferably, the obtaining request is specifically sent when the computing node performs file transfer.
In order to solve the above technical problem, the present application further provides a file management apparatus, including:
the acquisition module is used for acquiring a file directory of the file system;
the construction module is used for constructing a file directory label for the file directory;
the sending module is used for sending the file directory label to the computing node when an acquisition request of the computing node is acquired;
wherein the file directory tag comprises at least one parameter of the file directory.
In order to solve the above technical problem, the present application further provides a file management apparatus, including a memory for storing a computer program;
a processor for implementing the steps of the file management method as described when executing the computer program.
In order to solve the above technical problem, the present application further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the file management method as described above.
The file management method provided by the application is used for making the label aiming at the file directory of the file system, and the file directory label comprises at least one parameter of the file directory, such as the size of the file directory, the number of folders under the file directory, the number of files and the like, so that when an AI platform reads and transmits files through a computing node, especially when large files are operated, the file directory label can be directly obtained, the statistical operation of the size and the number of the files through a network by the computing node is avoided, the use efficiency of the files on the network through I/O is ensured, and the speed of traversing, reading and writing the file directory is improved. Meanwhile, due to the improvement of the reading and writing speed, the occupation of file output on I/O resources of the AI platform is reduced, the model training efficiency is improved, and the performance experience of algorithm personnel of the AI platform in using the AI platform is improved.
In addition, the file management device and the file management medium correspond to the method, and the effect is the same as that of the file management device and the file management medium.
Drawings
In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a component architecture of a file management system according to an embodiment of the present application;
FIG. 2 is a flowchart of a file management method according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a file management apparatus according to an embodiment of the present application;
fig. 4 is a structural diagram of a file management apparatus according to another embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the present application.
With the vigorous development of the related industries of artificial intelligence, more and more researchers of scientific research enterprises and colleges have higher and higher requirements on computing power and unified management and control of computing resources, and the AI training platform effectively solves the requirements of the enterprises or scientific research institutions on computing power. The performance of traversing and reading the file directory is related to the working efficiency of the algorithm personnel related to the AI platform, so how to efficiently read and write the files stored in the network on the computing nodes, how to improve the performance of reading and writing the files, and how to improve the efficiency of training the model become important indexes influencing the performance of the AI platform.
At present, the main bottleneck point of file reading, writing and transmission related to a large-scale cluster is in a cluster network, but the performance bottleneck point of file reading, writing and transmission related to the cluster of the AI platform is not only in the cluster network, but also comprises the use condition of CPU (central processing unit) resources of a training task on a computing node, and the influence factors influence the reading and writing efficiency of an algorithm researcher on files to a certain extent. Some enterprise units establish single network storage through a management terminal, then share the network storage to each computing node, and other enterprise units with superior conditions select distributed network storage to operate files, and regardless of the single network storage or the distributed network storage, when an AI platform performs display operation and transmission operation on files, the AI platform needs to continuously perform size statistics of file directories through the network, and also needs to perform file directory blocking and packaging, and needs to know the file directory structure and the file size, so as to judge the disk space residue. Due to the storage of the files in blocks, in the prior art, for the reading, writing and transmission of the files of the AI platform of the large-scale cluster, due to various reasons such as network and the like, the reading and writing speed is extremely slow, the traversal of the file directory is very low under the condition of concurrency, and the situation that the reading is blocked due to lock sometimes occurs. Meanwhile, traversal of the file directory occupies a large amount of resources of the AI platform, which causes high cluster read-write I/O, which affects normal training of other models and also affects use of other modules in the AI platform.
In view of the above problems, the present application provides a file management method, apparatus, and medium for increasing the speed of traversing and reading and writing a file directory and reducing the occupation of I/O resources of an AI platform by file output.
For ease of understanding, a system architecture to which the technical solution of the present application is applicable is described below. Referring to fig. 1, a constituent architecture of a file management system provided in the present application is shown.
As shown in FIG. 1, the file management system provided herein may include a compute node 1 and a network storage 2. The network storage 2 is mounted on each computing node 1 through a network, and the computing nodes 1 are specifically computing nodes of an AI platform. And the AI platform performs read-write operation on the files of the network storage 2 through the computing node 1.
In specific implementation, the file management method provided by the present application is applied to a management node of the network storage 2, and the structure of the network storage 2 may be in the form of direct connection storage, network additional storage, or a storage area network. The network storage 2 includes storage devices (e.g., disk arrays, CD/DVD drives, tape drives, or removable storage media) and embedded system software that may provide cross-platform file sharing functionality. The network storage 2 usually possesses its own management node on a LAN, allowing users to access data on the network without application server intervention, and in this configuration, the network storage centrally manages and processes all data on the network, and offloads the load from the application or enterprise server, effectively reducing total cost of ownership and protecting user investment.
Further, after the management node of the network storage 2 acquires the file directory of the file system, the management node mainly monitors the files stored in the file system in combination with Inotify and constructs file directory tags for the file directory, and updates the file directory tags in real time when the file directory changes, for example, when a user adds a file or a file folder, deletes the file or the file folder, and modifies the file content.
After the file directory in the network storage 2 is labeled, the AI platform is mainly used in the following four aspects: first, the method is used for the AI platform to manage the display file directory and the size of the display directory. Secondly, when the user operates the file directory, the user can know the size of the file directory, the AI platform provides a prompt that the number of files in the file directory is too large or the files are too large, and when the user selects whether to open the file directory, the phenomenon that the network storage 2 is blocked due to file operation is prevented. Thirdly, when the data set is cached to the local part of the computing node through network storage, the size of the statistical file is avoided, the size of the statistical file is mainly used for calculating the residual size of the disk cache space, and the size of the data set catalog is avoided being calculated through the network and the related operation of I/O traversal statistics is avoided. Fourthly, when the data set directory is cached to the local or the file directory is copied and transmitted, the AI platform needs to divide the file directory into blocks, execute each block to calculate the progress information and realize the breakpoint continuous transmission function, so that the number and the size of each level of the file directory at the storage end need to be very clear, and the file directory label avoids the lock deadlock phenomenon caused by calculating the number and the size of the files through the network.
In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings.
Fig. 2 is a flowchart of a file management method according to an embodiment of the present application. Referring to fig. 2, the file management method includes:
s10: a file directory of a file system is obtained.
In the specific implementation, the file system is a storage system for managing files such as algorithm personnel training scripts, models, data sets, training logs and the like, a management node of network storage acquires a file directory of the file system through Inotify, which is a Linux characteristic. Inotify is sensitive, very simple to use, and much more efficient than the busy polling of cron tasks.
It should be noted that, in general, Inotify is combined with application software more closely, such as Inotify-tools, rsync, and the like, to solve the problem of file consistency of distributed clusters. The Inotify has a good monitoring effect on the Linux system, can monitor various change conditions of files in the file system by using the kernel interface Inotify aiming at the synchronous recording of the difference of the number of million-level files, does not perform real-time synchronization, and avoids the problem that the files frequently send a file list caused by the real-time synchronization in the current use process.
S11: a file directory tag is constructed for the file directory. Wherein the file directory tag comprises at least one parameter of the file directory.
In specific implementation, the file directory includes a home directory, a shared directory, and a data set directory of the user, where the directories store scripts of an AI platform training model, trained models, and logs, and the data set directory stores massive pictures, file tags, and the like. File directory tags are built in the network storage for file directories, and the parameters of the file directories can be the size of the file directory, the number of folders under the file directory, the size, the creation time, the revision user, and the number, the size, the creation time, the revision time, and the revision user of the files. In this embodiment, the information mainly recorded includes the SIZE (SIZE) of the file directory, the number (FILENUM) of files in each file directory hierarchy, and the number (DIRNUM) of folders. For example: file directory: data, file directory tag structure: the name of the directory: data, file size: 10T, file structure: the number of folders is 100, and the number of files is 10000. The size of the file directory is used for recording the space size of a magnetic disk when the data set is locally cached and copied, block transmission is needed in the file caching process with a large number of files, the number of folders and the number of files under the subdirectories of the file directory are recorded (the subdirectory files of the subdirectories are not included), and then the statistical operation of the file directory is not needed.
It should be noted that the size of the file directory mentioned in the present application is a storage space occupied by the file in the file directory, and may also be understood as a number of bytes in the file directory, which is usually expressed by a number of bytes with a prefix. The actual disk space occupied by a file depends on the file system, and the maximum file size of the file system depends on the number of bits that hold the storage size information, as well as the total size of the file system. The size of a file directory may be an accumulation of the total file sizes in the file system, or may be the size of a file contained in a folder in the file system.
S12: and when an acquisition request of the computing node is acquired, the file directory label is sent to the computing node.
Further, the obtaining request is specifically sent when the computing node performs file transfer.
In specific implementation, the AI platform needs to divide a file directory into blocks, execute each block to calculate progress information and realize a breakpoint resume function, which needs to be clear about the number and size of each level of the file directory at a network storage end, when the AI platform needs to read and write a file in a file system, that is, when the file transmission needs to be performed through a computing node, the computing node sends an acquisition request to the network storage, and when the network storage acquires the acquisition request of the computing node, the file directory label is sent to the computing node, thereby avoiding a lock deadlock phenomenon caused by the computing node calculating the number and size of the file through the network.
It can be understood that the obtaining request may also be that the computing node sends the request to the network storage in an idle state, and when the network storage obtains the obtaining request of the computing node, the file directory tag is sent to the computing node, so that the computing node caches the file directory tag to the local storage.
In another embodiment, when the network storage sends the file directory label to the computing node, in order to ensure the security of transmission, the file directory label may be encrypted before sending, the encryption mode may be symmetric encryption or asymmetric encryption, or may be encrypted by using an MD5 algorithm, and in a suitable manner, the computing node may decrypt the obtained file directory label by using a corresponding key.
Further, in order to reduce the network pressure in the transmission process and further reduce the data transmission amount, the network storage may further compress the file directory tag before sending, and in a suitable manner, the computing node may decompress the obtained file directory tag in a corresponding manner.
The file management method provided by this embodiment performs label creation for a file directory of a file system, and since the file directory label includes at least one parameter of the file directory, such as the size of the file directory, the number of folders in the file directory, the number of files, and the like, when an AI platform reads and transmits a file through a computing node, especially when operating a large file, the file directory label can be directly obtained, thereby avoiding the statistical operation of the size and the number of files by the computing node through a network, ensuring the use efficiency of the file on the network through I/O, and improving the speed of traversing, reading, and writing the file directory. Meanwhile, due to the improvement of the reading and writing speed, the occupation of file output on I/O resources of the AI platform is reduced, the model training efficiency is improved, and the performance experience of algorithm personnel of the AI platform in using the AI platform is improved.
Because the network storage is mostly distributed file system, a plurality of nodes form a file system network. Each node may be distributed at different locations, with communication and data transfer between nodes over the network. When the computing node obtains the file directories in the nodes, the file directories may be scattered and chaotic. Therefore, as a preferred embodiment, after S10, the method further includes:
and (4) arranging the file directories through a Hash algorithm to construct a file directory ordered queue.
In a specific implementation, the hash function enables the access process of the AI platform to the file directory ordered queue to be more rapid and effective, and the data elements are positioned more rapidly through the hash function. The file directory may be generally arranged by a direct addressing method or a random number method, although other algorithms may be used, and the present application is not limited thereto.
Because the file directory is a file used by the AI platform and has more reads and writes, the configuration of monitoring the file directory needs to be performed, and the file directory monitoring is performed in a distributed manner. Further, after S11, the method further includes:
monitoring a file directory;
in the event of a change in the file directory, the file directory tag is updated.
In specific implementation, the method mainly combines an AI platform, and configures corresponding monitoring paths for a home directory, a shared directory and a data set directory according to the using condition of a file directory in network storage, wherein the using mode of each monitoring path is different, and the directory labels generated by monitoring are also different. For example, the data set directory is only newly added or deleted, is not modified, and is only read and not written in the using process of the AI platform; the home directory of the user needs to read and write the file, and the number of directory labels generated by monitoring is larger than that of the data set.
Further, the monitoring file directory specifically includes:
the file directory is monitored through Inotify.
After the file directory is monitored through Inotify, the method further includes:
and acquiring a change list of the file directory sent by Inotify.
The management node in the network storage is combined with the Inotify efficient kernel characteristic to monitor the file directory in real time, and the change of the file directory mainly has the following three conditions: adding files and folders, reducing files and folders, and modifying file contents, wherein Inotify sends a change list of a file directory to a management node aiming at different file changes so that the management node can update stored file directory labels conveniently.
The file management method provided by the embodiment monitors the change of the file directory and updates the file directory label based on the Inotify efficient kernel characteristic, and because the network storage is mounted to each computing node through the network, when the AI platform performs file operation, the file management method has the file display function, the file copy function and the file disk monitoring function, so that the process of calculating the size and the number of files through the network is reduced, the CPU resource use condition of the AI platform is reduced, the network transmission rate is fully utilized, the file transmission speed, the file display speed and the file verification speed are improved, the AI platform training speed is not influenced, and the training efficiency is improved.
On the basis of the above embodiment, in the case that the file directory changes, updating the file directory tag specifically includes:
locking a file directory label;
modifying the file directory label according to the change condition of the file directory;
the file directory tag is released.
In specific implementation, the file directory tag is locked first, access to the computing node or other nodes is not allowed, the management node of the network storage modifies and updates the file directory tag according to a change list of the file directory sent by the Inotify, and the file directory tag is released after the update, which indicates that other users can start to access the file directory tag.
The file management method provided by the embodiment locks the file directory tags before the file directory tags are modified, and releases the file directory tags after the modification is finished, so that the computing nodes can acquire the latest file directory tags, and the training accuracy of the AI platform is improved.
In the above embodiments, the document management method is described in detail, and the present application also provides embodiments corresponding to the document management apparatus. It should be noted that the present application describes the embodiments of the apparatus portion from two perspectives, one from the perspective of the function module and the other from the perspective of the hardware.
Fig. 3 is a schematic structural diagram of a file management apparatus according to an embodiment of the present application. As shown in fig. 3, the apparatus includes, based on the angle of the function module:
an obtaining module 10, configured to obtain a file directory of a file system;
the building module 11 is used for building a file directory label for the file directory;
the sending module 12 is configured to send the file directory label to the computing node when the obtaining request of the computing node is obtained;
wherein the file directory tag comprises at least one parameter of the file directory.
On the basis of the above embodiment, as a preferred embodiment, the apparatus further includes:
the code arranging module is used for arranging codes of the file directories through a Hash algorithm to construct a file directory ordered queue;
the acquisition module 10 is further configured to monitor a file directory;
the building module 11 is further configured to update the file directory tag when the file directory changes.
Since the embodiments of the apparatus portion and the method portion correspond to each other, please refer to the description of the embodiments of the method portion for the embodiments of the apparatus portion, which is not repeated here.
The file management device provided by the embodiment of the application performs label making on the file directory of the file system, and the file directory label comprises at least one parameter of the file directory, such as the size of the file directory, the number of folders under the file directory, the number of files and the like, so that when an AI platform reads and transmits files through a computing node, especially when large files are operated, the file directory label can be directly obtained, the statistical operation of the size and the number of the files through a network by the computing node is avoided, the use efficiency of the files on the network through I/O is ensured, and the speed of traversing, reading and writing the file directory is improved. Meanwhile, due to the improvement of the reading and writing speed, the occupation of file output on I/O resources of the AI platform is reduced, the model training efficiency is improved, and the performance experience of algorithm personnel of the AI platform in using the AI platform is improved.
Fig. 4 is a structural diagram of a file management apparatus according to another embodiment of the present application, and as shown in fig. 4, the apparatus includes, from the perspective of a hardware structure: a memory 20 for storing a computer program;
a processor 21 for implementing the steps of the file management method as in the above embodiments when executing the computer program.
The file management device provided by the embodiment may include, but is not limited to, a smart phone, a tablet computer, a notebook computer, or a desktop computer.
The processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 21 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 21 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 21 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 21 may further include an AI (Artificial Intelligence) processor for processing a calculation operation related to machine learning.
The memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing the following computer program 201, wherein after being loaded and executed by the processor 21, the computer program can implement the relevant steps of the file management method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may also include an operating system 202, data 203, and the like, and the storage manner may be a transient storage manner or a permanent storage manner. Operating system 202 may include, among others, Windows, Unix, Linux, and the like.
In some embodiments, the file management device may further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
Those skilled in the art will appreciate that the configuration shown in FIG. 4 does not constitute a limitation of the file management apparatus and may include more or fewer components than those shown.
The file management device provided by the embodiment of the application comprises a memory and a processor, wherein when the processor executes a program stored in the memory, the following method can be realized: the file directory label is used for making a label for a file directory of a file system, and the file directory label comprises at least one parameter of the file directory, such as the size of the file directory, the number of folders under the file directory, the number of files and the like, so that when an AI platform reads and transmits files through a computing node, especially when operating large files, the file directory label can be directly obtained, the statistical operation of the size and the number of the files through a network by the computing node is avoided, the use efficiency of the files on the network through I/O is ensured, and the speed of traversing, reading and writing the file directory is improved. Meanwhile, due to the improvement of the reading and writing speed, the occupation of file output on I/O resources of the AI platform is reduced, the model training efficiency is improved, and the performance experience of algorithm personnel of the AI platform in using the AI platform is improved.
Finally, the application also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium has stored thereon a computer program which, when being executed by a processor, carries out the steps as set forth in the above-mentioned method embodiments.
It is to be understood that if the method in the above embodiments is implemented in the form of software functional units and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium and executes all or part of the steps of the methods described in the embodiments of the present application, or all or part of the technical solutions. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The document management method, apparatus, and medium provided by the present application are described in detail above. The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A file management method, comprising:
acquiring a file directory of a file system;
constructing a file directory label for the file directory;
when an acquisition request of a computing node is acquired, the file directory label is sent to the computing node;
wherein the file directory tag comprises at least one parameter of the file directory.
2. The file management method according to claim 1, further comprising, after said obtaining the file directory of the file system:
and code arrangement is carried out on the file directories through a Hash algorithm so as to construct a file directory ordered queue.
3. The file management method according to claim 2, further comprising, after said building a file directory tag for said file directory:
monitoring the file directory;
and updating the file directory label when the file directory changes.
4. The file management method according to claim 3, wherein the updating the file directory tag when the file directory is changed specifically comprises:
locking the file directory label;
modifying the file directory label according to the change condition of the file directory;
and releasing the file directory label.
5. The file management method according to claim 3, wherein the monitoring of the file directory specifically is:
and monitoring the file directory through Inotify.
6. The file management method according to claim 5, further comprising, after said monitoring of said file directory by Inotify:
and acquiring a change list of the file directory sent by Inotify.
7. The file management method according to claim 1, wherein the acquisition request is transmitted specifically when the computing node performs file transfer.
8. A file management apparatus, characterized by comprising:
the acquisition module is used for acquiring a file directory of the file system;
the construction module is used for constructing a file directory label for the file directory;
the sending module is used for sending the file directory label to the computing node when an acquisition request of the computing node is acquired;
wherein the file directory tag comprises at least one parameter of the file directory.
9. A file management apparatus comprising a memory for storing a computer program;
a processor for implementing the steps of the file management method according to any one of claims 1 to 7 when executing said computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the file management method according to any one of claims 1 to 7.
CN202110290894.2A 2021-03-18 2021-03-18 File management method, device and medium Pending CN113010479A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110290894.2A CN113010479A (en) 2021-03-18 2021-03-18 File management method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110290894.2A CN113010479A (en) 2021-03-18 2021-03-18 File management method, device and medium

Publications (1)

Publication Number Publication Date
CN113010479A true CN113010479A (en) 2021-06-22

Family

ID=76409694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110290894.2A Pending CN113010479A (en) 2021-03-18 2021-03-18 File management method, device and medium

Country Status (1)

Country Link
CN (1) CN113010479A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113835856A (en) * 2021-09-17 2021-12-24 苏州浪潮智能科技有限公司 Storage statistical method, device and equipment for AI platform
CN114003576A (en) * 2021-10-25 2022-02-01 苏州浪潮智能科技有限公司 Method and device for calculating file traversal progress, computer equipment and storage medium
CN114422600A (en) * 2021-12-31 2022-04-29 成都鲁易科技有限公司 File scheduling system based on cloud storage and file scheduling method based on cloud storage
CN116089364A (en) * 2023-04-11 2023-05-09 山东英信计算机技术有限公司 Storage file management method and device, AI platform and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718496A (en) * 2014-12-18 2016-06-29 群晖科技股份有限公司 cross-platform file attribute synchronization method and computer readable storage medium
EP3352095A1 (en) * 2017-01-24 2018-07-25 Fujitsu Limited Dynamic hierarchical data system
CN108717516A (en) * 2018-05-18 2018-10-30 云易天成(北京)安全科技开发有限公司 File label method, terminal and medium
CN108776680A (en) * 2018-05-31 2018-11-09 郑州云海信息技术有限公司 A kind of KVM and its one key hanging method of file, device, equipment, medium
CN109240982A (en) * 2017-07-04 2019-01-18 上海万根网络技术有限公司 Document distribution method and system and storage medium
CN110109866A (en) * 2017-12-28 2019-08-09 中移(杭州)信息技术有限公司 A kind of management method and equipment of file system directories

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718496A (en) * 2014-12-18 2016-06-29 群晖科技股份有限公司 cross-platform file attribute synchronization method and computer readable storage medium
EP3352095A1 (en) * 2017-01-24 2018-07-25 Fujitsu Limited Dynamic hierarchical data system
CN109240982A (en) * 2017-07-04 2019-01-18 上海万根网络技术有限公司 Document distribution method and system and storage medium
CN110109866A (en) * 2017-12-28 2019-08-09 中移(杭州)信息技术有限公司 A kind of management method and equipment of file system directories
CN108717516A (en) * 2018-05-18 2018-10-30 云易天成(北京)安全科技开发有限公司 File label method, terminal and medium
CN108776680A (en) * 2018-05-31 2018-11-09 郑州云海信息技术有限公司 A kind of KVM and its one key hanging method of file, device, equipment, medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113835856A (en) * 2021-09-17 2021-12-24 苏州浪潮智能科技有限公司 Storage statistical method, device and equipment for AI platform
CN113835856B (en) * 2021-09-17 2023-07-14 苏州浪潮智能科技有限公司 Storage statistics method, device and equipment of AI platform
CN114003576A (en) * 2021-10-25 2022-02-01 苏州浪潮智能科技有限公司 Method and device for calculating file traversal progress, computer equipment and storage medium
CN114003576B (en) * 2021-10-25 2024-01-12 苏州浪潮智能科技有限公司 Method and device for calculating file traversal progress, computer equipment and storage medium
CN114422600A (en) * 2021-12-31 2022-04-29 成都鲁易科技有限公司 File scheduling system based on cloud storage and file scheduling method based on cloud storage
CN114422600B (en) * 2021-12-31 2023-11-07 成都鲁易科技有限公司 File scheduling system based on cloud storage and file scheduling method based on cloud storage
CN116089364A (en) * 2023-04-11 2023-05-09 山东英信计算机技术有限公司 Storage file management method and device, AI platform and storage medium

Similar Documents

Publication Publication Date Title
US11275763B2 (en) Storage constrained synchronization of shared content items
CN113010479A (en) File management method, device and medium
US11740818B2 (en) Dynamic data compression
Beaver et al. Finding a needle in haystack: Facebook's photo storage
US10360235B2 (en) Storage constrained synchronization engine
US11061622B2 (en) Tiering data strategy for a distributed storage system
JP5387757B2 (en) Parallel data processing system, parallel data processing method and program
WO2019085769A1 (en) Tiered data storage and tiered query method and apparatus
GB2439578A (en) Virtual file system with links between data streams
US11645236B2 (en) Extending retention lock protection from on-premises to the cloud
US20140081901A1 (en) Sharing modeling data between plug-in applications
US11223528B2 (en) Management of cloud-based shared content using predictive cost modeling
US10248705B2 (en) Storage constrained synchronization of shared content items
CA2977696C (en) Storage constrained synchronization of shared content items
CN103501319A (en) Low-delay distributed storage system for small files
US11960442B2 (en) Storing a point in time coherently for a distributed storage system
Merceedi et al. A comprehensive survey for hadoop distributed file system
WO2017187311A1 (en) Storage constrained synchronization engine
US10719532B2 (en) Storage constrained synchronization engine
CN110362590A (en) Data managing method, device, system, electronic equipment and computer-readable medium
CN109983452A (en) System and method for continuously available Network File System (NFS) status data
CN104281486B (en) A kind of virtual machine treating method and apparatus
CN114925078A (en) Data updating method, system, electronic device and storage medium
US20240193141A1 (en) Parameter-Based Versioning For Log-Based Block Devices
Kakoulli et al. OctopusFS in action: Tiered storage management for data intensive computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210622