CN114528258A - Asynchronous file processing method, device, server, medium, product and system - Google Patents

Asynchronous file processing method, device, server, medium, product and system Download PDF

Info

Publication number
CN114528258A
CN114528258A CN202210151881.1A CN202210151881A CN114528258A CN 114528258 A CN114528258 A CN 114528258A CN 202210151881 A CN202210151881 A CN 202210151881A CN 114528258 A CN114528258 A CN 114528258A
Authority
CN
China
Prior art keywords
file
target
storage
determining
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210151881.1A
Other languages
Chinese (zh)
Other versions
CN114528258B (en
Inventor
曾志刚
孙振元
黎世勇
王雁鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202210151881.1A priority Critical patent/CN114528258B/en
Publication of CN114528258A publication Critical patent/CN114528258A/en
Application granted granted Critical
Publication of CN114528258B publication Critical patent/CN114528258B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • G06F16/152File search processing using file content signatures, e.g. hash values
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a file asynchronous processing method, device, server, medium, product and system, and relates to the technical field of computers, in particular to the field of big data. The specific implementation scheme is as follows: receiving metadata of a target file to be deleted, which is sent by a metadata server; inquiring the file version number of the target file in the metadata; determining a first storage position of target data of the target file in a distributed file system according to the file version number; and deleting the target data from the distributed file system according to the first storage position. The technical scheme of the file asynchronous deletion method and the file asynchronous deletion device improves the accuracy of file asynchronous deletion.

Description

Asynchronous file processing method, device, server, medium, product and system
Technical Field
The present disclosure relates to the field of big data in the field of computer technologies, and in particular, to a method, an apparatus, a server, a medium, a product, and a system for asynchronously processing a file.
Background
The distributed file system is used in scenes such as cloud storage and remote file storage. In general, a distributed file system may be made up of multiple storage nodes. The files that a user stores to the distributed file system may typically include metadata as well as target data. The metadata is typically stored in a metadata server. The target data is typically stored in a plurality of storage nodes of a distributed file system. Thus, the metadata of the file and the target data are stored asynchronously. When a file is deleted, both the metadata and the data block stored in at least one storage node need to be deleted. In general, when the metadata is deleted, it is confirmed that the file is successfully deleted, and in this case, if the data block stored in the at least one storage node is not successfully deleted. When the same-name file is created again, a deletion error may occur when the data block of the newly created new file with the same name is deleted.
Disclosure of Invention
The disclosure provides a file asynchronous processing method, device, server, medium, product and system.
According to a first aspect of the present disclosure, there is provided a file asynchronous processing method, including:
receiving metadata of a target file to be deleted, which is sent by a metadata server;
inquiring the file version number of the target file in the metadata;
determining a first storage position of target data of the target file in a distributed file system according to the file version number;
and deleting the target data from the distributed file system according to the first storage position.
According to a second aspect of the present disclosure, there is provided a file asynchronous processing method, including:
receiving a deletion request initiated by user equipment for a target file;
acquiring metadata of the target file based on the deletion request;
sending the metadata to a recovery server, wherein the recovery server determines the file version number of the target file according to the metadata and deletes the target data corresponding to the target file according to the file version number;
and if the target file is determined to meet the file deletion condition, deleting the metadata.
According to a third aspect of the present disclosure, there is provided a file asynchronous processing apparatus, including:
the data receiving unit is used for receiving the metadata of the target file to be deleted, which is sent by the metadata server;
the first query unit is used for querying the file version number of the target file in the metadata;
the first determining unit is used for determining a first storage position of target data of the target file in the distributed file system according to the file version number;
and the first deleting unit is used for deleting the target data from the distributed file system according to the first storage position.
According to a fourth aspect of the present disclosure, there is provided a file asynchronous processing apparatus, including:
a request receiving unit, configured to receive a deletion request initiated by a user equipment for a target file;
a data acquisition unit configured to acquire metadata of the target file based on the deletion request;
the data sending unit is used for sending the metadata to a recovery server, and the recovery server determines the file version number of the target file according to the metadata and deletes the target data corresponding to the target file according to the file version number;
and the third deleting unit is used for deleting the metadata if the target file is determined to meet the file deleting condition.
According to a fifth aspect of the present disclosure, there is provided a recycle server comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect.
According to a sixth aspect of the present disclosure, there is provided a metadata server including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the first and the second end of the pipe are connected with each other,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the second aspect.
According to a seventh aspect of the present disclosure, there is provided a file asynchronous processing system, comprising: a user device, a reclamation server as described in the fifth aspect, a metadata server as described in the sixth aspect, and a distributed file system;
the metadata server is respectively in communication connection with the user equipment and the recovery server; the recovery server is connected with the distributed file system.
According to an eighth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of the first or second aspects.
According to a ninth aspect of the present disclosure, there is provided a computer program product comprising: a computer program, stored in a readable storage medium, from which at least one processor of an electronic device can read the computer program, the at least one processor executing the computer program causing a reclamation server to perform the method of the first aspect or a metadata server to perform the method of the second aspect.
The technology solves the problem that asynchronous deletion of files with the same name is inaccurate, and the file version number is set when the target file is established. The version of the file with the same name is identified through the version number, and then when the file is deleted, the storage position of the target file in the distributed file system can be determined through the file version number, so that the accurate deletion of the data is realized, the deletion accuracy is improved, and the mistaken deletion is avoided.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram of an application of a file asynchronous processing system according to an embodiment of the present disclosure;
FIG. 2 is a flow diagram for one embodiment of a method for asynchronous processing of files, provided in accordance with an embodiment of the present disclosure;
FIG. 3 is a flow diagram of yet another embodiment of a method for asynchronous processing of files provided in accordance with an embodiment of the present disclosure;
FIG. 4 is a flowchart of yet another embodiment of a method for asynchronous processing of files provided in accordance with an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a data structure of a version number provided according to an embodiment of the present disclosure;
FIG. 6 is a schematic structural diagram illustrating an embodiment of a file asynchronous processing device according to an embodiment of the present disclosure;
FIG. 7 is a schematic structural diagram illustrating a further embodiment of a file asynchronous processing device according to an embodiment of the present disclosure;
FIG. 8 is a block diagram of a recycle server used to implement the file asynchronous processing method of an embodiment of the present disclosure;
FIG. 9 is a block diagram of a metadata server used to implement the file asynchronous processing method of an embodiment of the present disclosure;
FIG. 10 is a system architecture diagram of a file asynchronous processing system provided in accordance with an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The technical scheme disclosed by the invention can be applied to a file deletion scene of asynchronous storage, the determination of the storage position is realized according to the version number by setting the version number in the metadata for the file, and then the data in the file system is accurately deleted through the storage position, so that the deletion efficiency and the accuracy are improved.
In some embodiments, multiple storage nodes may be included in a distributed file system. When a file is stored in a distributed file system, the file may be partitioned into metadata and target data. The target data is actual data of the file, and the metadata is data recording attribute information of the file. When a file is deleted, at least one data block in which metadata and target data are actually stored needs to be deleted respectively. In general, when the metadata is deleted, it is confirmed that the file is successfully deleted, and in this case, if the data block stored in the at least one storage node is not successfully deleted. When the same-name file is created again, a deletion error may occur when the data block of the newly created new file with the same name is deleted.
In order to solve the above technical problem, in the embodiment of the present disclosure, a file version number is set when the target file is created. The same-name files are identified through the version numbers, and then when the files are deleted, the storage positions of the target files in the distributed file system can be determined through the file version numbers, so that accurate deletion of data is realized, the deletion accuracy is improved, and mistaken deletion is avoided.
The disclosure provides a file asynchronous processing method, device, equipment, medium and product, which are applied to the field of big data in the technical field of computers so as to accurately delete files and avoid the phenomenon of mistaken deletion.
The technical solution of the present disclosure will be described in detail with reference to the accompanying drawings.
As shown in fig. 1, an application diagram of a file asynchronous processing system provided in the embodiment of the present disclosure may include a user device 1, a metadata server 2, a reclamation server 3, and a distributed file system 4. The metadata server 2 establishes communication connection with the user equipment 1 and the recovery server 3 respectively, and the recovery server 3 establishes communication connection with the distributed file system 4. The distributed file system may include at least one storage node 41.
The user equipment 1 may initiate a deletion request for the target file to the metadata server 2. The metadata server 2 may receive the deletion request sent by the user equipment 1, and after acquiring the metadata of the target file, may send the metadata to the recovery server 3. Further, the metadata server 2 may delete the metadata after transmitting the metadata to the recycle server 3.
The reclamation server 3 may receive the metadata transmitted by the metadata server 2 to query the file version number of the target file in the metadata. And determining a first storage position of the target file in the distributed file system 4 according to the file version number, and deleting the corresponding target data from the distributed file system 4 according to the first storage position to complete the deletion of the target data. The file version number may be set to the metadata when the target file is generated, and when the metadata is stored to the disk, the file version number is also stored to the disk. The file can be accurately defined through the file version number, and the file can be accurately deleted.
Fig. 2 is a flowchart of an embodiment of a file asynchronous processing method provided by an embodiment of the present disclosure, where the file asynchronous processing method may be configured as a file asynchronous processing apparatus, and the file asynchronous processing apparatus may be located in a recycle server. The file asynchronous processing method can comprise the following steps:
201: and receiving the metadata of the target file to be deleted, which is sent by the metadata server.
The target file may include metadata as well as target data. The metadata may be data that records data of the target file. The target data may be the actual file text of the target file. The target file may be sent by the user device to a distributed file system storage. The metadata may be stored at a metadata server. The target data may be stored in at least one storage node of the distributed file system.
The metadata may be deleted by the metadata server. The metadata server can receive a deletion request of the target file sent by the user equipment, and determine the metadata of the target file based on the deletion request.
202: and inquiring the file version number of the target file in the metadata.
The file version number of the target file may be established at the time of generation of the target file. The file version number may be used to distinguish different versions of a file of the same name. For example, a user may create a B-file and then delete the B-file. The version number of this B-file may be 1. At this time, since the B file does not exist from the user's perspective, a B file can be re-created. Wherein B refers to a file name. The version number of the re-established B-file may be 2 to be distinguished from the B-file having the original version number of 1.
203: and determining a first storage position of target data of the target file in the distributed file system according to the file version number.
The first storage location may comprise a location in the distributed system of target data that is a target file.
204: and deleting the target data from the distributed file system according to the first storage position.
The target data may be looked up from the distributed file system based on the first storage location to delete the target data.
In the embodiment of the present disclosure, the metadata server may receive metadata of a target file to be deleted, which is sent by the metadata server, so as to query a file version number of the target file in the metadata. According to the file version number, a first storage location of target data of the target file in the distributed file system can be determined, so that the target file can be deleted from the distributed file system according to the first storage location. The version number of the file can be distinguished, the error deletion of different target data of the file with the same name can be avoided, and the file deletion accuracy is improved.
As shown in fig. 3, a flowchart of another embodiment of a file asynchronous processing method provided by the embodiment of the present disclosure may be configured as a file asynchronous processing apparatus, and the file asynchronous processing apparatus may be located in a recycle server. The file asynchronous processing method can comprise the following steps:
301: and receiving the metadata of the target file to be deleted, which is sent by the metadata server.
It should be noted that, some steps in the embodiments of the present disclosure are the same as some steps in the foregoing embodiments, and are not repeated herein for the sake of brevity of description.
302: and inquiring the file version number of the target file in the metadata.
303: an absolute path of the target file is determined.
Optionally, determining the absolute path of the target file may include: an absolute path of the target file is determined based on the metadata. The absolute path of the target file may be protected in the metadata. Further, determining the absolute path of the target file may include: and receiving an absolute path of the target file sent by the metadata server. When receiving the metadata, an absolute path of the target file corresponding to the metadata may be determined.
304: and determining the storage path of the target data in the target file according to the absolute path and the file version number.
The storage path of the target data may be determined by an absolute path and a file version number. For example, the target file corresponding to the file version number may be queried under the absolute path to obtain the storage path corresponding to the target file.
305: and determining a first storage position of the target data in the distributed file system according to the storage path.
306: and deleting the target data from the distributed file system according to the first storage position.
In the embodiment of the present disclosure, when the first storage location of the target data is determined, the storage path of the target data in the target file may be accurately located at the first storage location of the distributed file system by using the storage path according to the absolute path and the file version number of the target file. To delete the target data from the distributed file system based on the first storage location. The absolute path and the file version number of the target file are utilized to enable the storage path to be associated with the file version number, so that the storage path of the target data in the target file can be accurately positioned, and the target data can be accurately deleted.
As an embodiment, determining a storage path of the target data in the target file according to the absolute path and the file version number may include:
determining a storage folder corresponding to the file version number according to the absolute path; the storage folder takes the file version number as the file name.
Based on the file path corresponding to the storage folder, the storage path of the target data in the target file can be determined.
For convenience of understanding, when the storage folder corresponding to the file version number is determined according to the absolute path, the file name in the absolute path may be updated to the file name including the file version number. For example, the absolute path of the target file is: and a/b/file, wherein the file version number is version1, at this time, the storage folder corresponding to the original file name file can be determined under the absolute path of the file, and the folder name is updated to be file _ version 1. At this time, the file path corresponding to the updated storage folder is/a/b/file _ version 1. The storage path of the target data in the target file can be determined through the file path.
In the embodiment of the present disclosure, a storage folder corresponding to the file version number is established under the absolute path in a manner of establishing the storage folder, and then after the absolute path is obtained, the storage folder containing the file version number may be searched under the absolute path. The storage folder actually has the file version number as the file name. The file path corresponding to the storage folder can be determined based on the file path corresponding to the storage folder, the storage path of the target data in the target file is determined, and accurate acquisition of the storage path of the target data is achieved.
In one possible design, determining a storage path of target data in a target file based on a file path corresponding to a storage folder may include:
dividing target data in a target file into at least one data block, and determining block identifiers corresponding to the at least one data block respectively;
according to the storage path of the storage file, establishing a block storage path corresponding to the block identifier of the data block, and obtaining block storage paths corresponding to at least one data block;
and determining a block storage path corresponding to each of the at least one data block as a storage path of the target data.
Optionally, dividing the target data in the target file into at least one data block may include: determining the file length of a target file and the data volume of each data block; and dividing the target data in the target file into at least one data block according to the file length of the target file and the data volume of each data block. The data blocks are divided according to the file length and the data volume of each data block, so that the balanced division of the data blocks can be realized, and the accurate division is realized.
Dividing the target data in the target file into at least one data block may include: and randomly dividing target data in the target file into at least one data block.
And determining the obtaining sequence corresponding to at least one data block respectively, and determining the block identifier of the data block according to the obtaining sequence corresponding to each data block.
In the embodiment of the present disclosure, the target data may be divided into at least one data block, and block identifiers corresponding to the at least one data block are determined. And establishing a block storage path corresponding to the block identifier of the data block under the storage path of the storage file to obtain the block storage path corresponding to at least one data block. And acquiring the block storage paths corresponding to the at least one data block respectively. The block storage path corresponding to each of the at least one data block can be a storage path of the target data, so that the storage path of the target data can be accurately set, and the acquisition efficiency and accuracy of the storage path are improved.
In one possible design, dividing target data in a target file into at least one data block and determining block identifiers corresponding to the at least one data block respectively may include:
and dividing the target data into at least one data block according to the data volume of the target data in the target file.
And determining block identifications respectively corresponding to the at least one data block according to the position of the at least one data block in the target file.
The location of the data block in the target file may refer to the file offset of the data block in the target file. The file positions corresponding to the at least one data block may be determined, the sequence corresponding to the at least one data block may be determined according to the file positions corresponding to the at least one data block, and the block identifiers corresponding to the at least one data block may be determined according to the sequence corresponding to the at least one data block.
For convenience of understanding, taking the example of dividing the target data into 4 data blocks, the sequence of the 4 data blocks may be determined to be 1, 2, 3, and 4, and the sequence number of the data blocks, 1, 2, 3, and 4, may be directly used as the block identifier of the corresponding data block.
In the embodiment of the disclosure, the target data is divided into at least one data block according to the data volume of the target data, and the block storage of the target data can be realized through the division of the data block, so that the data loss phenomenon caused by the influence of a network and the like in the data transmission or storage process due to the overlarge data volume is avoided. According to the position of the at least one data block in the target file, block identifications corresponding to the at least one data block respectively can be determined. The accurate setting of the block identification of each data block is realized, and the setting efficiency and accuracy of the data blocks are improved.
In some embodiments, determining a first storage location of the target data in the distributed file system based on the storage path may include:
acquiring a block storage path corresponding to at least one data block in the storage path;
determining a storage key word of the data block according to a block storage path corresponding to the data block;
determining a first node identification corresponding to a target storage node of a data block in a distributed file system based on a Hash calculation model;
and determining the block position of the data block according to the system identifier of the distributed file system, the first node identifier corresponding to the data block and the storage key word of the data block so as to obtain a first storage position formed by the block positions corresponding to at least one data block.
The block storage path corresponding to each of the at least one data block may be determined according to the storage path of the target data of the target file in combination with the block identifier of the data block. For example, assuming that the storage path of the target data is/a/b/file _ version, a new path formed by combining the block identifier and the storage path may be formed, and for the storage paths with block identifiers 1, 2, 3, and 4, the block storage paths corresponding to the four data blocks are:
/a/b/file_version/1,/a/b/file_version/2,/a/b/file_version/3,/a/b/file_version/4。
to achieve an accurate representation of the first storage location, the following location mapping formula may be used:
position=relymap(fsid,relyid,key)
wherein, the relymap is a position mapping formula. fsid is the system identification of the distributed file system. The reyid is the first node identification of the data block. The key is a storage key.
In the embodiment of the disclosure, block storage paths corresponding to at least one data block in a storage block are determined. For any data block, determining a storage keyword of the data block according to a block storage path corresponding to the data block, determining a first node identifier corresponding to a target storage node of the data block in the distributed file system based on a Hash calculation model, realizing node positioning of the data block in the distributed file system, and obtaining an accurate first node identifier. According to the system identification of the file system, the first node identification corresponding to the data block and the storage keyword of the data block, the block position of the data block can be determined, the data block can be accurately positioned by the block position, the first storage position formed by the block positions corresponding to at least one data block is obtained, the first storage position is accurately positioned, and the positioning efficiency and accuracy of data are improved.
In one possible design, determining a first node identifier corresponding to a target storage node of a data block in a distributed file system based on a hash calculation model includes:
performing hash calculation on storage keywords of the data block to obtain a first hash value corresponding to the storage keywords;
determining the number of nodes of at least one storage node in the distributed file system;
performing complementation calculation on the number of the nodes by using the first hash value to obtain a first node remainder;
and determining a first node identifier corresponding to a target storage node of the data block in the at least one storage node according to the first node remainder.
The calculation formula for hash calculation of the storage key may be as follows:
relyid=hash(key)%relynr
wherein the key is a storage keyword. relynr is the number of nodes. The reyid is the node identification. hash denotes a hash function. % is the remainder calculation symbol.
In the embodiment of the disclosure, hash calculation may be performed on a storage keyword of a data block, a first hash value corresponding to the storage keyword is obtained, the number of nodes of at least one storage node in a file system is determined, and the number of nodes may be subjected to remainder calculation by using the first hash value, so as to obtain a first node remainder. The first node remainder may be used to determine a target storage node of the data block in the at least one storage node, so as to obtain a first node identifier corresponding to the target storage node. Through Hash calculation and remainder calculation, the first node identification of the data block corresponding to the target storage node in the at least one storage node can be accurately positioned, and confirmation of the target storage node of the data block is achieved. The storage keywords of the data blocks are used in the confirmation process, and the storage keywords are different under the same file name and different versions, so that the data blocks of different versions can be distinguished, and the high-accuracy management of the data blocks is realized.
In some embodiments, deleting the target data from the distributed file system based on the first storage location comprises:
determining a block position corresponding to at least one data block in the first storage position;
determining a target storage node of the data block according to the block position of the data block, and obtaining the target storage node corresponding to at least one data block;
and sending a deletion request of the data block to a target storage node of the data block, wherein the deletion request instructs the target storage node to delete the data block.
A delete request for the corresponding data block may be initiated to each target storage node. When the target storage node receives the deletion request of the data block, the corresponding data block can be deleted in response to the deletion request of the data block. The block memory address of the data block may be included in the delete request for the data block. And searching a corresponding data block according to the block storage address, and deleting the searched data block.
In the embodiment of the disclosure, target data is deleted from the distributed file system according to the first storage location, and a block location corresponding to at least one data block in the first storage location may be determined. Based on the block location of the data block, a target storage node of the data block may be determined to send a delete request of the data block to the target storage node of the data block, which may instruct the target storage node to delete the data block. Through the block position of the data block, the target storage node where the data block is located can be accurately confirmed, the target storage node can be accurately deleted, and the mistaken deletion is avoided.
In one possible design, after sending a deletion request of the data block to a target storage node of the data block, the method may further include:
and receiving a feedback response sent by the target storage node corresponding to the data block.
And if the target storage nodes respectively corresponding to at least one data block are determined to all send feedback responses, determining that the target file is deleted, and deleting the received metadata of the target file.
After the target storage node successfully deletes the data block, a feedback response may be sent to the reclamation server. The reclamation server may receive feedback responses respectively sent by the at least one target storage node. The feedback responses respectively sent by the at least one target storage node have time difference. If the recovery server determines that at least one target storage node all sends feedback responses, it can be determined that at least one data block is deleted and the deletion of the target file is finished. The reclamation server may also delete the metadata sent by the metadata server.
In the embodiment of the present disclosure, feedback responses sent by target storage nodes corresponding to data blocks may be received, and if it is determined that the target storage nodes corresponding to at least one data block all send feedback responses, it may be determined that deletion of the target file is completed. By detecting the feedback response, the deletion condition of at least one data block can be accurately judged, the accurate detection of at least one data block is realized, and the efficiency and the accuracy of judging the metadata of the target file are improved.
As another embodiment, after receiving the metadata of the target file to be deleted sent by the metadata server, the method may further include:
the device process pressure of the recovery server is detected.
And if the processing pressure of the equipment is determined not to reach the maximum processing pressure, determining that the equipment deleting condition is met.
At this time, deleting the target data from the distributed file system according to the first storage location may include:
and if the device deleting condition is met, deleting the target data from the distributed file system according to the first storage position.
Optionally, detecting the device process pressure of the recovery server may include: and detecting pressure data corresponding to the pressure parameters of the recovery server. And if the pressure data does not reach the maximum pressure data, determining that the equipment deleting condition is met. The pressure parameters may include: the number of read and write tasks (run _ io _ job). The maximum pressure data includes the maximum number of read-write tasks (max _ jobs). The pressure parameters may also include: memory usage (used _ memory), maximum memory usage. The pressure parameters may also include: deleting the number of tasks, wherein the maximum pressure data comprises: maximum task deletion amount. The pressure parameter may be one or more of the number of read/write tasks, the amount of memory usage, and the number of delete tasks. When the pressure parameter is one, one pressure parameter is smaller than the maximum pressure data, and when the pressure parameter is multiple, the multiple pressure parameters are smaller than the corresponding maximum pressure data.
In the embodiment of the present disclosure, before the target data is deleted, the device processing pressure of the recovery server may be detected, and if it is determined that the current processing pressure does not reach the maximum processing pressure, it may be determined that the device deletion condition is satisfied, and the target data may be deleted. The deletion process is effectively controlled by detecting the processing pressure of the equipment, deletion failure caused by overlarge processing pressure of the equipment is avoided, and the data deletion management efficiency is improved.
As shown in fig. 4, a flowchart of another embodiment of a file asynchronous processing method provided by the embodiment of the present disclosure may be configured as a file asynchronous processing device, and the file asynchronous processing device may be located in a metadata server. The file asynchronous processing method can comprise the following steps:
401: and receiving a deletion request initiated by the user equipment for the target file.
The metadata server may be a database of metadata. The metadata server and the recovery server can store data in a key-value pair mode. In the metadata database, key is file absolute path, value is file metadata information, and such key value pair may be stored in the metadata server and the database storing metadata, respectively.
The metadata server stores metadata for all files stored in the distributed file system. The recovery server stores metadata of a target file to be deleted in the distributed file system.
402: and acquiring the metadata of the target file based on the deletion request.
Metadata corresponding to the absolute path of the target file may be obtained based on the deletion request. The absolute path of the target file may be included in the delete request. The delete request may be generated based on an absolute path of the target file.
403: and sending the metadata to a recovery server, determining the file version number of the target file by the recovery server according to the metadata, and deleting the target data corresponding to the target file according to the file version number.
Sending the metadata to the recovery server is to insert the metadata into a database of the recovery server, and store the metadata and an absolute path of the target file in a key-value form.
404: and if the target file is determined to meet the file deletion condition, deleting the metadata.
Optionally, the method may further include: and detecting the opening number of the target file, and if the opening number of the target file is determined to be zero, determining that the target file meets the file deletion condition. The number of opens of the target file may be read from the count bit of the file version number. The technical quantity corresponding to the counting bits of the file version number may be the open quantity of the target file.
In the embodiment of the present disclosure, the metadata server may receive a deletion request initiated by the user equipment for the target file. Based on the deletion request, metadata of the target file may be acquired. And sending the metadata to a recovery server, wherein the recovery server can determine the file version number of the target file according to the metadata so as to delete the target data corresponding to the target file according to the file version number. By accurately managing the target file, the target file can be accurately deleted, and the deletion accuracy of the target file is improved.
As an embodiment, after obtaining the metadata of the target file, the method may further include:
inquiring the file version number of the target file in the metadata; the file version number comprises a life cycle bit and a counting bit;
determining the counting number corresponding to the counting bits;
and if the counting number is determined to be zero, determining that the target file meets the file deletion condition.
The file version number may contain period bits as well as count bits.
For ease of understanding, the data structure of the file version number may be as shown in fig. 5. Referring to fig. 5, the file version number is set to 128 bits (called binary digit for short), the chinese translation is bits, where the 128 bits can be divided into a front 32 bits 501 and a rear 96 bits 502, the front 32 bits can be set to life-ver (life-ver, life version data), and the rear 96 bits can be count bits (current-ver, open number under the current version). And if the file is deleted once and newly built again, the version data corresponding to the life cycle bit is automatically added with 1. The counting bit records the open number of files stored in the distributed file system, and 1 is added to the corresponding number of the file counting bit when one file is opened through one window. After the open process of the target file is closed, the count bit is cleared.
In the embodiment of the present disclosure, after querying a file version number of a target file in metadata, the file version number may include a period bit and a count bit, and by determining a count number corresponding to the count bit, when the count number is equal to 0, it is determined that the target file satisfies a file deletion condition. The file deleting condition can be confirmed through the confirmation of the counting quantity, the accuracy of the file deleting condition can be improved, and the file deleting efficiency and accuracy can be improved.
In some embodiments, obtaining metadata of the target file based on the delete request includes:
determining an absolute path of a target file;
determining a second storage position corresponding to the metadata of the target file according to the absolute path of the target file;
and reading the metadata of the target file from the second storage position.
In this embodiment, the obtaining manner of the metadata corresponding to the second storage location is the same as the obtaining manner of the block storage location of the data block in the foregoing embodiment, and reference may be specifically made to the calculation formulas of the reply and the position in the foregoing embodiment, which is different from the difference of the input data, and details are not described here again.
In the embodiment of the disclosure, an absolute path of the target file is determined, so as to determine a second storage location corresponding to the metadata of the target file according to the absolute path, and the metadata of the target file is read from the second storage location. By using the absolute path of the target file, the metadata of the target file can be accurately read.
In one possible design, determining a second storage location corresponding to metadata of the target file according to the absolute path of the target file includes:
determining a meta keyword corresponding to the metadata of the target file according to the absolute path of the target file;
determining a second node identifier corresponding to a target storage node of the metadata in the distributed file system based on a Hash calculation model;
and determining a second storage position of the metadata according to the system identifier of the distributed file system, the second node identifier corresponding to the metadata and the metadata keyword of the metadata.
In the embodiment of the present disclosure, when the second storage location of the metadata is determined, a metadata keyword corresponding to the metadata of the target file may be determined according to an absolute path of the target file, and based on a hash calculation model, a second node identifier corresponding to a target storage node of the metadata in the distributed file system is determined, so as to determine a system identifier of the distributed file system, a second node identifier corresponding to the metadata, and a metadata keyword, obtain an accurate second storage location, and improve efficiency and accuracy of obtaining the second storage location of the metadata.
In some embodiments, according to the absolute path of the target file, determining a meta key corresponding to the metadata of the target file;
determining a second node identifier corresponding to a target storage node of the metadata in the distributed file system based on a Hash calculation model;
and determining a second storage position of the metadata according to the system identifier of the distributed file system, the second node identifier corresponding to the metadata and the metadata keyword of the metadata.
In the embodiment of the present disclosure, hash calculation may be performed on a meta keyword of metadata to obtain a second hash value corresponding to the meta keyword, the number of nodes of at least one storage node in the file system is determined, and the number of nodes may be subjected to remainder calculation by using the second hash value to obtain a second node remainder. The second node remainder may be used to determine a target storage node of the metadata in the at least one storage node, so as to obtain a second node identifier corresponding to the target storage node. Through Hash calculation and remainder calculation, the second node identification corresponding to the target storage node of the metadata in the at least one storage node can be accurately positioned, and confirmation of the target storage node of the metadata is achieved. The metadata keywords are used in the confirmation process, and the metadata keywords are different under the same file name and different versions, so that the metadata of different versions can be distinguished, and the high-accuracy management of the metadata is realized.
In practical application, the metadata server, the recovery server, and the at least one storage node corresponding to the distributed file system may all be nodes in a background server, that is, the metadata server, the recovery server, and the at least one storage node corresponding to the distributed file system may be a server cluster. For example, the background server may be a cloud server, a storage server, or the like in actual application, so as to improve the processing and storage performance of the file.
As shown in fig. 6, a schematic structural diagram of an embodiment of a file asynchronous processing apparatus provided in this disclosure, the file asynchronous processing apparatus may be configured with a file asynchronous processing method, and the file asynchronous processing apparatus may be located in a recycle server. The file asynchronous processing device 500 may include the following units:
data receiving section 601: the metadata server is used for receiving metadata of a target file to be deleted, which is sent by the metadata server;
the first querying unit 602: the file version number is used for inquiring the file version number of the target file in the metadata;
the first determination unit 603: the file storage device is used for determining a first storage position of target data of a target file in the distributed file system according to the file version number;
first deletion unit 604: the system is used for deleting the target data from the distributed file system according to the first storage position.
As an embodiment, the first determination unit includes:
the first determining module is used for determining an absolute path of the target file;
the second determining module is used for determining a storage path of the target data in the target file according to the absolute path and the file version number;
and the position determining module is used for determining a first storage position of the target data in the distributed file system according to the storage path.
In some embodiments, the second determining module comprises:
the file determining submodule is used for determining a storage folder corresponding to the file version number according to the absolute path; the storage folder takes the file version number as a file name;
and the path determining submodule is used for determining a storage path of the target data in the target file based on the file path corresponding to the storage folder.
In one possible design, the path determination submodule is specifically configured to:
dividing target data in a target file into at least one data block, and determining block identifiers corresponding to the at least one data block respectively;
under the storage path of the storage file, establishing a block storage path corresponding to the block identifier of the data block, and obtaining block storage paths corresponding to at least one data block respectively;
and determining a block storage path corresponding to each of the at least one data block as a storage path of the target data.
Further, optionally, the path determining submodule is specifically configured to:
dividing target data into at least one data block according to the data volume of the target data in the target file;
and determining block identifications respectively corresponding to the at least one data block according to the position of the at least one data block in the target file.
As yet another embodiment, a location determination module, comprising:
the first obtaining submodule is used for obtaining block storage paths corresponding to at least one data block in the storage paths;
the first determining submodule is used for determining storage keywords of the data block according to the block storage path corresponding to the data block;
the second determining submodule is used for determining a first node identifier corresponding to a target storage node of the data block in the distributed file system based on the Hash calculation model;
and the third determining submodule is used for determining the block position of the data block according to the system identifier of the distributed file system, the first node identifier corresponding to the data block and the storage key word of the data block so as to obtain the first storage position formed by the block positions corresponding to at least one data block.
In some embodiments, the second determining submodule is specifically configured to:
performing hash calculation on storage keywords of the data block to obtain a first hash value corresponding to the storage keywords;
determining the number of nodes of at least one storage node in the distributed file system;
performing complementation calculation on the number of the nodes by using the first hash value to obtain a first node remainder;
and determining a first node identifier corresponding to a target storage node of the data block in the at least one storage node according to the first node remainder.
In one possible design, the second determination submodule is specifically configured to:
determining a block position corresponding to at least one data block in the first storage position;
determining a target storage node of the data block according to the block position of the data block, and obtaining the target storage node corresponding to at least one data block;
and sending a deletion request of the data block to a target storage node of the data block, wherein the deletion request indicates the target storage node to delete the data block.
In certain embodiments, further comprising:
the feedback receiving unit is used for receiving a feedback response sent by a target storage node corresponding to the data block;
the end judgment unit is used for determining that the target file is deleted to be ended if the target storage nodes corresponding to the at least one data block respectively send feedback responses;
and the second deleting unit is used for deleting the received metadata of the target file.
As still another embodiment, the method further includes:
a pressure detection unit for detecting an apparatus processing pressure of the recovery server;
the pressure judging unit is used for determining that the equipment deleting condition is met if the equipment processing pressure is determined not to reach the maximum processing pressure;
a first deletion unit including:
and the first deleting module is used for deleting the target data from the distributed file system according to the first storage position if the condition of deleting the equipment is determined to be met.
As shown in fig. 7, a schematic structural diagram of another embodiment of a file asynchronous processing device provided in an embodiment of the present disclosure, the file asynchronous processing device may be configured with a file asynchronous processing method, and the file asynchronous processing device may be located in a metadata server. The file asynchronous processing device can comprise the following units:
request receiving unit 701: the system comprises a server and a server, wherein the server is used for receiving a deletion request initiated by user equipment for a target file;
the data acquisition unit 702: the metadata is used for acquiring the metadata of the target file based on the deletion request;
data transmission unit 703: the recovery server is used for determining the file version number of the target file according to the metadata and deleting the target data corresponding to the target file according to the file version number;
third deletion unit 704: and deleting the metadata if the target file meets the file deletion condition.
As an embodiment, further comprising:
the second query unit is used for querying the file version number of the target file in the metadata; the file version number comprises a life cycle bit and a counting bit;
the technology determining unit is used for determining the counting number corresponding to the counting bits;
and the condition determining unit is used for determining that the target file meets the file deleting condition if the counting number is determined to be zero.
In certain embodiments, a data acquisition unit, comprises:
the path determining module is used for determining an absolute path of the target file;
the position determining module is used for determining a second storage position corresponding to the metadata of the target file according to the absolute path of the target file;
and the data reading module is used for reading the metadata of the target file from the second storage position.
In one possible design, the position determination module includes:
the key determining submodule is used for determining a meta key word corresponding to the metadata of the target file according to the absolute path of the target file;
the hash calculation submodule is used for determining a second node identifier corresponding to a target storage node of the metadata in the distributed file system based on a hash calculation model;
and the position determining submodule is used for determining a second storage position of the metadata according to the system identifier of the distributed file system, the second node identifier corresponding to the metadata and the metadata keyword of the metadata.
In some embodiments, the hash computation sub-module is configured to:
performing hash calculation on the meta-keywords of the meta-data to obtain second hash values corresponding to the meta-keywords;
determining the number of nodes of at least one storage node in the distributed file system;
performing complementation calculation on the number of the nodes by using the second hash value to obtain a second node remainder;
and determining the second node remainder as a second node identification corresponding to the target storage node of the metadata in the at least one storage node.
The steps and technical effects of the apparatus in the embodiments of the present disclosure that need to be specifically executed may refer to the corresponding method, and are not described herein again.
It should be noted that the target file in this embodiment is not specific to a specific user, and cannot reflect personal information of a specific user. It should be noted that the target file in the present embodiment is from a public data set.
In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.
The present disclosure also provides a readable storage medium and a computer program product according to an embodiment of the present disclosure.
According to an embodiment of the present disclosure, the present disclosure also provides a computer program product comprising: and a computer program, which is stored in a readable storage medium, and which can be read by at least one processor of the recycling server, wherein the at least one processor executes the computer program to make the recycling server correspondingly execute the schemes provided by the embodiments of fig. 2 to fig. 3.
According to an embodiment of the present disclosure, the present disclosure also provides a computer program product comprising: and a computer program, which is stored in a readable storage medium, and which can be read by at least one processor of the reclamation server, wherein the computer program is executed by the at least one processor to enable the metadata server to correspondingly execute the scheme provided by the embodiment shown in fig. 4.
FIG. 8 illustrates a schematic block diagram of an example reclamation server 800 that may be used to implement embodiments of the present disclosure. The reclamation server is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The reclamation server may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as the file asynchronous processing method. For example, in some embodiments, the file asynchronous processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by computing unit 801, a computer program may perform one or more of the steps of the file asynchronous processing method described above. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the file asynchronous processing method by any other suitable means (e.g., by means of firmware).
Further, embodiments of the present disclosure also provide a metadata server, and fig. 9 shows a schematic block diagram of an example metadata server 900 that may be used to implement embodiments of the present disclosure. The metadata server 900 may include: a computing unit 901, a Read Only Memory (ROM)902, a Random Access Memory (RAM)903, a bus 904, an input/output (I/O) interface 905, an input unit 906, an output unit 907, a storage unit 908, and a communication unit 909. The units, buses, interfaces and connection relationships in fig. 9 are the same as those in the recovery server shown in fig. 8, and the functions are the same, which is not described again here.
In addition, as shown in fig. 10, an embodiment of the present disclosure further provides a system architecture diagram of a file asynchronous processing system, where the file asynchronous processing system may include: a user device 1001, a reclamation server 1002 shown in fig. 8, a metadata server 1003 shown in fig. 9, and a distributed file system 1004 including at least one storage node; the metadata server is respectively in communication connection with the user equipment and the recovery server; the recovery server is connected with the distributed file system.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, the feedback response provided to the user can be any form of sensory feedback response (e.g., a visual feedback response, an auditory feedback response, or a tactile feedback response); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (35)

1. A file asynchronous processing method comprises the following steps:
receiving metadata of a target file to be deleted, which is sent by a metadata server;
inquiring the file version number of the target file in the metadata;
determining a first storage position of target data of the target file in a distributed file system according to the file version number;
and deleting the target data from the distributed file system according to the first storage position.
2. The method of claim 1, wherein said determining a first storage location of target data of the target file in a distributed file system according to the file version number comprises:
determining an absolute path of the target file;
determining a storage path of target data in the target file according to the absolute path and the file version number;
and determining a first storage position of the target data in the distributed file system according to the storage path.
3. The method of claim 2, wherein said determining a storage path for target data in the target file from the absolute path and the file version number comprises:
determining a storage folder corresponding to the file version number according to the absolute path; the storage folder takes the file version number as a file name;
and determining the storage path of the target data in the target file based on the file path corresponding to the storage folder.
4. The method of claim 3, wherein the determining a storage path for the target data in the target file based on the file path corresponding to the storage folder comprises:
dividing target data in the target file into at least one data block, and determining block identifiers corresponding to the at least one data block respectively;
establishing a block storage path corresponding to the block identifier of the data block according to the storage path of the storage file, and obtaining block storage paths corresponding to at least one data block;
and determining a block storage path corresponding to at least one data block as the storage path of the target data.
5. The method of claim 4, wherein the dividing the target data in the target file into at least one data block and determining the block identifier corresponding to each of the at least one data block comprises:
dividing the target data into at least one data block according to the data volume of the target data in the target file;
and determining block identifications corresponding to the at least one data block according to the position of the at least one data block in the target file.
6. The method of claim 4, wherein said determining a first storage location of the target data in the distributed file system from the storage path comprises:
obtaining a block storage path corresponding to at least one data block in the storage paths;
determining a storage key word of the data block according to a block storage path corresponding to the data block;
determining a first node identification corresponding to a target storage node of the data block in the distributed file system based on a Hash calculation model;
and determining the block position of the data block according to the system identifier of the distributed file system, the first node identifier corresponding to the data block and the storage key word of the data block, so as to obtain the first storage position formed by the block positions corresponding to at least one data block.
7. The method of claim 6, wherein the determining a first node identification corresponding to a target storage node of the data chunk in the distributed file system based on a hash calculation model comprises:
performing hash calculation on the storage key words of the data blocks to obtain first hash values corresponding to the storage key words;
determining the number of nodes of at least one storage node in the distributed file system;
performing complementation calculation on the number of the nodes by using the first hash value to obtain a first node remainder;
and determining a first node identifier corresponding to a target storage node of the data block in at least one storage node according to the first node remainder.
8. The method of claim 6 or 7, wherein said deleting the target data from the distributed file system according to the first storage location comprises:
determining a block position corresponding to at least one data block in the first storage position;
determining a target storage node of the data block according to the block position of the data block, and obtaining at least one target storage node corresponding to the data block;
sending a deletion request of the data block to a target storage node of the data block, wherein the deletion request indicates that the target storage node deletes the data block.
9. The method of claim 8, further comprising:
receiving a feedback response sent by a target storage node corresponding to the data block;
if it is determined that the target storage nodes corresponding to at least one data block respectively all send feedback responses, determining that the target file deletion is finished;
deleting the received metadata of the target file.
10. The method according to any one of claims 1-9, wherein after receiving the metadata of the target file to be deleted, the metadata server further comprises:
detecting equipment processing pressure of a recovery server;
if the processing pressure of the equipment is determined not to reach the maximum processing pressure, determining that the equipment deleting condition is met;
deleting the target data from the distributed file system according to the first storage position, including:
and if the condition of deleting the equipment is determined to be met, deleting the target data from the distributed file system according to the first storage position.
11. A file asynchronous processing method comprises the following steps:
receiving a deletion request initiated by user equipment for a target file;
acquiring metadata of the target file based on the deletion request;
sending the metadata to a recovery server, wherein the recovery server determines the file version number of the target file according to the metadata and deletes the target data corresponding to the target file according to the file version number;
and if the target file is determined to meet the file deletion condition, deleting the metadata.
12. The method of claim 11, further comprising:
inquiring the file version number of the target file in the metadata; the file version number comprises a life cycle bit and a counting bit;
determining the counting quantity corresponding to the counting bits;
and if the counting number is determined to be zero, determining that the target file meets the file deletion condition.
13. The method of claim 11 or 12, wherein the obtaining metadata of the target file based on the deletion request comprises:
determining an absolute path of the target file;
determining a second storage position corresponding to the metadata of the target file according to the absolute path of the target file;
and reading the metadata of the target file from the second storage position.
14. The method of claim 13, wherein the determining a second storage location corresponding to the metadata of the target file according to the absolute path of the target file comprises:
determining a meta keyword corresponding to the metadata of the target file according to the absolute path of the target file;
determining a second node identifier corresponding to the metadata in a target storage node in the distributed file system based on a Hash calculation model;
and determining a second storage position of the metadata according to the system identifier of the distributed file system, the second node identifier corresponding to the metadata and the metadata keyword of the metadata.
15. The method of claim 14, wherein the determining a second node identification corresponding to the target storage node of the metadata in the distributed file system based on a hash calculation model comprises:
performing hash calculation on the meta-keyword of the meta-data to obtain a second hash value corresponding to the meta-keyword;
determining the number of nodes of at least one storage node in the distributed file system;
performing complementation calculation on the number of the nodes by using the second hash value to obtain a second node remainder;
and determining the second node remainder as a second node identification corresponding to a target storage node of the metadata in at least one storage node.
16. A file asynchronous processing device, comprising:
the data receiving unit is used for receiving the metadata of the target file to be deleted, which is sent by the metadata server;
a first query unit, configured to query a file version number of the target file in the metadata;
the first determining unit is used for determining a first storage position of target data of the target file in the distributed file system according to the file version number;
and the first deleting unit is used for deleting the target data from the distributed file system according to the first storage position.
17. The apparatus of claim 16, wherein the first determining unit comprises:
the first determining module is used for determining an absolute path of the target file;
a second determining module, configured to determine a storage path of target data in the target file according to the absolute path and the file version number;
and the position determining module is used for determining a first storage position of the target data in the distributed file system according to the storage path.
18. The apparatus of claim 17, wherein the second determining means comprises:
the file determining submodule is used for determining a storage folder corresponding to the file version number according to the absolute path; the storage folder takes the file version number as a file name;
and the path determining submodule is used for determining the storage path of the target data in the target file based on the file path corresponding to the storage folder.
19. The apparatus of claim 18, wherein the path determination submodule is specifically configured to:
dividing target data in the target file into at least one data block, and determining block identifiers corresponding to the at least one data block respectively;
establishing a block storage path corresponding to the block identifier of the data block according to the storage path of the storage file, and obtaining block storage paths corresponding to at least one data block;
and determining a block storage path corresponding to at least one data block as a storage path of the target data.
20. The apparatus of claim 19, wherein the path determination submodule is specifically configured to:
dividing the target data into at least one data block according to the data volume of the target data in the target file;
and determining block identifications corresponding to the at least one data block according to the position of the at least one data block in the target file.
21. The apparatus of claim 20, wherein the location determination module comprises:
the first obtaining submodule is used for obtaining a block storage path corresponding to at least one data block in the storage paths;
the first determining submodule is used for determining a storage keyword of the data block according to a block storage path corresponding to the data block;
the second determining submodule is used for determining a first node identifier corresponding to a target storage node of the data block in the distributed file system based on a Hash calculation model;
and the third determining submodule is used for determining the block position of the data block according to the system identifier of the distributed file system, the first node identifier corresponding to the data block and the storage keyword of the data block so as to obtain the first storage position formed by the block positions corresponding to at least one data block.
22. The apparatus of claim 21, wherein the second determining submodule is specifically configured to:
performing hash calculation on the storage key words of the data blocks to obtain first hash values corresponding to the storage key words;
determining the number of nodes of at least one storage node in the distributed file system;
performing complementation calculation on the number of the nodes by using the first hash value to obtain a first node remainder;
and determining a first node identifier corresponding to a target storage node of the data block in at least one storage node according to the first node remainder.
23. The apparatus of claim 22, wherein the second determination submodule is specifically configured to:
determining a block position corresponding to at least one data block in the first storage position;
determining a target storage node of the data block according to the block position of the data block, and obtaining at least one target storage node corresponding to the data block;
sending a deletion request of the data block to a target storage node of the data block, wherein the deletion request indicates that the target storage node deletes the data block.
24. The apparatus of claim 23, further comprising:
a feedback receiving unit, configured to receive a feedback response sent by a target storage node corresponding to the data block;
an end judgment unit, configured to determine that the target file is deleted when it is determined that the target storage nodes corresponding to the at least one data block all send feedback responses;
and the second deleting unit is used for deleting the received metadata of the target file.
25. The apparatus of any of claims 16-24, further comprising:
a pressure detection unit for detecting an apparatus processing pressure of the recovery server;
the pressure judging unit is used for determining that the equipment deleting condition is met if the equipment processing pressure is determined not to reach the maximum processing pressure;
the first deletion unit includes:
and the first deleting module is used for deleting the target data from the distributed file system according to the first storage position if the condition of deleting the equipment is determined to be met.
26. A file asynchronous processing device, comprising:
a request receiving unit, configured to receive a deletion request initiated by a user equipment for a target file;
a data acquisition unit configured to acquire metadata of the target file based on the deletion request;
the data sending unit is used for sending the metadata to a recovery server, and the recovery server determines the file version number of the target file according to the metadata and deletes the target data corresponding to the target file according to the file version number;
and the third deleting unit is used for deleting the metadata if the target file is determined to meet the file deleting condition.
27. The apparatus of claim 26, further comprising:
the second query unit is used for querying the file version number of the target file in the metadata; the file version number comprises a life cycle bit and a counting bit;
a technology determining unit, configured to determine a count quantity corresponding to the count bits;
and the condition determining unit is used for determining that the target file meets the file deletion condition if the counting number is determined to be zero.
28. The apparatus of claim 27, wherein the data acquisition unit comprises:
the path determining module is used for determining an absolute path of the target file;
the position determining module is used for determining a second storage position corresponding to the metadata of the target file according to the absolute path of the target file;
and the data reading module is used for reading the metadata of the target file from the second storage position.
29. The apparatus of claim 28, wherein the location determination module comprises:
the key determining submodule is used for determining a meta key word corresponding to the meta data of the target file according to the absolute path of the target file;
the hash calculation submodule is used for determining a second node identifier corresponding to the target storage node of the metadata in the distributed file system based on a hash calculation model;
and the position determining submodule is used for determining a second storage position of the metadata according to the system identifier of the distributed file system, the second node identifier corresponding to the metadata and the metadata keyword of the metadata.
30. The apparatus of claim 29, wherein the hash computation sub-module is configured to:
performing hash calculation on the meta-keyword of the meta-data to obtain a second hash value corresponding to the meta-keyword;
determining the number of nodes of at least one storage node in the distributed file system;
performing remainder calculation on the number of the nodes by using the second hash value to obtain a second node remainder;
and determining the second node remainder as a second node identification corresponding to a target storage node of the metadata in at least one storage node.
31. A reclamation server, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.
32. A metadata server, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 11-15.
33. A file asynchronous processing system, comprising: a user device, a reclamation server as recited in claim 32, a metadata server as recited in claim 33, and a distributed file system comprising at least one storage node;
the metadata server is respectively in communication connection with the user equipment and the recovery server; the recovery server is connected with the distributed file system.
34. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any of claims 1-10 or 11-15.
35. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the method of any one of claims 1 to 10 or 11 to 15.
CN202210151881.1A 2022-02-18 2022-02-18 Asynchronous file processing method, device, server, medium, product and system Active CN114528258B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210151881.1A CN114528258B (en) 2022-02-18 2022-02-18 Asynchronous file processing method, device, server, medium, product and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210151881.1A CN114528258B (en) 2022-02-18 2022-02-18 Asynchronous file processing method, device, server, medium, product and system

Publications (2)

Publication Number Publication Date
CN114528258A true CN114528258A (en) 2022-05-24
CN114528258B CN114528258B (en) 2022-12-27

Family

ID=81621904

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210151881.1A Active CN114528258B (en) 2022-02-18 2022-02-18 Asynchronous file processing method, device, server, medium, product and system

Country Status (1)

Country Link
CN (1) CN114528258B (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777056A (en) * 2009-12-31 2010-07-14 成都市华为赛门铁克科技有限公司 Data storage method and device
CN103064714A (en) * 2013-01-08 2013-04-24 北京赛科世纪数码科技有限公司 Software system updating method and software system updating device
CN106354840A (en) * 2016-08-31 2017-01-25 北京小米移动软件有限公司 File processing method and device and distributed file system
WO2017084563A1 (en) * 2015-11-19 2017-05-26 华为技术有限公司 Distributed system-based file reading method and device
CN107526777A (en) * 2017-07-21 2017-12-29 阿里巴巴集团控股有限公司 A kind of method and apparatus handled based on version number file
CN108984560A (en) * 2017-06-01 2018-12-11 杭州海康威视数字技术股份有限公司 File memory method and device
CN109446160A (en) * 2018-11-06 2019-03-08 郑州云海信息技术有限公司 A kind of file reading, system, device and computer readable storage medium
CN109614377A (en) * 2018-12-07 2019-04-12 浪潮电子信息产业股份有限公司 File delet method, device, equipment and the storage medium of distributed file system
CN109739815A (en) * 2019-01-10 2019-05-10 北京达佳互联信息技术有限公司 Document handling method, system, device, equipment and storage medium
CN111309696A (en) * 2020-03-20 2020-06-19 中电长城网际系统应用有限公司 Log processing method and device, electronic equipment and readable medium
CN111447248A (en) * 2019-01-16 2020-07-24 网宿科技股份有限公司 File transmission method and device
CN112347046A (en) * 2021-01-08 2021-02-09 北京东方通软件有限公司 Method for acquiring creation time of file in distributed system
CN113076290A (en) * 2021-04-12 2021-07-06 百果园技术(新加坡)有限公司 File deletion method, device, equipment, system and storage medium
CN113553300A (en) * 2021-07-27 2021-10-26 北京字跳网络技术有限公司 File processing method and device, readable medium and electronic equipment
CN113553335A (en) * 2021-07-26 2021-10-26 北京东方国信科技股份有限公司 Data updating method and device for columnar storage file
CN113609078A (en) * 2021-07-13 2021-11-05 中国银行股份有限公司 Remote batch file operation method and device

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101777056A (en) * 2009-12-31 2010-07-14 成都市华为赛门铁克科技有限公司 Data storage method and device
CN103064714A (en) * 2013-01-08 2013-04-24 北京赛科世纪数码科技有限公司 Software system updating method and software system updating device
WO2017084563A1 (en) * 2015-11-19 2017-05-26 华为技术有限公司 Distributed system-based file reading method and device
CN106354840A (en) * 2016-08-31 2017-01-25 北京小米移动软件有限公司 File processing method and device and distributed file system
CN108984560A (en) * 2017-06-01 2018-12-11 杭州海康威视数字技术股份有限公司 File memory method and device
CN107526777A (en) * 2017-07-21 2017-12-29 阿里巴巴集团控股有限公司 A kind of method and apparatus handled based on version number file
CN109446160A (en) * 2018-11-06 2019-03-08 郑州云海信息技术有限公司 A kind of file reading, system, device and computer readable storage medium
CN109614377A (en) * 2018-12-07 2019-04-12 浪潮电子信息产业股份有限公司 File delet method, device, equipment and the storage medium of distributed file system
CN109739815A (en) * 2019-01-10 2019-05-10 北京达佳互联信息技术有限公司 Document handling method, system, device, equipment and storage medium
CN111447248A (en) * 2019-01-16 2020-07-24 网宿科技股份有限公司 File transmission method and device
CN111309696A (en) * 2020-03-20 2020-06-19 中电长城网际系统应用有限公司 Log processing method and device, electronic equipment and readable medium
CN112347046A (en) * 2021-01-08 2021-02-09 北京东方通软件有限公司 Method for acquiring creation time of file in distributed system
CN113076290A (en) * 2021-04-12 2021-07-06 百果园技术(新加坡)有限公司 File deletion method, device, equipment, system and storage medium
CN113609078A (en) * 2021-07-13 2021-11-05 中国银行股份有限公司 Remote batch file operation method and device
CN113553335A (en) * 2021-07-26 2021-10-26 北京东方国信科技股份有限公司 Data updating method and device for columnar storage file
CN113553300A (en) * 2021-07-27 2021-10-26 北京字跳网络技术有限公司 File processing method and device, readable medium and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
姚孝珍: "云平台下基于MooseFS的网盘系统架构及关键技术的设计与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Also Published As

Publication number Publication date
CN114528258B (en) 2022-12-27

Similar Documents

Publication Publication Date Title
US8788471B2 (en) Matching transactions in multi-level records
CN107704202B (en) Method and device for quickly reading and writing data
US20210286763A1 (en) Suggesting a destination folder for a file to be saved
CN113364877B (en) Data processing method, device, electronic equipment and medium
US9213759B2 (en) System, apparatus, and method for executing a query including boolean and conditional expressions
CN115114232A (en) Method, device and medium for enumerating historical version objects
CN113722600A (en) Data query method, device, equipment and product applied to big data
CN112433757A (en) Method and device for determining interface calling relationship
CN114528258B (en) Asynchronous file processing method, device, server, medium, product and system
EP4092544A1 (en) Method, apparatus and storage medium for deduplicating entity nodes in graph database
CN114168119B (en) Code file editing method, device, electronic equipment and storage medium
CN115639966A (en) Data writing method and device, terminal equipment and storage medium
CN112860811B (en) Method and device for determining data blood relationship, electronic equipment and storage medium
CN115617802A (en) Method and device for quickly generating full snapshot, electronic equipment and storage medium
CN115328898A (en) Data processing method and device, electronic equipment and medium
CN115640280A (en) Data migration method and device
CN112887426B (en) Information stream pushing method and device, electronic equipment and storage medium
CN115543918A (en) File snapshot method, system, electronic equipment and storage medium
CN115510036A (en) Data migration method, device, equipment and storage medium
US10997144B2 (en) Reducing write amplification in buffer trees
CN113760988A (en) Method, device, equipment and storage medium for associating and processing unbounded stream data
US20230385240A1 (en) Optimizations for data deduplication operations
CN112783507B (en) Data stream guiding playback method and device, electronic equipment and readable storage medium
CN114564449A (en) Data query method, device, equipment and storage medium
CN117271518A (en) Method, device, equipment and storage medium for generating multi-transaction fact table

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant