CN114035750A - File processing method, device, equipment, medium and product - Google Patents

File processing method, device, equipment, medium and product Download PDF

Info

Publication number
CN114035750A
CN114035750A CN202111404189.7A CN202111404189A CN114035750A CN 114035750 A CN114035750 A CN 114035750A CN 202111404189 A CN202111404189 A CN 202111404189A CN 114035750 A CN114035750 A CN 114035750A
Authority
CN
China
Prior art keywords
written
data
writing
data block
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111404189.7A
Other languages
Chinese (zh)
Inventor
李磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Duyou Information Technology Co ltd
Original Assignee
Beijing Duyou Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Duyou Information Technology Co ltd filed Critical Beijing Duyou Information Technology Co ltd
Priority to CN202111404189.7A priority Critical patent/CN114035750A/en
Publication of CN114035750A publication Critical patent/CN114035750A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/119Details of migration of file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0682Tape device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a file processing method, a file processing device, equipment, a file processing medium and a file processing product, and relates to the technical field of artificial intelligence, in particular to the field of cloud computing. The specific implementation scheme is as follows: screening a plurality of files to be migrated of a target user meeting the migration condition from a disk storage medium; mapping the files to be migrated into data objects respectively to obtain a plurality of data objects corresponding to the target user; performing data aggregation on a plurality of data objects corresponding to the target user to obtain a target data block; and storing the target data block into a tape storage medium. The technical scheme of the disclosure effectively reduces the storage cost.

Description

File processing method, device, equipment, medium and product
Technical Field
The present disclosure relates to the field of cloud computing in the technical field of artificial intelligence, and in particular, to a method, an apparatus, a device, a medium, and a product for processing a file.
Background
With the rapid development of mobile internet and artificial intelligence, the data scale of the personal cloud storage industry is larger and larger. As the amount of data increases, the storage cost of the data increases. In the existing cloud storage technology, a user can transmit local data to a network disk, and the network disk realizes data storage. The network Disk may be a HDD (Hard Disk Drive) server, which is a kind of Disk storage, and all user data is stored on the HDD server, providing millisecond-level delayed data access service. However, the existing data storage cost is too high, and a large amount of storage resources are consumed.
Disclosure of Invention
The present disclosure provides a method, apparatus, device, medium, and product for file processing in a network disk storage scenario.
According to a first aspect of the present disclosure, there is provided a file processing method including:
screening a plurality of files to be migrated of a target user meeting the migration condition from a disk storage medium;
mapping the files to be migrated into data objects respectively to obtain a plurality of data objects corresponding to the target user;
performing data aggregation on a plurality of data objects corresponding to the target user to obtain a target data block;
and storing the target data block into a tape storage medium.
According to a second aspect of the present disclosure, there is provided a document processing apparatus including:
the file selection unit is used for screening a plurality of files to be migrated of the target user meeting the migration conditions from the magnetic disk storage medium;
the file mapping unit is used for respectively mapping the files to be migrated into data objects so as to obtain a plurality of data objects corresponding to the target user;
the data aggregation unit is used for performing data aggregation on the plurality of data objects corresponding to the target user to obtain a target data block;
and the data storage unit is used for storing the target data block into a tape storage medium.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a file processing method as described in the magnetic disk.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the file processing method of the magnetic disk aspect.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising: a computer program, stored in a readable storage medium, from which at least one processor of an electronic device can read the computer program, execution of the computer program by the at least one processor causing the electronic device to perform the method of the disk aspect.
According to the technology of the disclosure, the problem of overhigh storage cost caused by the fact that files are stored in the disk storage medium is solved, and the storage cost of the tape storage medium is lower by migrating the files meeting migration conditions in the disk storage medium to the tape storage medium. And in the migration process, a file selection and aggregation scheme is adopted to realize the rapid migration of the files.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a network architecture diagram of a document processing method provided in accordance with an embodiment of the present disclosure
FIG. 2 is a flow chart of a file processing method provided according to a first embodiment of the present disclosure;
FIG. 3 is a flow chart of a file processing method provided according to a second embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a data block storage structure;
FIG. 5 is a flow chart of a file processing method provided according to a third embodiment of the present disclosure;
FIG. 6 is a flowchart of a file processing method provided in accordance with a fourth embodiment of the present disclosure;
fig. 7 is a schematic view of a document processing apparatus provided according to a fifth embodiment of the present disclosure;
fig. 8 is a block diagram of an electronic device for implementing a file processing method of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The technical scheme disclosed by the invention can be applied to a personal cloud storage scene, and is mainly used for screening cold data in a disk storage medium, then migrating and storing the cold data into a tape storage medium, and realizing the rapid migration of files by adopting a mode of splitting the files into data fragments. The technical scheme of the disclosure reduces the storage cost and simultaneously realizes the rapid migration.
With the rapid development of mobile internet and artificial intelligence technology, the scale of personal cloud storage business is increasingly huge, so that a large amount of personal storage data is generated. As the amount of storage increases, the cost of storage increases. The personal cloud storage service mainly stores user data in a Hard Disk Drive (HDD) server to provide a fast data access service for a user. However, the HDD server is actually a disk storage, and the storage cost is high.
In order to solve the problem of high storage cost of the disk storage, the scheme considers that data in the disk storage system is migrated to a tape storage system with lower cost. The cost of storage media such as magnetic tape is lower than the storage cost, and the storage density of magnetic tape storage is increased at a speed much higher than that of magnetic disks, so that the storage cost is reduced and the space is larger. If all data in the disk storage system is migrated to the tape storage system, the data needs to be read from the tape storage system when a user accesses the tape storage system, which results in a low data reading speed. In practical applications, more than 95% of data in a disk storage system is not accessed for a long time and may be referred to as "cold data", and data that is frequently accessed, for example, public data, may be referred to as "hot data". Therefore, in the embodiment of the present disclosure, it is considered that the data is stored in different storage media according to the requirements of the access heat and the access delay. Cold data is migrated to the tape storage system.
The present disclosure provides a file processing method, device, apparatus, medium, and product, which are applied to the field of cloud computing in the technical field of artificial intelligence, and in particular, to the field of cloud storage and the field of cloud distribution (CDN, Content Delivery Network) to achieve the technical problems of reducing storage cost and reducing a large amount of consumption of storage resources.
In the embodiment of the disclosure, a plurality of files to be migrated of a target user meeting migration conditions may be screened from a disk storage medium, so as to map the plurality of files to be migrated as data objects respectively, and obtain a plurality of data objects corresponding to the target user. The plurality of data objects are data objects to be stored by the target user. And after the data object data corresponding to the target user are aggregated into at least one target storage block, the at least one target storage block can be stored in the tape storage medium, so that the target storage block can be stored quickly and accurately. The method comprises the steps of screening files to be migrated meeting migration conditions, accurately selecting the files to be written, aggregating a plurality of files of one user into a target data block in a data aggregation mode, enabling the file storage of the target user to be aggregated into the target data block, ensuring that at least one data block can be quickly written into the target storage block of a magnetic tape storage medium, and rapidly realizing data migration while reducing the storage cost through data migration so as to reduce the consumption of storage resources.
The technical solution of the present disclosure will be described in detail with reference to the accompanying drawings.
Fig. 1 is a diagram of a network architecture of a file processing method according to an embodiment of the present disclosure, where the network architecture may include a disk storage system 1, an electronic device 2, and a tape storage system 3. The disk storage system 1 may be connected to the electronic device 2 through a local area network or a wide area network. The electronic device 2 and the tape storage system 3 are connected to each other via a local area network or a wide area network. The electronic device 2 may be, for example, a general server, a cloud server, or the like, and may also be a computer, a notebook, a supercomputer device, or the like. The specific type of the electronic device 2 in the embodiment of the present disclosure is not limited too much.
The electronic device 2 may screen the user files in the disk storage system 1 to obtain a plurality of files to be migrated of the target user that satisfy the migration condition, and then map the plurality of files to be migrated as data objects to obtain a plurality of data objects corresponding to the target user. Therefore, the data of the plurality of data objects are aggregated into at least one target storage block, so that the target data block is written into the target storage block of the tape storage system, and the rapid unloading of the migratable data is realized. The storage cost can be effectively reduced by transferring the files from the disk storage system to the tape storage system for storage, and the rapid data transfer can be realized by adopting the data object conversion and data aggregation modes of the files, so that the transfer efficiency is improved.
As shown in fig. 2, which is a flowchart of a file processing method provided in a first embodiment of the present disclosure, the disk embodiment may be configured as a file processing apparatus, and the file processing apparatus may be located in an electronic device, and the file processing method may include the following steps:
201: and screening a plurality of files to be migrated of the target user meeting the migration condition from the disk storage medium.
The plurality of files to be migrated of the target user may be files satisfying the migration condition. The disk storage media may include: one or more of a memory, a HDD server, and a SSD (Solid State drive) memory. Hard disk storage media can provide efficient data read-write services, but are costly. The target user may be a user having a storage account in a cloud storage space, and all files of the user may be files stored in the cloud storage space, which may be, for example, an HDD server.
In order to improve the data migration efficiency, a plurality of target users may be simultaneously screened to migrate a plurality of files to be migrated of each target user.
The method for screening the plurality of files to be migrated of the target user meeting the migration condition from the disk storage medium may further include: receiving a file migration request initiated by a user; and screening a plurality of files to be migrated of the target users meeting the migration conditions from the disk storage medium in response to the file migration request.
202: and mapping the plurality of files to be migrated into data objects respectively to obtain a plurality of data objects corresponding to the target user.
Optionally, the File to be migrated may have a corresponding File Format, for example, the Audio/Video File may be in a Format of Moving Picture Experts Group Audio Layer III (MP3, Audio compression technology), Windows Media Video (WMV, microsoft Media Video), Audio Video Interleaved (AVI, Audio/Video Interleaved Format), and the like, and the Image may be in a Format of Bitmap (BMP), Joint Photographic Experts Group (JPEG), Tagged Image File Format (TIF, Tagged Image File Format), Graphics exchange Format (GIF, Image Interchange Format), and the like. And the text files, such as various files under microsoft office software, such as word files, excel files and the like. The file formats of multiple files to be migrated of the same user may be different.
The mapping of the multiple files to be migrated to the data object may be a unification of expression modes of the multiple files to be migrated using the same data expression type, and the data object may be a data unit in the object storage medium. The object storage medium may be an electronic device for executing the technical solution of the embodiments of the present disclosure. For example, the data unit may be written in a binary language for the data object, and this type of manner is only for illustration and does not constitute a limitation on the data format of multiple data objects, and the format of the data object may be set according to the actual requirement of use, for example, a header, a data segment, an end-of-file flag, and the like of the data object are set. The data object of any file to be migrated can be obtained by format conversion of the file to be migrated.
203: and carrying out data aggregation on a plurality of data objects corresponding to the target user to obtain a target data block.
And aggregating the obtained target data blocks by a plurality of data objects corresponding to the target user. The target data block may include one or more data objects, and may be specifically determined according to a total data amount of the data objects and a storage amount of each data block.
The data storage amount of the target data block is set in advance, and may be set to 1GB (gigabyte) for example.
204: storing the target data block in a tape storage medium.
The tape storage medium is a hardware storage, and a specific storage medium may be a tape. The tape storage medium may provide storage services for the target data blocks. The tape storage medium may include at least one tape storage, each tape storage having a storage capacity, and one tape storage may store a plurality of data blocks. For example, when the storage capacity of a tape storage is 10TB (Terabyte) and the storage capacity of a data block is 1GB, the tape storage can store 1 ten thousand data blocks having a storage capacity of 1 GB.
In practical applications, the magnetic disk storage medium and the magnetic tape storage medium may be different storage systems, respectively, referring to the magnetic disk storage system and the magnetic tape storage system shown in fig. 1. The magnetic disk storage media and the magnetic tape storage media may be different storage spaces in the same storage system.
In the embodiment of the disclosure, a plurality of files to be migrated of a target user meeting a migration condition may be screened from a disk storage medium, so as to map the plurality of files to be migrated as data objects, and obtain a plurality of data objects corresponding to the target user. And then carrying out data aggregation on a plurality of data objects of the target user to obtain a target data block, and storing the target data block into a tape storage medium to realize the quick and accurate storage of the target storage block. The files are mapped as objects to carry out data aggregation, a plurality of files to be migrated of a target user can be quickly written into a tape storage medium in the form of one target data block, storage cost is reduced, consumption of storage resources is reduced, and data migration is quickly realized. In addition, through data aggregation, the storage spaces of a plurality of files to be migrated of the target user can be ensured to have certain continuity, so that when the target user queries the files, related files under the target user name can be quickly positioned, and the files can be quickly queried.
As an embodiment, performing data aggregation on a plurality of data objects corresponding to a target user to obtain a target data block includes:
determining a data block to be written;
determining a current object to be written in from a plurality of data objects corresponding to a target user;
writing the object to be written into the data block to be written when the writing state of the data block to be written is in a writable state;
and acquiring the target data block obtained after the writing of all the data blocks to be written is finished.
In the embodiment of the disclosure, when data aggregation is performed on a plurality of data objects of a target user, a data block to be written may be generated first, and then a current object to be written is determined from the plurality of data objects of the target user, so that when the writing state of the data block to be written is in a writable state, the data block to be written is written into the data block to be written, thereby implementing sequential writing of the plurality of data objects of the target user, avoiding missing writing or repeated writing, and improving accuracy of data aggregation of the plurality of data objects.
As shown in fig. 3, a flowchart of another embodiment of a file processing method according to a second embodiment of the present disclosure is provided, where the second embodiment may be configured as a file processing apparatus, and the file processing apparatus may be located in an electronic device, and the file processing method may include the following steps:
301: and screening a plurality of files to be migrated of the target user meeting the migration condition from the disk storage medium.
Some steps in the embodiment of the present disclosure are the same as those in the embodiment shown in fig. 2, and are not repeated herein for the sake of brevity of description.
302: and mapping the multiple conditions to be migrated into data objects respectively to obtain multiple data objects corresponding to the target user.
303: a data block to be written is determined.
Wherein determining the data block to be written may include: and generating a new data block according to the storage capacity of the preset data block to obtain the data block to be written. The data block to be written may be a continuous segment of data writing space, so that different data objects may be written into the data block to be written.
304: and determining the current object to be written in from a plurality of data objects corresponding to the target user.
305: and writing the object to be written into the data block to be written when the data block to be written is in a writable state.
The write status of the data block to be written may include: a writable state and a fully written state. In addition, the writing state of the data block to be written may further include: a non-writable state.
The writable state refers to that the data block to be written can be written with data. The non-writable state means that the data block to be written is already occupied by writing and cannot be written into data by other threads. The full-write state means that the storage space of the data block to be written is completely occupied, and even if the storage space is not occupied by writing, data can not be written any more.
In the process of writing the object to be written into the data block to be written, the data block to be written is occupied and enters a non-writable state so as to ensure that the written data of the data block to be written have user relevance, namely, the data of the same user is written into the corresponding data block in batches, and the accuracy of data writing is ensured.
306: and when the writing of the object to be written is finished, determining the writing state of the data block to be written.
When the object to be written is written, the writing state of the data block to be written can be updated, and the writing state can be switched from a non-writable state to a fully-written state or a writable state.
After the object to be written is written, the writing state of the data block to be written may be a full-written state or a writable state.
307: if the write status is writable, returning to S304 to continue the execution until the plurality of data objects of the target user are written, and obtaining the target data block with the written end.
And when the writing state is the writable state, the data objects can be continuously written, so that the current object to be written can be determined from the plurality of data objects of the target user, and the data block to be written is written to the end until the plurality of data objects of the target user are written to the end, and the target data block corresponding to the data block to be written is obtained.
S08: if the write-in state is the full-write state, the target data block after the data block to be written is determined to be full-written, and the process returns to S303 to continue execution.
If the writing state is a full writing state, the data block to be written is full at this time and is no longer suitable for writing data, at this time, a new data block to be written may be generated, that is, the step of determining the data block to be written is returned to continue to be executed, and a new data block is obtained, so that the remaining data objects of the target user are continuously written into the new data block to be written.
309: and acquiring the target data block obtained after the writing of all the data blocks to be written is finished.
In practical application, if the data volume of the plurality of data objects is large, the data blocks to be written may be generated continuously along with the writing of the data, when each data block to be written enters a full-written state, the target data block corresponding to the data block to be written may be obtained, and the target data block obtained when the writing is finished may include one or more data blocks.
310: storing the target data block in a tape storage medium.
In the embodiment of the present disclosure, after a plurality of migration files of a target user that satisfy a migration condition are screened from a disk storage medium, a plurality of files to be migrated may be respectively mapped as data objects, so as to obtain a plurality of data objects corresponding to the target user. And then determining a data block to be written, and writing the object to be written into the data block to be written when the data block to be written is in a writable state, so as to realize the writing of the data object. And detecting the writing state of the data block to be written when the writing of the writing object is finished. And if the writing state is the writable state, continuing to write the object to be written until the writing of the plurality of data objects of the target user is completed, and obtaining the target data block after the writing is completed. If the writing state is the full writing state, the new data block to be written can be determined again at this time, and the remaining objects are continuously written into the new data block to be written. And finally acquiring target data obtained after the writing of all data blocks to be written is finished. By sequentially writing the plurality of data objects of the target user into the data block to be written, the data aggregation of the plurality of data objects of the target user can be realized, so that the plurality of data objects of the target user are aggregated into the target data block, and the file is transferred to the tape storage medium in a data block mode, so that the data can be rapidly stored, and the storage efficiency is improved.
In practical applications, in order to ensure that a plurality of data objects are all written into the same data block, during the process of writing the data objects into the data block, the data block may be write-locked by the write identifier. As a possible implementation manner, in step S304, when the data block to be written is in a writable state, writing the object to be written into the data block to be written may include:
generating a writing identifier for an object to be written;
when the data block to be written is in a writable state, establishing a writing association between a writing identifier and the data block to be written;
and writing the object to be written into the data block to be written according to the writing identifier based on the writing association.
In the process of writing the data block to be written into the object to be written, the writing identifier of the object to be written and the data block to be written can be established to be written into the association, so that the writing process of the object to be written is distinguished by the writing identifier, and the object to be written can be accurately written into the object to be written.
And writing the data block to be written in the object to be written, and if the data block to be written is detected to be in a full-written state, closing the write-in identifier associated with the data block to be written so as to release the memory, save the memory space of the electronic equipment and improve the equipment utilization rate of the electronic equipment.
In the embodiment of the disclosure, when the object to be written is written into the data block to be written, a write identifier may be generated for the object to be written, so that when the data block to be written is in a writable state, a write association between the write identifier and the data block to be written is established, and based on the write association, the object to be written is written into the corresponding data block to be written according to the write identifier. By establishing the write-in association between the write-in identifier and the data block to be written in, the object to be written in can be ensured to be accurately written in the data block to be written in, the data object can be accurately written in, and the accuracy of the data is ensured.
In some embodiments, if the writing of the plurality of data objects of the target user is completed and the data block to be written is still in the writable state, the file writing of the next target user may be continuously performed. In the above step, until the plurality of data objects of the target user are written, and after the target data block with written data is obtained, the file processing method may further include:
and detecting the writing state of the data block to be written when the writing of the plurality of data objects is finished.
And if the writing state of the data block to be written is a writable state, returning to the step of screening a plurality of files to be migrated of the target user meeting the migration condition from the magnetic disk storage medium and continuing to execute.
In the embodiment of the disclosure, the writing state of the data block to be written when the writing of the plurality of data objects is completed is detected, and if the writing state of the data block to be written is a writable state, the step of screening the plurality of files to be migrated of the target user meeting the migration condition from the disk storage medium can be returned to continue to be executed, so that the target user can select again, and the file migration operation of the user is continuously executed. Through the re-selection of the user, the automatic selection of the user and the automatic selection of the file can be realized, and the file migration efficiency is improved.
After determining a new target user, the file of the target user can be continuously written into the data block to be written, and the consistency of writing is ensured. In a possible design, if the writing status of the data block to be written is a writable status, after the step of returning to the step of screening the multiple files to be migrated of the target user meeting the migration condition from the magnetic disk storage medium is continuously executed, the multiple files to be migrated of the new target user may be obtained. At this time, in the process of continuing to execute the file mapping and the writing of the data object, a writing identifier is generated for the object to be written, and at this time, the method may specifically include:
acquiring a write identifier associated with a data block to be written;
and taking the writing identifier associated with the data block to be written as the newly obtained writing identifier of the current object to be written corresponding to the target user.
In the embodiment of the present disclosure, after the writing of the multiple data objects of the previous target user is completed, the multiple data objects of the new target user are obtained, and the new object to be written is determined, the original associated writing identifier of the data block to be written may be used as the writing identifier of the new object to be written, so as to ensure that the data block to be written can continue to write into the new object to be written by using the associated writing identifier, so that the object to be written into the data block to be written has writing continuity, and the writing accuracy is ensured.
In one possible design, if a data object is written completely and there is write space in the data block to be written, the next data object of the target user may be written continuously. That is, the next data object of the target user is taken as the object to be written. At this time, when writing a new data object into the data block to be written, the write identifier associated with the data block to be written may be used as the new data object, that is, the write identifier of the object to be written. Therefore, if the writing state is a writable state, returning to the plurality of data objects corresponding to the target user, and after the step of determining the current object to be written is continuously executed, generating a writing identifier for the object to be written, including:
acquiring a write identifier associated with a data block to be written;
and taking the writing identifier associated with the data block to be written as the writing identifier of the object to be written newly obtained by the target user.
In the embodiment of the present disclosure, if the target user finishes writing the object to be written, the next data object may be continuously written, and the write identifier associated with the data block to be written is used as the write identifier of the object to be written, which may ensure that the data object of the target user may be written into the data block to be written associated with the write identifier, with the write identifier as a write reference, thereby implementing continuous writing of a plurality of data objects, ensuring that a plurality of data objects of the target user may be continuously written into the corresponding data block to be written, implementing continuous and accurate writing of the data objects, and improving the accuracy of writing.
In practical applications, if the data object corresponding to the file is directly stored as a whole in the data block to be written, the data is difficult to write, the write interruption is easy to occur, the data needs to be rewritten, and the write efficiency is reduced. In some embodiments, based on the write association, writing the object to be written into the data block to be written according to the write identifier may specifically include:
and splitting the object to be written into a plurality of data fragments.
And sequentially writing the plurality of data fragments into the data block to be written associated with the writing identifier.
Splitting the object to be written into a plurality of data fragments may include: and splitting the object to be written into a plurality of data fragments according to the preset fragment size.
For example, if the data size of an object to be written is 10MB (Mega Byte), specifically 10485760bytes, and the preset fragment size may be 4MB, specifically 4194304bytes, the object to be written may be split into three data fragments, which are: data slices corresponding to 1-4194304 bytes, data slices corresponding to 4194305-8338608 bytes, and data slices corresponding to 8338609-10485760 bytes.
In the embodiment of the disclosure, when the object to be written is written into the data block to be written, the object to be written may be split into a plurality of data fragments, so that the object to be written is sequentially written into the data block to be written in a unit of data fragments, thereby ensuring the efficiency of data writing. In the process of writing in each data fragment, the plurality of data fragments can be sequentially written into the data block to be written according to the writing identifier, so that the writing action of different data fragments is limited by the writing identifier, the plurality of data fragments are sequentially written into the data block to be written associated with the writing identifier, the accurate writing of different data fragments is realized, and the consistent and accurate writing of data objects is ensured.
One object to be written can be split into a plurality of data fragments, and in order to ensure the writing accuracy of the data fragments, a split identifier can be set for each writing fragment. As a possible implementation manner, after splitting the object to be written into a plurality of data fragments, the method may further include:
and determining splitting identifications respectively corresponding to the plurality of data fragments.
The writing of the plurality of data fragments into the data block to be written in sequence according to the writing identifier may include:
and writing the plurality of data fragments into the data block to be written associated with the writing identifier in sequence according to the splitting identifiers respectively corresponding to the plurality of data fragments.
Taking the splitting of the 10MB as an example, the data fragments corresponding to 1-4194304 bytes can be divided, and a splitting identifier 1 is set; 4194305-8338608 bytes, and setting a splitting identifier 2; and setting a splitting identifier 3 for the data slices corresponding to 8338609-10485760 bytes.
The writing of the plurality of data fragments into the to-be-written data block associated with the writing identifier in sequence according to the splitting identifiers respectively corresponding to the plurality of data fragments may include: and according to the splitting identifications corresponding to the multiple data fragments respectively, sequentially writing the multiple data into the data block to be written associated with the writing identification from the data fragment corresponding to the first splitting identification until the data fragment corresponding to the last splitting identification is completely written.
In the embodiment of the disclosure, the splitting identifiers respectively corresponding to the plurality of data fragments are determined, so that the plurality of data fragments are sequentially written into the data block to be written in which the writing identifier is associated according to the splitting identifiers respectively corresponding to the plurality of data fragments, and accurate writing of each data fragment is realized.
In practical applications, if a file has a large capacity, the file may not be completely stored in one data block. For example, the movie file may have a capacity of 3G, and when the actual capacity of one data block is set to 1G, the movie file needs to be stored in three data blocks. In order to solve the problem that the storage of a large-capacity file is needed due to the fact that the capacity of the file is too large, the storage problem of the large-capacity file can be effectively solved by means of data fragmentation and data blocks.
Therefore, as an embodiment, during the process of writing the object to be written to the data block to be written, the method may further include:
in the process of writing the plurality of data fragments of the object to be written into the data block to be written according to the writing identification, the writing state of the data block to be written can be detected.
When any data block is written into the data block to be written, and the writing state of the data block to be written is detected to be a full writing state, a new data block to be written can be determined.
And setting a new writing identifier for the object to be written so as to establish the incidence relation between the new writing identifier and the new data block to be written.
And sequentially writing the unwritten data fragments into a new data block to be written according to the new writing identification until the plurality of data fragments are written into the corresponding data block to be written, and obtaining the corresponding target data block.
To facilitate understanding of the storage structure of the data object of the target user in the data block, referring to fig. 4, which is a schematic diagram of the storage structure of the data block, a file 1 may be mapped as an object 1, and the object 1 may be divided into a slice 11, a slice 12, a slice 13, and a slice 14. File 2 may be mapped as object 2, and object 2 may be divided into slices 21, 22, 23, 24, and 25. Wherein, the fragments 11-14 are stored in the data block in sequence, and the distribution 21-25 are stored in the data block in sequence.
In the embodiment of the disclosure, in the process of writing a plurality of data fragments of an object to be written into a data block to be written according to a write identifier, the write state of the data block to be written is detected, so that when any data block is written into the data block to be written, and the write state of the data block to be written is detected to be a full write state, a new data block to be written can be determined and a new write identifier is set for the object to be written, so that after the association relationship between the new write identifier and the new data block to be written is established, the data fragments which are not written are written into the new data block to be written according to the new write identifier, so that the accurate writing of the data block is realized, the object to be written can be completely written, and the object to be written is enabled to be based on the corresponding write identifier, so that the accurate writing of data is realized.
In one possible design, multiple data objects of a target user may be written sequentially in a write order. Determining the current object to be written from a plurality of data objects corresponding to the target user may include:
and determining the writing sequence corresponding to the plurality of data objects corresponding to the target user respectively.
And determining the next data object of the previously written data object as the current object to be written according to the writing sequence respectively corresponding to the plurality of data objects.
Optionally, the writing sequence corresponding to each of the plurality of data objects corresponding to the target user may be determined according to the storage time, the storage size, and/or the storage level of the plurality of data objects.
In the embodiment of the present disclosure, the multiple data objects of the target user are obtained, and the multiple data objects can be respectively written into the data block to be written according to the respective writing sequence of the multiple data objects, so that sequential writing of the multiple data objects is realized, phenomena such as repeated writing or missing writing are avoided, and the accuracy of writing is ensured.
As shown in fig. 5, a flowchart of a file processing method according to a third embodiment of the present disclosure is provided, where the third embodiment may be configured as a file processing apparatus, and the file processing apparatus may be located in an electronic device, and the file processing method may include the following steps:
501: and determining target users meeting the user screening conditions from the disk storage medium.
502: and selecting a plurality of files to be migrated which meet the file screening condition from all the files of the target user.
The filtering conditions may include user filtering conditions and file filtering conditions. The target users meeting the user screening conditions can be screened from the disk storage medium, and then a plurality of files to be migrated meeting the file migration conditions are screened from all files of any target user. The user filtering condition may be that a predetermined user identity is satisfied. The file filtering condition may include that the usage frequency of the file is lower than a preset access frequency and belongs to a predetermined file type.
In a personal web application scenario, the predetermined user identity may be a private user. Files stored under the user name can be classified into private files and public files according to different file types. A private file may be a file that is stored only and not accessible to other users. The public file may be a file stored on a network disk, but accessible by other users.
503: and mapping the plurality of files to be migrated into data objects respectively to obtain a plurality of data objects corresponding to the target user.
504: and carrying out data aggregation on a plurality of data objects corresponding to the target user to obtain a target data block.
505: storing the target data block in a tape storage medium.
In the embodiment of the disclosure, when determining a plurality of files to be migrated of a target user, the target user meeting the user screening condition may be determined first. The user identity is selected to obtain an accurate target user. And then aiming at all files under the target user name, selecting a plurality of files to be migrated according to the file screening conditions, realizing accurate selection of the files, ensuring that the migrated files meet certain selection conditions, not migrating all the files, ensuring the file migration effectiveness and playing a certain positive role in the access efficiency of the files.
As a possible implementation manner, selecting a plurality of files to be migrated that satisfy the file filtering condition from all the files of the target user may include:
a plurality of candidate files belonging to a predetermined file type are selected from all files of a target user.
And determining a plurality of files to be migrated with the access frequency smaller than a preset access frequency threshold value from the plurality of candidate files based on the access frequencies respectively corresponding to the plurality of candidate files.
The lower the access frequency of the candidate file is, the higher the mobility of the candidate file is, so as to reduce the storage cost of the file. The file types of the plurality of candidate files may be private file types. The private file type may not be shared to other users, and there is no sharing history.
In the embodiment of the disclosure, the files with low access frequency are migrated, so that the files with low utilization rate of users can be migrated and stored, the access efficiency of accessing more files is not affected by the migration of data, and the access efficiency of the files of the users is guaranteed to a certain extent.
As a possible implementation manner, selecting a plurality of candidate files belonging to a predetermined file type from all files of a target user includes:
and acquiring all files under the target user name.
And determining a plurality of files without preset sharing history from all files under the target user name.
Determining a plurality of files without preset sharing history as a plurality of candidate files.
In the embodiment of the disclosure, when a plurality of candidate files of a predetermined file type are selected from all files of a target user, a file without sharing history can be selected from all files under a target user name, and a private file of the user is used as the plurality of candidate files, so that migration of personal files of the user is realized, the common files are ensured not to be migrated, the common files are stored in a disk storage medium, and the file reading efficiency is ensured. And the private files of the user can be migrated, so that the files with low use frequency can be migrated, and the storage cost is effectively reduced.
In the foregoing embodiment, after the file to be migrated of the target user is migrated and is transferred to the tape storage medium, if the migrated file needs to be read, the file may be searched according to the association between the file and the object. As shown in fig. 6, a flowchart of a file processing method according to a fourth embodiment of the present disclosure is provided, where the fourth embodiment may be configured as a file processing apparatus, and the file processing apparatus may be located in an electronic device, and the file processing method may include the following steps:
601: and determining the file access request initiated by the target user and received by the disk storage medium.
602: in response to the file access request, a target file to which the target user requests access is determined.
603: and determining a target data object corresponding to the target file to query a target data block where the target data object is located.
Alternatively, the data object may be stored in the data block in the form of a plurality of data fragments. When the target data block where the target data object is located is queried, the target data block where all data fragments of the target data object are located can be queried, so that the target data object is located.
When the data object is split into the plurality of data fragments, the plurality of data fragments of the same file are stored in the determined data block, when the data object is accessed, the plurality of data fragments corresponding to the data object can be accessed, the data aggregation access of a user to the data block where the same file is located can be realized, and the access efficiency of the file can be effectively improved.
604: and reading the target data block where the target data object is located from the tape storage medium to obtain the target data object.
605: and mapping the target data object into a target file, and sending the target file to a disk storage medium so that the disk storage medium outputs the target file for a target user.
The target data block obtained based on the file processing method provided by the above embodiment is stored in the disk storage medium.
In the embodiment of the disclosure, when determining that a file access request initiated by a target user is received by a disk storage medium, a target data object corresponding to a target file may be determined to query a target data block where the target data object is located. A determination of an access link for a file to a data block is effected. And then reading a target data block where the target data object is located from the tape storage medium to obtain the target data object, and after the target data object is mapped into a target file, sending the target user to the disk storage medium for the disk storage medium to output the target file for the target user. After the target data object is read through the access link, the target data object can be converted into a corresponding file format, a target file is obtained, and output of the target file is achieved. In the file access process, the target data block where the data object of the file is located is positioned, so that the file can be quickly and accurately accessed.
As shown in fig. 7, a schematic diagram of a document processing apparatus according to a fifth embodiment of the disclosure, the document processing apparatus may be located in an electronic device, and the document processing apparatus 700 may include the following units:
file selecting unit 701: and the file migration method is used for screening a plurality of files to be migrated of target users meeting migration conditions from the disk storage medium.
The file mapping unit 702: the method is used for mapping the files to be migrated into the data objects respectively so as to obtain a plurality of data objects corresponding to the target user.
The data aggregation unit 703: and the data aggregation module is used for carrying out data aggregation on a plurality of data objects corresponding to the target user to obtain a target data block.
Data storage unit 704: for storing the target data block to the tape storage medium.
As an embodiment, the data aggregation unit 703 may include:
the first determining module is used for determining the data block to be written.
And the object determining module is used for determining the current object to be written from a plurality of data objects corresponding to the target user.
The object writing module is used for writing the object to be written into the data block to be written when the writing state of the data block to be written is in a writable state;
and the target acquisition module is used for acquiring the target data block obtained after the writing of all the data blocks to be written is finished.
In some embodiments, the data aggregation unit 703 may further include:
the state detection module is used for detecting the writing state of the data block to be written when the writing of the object to be written is finished; the writing state includes: a writable state or a fully written state;
the first processing module is used for returning to a plurality of data objects corresponding to the target user if the writing state is the writable state, and continuing to execute the step of determining the current object to be written until the plurality of data objects of the target user are written completely, and obtaining a target data block when the writing is finished;
and the second processing module is used for determining a target data block after the data block to be written is fully written if the writing state is the fully written state, and returning to the step of determining a new data block to be written to continue executing.
In one possible design, an object writing module includes:
the identification generation submodule is used for generating a write-in identification for the object to be written;
the association establishing submodule is used for establishing the writing association between the writing identifier and the data block to be written when the data block to be written is in a writable state;
and the data writing submodule is used for writing the object to be written into the data block to be written according to the writing identification based on the writing association.
In certain embodiments, further comprising:
the first detection unit is used for detecting the writing state of a data block to be written when the writing of the plurality of data objects is finished;
the first returning unit is configured to return to the file selecting unit 701 to continue execution if the writing status of the data block to be written is a writable status.
In one possible design, the identifier generation submodule may be specifically configured to:
acquiring a write identifier associated with a data block to be written;
and taking the writing identifier associated with the data block to be written as the newly obtained writing identifier of the current object to be written corresponding to the target user.
In another possible design, the identifier generation submodule may be specifically configured to:
acquiring a write identifier associated with a data block to be written;
and taking the writing identifier associated with the data block to be written as the writing identifier of the object to be written newly obtained by the target user.
As an embodiment, the data writing submodule may specifically be configured to:
splitting an object to be written into a plurality of data fragments;
and sequentially writing the plurality of data fragments into the data block to be written associated with the writing identifier.
In some embodiments, the data writing submodule may be further configured to:
determining splitting identifications corresponding to the data fragments respectively;
and writing the plurality of data fragments into the data block to be written associated with the writing identifier in sequence according to the splitting identifiers respectively corresponding to the plurality of data fragments.
In one possible design, further comprising:
the second detection unit is used for detecting the writing state of the data block to be written in the process of writing the plurality of data fragments of the object to be written in the data block to be written in according to the writing identification;
the second determining unit is used for determining a new data block to be written when any data slice is written into the data block to be written and the writing state of the data block to be written is detected to be a full writing state;
the identification association unit is used for setting a new writing identification for the object to be written so as to establish the association relationship between the new writing identification and the new data block to be written;
and the fragment writing unit is used for sequentially writing the unwritten data fragments into a new data block to be written according to the new writing identifier until the plurality of data fragments are written into the corresponding data block to be written, so as to obtain the corresponding target data block.
In some embodiments, the object determination module may include:
the sequence determining unit is used for determining the writing sequence corresponding to the plurality of data objects corresponding to the target user respectively;
and the object determining unit is used for determining the next data object of the data object which is written before as the current object to be written according to the writing sequence respectively corresponding to the plurality of data objects.
As an embodiment, the file selecting unit 701 may include:
the user screening module is used for determining target users meeting the user screening conditions from the disk storage medium;
and the file screening module is used for selecting a plurality of files to be migrated which meet the file screening conditions from all the files of the target user.
In one possible design, the document filtering module may include:
the type screening submodule is used for selecting a plurality of candidate files belonging to a preset file type from all files of a target user;
and the frequency screening submodule is used for determining a plurality of files to be migrated with access frequencies smaller than a preset access frequency threshold value from the plurality of candidate files based on the access frequencies respectively corresponding to the plurality of candidate files.
As yet another example, the type screening submodule may be operable to:
acquiring all files under a target user name;
determining a plurality of files without preset sharing history from all files under the target user name;
determining a plurality of files without preset sharing history as a plurality of candidate files.
In some embodiments, it may further include:
the request determining unit is used for determining a file access request initiated by a target user and received by the disk storage medium;
the request response unit is used for responding to the file access request and determining a target file which is requested to be accessed by a target user;
the data query unit is used for determining a target data object corresponding to the target file so as to query a target data block where the target data object is located;
a data reading unit for reading the target data block from the tape storage medium to obtain a target data object;
and the file sending unit is used for mapping the target data object into a target file and sending the target file to the disk storage medium so that the disk storage medium outputs the target file as a target user.
It should be noted that the target user in this embodiment is not specific to a specific user, and cannot reflect personal information of a specific user. In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
According to an embodiment of the present disclosure, the present disclosure also provides a computer program product comprising: a computer program, stored in a readable storage medium, from which at least one processor of the electronic device can read the computer program, the at least one processor executing the computer program causing the electronic device to perform the solution provided by any of the embodiments described above.
FIG. 8 illustrates a schematic block diagram of an example electronic device 800 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and the like. The calculation unit 801 executes the respective methods and processes described above, such as a file processing method. For example, in some embodiments, the file processing method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 808. In some embodiments, part or all of the computer program can be loaded and/or installed onto device 800 via ROM 802 and/or communications unit 809. When loaded into RAM 803 and executed by the computing unit 801, a computer program may perform one or more steps of the file processing method described above. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the file processing method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (33)

1. A method of file processing, comprising:
screening a plurality of files to be migrated of a target user meeting the migration condition from a disk storage medium;
mapping the files to be migrated into data objects respectively to obtain a plurality of data objects corresponding to the target user;
performing data aggregation on a plurality of data objects corresponding to the target user to obtain a target data block;
and storing the target data block into a tape storage medium.
2. The method of claim 1, wherein the aggregating data of the plurality of data objects corresponding to the target user to obtain a target data block comprises:
determining a data block to be written;
determining a current object to be written in from a plurality of data objects corresponding to the target user;
writing the object to be written into the data block to be written when the writing state of the data block to be written is in a writable state;
and acquiring the target data block obtained after the writing of all data blocks to be written is finished.
3. The method of claim 2, further comprising:
detecting the writing state of the data block to be written when the writing of the object to be written is finished; the writing state includes: a writable state or a fully written state;
if the writing state is a writable state, returning to the plurality of data objects corresponding to the target user, and continuing to execute the step of determining the current object to be written until the plurality of data objects of the target user are written completely, and obtaining a target data block when the writing is finished;
and if the writing state is a full writing state, determining the target data block after the data block to be written is fully written, and returning to the step of determining the data block to be written to continue executing.
4. The method of claim 3, wherein writing the object to be written to the data block to be written while the data block to be written is in a writable state comprises:
generating a writing identifier for the object to be written;
when the data block to be written is in a writable state, establishing the writing association between the writing identification and the data block to be written;
and writing the object to be written into the data block to be written according to the writing identifier based on the writing association.
5. The method of claim 4, wherein the obtaining of the target data block with completed writing until the plurality of data objects of the target user are written, further comprises:
detecting the writing state of the data block to be written when the writing of the plurality of data objects is finished;
and if the writing state of the data block to be written is a writable state, returning to the step of screening a plurality of files to be migrated of the target user meeting the migration condition from the magnetic disk storage medium and continuing to execute.
6. The method according to claim 5, wherein if the writing status of the data block to be written is a writable status, after the step of returning to the step of screening multiple files to be migrated of the target user meeting the migration condition from the disk storage medium is continuously performed, generating a write identifier for the object to be written comprises:
acquiring a write identifier associated with the data block to be written;
and taking the writing identifier associated with the data block to be written as the newly obtained writing identifier of the current object to be written corresponding to the target user.
7. The method according to any one of claims 4 to 6, wherein, if the writing status is a writable status, returning to the plurality of data objects corresponding to the target user, and after the step of determining the current object to be written is continuously performed, generating a writing identifier for the object to be written comprises:
acquiring a write identifier associated with the data block to be written;
and taking the writing identifier associated with the data block to be written as the writing identifier of the object to be written newly obtained by the target user.
8. The method according to any one of claims 4 to 7, wherein the writing the object to be written to the data block to be written according to the writing identifier based on the writing association comprises:
splitting the object to be written into a plurality of data fragments;
and sequentially writing the plurality of data fragments into the data block to be written associated with the writing identifier.
9. The method of claim 8, wherein after the splitting the object to be written into the plurality of data slices, further comprising:
determining splitting identifications corresponding to the data fragments respectively;
the writing the plurality of data fragments into the data block to be written in sequence according to the writing identifier includes:
and sequentially writing the plurality of data fragments into the data block to be written associated with the writing identifier according to the splitting identifiers respectively corresponding to the plurality of data fragments.
10. The method of claim 8, further comprising:
detecting the writing state of the data block to be written in the process that a plurality of data fragments of the object to be written in are written in the data block to be written according to the writing identification;
when any one of the data fragments is written into the data block to be written, and the writing state of the data block to be written is detected to be a full writing state, determining a new data block to be written;
setting a new writing identifier for the object to be written to establish an incidence relation between the new writing identifier and the new data block to be written;
and sequentially writing the unwritten data fragments into the new data block to be written according to the new writing identification until the plurality of data fragments are written into the corresponding data block to be written, and obtaining the corresponding target data block.
11. The method according to any one of claims 2-10, wherein the determining a current object to be written from among a plurality of data objects corresponding to the target user comprises:
determining writing sequences respectively corresponding to a plurality of data objects corresponding to the target user;
and determining the next data object of the previously written data object as the current object to be written according to the writing sequence respectively corresponding to the plurality of data objects.
12. The method according to any one of claims 1 to 11, wherein the screening of the plurality of files to be migrated of the target user satisfying the migration condition from the disk storage medium comprises:
determining target users meeting user screening conditions from the disk storage medium;
and selecting a plurality of files to be migrated which meet file screening conditions from all files of the target user.
13. The method of claim 12, wherein the selecting a plurality of files to be migrated that satisfy a file filtering condition from all files of the target user comprises:
selecting a plurality of candidate files belonging to a preset file type from all files of the target user;
and determining the plurality of files to be migrated with the access frequency smaller than a preset access frequency threshold value from the plurality of candidate files based on the access frequencies respectively corresponding to the plurality of candidate files.
14. The method of claim 13, wherein said selecting a plurality of candidate files belonging to a predetermined file type from all files of the target user comprises:
acquiring all files under the target user name;
determining a plurality of files without preset sharing history from all files under the target user name;
determining a plurality of files without preset sharing history as a plurality of candidate files.
15. The method of any of claims 1-14, further comprising:
determining a file access request initiated by the target user and received by the disk storage medium;
in response to the file access request, determining a target file which the target user requests to access;
determining a target data object corresponding to the target file to query a target data block where the target data object is located;
reading the target data block from the tape storage medium to obtain the target data object;
and mapping the target data object into the target file, and sending the target file to the disk storage medium so that the disk storage medium outputs the target file for the target user.
16. A document processing apparatus comprising:
the file selection unit is used for screening a plurality of files to be migrated of the target user meeting the migration conditions from the magnetic disk storage medium;
the file mapping unit is used for respectively mapping the files to be migrated into data objects so as to obtain a plurality of data objects corresponding to the target user;
the data aggregation unit is used for performing data aggregation on the plurality of data objects corresponding to the target user to obtain a target data block;
and the data storage unit is used for storing the target data block into a tape storage medium.
17. The apparatus of claim 16, wherein the data aggregation unit comprises:
the first determining module is used for determining a data block to be written;
the object determining module is used for determining a current object to be written in from a plurality of data objects corresponding to the target user;
the object writing module is used for writing the object to be written into the data block to be written when the writing state of the data block to be written is in a writable state;
and the target acquisition module is used for acquiring the target data block obtained after all the data blocks to be written are written.
18. The apparatus of claim 17, wherein the data aggregation unit further comprises:
the state detection module is used for detecting the writing state of the data block to be written when the writing of the object to be written is finished; the writing state includes: a writable state or a fully written state;
the first processing module is configured to, if the writing status is a writable status, return to the plurality of data objects corresponding to the target user, and continue to execute the step of determining the current object to be written until the plurality of data objects of the target user are written completely, and obtain a target data block when writing is finished;
and the second processing module is used for determining the target data block after the data block to be written is fully written if the writing state is a fully written state, and returning to the step of determining the data block to be written to continue executing.
19. The apparatus of claim 18, wherein the object write module comprises:
the identification generation submodule is used for generating a writing identification for the object to be written;
the association establishing submodule is used for establishing the writing association between the writing identification and the data block to be written when the data block to be written is in a writable state;
and the data writing submodule is used for writing the object to be written into the data block to be written according to the writing identification based on the writing association.
20. The apparatus of claim 19, further comprising:
a first detection unit, configured to detect a write status of the data block to be written when writing of the plurality of data objects is completed;
and the first returning unit is used for returning to the file selecting unit to continue executing if the writing state of the data block to be written is a writable state.
21. The apparatus according to claim 20, wherein the identifier generation submodule is specifically configured to:
acquiring a write identifier associated with the data block to be written;
and taking the writing identifier associated with the data block to be written as the newly obtained writing identifier of the current object to be written corresponding to the target user.
22. The apparatus according to claims 19-21, wherein the identity generation submodule is specifically configured to:
acquiring a write identifier associated with the data block to be written;
and taking the writing identifier associated with the data block to be written as the writing identifier of the object to be written newly obtained by the target user.
23. The apparatus of claims 19-22, wherein the data write submodule is specifically configured to:
splitting the object to be written into a plurality of data fragments;
and sequentially writing the plurality of data fragments into the data block to be written associated with the writing identifier.
24. The apparatus of claim 23, wherein the data write submodule is specifically configured to:
determining splitting identifications corresponding to the data fragments respectively;
and sequentially writing the plurality of data fragments into the data block to be written associated with the writing identifier according to the splitting identifiers respectively corresponding to the plurality of data fragments.
25. The apparatus of claim 23, further comprising:
a second detecting unit, configured to detect a write status of the data block to be written in a process that the plurality of data fragments of the object to be written are written in the data block to be written according to the write identifier;
a second determining unit, configured to determine a new data block to be written when any of the data blocks is written into the data block to be written in a fragmented manner and the writing state of the data block to be written is detected to be a full writing state;
the identification association unit is used for setting a new writing identification for the object to be written so as to establish the association relationship between the new writing identification and the new data block to be written;
and the fragment writing unit is used for sequentially writing the unwritten data fragments into the new data block to be written according to the new writing identifier until the plurality of data fragments are written into the corresponding data block to be written, so as to obtain the corresponding target data block.
26. The apparatus of any of claims 16-25, wherein the object determination module comprises:
the sequence determining unit is used for determining the writing sequence corresponding to the plurality of data objects corresponding to the target user respectively;
and the object determining unit is used for determining the next data object of the data object which is written before and is finished as the current object to be written according to the writing sequence respectively corresponding to the plurality of data objects.
27. The apparatus according to any one of claims 16-26, wherein the file selecting unit comprises:
the user screening module is used for determining target users meeting user screening conditions from the disk storage medium;
and the file screening module is used for selecting a plurality of files to be migrated which meet the file screening conditions from all the files of the target user.
28. The apparatus of claim 27, wherein the file filtering module comprises:
the type screening submodule is used for selecting a plurality of candidate files belonging to a preset file type from all files of the target user;
and the frequency screening submodule is used for determining the plurality of files to be migrated with the access frequency smaller than a preset access frequency threshold value from the plurality of candidate files based on the access frequencies respectively corresponding to the plurality of candidate files.
29. The apparatus of claim 28, wherein the type filtering submodule is specifically configured to:
acquiring all files under the target user name;
determining a plurality of files without preset sharing history from all files under the target user name;
determining a plurality of files without preset sharing history as a plurality of candidate files.
30. The apparatus of any of claims 16-29, further comprising:
a request determining unit, configured to determine a file access request initiated by the target user and received by the disk storage medium;
a request response unit, configured to determine, in response to the file access request, a target file to which the target user requests access;
the data query unit is used for determining a target data object corresponding to the target file so as to query a target data block where the target data object is located;
a data reading unit, configured to read the target data block from the tape storage medium, and obtain the target data object;
and the file sending unit is used for mapping the target data object into the target file and sending the target file to the disk storage medium so that the disk storage medium outputs the target file for the target user.
31. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the file processing method of any of claims 1-15.
32. A non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the file processing method according to any one of claims 1 to 15.
33. A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the file processing method of any one of claims 1 to 15.
CN202111404189.7A 2021-11-24 2021-11-24 File processing method, device, equipment, medium and product Pending CN114035750A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111404189.7A CN114035750A (en) 2021-11-24 2021-11-24 File processing method, device, equipment, medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111404189.7A CN114035750A (en) 2021-11-24 2021-11-24 File processing method, device, equipment, medium and product

Publications (1)

Publication Number Publication Date
CN114035750A true CN114035750A (en) 2022-02-11

Family

ID=80138697

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111404189.7A Pending CN114035750A (en) 2021-11-24 2021-11-24 File processing method, device, equipment, medium and product

Country Status (1)

Country Link
CN (1) CN114035750A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114518848A (en) * 2022-02-15 2022-05-20 北京百度网讯科技有限公司 Hierarchical storage system, and method, apparatus, device, and medium for processing storage data
CN114625695A (en) * 2022-03-29 2022-06-14 阿里巴巴(中国)有限公司 Data processing method and device
CN114924696A (en) * 2022-07-18 2022-08-19 上海有孚数迅科技有限公司 Method, apparatus, medium, and program product for storage management

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114518848A (en) * 2022-02-15 2022-05-20 北京百度网讯科技有限公司 Hierarchical storage system, and method, apparatus, device, and medium for processing storage data
CN114625695A (en) * 2022-03-29 2022-06-14 阿里巴巴(中国)有限公司 Data processing method and device
CN114924696A (en) * 2022-07-18 2022-08-19 上海有孚数迅科技有限公司 Method, apparatus, medium, and program product for storage management

Similar Documents

Publication Publication Date Title
CN114035750A (en) File processing method, device, equipment, medium and product
CN111309732B (en) Data processing method, device, medium and computing equipment
KR20200027413A (en) Method, device and system for storing data
EP4030273A1 (en) Data storage method and device
CN108519862A (en) Storage method, device, system and the storage medium of block catenary system
US11947842B2 (en) Method for writing data in append mode, device and storage medium
CN113806300B (en) Data storage method, system, device, equipment and storage medium
EP3865992A2 (en) Distributed block storage system, method, apparatus and medium
CN112540731B (en) Data append writing method, device, equipment, medium and program product
CN115794669A (en) Method, device and related equipment for expanding memory
WO2024021470A1 (en) Cross-region data scheduling method and apparatus, device, and storage medium
CN115291806A (en) Processing method, processing device, electronic equipment and storage medium
EP4120060A1 (en) Method and apparatus of storing data,and method and apparatus of reading data
CN117082046A (en) Data uploading method, device, equipment and storage medium
US12007965B2 (en) Method, device and storage medium for deduplicating entity nodes in graph database
CN115617802A (en) Method and device for quickly generating full snapshot, electronic equipment and storage medium
CN113051244B (en) Data access method and device, and data acquisition method and device
CN111966845B (en) Picture management method, device, storage node and storage medium
CN110895520B (en) File migration method, related device and equipment
CN112783804A (en) Data access method, device and storage medium
WO2024001863A1 (en) Data processing method and related device
WO2023155703A1 (en) Workload feature extraction method and apparatus
US11977785B2 (en) Non-volatile memory device-assisted live migration of virtual machine data
CN114528258B (en) Asynchronous file processing method, device, server, medium, product and system
CN117520273A (en) Data merging method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination