CN113253945B - Disk coiling clustering method and device and electronic equipment - Google Patents

Disk coiling clustering method and device and electronic equipment Download PDF

Info

Publication number
CN113253945B
CN113253945B CN202110772044.6A CN202110772044A CN113253945B CN 113253945 B CN113253945 B CN 113253945B CN 202110772044 A CN202110772044 A CN 202110772044A CN 113253945 B CN113253945 B CN 113253945B
Authority
CN
China
Prior art keywords
sector
operation information
data operation
data
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110772044.6A
Other languages
Chinese (zh)
Other versions
CN113253945A (en
Inventor
谢蜀岷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Yiwo Tech Development Co ltd
Original Assignee
Chengdu Yiwo Tech Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Yiwo Tech Development Co ltd filed Critical Chengdu Yiwo Tech Development Co ltd
Priority to CN202110772044.6A priority Critical patent/CN113253945B/en
Publication of CN113253945A publication Critical patent/CN113253945A/en
Application granted granted Critical
Publication of CN113253945B publication Critical patent/CN113253945B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance

Abstract

The application discloses a disk coiling clustering method, a disk coiling clustering device and electronic equipment, wherein the method comprises the following steps: obtaining a data operation set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster; marking the data operation information in the data operation set as at least a first set or a second set according to the number of sectors of a second value in the target cluster; the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divided by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divided by the second value; and moving the sector corresponding to the data operation information corresponding to the second set so that the moved file is stored in the disk volume according to the target cluster.

Description

Disk coiling clustering method and device and electronic equipment
Technical Field
The present application relates to the field of disk storage technologies, and in particular, to a disk volume clustering method and apparatus, and an electronic device.
Background
In the disk space, files are stored in a cluster as a basic unit, and the size of the cluster is related to the specification of the disk. For example, typically a floppy disk has 1 sector per cluster, and the number of sectors per cluster of a hard disk is related to the total capacity size of the hard disk, which may be 4 sectors or 64 sectors, and so on. Users who need optimized use of disk space may wish to: clusters in a given volume are adjusted to an optimal size, i.e., changed clusters, without data loss.
In the current clustering scheme, a copy method is usually used to copy files in a designated volume into a free space of a current volume according to a clustering format, and then meta information of the volume is reconstructed to achieve the clustering effect.
However, in such a scheme, since all data in the volume needs to be copied, clustering efficiency may be low.
Disclosure of Invention
In view of the above, the present application provides a method, an apparatus and an electronic device for disk volume clustering, so as to solve the technical problem of low clustering efficiency in the prior art, as follows:
the application provides a disk coiling clustering method, which comprises the following steps:
obtaining a data operation set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster;
according to the physical sector count, the virtual sector count and the sector length of each piece of data operation information, at least according to the number of sectors with a second value in a target cluster, marking the data operation information in the data operation set as a first set or a second set;
wherein the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divisible by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divisible by the second value;
and moving the sector corresponding to the data operation information corresponding to the second set, so that the moved file is stored in the disk volume according to the target cluster.
In the above method, preferably, the disk volume has a current management space and a current data space, a start sector position of the current data space is connected to an end sector position of the current management space, the current management space is used to store management data information of the disk volume, and the current data space is used to store data of the file;
according to the physical sector count, the virtual sector count and the sector length of each piece of data operation information, at least according to the number of sectors with the second value in the target cluster, marking the data operation information in the disk volume as at least a first set or a second set, and at least comprising:
determining a target management space of the disk volume according to the number of sectors of a second value in a target cluster, wherein the target management space is consistent with the initial sector position of the current management space;
under the condition that the position of an ending sector of the target management space is larger than that of the ending sector of the current management space, dividing data operation information corresponding to the current data space according to the position of the ending sector of the target management space to obtain a management set and a data set, wherein the data operation information in the management set corresponds to the target management space, and the data operation information in the data set corresponds to the target data space;
under the condition that the position of the ending sector of the target management space is less than or equal to the position of the ending sector of the current management space, dividing the data information corresponding to the current data space into the data set, wherein the management set is empty;
screening out first data operation information which meets the condition that the physical sector count and the virtual sector count are respectively divided by the second numerical value from the data operation information of the data set;
marking the first data operation information as a first set;
marking second data operation information except the first data operation information in the data set as a second set;
before moving the sector corresponding to the data operation information corresponding to the second set, the method further includes:
and moving the file corresponding to the data operation information in the management set to the free space of the disk volume.
Preferably, after marking the second data operation information in the data set except for the first data operation information as a second set, the method further includes:
obtaining third data operation information obtained if the sector corresponding to the second data operation information in the second set is moved to the sector of the target value; the target value is a target remainder obtained by dividing the physical sector count in the second data running information by the second numerical value, or the target value is a difference obtained by subtracting the target remainder from the second numerical value;
under the condition that an overlapping sector area exists between a sector corresponding to the third data operation information and a sector corresponding to the data operation information in the first set, a target sector area with the sector length matched with the overlapping sector area is cut out from the sector corresponding to the second data operation information, and the target sector area corresponds to fourth data operation information;
marking the fourth data operation information as a third set;
wherein the moving the sector corresponding to the data operation information corresponding to the second set includes:
moving the sectors corresponding to the residual data operation information in the second set forwards or backwards according to the front and back sequence of the sectors and the sectors corresponding to the target value;
and moving the sector corresponding to the data operation information corresponding to the third set to the free space of the disk volume.
In the above method, preferably, before screening out, from the data operation information of the data set, first data operation information that satisfies that the physical sector count and the virtual sector count are each evenly divided by the second value, the method further includes:
judging whether the fifth data operation information corresponds to an idle sector on the last target cluster; the fifth data operation information is the last data operation information in the management set;
if the fifth data operation information corresponds to a first idle sector on the last target cluster, in a sector corresponding to sixth data operation information corresponding to the fifth data operation information, segmenting a first sector area with the sector length matched with the first idle sector, wherein the first sector area corresponds to seventh data operation information; the sixth data operation information is the next data operation information corresponding to the fifth data operation information in the data operation set where the fifth data operation information is located;
marking the seventh data run information as the management set.
Preferably, before dividing the data operation information in the current data space according to the ending sector position of the target management space, the method further includes:
judging whether the current data space corresponds to data operation information meeting a space crossing condition; wherein the space crossing condition comprises: the starting sector position of the data operation information is smaller than the ending sector position of the target management space, and the ending sector position of the data operation information is larger than the ending sector position of the target management space;
if the data operation information meeting the space crossing condition exists, dividing the data operation information meeting the space crossing condition into: eighth data operation information and ninth data operation information; the steps are executed: dividing the data operation information corresponding to the current data space according to the ending sector position of the target management space to obtain a management set and a data set;
wherein, the ending sector position of the eighth data operation information is consistent with the ending sector position of the target management space, and the starting sector position of the ninth data operation information is consistent with the starting sector position of the target data space;
and if the data operation information meeting the space crossing condition does not exist, dividing the data operation information corresponding to the current data space according to the position of the ending sector of the target management space to obtain a management set and a data set.
The above method, preferably, after marking the first data run information as a first set, and before marking second data run information in the data set except the first data run information as a second set, further includes:
judging whether the first data operation information corresponds to an idle sector on the last target cluster;
if the first data operation information corresponds to a second idle sector on the last target cluster, selecting tenth data operation information corresponding to the first data operation information from second data operation information of the second set, wherein the tenth data operation information is next data operation information of the first data operation information in the data operation set where the tenth data operation information is located;
dividing a second sector area matched with the second idle sector in a sector corresponding to the tenth data operation information, wherein the second sector area corresponds to eleventh data operation information;
marking the eleventh data run information as the first set;
before moving the sector corresponding to the data operation information corresponding to the second set, the method further includes:
and moving the sector corresponding to the eleventh data operation information to the second idle sector.
Preferably, the method for moving the sector corresponding to the data operation information corresponding to the second set includes:
and moving the sector corresponding to the data operation information in the second set to the free space of the disk volume.
Preferably, after at least moving the sector corresponding to the data operation information corresponding to the second set, the method further includes:
generating directory data of each file according to the data operation information of the file after the sector movement;
and generating target management data of the disk volume at least according to the directory data, wherein the target management data is stored in a target management space created by the disk volume according to the disk volume.
The application also provides a disk volume clustering device of an electronic device, which comprises:
the set obtaining unit is used for obtaining a data operation set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster;
an information dividing unit, configured to mark the data operation information in the data operation set as at least a first set or a second set according to a physical sector count, a virtual sector count, and a sector length of each piece of data operation information, and at least according to a sector number of a second value in a target cluster;
wherein the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divisible by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divisible by the second value;
and the file moving unit is used for moving the sector corresponding to the data operation information corresponding to the second set so that the moved file is stored in the disk volume according to the target cluster.
The present application further provides an electronic device, including:
the disk volume is used for storing files;
a processor to: obtaining a data operation set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster; according to the physical sector count, the virtual sector count and the sector length of each piece of data operation information, at least according to the number of sectors with a second value in a target cluster, marking the data operation information in the data operation set as a first set or a second set; wherein the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divisible by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divisible by the second value; and moving the sector corresponding to the data operation information corresponding to the second set, so that the moved file is stored in the disk volume according to the target cluster.
From the above technical solutions, it can be seen that in the disk volume clustering method, device and electronic device disclosed in the present application, by obtaining the data operation information of each file in the disk volume, the data operation information can be divided according to their physical sector count, virtual sector count and sector length according to the target cluster, the physical sector count and virtual sector count of the data operation information corresponding to the divided first set are both divided by the number of sectors in the target cluster, the physical sector count or virtual sector count of the data operation information corresponding to the divided second set cannot be divided by the number of sectors in the target cluster, based on which, when the file sector in the disk volume is moved due to clustering, only the sector corresponding to the data operation information in the second set needs to be moved, and the sector corresponding to the data operation information whose physical sector count and virtual sector count are both divided by the number of sectors in the target cluster can be kept unchanged, thus, clustering can be achieved without moving all sectors in the disc volume, thereby improving clustering efficiency by reducing the number of moving sectors.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flowchart illustrating a disk volume clustering method according to an embodiment of the present disclosure;
fig. 2-4 are partial flow charts of a disk volume clustering method according to an embodiment of the present disclosure;
fig. 5-9 are respectively another partial flow charts illustrating a disk volume clustering method according to an embodiment of the present application;
FIG. 10 is another flowchart of a disk volume clustering method according to an embodiment of the present disclosure;
fig. 11 is a schematic structural diagram of a disk coiling and clustering device disclosed in the second embodiment of the present application;
FIG. 12 is a schematic structural diagram of a disk coiling and clustering device disclosed in the second embodiment of the present application;
fig. 13 is a schematic structural diagram of an electronic device disclosed in the third embodiment of the present application.
Detailed Description
The inventors of the present application, when studying the clustering scheme of a coil, found that: in the cluster-changing implementation, a copy method is generally used to copy the usage content of the disk volume into the free space of the disk volume according to the cluster-changing format, and then reconstruct meta information, i.e. management data information, to achieve the cluster-changing effect. However, this method has the following drawbacks:
first, there is a requirement for disk space usage: if the free space of the disk volume is insufficient to vacate between data, no action can be performed, and this determination is sometimes discovered only while an action is in progress, resulting in a loss of data.
Most importantly, clustering is inefficient: since all the contents need to be copied and the copying cannot be performed in a continuous space, the overall operation efficiency is low.
In view of this, the inventor of the present application proposes a technical solution for performing cluster changing on a disk volume, so as to reduce the amount of sectors that need to be moved in the process of cluster changing, keep files of the disk volume at the original sector positions as much as possible, and completely acquire the result state of the whole moving process before starting moving the sectors, thereby avoiding the above-mentioned several disadvantages.
It should be noted that the technical solution of the present application is applicable to any File System, such as a New Technology File System (NTFS), a File Allocation Table (FAT), and the like. The present application will be described by taking the file system of NTFS as an example.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 is a flowchart illustrating an implementation of a disk volume clustering method disclosed in an embodiment of the present application, where the method may be applied to an electronic device with a disk volume, such as a mobile phone, a notebook, a computer, or a server. The technical scheme in the embodiment is mainly used for improving the efficiency of disk coiling and clustering.
Specifically, the method in this embodiment may include the following steps:
step 101: a running set of data for each file in the disk volume is obtained.
The data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disc volume is currently clustered by a first number of sectors.
For example, a disk volume that needs to be clustered may also be referred to as a target volume, where the target volume currently has 4 sectors as a cluster, that is, the first value is 4, and the target volume includes three files: file1, file2, and file3, each file including a plurality of sector areas, each sector area including one or more sectors, the sector areas corresponding to data run information, respectively, indicated by data run, in which the physical sector count, virtual sector count, and sector length of the sector area are recorded. The physical sector count is expressed by PSN, and may also be referred to as physical sector Number, which represents the physical sector location of the sector in the disk volume and is referred to as physical Section Number. The Virtual sector count is expressed by VSN, and may also be referred to as Virtual sector Number, and represents the relative sector position of the sector in the file, which is referred to as Virtual Section Number. The sector length is the number of sectors included in the sector area. The data operation information of the file is in units of sectors and can be recorded as a structure of a linked list. The virtual sector count for each data run information is: and adding the sector lengths in the data running information before the data running information in the corresponding data running set to obtain a value.
For example, first, in this embodiment, an initial running set of each file is obtained, where a first numerical value PSN of initial data running information in the initial running set is a physical sector count, and a second numerical value Length is a sector Length, as follows:
File1.data_run={ (100,4),(188,4),(300,12)};
File2.data_run={(104,84),(192,12),(204,84)};
File3.data_run={(288,12),(312,20)};
and then, preprocessing the initial operation information in the initial operation set of each file to obtain a data operation set of each file. The method specifically comprises the following steps: adding a virtual sector count and judging whether the Length of the data run is within the size of a target cluster, if so, not increasing the partition, if so, increasing the partition according to the size of the new cluster (namely, increasing the partition when the head of the data run is not aligned), and the calculation mode of the misalignment represents that VSN% 8! When =0, division needs to be added, so the original Data space is subdivided to obtain: a running set of data for each file. As follows, a first value PSN in the data operation information is a physical sector count, a second value VSN is a virtual sector count, and a third value Length is a sector Length:
pretreatment: firstly:
File1={(100,0,4),(188,4,4),(300,8,12)};
File2={(104,0,84),(192,84,4),(196,88,8),(204,96,84)};
File3={(288,0,12),(312,12,4),(316,16,16)};
wherein, (192, 84, 4), (196, 88, 8), (312, 12, 4), (316, 16, 16) are data run after being subdivided and expanded, the principle is whether VSN%8 is 0, and the remainder and length are larger than the cluster size, so that (188, 4, 4) no additional segmentation is added.
After processing, each data run is either an integer multiple of the VSN of the cluster or is simply the complement of the previous data run.
It should be noted that, in this embodiment, the file directory may not be acquired when the cluster is changed.
Step 102: and marking the data operation information in the data operation set as at least a first set or a second set according to the physical sector count, the virtual sector count and the sector length of each data operation information and at least according to the number of sectors with a second value in the target cluster.
And the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be evenly divided by the second value.
It should be noted that the target cluster is a target of the disk volume that needs to be clustered. For example, the target cluster uses 8 sectors as a cluster, and the target for cluster changing in this embodiment is: the disc volume is changed from 4 sectors in a cluster to 8 sectors in a cluster. Of course, the second value of the target cluster may be greater than the first value or less than the first value. In this embodiment, the second data is greater than the first value.
Specifically, in this embodiment, a first remainder obtained by dividing the physical sector count in the data operation information by the second numerical value may be calculated, and a second remainder obtained by dividing the virtual sector count in the data operation information by the second numerical value may be calculated, where if the first remainder and the second remainder are both 0, that is: the physical sector count and the virtual sector count are both evenly divided by the second value, at which time the data run information is marked as a first set, and for data run information where the first remainder or the second remainder is not 0, it may be marked as a second set.
For example, the physical sector count and the virtual sector count in each data run information corresponding to File1, File2, and File3 are divided by 8, respectively, to obtain data run information in which the physical sector count and the virtual sector count can be divided by 8: (104, 0, 84) and (288, 0, 12), the two data run information are marked as a first set, and the other data run information is marked as a second set.
Step 103: and moving the sector corresponding to the data operation information corresponding to the second set, so that the moved file is stored in the disk volume according to the target cluster.
In this embodiment, the sectors corresponding to the data operation information corresponding to the first set are kept stationary, and only the sectors corresponding to the data operation information corresponding to the second set are moved.
It should be noted that, after the sector is moved, the data operation information of the file related to the sector movement may be changed. For example, after the sector is moved, the data run information of the moved file is recorded in the physical sector count, the virtual sector count, and the sector length of the sector area in accordance with the target cluster, and at this time, the file is stored in the disk volume in one cluster in accordance with the number of sectors having the second value.
It can be known from the foregoing solution that, in the method for changing a disk volume into clusters disclosed in the first embodiment of the present application, by obtaining data running information of each file in a disk volume, the data running information can be divided according to their physical sector count, virtual sector count and sector length according to a target cluster, the physical sector count and virtual sector count of the data running information corresponding to a first divided set are both divided by the number of sectors in the target cluster, and the physical sector count or virtual sector count of the data running information corresponding to a second divided set cannot be divided by the number of sectors in the target cluster, based on which, when a file sector in the disk volume is moved due to the change of a cluster, only the sector corresponding to the data running information in the second set needs to be moved, and the sector corresponding to the data running information whose physical sector count and virtual sector count are both divided by the number of sectors in the target cluster can be kept still, thus, clustering can be achieved without moving all sectors in the disc volume, thereby improving clustering efficiency by reducing the number of moving sectors.
In a specific implementation scheme, the disk volume has a current management space and a current data space, a start sector position of the current data space is connected to an end sector position of the current management space, the current management space is used for storing management data information of the disk volume, and the current data space is used for storing data of a file.
For example, the disk volume is divided into a current management space for storing management data, i.e., meta data, from 0 sector location to 99 sector location, and is divided into a current data space for storing data of a file, starting from 100 sector locations. The file records the sector position of the file through the data operation information.
Based on this, when a disk volume is clustered, a change in management space may be caused, that is: the sectors occupied by the management space may change, and the changed management space may occupy the sectors of the original data space. Therefore, in this embodiment, it is necessary to consider whether the target management space after disk volume clustering occupies the original data space. Based on this, in the present embodiment, when the data operation information in the disk volume is marked as at least the first set or the second set in step 102, the following may be implemented, as shown in fig. 2:
step 201: and determining the target management space of the disk volume according to the number of the sectors of the second value in the target cluster.
Wherein, the starting sector position of the target management space is consistent with the starting sector position of the current management space.
For example, a disk volume needs 0 sector location to 103 sector locations as a target management space after clustering, and a target data space starts from 104 sector locations; for another example, the disk volume needs 0 sector location to 89 sector location as the target management space after clustering, and the target data space starts from 90 sector location.
Step 202: judging whether the ending sector position of the target management space is larger than the ending sector position of the current management space, executing step 203 if the ending sector position of the target management space is larger than the ending sector position of the current management space, and executing step 204 if the ending sector position of the target management space is smaller than or equal to the ending sector position of the current management space.
For example, after the disk volume is clustered, 0 sector position to 103 sector position are required as the target management space, and since the 103 sector position is greater than the 99 sector position of the current management space, step 203 is executed; for another example, the disk volume needs 0 sector location to 89 sector location as the target management space after clustering, and since the 89 sector location is smaller than the 99 sector location of the current management space, step 204 is executed.
Step 203: the data operation information corresponding to the current data space is divided according to the ending sector position of the target management space to obtain a management set and a data set, and step 205 is executed.
The data operation information in the management set corresponds to a target management space, and the data operation information in the data set corresponds to the target data space.
For example, since the 103 sector position is larger than the 99 sector position of the current management space, the data run information of the file recorded in the current data space from the 100 sector position is divided into 103 sector positions, the data run information corresponding to the sector area between the 100 sector position and the 103 sector position is divided into the management set, and the data run information corresponding to the sector area from the 104 sector position is divided into the data set.
Step 204: the data information corresponding to the current data space is divided into data sets, and the management set is empty, and step 205 is executed.
For example, since the 89 sector position is smaller than the 99 sector position of the current management space, the management set is empty, and the data operation information corresponding to the sector area starting from the 100 sector position is divided into the data sets.
Step 205: and screening out first data operation information which meets the condition that the physical sector count and the virtual sector count are respectively divided by the second numerical value from the data operation information of the data set.
If the management set has data operation information, the sector corresponding to the data operation information in the management set is divided into a part of the target management space after clustering so as to store the management data, and therefore, the sector corresponding to the data operation information in the management set needs to be moved, and therefore, in this embodiment, only the data operation information corresponding to the sector area that may not be moved in the data set may be screened.
Based on this, in the present embodiment, the data operation information in which the included physical sector count and virtual sector count are respectively divided by the second value may be determined as the first data operation information. This is because: the physical sector count and the virtual sector count are both divisible by the second value, indicating that the sector area corresponding to the data run information is sector-aligned according to the number of sectors in the target cluster, and therefore, the sector area corresponding to the first data run information may not be moved.
Step 206: the first data run information is marked as a first set.
Step 207: and marking second data operation information except the first data operation information in the data set as a second set.
Based on: the scheme of marking the data operation information in the current data space of the disk volume according to the result of dividing the data operation information in the current data space of the disk volume according to the ending sector position of the target management space can consider the condition that the target management space after cluster changing is inconsistent with the current management space before cluster changing, and further improve the reliability of sector movement in the cluster changing process.
Based on the above implementation scheme, before step 103, that is, before moving the sector corresponding to the data operation information corresponding to the second set, the present embodiment may further include the following steps, as shown in fig. 3:
step 104: and moving the file corresponding to the data operation information in the management set to the free space of the disk volume.
Specifically, in this embodiment, the sector corresponding to the data operation information corresponding to the management set may be moved to the first free space of the disk volume, for example, a free space far away from the sector corresponding to the first data operation information in the first set.
Based on the above implementation scheme, when the sector corresponding to the data operation information in the second set is moved in step 103, the corresponding sector may be moved according to the physical sector count and the virtual sector count in the data operation information corresponding to the second set. The method comprises the following specific steps:
in one implementation, in step 103, all sectors corresponding to the data operation information corresponding to the second set may be moved to the free space of the disk volume.
Wherein free space of the disc volume may be pre-marked prior to clustering. Specifically, according to the position of an occupied sector area in the disk volume, a space corresponding to an unoccupied sector in the disk volume, that is, a free space, may be marked.
Specifically, in this embodiment, according to the target cluster, the sectors corresponding to the data running information in the second set may be moved to the free space of the disk volume according to the sequence in the respective data running sets, and at the same time, the sectors corresponding to the data running information in the first set are not moved, and at this time, the entire disk volume stores files according to the number of sectors with the second value in the target cluster.
In another implementation, in order to reduce the number of sectors moved and maintain the continuity of occupied sectors, only the sectors corresponding to a part of the data operation information in the second set may be moved by less than the number of sectors in one cluster, and the sectors corresponding to the other data operation information in the second set may be moved to the free space of the disk volume.
Specifically, after marking the second data operation information in the data set except for the first data operation information as the second set in step 207, before moving the sector corresponding to the data operation information corresponding to the second set in step 103, the method in this embodiment may further include the following steps, as shown in fig. 4:
step 208: and obtaining third data operation information obtained after the sector corresponding to the second data operation information in the second set is moved to the sector of the target value.
The target value is a target remainder obtained by dividing the physical sector count in the second data operation information by the second numerical value, or the target value is a difference obtained by subtracting the target remainder from the second numerical value.
It should be noted that, in this embodiment, if the sector corresponding to the second data operation information in the second set is moved by the sector of the target value, the sector corresponding to the second data operation information in the second set is moved forward or backward by the sector of the target value according to the sequence of the sectors.
In a specific implementation, the target value is related to the way of moving forward or backward: if the sector corresponding to the second data running information in the second set is moved forward, the target value is a target remainder obtained by dividing the physical sector count in the second data running information by the second value, and if the sector corresponding to the second data running information in the second set is moved backward, the target value is a difference obtained by subtracting the target remainder from the second value. It can be seen that the target value is a value smaller than the second value. That is, in step 208, the third data operation information obtained if the sector corresponding to the second data operation information in the second set is moved by a sector smaller than a cluster is obtained.
For example, if the first value is 4 and the second value is 16, the physical sector count in the second data run information is 28, and based on this, to achieve target cluster alignment, if the sector corresponding to the second data run information is moved forward, the number of sectors that need to be moved is 12, i.e., 28 is divided by 16 to obtain the remainder after the quotient is 1, and if the sector corresponding to the second data run information is moved backward, the number of sectors that need to be moved is 4, i.e., the difference obtained by subtracting 12 from 16.
Based on this, the third data operation information obtained in step 208 is: and moving the sector corresponding to the second data operation information to the sector of the target value, and then moving the data operation information corresponding to the sector area.
It should be noted that the third data operation information obtained in step 208 is data operation information corresponding to a sector that is assumed to have moved to the target value, and does not actually move the sector corresponding to the second data operation information in the disk volume.
Step 209: judging whether an overlapping sector area exists between the sector corresponding to the third data operation information and the sector corresponding to the data operation information in the first set, if so, executing step 210, and if not, executing step 103, or executing step 103 after executing step 104.
Specifically, in step 209, it may be determined whether the physical sector count in the third data run information is between the physical sector count and the sum of the physical sector count and the sector length of any one of the data run information in the first set, if the physical sector count in the third data run information is greater than the physical sector count of a certain first data run information and the physical sector count in the third data run information is less than the sum of the physical sector count and the sector length in the first data run information, it is indicated that the sector corresponding to the third data run information has an overlapping sector area with the sector corresponding to this first data run information, and the overlapping sector area starts from the start sector position corresponding to the third data operation information to the end sector position corresponding to the third data operation information or the end sector position corresponding to the first data operation information. For example, if the ending sector position corresponding to the third data operation information is before the ending sector position corresponding to the first data operation information in the sequence of sectors, the overlapping region starts from the starting sector position corresponding to the third data operation information to the ending sector position corresponding to the third data operation information; if the ending sector position corresponding to the first data run information is before the ending sector position corresponding to the third data run information in the sequence of sectors, the overlapping sector area starts from the starting sector position corresponding to the third data run information to the ending sector position corresponding to the first data run information.
Step 210: and in the sector corresponding to the second data operation information, dividing a target sector area with the sector length matched with the overlapped sector area.
Wherein the target sector area corresponds to the fourth data operation information.
Based on this, after determining that an overlapping sector area is generated between the sector corresponding to the data operation information in the first set if the sector corresponding to the second data operation information moves, in order to avoid overlapping sectors and avoid a situation of data coverage loss generated when the sector is moved, in this embodiment, a target sector area corresponding to the overlapping sector area is cut out from the sector corresponding to the second data operation information. The target sector area here can be understood as: if the sector corresponding to the second data operation information is moved, the data in the target sector area may overwrite the data in the sector area that was not moved.
Step 211: the fourth data operation information is marked as a third set, step 103 may be executed, or step 103 may be executed after step 104 is executed.
That is, in the present embodiment, the third set is divided from the second set by splitting the sector corresponding to the second data run information in the second set, where the fourth data run information in the third set corresponds to the sector area where data coverage would exist if a move less than one cluster is performed, and the remaining second data run information in the second set corresponds to the sector area where data coverage would not occur if a move less than one cluster is performed.
Based on this, when the sector corresponding to the data operation information corresponding to the second set is moved in step 103, the following method may be specifically implemented, as shown in fig. 5:
step 301: and moving the sectors corresponding to the residual data operation information in the second set forwards or backwards by the sectors corresponding to the target value according to the front and back sequence of the sectors.
That is, the sectors corresponding to the remaining second data operation information in the second set are moved forward by the sectors of the target remainder in the front-back order of the sectors; or, moving the sector corresponding to the second data operation information in the second set backward by the sector of the difference value according to the front-back sequence of the sectors.
Step 302: and moving the sector corresponding to the data operation information corresponding to the third set to the free space of the disk volume.
Specifically, in this embodiment, a sector corresponding to the data operation information corresponding to the third set, that is, a sector corresponding to the fourth data operation information may be moved to a second free space of the disk volume, for example, a free space far away from a sector corresponding to the first data operation information in the first set. The second free space is different from the first free space in the foregoing, but is an unoccupied sector area in the disk volume.
As can be seen, in this embodiment, the second data operation information in the second set is segmented, the sector region where the sectors overlap after moving is segmented, and the sectors corresponding to the overlapping sector region are moved to the free space of the disk volume, so that a situation of data coverage that may exist when the sectors corresponding to the second data operation information in the second set are moved less than one cluster is avoided, and a situation of data loss due to data coverage is avoided while the number of sectors moved is reduced and the continuity of occupied sectors is ensured, thereby improving the reliability of clusters.
Based on the above implementation, before step 205, namely: before screening out the first data operation information satisfying that the physical sector count and the virtual sector count are respectively divided by the second value in the data operation information of the data set, the method in this embodiment may further include the following steps, as shown in fig. 6:
step 212: judging whether the fifth data run information corresponds to a free sector on the last target cluster, if the fifth data run information corresponds to a first free sector on the last target cluster, executing step 213, and if the fifth data run information does not correspond to a free sector on the last target cluster, executing step 205.
And the fifth data operation information is the last data operation information in the management set.
In the process of moving the sectors specifically, the sectors in the disk volume are to be moved by taking the number of sectors in the target cluster as the minimum moving unit, and therefore, it is necessary to determine whether there is a free sector in the last data operation information in the management set, that is, the fifth data operation information, in the last target cluster, and if there is a free sector, sector completion is necessary.
Step 213: and in the sector corresponding to the sixth data operation information corresponding to the fifth data operation information, segmenting a first sector area with the sector length matched with the first idle sector.
Wherein the first sector area corresponds to seventh data operation information; and the sixth data operation information is the next data operation information corresponding to the fifth data operation information in the data operation set where the fifth data operation information is located.
That is to say, in this embodiment, the sector corresponding to the fifth data operation information is divided into a first sector area capable of filling up the first free sector in the next sector area of the file where the sector is located, and the data operation information in the first sector area is recorded as seventh data operation information.
Step 214: the seventh data run information is marked as the management set and step 205 is performed.
Therefore, in this embodiment, in order to move the sectors in the disc volume with the number of sectors in the target cluster as the minimum movement unit and ensure the continuity of occupied sectors, by determining whether there is a free sector in the last target cluster in the fifth data operation information, that is, the fifth data operation information in the management set, when the fifth data operation information corresponds to the first free sector in the last target cluster, the first sector area that complements the sector corresponding to the fifth data operation information is cut out from the sector corresponding to the next data operation information of the fifth data operation information, so that the corresponding seventh data operation information is marked as the management set.
Based on the above implementation, before step 203, that is, before dividing the data operation information in the current data space according to the ending sector position of the target management space, the present embodiment may further include the following steps, as shown in fig. 7:
step 215: and judging whether the current data space corresponds to the data operation information meeting the space crossing condition, if so, executing step 216, and if not, executing step 203.
Wherein the space crossing condition includes: the start sector position of the data operation information is smaller than the end sector position of the target management space, and the end sector position of the data operation information is larger than the end sector position of the target management space.
That is to say, in this embodiment, it is determined whether there is a sector area spanning the clustered target management space and the clustered target data space, and if there is a sector area spanning the clustered target management space and the clustered target data space, the sector area needs to be segmented.
Step 216: dividing the data operation information meeting the space crossing condition into: eighth data operation information and ninth data operation information; the step 203 is performed.
The ending sector position of the eighth data operation information is consistent with the ending sector position of the target management space, and the starting sector position of the ninth data operation information is consistent with the starting sector position of the target data space.
Specifically, in this embodiment, the sector area corresponding to the data operation information that meets the space crossing condition is segmented, that is, the segmentation is performed according to the ending sector position of the target management space, so that the two segmented sector areas correspond to the target management space and the target data space, and based on this, after step 203 is executed, the two segmented sector areas can be marked as a management set and a data set, respectively.
As can be seen, in this embodiment, by determining whether a sector area spans the target management space and the target data space after the cluster change, the sector area that spans the target management space and the target data space is segmented, thereby avoiding a situation that the data operation information corresponding to the sector area cannot be marked, and improving the reliability of the cluster change.
Based on the above implementation, after step 206, before step 207, that is, after the first data run information is marked as the first set, and before the second data run information in the data set other than the first data run information is marked as the second set in the present embodiment, the following steps may also be included, as shown in fig. 8:
step 217: it is determined whether the first data run information corresponds to a free sector in the last target cluster, if the first data run information corresponds to a second free sector in the last target cluster, step 218 is performed, and if the first data run information does not correspond to a second free sector in the last target cluster, step 207 is performed.
Specifically, in this embodiment, after step 206, it may be sequentially determined whether each first data operation information corresponding to the first set corresponds to a free sector on the last target cluster.
Step 218: and selecting tenth data operation information corresponding to the first data operation information from the second data operation information of the second set.
The tenth data operation information is the next data operation information of the first data operation information in the data operation set where the first data operation information is located.
Step 219: and in the sector corresponding to the tenth data operation information, a second sector area matched with the second idle sector is divided.
Wherein the second sector area corresponds to eleventh data operation information;
that is to say, in this embodiment, when the first data operation information corresponds to a free sector on the last target cluster, a second sector area capable of completing a second free sector is cut out from a next sector area of the file where the first data operation information is located, and the data operation information in the second sector area is recorded as eleventh data operation information.
Step 220: the eleventh data run information is marked as the first set.
Based on this, after step 220, before moving the sector corresponding to the data operation information corresponding to the second set in step 103, the method in this embodiment may further include the following steps, as shown in fig. 9:
step 105: and moving the sector corresponding to the eleventh data operation information to the second idle sector.
It should be noted that, in this embodiment, steps 217 to 219 may be sequentially executed, that is, according to the above scheme of completing the sectors of the last target cluster, sector completion is sequentially performed on the sectors corresponding to all the first data operation information in the first set according to the target cluster, and thus, continuity of occupied sectors may be ensured through sector completion.
Based on the above implementation, after step 103, the method in this embodiment may further include the following steps, as shown in fig. 10:
step 106: and generating directory data of each file according to the data operation information of the file after the sector movement.
Step 107: target management data of the disk volume is generated based on at least the directory data.
Wherein the target management data is stored in a target management space created by the disk volume.
Referring to fig. 11, a schematic structural diagram of a disk volume clustering apparatus provided in the second embodiment of the present application is shown, where the apparatus may be configured in an electronic device with a disk volume, such as a mobile phone, a notebook, a computer, or a server. The technical scheme in the embodiment is mainly used for improving the efficiency of disk coiling and clustering.
Specifically, the apparatus in this embodiment may include the following functional units:
a set obtaining unit 1101, configured to obtain a data run set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster;
an information dividing unit 1102, configured to mark the data operation information in the data operation set as at least a first set or a second set according to the physical sector count, the virtual sector count, and the sector length of each piece of data operation information, and at least according to the number of sectors of a second value in a target cluster;
wherein the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divisible by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divisible by the second value;
a file moving unit 1103, configured to move a sector corresponding to the data operation information corresponding to the second set, so that the moved file is stored in the disk volume according to the target cluster.
It can be known from the foregoing solution that, in the disk volume cluster changing apparatus disclosed in the second embodiment of the present application, by obtaining the data operation information of each file in the disk volume, the data operation information can be divided according to their physical sector count, virtual sector count and sector length according to the target cluster, the physical sector count and virtual sector count of the data operation information corresponding to the divided first set are both divided by the number of sectors in the target cluster, and the physical sector count or virtual sector count of the data operation information corresponding to the divided second set cannot be divided by the number of sectors in the target cluster, based on which, when the file sector in the disk volume is moved due to cluster changing, only the sector corresponding to the data operation information in the second set needs to be moved, and the sector corresponding to the data operation information whose physical sector count and virtual sector count are both divided by the number of sectors in the target cluster can be kept unchanged, thus, clustering can be achieved without moving all sectors in the disc volume, thereby improving clustering efficiency by reducing the number of moving sectors.
In one implementation, the disk volume has a current management space and a current data space, a start sector position of the current data space is connected to an end sector position of the current management space, the current management space is used for storing management data information of the disk volume, and the current data space is used for storing data of the file;
the information dividing unit 1102 is at least configured to: determining a target management space of the disk volume according to the number of sectors of a second value in a target cluster, wherein the target management space is consistent with the initial sector position of the current management space; under the condition that the position of an ending sector of the target management space is larger than that of the ending sector of the current management space, dividing data operation information corresponding to the current data space according to the position of the ending sector of the target management space to obtain a management set and a data set, wherein the data operation information in the management set corresponds to the target management space, and the data operation information in the data set corresponds to the target data space; under the condition that the position of the ending sector of the target management space is less than or equal to the position of the ending sector of the current management space, dividing the data information corresponding to the current data space into the data set, wherein the management set is empty; screening out first data operation information which meets the condition that the physical sector count and the virtual sector count are respectively divided by the second numerical value from the data operation information of the data set; marking the first data operation information as a first set; marking second data operation information except the first data operation information in the data set as a second set;
before moving the sector corresponding to the data operation information corresponding to the second set, the file moving unit 1103 is further configured to: and moving the file corresponding to the data operation information in the management set to the free space of the disk volume.
In one implementation, after marking the second data run information in the data set, except for the first data run information, as a second set, the information dividing unit 1102 is further configured to: obtaining third data operation information obtained after the sector corresponding to the second data operation information in the second set is moved to the sector of the target value; the target value is a target remainder obtained by dividing the physical sector count in the second data running information by the second numerical value, or the target value is a difference obtained by subtracting the target remainder from the second numerical value; under the condition that an overlapping sector area exists between a sector corresponding to the third data operation information and a sector corresponding to the data operation information in the first set, a target sector area with the sector length matched with the overlapping sector area is cut out from the sector corresponding to the second data operation information, and the target sector area corresponds to fourth data operation information; marking the fourth data operation information as a third set;
the file moving unit 1103 is specifically configured to: moving the sectors corresponding to the residual data operation information in the second set forwards or backwards according to the front and back sequence of the sectors and the sectors corresponding to the target value; and moving the sector corresponding to the data operation information corresponding to the third set to the free space of the disk volume.
In one implementation, before screening out the first data operation information satisfying that the physical sector count and the virtual sector count are respectively evenly divided by the second value in the data operation information of the data set, the information dividing unit 1102 is further configured to: judging whether the fifth data operation information corresponds to an idle sector on the last target cluster; the fifth data operation information is the last data operation information in the management set; if the fifth data operation information corresponds to a first idle sector on the last target cluster, in a sector corresponding to sixth data operation information corresponding to the fifth data operation information, segmenting a first sector area with the sector length matched with the first idle sector, wherein the first sector area corresponds to seventh data operation information; the sixth data operation information is the next data operation information corresponding to the fifth data operation information in the data operation set where the fifth data operation information is located; marking the seventh data run information as the management set.
In one implementation manner, before dividing the data operation information in the current data space according to the ending sector position of the target management space, the information dividing unit 1102 is further configured to: judging whether the current data space corresponds to data operation information meeting a space crossing condition; wherein the space crossing condition comprises: the starting sector position of the data operation information is smaller than the ending sector position of the target management space, and the ending sector position of the data operation information is larger than the ending sector position of the target management space; if the data operation information meeting the space crossing condition exists, dividing the data operation information meeting the space crossing condition into: eighth data operation information and ninth data operation information; the steps are executed: dividing the data operation information corresponding to the current data space according to the ending sector position of the target management space to obtain a management set and a data set; wherein, the ending sector position of the eighth data operation information is consistent with the ending sector position of the target management space, and the starting sector position of the ninth data operation information is consistent with the starting sector position of the target data space; and if the data operation information meeting the space crossing condition does not exist, dividing the data operation information corresponding to the current data space according to the position of the ending sector of the target management space to obtain a management set and a data set.
In one implementation manner, after marking the first data run information as a first set, and before marking second data run information in the data set except the first data run information as a second set, the information dividing unit 1102 is further configured to: judging whether the first data operation information corresponds to an idle sector on the last target cluster; if the first data operation information corresponds to a second idle sector on the last target cluster, selecting tenth data operation information corresponding to the first data operation information from second data operation information of the second set, wherein the tenth data operation information is next data operation information of the first data operation information in the data operation set where the tenth data operation information is located; dividing a second sector area matched with the second idle sector in a sector corresponding to the tenth data operation information, wherein the second sector area corresponds to eleventh data operation information; marking the eleventh data run information as the first set;
before moving the sector corresponding to the data operation information corresponding to the second set, the file moving unit 1103 is further configured to: and moving the sector corresponding to the eleventh data operation information to the second idle sector.
In one implementation, the file moving unit 1103 is specifically configured to: and moving the sector corresponding to the data operation information in the second set to the free space of the disk volume.
In one implementation, the apparatus in this embodiment further includes the following units, as shown in fig. 12:
a data generating unit 1104, configured to, after the file moving unit 1103 moves at least the sector corresponding to the data operation information corresponding to the second set, generate directory data of each file according to the data operation information of the file after the sector movement; and generating target management data of the disk volume at least according to the directory data, wherein the target management data is stored in a target management space created by the disk volume according to the disk volume.
It should be noted that, for the specific implementation of each unit in the present embodiment, reference may be made to the corresponding content in the foregoing, and details are not described here.
Referring to fig. 13, a schematic structural diagram of an electronic device according to a third embodiment of the present disclosure is provided, where the electronic device may be an electronic device with a disk volume, such as a mobile phone, a notebook, a computer, or a server. The technical scheme in the embodiment is mainly used for improving the efficiency of disk coiling and clustering.
Specifically, the electronic device in this embodiment may include the following structure:
a disk volume 1301 for storing files;
a processor 1302 for: obtaining a data operation set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster; according to the physical sector count, the virtual sector count and the sector length of each piece of data operation information, at least according to the number of sectors with a second value in a target cluster, marking the data operation information in the data operation set as a first set or a second set; wherein the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divisible by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divisible by the second value; and moving the sector corresponding to the data operation information corresponding to the second set, so that the moved file is stored in the disk volume according to the target cluster.
It can be known from the foregoing solution that, in an electronic device disclosed in the third embodiment of the present application, by obtaining data operation information of each file in a disk volume, the data operation information can be divided according to their physical sector count, virtual sector count and sector length according to a target cluster, the physical sector count and virtual sector count of data operation information corresponding to a divided first set are both divisible by the number of sectors in the target cluster, and the physical sector count or virtual sector count of data operation information corresponding to a divided second set cannot be divisible by the number of sectors in the target cluster, based on which, when a file sector in the disk volume is moved due to cluster change, only the sector corresponding to the data operation information in the second set needs to be moved, and the sector corresponding to the data operation information whose physical sector count and virtual sector count are both divisible by the number of sectors in the target cluster can be kept still, thus, clustering can be achieved without moving all sectors in the disc volume, thereby improving clustering efficiency by reducing the number of moving sectors.
Taking clustering of a disk volume of a server as an example, a specific clustering process is described as follows:
first, in this embodiment, a volume that needs to be clustered in a server is divided into:
(1) analysis of volumes requiring clustering: and acquiring data of all files for operation (hereinafter referred to as data run), wherein all data run are recorded as a linked list by taking a sector as a unit, and the content of the directory is not required to be acquired.
(2) In the data run set (i.e. the data run set in the foregoing) of each file, preprocessing is performed first, and the method includes: 1) introducing VSN for expressing the virtual sector count of each data run, and recording the VSN as VSN; 2) if the remainder is not zero after the VSN divides the cluster size in whole, and the sum of the remainder and the length is greater than the cluster size, then a split is generated (which may be understood as adding one linked list entry).
(3) The total volume of the volume is constructed into a management space (hereinafter referred to as meta space for storing meta data) and a data space (hereinafter referred to as data space) according to the length of the destination cluster, and the meta space may change after the file system cluster changes. And if the meta space after the clustering is not changed, the meta set is empty.
If the sector corresponding to the data run spans the meta space and the data space, the sector corresponding to the data run is divided, and a linked list item of a linked list is added to the data run, so that a single data run cannot be located in both the meta space and the data space.
Then, a space allocation is determined:
(4) and marking the whole data set to the whole space by the target cluster capacity.
Second, a mobile set of meta data is established:
(5) and allocating sectors corresponding to data run in meta sets of all files to unoccupied space positions. For example, for a data run in a meta set, a new cluster is allocated in all the space, and a mobile set, i.e. the establishment management set in the foregoing, is constructed for the movement of the data run in the meta set.
Again, the E-set flag, i.e. the first set flag in the foregoing:
(6) for a data run in a data set, if the remainder of the psn (physical sector count) and vsn (virtual sector count) of the data run divided by the size of the destination cluster is zero, the data run can be completely reserved, the portion of the data run is put into an E set, and the cluster is marked directly in the destination space, the set is only used for marking the data run of a new file, and no actual movement is generated.
(7) If the data run placed in the E set does not fill the target cluster, the data run automatically enters the E set by the subsequent remainder (namely the number of occupied sectors needs to be supplemented with the number of sectors with the size of the new cluster), and the part needs data movement. The subsequent marking is performed after the E-set marking of all files is completed.
In addition, the I-set flag, i.e., the second set in the foregoing:
(8) and data run in other data sets are psn or vsn, the remainder of the psn or vsn is not zero after the psn or vsn is divided by the size of the target cluster, the part of data needs to be internally moved by less than 1cluster, and the part of data run is added into the I moving set. Based on the foregoing principle of padding, if a data run added to the I set does not fill the whole cluster, a subsequent remainder (i.e., the number of occupied sectors to be padded up to the number of sectors of the new cluster size) data run is automatically added to the I set.
Because the target space position has the E set, the data run which can be added into the I set can possibly generate truncation, and the part enters the next step;
namely: s set flag, third set in the foregoing:
(9) the remaining data run is the data run which should be added into the I set originally but generates truncation data run (i.e. fragments/small blocks smaller than 1 cluster), the part of data run needs a completely new construction destination space, and an S set is added into the constructed destination space, so that it can be determined that the data run in the S set is always a small block smaller than 1 cluster.
By constructing E, I and S sets, the destination space forms a continuous state as much as possible, and the original address content which can be preserved is protected.
Finally, a new catalog is generated:
(10) through the previous processing, new data run of all files is built, the directory description of each file is also built conditionally, new directory data is built, the directory data is applied through space distribution, and the directory is marked for temporary use.
Meta data construction:
through the previous 10 steps, the meta data can be constructed by generating the temporary use of all file directories, so that the whole data construction of the changed cluster is virtualized to be finished (refer to the last step of the embodiment in detail: therefore, the data operation after the movement according to the new cluster size is finished).
The clustering method in this embodiment is described below in a practical case:
source part 10000 sectors/whole volume space size 10000 sectors
Before proc Meta:100 sectors 1cluster =4 sectors// size of Meta Before clustering: 100 sectors, cluster size 4
File1.data_run={ (100,4),(188,4),(300,12)};
File2.data_run={(104,84),(192,12),(204,84)};
File3.data_run={(288,12),(312,20)};
After proc: Meta: 104 sector, 1cluster =8 sector// meta size after clustering is (target cluster): 104 sectors, cluster size 8
The following processing results in sequence according to the cluster changing method in this embodiment are:
pretreatment: firstly, judging whether the Length of the source data run is within the size of a target cluster, if so, not increasing the partition, if so, increasing the partition according to the size of the new cluster (namely, increasing the partition when the head of the data run is not aligned), and the calculation mode of the misalignment represents that VSN% 8! When =0, division needs to be added, so the original Data space is subdivided into:
data _ run (PSN, VSN, Length): // data running (physical sector count PSN, virtual sector count VSN, Length)
File1={(100,0,4),(188,4,4),(300,8,12)};
File2={(104,0,84),(192,84,4),(196,88,8),(204,96,84)};
File3={(288,0,12),(312,12,4),(316,16,16)};
Wherein, (192, 84, 4), (196, 88, 8), (312, 12, 4), (316, 16, 16) indicates the data run being subdivided and extended, the principle is whether VSN%8 is 0, and the remainder and length are larger than the cluster size, so (188, 4, 4) no additional segmentation is added. After processing, each data run is either an integer multiple of the VSN of the cluster or is the complement of the previous data run, so that the subsequent processing is more convenient.
The processing steps are as follows:
1. determining a Meta set: the size of the meta space is: the data in this space needs to be moved for the 100-104 sectors (see above for proc: Meta:100 sectors, After proc: Meta: 104 sectors).
File1= { (100, 0, 4), (188, 4, 4) }. Because the first 1 st data run (100, 0, 4) of the file1 of the original data space just falls in the meta space (100-. Other files are not within the scope of the Meta space.
2. Determining an E set: the result is (+: indicating remainder accumulation, i.e., the number of occupied sectors to be filled up to the number of sectors of the new cluster size):
the calculation method comprises the following steps: PSN/8 and VSN/8 both need to be evenly divided, and if not, data run is put into I set or S set.
File1= { None }. Here, PSN/8 of the remaining 1 data run (300, 8, 12) of the original data space file1 cannot be divided evenly, and is put in the I set or S set, so it is named None here.
File2= { (104, 0, 84) + (192, 84, 4) }. Here, the data run (104, 0, 84) of file2 in the original data space is divisible and therefore put into E set, but since this data run does not fill the target cluster, it needs to be complemented to the full cluster size, so the latter data run (192, 84, 4) is automatically complemented into E set;
file3= { (288, 0, 12) + (312, 12, 4) }. The reason for this is the same as above.
3. Determine the I set (I set moved backward in this example): the result is (+: indicating remainder accumulation, i.e., the number of occupied sectors to be filled up to the number of sectors of the new cluster size): the data movement of the set is that the data movement is integrally backward movement by 4 sectors, the backward movement can be completely divided by the size of a new cluster by 8, and the data movement does not need to be divided, and the data movement is shorter than 8 sectors.
File1= { (300, 8, 12) }. (this data run needs to be moved back (304, 12), the file has no content behind it, so the cluster remainder data run (316, 4) is also attributed to file1 by default);
file2= { (196, 88, 8), (204, 96, 80) }. This part (196, 88, 8) requires moving 4 sectors backward to (200, 8). The original data run (204, 96, 84) moves 4 sectors backward to (208, 84), but since (288, 0, 12) already exists in the E-set, this move can only accommodate 80 sectors, so a split is generated, and the latter data run (284, 176, 4) needs to be added to the S-set.
File3= { (316, 16, 16) }. This data run needs to be moved backward to (320, 16).
4. The S set is determined as described above, with the result (+: indicating remainder accumulation, i.e., the number of occupied sectors to be filled to the new cluster size): the set is neither evenly divisible by the new cluster size 8 nor can it be placed in the I-set (because the positions of the I-set are occupied, i.e., both front and back positions are occupied).
File1={None};
File3={None};
File2= { (284, 176, 4) }. I sets should be put in, but because E sets and I sets cannot be accommodated, the put-in S sets are allocated separately.
Newly distributed are a Meta set and an S set, an E set data is used as an original address, all the I set data is moved backwards by 4 Sectors (all the I set data is only specially defined for description in the example, and the I set data can be moved forwards or backwards in the actual situation), and all the remainder parts are moved to the tail part of the data operation of the specified set by absolute movement.
Thus, the data after moving by the new cluster size runs as follows:
File1={(1000,0,4),(1004,4,4),(304,8,12)};
File2={(104,0,84)+(188,84,4),(200,88,8),(208,96,80),(1008,176,4)};
File3={(288,0,12)+(300,12,4),(320,16,16)}。
the embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of disk coil clustering, the method comprising:
obtaining a data operation set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster;
according to the physical sector count, the virtual sector count and the sector length of each piece of data operation information, at least according to the number of sectors with a second value in a target cluster, marking the data operation information in the data operation set as a first set or a second set;
wherein the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divisible by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divisible by the second value;
and keeping the sector corresponding to the data operation information corresponding to the first set still, and moving the sector corresponding to the data operation information corresponding to the second set, so that the moved file is stored in the disk volume according to the target cluster.
2. The method of claim 1, wherein the disk volume has a current management space and a current data space, a start sector position of the current data space is connected to an end sector position of the current management space, the current management space is used for storing management data information of the disk volume, and the current data space is used for storing data of the file;
according to the physical sector count, the virtual sector count and the sector length of each piece of data operation information, at least according to the number of sectors with the second value in the target cluster, marking the data operation information in the disk volume as at least a first set or a second set, and at least comprising:
determining a target management space and a target data space of the disk volume according to the number of sectors of a second value in a target cluster, wherein the starting sector position of the target management space is consistent with the starting sector position of the current management space;
under the condition that the position of an ending sector of the target management space is larger than that of the ending sector of the current management space, dividing data operation information corresponding to the current data space according to the position of the ending sector of the target management space to obtain a management set and a data set, wherein the data operation information in the management set corresponds to the target management space, and the data operation information in the data set corresponds to the target data space;
under the condition that the position of the ending sector of the target management space is less than or equal to the position of the ending sector of the current management space, dividing the data information corresponding to the current data space into the data set, wherein the management set is empty;
screening out first data operation information which meets the condition that the physical sector count and the virtual sector count are respectively divided by the second numerical value from the data operation information of the data set;
marking the first data operation information as a first set;
marking second data operation information except the first data operation information in the data set as a second set;
before moving the sector corresponding to the data operation information corresponding to the second set, the method further includes:
and moving the file corresponding to the data operation information in the management set to the free space of the disk volume.
3. The method of claim 2, wherein after marking second data run information in the set of data other than the first data run information as a second set, the method further comprises:
obtaining third data operation information, where the third data operation information is data operation information corresponding to a sector corresponding to second data operation information in the second set if the sector corresponding to the second data operation information in the second set is moved to a sector of a target value; the target value is a target remainder obtained by dividing the physical sector count in the second data running information by the second numerical value, or the target value is a difference obtained by subtracting the target remainder from the second numerical value;
under the condition that an overlapping sector area exists between a sector corresponding to the third data operation information and a sector corresponding to the data operation information in the first set, a target sector area with the sector length matched with the overlapping sector area is cut out from the sector corresponding to the second data operation information, and the target sector area corresponds to fourth data operation information;
marking the fourth data operation information as a third set;
wherein the moving the sector corresponding to the data operation information corresponding to the second set includes:
moving the sectors corresponding to the residual data operation information in the second set forwards or backwards according to the front and back sequence of the sectors and the sectors corresponding to the target value;
and moving the sector corresponding to the data operation information corresponding to the third set to the free space of the disk volume.
4. The method of claim 2, wherein before screening out first data operation information in the data operation information of the data set that satisfies the physical sector count and the virtual sector count each divided by the second value, the method further comprises:
judging whether the fifth data operation information corresponds to an idle sector on the last target cluster; the fifth data operation information is the last data operation information in the management set;
if the fifth data operation information corresponds to a first idle sector on the last target cluster, in a sector corresponding to sixth data operation information corresponding to the fifth data operation information, segmenting a first sector area with the sector length matched with the first idle sector, wherein the first sector area corresponds to seventh data operation information; the sixth data operation information is the next data operation information corresponding to the fifth data operation information in the data operation set where the fifth data operation information is located;
marking the seventh data run information as the management set.
5. The method of claim 2, wherein before dividing the data operation information in the current data space according to the end sector position of the target management space, the method further comprises:
judging whether the current data space corresponds to data operation information meeting a space crossing condition; wherein the space crossing condition comprises: the starting sector position of the data operation information is smaller than the ending sector position of the target management space, and the ending sector position of the data operation information is larger than the ending sector position of the target management space;
if the data operation information meeting the space crossing condition exists, dividing the data operation information meeting the space crossing condition into: eighth data operation information and ninth data operation information; the following steps are performed: dividing the data operation information corresponding to the current data space according to the ending sector position of the target management space to obtain a management set and a data set;
wherein, the ending sector position of the eighth data operation information is consistent with the ending sector position of the target management space, and the starting sector position of the ninth data operation information is consistent with the starting sector position of the target data space;
and if the data operation information meeting the space crossing condition does not exist, dividing the data operation information corresponding to the current data space according to the position of the ending sector of the target management space to obtain a management set and a data set.
6. The method of claim 2, wherein after marking the first data run information as a first set, before marking second data run information in the set of data other than the first data run information as a second set, the method further comprises:
judging whether the first data operation information corresponds to an idle sector on the last target cluster;
if the first data operation information corresponds to a second idle sector on the last target cluster, selecting tenth data operation information corresponding to the first data operation information from second data operation information of the second set, wherein the tenth data operation information is next data operation information of the first data operation information in the data operation set where the tenth data operation information is located;
dividing a second sector area matched with the second idle sector in a sector corresponding to the tenth data operation information, wherein the second sector area corresponds to eleventh data operation information;
marking the eleventh data run information as the first set;
before moving the sector corresponding to the data operation information corresponding to the second set, the method further includes:
and moving the sector corresponding to the eleventh data operation information to the second idle sector.
7. The method according to claim 1 or 2, wherein moving the sector corresponding to the data operation information corresponding to the second set comprises:
and moving the sector corresponding to the data operation information in the second set to the free space of the disk volume.
8. The method according to claim 1 or 2, wherein at least after moving the sector corresponding to the data operation information corresponding to the second set, the method further comprises:
generating directory data of each file according to the data operation information of the file after the sector movement;
and generating target management data of the disk volume at least according to the directory data, wherein the target management data is stored in a target management space created by the disk volume according to the disk volume.
9. A disk coil tufting apparatus, comprising:
the set obtaining unit is used for obtaining a data operation set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster;
an information dividing unit, configured to mark the data operation information in the data operation set as at least a first set or a second set according to a physical sector count, a virtual sector count, and a sector length of each piece of data operation information, and at least according to a sector number of a second value in a target cluster;
wherein the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divisible by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divisible by the second value;
and the file moving unit is configured to keep the sector corresponding to the data operation information corresponding to the first set still, and move the sector corresponding to the data operation information corresponding to the second set, so that the moved file is stored in the disk volume according to the target cluster.
10. An electronic device, comprising:
the disk volume is used for storing files;
a processor to: obtaining a data operation set of each file in the disk volume; the data operation set of the file comprises at least one piece of data operation information, and the data operation information corresponds to physical sector count, virtual sector count and sector length; the disk volume takes the sector number of a first value as a current cluster; according to the physical sector count, the virtual sector count and the sector length of each piece of data operation information, at least according to the number of sectors with a second value in a target cluster, marking the data operation information in the data operation set as a first set or a second set; wherein the physical sector count and the virtual sector count of the data operation information corresponding to the first set are both divisible by the second value, and the physical sector count or the virtual sector count of the data operation information corresponding to the second set cannot be divisible by the second value; and keeping the sector corresponding to the data operation information corresponding to the first set still, and moving the sector corresponding to the data operation information corresponding to the second set, so that the moved file is stored in the disk volume according to the target cluster.
CN202110772044.6A 2021-07-08 2021-07-08 Disk coiling clustering method and device and electronic equipment Active CN113253945B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110772044.6A CN113253945B (en) 2021-07-08 2021-07-08 Disk coiling clustering method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110772044.6A CN113253945B (en) 2021-07-08 2021-07-08 Disk coiling clustering method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113253945A CN113253945A (en) 2021-08-13
CN113253945B true CN113253945B (en) 2021-09-28

Family

ID=77191049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110772044.6A Active CN113253945B (en) 2021-07-08 2021-07-08 Disk coiling clustering method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113253945B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101681307A (en) * 2008-03-01 2010-03-24 株式会社东芝 Memory system
CN102184248A (en) * 2011-05-20 2011-09-14 深圳市万兴软件有限公司 Method and device for regulating disk partitions in Windows
CN102667740A (en) * 2009-11-20 2012-09-12 西部数据技术公司 Aligning data storage device partition to boundary of physical data sector
CN110825712A (en) * 2019-10-31 2020-02-21 四川效率源科技有限责任公司 Method for recovering disk cluster data managed by logical volume

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1145126B1 (en) * 1999-10-21 2005-02-02 Matsushita Electric Industrial Co., Ltd. A semiconductor memory card access apparatus, a computer-readable recording medium, an initialization method, and a semiconductor memory card
CN106709014B (en) * 2016-12-26 2020-08-25 华为技术有限公司 File system conversion method and device
US11328175B2 (en) * 2018-09-12 2022-05-10 [24]7.ai, Inc. Method and apparatus for facilitating training of agents
CN110347643B (en) * 2019-07-22 2020-05-19 成都易我科技开发有限责任公司 Method and device for cloning NTFS (New technology File System) volume between disks
CN112052121B (en) * 2020-09-03 2022-11-15 北京尖晶尖科技有限公司 Hard disk data recovery method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101681307A (en) * 2008-03-01 2010-03-24 株式会社东芝 Memory system
CN102667740A (en) * 2009-11-20 2012-09-12 西部数据技术公司 Aligning data storage device partition to boundary of physical data sector
CN102184248A (en) * 2011-05-20 2011-09-14 深圳市万兴软件有限公司 Method and device for regulating disk partitions in Windows
CN110825712A (en) * 2019-10-31 2020-02-21 四川效率源科技有限责任公司 Method for recovering disk cluster data managed by logical volume

Also Published As

Publication number Publication date
CN113253945A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
JP3548529B2 (en) Computer partition operation during image formation
US20070288711A1 (en) Snapshot copy management method used for logic volume manager
US20170131934A1 (en) Storage device, program, and information processing method
CN109117273B (en) Data storage method, device and equipment
CN104408091A (en) Data storage method and system for distributed file system
JP4304194B2 (en) File system management apparatus and method
US20080186318A1 (en) Memory management for systems for generating 3-dimensional computer images
JP2008204206A (en) Data distribution and storage system, data distribution method, device to be used for this and its program
JP2011191933A (en) Storage device, and program and method for controlling storage device
CN106293497B (en) Watt record filesystem-aware in junk data recovery method and device
CN107644056A (en) A kind of file memory method, apparatus and system
CN111813813B (en) Data management method, device, equipment and storage medium
CN113568582A (en) Data management method and device and storage equipment
CN113138945A (en) Data caching method, device, equipment and medium
CN113253945B (en) Disk coiling clustering method and device and electronic equipment
CN108255989A (en) Picture storage method, device, terminal device and computer storage media
JP2007148546A (en) System, device, and method for reading out data
CN115576500A (en) RAID array capacity expansion method and related device
CN112597102B (en) High-efficiency mirror image file system implementation method
CN116108031A (en) Mirror image data updating method and device and electronic equipment
US20090132623A1 (en) Information processing device having data field and operation methods of the same
CN114661557A (en) Method and device for recording cold and hot states of memory
CN109189345B (en) Online data sorting method, device, equipment and storage medium
CN110008178B (en) Distributed file system metadata organization method and device
KR101995460B1 (en) System and method for defragmenting of file with ext file structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant