US20210271405A1 - Data storage method and apparatus - Google Patents

Data storage method and apparatus Download PDF

Info

Publication number
US20210271405A1
US20210271405A1 US17/325,287 US202117325287A US2021271405A1 US 20210271405 A1 US20210271405 A1 US 20210271405A1 US 202117325287 A US202117325287 A US 202117325287A US 2021271405 A1 US2021271405 A1 US 2021271405A1
Authority
US
United States
Prior art keywords
data
storage
stored data
location
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US17/325,287
Other versions
US11550486B2 (en
Inventor
Haitao Yan
Lin Lin
Mingqian Zhang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Assigned to HUAWEI TECHNOLOGIES CO., LTD. reassignment HUAWEI TECHNOLOGIES CO., LTD. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LIN, LIN, YAN, Haitao, ZHANG, MINGQIAN
Publication of US20210271405A1 publication Critical patent/US20210271405A1/en
Application granted granted Critical
Publication of US11550486B2 publication Critical patent/US11550486B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0685Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0605Improving or facilitating administration, e.g. storage management by facilitating the interaction with a user or administrator
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • G06F3/0649Lifecycle management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices

Definitions

  • This disclosure relates to the field of storage technologies, and in particular, to a data storage method and apparatus.
  • a tiered storage technology is introduced to improve storage performance of the storage system.
  • a main idea of the tiered storage technology is to separately store different data in storage media with different performance based on indicators such as data importance and data access frequency. For example, data with relatively low access frequency is stored onto a hard disk drive (HDD) with a relatively low read/write speed in the storage system, and data with relatively high access frequency is stored onto a solid state drive (SSD) with a relatively high read/write speed in the storage system. This can improve a read/write speed of the storage system.
  • HDD hard disk drive
  • SSD solid state drive
  • the storage system using the tiered storage technology usually scans, at a regular interval (for example, every other week), data stored in the entire storage system, to determine whether the data stored in the storage system meets a preset tiered storage policy (for example, storing data with relatively low access frequency onto the HDD or storing data with relatively high access frequency onto the SSD). If a part of data in the storage system does not meet the tiered storage policy, for example, a data block 1 is data with relatively low access frequency but is stored on the SSD, the part of data needs to be migrated to an expected storage medium, that is, the data block 1 is migrated to the HDD.
  • a preset tiered storage policy for example, storing data with relatively low access frequency onto the HDD or storing data with relatively high access frequency onto the SSD.
  • the storage system After scanning the data stored in the entire storage system, if it is determined that a large amount of data needs to be migrated, the storage system needs to consume a large quantity of resources (for example, input/output (I/O) resources) to migrate, to the expected storage medium, the data that needs to be migrated. Consequently, storage performance of the storage system deteriorates.
  • resources for example, input/output (I/O) resources
  • This disclosure provides a data storage method and apparatus, to improve storage performance of a storage system.
  • a data storage method is provided and applied to a storage system.
  • first information of to-be-stored data is first obtained.
  • the first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data.
  • an expected storage location of the to-be-stored data is determined based on whether the first information of the to-be-stored data meets a condition.
  • the expected storage location is a first storage space whose read/write performance is higher than or equal to a threshold in the storage system; otherwise, it is determined that the expected storage location is a second storage space whose read/write performance is lower than the threshold in the storage system.
  • at least one data packet in a plurality of data packets of the to-be-stored data is stored in the expected storage location.
  • an expected storage location is determined according to a preset policy of the storage system and based on at least one piece of information: a type of the data, a name of the data, and a user identifier corresponding to the data.
  • a data packet of the data is stored in the location. In this way, the data does not need to be migrated subsequently. This can reduce an amount of data that needs to be migrated, reduce resource consumption of the storage system during data migration, and improve storage performance of the storage system.
  • the condition includes at least one of the following conditions:
  • the expected storage location of the data may be determined in a plurality of different manners, to improve flexibility of the storage system.
  • the first information of the to-be-stored data is obtained to determine the expected storage location of the to-be-stored data.
  • the data packet of the to-be-stored data is stored. In this way, when the plurality of data packets of the to-be-stored data are stored, each data packet in the plurality of data packets is stored in the determined expected storage location.
  • the storage system may skip a process of scanning the data to determine data that needs to be migrated. This can improve storage performance of the storage system.
  • a part of data packets in the plurality of data packets of the to-be-stored data may be first stored in a first location.
  • the first location is different from the expected storage location, for example, may be a location preset by the storage system.
  • the first information of the to-be-stored data is obtained to determine the expected storage location of the to-be-stored data.
  • a data packet other than the part of data packets that are stored in the first location in the plurality of data packets of the to-be-stored data is stored in the expected storage location.
  • the plurality of data packets of the to-be-stored data are stored in different storage spaces.
  • the part of data packets of the data are first stored in a default location, and then the expected storage location of the data is determined during storage. This can reduce response duration of the storage system when the data is stored.
  • the storage system needs to perform data migration only on the part of data packets of the to-be-stored data. This can reduce an amount of data that needs to be migrated, and improve storage performance of the storage system.
  • the storage system may further record a storage status of the to-be-stored data.
  • the storage status includes a first storage status in which the plurality of data packets of the to-be-stored data are stored in the expected storage location and a second storage status in which the plurality of data packets of the to-be-stored data are separately stored in the first location and the expected storage location.
  • whether to perform data migration may be determined based on the obtained storage status of the to-be-stored data. For example, if the storage status of the to-be-stored data indicates that the to-be-stored data is in the second storage status, the storage system may migrate, to the expected storage location, the part of data packets of the to-be-stored data stored in the first location. In this way, the storage system may determine, based on the storage status of the to-be-stored data, whether to perform data migration. This can reduce complexity of scanning.
  • the storage system may further adjust the storage status of the to-be-stored data from the second storage status to the first storage status. In this way, when the storage system performs scanning again, the to-be-stored data may not need to be migrated.
  • a data storage apparatus includes a processor, configured to implement the method described in the first aspect.
  • the data storage apparatus may further include a memory, configured to store program instructions and data.
  • the memory is coupled to the processor, and the processor may invoke and execute the program instructions stored in the memory, to implement any method in the methods described in the first aspect.
  • the data storage apparatus may further include a communications interface. The communications interface is used by the data storage apparatus to communicate with another device.
  • the processor is configured to: obtain first information of to-be-stored data, where the first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data; and determine an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy.
  • the preset policy is: when the first information meets a condition, determining that the expected storage location is a first storage space, otherwise, determining that the expected storage location is a second storage space.
  • the first storage space is a storage space whose read/write performance is higher than or equal to a threshold in a storage system
  • the second storage space is a storage space whose read/write performance is lower than the threshold in the storage system.
  • the processor is further configured to store, in the expected storage location, at least one data packet in a plurality of data packets of the to-be-stored data received through the communications interface.
  • the condition includes at least one of the following conditions:
  • the processor when the processor obtains the first information of the to-be-stored data, the processor is specifically configured to: before no data packet in the plurality of data packets of the to-be-stored data received through the communications interface is stored in the storage system, obtain the first information of the to-be-stored data.
  • the processor When the processor stores, in the expected storage location, the at least one data packet in the plurality of data packets of the to-be-stored data received through the communications interface, the processor is specifically configured to store each data packet in the plurality of data packets in the expected storage location.
  • the processor is further configured to: before obtaining the first information of the to-be-stored data, store, in a first location, a part of data packets in the plurality of data packets of the to-be-stored data received through the communications interface.
  • the first location is different from the expected storage location.
  • the processor When the processor stores, in the expected storage location, the at least one data packet in the plurality of data packets of the to-be-stored data received through the communications interface, the processor is specifically configured to store a data packet other than the part of data packets in the plurality of data packets in the expected storage location.
  • the processor is further configured to record a storage status of the to-be-stored data.
  • the storage status includes a first storage status and a second storage status.
  • the first storage status is a status in which the plurality of data packets of the to-be-stored data are stored in the expected storage location
  • the second storage status is a status in which the plurality of data packets of the to-be-stored data are separately stored in the first location and the expected storage location. Then, the storage status of the to-be-stored data may be obtained.
  • the part of data packets of the to-be-stored data may be migrated from the first location to the expected storage location, and the storage status of the to-be-stored data is adjusted from the second storage status to the first storage status.
  • a data storage apparatus may be a storage system, or may be an apparatus in a storage system.
  • the data storage apparatus may include a processing module and a communications module.
  • the modules may perform corresponding functions performed by the storage system in any one of the design examples in the first aspect. Details are as follows:
  • the processing module is configured to: obtain first information of to-be-stored data, where the first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data; and determine an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy, where the preset policy is: when the first information meets a condition, determining that the expected storage location is a first storage space, otherwise, determining that the expected storage location is a second storage space.
  • the first storage space is a storage space whose read/write performance is higher than or equal to a threshold in the storage system
  • the second storage space is a storage space whose read/write performance is lower than the threshold in the storage system.
  • the processing module is further configured to store, in the expected storage location, at least one data packet in a plurality of data packets of the to-be-stored data received by the communications module.
  • an embodiment of this disclosure further provides a computer-readable storage medium including instructions.
  • the instructions When the instructions are run on a computer, the computer is enabled to perform the method according to the first aspect.
  • an embodiment of this disclosure further provides a computer program product including instructions.
  • the computer program product When the computer program product is run on a computer, the computer is enabled to perform the method according to the first aspect.
  • an embodiment of this disclosure provides a chip system.
  • the chip system includes a processor and may further include a memory, and is configured to implement the method according to the first aspect.
  • the chip system may include a chip, or may include a chip and another discrete component.
  • FIG. 1 is a schematic diagram of an example of a storage system according to an embodiment of this disclosure
  • FIG. 2 is a flowchart of an example of a data storage method according to an embodiment of this disclosure
  • FIG. 3 is a flowchart of another example of a data storage method according to an embodiment of this disclosure.
  • FIG. 4 is a flowchart of another example of a data storage method according to an embodiment of this disclosure.
  • FIG. 5 is a flowchart of another example of a data storage method according to an embodiment of this disclosure.
  • FIG. 6 is a flowchart of another example of a data storage method according to an embodiment of this disclosure.
  • FIG. 7 is a schematic structural diagram of an example of a data storage apparatus according to an embodiment of this disclosure.
  • FIG. 8 is a schematic structural diagram of another example of a data storage apparatus according to an embodiment of this disclosure.
  • a plurality of means two or more.
  • a plurality of may be understood as “at least two”.
  • At least one may be understood as one or more, for example, one, two, or more.
  • Including at least one means including one, two, or more, and does not limit what are included.
  • including at least one of A, B, and C may represent the following cases: A is included, B is included, C is included, A and B are included, A and C are included, B and C are included, and A and B and C are included.
  • the term “and/or” describes an association relationship between associated objects and represents that three relationships may exist.
  • a and/or B may represent the following three cases: Only A exists, only B exists, and both A and B exist.
  • the character “/”, unless otherwise specified generally indicates an “or” relationship between the associated objects.
  • ordinal terms such as “first” and “second” mentioned in this disclosure are used to distinguish between a plurality of objects, and are not intended to limit a sequence, a time sequence, a priority, or an importance degree of the plurality of objects.
  • Tiered storage is a manner to manage data.
  • data is separately stored in storage media with different performance based on indicators such as data importance, access frequency, attribute information, and a size, and the data is automatically migrated between storage media by using a tiered storage technology.
  • a storage system using the tiered storage technology generally includes a plurality of storage media with different performance, for example, a serial advanced technology attachment (SATA) hard disk, a small computer system interface (SCSI) hard disk, a serial attached SCSI interface (SAS) hard disk, a fiber channel (FC) interface hard disk, and an SSD.
  • SATA serial advanced technology attachment
  • SAS serial attached SCSI interface
  • FC fiber channel
  • SSD fiber channel
  • a relationship between performance of the hard disks is as follows: SATA hard disk ⁇ SCSI hard disk ⁇ SAS hard disk ⁇ FC hard disk ⁇ SSD.
  • a person skilled in the art may select, based on an actual use requirement, storage media with different performance to constitute different storage systems, for example, a level-3 storage system including three storage media with different performance, a level-5 storage system including five storage media with different performance.
  • a preset tiered storage policy is stored, for example, a tiered storage policy of storing important data in a storage medium with good performance and storing unimportant data in a storage medium with poor performance, or a tiered storage policy of storing data with high access frequency in a storage medium with good performance and storing data with low access frequency in a storage medium with poor performance
  • a person skilled in the art may perform setting based on a use requirement.
  • the storage system using the tiered storage technology after receiving a data packet of to-be-stored data, stores the data packet in a preset storage medium.
  • the preset storage medium may be a storage medium with good performance in the storage system, for example, a SAS hard disk or an SSD. Then, at a regular interval, data stored in the entire storage system is scanned to determine a part of data packets that do not meet a preset tiered storage policy, and the part of data packets are migrated to an expected storage medium.
  • the technical solutions in the embodiments of this disclosure are provided.
  • at least one piece of information a type of the data, a name of the data, and a user identifier corresponding to the data is first obtained. Then, it is determined, based on the obtained at least one piece of information and according to a tiered storage policy preset by a storage system, whether the data needs to be stored in a storage space with good performance or a storage space with poor performance. Then, a data packet of the data is stored in a determined storage space. In this way, the data has been stored in the storage space corresponding to the tiered storage policy during storage, and the data does not need to be migrated subsequently. This can reduce an amount of data that needs to be migrated, reduce resource consumption of the storage system during data migration, and improve storage performance of the storage system.
  • the technical solutions in the embodiments of this disclosure are applied to a storage system using the tiered storage technology.
  • the storage system may be a file storage system, a block storage system, an object storage system, or a combination of the foregoing storage systems. This is not limited in the embodiments of this disclosure.
  • FIG. 1 shows an example of a storage system according to an embodiment of this disclosure.
  • the storage system includes a management unit 101 and two storage media with different performance.
  • the two storage media are a first storage medium 102 and a second storage medium 103 .
  • the management unit 101 is configured to manage an operation request, for example, a request of processing a write operation to write data into the storage medium, or a request of processing a read operation to obtain data from the storage medium.
  • the management unit 101 may be a central processing unit (CPU), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or the like.
  • the first storage medium 102 and the second storage medium 103 are configured to store data.
  • Read/write performance of the first storage medium 102 is better than read/write performance of the second storage medium 103 .
  • the first storage medium 102 may be an SSD or an FC hard disk
  • the second storage medium 103 may be a SATA hard disk, a SAS hard disk, or a SCSI hard disk.
  • the first storage medium 102 and the second storage medium 103 may be other storage media. This is not limited herein.
  • a storage space constituted by at least one first storage medium 102 is referred to as a first storage space 1021
  • a storage space constituted by at least one second storage medium 103 is referred to as a second storage space 1031 .
  • the first storage space 1021 is constituted by a quantity of storage spaces corresponding to a plurality of first storage media 102
  • the plurality of first storage media 102 may form a storage array in a coupling manner, and provide a service for a user together.
  • the storage system using the tiered storage technology is not limited to an architecture shown in FIG. 1 .
  • the storage system may include three or more storage media with different performance, for example, four storage media with different performance, or the storage system may further include another apparatus, for example, a management apparatus.
  • the storage system described in this embodiment of this disclosure is intended to describe the technical solutions in example embodiments of this disclosure more clearly, and does not constitute a limitation to the technical solutions provided in the embodiments of this disclosure. A person of ordinary skill in the art may learn that, with evolution of a storage technology and a storage system architecture, the technical solutions provided in the embodiments of this disclosure are also applicable to a similar technical problem.
  • FIG. 2 is a flowchart of the method.
  • the data storage method may be performed by the storage system shown in FIG. 1 .
  • the method may be performed by a communications apparatus.
  • the communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method.
  • the communications apparatus may be another communications apparatus, for example, a chip system.
  • An implementation of the communications apparatus is not limited herein.
  • the following uses an example in which the method is performed by a management unit in the storage system.
  • S 21 Another electronic device sends a plurality of data packets of to-be-stored data to the storage system, and the management unit in the storage system obtains the plurality of data packets.
  • the storage system usually collaborates with another electronic device to complete a data read/write process.
  • the another electronic device may be a server, a client, or the like.
  • a user may write data into the storage system by performing a data write operation on the another electronic device, or read data from the storage system by performing a data read operation on the another electronic device.
  • the another electronic device is a server.
  • the user when the user needs to store data in the storage system, the user may perform a data write operation, for example, write data A.
  • the server After detecting the data write operation, the server sends a data write request to the storage system.
  • the data write request may include a plurality of data packets of the data A.
  • the management unit in the storage system may obtain the plurality of data packets of the to-be-stored data A from the data write request.
  • S 22 The management unit in the storage system obtains first information of the to-be-stored data.
  • the first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data.
  • manners in which the management unit in the storage system obtains the first information include but are not limited to the following two manners:
  • the data write request sent by the server to the storage system may include the first information, for example, description information of the data A.
  • the description information may include a type of the data A (for example, an RAR type, a JPEG type, or an avi type), a size of the data A (for example, 1 M or 30 kB), a name of the data A (for example, a photo 1 or JPG 2 ), a user identifier (for example, an IP address of the user or a user number), or the like. This is not limited herein.
  • the management unit in the storage system may obtain the description information of the data A from the data write request.
  • the description information of the data A may be understood as the first information of the to-be-stored data (the data A).
  • the management unit in the storage system may determine a type of the data A based on a correspondence between a header character and a data type and headers of the plurality of data packets of the data A.
  • Table 1 shows an example of the correspondence between a header character and a data type.
  • the header character is represented by a hexadecimal number. As shown in Table 1, when the header character is “52617221”, it indicates that the data type is an RAR type. When the header character is “504B0304”, it indicates that the data type is a ZIP type. In this case, the type of the data A is the first information.
  • the first information is preset by the storage system or a person skilled in the art.
  • the first information may be the type of the to-be-stored data, the name of the to-be-stored data, and the user identifier corresponding to the to-be-stored data.
  • the first information may include only the type of the to-be-stored data. This is not limited herein.
  • the management unit in the storage system may obtain the first information of the to-be-stored data in another manner. This is not limited herein.
  • the management unit in the storage system determines an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy associated with the storage system.
  • the preset policy is: when the first information meets a condition, determining that the expected storage location of the to-be-stored data is a first storage space whose read/write performance is higher than or equal to a threshold; when the first information does not meet the condition, determining that the expected storage location of the to-be-stored data is a second storage space whose read/write performance is lower than the threshold.
  • a specific value of the threshold is not limited in this embodiment of this disclosure.
  • the first storage space whose read/write performance is higher than or equal to the threshold and the second storage space whose read/write performance is lower than the threshold may be understood as that the read/write performance of the first storage space is higher than that of the second storage space.
  • the first storage space may be understood as a storage space constituted by at least one first storage medium
  • the second storage space may be understood as a storage space constituted by at least one second storage medium.
  • the following uses an example in which the first storage space is an SSD and the second storage space is an HDD.
  • the preset policy may include but is not limited to the following three cases:
  • the preset policy is: if the type of the to-be-stored data is the same as the preset type, determining that the expected storage location of the to-be-stored data is the SSD; when the type of the to-be-stored data is different from the preset type, determining that the expected storage location of the to-be-stored data is the HDD.
  • the preset type may be the RAR type.
  • the management unit in the storage system After obtaining the first information of the data A, the management unit in the storage system obtains the type of the data A from the first information. For example, when the type of the data A is the JPEG type, the storage system determines that the type of the data A is not the RAR type, and then determines that the expected storage location of the data A is the HDD.
  • the preset type may be specified in association with the preset policy associated with the storage system.
  • the preset policy is: if the name of the to-be-stored data is the same as the preset name, determining that the expected storage location of the to-be-stored data is the SSD; when the name of the to-be-stored data is different from the preset name, determining that the expected storage location of the to-be-stored data is the HDD.
  • the preset name is a name starting with photo.
  • the management unit in the storage system After obtaining the first information of the data A, the management unit in the storage system obtains the name of the data A from the first information. For example, when the name of the data A is JPEG 2, the storage system determines that the name of the data A is not starting with photo, and then determines that the expected storage location of the data A is the HDD.
  • the preset name may be specified in association with the preset policy associated with the storage system.
  • the preset policy is: if the user identifier corresponding to the to-be-stored data is the same as the preset user identifier, determining that the expected storage location of the to-be-stored data is the SSD; when the user identifier corresponding to the to-be-stored data is different from the preset user identifier, determining that the expected storage location of the to-be-stored data is the HDD.
  • the following uses an example in which the user identifier is a user number, and the preset user identifier is ID 1000 .
  • the management unit in the storage system After obtaining the first information of the data A, the management unit in the storage system obtains the user number corresponding to the data A from the first information. For example, when the user number corresponding to the data A is ID 100 , the management unit in the storage system determines that the user number corresponding to the data A is different from the preset user identifier, and then determines that the expected storage location of the data A is the HDD.
  • the preset user identifier may be specified in association with the preset policy associated with the storage system.
  • the preset policy may be a combination of at least two of the foregoing three cases.
  • the preset policy may be a combination of the foregoing first case and second case, or the preset policy may be a combination of the foregoing three cases, or the like.
  • a process in which the management unit in the storage system determines the expected storage location of the to-be-stored data is similar to the process in any one of the foregoing three cases.
  • the storage system may determine the expected storage location of the to-be-stored data based on related information of the to-be-stored data, this can resolve a problem that data having a relatively high requirement for read/write performance of a storage space is stored by default in a storage space with relatively low read/write performance and a data requirement cannot be met. In addition, this can resolve a problem that data having a relatively low requirement for read/write performance of a storage space is stored in a storage space with relatively high read/write performance and a storage resource with relatively high read/write performance is wasted.
  • the management unit in the storage system stores each data packet in the plurality of data packets of the to-be-stored data in the expected storage location.
  • the management unit in the storage system After determining the expected storage location of the to-be-stored data, the management unit in the storage system stores each received data packet of the to-be-stored data in the expected storage location. For example, after it is determined that the expected storage location of the data A is the HDD, the plurality of data packets of the data A are all stored onto the HDD. Because the to-be-stored data is stored in the expected storage location, when the management unit in the storage system scans all the stored data to determine whether the data needs to be migrated, the data has already been located in the target location, and the data does not need to be migrated. This can reduce an amount of data that needs to be migrated, and reduce resource consumption of the storage system during data migration.
  • the management unit in the storage system records a storage status of the to-be-stored data as a first storage status.
  • the storage status includes a first storage status (which may also be referred to as a migration-free status) in which the plurality of data packets of the to-be-stored data are stored in the expected storage location of the to-be-stored data, and a second storage status (which may also be referred to as a migration-required status) in which a subset of the plurality of data packets of the to-be-stored data are stored in the expected storage location and another subset of the plurality of data packets are stored in a first location different from the expected storage location. If the expected storage location is the first storage space with relatively high read/write performance, the first location is the second storage space with relatively low read/write performance. If the expected storage location is the second storage space with relatively low read/write performance, the first location is the first storage space with relatively high read/write performance.
  • the management unit in the storage system may record the storage status of the data A as the first storage status.
  • the storage system may include a dedicated space, and the dedicated space is specially used to record a storage status of data.
  • the dedicated space may be reserved in the first storage space or the second storage space. If the storage system stores 10 pieces of data, the dedicated space may store 10 bits, and a storage status of each piece of data corresponds to one bit. When a value of the bit is 0, it indicates that a storage status of data corresponding to the bit is the migration-free status.
  • a storage status of data corresponding to the bit is the migration-required status. If the data A is the first data stored in the storage system, the first bit in the dedicated space corresponds to the storage status of the data A, and the management unit in the storage system sets a value of the first bit in the dedicated space to 0. Alternatively, a storage status may be recorded in another manner. This is not limited herein.
  • an execution sequence of S 24 and S 25 is not limited.
  • S 24 may be performed before S 25
  • S 25 may be performed before S 24
  • S 24 and S 25 are performed at the same time.
  • FIG. 2 an example in which S 24 is performed before S 25 is used.
  • S 26 The management unit in the storage system obtains the storage status of the to-be-stored data.
  • the storage system using the tiered storage technology scans data stored in the storage system at a regular interval to determine whether the data needs to be migrated.
  • the management unit in the storage system records the storage status of each piece of data, for example, the storage status of each piece of data is stored in the dedicated space used to record the storage status.
  • the management unit in the storage system may determine, based on the storage status, recorded in the dedicated space, of the data, whether the data needs to be migrated. This can reduce scanning time and reduce resource consumption of scanning.
  • the data A is used as an example.
  • the data A is the first data stored in the storage system, the management unit in the storage system obtains the value of the first bit in the dedicated space used to record the storage status, where the value of the first bit is 0.
  • S 27 The management unit in the storage system determines that the to-be-stored data does not need to be migrated.
  • the management unit in the storage system determines that the value of the first bit that is corresponding to the data A and that is in the dedicated space used to record the storage status is 0, so that the management unit determines, based on a correspondence between a value of a bit and a storage status, that the data A does not need to be migrated.
  • the storage system does not need to migrate the data. This can reduce an amount of data that needs to be migrated, reduce resource consumption during data migration, and improve storage performance of the storage system.
  • the storage system may skip a process of scanning the data to determine data that needs to be migrated. This can improve storage performance of the storage system.
  • FIG. 3 is a flowchart of another example of a data storage method according to this disclosure.
  • the data storage method may be performed by the storage system shown in FIG. 1 .
  • the method may be performed by a communications apparatus.
  • the communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method.
  • the communications apparatus may alternatively be another communications apparatus, for example, a chip system.
  • An implementation of the communications apparatus is not limited herein.
  • the following uses an example in which the method is performed by a management unit in the storage system.
  • S 31 Another electronic device sends a plurality of data packets of to-be-stored data to the storage system, and the management unit in the storage system obtains the plurality of data packets.
  • the another electronic device may be a server, a client, or the like.
  • the following provides description by using an example in which the another electronic device is a server.
  • S 31 is similar to S 21 .
  • the management unit in the storage system stores a part of data packets in the plurality of data packets of the to-be-stored data in a first location.
  • the first location is a default location of the storage system.
  • the management unit in the storage system stores at least one part of data packets of each piece of received data in the first location.
  • the storage system shown in FIG. 1 includes a first storage space and a second storage space.
  • the first location may be the first storage space, or may be the second storage space.
  • a person skilled in the art may perform setting based on an actual use requirement. This is not limited herein.
  • the following uses an example in which the first storage space is an SSD, the second storage space is an HDD, and the first location is the HDD.
  • the to-be-stored data is data A
  • the data A includes 10 data packets.
  • the management unit in the storage system After obtaining the 10 data packets of the data A based on a data write request of the server, the management unit in the storage system writes the 10 data packets into a storage space of the storage system in sequence.
  • the management unit in the storage system may write the 10 data packets one by one in a unit of a data packet.
  • the management unit in the storage system may split a data packet into a plurality of data blocks, and then write the 10 data packets one by one in a unit of a data block.
  • the management unit in the storage system splits each data packet into three data blocks, writes three data blocks corresponding to the first data packet onto the HDD, and then writes three data blocks corresponding to the second data packet onto the HDD.
  • S 33 The management unit in the storage system obtains first information of the to-be-stored data.
  • the management unit in the storage system After writing the part of data packets of the to-be-stored data onto the HDD, the management unit in the storage system obtains the first information of the to-be-stored data. Content of the first information and a manner of obtaining the first information are the same as those in S 22 .
  • the management unit in the storage system determines an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy.
  • S 34 is similar to S 23 .
  • the expected storage location determined by the management unit in the storage system may be the same as the first location, or may be different from the first location. For example, if the management unit in the storage system determines that the expected storage location is the HDD, the expected storage location is the same as the first location. If the management unit in the storage system determines that the expected storage location of the to-be-stored data is the SSD, the expected storage location is different from the first location. In this embodiment of this disclosure, the following provides description by using an example in which the expected storage location is the same as the first location.
  • the management unit in the storage system stores each data packet in the plurality of data packets of the to-be-stored data in the first location.
  • S 36 The management unit in the storage system records a storage status of the to-be-stored data as a first storage status.
  • S 37 The management unit in the storage system obtains the storage status of the to-be-stored data.
  • S 38 The management unit in the storage system determines that the to-be-stored data does not need to be migrated.
  • S 35 to S 38 are similar to S 24 to S 27 .
  • S 36 to S 38 are optional steps. In other words, S 36 to S 38 are not mandatory.
  • the storage system after obtaining the to-be-stored data, the storage system first stores the part of data packets of the data in the default location, and then determines the expected storage location of the data during storage. This can reduce response duration of the storage system when the data is stored.
  • FIG. 3 describes an example in which the expected storage location of the data is determined during storage and the expected storage location is the same as the first location.
  • the following provides description by using an example in which the expected storage location of the data is determined during storage and the expected storage location is different from the first location.
  • FIG. 4 is a flowchart of another example of a data storage method according to this disclosure.
  • the data storage method may be performed by the storage system shown in FIG. 1 .
  • the method may be performed by a communications apparatus.
  • the communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method.
  • the communications apparatus may alternatively be another communications apparatus, for example, a chip system.
  • An implementation of the communications apparatus is not limited herein.
  • the following uses an example in which the method is performed by a management unit in the storage system.
  • Another electronic device sends a plurality of data packets of to-be-stored data to the storage system, and the management unit in the storage system obtains the plurality of data packets.
  • the management unit in the storage system stores a part of data packets in the plurality of data packets of the to-be-stored data in a first location.
  • S 43 The management unit in the storage system obtains first information of the to-be-stored data.
  • the management unit in the storage system determines an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy.
  • S 41 to S 44 are similar to S 31 to S 34 .
  • the following provides description by using an example in which the expected storage location is different from the first location.
  • the management unit in the storage system stores a remaining data packet other than the part of data packets in the plurality of data packets of the to-be-stored data in the expected storage location.
  • the management unit in the storage system stores the first two data packets of data A in the first location, that is, stores onto an HDD.
  • the management unit in the storage system determines that an expected storage location of the data A is an SSD and is different from the first location, the management unit in the storage system stores remaining 8 data packets of the data A onto the SSD.
  • S 46 The management unit in the storage system records a storage status of the to-be-stored data as a second storage status.
  • the management unit in the storage system records a storage status of the data A as the second storage status in which data needs to be migrated.
  • S 47 The management unit in the storage system obtains the storage status of the to-be-stored data.
  • S 47 is similar to S 26 .
  • S 48 The management unit in the storage system determines that the to-be-stored data needs to be migrated, and migrates the part of data packets of the to-be-stored data from the first location to the expected storage location.
  • the management unit in the storage system learns that the storage status of the to-be-stored data (the data A) is the second storage status, the management unit in the storage system determines that the to-be-stored data needs to be migrated, and migrates the part of data packets of the data A that are stored in the first location to the expected storage location of the data A.
  • S 49 The management unit in the storage system adjusts the storage status of the to-be-stored data from the second storage status to a first storage status.
  • S 46 to S 49 are optional steps. In other words, S 46 to S 49 are not mandatory.
  • the foregoing describes an overall procedure of the data storage method in the storage system in the embodiments of this disclosure.
  • the following describes the foregoing technical solutions by using a specific storage system (for example, a file storage system) as an example.
  • Data stored in the file storage system is classified into two types: data and metadata.
  • the data may be understood as actual data in a file.
  • the file is a picture
  • the actual data of the file is information such as a person, an animal, and an environment that is included in the picture.
  • the metadata is data used to describe attribute information of a file, for example, access permission of the file, an owner of the file, and a storage location of the file. If a user needs to perform an operation on a file in the file storage system, the user first needs to obtain metadata of the file, and then can locate a location of the file based on the metadata and obtain actual data in the file.
  • metadata may be managed in two manners: centralized management and distributed management.
  • the centralized management means that a storage space is specified in the file system and is dedicated to storing metadata of all files.
  • the metadata of all files is stored on an SSD, and this can facilitate management of files in the file storage system.
  • the distributed management means that metadata is stored in any storage space in the file system. For example, metadata may be stored together with each file. If a file is stored on the HDD, metadata of the file is also stored on the HDD. If a file is stored on the SSD, metadata of the file is also stored on the SSD. In this way, responsibilities of metadata management are distributed to different storage spaces, to resolve a problem that the entire file storage system cannot be used when a storage space for storing metadata is faulty in the centralized management manner.
  • FIG. 5 is a flowchart of the method.
  • the data storage method may be performed by the storage system shown in FIG. 1 .
  • the method may be performed by a communications apparatus.
  • the communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method.
  • the communications apparatus may alternatively be another communications apparatus, for example, a chip system.
  • An implementation of the communications apparatus is not limited herein.
  • the following uses an example in which the method is performed by a management unit in the storage system and the storage system is a file storage system.
  • Another electronic device sends a plurality of data packets of a to-be-stored file to the file storage system, and the management unit in the file storage system obtains the plurality of data packets.
  • S 51 is similar to S 21 .
  • S 52 The management unit in the file storage system obtains first information of the to-be-stored file.
  • the management unit in the file storage system determines an expected storage location of the to-be-stored file based on the first information of the to-be-stored file and according to a preset policy.
  • S 52 and S 53 are similar to S 22 and S 23 .
  • S 54 The management unit in the file storage system creates metadata corresponding to the to-be-stored file.
  • the management unit in the file storage system After determining the expected storage location of the to-be-stored file, the management unit in the file storage system creates, based on the expected storage location of the file, the metadata corresponding to the to-be-stored file.
  • the first type of information is an expected storage location of a file, and the location is denoted as Store tier.
  • the expected storage location of the file may be a first storage space, or may be a second storage space.
  • the following uses an example in which the first storage space is an SSD and the second storage space is an HDD.
  • the second type of information is a storage status, and the status is denoted as Status.
  • Status For description of the storage status, refer to corresponding content in S 25 .
  • a storage status is set to a first storage status by default.
  • two fields may be added to metadata in the conventional technology to respectively indicate the first type of information and the second type of information.
  • a length of each field is one bit.
  • the metadata in the conventional technology includes 10 bits
  • the metadata in this embodiment of this disclosure may include 12 bits, and the eleventh bit is used to indicate an expected storage location of a file. When a value of the bit is 0, it indicates that the expected storage location of the file is the HDD. When a value of the bit is 1, it indicates that the expected storage location of the file is the SSD.
  • the twelfth bit is used to indicate a storage status of a file. When a value of the bit is 0, it indicates that the storage status of the file is the first storage status (namely, a migration-free status).
  • the management unit in the file storage system determines the expected storage location of the to-be-stored file, for example, determines that the expected storage location of the file is the HDD, when creating metadata of the file, the management unit in the file storage system sets the eleventh bit of the metadata to 0, and sets the storage status of the file to the first storage status. In other words, the twelfth bit of the metadata is set to 0.
  • the management unit in the file storage system stores the plurality of data packets of the file in a location indicated by the metadata.
  • the management unit in the file storage system After creating the metadata corresponding to the file, the management unit in the file storage system stores the data packets of the file in an expected storage location indicated by the metadata.
  • the file storage system actively determines the expected storage location of the file according to the specified policy, so that the data packets of the file are directly stored in the expected storage location, and a process of scanning the file periodically and performing data migration can be skipped. This can improve performance of the file storage system.
  • FIG. 6 is a flowchart of another example of a data storage method according to this disclosure.
  • the data storage method may be performed by the storage system shown in FIG. 1 .
  • the method may be performed by a communications apparatus.
  • the communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method.
  • the communications apparatus may be another communications apparatus, for example, a chip system.
  • An implementation of the communications apparatus is not limited herein.
  • the following uses an example in which the method is performed by a management unit in the storage system and the storage system is a file storage system.
  • Another electronic device sends a plurality of data packets of a to-be-stored file to the file storage system, and the management unit in the file storage system obtains the plurality of data packets.
  • S 61 is similar to S 21 .
  • S 62 The management unit in the file storage system creates metadata corresponding to the to-be-stored file.
  • the management unit in the file storage system presets an expected storage location of a file.
  • an expected storage location of the file indicated by the metadata is the same.
  • the preset location may be a first storage space, or may be a second storage space.
  • the first storage space is an SSD
  • the second storage space is an HDD
  • the preset location is the HDD. Description of the storage status is the same as that in S 52 .
  • the preset location is the HDD
  • the management unit in the file storage system sets a value of a bit that is in the metadata and that is used to indicate the expected storage location of the file to 0.
  • the management unit in the file storage system stores a part of data packets of the to-be-stored file in a location indicated by the metadata.
  • the management unit in the file storage system After creating the metadata corresponding to the file, stores, in sequence, the data packets of the file in an expected storage location indicated by the metadata, namely, the HDD.
  • S 64 The management unit in the file storage system obtains first information of the to-be-stored file.
  • the management unit in the file storage system determines an expected storage location of the to-be-stored file based on the first information of the to-be-stored file and according to a preset policy.
  • S 64 and S 65 are the same as S 52 and S 53 , and details are not described herein again.
  • the management unit in the file storage system updates, by using the determined expected storage location of the file, a value of the field that is in the metadata and that is used to indicate the expected storage location of the file, and updates the storage status of the file in the metadata. For example, when the management unit in the file storage system determines that the expected storage location of the file is the SSD, and the expected storage location indicated by the metadata is the HDD, the expected storage location indicated by the metadata needs to be changed from the HDD to the SSD. To be specific, a value of a bit that is in the metadata and that is used to indicate the expected storage location of the file is reset to 1, and the storage status of the file is updated to the second storage status. In other words, a value of a bit that is in the metadata and that is used to indicate the storage status of the file is reset to 1.
  • the management unit in the file storage system stores a remaining data packet other than the part of data packets of the to-be-stored file in the location indicated by the metadata.
  • a storage location of the file in the file storage system is also changed. For example, the expected storage location of the file indicated by the metadata is changed to the SSD, and the remaining data packet of the file is stored onto the SSD.
  • the management unit in the file storage system scans the metadata, and determines that data migration needs to be performed on the file whose storage status is the second storage status.
  • the management unit in the file storage system migrates the file whose storage status is the second storage status to the expected storage location of the file indicated by the metadata of the file.
  • the file storage system determines an expected storage location of a file during file storage. This can ensure that an operation delay of creating metadata of the file is not increased.
  • a part of data packets of the file have been stored in the expected storage location during file storage. Therefore, this can reduce an amount of data that needs to be migrated, and improve storage performance of the storage system.
  • the storage system may include a hardware structure and/or a software module, and implement the functions in a form of the hardware structure, the software module, or a combination of the hardware structure and the software module. Whether a function in the foregoing functions is performed in a form of the hardware structure, the software module, or both the hardware structure and the software module depends on a specific application and a design constraint condition of the technical solutions.
  • FIG. 7 is a schematic structural diagram of a data storage apparatus 700 .
  • the data storage apparatus 700 may be applied to a storage system or an apparatus in a storage system, and can implement a function of the storage system in the method provided in the embodiments of this disclosure.
  • the data storage apparatus 700 may alternatively be an apparatus that can support the storage system in implementing a function of the storage system in the method provided in the embodiments of this disclosure.
  • the data storage apparatus 700 may be a hardware structure, a software module, or a combination of a hardware structure and a software module.
  • the data storage apparatus 700 may be implemented by a chip system. In this embodiment of this disclosure, the chip system may include a chip, or may include a chip and another discrete component.
  • the data storage apparatus 700 may include a communications module 701 and a processing module 702 .
  • the communications module 701 may be configured to perform step S 21 in the embodiment shown in FIG. 2 , and/or configured to perform step S 31 in the embodiment shown in FIG. 3 , and/or configured to perform step S 41 in the embodiment shown in FIG. 4 , and/or configured to perform step S 51 in the embodiment shown in FIG. 5 , and/or configured to perform step S 61 in the embodiment shown in FIG. 6 , and/or configured to support another process of the technology described in this specification.
  • the communications module 701 is used by the data storage apparatus 700 to communicate with another module, and may be a circuit, a component, an interface, a bus, a software module, a transceiver, or any other apparatus that can implement communication.
  • the processing module 702 may be configured to perform step S 22 to step S 26 in the embodiment shown in FIG. 2 , and/or configured to perform step S 32 to step S 38 in the embodiment shown in FIG. 3 , and/or configured to perform step S 42 to step S 49 in the embodiment shown in FIG. 4 , and/or configured to perform step S 52 to step S 55 in the embodiment shown in FIG. 5 , and/or configured to perform step S 62 to step S 69 in the embodiment shown in FIG. 6 , and/or configured to support another process of the technology described in this specification.
  • Division into modules in the embodiments of this disclosure is an example, is only logical function division, and may be other division in an actual implementation.
  • function modules in the embodiments of this disclosure may be integrated into one processor, or may exist alone physically, or two or more modules are integrated into one module.
  • the integrated module may be implemented in a form of hardware, or may be implemented in a form of a software functional module.
  • an embodiment of this disclosure provides a data storage apparatus 800 .
  • the data storage apparatus 800 may be the storage system in the embodiments shown in FIG. 2 to FIG. 6 , or an apparatus in the storage system, and can implement a function of the storage system in the embodiments of this disclosure shown in FIG. 2 to FIG. 6 .
  • the data storage apparatus 800 may alternatively be an apparatus that can support the storage system in implementing a function of the storage system in the method provided in the embodiments of this disclosure shown in FIG. 2 to FIG. 6 .
  • the data storage apparatus 800 may be a chip system. In this embodiment of this disclosure, the chip system may include a chip, or may include a chip and another discrete component.
  • the data storage apparatus 800 includes at least one processor 820 , configured to implement or support the data storage apparatus 800 in implementing a function of the management unit in the storage system in the embodiments of this disclosure shown in FIG. 2 to FIG. 6 .
  • the processor 820 may obtain first information of to-be-stored data, and determine an expected storage location of the to-be-stored data based on the first information and according to a preset policy. For details, refer to detailed descriptions in the method example.
  • the data storage apparatus 800 may further include at least one memory 830 , configured to store program instructions and/or data.
  • the memory 830 is coupled to the processor 820 . Coupling in this embodiment of this disclosure is an indirect coupling or a communication connection between apparatuses, units, or modules, may be in an electrical, a mechanical, or another form, and is used for information exchange between the apparatuses, the units, or the modules.
  • the processor 820 may operate with the memory 830 .
  • the processor 820 may execute the program instructions stored in the memory 830 . At least one of the at least one memory may be included in the processor. When executing the program instructions in the memory 830 , the processor 820 can implement the method shown in FIG. 2 to FIG. 6 .
  • the data storage apparatus 800 may further include a communications interface 810 , configured to communicate with another device through a transmission medium, so that the communications interface 810 is used by the data storage apparatus 800 to communication with the another device.
  • the another device may be a server.
  • the processor 820 may send and receive data through the communications interface 810 .
  • a specific connection medium between the communications interface 810 , the processor 820 , and the memory 830 is not limited.
  • the memory 830 , the processor 820 , and the communications interface 810 are connected through a bus 840 in FIG. 8 , and the bus is represented by a thick line in FIG. 8 .
  • a connection manner between other components is schematically described, and is not limited thereto.
  • the bus may be classified into an address bus, a data bus, a control bus, or the like. For ease of representation, only one thick line is used to represent the bus in FIG. 8 , but this does not mean that there is only one bus or only one type of bus.
  • the processor 820 may be a general-purpose processor, a digital signal processor, an ASIC, a FPGA or another programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, and may implement or perform the methods, steps, and logical block diagrams disclosed in the embodiments of this disclosure.
  • the general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed with reference to the embodiments of this disclosure may be directly performed by a hardware processor, or may be performed by using a combination of hardware in the processor and a software module.
  • the memory 830 may be a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD), or may be a volatile memory, such as a random access memory (RAM).
  • the memory is any other medium that can be used to include or store expected program code in a form of an instruction or a data structure and that can be accessed by a computer. However, this is not limited thereto.
  • the memory in the embodiments of this disclosure may alternatively be a circuit or any other apparatus that can implement a storage function, and is configured to store program instructions and/or data.
  • An embodiment of this disclosure further provides a computer-readable storage medium including instructions.
  • the instructions When the instructions are run on a computer, the computer is enabled to perform the method implemented by the storage array in the embodiments shown in FIG. 2 to FIG. 6 .
  • An embodiment of this disclosure further provides a computer program product including instructions.
  • the instructions When the instructions are run on a computer, the computer is enabled to perform the method implemented by the storage array in the embodiments shown in FIG. 2 to FIG. 6 .
  • the chip system includes a processor and may further include a memory, and is configured to implement a function of the storage system in the foregoing method.
  • the chip system may include a chip, or may include a chip and another discrete component.
  • All or some of the foregoing methods in the embodiments of this disclosure may be implemented by using software, hardware, firmware, or any combination thereof.
  • the software is used to implement the embodiments, all or some of the embodiments may be implemented in a form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user device, or another programmable apparatus.
  • the computer instruction may be stored in a computer-readable storage medium or may be transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instruction may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center in a wired (for example, through a coaxial cable, an optical fiber, or a digital subscriber line (DSL)) or wireless (for example, infrared, radio, or microwave) manner.
  • the computer-readable storage medium may be any usable medium accessible by a computer, or a data storage device, such as a server or a data center, integrating one or more usable media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a digital video disc (DVD)), a semiconductor medium (for example, an SSD), or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data storage method and apparatus are provided. First information of to-be-stored data is first obtained. The first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data; An expected storage location of the to-be-stored data is determined based on whether the first information of the to-be-stored data meets a condition. At least one data packet in a plurality of data packets of the to-be-stored data is stored in the expected storage location.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of International Application No. PCT/CN2019/115215, filed on Nov. 4, 2019, which claims priority to Chinese Patent Application No. 201811394013.6, filed on Nov. 21, 2018. The disclosures of the aforementioned applications are hereby incorporated by reference in their entireties.
  • TECHNICAL FIELD
  • This disclosure relates to the field of storage technologies, and in particular, to a data storage method and apparatus.
  • BACKGROUND
  • With rapid increase of internet users and diversified development of services, more data (for example, user data and service configuration data) needs to be stored by using a storage system, for service analysis and service guidance. A tiered storage technology is introduced to improve storage performance of the storage system. A main idea of the tiered storage technology is to separately store different data in storage media with different performance based on indicators such as data importance and data access frequency. For example, data with relatively low access frequency is stored onto a hard disk drive (HDD) with a relatively low read/write speed in the storage system, and data with relatively high access frequency is stored onto a solid state drive (SSD) with a relatively high read/write speed in the storage system. This can improve a read/write speed of the storage system.
  • In the conventional technology, the storage system using the tiered storage technology usually scans, at a regular interval (for example, every other week), data stored in the entire storage system, to determine whether the data stored in the storage system meets a preset tiered storage policy (for example, storing data with relatively low access frequency onto the HDD or storing data with relatively high access frequency onto the SSD). If a part of data in the storage system does not meet the tiered storage policy, for example, a data block 1 is data with relatively low access frequency but is stored on the SSD, the part of data needs to be migrated to an expected storage medium, that is, the data block 1 is migrated to the HDD.
  • After scanning the data stored in the entire storage system, if it is determined that a large amount of data needs to be migrated, the storage system needs to consume a large quantity of resources (for example, input/output (I/O) resources) to migrate, to the expected storage medium, the data that needs to be migrated. Consequently, storage performance of the storage system deteriorates.
  • SUMMARY
  • This disclosure provides a data storage method and apparatus, to improve storage performance of a storage system.
  • According to a first aspect, a data storage method is provided and applied to a storage system. In the method, when data is to be stored, first information of to-be-stored data is first obtained. The first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data. Then, an expected storage location of the to-be-stored data is determined based on whether the first information of the to-be-stored data meets a condition. For example, when the obtained at least one piece of information of the to-be-stored data meets the condition, it is determined that the expected storage location is a first storage space whose read/write performance is higher than or equal to a threshold in the storage system; otherwise, it is determined that the expected storage location is a second storage space whose read/write performance is lower than the threshold in the storage system. Thereafter, at least one data packet in a plurality of data packets of the to-be-stored data is stored in the expected storage location.
  • In the foregoing technical solution, when data is to be stored, an expected storage location is determined according to a preset policy of the storage system and based on at least one piece of information: a type of the data, a name of the data, and a user identifier corresponding to the data. A data packet of the data is stored in the location. In this way, the data does not need to be migrated subsequently. This can reduce an amount of data that needs to be migrated, reduce resource consumption of the storage system during data migration, and improve storage performance of the storage system.
  • In an example embodiment, the condition includes at least one of the following conditions:
      • the type of the to-be-stored data is the same as a preset type;
      • the name of the to-be-stored data is the same as a preset name; and
      • the user identifier corresponding to the to-be-stored data is the same as a preset user identifier.
  • In the foregoing technical solution, the expected storage location of the data may be determined in a plurality of different manners, to improve flexibility of the storage system.
  • In an example embodiment, before no data packet in the plurality of data packets of the to-be-stored data is stored in the storage system, the first information of the to-be-stored data is obtained to determine the expected storage location of the to-be-stored data. To be specific, after the expected storage location of the to-be-stored data is determined, the data packet of the to-be-stored data is stored. In this way, when the plurality of data packets of the to-be-stored data are stored, each data packet in the plurality of data packets is stored in the determined expected storage location.
  • In the foregoing technical solution, because each data packet of the to-be-stored data has been stored in the expected storage location, all data stored in the storage system does not need to be migrated. Therefore, the storage system may skip a process of scanning the data to determine data that needs to be migrated. This can improve storage performance of the storage system.
  • In an example embodiment, a part of data packets in the plurality of data packets of the to-be-stored data may be first stored in a first location. The first location is different from the expected storage location, for example, may be a location preset by the storage system. Then, the first information of the to-be-stored data is obtained to determine the expected storage location of the to-be-stored data. Thereafter, a data packet other than the part of data packets that are stored in the first location in the plurality of data packets of the to-be-stored data is stored in the expected storage location. In this case, the plurality of data packets of the to-be-stored data are stored in different storage spaces.
  • In the foregoing technical solution, after the to-be-stored data is obtained, the part of data packets of the data are first stored in a default location, and then the expected storage location of the data is determined during storage. This can reduce response duration of the storage system when the data is stored. In addition, because the part of data packets of the to-be-stored data has been stored in the expected storage location when the to-be-stored data is stored, the storage system needs to perform data migration only on the part of data packets of the to-be-stored data. This can reduce an amount of data that needs to be migrated, and improve storage performance of the storage system.
  • In an example embodiment, the storage system may further record a storage status of the to-be-stored data. The storage status includes a first storage status in which the plurality of data packets of the to-be-stored data are stored in the expected storage location and a second storage status in which the plurality of data packets of the to-be-stored data are separately stored in the first location and the expected storage location. Then, whether to perform data migration may be determined based on the obtained storage status of the to-be-stored data. For example, if the storage status of the to-be-stored data indicates that the to-be-stored data is in the second storage status, the storage system may migrate, to the expected storage location, the part of data packets of the to-be-stored data stored in the first location. In this way, the storage system may determine, based on the storage status of the to-be-stored data, whether to perform data migration. This can reduce complexity of scanning.
  • Further, after completing data migration of the to-be-stored data, the storage system may further adjust the storage status of the to-be-stored data from the second storage status to the first storage status. In this way, when the storage system performs scanning again, the to-be-stored data may not need to be migrated.
  • According to a second aspect, a data storage apparatus is provided. The data storage apparatus includes a processor, configured to implement the method described in the first aspect. The data storage apparatus may further include a memory, configured to store program instructions and data. The memory is coupled to the processor, and the processor may invoke and execute the program instructions stored in the memory, to implement any method in the methods described in the first aspect. The data storage apparatus may further include a communications interface. The communications interface is used by the data storage apparatus to communicate with another device.
  • In an example embodiment, the processor is configured to: obtain first information of to-be-stored data, where the first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data; and determine an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy. The preset policy is: when the first information meets a condition, determining that the expected storage location is a first storage space, otherwise, determining that the expected storage location is a second storage space. The first storage space is a storage space whose read/write performance is higher than or equal to a threshold in a storage system, and the second storage space is a storage space whose read/write performance is lower than the threshold in the storage system.
  • The processor is further configured to store, in the expected storage location, at least one data packet in a plurality of data packets of the to-be-stored data received through the communications interface.
  • In an example embodiment, the condition includes at least one of the following conditions:
      • the type of the to-be-stored data is the same as a preset type;
      • the name of the to-be-stored data is the same as a preset name; and
      • the user identifier corresponding to the to-be-stored data is the same as a preset user identifier.
  • In an example embodiment, when the processor obtains the first information of the to-be-stored data, the processor is specifically configured to: before no data packet in the plurality of data packets of the to-be-stored data received through the communications interface is stored in the storage system, obtain the first information of the to-be-stored data.
  • When the processor stores, in the expected storage location, the at least one data packet in the plurality of data packets of the to-be-stored data received through the communications interface, the processor is specifically configured to store each data packet in the plurality of data packets in the expected storage location.
  • In an example embodiment, the processor is further configured to: before obtaining the first information of the to-be-stored data, store, in a first location, a part of data packets in the plurality of data packets of the to-be-stored data received through the communications interface. The first location is different from the expected storage location.
  • When the processor stores, in the expected storage location, the at least one data packet in the plurality of data packets of the to-be-stored data received through the communications interface, the processor is specifically configured to store a data packet other than the part of data packets in the plurality of data packets in the expected storage location.
  • In an example embodiment, the processor is further configured to record a storage status of the to-be-stored data. The storage status includes a first storage status and a second storage status. The first storage status is a status in which the plurality of data packets of the to-be-stored data are stored in the expected storage location, and the second storage status is a status in which the plurality of data packets of the to-be-stored data are separately stored in the first location and the expected storage location. Then, the storage status of the to-be-stored data may be obtained. If the storage status of the to-be-stored data indicates that the to-be-stored data is in the second storage status, the part of data packets of the to-be-stored data may be migrated from the first location to the expected storage location, and the storage status of the to-be-stored data is adjusted from the second storage status to the first storage status.
  • According to a third aspect, a data storage apparatus is provided. The data storage apparatus may be a storage system, or may be an apparatus in a storage system. The data storage apparatus may include a processing module and a communications module. The modules may perform corresponding functions performed by the storage system in any one of the design examples in the first aspect. Details are as follows:
  • The processing module is configured to: obtain first information of to-be-stored data, where the first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data; and determine an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy, where the preset policy is: when the first information meets a condition, determining that the expected storage location is a first storage space, otherwise, determining that the expected storage location is a second storage space. The first storage space is a storage space whose read/write performance is higher than or equal to a threshold in the storage system, and the second storage space is a storage space whose read/write performance is lower than the threshold in the storage system. The processing module is further configured to store, in the expected storage location, at least one data packet in a plurality of data packets of the to-be-stored data received by the communications module.
  • According to a fourth aspect, an embodiment of this disclosure further provides a computer-readable storage medium including instructions. When the instructions are run on a computer, the computer is enabled to perform the method according to the first aspect.
  • According to a fifth aspect, an embodiment of this disclosure further provides a computer program product including instructions. When the computer program product is run on a computer, the computer is enabled to perform the method according to the first aspect.
  • According to a sixth aspect, an embodiment of this disclosure provides a chip system. The chip system includes a processor and may further include a memory, and is configured to implement the method according to the first aspect. The chip system may include a chip, or may include a chip and another discrete component.
  • For beneficial effects of the second aspect to the sixth aspect and the implementations of the second aspect to the sixth aspect, refer to descriptions of beneficial effects of the method in the first aspect and the implementations of the first aspect.
  • BRIEF DESCRIPTION OF DRAWINGS
  • FIG. 1 is a schematic diagram of an example of a storage system according to an embodiment of this disclosure;
  • FIG. 2 is a flowchart of an example of a data storage method according to an embodiment of this disclosure;
  • FIG. 3 is a flowchart of another example of a data storage method according to an embodiment of this disclosure;
  • FIG. 4 is a flowchart of another example of a data storage method according to an embodiment of this disclosure;
  • FIG. 5 is a flowchart of another example of a data storage method according to an embodiment of this disclosure;
  • FIG. 6 is a flowchart of another example of a data storage method according to an embodiment of this disclosure;
  • FIG. 7 is a schematic structural diagram of an example of a data storage apparatus according to an embodiment of this disclosure; and
  • FIG. 8 is a schematic structural diagram of another example of a data storage apparatus according to an embodiment of this disclosure.
  • DESCRIPTION OF EMBODIMENTS
  • To make objectives, technical solutions, and advantages of embodiments of this disclosure clearer, the following describes the technical solutions in the embodiments of this disclosure in detail with reference to the accompanying drawings in this specification and specific implementations.
  • In the descriptions of this disclosure, “a plurality of” means two or more. Alternatively, “a plurality of” may be understood as “at least two”. “At least one” may be understood as one or more, for example, one, two, or more. Including at least one means including one, two, or more, and does not limit what are included. For example, including at least one of A, B, and C may represent the following cases: A is included, B is included, C is included, A and B are included, A and C are included, B and C are included, and A and B and C are included. The term “and/or” describes an association relationship between associated objects and represents that three relationships may exist. For example, A and/or B may represent the following three cases: Only A exists, only B exists, and both A and B exist. In addition, the character “/”, unless otherwise specified, generally indicates an “or” relationship between the associated objects.
  • Unless otherwise stated, ordinal terms such as “first” and “second” mentioned in this disclosure are used to distinguish between a plurality of objects, and are not intended to limit a sequence, a time sequence, a priority, or an importance degree of the plurality of objects.
  • The foregoing describes some concepts in this disclosure. The following describes a technical background of this disclosure.
  • In a big data era, an amount of data stored in a storage system explodes. Due to a limited storage space in the storage system, a large amount of stored data needs to be managed in order to ensure storage performance of the storage system. Tiered storage is a manner to manage data. To be specific, data is separately stored in storage media with different performance based on indicators such as data importance, access frequency, attribute information, and a size, and the data is automatically migrated between storage media by using a tiered storage technology.
  • A storage system using the tiered storage technology generally includes a plurality of storage media with different performance, for example, a serial advanced technology attachment (SATA) hard disk, a small computer system interface (SCSI) hard disk, a serial attached SCSI interface (SAS) hard disk, a fiber channel (FC) interface hard disk, and an SSD. A relationship between performance of the hard disks is as follows: SATA hard disk<SCSI hard disk<SAS hard disk<FC hard disk<SSD. A person skilled in the art may select, based on an actual use requirement, storage media with different performance to constitute different storage systems, for example, a level-3 storage system including three storage media with different performance, a level-5 storage system including five storage media with different performance. In addition, in the storage system, a preset tiered storage policy is stored, for example, a tiered storage policy of storing important data in a storage medium with good performance and storing unimportant data in a storage medium with poor performance, or a tiered storage policy of storing data with high access frequency in a storage medium with good performance and storing data with low access frequency in a storage medium with poor performance A person skilled in the art may perform setting based on a use requirement.
  • In the conventional technology, after receiving a data packet of to-be-stored data, the storage system using the tiered storage technology stores the data packet in a preset storage medium. The preset storage medium may be a storage medium with good performance in the storage system, for example, a SAS hard disk or an SSD. Then, at a regular interval, data stored in the entire storage system is scanned to determine a part of data packets that do not meet a preset tiered storage policy, and the part of data packets are migrated to an expected storage medium.
  • Because data migration consumes resources of the storage system, when a large quantity of data packets need to be migrated in the storage system, storage performance of the storage system deteriorates.
  • In view of this, the technical solutions in the embodiments of this disclosure are provided. In the embodiments of this disclosure, when data is to be stored, at least one piece of information: a type of the data, a name of the data, and a user identifier corresponding to the data is first obtained. Then, it is determined, based on the obtained at least one piece of information and according to a tiered storage policy preset by a storage system, whether the data needs to be stored in a storage space with good performance or a storage space with poor performance. Then, a data packet of the data is stored in a determined storage space. In this way, the data has been stored in the storage space corresponding to the tiered storage policy during storage, and the data does not need to be migrated subsequently. This can reduce an amount of data that needs to be migrated, reduce resource consumption of the storage system during data migration, and improve storage performance of the storage system.
  • The technical solutions in the embodiments of this disclosure are applied to a storage system using the tiered storage technology. The storage system may be a file storage system, a block storage system, an object storage system, or a combination of the foregoing storage systems. This is not limited in the embodiments of this disclosure.
  • FIG. 1 shows an example of a storage system according to an embodiment of this disclosure. As shown in FIG. 1, the storage system includes a management unit 101 and two storage media with different performance. The two storage media are a first storage medium 102 and a second storage medium 103. The management unit 101 is configured to manage an operation request, for example, a request of processing a write operation to write data into the storage medium, or a request of processing a read operation to obtain data from the storage medium. The management unit 101 may be a central processing unit (CPU), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or the like. The first storage medium 102 and the second storage medium 103 are configured to store data. Read/write performance of the first storage medium 102 is better than read/write performance of the second storage medium 103. For example, the first storage medium 102 may be an SSD or an FC hard disk, and the second storage medium 103 may be a SATA hard disk, a SAS hard disk, or a SCSI hard disk. Alternatively, the first storage medium 102 and the second storage medium 103 may be other storage media. This is not limited herein. A storage space constituted by at least one first storage medium 102 is referred to as a first storage space 1021, and a storage space constituted by at least one second storage medium 103 is referred to as a second storage space 1031. It should be noted that, when the first storage space 1021 is constituted by a quantity of storage spaces corresponding to a plurality of first storage media 102, the plurality of first storage media 102 may form a storage array in a coupling manner, and provide a service for a user together. The same applies to the second storage space 1031.
  • It should be noted that the storage system using the tiered storage technology is not limited to an architecture shown in FIG. 1. For example, the storage system may include three or more storage media with different performance, for example, four storage media with different performance, or the storage system may further include another apparatus, for example, a management apparatus. The storage system described in this embodiment of this disclosure is intended to describe the technical solutions in example embodiments of this disclosure more clearly, and does not constitute a limitation to the technical solutions provided in the embodiments of this disclosure. A person of ordinary skill in the art may learn that, with evolution of a storage technology and a storage system architecture, the technical solutions provided in the embodiments of this disclosure are also applicable to a similar technical problem.
  • The following describes the technical solutions provided in the embodiments of this disclosure with reference to the accompanying drawings.
  • An embodiment of this disclosure provides a data storage method. FIG. 2 is a flowchart of the method.
  • The following provides description by using an example in which the method is applied to the storage system shown in FIG. 1. In other words, in the following, the data storage method may be performed by the storage system shown in FIG. 1. In addition, the method may be performed by a communications apparatus. The communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method. Alternatively, the communications apparatus may be another communications apparatus, for example, a chip system. An implementation of the communications apparatus is not limited herein. For ease of description, the following uses an example in which the method is performed by a management unit in the storage system.
  • S21: Another electronic device sends a plurality of data packets of to-be-stored data to the storage system, and the management unit in the storage system obtains the plurality of data packets.
  • During actual usage, the storage system usually collaborates with another electronic device to complete a data read/write process. The another electronic device may be a server, a client, or the like. In this way, a user may write data into the storage system by performing a data write operation on the another electronic device, or read data from the storage system by performing a data read operation on the another electronic device. For ease of description, the following provides description by using an example in which the another electronic device is a server.
  • For example, when the user needs to store data in the storage system, the user may perform a data write operation, for example, write data A. After detecting the data write operation, the server sends a data write request to the storage system. The data write request may include a plurality of data packets of the data A. In this way, after receiving the data write request, the management unit in the storage system may obtain the plurality of data packets of the to-be-stored data A from the data write request.
  • S22: The management unit in the storage system obtains first information of the to-be-stored data.
  • In this embodiment of this disclosure, the first information includes at least one piece of information: a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data.
  • In this embodiment of this disclosure, manners in which the management unit in the storage system obtains the first information include but are not limited to the following two manners:
  • A First Obtaining Manner:
  • The data write request sent by the server to the storage system may include the first information, for example, description information of the data A. The description information may include a type of the data A (for example, an RAR type, a JPEG type, or an avi type), a size of the data A (for example, 1 M or 30 kB), a name of the data A (for example, a photo 1 or JPG 2), a user identifier (for example, an IP address of the user or a user number), or the like. This is not limited herein. After receiving the data write request, the management unit in the storage system may obtain the description information of the data A from the data write request. The description information of the data A may be understood as the first information of the to-be-stored data (the data A).
  • A Second Manner:
  • When the storage system obtains the plurality of data packets of the data A from the data write request of the server, the management unit in the storage system may determine a type of the data A based on a correspondence between a header character and a data type and headers of the plurality of data packets of the data A. Specifically, Table 1 shows an example of the correspondence between a header character and a data type. The header character is represented by a hexadecimal number. As shown in Table 1, when the header character is “52617221”, it indicates that the data type is an RAR type. When the header character is “504B0304”, it indicates that the data type is a ZIP type. In this case, the type of the data A is the first information. It should be noted that content of the first information is preset by the storage system or a person skilled in the art. For example, the first information may be the type of the to-be-stored data, the name of the to-be-stored data, and the user identifier corresponding to the to-be-stored data. Alternatively, the first information may include only the type of the to-be-stored data. This is not limited herein.
  • TABLE 1
    Header character Data type
    52617221 RAR type
    FFD8FF JPEG type
    504B0304 ZIP type
    47494638 GIF type
    . . . . . .
  • Alternatively, the management unit in the storage system may obtain the first information of the to-be-stored data in another manner. This is not limited herein.
  • S23: The management unit in the storage system determines an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy associated with the storage system.
  • In this embodiment of this disclosure, the preset policy is: when the first information meets a condition, determining that the expected storage location of the to-be-stored data is a first storage space whose read/write performance is higher than or equal to a threshold; when the first information does not meet the condition, determining that the expected storage location of the to-be-stored data is a second storage space whose read/write performance is lower than the threshold. A specific value of the threshold is not limited in this embodiment of this disclosure. The first storage space whose read/write performance is higher than or equal to the threshold and the second storage space whose read/write performance is lower than the threshold may be understood as that the read/write performance of the first storage space is higher than that of the second storage space. Specifically, in the storage system shown in FIG. 1, the first storage space may be understood as a storage space constituted by at least one first storage medium, and the second storage space may be understood as a storage space constituted by at least one second storage medium. For ease of description, the following uses an example in which the first storage space is an SSD and the second storage space is an HDD.
  • The following describes the preset policy. The preset policy may include but is not limited to the following three cases:
  • In a first case, the preset policy is: if the type of the to-be-stored data is the same as the preset type, determining that the expected storage location of the to-be-stored data is the SSD; when the type of the to-be-stored data is different from the preset type, determining that the expected storage location of the to-be-stored data is the HDD.
  • For example, the preset type may be the RAR type. After obtaining the first information of the data A, the management unit in the storage system obtains the type of the data A from the first information. For example, when the type of the data A is the JPEG type, the storage system determines that the type of the data A is not the RAR type, and then determines that the expected storage location of the data A is the HDD. The preset type may be specified in association with the preset policy associated with the storage system.
  • In a second case, the preset policy is: if the name of the to-be-stored data is the same as the preset name, determining that the expected storage location of the to-be-stored data is the SSD; when the name of the to-be-stored data is different from the preset name, determining that the expected storage location of the to-be-stored data is the HDD.
  • For example, the preset name is a name starting with photo. After obtaining the first information of the data A, the management unit in the storage system obtains the name of the data A from the first information. For example, when the name of the data A is JPEG 2, the storage system determines that the name of the data A is not starting with photo, and then determines that the expected storage location of the data A is the HDD. The preset name may be specified in association with the preset policy associated with the storage system.
  • In a third case, the preset policy is: if the user identifier corresponding to the to-be-stored data is the same as the preset user identifier, determining that the expected storage location of the to-be-stored data is the SSD; when the user identifier corresponding to the to-be-stored data is different from the preset user identifier, determining that the expected storage location of the to-be-stored data is the HDD.
  • For example, the following uses an example in which the user identifier is a user number, and the preset user identifier is ID1000. After obtaining the first information of the data A, the management unit in the storage system obtains the user number corresponding to the data A from the first information. For example, when the user number corresponding to the data A is ID100, the management unit in the storage system determines that the user number corresponding to the data A is different from the preset user identifier, and then determines that the expected storage location of the data A is the HDD. The preset user identifier may be specified in association with the preset policy associated with the storage system.
  • Certainly, the preset policy may be a combination of at least two of the foregoing three cases. For example, the preset policy may be a combination of the foregoing first case and second case, or the preset policy may be a combination of the foregoing three cases, or the like. In this case, a process in which the management unit in the storage system determines the expected storage location of the to-be-stored data is similar to the process in any one of the foregoing three cases.
  • Because the storage system may determine the expected storage location of the to-be-stored data based on related information of the to-be-stored data, this can resolve a problem that data having a relatively high requirement for read/write performance of a storage space is stored by default in a storage space with relatively low read/write performance and a data requirement cannot be met. In addition, this can resolve a problem that data having a relatively low requirement for read/write performance of a storage space is stored in a storage space with relatively high read/write performance and a storage resource with relatively high read/write performance is wasted.
  • S24: The management unit in the storage system stores each data packet in the plurality of data packets of the to-be-stored data in the expected storage location.
  • After determining the expected storage location of the to-be-stored data, the management unit in the storage system stores each received data packet of the to-be-stored data in the expected storage location. For example, after it is determined that the expected storage location of the data A is the HDD, the plurality of data packets of the data A are all stored onto the HDD. Because the to-be-stored data is stored in the expected storage location, when the management unit in the storage system scans all the stored data to determine whether the data needs to be migrated, the data has already been located in the target location, and the data does not need to be migrated. This can reduce an amount of data that needs to be migrated, and reduce resource consumption of the storage system during data migration.
  • S25: The management unit in the storage system records a storage status of the to-be-stored data as a first storage status.
  • In this embodiment of this disclosure, the storage status includes a first storage status (which may also be referred to as a migration-free status) in which the plurality of data packets of the to-be-stored data are stored in the expected storage location of the to-be-stored data, and a second storage status (which may also be referred to as a migration-required status) in which a subset of the plurality of data packets of the to-be-stored data are stored in the expected storage location and another subset of the plurality of data packets are stored in a first location different from the expected storage location. If the expected storage location is the first storage space with relatively high read/write performance, the first location is the second storage space with relatively low read/write performance. If the expected storage location is the second storage space with relatively low read/write performance, the first location is the first storage space with relatively high read/write performance.
  • For example, in S23, all data packets of the data A are stored in the expected storage location. Therefore, the management unit in the storage system may record the storage status of the data A as the first storage status. Specifically, the storage system may include a dedicated space, and the dedicated space is specially used to record a storage status of data. For example, the dedicated space may be reserved in the first storage space or the second storage space. If the storage system stores 10 pieces of data, the dedicated space may store 10 bits, and a storage status of each piece of data corresponds to one bit. When a value of the bit is 0, it indicates that a storage status of data corresponding to the bit is the migration-free status. When a value of the bit is 1, it indicates that a storage status of data corresponding to the bit is the migration-required status. If the data A is the first data stored in the storage system, the first bit in the dedicated space corresponds to the storage status of the data A, and the management unit in the storage system sets a value of the first bit in the dedicated space to 0. Alternatively, a storage status may be recorded in another manner. This is not limited herein.
  • It should be noted that, in this embodiment of this disclosure, an execution sequence of S24 and S25 is not limited. In other words, S24 may be performed before S25, S25 may be performed before S24, or S24 and S25 are performed at the same time. In FIG. 2, an example in which S24 is performed before S25 is used.
  • S26: The management unit in the storage system obtains the storage status of the to-be-stored data.
  • The storage system using the tiered storage technology scans data stored in the storage system at a regular interval to determine whether the data needs to be migrated. In this embodiment of this disclosure, because the management unit in the storage system records the storage status of each piece of data, for example, the storage status of each piece of data is stored in the dedicated space used to record the storage status. In this way, the management unit in the storage system may determine, based on the storage status, recorded in the dedicated space, of the data, whether the data needs to be migrated. This can reduce scanning time and reduce resource consumption of scanning. The data A is used as an example. The data A is the first data stored in the storage system, the management unit in the storage system obtains the value of the first bit in the dedicated space used to record the storage status, where the value of the first bit is 0.
  • S27: The management unit in the storage system determines that the to-be-stored data does not need to be migrated.
  • The management unit in the storage system determines that the value of the first bit that is corresponding to the data A and that is in the dedicated space used to record the storage status is 0, so that the management unit determines, based on a correspondence between a value of a bit and a storage status, that the data A does not need to be migrated.
  • In the foregoing technical solution, because the to-be-stored data has been stored in the expected storage location during storage, the storage system does not need to migrate the data. This can reduce an amount of data that needs to be migrated, reduce resource consumption during data migration, and improve storage performance of the storage system.
  • Further, because the to-be-stored data has been stored in the expected storage location during storage, all data stored in the storage system does not need to be migrated. In other words, S25 to S27 are optional steps and are not mandatory. Therefore, the storage system may skip a process of scanning the data to determine data that needs to be migrated. This can improve storage performance of the storage system.
  • In the embodiment shown in FIG. 2, before storing data, the storage system first determines an expected storage location of the data, and then stores the data in the expected storage location. Because it takes time for the storage system to determine the expected storage location of the data, in order not to affect a response speed of the storage system during data storage, FIG. 3 is a flowchart of another example of a data storage method according to this disclosure.
  • The following provides description by using an example in which the method is applied to the storage system shown in FIG. 1. In other words, in the following, the data storage method may be performed by the storage system shown in FIG. 1. In addition, the method may be performed by a communications apparatus. The communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method. Certainly, the communications apparatus may alternatively be another communications apparatus, for example, a chip system. An implementation of the communications apparatus is not limited herein. For ease of description, the following uses an example in which the method is performed by a management unit in the storage system.
  • S31: Another electronic device sends a plurality of data packets of to-be-stored data to the storage system, and the management unit in the storage system obtains the plurality of data packets.
  • In this embodiment of this disclosure, the another electronic device may be a server, a client, or the like. The following provides description by using an example in which the another electronic device is a server. S31 is similar to S21.
  • S32: The management unit in the storage system stores a part of data packets in the plurality of data packets of the to-be-stored data in a first location.
  • In this embodiment of this disclosure, the first location is a default location of the storage system. In other words, the management unit in the storage system stores at least one part of data packets of each piece of received data in the first location. The storage system shown in FIG. 1 includes a first storage space and a second storage space. The first location may be the first storage space, or may be the second storage space. A person skilled in the art may perform setting based on an actual use requirement. This is not limited herein. For ease of description, the following uses an example in which the first storage space is an SSD, the second storage space is an HDD, and the first location is the HDD.
  • For example, the to-be-stored data is data A, and the data A includes 10 data packets. After obtaining the 10 data packets of the data A based on a data write request of the server, the management unit in the storage system writes the 10 data packets into a storage space of the storage system in sequence. The management unit in the storage system may write the 10 data packets one by one in a unit of a data packet. In other words, after the first data packet is written onto the HDD, the second data packet is written onto the HDD. Alternatively, the management unit in the storage system may split a data packet into a plurality of data blocks, and then write the 10 data packets one by one in a unit of a data block. For example, the management unit in the storage system splits each data packet into three data blocks, writes three data blocks corresponding to the first data packet onto the HDD, and then writes three data blocks corresponding to the second data packet onto the HDD.
  • S33: The management unit in the storage system obtains first information of the to-be-stored data.
  • After writing the part of data packets of the to-be-stored data onto the HDD, the management unit in the storage system obtains the first information of the to-be-stored data. Content of the first information and a manner of obtaining the first information are the same as those in S22.
  • S34: The management unit in the storage system determines an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy.
  • S34 is similar to S23.
  • It should be noted that the expected storage location determined by the management unit in the storage system may be the same as the first location, or may be different from the first location. For example, if the management unit in the storage system determines that the expected storage location is the HDD, the expected storage location is the same as the first location. If the management unit in the storage system determines that the expected storage location of the to-be-stored data is the SSD, the expected storage location is different from the first location. In this embodiment of this disclosure, the following provides description by using an example in which the expected storage location is the same as the first location.
  • S35: When the expected storage location is the same as the first location, the management unit in the storage system stores each data packet in the plurality of data packets of the to-be-stored data in the first location.
  • S36: The management unit in the storage system records a storage status of the to-be-stored data as a first storage status.
  • S37: The management unit in the storage system obtains the storage status of the to-be-stored data.
  • S38: The management unit in the storage system determines that the to-be-stored data does not need to be migrated.
  • S35 to S38 are similar to S24 to S27. S36 to S38 are optional steps. In other words, S36 to S38 are not mandatory.
  • In the foregoing technical solution, after obtaining the to-be-stored data, the storage system first stores the part of data packets of the data in the default location, and then determines the expected storage location of the data during storage. This can reduce response duration of the storage system when the data is stored.
  • The embodiment shown in FIG. 3 describes an example in which the expected storage location of the data is determined during storage and the expected storage location is the same as the first location. The following provides description by using an example in which the expected storage location of the data is determined during storage and the expected storage location is different from the first location. FIG. 4 is a flowchart of another example of a data storage method according to this disclosure.
  • The following provides description by using an example in which the method is applied to the storage system shown in FIG. 1. In other words, in the following, the data storage method may be performed by the storage system shown in FIG. 1. In addition, the method may be performed by a communications apparatus. The communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method. Certainly, the communications apparatus may alternatively be another communications apparatus, for example, a chip system. An implementation of the communications apparatus is not limited herein. For ease of description, the following uses an example in which the method is performed by a management unit in the storage system.
  • S41: Another electronic device sends a plurality of data packets of to-be-stored data to the storage system, and the management unit in the storage system obtains the plurality of data packets.
  • S42: The management unit in the storage system stores a part of data packets in the plurality of data packets of the to-be-stored data in a first location.
  • S43: The management unit in the storage system obtains first information of the to-be-stored data.
  • S44: The management unit in the storage system determines an expected storage location of the to-be-stored data based on the first information of the to-be-stored data and according to a preset policy.
  • S41 to S44 are similar to S31 to S34.
  • In this embodiment of this disclosure, the following provides description by using an example in which the expected storage location is different from the first location.
  • S45: When the expected storage location is different from the first location, the management unit in the storage system stores a remaining data packet other than the part of data packets in the plurality of data packets of the to-be-stored data in the expected storage location.
  • For example, in S42, the management unit in the storage system stores the first two data packets of data A in the first location, that is, stores onto an HDD. When the management unit in the storage system determines that an expected storage location of the data A is an SSD and is different from the first location, the management unit in the storage system stores remaining 8 data packets of the data A onto the SSD.
  • S46: The management unit in the storage system records a storage status of the to-be-stored data as a second storage status.
  • Because the part of data packets of the data A are stored on the HDD, and the other part of data packets are stored on the SSD, the management unit in the storage system records a storage status of the data A as the second storage status in which data needs to be migrated.
  • S47: The management unit in the storage system obtains the storage status of the to-be-stored data.
  • S47 is similar to S26.
  • S48: The management unit in the storage system determines that the to-be-stored data needs to be migrated, and migrates the part of data packets of the to-be-stored data from the first location to the expected storage location.
  • In this embodiment of this disclosure, if the management unit in the storage system learns that the storage status of the to-be-stored data (the data A) is the second storage status, the management unit in the storage system determines that the to-be-stored data needs to be migrated, and migrates the part of data packets of the data A that are stored in the first location to the expected storage location of the data A.
  • S49: The management unit in the storage system adjusts the storage status of the to-be-stored data from the second storage status to a first storage status.
  • After the storage system completes data migration, all data packets of the data are stored in the expected storage location. Therefore, the data does not need to be migrated, and the management unit in the storage system updates the storage status of the data to the first storage status.
  • It should be noted that S46 to S49 are optional steps. In other words, S46 to S49 are not mandatory.
  • In the foregoing technical solution, because the part of data packets of the data have been stored in the expected storage location when the data is stored, only the other part of data packets of the data need to be migrated during data migration. Therefore, this can reduce an amount of data that needs to be migrated, reduce storage resource consumption of data migration, and improve storage performance of the storage system.
  • The foregoing describes an overall procedure of the data storage method in the storage system in the embodiments of this disclosure. The following describes the foregoing technical solutions by using a specific storage system (for example, a file storage system) as an example.
  • First, a data storage principle of the file storage system is described.
  • Data stored in the file storage system is classified into two types: data and metadata. The data may be understood as actual data in a file. For example, the file is a picture, and the actual data of the file is information such as a person, an animal, and an environment that is included in the picture. The metadata is data used to describe attribute information of a file, for example, access permission of the file, an owner of the file, and a storage location of the file. If a user needs to perform an operation on a file in the file storage system, the user first needs to obtain metadata of the file, and then can locate a location of the file based on the metadata and obtain actual data in the file. In the file storage system, metadata may be managed in two manners: centralized management and distributed management. The centralized management means that a storage space is specified in the file system and is dedicated to storing metadata of all files. For example, the metadata of all files is stored on an SSD, and this can facilitate management of files in the file storage system. It should be noted that, in this case, because different files may be stored in different locations, for example, some files are stored on an HDD, and some files are stored on the SSD, a file and metadata of the file are stored in different storage media. The distributed management means that metadata is stored in any storage space in the file system. For example, metadata may be stored together with each file. If a file is stored on the HDD, metadata of the file is also stored on the HDD. If a file is stored on the SSD, metadata of the file is also stored on the SSD. In this way, responsibilities of metadata management are distributed to different storage spaces, to resolve a problem that the entire file storage system cannot be used when a storage space for storing metadata is faulty in the centralized management manner.
  • According to the foregoing principle, the following describes an execution process of a data storage method in the file storage system according to an embodiment of this disclosure. FIG. 5 is a flowchart of the method.
  • The following provides description by using an example in which the method is applied to the storage system shown in FIG. 1. In other words, in the following, the data storage method may be performed by the storage system shown in FIG. 1. In addition, the method may be performed by a communications apparatus. The communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method. Certainly, the communications apparatus may alternatively be another communications apparatus, for example, a chip system. An implementation of the communications apparatus is not limited herein. For ease of description, the following uses an example in which the method is performed by a management unit in the storage system and the storage system is a file storage system.
  • S51: Another electronic device sends a plurality of data packets of a to-be-stored file to the file storage system, and the management unit in the file storage system obtains the plurality of data packets.
  • S51 is similar to S21.
  • S52: The management unit in the file storage system obtains first information of the to-be-stored file.
  • S53: The management unit in the file storage system determines an expected storage location of the to-be-stored file based on the first information of the to-be-stored file and according to a preset policy.
  • S52 and S53 are similar to S22 and S23.
  • S54: The management unit in the file storage system creates metadata corresponding to the to-be-stored file.
  • After determining the expected storage location of the to-be-stored file, the management unit in the file storage system creates, based on the expected storage location of the file, the metadata corresponding to the to-be-stored file.
  • In this embodiment of this disclosure, the following two types of information are added to metadata in the conventional technology.
  • The first type of information is an expected storage location of a file, and the location is denoted as Store tier. The expected storage location of the file may be a first storage space, or may be a second storage space. For ease of description, the following uses an example in which the first storage space is an SSD and the second storage space is an HDD.
  • The second type of information is a storage status, and the status is denoted as Status. For description of the storage status, refer to corresponding content in S25. In this embodiment of this disclosure, when the metadata is created, a storage status is set to a first storage status by default.
  • For example, two fields may be added to metadata in the conventional technology to respectively indicate the first type of information and the second type of information. For example, a length of each field is one bit. If the metadata in the conventional technology includes 10 bits, the metadata in this embodiment of this disclosure may include 12 bits, and the eleventh bit is used to indicate an expected storage location of a file. When a value of the bit is 0, it indicates that the expected storage location of the file is the HDD. When a value of the bit is 1, it indicates that the expected storage location of the file is the SSD. The twelfth bit is used to indicate a storage status of a file. When a value of the bit is 0, it indicates that the storage status of the file is the first storage status (namely, a migration-free status). When a value of the bit is 1, it indicates that the storage status of the file is a second storage status (namely, a migration-required status). After the management unit in the file storage system determines the expected storage location of the to-be-stored file, for example, determines that the expected storage location of the file is the HDD, when creating metadata of the file, the management unit in the file storage system sets the eleventh bit of the metadata to 0, and sets the storage status of the file to the first storage status. In other words, the twelfth bit of the metadata is set to 0.
  • S55: The management unit in the file storage system stores the plurality of data packets of the file in a location indicated by the metadata.
  • After creating the metadata corresponding to the file, the management unit in the file storage system stores the data packets of the file in an expected storage location indicated by the metadata.
  • In this way, when creating the metadata of the file, the file storage system actively determines the expected storage location of the file according to the specified policy, so that the data packets of the file are directly stored in the expected storage location, and a process of scanning the file periodically and performing data migration can be skipped. This can improve performance of the file storage system.
  • In order not to affect a speed of creating metadata, FIG. 6 is a flowchart of another example of a data storage method according to this disclosure.
  • The following provides description by using an example in which the method is applied to the storage system shown in FIG. 1. In other words, in the following, the data storage method may be performed by the storage system shown in FIG. 1. In addition, the method may be performed by a communications apparatus. The communications apparatus may be a server in the storage system or a communications apparatus that can support the storage system in implementing a function required for the method. Alternatively, the communications apparatus may be another communications apparatus, for example, a chip system. An implementation of the communications apparatus is not limited herein. For ease of description, the following uses an example in which the method is performed by a management unit in the storage system and the storage system is a file storage system.
  • S61: Another electronic device sends a plurality of data packets of a to-be-stored file to the file storage system, and the management unit in the file storage system obtains the plurality of data packets.
  • S61 is similar to S21.
  • S62: The management unit in the file storage system creates metadata corresponding to the to-be-stored file.
  • In this embodiment of this disclosure, the following two types of information are added to metadata in the conventional technology: an expected storage location of a file and a storage status. In order not to affect a speed of creating metadata, when creating the metadata, the management unit in the file storage system presets an expected storage location of a file. To be specific, for any to-be-stored file, when metadata of the file is created, an expected storage location of the file indicated by the metadata is the same. For example, the preset location may be a first storage space, or may be a second storage space. For ease of description, the following uses an example in which the first storage space is an SSD, the second storage space is an HDD, and the preset location is the HDD. Description of the storage status is the same as that in S52.
  • For example, two fields are added to the metadata to indicate the expected storage location and the storage status of the file. Meanings of a length and a value of each field are the same as those in S52, and details are not described herein again. For example, the preset location is the HDD, and the management unit in the file storage system sets a value of a bit that is in the metadata and that is used to indicate the expected storage location of the file to 0.
  • S63: The management unit in the file storage system stores a part of data packets of the to-be-stored file in a location indicated by the metadata.
  • After creating the metadata corresponding to the file, the management unit in the file storage system stores, in sequence, the data packets of the file in an expected storage location indicated by the metadata, namely, the HDD.
  • S64: The management unit in the file storage system obtains first information of the to-be-stored file.
  • S65: The management unit in the file storage system determines an expected storage location of the to-be-stored file based on the first information of the to-be-stored file and according to a preset policy.
  • S64 and S65 are the same as S52 and S53, and details are not described herein again.
  • S66: When the expected storage location of the file is different from the location indicated by the metadata, the management unit in the file storage system updates the metadata of the file.
  • When determining that the expected storage location of the file is different from the location indicated by the metadata, the management unit in the file storage system updates, by using the determined expected storage location of the file, a value of the field that is in the metadata and that is used to indicate the expected storage location of the file, and updates the storage status of the file in the metadata. For example, when the management unit in the file storage system determines that the expected storage location of the file is the SSD, and the expected storage location indicated by the metadata is the HDD, the expected storage location indicated by the metadata needs to be changed from the HDD to the SSD. To be specific, a value of a bit that is in the metadata and that is used to indicate the expected storage location of the file is reset to 1, and the storage status of the file is updated to the second storage status. In other words, a value of a bit that is in the metadata and that is used to indicate the storage status of the file is reset to 1.
  • S67: The management unit in the file storage system stores a remaining data packet other than the part of data packets of the to-be-stored file in the location indicated by the metadata.
  • Because the expected storage location of the file indicated by the metadata is changed, a storage location of the file in the file storage system is also changed. For example, the expected storage location of the file indicated by the metadata is changed to the SSD, and the remaining data packet of the file is stored onto the SSD.
  • S68: The management unit in the file storage system scans the metadata, and determines that data migration needs to be performed on the file whose storage status is the second storage status.
  • S69: The management unit in the file storage system migrates the file whose storage status is the second storage status to the expected storage location of the file indicated by the metadata of the file.
  • In the foregoing technical solution, the file storage system determines an expected storage location of a file during file storage. This can ensure that an operation delay of creating metadata of the file is not increased. In addition, in the foregoing technical solution, a part of data packets of the file have been stored in the expected storage location during file storage. Therefore, this can reduce an amount of data that needs to be migrated, and improve storage performance of the storage system.
  • The foregoing embodiments of this disclosure describe the method provided in the embodiments of this disclosure from a perspective of interaction between the storage system and the server. To implement functions in the method provided in the embodiments of this disclosure, the storage system may include a hardware structure and/or a software module, and implement the functions in a form of the hardware structure, the software module, or a combination of the hardware structure and the software module. Whether a function in the foregoing functions is performed in a form of the hardware structure, the software module, or both the hardware structure and the software module depends on a specific application and a design constraint condition of the technical solutions.
  • FIG. 7 is a schematic structural diagram of a data storage apparatus 700. The data storage apparatus 700 may be applied to a storage system or an apparatus in a storage system, and can implement a function of the storage system in the method provided in the embodiments of this disclosure. The data storage apparatus 700 may alternatively be an apparatus that can support the storage system in implementing a function of the storage system in the method provided in the embodiments of this disclosure. The data storage apparatus 700 may be a hardware structure, a software module, or a combination of a hardware structure and a software module. The data storage apparatus 700 may be implemented by a chip system. In this embodiment of this disclosure, the chip system may include a chip, or may include a chip and another discrete component.
  • The data storage apparatus 700 may include a communications module 701 and a processing module 702.
  • The communications module 701 may be configured to perform step S21 in the embodiment shown in FIG. 2, and/or configured to perform step S31 in the embodiment shown in FIG. 3, and/or configured to perform step S41 in the embodiment shown in FIG. 4, and/or configured to perform step S51 in the embodiment shown in FIG. 5, and/or configured to perform step S61 in the embodiment shown in FIG. 6, and/or configured to support another process of the technology described in this specification. The communications module 701 is used by the data storage apparatus 700 to communicate with another module, and may be a circuit, a component, an interface, a bus, a software module, a transceiver, or any other apparatus that can implement communication.
  • The processing module 702 may be configured to perform step S22 to step S26 in the embodiment shown in FIG. 2, and/or configured to perform step S32 to step S38 in the embodiment shown in FIG. 3, and/or configured to perform step S42 to step S49 in the embodiment shown in FIG. 4, and/or configured to perform step S52 to step S55 in the embodiment shown in FIG. 5, and/or configured to perform step S62 to step S69 in the embodiment shown in FIG. 6, and/or configured to support another process of the technology described in this specification.
  • All related content of the steps in the foregoing method embodiments may be cited in function descriptions of corresponding function modules.
  • Division into modules in the embodiments of this disclosure is an example, is only logical function division, and may be other division in an actual implementation. In addition, function modules in the embodiments of this disclosure may be integrated into one processor, or may exist alone physically, or two or more modules are integrated into one module. The integrated module may be implemented in a form of hardware, or may be implemented in a form of a software functional module.
  • As shown in FIG. 8, an embodiment of this disclosure provides a data storage apparatus 800. The data storage apparatus 800 may be the storage system in the embodiments shown in FIG. 2 to FIG. 6, or an apparatus in the storage system, and can implement a function of the storage system in the embodiments of this disclosure shown in FIG. 2 to FIG. 6. The data storage apparatus 800 may alternatively be an apparatus that can support the storage system in implementing a function of the storage system in the method provided in the embodiments of this disclosure shown in FIG. 2 to FIG. 6. The data storage apparatus 800 may be a chip system. In this embodiment of this disclosure, the chip system may include a chip, or may include a chip and another discrete component.
  • The data storage apparatus 800 includes at least one processor 820, configured to implement or support the data storage apparatus 800 in implementing a function of the management unit in the storage system in the embodiments of this disclosure shown in FIG. 2 to FIG. 6. For example, the processor 820 may obtain first information of to-be-stored data, and determine an expected storage location of the to-be-stored data based on the first information and according to a preset policy. For details, refer to detailed descriptions in the method example.
  • The data storage apparatus 800 may further include at least one memory 830, configured to store program instructions and/or data. The memory 830 is coupled to the processor 820. Coupling in this embodiment of this disclosure is an indirect coupling or a communication connection between apparatuses, units, or modules, may be in an electrical, a mechanical, or another form, and is used for information exchange between the apparatuses, the units, or the modules. The processor 820 may operate with the memory 830. The processor 820 may execute the program instructions stored in the memory 830. At least one of the at least one memory may be included in the processor. When executing the program instructions in the memory 830, the processor 820 can implement the method shown in FIG. 2 to FIG. 6.
  • The data storage apparatus 800 may further include a communications interface 810, configured to communicate with another device through a transmission medium, so that the communications interface 810 is used by the data storage apparatus 800 to communication with the another device. For example, the another device may be a server. The processor 820 may send and receive data through the communications interface 810.
  • In this embodiment of this disclosure, a specific connection medium between the communications interface 810, the processor 820, and the memory 830 is not limited. In this embodiment of this disclosure, the memory 830, the processor 820, and the communications interface 810 are connected through a bus 840 in FIG. 8, and the bus is represented by a thick line in FIG. 8. A connection manner between other components is schematically described, and is not limited thereto. The bus may be classified into an address bus, a data bus, a control bus, or the like. For ease of representation, only one thick line is used to represent the bus in FIG. 8, but this does not mean that there is only one bus or only one type of bus.
  • In the embodiments of this disclosure, the processor 820 may be a general-purpose processor, a digital signal processor, an ASIC, a FPGA or another programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component, and may implement or perform the methods, steps, and logical block diagrams disclosed in the embodiments of this disclosure. The general-purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed with reference to the embodiments of this disclosure may be directly performed by a hardware processor, or may be performed by using a combination of hardware in the processor and a software module.
  • In this embodiment of this disclosure, the memory 830 may be a non-volatile memory, such as a hard disk drive (HDD) or a solid-state drive (SSD), or may be a volatile memory, such as a random access memory (RAM). The memory is any other medium that can be used to include or store expected program code in a form of an instruction or a data structure and that can be accessed by a computer. However, this is not limited thereto. The memory in the embodiments of this disclosure may alternatively be a circuit or any other apparatus that can implement a storage function, and is configured to store program instructions and/or data.
  • An embodiment of this disclosure further provides a computer-readable storage medium including instructions. When the instructions are run on a computer, the computer is enabled to perform the method implemented by the storage array in the embodiments shown in FIG. 2 to FIG. 6.
  • An embodiment of this disclosure further provides a computer program product including instructions. When the instructions are run on a computer, the computer is enabled to perform the method implemented by the storage array in the embodiments shown in FIG. 2 to FIG. 6.
  • An embodiment of this disclosure provides a chip system. The chip system includes a processor and may further include a memory, and is configured to implement a function of the storage system in the foregoing method. The chip system may include a chip, or may include a chip and another discrete component.
  • All or some of the foregoing methods in the embodiments of this disclosure may be implemented by using software, hardware, firmware, or any combination thereof. When the software is used to implement the embodiments, all or some of the embodiments may be implemented in a form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the procedure or functions according to the embodiments of this disclosure are all or partially generated. The computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user device, or another programmable apparatus. The computer instruction may be stored in a computer-readable storage medium or may be transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instruction may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center in a wired (for example, through a coaxial cable, an optical fiber, or a digital subscriber line (DSL)) or wireless (for example, infrared, radio, or microwave) manner. The computer-readable storage medium may be any usable medium accessible by a computer, or a data storage device, such as a server or a data center, integrating one or more usable media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, or a magnetic tape), an optical medium (for example, a digital video disc (DVD)), a semiconductor medium (for example, an SSD), or the like.
  • It is clear that a person skilled in the art can make various modifications and variations to this disclosure without departing from the scope of this disclosure. This disclosure is intended to cover these modifications and variations of this disclosure provided that they fall within the scope of protection defined by the following claims and their equivalent technologies.

Claims (15)

What is claimed is:
1. A data storage method, comprising:
obtaining first information of to-be-stored data, wherein the first information comprises at least one piece of information, and wherein the information comprises one or more of a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data;
determining an expected storage location of the to-be-stored data based on the first information, wherein the determining comprises determining that the expected storage location is a first storage space when the first information meets a condition and determining that the expected storage location is a second storage space when the first information does not meet the condition, read/write performance of the first storage space being higher than read/write performance of the second storage space; and
storing at least one data packet in a plurality of data packets of the to-be-stored data in the expected storage location.
2. The method according to claim 1, wherein the condition comprises at least one of the following conditions:
the type of the to-be-stored data is the same as a preset type;
the name of the to-be-stored data is the same as a preset name; and
the user identifier corresponding to the to-be-stored data is the same as a preset user identifier.
3. The method according to claim 1, the storing at least one data packet in a plurality of data packets of the to-be-stored data in the expected storage location comprises storing each data packet in the plurality of data packets in the expected storage location.
4. The method according to claim 1, the method further comprises:
storing a first packet in the plurality of data packets of the to-be-stored data in a first location, wherein the first location is different from the expected storage location; and
storing a second data packet in the plurality of data packets in the expected storage location.
5. The method according to claim 4, wherein the method further comprises migrating the first data packet from the first location to the expected storage location.
6. A data storage apparatus, comprising:
a communications interface;
a processor coupled to the communications interface and configured to:
obtain first information of to-be-stored data, wherein the first information comprises at least one piece of information, and wherein the information comprises one or more of a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data;
determine an expected storage location of the to-be-stored data based on the first information, wherein the determining comprises determining that the expected storage location is a first storage space when the first information meets a condition and determining that the expected storage location is a second storage space when the first information does not meet the condition, read/write performance of the first storage space being higher than read/write performance of the second storage space; and
store, in the expected storage location, at least one data packet in a plurality of data packets of the to-be-stored data received through the communications interface.
7. The apparatus according to claim 6, wherein the condition comprises at least one of the following conditions:
the type of the to-be-stored data is the same as a preset type;
the name of the to-be-stored data is the same as a preset name; and
the user identifier corresponding to the to-be-stored data is the same as a preset user identifier.
8. The apparatus according to claim 6, wherein the processor is configured to store each data packet in the plurality of data packets in the expected storage location.
9. The apparatus according to claim 6, wherein the processor is further configured to:
store a first packet in the plurality of data packets of the to-be-stored data in a first location, wherein the first location is different from the expected storage location; and
store a second data packet in the plurality of data packets in the expected storage location.
10. The apparatus according to claim 9, wherein the processor is further configured to migrate the part of data packets of the to-be-stored data from the first location to the expected storage location.
11. A non-volatile computer-readable storage medium, wherein the medium stores instructions, and when the instructions are run on a computer, the computer is enabled to perform steps of:
obtaining first information of to-be-stored data, wherein the first information comprises at least one piece of information, and wherein the information comprises one or more of a type of the to-be-stored data, a name of the to-be-stored data, and a user identifier corresponding to the to-be-stored data;
determining an expected storage location of the to-be-stored data based on the first information, wherein the determining comprises determining that the expected storage location is a first storage space when the first information meets a condition and determining that the expected storage location is a second storage space when the first information does not meet the condition, read/write performance of the first storage space being higher than read/write performance of the second storage space; and
storing at least one data packet in a plurality of data packets of the to-be-stored data in the expected storage location.
12. The non-volatile computer-readable storage medium according to claim 11, wherein the condition comprises at least one of the following conditions:
the type of the to-be-stored data is the same as a preset type;
the name of the to-be-stored data is the same as a preset name; and
the user identifier corresponding to the to-be-stored data is the same as a preset user identifier.
13. The non-volatile computer-readable storage medium according to claim 11, wherein the storing at least one data packet in a plurality of data packets of the to-be-stored data in the expected storage location comprises storing each data packet in the plurality of data packets in the expected storage location.
14. The non-volatile computer-readable storage medium according to claim 11, wherein the computer is further enabled to perform steps of:
storing a first packet in the plurality of data packets of the to-be-stored data in a first location, wherein the first location is different from the expected storage location; and
storing a second data packet in the plurality of data packets in the expected storage location.
15. The non-volatile computer-readable storage medium according to claim 14, wherein the computer is further enabled to perform step of migrating the first data packet from the first location to the expected storage location.
US17/325,287 2018-11-21 2021-05-20 Data storage method and apparatus Active US11550486B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201811394013.6 2018-11-21
CN201811394013.6A CN111208934B (en) 2018-11-21 2018-11-21 Data storage method and device
PCT/CN2019/115215 WO2020103679A1 (en) 2018-11-21 2019-11-04 Data storage method and apparatus

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/115215 Continuation WO2020103679A1 (en) 2018-11-21 2019-11-04 Data storage method and apparatus

Publications (2)

Publication Number Publication Date
US20210271405A1 true US20210271405A1 (en) 2021-09-02
US11550486B2 US11550486B2 (en) 2023-01-10

Family

ID=70773673

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/325,287 Active US11550486B2 (en) 2018-11-21 2021-05-20 Data storage method and apparatus

Country Status (4)

Country Link
US (1) US11550486B2 (en)
EP (1) EP3869313A4 (en)
CN (1) CN111208934B (en)
WO (1) WO2020103679A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113835630A (en) * 2021-09-15 2021-12-24 联泰集群(北京)科技有限责任公司 Data storage method, device, data server, storage medium and system

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112416858A (en) * 2020-11-09 2021-02-26 深圳市珍爱捷云信息技术有限公司 Document storage method and device, electronic equipment and computer readable storage medium
CN112506435B (en) * 2020-12-12 2024-04-02 南京地铁建设有限责任公司 Data hierarchical storage method and system applied to escalator
CN113986116A (en) * 2021-09-07 2022-01-28 广东珠江智联信息科技股份有限公司 Distributed storage system and data management method based on distributed storage system
CN115840543B (en) * 2023-02-28 2023-05-16 浪潮电子信息产业股份有限公司 Data hierarchical storage method, device, equipment and storage medium

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2714745A1 (en) * 2008-02-12 2009-08-20 Netapp, Inc. Hybrid media storage system architecture
CN101281542B (en) * 2008-05-09 2011-04-20 成都市华为赛门铁克科技有限公司 Method and device for storing file
US8566549B1 (en) * 2008-12-31 2013-10-22 Emc Corporation Synchronizing performance requirements across multiple storage platforms
CN102541475B (en) * 2012-03-12 2015-02-04 华为数字技术(成都)有限公司 Data storage method and data storage device
CN103457963B (en) * 2012-05-28 2017-06-27 联想(北京)有限公司 The method and distributed memory system of storage file
GB2505185A (en) * 2012-08-21 2014-02-26 Ibm Creating a backup image of a first memory space in a second memory space.
GB2506623A (en) * 2012-10-04 2014-04-09 Ibm Managing user files in a tiered storage system
US20140281301A1 (en) * 2013-03-15 2014-09-18 Silicon Graphics International Corp. Elastic hierarchical data storage backend
CN104598495A (en) * 2013-10-31 2015-05-06 南京中兴新软件有限责任公司 Hierarchical storage method and system based on distributed file system
JP6005116B2 (en) * 2014-09-30 2016-10-12 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Automated migration of files recalled by specific applications
CN104283960B (en) * 2014-10-15 2018-11-20 福建亿榕信息技术有限公司 Realize the virtualization integration of heterogeneous network storage and the system of differentiated control
CN106294421B (en) * 2015-05-25 2020-02-04 阿里巴巴集团控股有限公司 Data writing and reading method and device
CN106326487B (en) * 2016-09-05 2019-12-27 天脉聚源(北京)科技有限公司 Data storage method and device
CN106484330A (en) * 2016-09-27 2017-03-08 郑州云海信息技术有限公司 A kind of hybrid magnetic disc individual-layer data optimization method and device
CN107729182B (en) * 2017-10-11 2020-12-04 苏州乐麟无线信息科技有限公司 Data storage and access method and device
CN108268217B (en) * 2018-01-10 2021-04-30 北京航天云路有限公司 Hierarchical storage method based on time sequence data cold and hot classification
CN108846064B (en) * 2018-06-06 2021-07-23 南京群顶科技有限公司 Method for realizing dynamic chained storage cluster based on ceph

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113835630A (en) * 2021-09-15 2021-12-24 联泰集群(北京)科技有限责任公司 Data storage method, device, data server, storage medium and system

Also Published As

Publication number Publication date
US11550486B2 (en) 2023-01-10
CN111208934B (en) 2021-07-09
WO2020103679A1 (en) 2020-05-28
CN111208934A (en) 2020-05-29
EP3869313A1 (en) 2021-08-25
EP3869313A4 (en) 2021-12-22

Similar Documents

Publication Publication Date Title
US11550486B2 (en) Data storage method and apparatus
US10346081B2 (en) Handling data block migration to efficiently utilize higher performance tiers in a multi-tier storage environment
US9830101B2 (en) Managing data storage in a set of storage systems using usage counters
US9613040B2 (en) File system snapshot data management in a multi-tier storage environment
US11397538B2 (en) Data migration method and apparatus
US10671285B2 (en) Tier based data file management
US7010657B2 (en) Avoiding deadlock between storage assignments by devices in a network
US10552089B2 (en) Data processing for managing local and distributed storage systems by scheduling information corresponding to data write requests
US20220164316A1 (en) Deduplication method and apparatus
US8578119B2 (en) File system quota and reservation
EP4036732A1 (en) Verification data calculation method and device
US11226778B2 (en) Method, apparatus and computer program product for managing metadata migration
CN109144403B (en) Method and equipment for switching cloud disk modes
WO2021018052A1 (en) Garbage collection method and apparatus
EP4087212A1 (en) Method and apparatus for cloning file system
CN106933497B (en) Management scheduling device, system and method based on SAS
WO2021063242A1 (en) Metadata transmission method of storage system, and storage system
CN116431566B (en) Data migration method, device, electronic equipment and medium
WO2023116438A1 (en) Data access method and apparatus, and device
US9270530B1 (en) Managing imaging of multiple computing devices

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: HUAWEI TECHNOLOGIES CO., LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YAN, HAITAO;LIN, LIN;ZHANG, MINGQIAN;REEL/FRAME:056698/0947

Effective date: 20210628

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE