WO2022257615A1 - Information processing method and apparatus, and storage medium - Google Patents

Information processing method and apparatus, and storage medium Download PDF

Info

Publication number
WO2022257615A1
WO2022257615A1 PCT/CN2022/088113 CN2022088113W WO2022257615A1 WO 2022257615 A1 WO2022257615 A1 WO 2022257615A1 CN 2022088113 W CN2022088113 W CN 2022088113W WO 2022257615 A1 WO2022257615 A1 WO 2022257615A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
access frequency
access
type
storage
Prior art date
Application number
PCT/CN2022/088113
Other languages
French (fr)
Chinese (zh)
Inventor
郑思城
吴怡燃
王英杰
Original Assignee
北京沃东天骏信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京沃东天骏信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京沃东天骏信息技术有限公司
Publication of WO2022257615A1 publication Critical patent/WO2022257615A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/068Hybrid storage device

Definitions

  • the present application relates to the technical field of information processing, and in particular to an information processing method and device, and a storage medium.
  • a primary cluster with high storage performance such as solid state disk (Solid State Disk, SSD)
  • a standby cluster with low storage performance such as hard disk drive (Hard Disk Drive, HDD)
  • Store the data in the primary cluster first, and then manually copy the data from the primary cluster to the standby cluster for storage when it is manually determined that the heat of the data is reduced.
  • Storage location as such, reduces intelligence in storing data.
  • the embodiments of the present application expect to provide an information processing method and device, and a storage medium, which can improve the intelligence when storing stored data.
  • An embodiment of the present application provides an information processing method, including:
  • each type of data in the at least one type of data corresponds to a storage category
  • At least one storage category corresponding to the at least one type of data determine at least one storage system corresponding to the at least one type of data, and transfer the at least one type of data to the at least one storage system respectively; wherein, each A type of data corresponds to a storage system.
  • An embodiment of the present application provides an information processing device, and the device includes:
  • the detecting part is configured to detect the data access frequency corresponding to the stored data in the internal memory when it is detected that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold;
  • the dividing part is configured to divide the stored data into at least one type of data based on the data access frequency; each type of data in the at least one type of data corresponds to a storage category;
  • the determining part is configured to determine at least one storage system corresponding to the at least one type of data based on at least one storage class corresponding to the at least one type of data;
  • the dumping part is configured to dump the at least one type of data to the at least one storage system respectively; wherein, each type of data corresponds to a storage system.
  • An embodiment of the present application provides an information processing device, and the device includes:
  • a memory a processor, and a communication bus
  • the memory communicates with the processor through the communication bus
  • the memory stores an information processing program executable by the processor, and when the information processing program is executed , executing the information processing method described above by the processor.
  • An embodiment of the present application provides a storage medium on which a computer program is stored and applied to an information processing device.
  • the computer program is executed by a processor, the information processing method described above is implemented.
  • An embodiment of the present application provides an information processing method and device, and a storage medium.
  • the information processing method includes: detecting that the stored data in the memory corresponds to data access frequency; based on the data access frequency, divide the stored data into at least one type of data; each type of data in at least one type of data corresponds to a storage category; based on at least one storage category corresponding to at least one type of data, determine at least one At least one storage system corresponding to the type of data, and at least one type of data is transferred to at least one storage system; wherein, each type of data corresponds to a storage system.
  • the information processing device detects the data access frequency corresponding to the stored data in the memory, so that the information processing device can divide the stored data into at least one type of data based on the data access frequency, based on at least one type of data corresponding to at least one type of data A storage category, to determine at least one storage system corresponding to at least one type of data, so as to transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system, and there is no need to manually determine the storage data The heat of the data and the corresponding storage location do not need to manually dump the stored data, which improves the intelligence when storing the stored data.
  • FIG. 1 is a flow chart of an information processing method provided by an embodiment of the present application.
  • Fig. 2 is a schematic diagram 1 of an exemplary information processing device provided by an embodiment of the present application.
  • Fig. 3 is an exemplary architecture diagram 2 of an information processing device provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram 3 of an exemplary information processing device architecture provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram 4 of an exemplary information processing device architecture provided by an embodiment of the present application.
  • FIG. 6 is a first structural schematic diagram of an information processing device provided by an embodiment of the present application.
  • FIG. 7 is a second schematic diagram of the composition and structure of an information processing device provided by an embodiment of the present application.
  • FIG. 1 is a flow chart of an information processing method provided in the embodiment of the present application. As shown in FIG. 1, the information processing method may include:
  • An information processing method provided in an embodiment of the present application is applicable to a scenario of dumping stored data.
  • the information processing apparatus may be implemented in various forms.
  • the information processing devices described in this application may include mobile phones, cameras, tablet computers, notebook computers, palmtop computers, personal digital assistants (Personal Digital Assistant, PDA), portable media players (Portable Media Player, PMP), Devices such as navigation devices, wearable devices, smart bracelets, pedometers, and devices such as digital TVs, desktop computers, servers, etc.
  • PDA Personal Digital Assistant
  • PMP portable media players
  • Devices such as navigation devices, wearable devices, smart bracelets, pedometers, and devices such as digital TVs, desktop computers, servers, etc.
  • the stored data may be log data, order data of the product or other data information of the product, which can be determined according to the actual situation, which is not limited in the embodiment of the present application.
  • the amount of stored data may be determined according to actual conditions, which is not limited in the embodiment of the present application.
  • the quantity of stored data corresponds to the quantity of data access frequencies one by one, that is, one stored data corresponds to one data access frequency.
  • the number of data access frequencies can be one, the number of data access frequencies can also be two, and the number of data access frequencies can also be multiple.
  • the specific number of data access frequencies can be determined according to the actual situation. This is not limited.
  • the information processing device when the information processing device receives the stored data, the information processing device first stores the stored data in the internal memory, and then the information processing device will store the stored data when the remaining storage capacity of the internal memory is less than the lower limit of the preset capacity. In the case of a threshold, the data access frequency corresponding to the stored data in the memory is detected.
  • the information processing device can detect the data access frequency corresponding to the stored data in the internal memory when the remaining storage capacity of the internal memory is less than the preset lower limit threshold; The device detects the data access frequency corresponding to the stored data in the internal memory; the specific condition for the information processing device to detect the data access frequency corresponding to the stored data in the internal memory can be determined according to the actual situation, which is not limited in this embodiment of the present application.
  • the preset capacity lower limit threshold can be the threshold configured in the information processing device; the preset capacity lower limit threshold can also be the threshold obtained by the information processing device before the information processing device receives the stored data; it can also be the information processing device The threshold value acquired by the processing device in other ways; specifically, it may be determined according to the actual situation, which is not limited in this embodiment of the present application.
  • the preset capacity lower limit threshold may be 0, that is, the memory is full.
  • the preset time period may be the time period configured in the information processing device; the preset time period may also be the time period obtained by the information processing device before the information processing device receives the stored data; The time period acquired by the processing device in other ways; the specific time period may be determined according to the actual situation, which is not limited in this embodiment of the present application.
  • the data access frequency may be the access frequency range that identifies the stored data category. For example, if the data category of the stored data is a hot data category, the data access frequency may be greater than or equal to the preset access frequency threshold frequency band; if the data category of the stored data is cold data category, the data access frequency may be a frequency band less than the preset access frequency threshold.
  • the preset access frequency threshold can be the frequency threshold configured in the information processing device, or it can be the frequency threshold obtained by the information processing device in other ways, which can be determined according to the actual situation. There is no limit to this.
  • the information processing device when the information processing device detects that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold, before detecting the data access frequency corresponding to the stored data in the internal memory, the information processing device will also receive The data identification corresponding to the stored data; correspondingly, at least one type of data includes the first data and/or the second data; the process of the information processing device detecting the data access frequency corresponding to the stored data in the internal memory includes: the information processing device detects from the data identification Filter out the hot data identifiers from the hot data identifiers, and use the hot data identifiers to identify the first access frequency corresponding to the first data; and/or, the information processing device filters out the cold data identifiers from the data identifiers, and uses the cold data identifiers to identify the second data corresponding to The second access frequency; the information processing device uses the first access frequency and/or the second access frequency as the data access frequency.
  • the data identification includes hot data identification and cold data identification.
  • the hot data identifier is used to identify the first data as a hot data category
  • the cold data identifier is used to identify the second data as a cold data category.
  • the first access frequency may be a part of the data access frequencies, and the first access frequency may also be the data access frequency.
  • the second access frequency may be a part of the data access frequencies, and the second access frequency may also be a data access frequency. Wherein, if the first access frequency is a part of the data access frequency, and the second access frequency is also a part of the data access frequency, then the first access frequency and the second access frequency constitute the data access frequency.
  • the first access frequency may be a frequency segment greater than or equal to a preset access frequency threshold; the second access frequency may be a frequency segment less than the preset access frequency threshold.
  • the information processing device determines that the data identifier of the first data in the stored data is a hot data identifier, the information processing device uses the hot data identifier as the data identifier of the first data; When it is determined that the data identifier of the second data in the stored data is a cold data identifier, the information processing device uses the cold data identifier as the data identifier of the second data.
  • the first access frequency may be the total number of times the first data is accessed within a period of time, and the first access frequency may also be the number of times the first data is accessed per second.
  • the second access frequency may be the total access times of the second data within a period of time, or the second access frequency may be the access times of the second data per second.
  • period of time can be 3 months, 6 months, 1 year, etc., and the specific period of time can be determined according to the actual situation, which is not limited in the embodiment of the present application.
  • the first access frequency may be the total access times of the first data within a period of time
  • the first access frequency may be the total access times of the first data within 6 months (for example, 1000qps). If the first access frequency is the number of times the first data is accessed per second, the first access frequency may be 1000qps/sec (one thousand requests per second); or the first access frequency may be 1000qps/sec and may last for 5 minutes.
  • the second access frequency can be the total access times of the second data within a period of time, then the second access frequency can be the total access times of the second data within 6 months (for example, 1000qps). If the second access frequency is the number of times the second data is accessed per second, the second access frequency may be 1000qps/sec (one thousand requests per second); or the second access frequency may be 1000qps/sec and last for 5 minutes.
  • the information processing device may directly receive the data identifier transmitted by the user; the information processing device may also create a hot and cold table according to the instruction transmitted by the user, and determine the data identifier corresponding to the stored data from the hot and cold table.
  • the way for the information processing device to create the hot and cold tables can be as follows:
  • the information processing device may divide the stored data into at least one type of data based on the data access frequency.
  • At least one type of data includes first data of a hot data type and/or second data of a cold data type.
  • the information processing device can be a distributed non-relational (NoSQL) system, and the HDFS address of the distributed file system (Hadoop Distribution FileSystem, HDFS) and the cloud file system (Cloud File System) are stored in the distributed NoSQL system.
  • Service, CFS) CFS address the distributed NoSQL system can load the HDFS system to the distributed NoSQL system through the HDFS address, and load the CFS system to the distributed NoSQL system through the CFS address, that is, the distributed NoSQL system is set with HDFS system and the CFS system.
  • the distributed NoSQL system starts, it automatically loads the HDFS system and the CFS system.
  • a management node (HMaster) is set in the distributed NoSQL system, which is responsible for table management (addition, deletion, modification, query), region management, file system initialization, and so on.
  • the distributed NoSQL system is also equipped with a sorting module (Compact), which is used to monitor the access frequency of files and the formulation of rules.
  • the life cycle of data meets the rules set by the business to realize the flow of data (periodic inspection analysis files, statistical files frequency of access, checking for compliance with rules, automating data migration).
  • HDFS system and CFS sting are used to store different types of storage data respectively.
  • the HDFS system uses full SSD disks
  • the CFS system uses cloud disks.
  • the information processing device is provided with a file system interface (FileSystemInterface) for routing and forwarding, and the information processing device can use the file system interface to select a storage system (HDFS system or CFS system) that matches the stored data identifier, thereby Transfer storage data to HDFS system or CFS system.
  • FileSystemInterface FileSystemInterface
  • the storage category includes a hot data storage category and a cold data storage category, where the storage system corresponding to the hot data storage category is a high-frequency access storage system, and the storage system corresponding to the cold data storage category may be a low-frequency access storage system. system.
  • the high-frequency access storage system may be an HDFS system
  • the low-frequency access storage system may be a CFS system.
  • each type of data in at least one type of data corresponds to a storage type, including: the first data of the hot data type corresponds to the hot data storage type, and the second data of the cold data type corresponds to the cold data storage type.
  • the information processing device receives the data identifier corresponding to the stored data, and the information processing device filters out the hot data identifier from the data identifier, and uses the hot data identifier to identify the first access frequency corresponding to the first data; And/or, the information processing device screens out the cold data identifier from the data identifier, and uses the cold data identifier to identify the second access frequency corresponding to the second data; using the first access frequency and/or the second access frequency as the data access frequency, Then, based on the data access frequency, the information processing device divides the stored data into at least one type of data.
  • the information processing device can filter out the first access frequency corresponding to the hot data identifier from the data access frequency, and determine from the stored data The first data corresponding to the first access frequency; and/or, the information processing device screens out the second access frequency corresponding to the cold data identifier from the data access frequency, and determines the second data corresponding to the second access frequency from the stored data .
  • the storage tag of the data is identified in the hot and cold table, and the information processing device may also determine the storage system of the data from the storage tag.
  • the information processing device divides the stored data into at least one type of data based on the data access frequency.
  • the information processing device can also filter out the access frequency greater than or equal to the preset access frequency threshold from the data access frequency. the first access frequency, and determine the first data corresponding to the first access frequency in the stored data; and/or, the information processing device screens out the second access frequency whose access frequency is less than the preset access frequency threshold from the data access frequencies , and determine the second data corresponding to the second access frequency in the stored data.
  • S103 Based on at least one storage category corresponding to at least one type of data, determine at least one storage system corresponding to at least one type of data, and transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system system.
  • the information processing device after the information processing device divides the stored data into at least one type of data based on the data access frequency, the information processing device can determine the corresponding at least one storage system, and transfer at least one type of data to at least one storage system respectively.
  • At least one type of data is in one-to-one correspondence with at least one storage class, that is, one type of data corresponds to one storage class.
  • one type of data corresponds to one storage class.
  • the first data of the hot data category corresponds to the hot data storage category
  • the second data of the cold data category corresponds to the cold data storage category.
  • At least one storage system includes a high-frequency access storage system and/or a low-frequency access storage system.
  • the HDFS system is used to store the first data identified by the hot data; the CFS is used to store the second data identified by the cold data.
  • the information processing device after the information processing device transfers at least one type of data to at least one storage system, the information processing device stores the first data in the high-frequency access storage system, and the storage time of the first data meets the predetermined When the storage duration is set, the information processing device detects the third access frequency of the first data; when the third access frequency is less than the preset access frequency threshold, the information processing device determines that the first data is marked as a cold data mark The information processing device transfers the first data from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier.
  • the third access frequency may be the total number of accesses within the preset storage duration of the stored data, and the third access frequency may also be the number of accesses per second of the stored data.
  • the third access frequency can be the total access times of the stored data within the preset storage period
  • the third access frequency can be the total access times of the stored data within 6 months (for example, 1000qps). If the third access frequency is the number of accesses per second of the stored data, the third access frequency may be 1000qps/sec (one thousand requests per second); or the third access frequency may be 1000qps/sec and may last for 5 minutes.
  • the preset storage duration may be the duration configured in the information processing device; the preset storage duration may also be the information processing device obtains before the information processing device compares the storage duration of the first data with the preset storage duration.
  • the duration obtained; the preset storage duration may also be the duration obtained by the information processing device in other ways, which may be determined according to actual conditions, and is not limited in this embodiment of the present application.
  • the process of the information processing device transferring the first data from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier includes that the information processing device first compresses the first data to obtain the compressed the compressed first data; the information processing device transfers the compressed first data to the low-frequency access storage system.
  • the information processing device can use the lossless compression algorithm (LZ4) compression format to compress the stored data to obtain the compressed first data; the information processing device can also use other data compression methods to compress the stored data , to obtain the compressed first data, which may be specifically determined according to actual conditions, which is not limited in this embodiment of the present application.
  • LZ4 lossless compression algorithm
  • the information processing device further includes a data flow (compact) component
  • the data flow component is used to organize the module to be responsible for monitoring the access frequency of the stored data and formulating rules, and the life cycle of the stored data meets the rules set by the business Data flow can be realized (periodically check and analyze files, count file access frequency, check whether rules are met, and automate data migration).
  • the information processing device determines that the access frequency of the first data stored in the HDFS system is less than the preset access frequency threshold (or the identifier of the first data becomes a cold data identifier)
  • the information processing device will
  • the data transfer component can be used to transfer the first data to the CFS system, which eliminates the need to manually transfer the first data, realizes the automatic transfer process of the first data, and improves the intelligence of the first data transfer .
  • the information processing device after the information processing device transfers at least one type of data to at least one storage system, the information processing device stores the second data in the low-frequency access storage system, and the storage time of the second data meets the predetermined
  • the information processing device detects the fourth access frequency of the second data; when the fourth access frequency is greater than or equal to the preset access frequency threshold, the information processing device determines that the identifier of the second data is a hot data identifier ;
  • the information processing device transfers the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier.
  • the fourth access frequency may be the total number of access times of the second data within the preset storage duration, and the fourth access frequency may also be the number of access times of the second data per second.
  • the fourth access frequency can be the total access times of the second data within the preset storage period
  • the fourth access frequency can be the total access times of the second data within 6 months (for example, 1000qps). If the fourth access frequency is the number of times the second data is accessed per second, the fourth access frequency may be 1000qps/sec (one thousand requests per second); or the fourth access frequency may be 1000qps/sec and may last for 5 minutes.
  • the information processing device transfers the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier, including that the information processing device first compresses the second data to obtain Compressed second data: the information processing device transfers the compressed second data to the high-frequency access storage system corresponding to the hot data identifier.
  • the information processing device can use the data compression component (Zstandard, ZSTD) compression format to compress the second data to obtain the compressed second data; the information processing device can also use other data compression methods to compress the second data.
  • the second data is compressed to obtain the compressed second data, which may be specifically determined according to actual conditions, which is not limited in this embodiment of the present application.
  • the information processing device determines that the access frequency of the second data stored in the CFS system is greater than or equal to the preset access frequency threshold (or the identifier of the second data becomes a hot data identifier)
  • the information processing The device can use the data transfer component to transfer the second data to the HDFS system, no longer need to manually transfer the second data, realize the automatic transfer process of the second data, and improve the efficiency of the second data transfer intelligence.
  • the information processing device may be NoSQL, and NoSQL includes Regionserver.
  • the specific Regionserver includes memory, file system interface and data flow components (data flow);
  • NoSQL also includes management nodes.
  • the information processing device may first use the management node to receive the data identifier corresponding to the stored data (create a table), and then, when the information processing device receives the stored data written by the client, the information processing device directly writes the stored data into Internal memory, when the information processing device detects that the remaining storage capacity in the internal memory is less than the preset capacity lower limit threshold, the information processing device detects the data access frequency corresponding to the stored data in the internal memory, and the information processing device filters out from the data access frequency
  • the hot data identifies the corresponding first access frequency, and determines the first data corresponding to the first access frequency from the stored data; the information processing device selects a storage medium based on at least one storage category corresponding to at least one type of data, and determines the first data
  • the corresponding storage system (the type is HD
  • the information processing device screens out the second access frequency corresponding to the cold data identifier from the data access frequency, and determines the second data corresponding to the second access frequency from the stored data; the information processing device based on at least one storage corresponding to at least one type of data Select the storage medium by category, and determine that the storage system (type is CFS) corresponding to the second data is a low-frequency access storage system (CFS client); the information processing device transfers the second data to the low-frequency access storage system through the file system interface.
  • type is CFS
  • CFS client low-frequency access storage system
  • the information processing device when the information processing device stores the first data in the high-frequency access storage system and the storage duration of the first data meets the preset storage duration, the information processing device periodically detects the third access frequency of the first data (Periodically check and analyze files, check file access frequency, check whether the automatic data migration meets the rules); when the third access frequency is less than the preset access frequency threshold, the information processing device determines that the identification of the first data is a cold data identification; the information processing device The first data is transferred from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier through the data transfer component.
  • the third access frequency of the first data Periodically check and analyze files, check file access frequency, check whether the automatic data migration meets the rules
  • the information processing device When the information processing device stores the second data in the low-frequency access storage system, and the storage duration of the second data satisfies the preset storage duration, the information processing device periodically detects the fourth access frequency of the second data (periodic inspection Analyze the file, check the file access frequency, check whether the rule is satisfied (automated data migration); in the case that the fourth access frequency is greater than or equal to the preset access frequency threshold, the information processing device determines that the identification of the second data is a hot data identification; the information processing device passes The data transfer component transfers the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier.
  • the HDFS system is a system obtained from multiple (possibly three) SSDs.
  • the way for the information processing device to create the hot and cold tables may be:
  • the cluster management node includes metadata management, area management and initialization file system (initialization HDFS system and CFS system), under the situation that the information processing device starts, the cluster management node just Initialize the HDFS system and CFS system.
  • the Regionserver receives the stored data written by the client, the Regionserver writes the stored data into the memory.
  • the Regionserver When the Regionserver detects that the remaining storage capacity in the memory is less than the preset capacity lower limit threshold, and When the Regionserver determines that the identifier of the first data is a hot data identifier, the Regionserver transfers the first data to the HDFS system; when the Regionserver detects that the remaining storage capacity in the memory is less than the preset capacity lower limit threshold, and the Regionserver determines that the When the second data is marked as cold data, the Regionserver transfers the second data to the CFS system.
  • the information processing device when the information processing device receives the stored data written by the client through the cluster management node, the information processing device directly writes the stored data into the memory, and when the information processing device detects When the remaining storage capacity in the internal memory is less than the preset capacity lower limit threshold, the information processing device detects the data access frequency corresponding to the stored data in the internal memory, and the information processing device screens out the first access corresponding to the hot data identifier from the data access frequency.
  • the information processing device determines that the storage system corresponding to the first data is high based on at least one storage category (type is HDFS) corresponding to at least one type of data A frequent access storage system (CFS system); the information processing device transfers the first data to the high frequency access storage system through a file system interface.
  • type is HDFS
  • CFS system frequent access storage system
  • the information processing device screens out the second access frequency corresponding to the cold data identifier from the data access frequency, and determines the second data corresponding to the second access frequency from the stored data; the information processing device based on at least one storage corresponding to at least one type of data
  • the category (the type is CFS), determines that the storage system corresponding to the second data is a low-frequency access storage system (CFS system); the information processing device transfers the second data to the low-frequency access storage system through the file system interface.
  • CFS low-frequency access storage system
  • the information processing device determines that the identifier of the first data is a cold data identifier; the information processing device uses the data transfer component (data transfer) to transfer the first data from the high-frequency access storage system Save to the low-frequency access storage system corresponding to the cold data ID.
  • the information processing device When the information processing device stores the second data in the low-frequency access storage system, and the storage duration of the second data satisfies the preset storage duration, the information processing device detects a fourth access frequency of the second data; If it is greater than or equal to the preset access frequency threshold, the information processing device determines that the identification of the second data is a hot data identification; the information processing device uses the data flow component to transfer the second data from the low-frequency access storage system to the hot data Identify the corresponding high-frequency access storage system.
  • the information processing device stores the stored data written by the client in real time into the internal memory, and when the information processing device determines that the identifier of the first data in the stored data is a hot data identifier, The information processing device transfers (refreshes) the first data to the HDFS system (hot smoke layer).
  • the HDFS system hot smoke layer
  • the storage location corresponding to the cold data mark is the CFS system (cold layer); and utilize the LZ4 compression method to compress the first data (data compression), and transfer the compressed first data to the cold layer; after that, in the information
  • the processing device determines that the identifier of the first data in the cold layer has changed to a hot data identifier
  • the information processing device determines that the storage location corresponding to the hot data identifier is the hot smoke layer; and uses the ZSTD compression method to compress the compressed first data
  • the data is compressed (data compression), the first compressed data is obtained, and the first compressed data is transferred to the hot smoke layer.
  • the hot smoke layer is the local file system in the file system layer, including (HDD, SSD, NVM, AEP storage class memory);
  • the cold layer is the storage in the file system layer, specifically including cloud storage or other cloud storage.
  • the information processing device detects the data access frequency corresponding to the stored data in the memory, so that the information processing device can divide the stored data into at least one type of data based on the data access frequency, based on at least one corresponding to the at least one type of data Storage category, determining at least one storage system corresponding to at least one type of data, so as to transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system, and there is no need to manually determine the storage data Data popularity and corresponding storage locations do not require manual dumping of stored data, which improves intelligence when storing stored data.
  • FIG. 6 is a schematic diagram of the composition and structure of an information processing device provided in the embodiment of the present application.
  • Information processing device 1 may include:
  • the detection part 11 is configured to detect the data access frequency corresponding to the stored data in the internal memory when it is detected that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold;
  • the division part 12 is configured to divide the stored data into at least one type of data based on the data access frequency; each type of data in the at least one type of data corresponds to a storage category;
  • the determining part 13 is configured to determine at least one storage system corresponding to the at least one type of data based on at least one storage class corresponding to the at least one type of data;
  • the dumping part 14 is configured to dump the at least one type of data to the at least one storage system respectively; wherein, each type of data corresponds to a storage system.
  • the device further includes a receiving part
  • the receiving part is configured to receive a data identifier corresponding to the stored data; the data identifier includes a hot data identifier and a cold data identifier;
  • the at least one type of data includes first data and/or second data; the device further includes a screening part and an identification part;
  • the screening part is configured to filter out the hot data tags from the data tags; and/or, filter out the cold data tags from the data tags;
  • the identification part is configured to use the hot data identifier to identify the first access frequency corresponding to the first data; and/or use the cold data identifier to identify the second access frequency corresponding to the second data;
  • the first access frequency and/or the second access frequency are used as the data access frequency.
  • the screening part is configured to filter out the first access frequency corresponding to the hot data identifier from the data access frequency, and/or, from the data access frequency Filter out the second access frequency corresponding to the cold data identifier;
  • the determining part 13 is configured to determine from the stored data the first data corresponding to the first access frequency; and/or, determine from the stored data that the second access frequency corresponds to The second data of .
  • the determination part 13 is configured to filter out the first access frequency whose access frequency is greater than or equal to a preset access frequency threshold value from the data access frequencies, and determine the first data corresponding to the first access frequency; and/or, filter out the second access frequency whose access frequency is less than the preset access frequency threshold from the data access frequency, and determine in the stored data second data corresponding to the second access frequency.
  • the at least one storage system includes a high-frequency access storage system and/or a low-frequency access storage system;
  • the detection part 11 is configured to detect the third data of the first data when the first data is stored in the high-frequency access storage system and the storage duration of the first data satisfies a preset storage duration. frequency of visits;
  • the determining part 13 is configured to determine that the identifier of the first data is a cold data identifier when the third access frequency is less than a preset access frequency threshold;
  • the dumping part 14 is configured to dump the first data from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier.
  • the detection part 11 is configured to detect the a fourth access frequency of the second data
  • the determining part 13 is configured to determine that the identifier of the second data is a hot data identifier when the fourth access frequency is greater than or equal to a preset access frequency threshold;
  • the dumping part 14 is configured to dump the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier.
  • the above-mentioned detection part 11, division part 12, determination part 13 and dump part 14 can be realized by the processor 15 on the information processing device 1, specifically a CPU (Central Processing Unit, central processing unit ), MPU (Microprocessor Unit, microprocessor), DSP (Digital Signal Processing, digital signal processor) or field programmable gate array (FPGA, Field Programmable Gate Array) etc.;
  • the memory 16 implements.
  • the embodiment of the present invention also provides an information processing device 1. As shown in FIG.
  • the processor 15 communicates, and the memory 16 stores a program executable by the processor 15. When the program is executed, the processor 15 executes the information processing method as described above.
  • the above-mentioned memory 16 can be a volatile memory (volatile memory), such as a random access memory (Random-Access Memory, RAM); or a non-volatile memory (non-volatile memory), such as a read-only memory (Read-Only Memory, ROM), flash memory (flash memory), hard disk (Hard Disk Drive, HDD) or solid-state drive (Solid-State Drive, SSD); Provide instructions and data.
  • volatile memory such as a random access memory (Random-Access Memory, RAM)
  • non-volatile memory such as a read-only memory (Read-Only Memory, ROM), flash memory (flash memory), hard disk (Hard Disk Drive, HDD) or solid-state drive (Solid-State Drive, SSD); Provide instructions and data.
  • An embodiment of the present invention provides a computer-readable storage medium, on which there is a computer program, and when the program is executed by the processor 15, the information processing method as described above is realized.
  • the information processing device detects the data access frequency corresponding to the stored data in the memory, so that the information processing device can divide the stored data into at least one type of data based on the data access frequency, based on at least one corresponding to the at least one type of data Storage category, determining at least one storage system corresponding to at least one type of data, so as to transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system, and there is no need to manually determine the storage data Data popularity and corresponding storage locations do not require manual dumping of stored data, which improves intelligence when storing stored data.
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) having computer-usable program code embodied therein.
  • a computer-usable storage media including but not limited to disk storage and optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
  • An embodiment of the present application provides an information processing method and device, and a storage medium.
  • the information processing method includes: detecting that the stored data in the memory corresponds to data access frequency; based on the data access frequency, divide the stored data into at least one type of data; each type of data in at least one type of data corresponds to a storage category; based on at least one storage category corresponding to at least one type of data, determine at least one At least one storage system corresponding to the type of data, and at least one type of data is transferred to at least one storage system; wherein, each type of data corresponds to a storage system.
  • the information processing device detects the data access frequency corresponding to the stored data in the memory, so that the information processing device can divide the stored data into at least one type of data based on the data access frequency, based on at least one type of data corresponding to at least one type of data A storage category, to determine at least one storage system corresponding to at least one type of data, so as to transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system, and there is no need to manually determine the storage data The heat of the data and the corresponding storage location do not need to manually dump the stored data, which improves the intelligence when storing the stored data.

Abstract

Embodiments of the present application disclose an information processing method and apparatus, and a storage medium. The method comprises: when it is detected that the remaining storage capacity in a memory is less than or equal to a preset capacity lower limit threshold, detecting a data access frequency corresponding to storage data in the memory (S101); dividing the storage data into at least one type of data on the basis of the data access frequency, each of the at least one type of data corresponding to one storage category (S102); and determining, on the basis of at least one storage category corresponding to the at least one type of data, at least one storage system corresponding to the at least one type of data, and respectively transferring the at least one type of data to the at least one storage system, wherein each type of data corresponds to one storage system (S103).

Description

一种信息处理方法及装置、存储介质An information processing method and device, storage medium
相关申请的交叉引用Cross References to Related Applications
本申请要求在2021年06月09日提交中国专利局、申请号为202110642322.6、申请名称为“一种信息处理方法及装置、存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202110642322.6 and the application title "An information processing method and device, storage medium" submitted to the China Patent Office on June 9, 2021, the entire contents of which are incorporated by reference in In this application.
技术领域technical field
本申请涉及信息处理技术领域,尤其涉及一种信息处理方法及装置、存储介质。The present application relates to the technical field of information processing, and in particular to an information processing method and device, and a storage medium.
背景技术Background technique
随着互联网技术的发展,网络上的数据越来越多,需要存储的数据量也逐渐增加,随着也产生了许多在数据存储方面上的问题。With the development of Internet technology, there are more and more data on the network, and the amount of data that needs to be stored is gradually increasing, and many problems in data storage have also arisen.
现有技术中,是使用两套集群:存储性能较高的主集群(如:固态硬盘(Solid State Disk,SSD))和存储性能较低的备集群(如:硬盘驱动器(Hard Disk Drive,HDD))。先将数据存储至主集群中,在人工确定出数据的热度降低了的情况下,人工再将该数据从主集群拷贝至备集群进行存储,由于需要人工确定数据的热度,以及人工转移数据的存储位置,如此,降低了存储数据时的智能性。In the prior art, two sets of clusters are used: a primary cluster with high storage performance (such as solid state disk (Solid State Disk, SSD)) and a standby cluster with low storage performance (such as hard disk drive (Hard Disk Drive, HDD) )). Store the data in the primary cluster first, and then manually copy the data from the primary cluster to the standby cluster for storage when it is manually determined that the heat of the data is reduced. Storage location, as such, reduces intelligence in storing data.
发明内容Contents of the invention
为解决上述技术问题,本申请实施例期望提供一种信息处理方法及装置、存储介质,能够提高对存储数据进行存储时的智能性。In order to solve the above technical problems, the embodiments of the present application expect to provide an information processing method and device, and a storage medium, which can improve the intelligence when storing stored data.
本申请的技术方案是这样实现的:The technical scheme of the present application is realized like this:
本申请实施例提供一种信息处理方法,包括:An embodiment of the present application provides an information processing method, including:
在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测所述内存中的存储数据对应的数据访问频率;When it is detected that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold, detecting the data access frequency corresponding to the stored data in the internal memory;
基于所述数据访问频率,将所述存储数据划分成至少一类数据;所述至少一类数据中的每一类数据对应一个存储类别;Divide the stored data into at least one type of data based on the data access frequency; each type of data in the at least one type of data corresponds to a storage category;
基于所述至少一类数据对应的至少一个存储类别,确定所述至少一类数据对应的至少一个存储系统,并将所述至少一类数据分别转存至所述至少一个存储系统;其中,每一类数据对应一个存储系统。Based on at least one storage category corresponding to the at least one type of data, determine at least one storage system corresponding to the at least one type of data, and transfer the at least one type of data to the at least one storage system respectively; wherein, each A type of data corresponds to a storage system.
本申请实施例提供了一种信息处理装置,所述装置包括:An embodiment of the present application provides an information processing device, and the device includes:
检测部分,配置于在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测所述内存中的存储数据对应的数据访问频率;The detecting part is configured to detect the data access frequency corresponding to the stored data in the internal memory when it is detected that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold;
划分部分,配置于基于所述数据访问频率,将所述存储数据划分成至少一类数据;所述至少一类数据中的每一类数据对应一个存储类别;The dividing part is configured to divide the stored data into at least one type of data based on the data access frequency; each type of data in the at least one type of data corresponds to a storage category;
确定部分,配置于基于所述至少一类数据对应的至少一个存储类别,确定所述至少一类数据对应的至少一个存储系统;The determining part is configured to determine at least one storage system corresponding to the at least one type of data based on at least one storage class corresponding to the at least one type of data;
转存部分,配置于将所述至少一类数据分别转存至所述至少一个存储系统;其中,每一类数据对应一个存储系统。The dumping part is configured to dump the at least one type of data to the at least one storage system respectively; wherein, each type of data corresponds to a storage system.
本申请实施例提供了一种信息处理装置,所述装置包括:An embodiment of the present application provides an information processing device, and the device includes:
存储器、处理器和通信总线,所述存储器通过所述通信总线与所述处理器进行通信,所述存储器存储所述处理器可执行的信息处理的程序,当所述信息处理的程序被执行时,通过所述处理器执行上述所述的信息处理方法。A memory, a processor, and a communication bus, the memory communicates with the processor through the communication bus, the memory stores an information processing program executable by the processor, and when the information processing program is executed , executing the information processing method described above by the processor.
本申请实施例提供了一种存储介质,其上存储有计算机程序,应用于信息处理装置,该计算机程序被处理器执行时实现上述所述的信息处理方法。An embodiment of the present application provides a storage medium on which a computer program is stored and applied to an information processing device. When the computer program is executed by a processor, the information processing method described above is implemented.
本申请实施例提供了一种信息处理方法及装置、存储介质,信息处理方法包括:在检测到内存中的剩余存储容量小于或者等于预设容量下限阈 值的情况下,检测内存中的存储数据对应的数据访问频率;基于数据访问频率,将存储数据划分成至少一类数据;至少一类数据中的每一类数据对应一个存储类别;基于至少一类数据对应的至少一个存储类别,确定至少一类数据对应的至少一个存储系统,并将至少一类数据分别转存至至少一个存储系统;其中,每一类数据对应一个存储系统。采用上述方法实现方案,信息处理装置通过检测内存中的存储数据对应的数据访问频率,使得信息处理装置可以基于数据访问频率,将存储数据划分成至少一类数据,基于至少一类数据对应的至少一个存储类别,确定至少一类数据对应的至少一个存储系统,从而将至少一类数据分别转存至至少一个存储系统;其中,每一类数据对应一个存储系统,不需要再人工确定该存储数据的数据热度和对应的存储位置,也不需要人工来对存储数据进行转存,提高了对存储数据进行存储时的智能性。An embodiment of the present application provides an information processing method and device, and a storage medium. The information processing method includes: detecting that the stored data in the memory corresponds to data access frequency; based on the data access frequency, divide the stored data into at least one type of data; each type of data in at least one type of data corresponds to a storage category; based on at least one storage category corresponding to at least one type of data, determine at least one At least one storage system corresponding to the type of data, and at least one type of data is transferred to at least one storage system; wherein, each type of data corresponds to a storage system. By adopting the implementation scheme of the above method, the information processing device detects the data access frequency corresponding to the stored data in the memory, so that the information processing device can divide the stored data into at least one type of data based on the data access frequency, based on at least one type of data corresponding to at least one type of data A storage category, to determine at least one storage system corresponding to at least one type of data, so as to transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system, and there is no need to manually determine the storage data The heat of the data and the corresponding storage location do not need to manually dump the stored data, which improves the intelligence when storing the stored data.
附图说明Description of drawings
图1为本申请实施例提供的一种信息处理方法流程图;FIG. 1 is a flow chart of an information processing method provided by an embodiment of the present application;
图2为本申请实施例提供的一种示例性的信息处理装置架构图一;Fig. 2 is a schematic diagram 1 of an exemplary information processing device provided by an embodiment of the present application;
图3为本申请实施例提供的一种示例性的信息处理装置架构图二;Fig. 3 is an exemplary architecture diagram 2 of an information processing device provided by an embodiment of the present application;
图4为本申请实施例提供的一种示例性的信息处理装置架构图三;FIG. 4 is a schematic diagram 3 of an exemplary information processing device architecture provided by an embodiment of the present application;
图5为本申请实施例提供的一种示例性的信息处理装置架构图四;FIG. 5 is a schematic diagram 4 of an exemplary information processing device architecture provided by an embodiment of the present application;
图6为本申请实施例提供的一种信息处理装置的组成结构示意图一;FIG. 6 is a first structural schematic diagram of an information processing device provided by an embodiment of the present application;
图7为本申请实施例提供的一种信息处理装置的组成结构示意图二。FIG. 7 is a second schematic diagram of the composition and structure of an information processing device provided by an embodiment of the present application.
具体实施方式Detailed ways
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。In order to understand the characteristics and technical contents of the embodiments of the present application in more detail, the implementation of the embodiments of the present application will be described in detail below in conjunction with the accompanying drawings. The attached drawings are only for reference and description, and are not intended to limit the embodiments of the present application.
实施例一Embodiment one
本申请实施例提供了一种信息处理方法,图1为本申请实施例提供的一种信息处理方法流程图,如图1所示,信息处理方法可以包括:The embodiment of the present application provides an information processing method. FIG. 1 is a flow chart of an information processing method provided in the embodiment of the present application. As shown in FIG. 1, the information processing method may include:
S101、在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测内存中的存储数据对应的数据访问频率。S101. When it is detected that the remaining storage capacity in the internal memory is less than or equal to a preset capacity lower limit threshold, detect the data access frequency corresponding to the stored data in the internal memory.
本申请实施例提供的一种信息处理方法适用于对存储数据进行转存的场景下。An information processing method provided in an embodiment of the present application is applicable to a scenario of dumping stored data.
在本申请实施例中,信息处理装置可以以各种形式来实施。例如,本申请中描述的信息处理装置可以包括诸如手机、照相机、平板电脑、笔记本电脑、掌上电脑、个人数字助理(Personal Digital Assistant,PDA)、便捷式媒体播放器(Portable Media Player,PMP)、导航装置、可穿戴设备、智能手环、计步器等装置,以及诸如数字TV、台式计算机、服务器等装置。In the embodiment of the present application, the information processing apparatus may be implemented in various forms. For example, the information processing devices described in this application may include mobile phones, cameras, tablet computers, notebook computers, palmtop computers, personal digital assistants (Personal Digital Assistant, PDA), portable media players (Portable Media Player, PMP), Devices such as navigation devices, wearable devices, smart bracelets, pedometers, and devices such as digital TVs, desktop computers, servers, etc.
在本申请实施例中,存储数据可以为日志数据、商品的订单数据或者是商品的其他数据信息,具体的可根据实际情况进行确定,本申请实施例对此不作限定。In the embodiment of the present application, the stored data may be log data, order data of the product or other data information of the product, which can be determined according to the actual situation, which is not limited in the embodiment of the present application.
在本申请实施例中,存储数据的数量可根据实际情况进行确定,本申请实施例对此不作限定。In the embodiment of the present application, the amount of stored data may be determined according to actual conditions, which is not limited in the embodiment of the present application.
在本申请实施例中,存储数据的数量和数据访问频率的数量一一对应,即一个存储数据对应一个数据访问频率。数据访问频率的数量可以为一个,数据访问频率的数量也可以为两个,数据访问频率的数量还可以为多个,具体的数据访问频率的数量可根据实际情况进行确定,本申请实施例对此不作限定。In the embodiment of the present application, the quantity of stored data corresponds to the quantity of data access frequencies one by one, that is, one stored data corresponds to one data access frequency. The number of data access frequencies can be one, the number of data access frequencies can also be two, and the number of data access frequencies can also be multiple. The specific number of data access frequencies can be determined according to the actual situation. This is not limited.
在本申请实施例中,信息处理装置在接收到存储数据的情况下,信息处理装置先将该存储数据存储至内存中,之后,信息处理装置就会在内存的剩余存储容量小于预设容量下限阈值的情况下,检测内存中的存储数据对应的数据访问频率。In the embodiment of the present application, when the information processing device receives the stored data, the information processing device first stores the stored data in the internal memory, and then the information processing device will store the stored data when the remaining storage capacity of the internal memory is less than the lower limit of the preset capacity. In the case of a threshold, the data access frequency corresponding to the stored data in the memory is detected.
需要说明的是,信息处理装置可以在内存的剩余存储容量小于预设容量下限阈值的情况下,检测内存中的存储数据对应的数据访问频率;信息处理装置也可以间隔预设时间段,信息处理装置就检测内存中的存储数据对应的数据访问频率;具体的信息处理装置检测内存中的存储数据对应的数据访问频率的条件可根据实际情况进行确定,本申请实施例对此不作限定。It should be noted that the information processing device can detect the data access frequency corresponding to the stored data in the internal memory when the remaining storage capacity of the internal memory is less than the preset lower limit threshold; The device detects the data access frequency corresponding to the stored data in the internal memory; the specific condition for the information processing device to detect the data access frequency corresponding to the stored data in the internal memory can be determined according to the actual situation, which is not limited in this embodiment of the present application.
还需要说明的是,预设容量下限阈值可以为信息处理装置中配置的阈值;预设容量下限阈值也可以为信息处理装置接收到存储数据之前,信息处理装置获取到的阈值;还可以为信息处理装置以其他的方式获取到的阈值;具体的可根据实际情况进行确定,本申请实施例对此不作限定。It should also be noted that the preset capacity lower limit threshold can be the threshold configured in the information processing device; the preset capacity lower limit threshold can also be the threshold obtained by the information processing device before the information processing device receives the stored data; it can also be the information processing device The threshold value acquired by the processing device in other ways; specifically, it may be determined according to the actual situation, which is not limited in this embodiment of the present application.
示例性地,预设容量下限阈值可以为0,即内存已满。Exemplarily, the preset capacity lower limit threshold may be 0, that is, the memory is full.
还需要说明的是,预设时间段可以为信息处理装置中配置的时间段;预设时间段也可以为信息处理装置接收到存储数据之前,信息处理装置获取到的时间段;还可以为信息处理装置以其他的方式获取到的时间段;具体的可根据实际情况进行确定,本申请实施例对此不作限定。It should also be noted that the preset time period may be the time period configured in the information processing device; the preset time period may also be the time period obtained by the information processing device before the information processing device receives the stored data; The time period acquired by the processing device in other ways; the specific time period may be determined according to the actual situation, which is not limited in this embodiment of the present application.
在本申请实施例中,数据访问频率可以为标识存储数据类别的访问频率范围,示例性的,若存储数据的数据类别为热数据类别,则数据访问频率可以为大于或者等于预设访问频率阈值的频率段;若存储数据的数据类别为冷数据类别,则数据访问频率可以为小于预设访问频率阈值的频率段。In this embodiment of the application, the data access frequency may be the access frequency range that identifies the stored data category. For example, if the data category of the stored data is a hot data category, the data access frequency may be greater than or equal to the preset access frequency threshold frequency band; if the data category of the stored data is cold data category, the data access frequency may be a frequency band less than the preset access frequency threshold.
需要说明的是,预设访问频率阈值可以为信息处理装置中配置的频率阈值,也可以为信息处理装置以其他的方式获取到的频率阈值,具体的可根据实际情况进行确定,本申请实施例对此不作限定。It should be noted that the preset access frequency threshold can be the frequency threshold configured in the information processing device, or it can be the frequency threshold obtained by the information processing device in other ways, which can be determined according to the actual situation. There is no limit to this.
在本申请实施例中,信息处理装置在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测内存中的存储数据对应的数据访问频率之前,信息处理装置还会接收存储数据对应的数据标识;相应的,至少一类数据包括第一数据和/或第二数据;信息处理装置检测内存中 的存储数据对应的数据访问频率的过程,包括:信息处理装置从数据标识中筛选出热数据标识,并利用热数据标识标识第一数据对应的第一访问频率;和/或,信息处理装置从数据标识中筛选出冷数据标识,并利用冷数据标识标识第二数据对应的第二访问频率;信息处理装置将第一访问频率和/或第二访问频率作为数据访问频率。In this embodiment of the application, when the information processing device detects that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold, before detecting the data access frequency corresponding to the stored data in the internal memory, the information processing device will also receive The data identification corresponding to the stored data; correspondingly, at least one type of data includes the first data and/or the second data; the process of the information processing device detecting the data access frequency corresponding to the stored data in the internal memory includes: the information processing device detects from the data identification Filter out the hot data identifiers from the hot data identifiers, and use the hot data identifiers to identify the first access frequency corresponding to the first data; and/or, the information processing device filters out the cold data identifiers from the data identifiers, and uses the cold data identifiers to identify the second data corresponding to The second access frequency; the information processing device uses the first access frequency and/or the second access frequency as the data access frequency.
在本申请实施例中,数据标识包括热数据标识和冷数据标识。热数据标识用于标识第一数据为热数据类别,冷数据标识用于标识第二数据为冷数据类别。In the embodiment of the present application, the data identification includes hot data identification and cold data identification. The hot data identifier is used to identify the first data as a hot data category, and the cold data identifier is used to identify the second data as a cold data category.
在本申请实施例中,第一访问频率可以为数据访问频率中的部分访问频率,第一访问频率也可以为数据访问频率。第二访问频率可以为数据访问频率中的部分访问频率,第二访问频率也可以为数据访问频率。其中,若第一访问频率为数据访问频率中的部分访问频率,且第二访问频率也为数据访问频率中的部分访问频率,则第一访问频率和第二访问频率组成了数据访问频率。In this embodiment of the present application, the first access frequency may be a part of the data access frequencies, and the first access frequency may also be the data access frequency. The second access frequency may be a part of the data access frequencies, and the second access frequency may also be a data access frequency. Wherein, if the first access frequency is a part of the data access frequency, and the second access frequency is also a part of the data access frequency, then the first access frequency and the second access frequency constitute the data access frequency.
需要说明的是,第一访问频率可以为大于或者等于预设访问频率阈值的频率段;第二访问频率可以为小于预设访问频率阈值的频率段。It should be noted that the first access frequency may be a frequency segment greater than or equal to a preset access frequency threshold; the second access frequency may be a frequency segment less than the preset access frequency threshold.
需要说明的是,信息处理装置在确定出存储数据中的第一数据的数据标识为热数据标识的情况下,信息处理装置就将该热数据标识作为第一数据的数据标识;信息处理装置在确定出存储数据中的第二数据的数据标识为冷数据标识的情况下,信息处理装置就将该冷数据标识作为第二数据的数据标识。It should be noted that, when the information processing device determines that the data identifier of the first data in the stored data is a hot data identifier, the information processing device uses the hot data identifier as the data identifier of the first data; When it is determined that the data identifier of the second data in the stored data is a cold data identifier, the information processing device uses the cold data identifier as the data identifier of the second data.
在本申请实施例中,第一访问频率可以为第一数据在一段时间内的总访问次数,第一访问频率也可以为第一数据每秒的访问次数。第二访问频率可以为第二数据在一段时间内的总访问次数,第二访问频率也可以为第二数据每秒的访问次数。In this embodiment of the present application, the first access frequency may be the total number of times the first data is accessed within a period of time, and the first access frequency may also be the number of times the first data is accessed per second. The second access frequency may be the total access times of the second data within a period of time, or the second access frequency may be the access times of the second data per second.
需要说明的是,一段时间内可以为3个月、6个月、1年等,具体的一 段时间可根据实际情况进行确定,本申请实施例对此不作限定。It should be noted that the period of time can be 3 months, 6 months, 1 year, etc., and the specific period of time can be determined according to the actual situation, which is not limited in the embodiment of the present application.
示例性的,若第一访问频率可以为第一数据在一段时间内的总访问次数,则第一访问频率可以为第一数据6个月内的总访问次数(如1000qps)。若第一访问频率为第一数据每秒的访问次数,则第一访问频率可以为1000qps/sec(每秒一千请求);或者第一访问频率可以为1000qps/sec、且可以持续5分钟。若第二访问频率可以为第二数据在一段时间内的总访问次数,则第二访问频率可以为第二数据6个月内的总访问次数(如1000qps)。若第二访问频率为第二数据每秒的访问次数,则第二访问频率可以为1000qps/sec(每秒一千请求);或者第二访问频率可以为1000qps/sec、且可以持续5分钟。Exemplarily, if the first access frequency may be the total access times of the first data within a period of time, then the first access frequency may be the total access times of the first data within 6 months (for example, 1000qps). If the first access frequency is the number of times the first data is accessed per second, the first access frequency may be 1000qps/sec (one thousand requests per second); or the first access frequency may be 1000qps/sec and may last for 5 minutes. If the second access frequency can be the total access times of the second data within a period of time, then the second access frequency can be the total access times of the second data within 6 months (for example, 1000qps). If the second access frequency is the number of times the second data is accessed per second, the second access frequency may be 1000qps/sec (one thousand requests per second); or the second access frequency may be 1000qps/sec and last for 5 minutes.
在本申请实施例中,信息处理装置可以直接接收用户传输的数据标识;信息处理装置也可以根据用户传输的指令创建冷热表,从冷热表中确定出存储数据对应的数据标识。In this embodiment of the application, the information processing device may directly receive the data identifier transmitted by the user; the information processing device may also create a hot and cold table according to the instruction transmitted by the user, and determine the data identifier corresponding to the stored data from the hot and cold table.
需要说明的是,若信息处理装置根据用户传输的指令创建冷热表,则信息处理装置创建冷热表的方式可以为:It should be noted that, if the information processing device creates the hot and cold tables according to the instructions transmitted by the user, the way for the information processing device to create the hot and cold tables can be as follows:
create‘cold_table’,{NAME=>‘f1’,DATA_STORAGE=>’CFS’};create'cold_table',{NAME=>'f1',DATA_STORAGE=>'CFS'};
create‘hot_table’,{NAME=>‘f1’}或是create‘hot_table’,{NAME=>‘f1’,DATA_STORAGE=>’HDFS’}。create'hot_table',{NAME=>'f1'} or create'hot_table',{NAME=>'f1',DATA_STORAGE=>'HDFS'}.
S102、基于数据访问频率,将存储数据划分成至少一类数据;至少一类数据中的每一类数据对应一个存储类别。S102. Based on the data access frequency, divide the stored data into at least one type of data; each type of data in the at least one type of data corresponds to a storage category.
在本申请实施例中,信息处理装置检测内存中的存储数据对应的数据访问频率之后,信息处理装置就可以基于数据访问频率,将存储数据划分成至少一类数据。In the embodiment of the present application, after the information processing device detects the data access frequency corresponding to the stored data in the internal memory, the information processing device may divide the stored data into at least one type of data based on the data access frequency.
在本申请实施例中,至少一类数据包括热数据类别的第一数据和/或冷数据类别的第二数据。In this embodiment of the present application, at least one type of data includes first data of a hot data type and/or second data of a cold data type.
在本申请实施例中,信息处理装置可以为分布式非关系型(NoSQL) 系统,分布式NoSQL系统中存储有分布式文件系统(Hadoop Distribution FileSystem,HDFS)的HDFS地址和云文件系统(Cloud File Service,CFS)的CFS地址,分布式NoSQL系统可以通过HDFS地址将HDFS系统加载至分布式NoSQL系统中,通过CFS地址将CFS系统加载至分布式NoSQL系统中,即分布式NoSQL系统中设置有HDFS系统和CFS系统。分布式NoSQL系统在启动时,自动加载HDFS系统和CFS系统。In the embodiment of the present application, the information processing device can be a distributed non-relational (NoSQL) system, and the HDFS address of the distributed file system (Hadoop Distribution FileSystem, HDFS) and the cloud file system (Cloud File System) are stored in the distributed NoSQL system. Service, CFS) CFS address, the distributed NoSQL system can load the HDFS system to the distributed NoSQL system through the HDFS address, and load the CFS system to the distributed NoSQL system through the CFS address, that is, the distributed NoSQL system is set with HDFS system and the CFS system. When the distributed NoSQL system starts, it automatically loads the HDFS system and the CFS system.
在本申请实施例中,分布式NoSQL系统中设置有管理节点(HMaster),用于负责表管理(增删改查)、区域(region)管理、初始化文件系统、等。分布式NoSQL系统中还设置有整理模块(Compact),用于监控文件的访问频率和规则的制定,数据的生命周期满足业务设定的规则就可实现数据的流转(周期检查分析文件、统计文件访问频率、检查是否满足规则、自动化数据迁移)。In the embodiment of the present application, a management node (HMaster) is set in the distributed NoSQL system, which is responsible for table management (addition, deletion, modification, query), region management, file system initialization, and so on. The distributed NoSQL system is also equipped with a sorting module (Compact), which is used to monitor the access frequency of files and the formulation of rules. The life cycle of data meets the rules set by the business to realize the flow of data (periodic inspection analysis files, statistical files frequency of access, checking for compliance with rules, automating data migration).
需要说明的是,HDFS系统和CFS刺痛分别用于存储不同类别的存储数据。其中,HDFS系统使用全SSD盘,CFS系统使用云盘。It should be noted that HDFS system and CFS sting are used to store different types of storage data respectively. Among them, the HDFS system uses full SSD disks, and the CFS system uses cloud disks.
需要说明的是,信息处理装置中设置有文件系统接口(FileSystemInterface),用于路由转发,信息处理装置可以利用文件系统接口,选择与存储数据标识匹配的存储系统(HDFS系统或者CFS系统),从而将存储数据转存至HDFS系统或者CFS系统。It should be noted that the information processing device is provided with a file system interface (FileSystemInterface) for routing and forwarding, and the information processing device can use the file system interface to select a storage system (HDFS system or CFS system) that matches the stored data identifier, thereby Transfer storage data to HDFS system or CFS system.
在本申请实施例中,存储类别包括热数据存储类别和冷数据存储类别,其中,热数据存储类别对应的存储系统为高频访问存储系统,冷数据存储类别对应的存储系统可以为低频访问存储系统。需要说明的是,高频访问存储系统可以为HDFS系统,低频访问存储系统可以为CFS系统。In this embodiment of the application, the storage category includes a hot data storage category and a cold data storage category, where the storage system corresponding to the hot data storage category is a high-frequency access storage system, and the storage system corresponding to the cold data storage category may be a low-frequency access storage system. system. It should be noted that the high-frequency access storage system may be an HDFS system, and the low-frequency access storage system may be a CFS system.
在本申请实施例中,至少一类数据中的每一类数据对应一个存储类别包括:热数据类别的第一数据对应热数据存储类别,冷数据类别的第二数据对应冷数据存储类别。In the embodiment of the present application, each type of data in at least one type of data corresponds to a storage type, including: the first data of the hot data type corresponds to the hot data storage type, and the second data of the cold data type corresponds to the cold data storage type.
在本申请实施例中,若信息处理装置接收了存储数据对应的数据标识, 且信息处理装置从数据标识中筛选出热数据标识,并利用热数据标识标识第一数据对应的第一访问频率;和/或,信息处理装置从数据标识中筛选出冷数据标识,并利用冷数据标识标识第二数据对应的第二访问频率;将第一访问频率和/或第二访问频率作为数据访问频率,则信息处理装置基于数据访问频率,将存储数据划分成至少一类数据的过程,可以为信息处理装置从数据访问频率中筛选出热数据标识对应的第一访问频率,并从存储数据中确定出第一访问频率对应的第一数据;和/或,信息处理装置从数据访问频率中筛选出冷数据标识对应的第二访问频率,并从存储数据中确定出第二访问频率对应的第二数据。In the embodiment of the present application, if the information processing device receives the data identifier corresponding to the stored data, and the information processing device filters out the hot data identifier from the data identifier, and uses the hot data identifier to identify the first access frequency corresponding to the first data; And/or, the information processing device screens out the cold data identifier from the data identifier, and uses the cold data identifier to identify the second access frequency corresponding to the second data; using the first access frequency and/or the second access frequency as the data access frequency, Then, based on the data access frequency, the information processing device divides the stored data into at least one type of data. The information processing device can filter out the first access frequency corresponding to the hot data identifier from the data access frequency, and determine from the stored data The first data corresponding to the first access frequency; and/or, the information processing device screens out the second access frequency corresponding to the cold data identifier from the data access frequency, and determines the second data corresponding to the second access frequency from the stored data .
在本申请实施例中,冷热表中标识了数据的存储标签,信息处理装置也可以从存储标签中确定出数据的存储系统。In the embodiment of the present application, the storage tag of the data is identified in the hot and cold table, and the information processing device may also determine the storage system of the data from the storage tag.
示例性的,存储标签可以为:DATA_STORAGE=>’CFS’或是DATA_STORAGE=>’HDFS’。Exemplarily, the storage label may be: DATA_STORAGE=>'CFS' or DATA_STORAGE=>'HDFS'.
在本申请实施例中,信息处理装置基于数据访问频率,将存储数据划分成至少一类数据的过程,也可以为信息处理装置在数据访问频率中筛选出访问频率大于或者等于预设访问频率阈值的第一访问频率,并在存储数据中确定与第一访问频率对应的第一数据;和/或,信息处理装置在数据访问频率中筛选出访问频率小于预设访问频率阈值的第二访问频率,并在存储数据中确定与第二访问频率对应的第二数据。In the embodiment of the present application, the information processing device divides the stored data into at least one type of data based on the data access frequency. The information processing device can also filter out the access frequency greater than or equal to the preset access frequency threshold from the data access frequency. the first access frequency, and determine the first data corresponding to the first access frequency in the stored data; and/or, the information processing device screens out the second access frequency whose access frequency is less than the preset access frequency threshold from the data access frequencies , and determine the second data corresponding to the second access frequency in the stored data.
S103、基于至少一类数据对应的至少一个存储类别,确定至少一类数据对应的至少一个存储系统,并将至少一类数据分别转存至至少一个存储系统;其中,每一类数据对应一个存储系统。S103. Based on at least one storage category corresponding to at least one type of data, determine at least one storage system corresponding to at least one type of data, and transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system system.
在本申请实施例中,信息处理装置基于数据访问频率,将存储数据划分成至少一类数据之后,信息处理装置就可以基于至少一类数据对应的至少一个存储类别,确定至少一类数据对应的至少一个存储系统,并将至少一类数据分别转存至至少一个存储系统。In this embodiment of the application, after the information processing device divides the stored data into at least one type of data based on the data access frequency, the information processing device can determine the corresponding at least one storage system, and transfer at least one type of data to at least one storage system respectively.
在本申请实施例中,至少一类数据与至少一个存储类别一一对应,即一类数据对应一个存储类别。如:热数据类别的第一数据对应热数据存储类别,冷数据类别的第二数据对应冷数据存储类别。In this embodiment of the present application, at least one type of data is in one-to-one correspondence with at least one storage class, that is, one type of data corresponds to one storage class. For example: the first data of the hot data category corresponds to the hot data storage category, and the second data of the cold data category corresponds to the cold data storage category.
示例性的,至少一类数据对应的至少一个存储类别之间的对应关系可以为:DATA_STORAGE=>’CFS’或是DATA_STORAGE=>’HDFS’。Exemplarily, the correspondence between at least one storage class corresponding to at least one type of data may be: DATA_STORAGE=>'CFS' or DATA_STORAGE=>'HDFS'.
在本申请实施例中,至少一个存储系统包括高频访问存储系统和/或低频访问存储系统。In this embodiment of the present application, at least one storage system includes a high-frequency access storage system and/or a low-frequency access storage system.
需要说明的是,HDFS系统用于存储热数据标识的第一数据;CFS用于存储冷数据标识的第二数据。It should be noted that the HDFS system is used to store the first data identified by the hot data; the CFS is used to store the second data identified by the cold data.
在本申请实施例中,信息处理装置将至少一类数据分别转存至至少一个存储系统之后,在信息处理装置将第一数据存储至高频访问存储系统、且第一数据的存储时长满足预设存储时长的情况下,信息处理装置就检测第一数据的第三访问频率;在第三访问频率小于预设访问频率阈值的情况下,信息处理装置就确定第一数据的标识为冷数据标识;信息处理装置就将第一数据从高频访问存储系统转存至与冷数据标识对应的低频访问存储系统。In this embodiment of the application, after the information processing device transfers at least one type of data to at least one storage system, the information processing device stores the first data in the high-frequency access storage system, and the storage time of the first data meets the predetermined When the storage duration is set, the information processing device detects the third access frequency of the first data; when the third access frequency is less than the preset access frequency threshold, the information processing device determines that the first data is marked as a cold data mark The information processing device transfers the first data from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier.
在本申请实施例中,第三访问频率可以为存储数据预设存储时长内的总访问次数,第三访问频率也可以为存储数据每秒的访问次数。In this embodiment of the present application, the third access frequency may be the total number of accesses within the preset storage duration of the stored data, and the third access frequency may also be the number of accesses per second of the stored data.
示例性的,若第三访问频率可以为存储数据在预设存储时长内的总访问次数,则第三访问频率可以为存储数据6个月内的总访问次数(如1000qps)。若第三访问频率为存储数据每秒的访问次数,则第三访问频率可以为1000qps/sec(每秒一千请求);或者第三访问频率可以为1000qps/sec、且可以持续5分钟。Exemplarily, if the third access frequency can be the total access times of the stored data within the preset storage period, the third access frequency can be the total access times of the stored data within 6 months (for example, 1000qps). If the third access frequency is the number of accesses per second of the stored data, the third access frequency may be 1000qps/sec (one thousand requests per second); or the third access frequency may be 1000qps/sec and may last for 5 minutes.
在本申请实施例中,预设存储时长可以为信息处理装置中配置的时长;预设存储时长也可以为信息处理装置将第一数据的存储时长与预设存储时长对比之前,信息处理装置获取到的时长;预设存储时长还可以为信息处 理装置以其他的方式获取到的时长,具体的可根据实际情况进行确定,本申请实施例对此不作限定。In this embodiment of the application, the preset storage duration may be the duration configured in the information processing device; the preset storage duration may also be the information processing device obtains before the information processing device compares the storage duration of the first data with the preset storage duration. The duration obtained; the preset storage duration may also be the duration obtained by the information processing device in other ways, which may be determined according to actual conditions, and is not limited in this embodiment of the present application.
在本申请实施例中,信息处理装置将第一数据从高频访问存储系统转存至与冷数据标识对应的低频访问存储系统的过程,包括信息处理装置先对第一数据进行压缩,得到压缩后的第一数据;信息处理装置将压缩后的第一数据转存至与低频访问存储系统。In the embodiment of the present application, the process of the information processing device transferring the first data from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier includes that the information processing device first compresses the first data to obtain the compressed the compressed first data; the information processing device transfers the compressed first data to the low-frequency access storage system.
在本申请实施例中,信息处理装置可以利用无损压缩算法(LZ4)压缩格式对存储数据进行压缩,得到压缩后的第一数据;信息处理装置也可以利用其他的数据压缩方式对存储数据进行压缩,得到压缩后的第一数据,具体的可根据实际情况进行确定,本申请实施例对此不作限定。In the embodiment of the present application, the information processing device can use the lossless compression algorithm (LZ4) compression format to compress the stored data to obtain the compressed first data; the information processing device can also use other data compression methods to compress the stored data , to obtain the compressed first data, which may be specifically determined according to actual conditions, which is not limited in this embodiment of the present application.
在本申请实施例中,信息处理装置还包括数据流转(compact)组件,数据流转组件,用于整理模块负责监控存储数据的访问频率和规则的制定,存储数据的生命周期满足业务设定的规则就可实现数据的流转(周期检查分析文件、统计文件访问频率、检查是否满足规则、自动化数据迁移)。In the embodiment of the present application, the information processing device further includes a data flow (compact) component, the data flow component is used to organize the module to be responsible for monitoring the access frequency of the stored data and formulating rules, and the life cycle of the stored data meets the rules set by the business Data flow can be realized (periodically check and analyze files, count file access frequency, check whether rules are met, and automate data migration).
可以理解的是,在信息处理装置确定出存储至HDFS系统中的第一数据的访问频率小于预设访问频率阈值(或者第一数据的标识变为冷数据标识)的情况下,信息处理装置就可以利用数据流转组件将该第一数据转存至CFS系统中,不再需要人工对第一数据进行转存,实现了第一数据的自动化流转过程,提高了第一数据转存时的智能性。It can be understood that, when the information processing device determines that the access frequency of the first data stored in the HDFS system is less than the preset access frequency threshold (or the identifier of the first data becomes a cold data identifier), the information processing device will The data transfer component can be used to transfer the first data to the CFS system, which eliminates the need to manually transfer the first data, realizes the automatic transfer process of the first data, and improves the intelligence of the first data transfer .
在本申请实施例中,信息处理装置将将至少一类数据分别转存至至少一个存储系统之后,在信息处理装置将第二数据存储至低频访问存储系统、且第二数据的存储时长满足预设存储时长的情况下,信息处理装置检测第二数据的第四访问频率;在第四访问频率大于或者等于预设访问频率阈值的情况下,信息处理装置确定第二数据的标识为热数据标识;信息处理装置将第二数据从低频访问存储系统中转存至与热数据标识对应的高频访问存储系统。In this embodiment of the application, after the information processing device transfers at least one type of data to at least one storage system, the information processing device stores the second data in the low-frequency access storage system, and the storage time of the second data meets the predetermined When the storage duration is assumed, the information processing device detects the fourth access frequency of the second data; when the fourth access frequency is greater than or equal to the preset access frequency threshold, the information processing device determines that the identifier of the second data is a hot data identifier ; The information processing device transfers the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier.
在本申请实施例中,第四访问频率可以为第二数据在预设存储时长内的总访问次数,第四访问频率也可以为第二数据每秒的访问次数。In the embodiment of the present application, the fourth access frequency may be the total number of access times of the second data within the preset storage duration, and the fourth access frequency may also be the number of access times of the second data per second.
示例性的,若第四访问频率可以为第二数据在预设存储时长内的总访问次数,则第四访问频率可以为第二数据6个月内的总访问次数(如1000qps)。若第四访问频率为第二数据每秒的访问次数,则第四访问频率可以为1000qps/sec(每秒一千请求);或者第四访问频率可以为1000qps/sec、且可以持续5分钟。Exemplarily, if the fourth access frequency can be the total access times of the second data within the preset storage period, the fourth access frequency can be the total access times of the second data within 6 months (for example, 1000qps). If the fourth access frequency is the number of times the second data is accessed per second, the fourth access frequency may be 1000qps/sec (one thousand requests per second); or the fourth access frequency may be 1000qps/sec and may last for 5 minutes.
在本申请实施例中,信息处理装置将第二数据从低频访问存储系统中转存至与热数据标识对应的高频访问存储系统的过程,包括信息处理装置先对第二数据进行压缩,得到压缩后的第二数据;信息处理装置将压缩后的第二数据转存至与热数据标识对应的高频访问存储系统。In the embodiment of the present application, the information processing device transfers the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier, including that the information processing device first compresses the second data to obtain Compressed second data: the information processing device transfers the compressed second data to the high-frequency access storage system corresponding to the hot data identifier.
在本申请实施例中,信息处理装置可以利用数据压缩组件(Zstandard,ZSTD)压缩格式对第二数据进行压缩,得到压缩后的第二数据;信息处理装置也可以利用其他的数据压缩方式对第二数据进行压缩,得到压缩后的第二数据,具体的可根据实际情况进行确定,本申请实施例对此不作限定。In the embodiment of the present application, the information processing device can use the data compression component (Zstandard, ZSTD) compression format to compress the second data to obtain the compressed second data; the information processing device can also use other data compression methods to compress the second data. The second data is compressed to obtain the compressed second data, which may be specifically determined according to actual conditions, which is not limited in this embodiment of the present application.
可以理解的是,在信息处理装置确定出存储至CFS系统中的第二数据的访问频率大于或者等于预设访问频率阈值(或者第二数据的标识变为热数据标识)的情况下,信息处理装置就可以利用数据流转组件将该第二数据转存至HDFS系统中,不再需要人工对第二数据进行转存,实现了第二数据的自动化流转过程,提高了第二数据转存时的智能性。It can be understood that when the information processing device determines that the access frequency of the second data stored in the CFS system is greater than or equal to the preset access frequency threshold (or the identifier of the second data becomes a hot data identifier), the information processing The device can use the data transfer component to transfer the second data to the HDFS system, no longer need to manually transfer the second data, realize the automatic transfer process of the second data, and improve the efficiency of the second data transfer intelligence.
示例性的,如图2所示:信息处理装置可以为NoSQL,NoSQL中包括Regionserver。具体的Regionserver包括内存、文件系统接口和数据流转组件(数据流转);NoSQL还包括管理节点。信息处理装置可以先利用管理节点接收存储数据对应的数据标识(创建表),之后,信息处理装置在接收到客户端写入的存储数据的情况下,信息处理装置就直接将该存储数据写入内存,在信息处理装置检测到内存中的剩余存储容量小于预设容量下 限阈值的情况下,信息处理装置就检测内存中的存储数据对应的数据访问频率,信息处理装置从数据访问频率中筛选出热数据标识对应的第一访问频率,并从存储数据中确定出第一访问频率对应的第一数据;信息处理装置基于至少一类数据对应的至少一个存储类别选择存储介质,确定出第一数据对应的存储系统(类型为HDFS)为高频访问存储系统(HDFS客户端);信息处理装置通过文件系统接口将第一数据转存至高频访问存储系统。信息处理装置从数据访问频率中筛选出冷数据标识对应的第二访问频率,并从存储数据中确定出第二访问频率对应的第二数据;信息处理装置基于至少一类数据对应的至少一个存储类别选择存储介质,确定出第二数据对应的存储系统(类型为CFS)为低频访问存储系统(CFS客户端);信息处理装置通过文件系统接口将第二数据转存至低频访问存储系统。之后,信息处理装置在将第一数据存储至高频访问存储系统、且第一数据的存储时长满足预设存储时长的情况下,信息处理装置就周期性的检测第一数据的第三访问频率(周期检查分析文件检查文件访问频率检查是否满足规则自动化数据迁移);在第三访问频率小于预设访问频率阈值的情况下,信息处理装置确定第一数据的标识为冷数据标识;信息处理装置通过数据流转组件将第一数据从高频访问存储系统转存至与冷数据标识对应的低频访问存储系统。信息处理装置在将第二数据存储至低频访问存储系统、且第二数据的存储时长满足预设存储时长的情况下,信息处理装置就周期性的检测第二数据的第四访问频率(周期检查分析文件检查文件访问频率检查是否满足规则自动化数据迁移);在第四访问频率大于或者等于预设访问频率阈值的情况下,信息处理装置确定第二数据的标识为热数据标识;信息处理装置通过数据流转组件将第二数据从低频访问存储系统中转存至与热数据标识对应的高频访问存储系统。其中,HDFS系统为根据多个(可以为3个)SSD得到的系统。Exemplarily, as shown in FIG. 2 : the information processing device may be NoSQL, and NoSQL includes Regionserver. The specific Regionserver includes memory, file system interface and data flow components (data flow); NoSQL also includes management nodes. The information processing device may first use the management node to receive the data identifier corresponding to the stored data (create a table), and then, when the information processing device receives the stored data written by the client, the information processing device directly writes the stored data into Internal memory, when the information processing device detects that the remaining storage capacity in the internal memory is less than the preset capacity lower limit threshold, the information processing device detects the data access frequency corresponding to the stored data in the internal memory, and the information processing device filters out from the data access frequency The hot data identifies the corresponding first access frequency, and determines the first data corresponding to the first access frequency from the stored data; the information processing device selects a storage medium based on at least one storage category corresponding to at least one type of data, and determines the first data The corresponding storage system (the type is HDFS) is a high-frequency access storage system (HDFS client); the information processing device transfers the first data to the high-frequency access storage system through the file system interface. The information processing device screens out the second access frequency corresponding to the cold data identifier from the data access frequency, and determines the second data corresponding to the second access frequency from the stored data; the information processing device based on at least one storage corresponding to at least one type of data Select the storage medium by category, and determine that the storage system (type is CFS) corresponding to the second data is a low-frequency access storage system (CFS client); the information processing device transfers the second data to the low-frequency access storage system through the file system interface. Afterwards, when the information processing device stores the first data in the high-frequency access storage system and the storage duration of the first data meets the preset storage duration, the information processing device periodically detects the third access frequency of the first data (Periodically check and analyze files, check file access frequency, check whether the automatic data migration meets the rules); when the third access frequency is less than the preset access frequency threshold, the information processing device determines that the identification of the first data is a cold data identification; the information processing device The first data is transferred from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier through the data transfer component. When the information processing device stores the second data in the low-frequency access storage system, and the storage duration of the second data satisfies the preset storage duration, the information processing device periodically detects the fourth access frequency of the second data (periodic inspection Analyze the file, check the file access frequency, check whether the rule is satisfied (automated data migration); in the case that the fourth access frequency is greater than or equal to the preset access frequency threshold, the information processing device determines that the identification of the second data is a hot data identification; the information processing device passes The data transfer component transfers the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier. Wherein, the HDFS system is a system obtained from multiple (possibly three) SSDs.
示例性的,信息处理装置创建冷热表的方式可以为:Exemplarily, the way for the information processing device to create the hot and cold tables may be:
create'cold:table',{NAME=>'f1',METADATA=>{'DATA_STORAGE'=>'cfs'}}create'cold:table',{NAME=>'f1',METADATA=>{'DATA_STORAGE'=>'cfs'}}
create'chsTable',{NAME=>'f',COLD_BOUNDARY=>'86400'}。create'chsTable',{NAME=>'f',COLD_BOUNDARY=>'86400'}.
示例性的,如图3所示:集群管理节点(管理节点)包括元数据管理、区域管理和初始化文件系统(初始化HDFS系统和CFS系统),在信息处理装置启动的情况下,集群管理节点就初始化HDFS系统和CFS系统,在Regionserver接收到客户端写入的存储数据的情况下,Regionserver就将该存储数据写入内存,在Regionserver检测到内存中的剩余存储容量小于预设容量下限阈值、且Regionserver确定出第一数据的标识为热数据标识的情况下,Regionserver就将该第一数据转存至HDFS系统;在Regionserver检测到内存中的剩余存储容量小于预设容量下限阈值、且Regionserver确定出第二数据的标识为冷数据标识的情况下,Regionserver就将该第二数据转存至CFS系统。Exemplary, as shown in Figure 3: the cluster management node (management node) includes metadata management, area management and initialization file system (initialization HDFS system and CFS system), under the situation that the information processing device starts, the cluster management node just Initialize the HDFS system and CFS system. When the Regionserver receives the stored data written by the client, the Regionserver writes the stored data into the memory. When the Regionserver detects that the remaining storage capacity in the memory is less than the preset capacity lower limit threshold, and When the Regionserver determines that the identifier of the first data is a hot data identifier, the Regionserver transfers the first data to the HDFS system; when the Regionserver detects that the remaining storage capacity in the memory is less than the preset capacity lower limit threshold, and the Regionserver determines that the When the second data is marked as cold data, the Regionserver transfers the second data to the CFS system.
示例性的,如图4所示:信息处理装置在通过集群管理节点接收到客户端写入的存储数据的情况下,信息处理装置就直接将该存储数据写入内存,在信息处理装置检测到内存中的剩余存储容量小于预设容量下限阈值的情况下,信息处理装置就检测内存中的存储数据对应的数据访问频率,信息处理装置从数据访问频率中筛选出热数据标识对应的第一访问频率,并从存储数据中确定出第一访问频率对应的第一数据;信息处理装置基于至少一类数据对应的至少一个存储类别(类型为HDFS),确定出第一数据对应的存储系统为高频访问存储系统(CFS系统);信息处理装置通过文件系统接口将第一数据转存至高频访问存储系统。信息处理装置从数据访问频率中筛选出冷数据标识对应的第二访问频率,并从存储数据中确定出第二访问频率对应的第二数据;信息处理装置基于至少一类数据对应的至少一个存储类别(类型为CFS),确定出第二数据对应的存储系统为低频访问存储系统(CFS系统);信息处理装置通过文件系统接口将第二数据 转存至低频访问存储系统。之后,信息处理装置在将第一数据存储至高频访问存储系统、且第一数据的存储时长满足预设存储时长的情况下,信息处理装置就检测第一数据的第三访问频率;在第三访问频率小于预设访问频率阈值的情况下,信息处理装置确定第一数据的标识为冷数据标识;信息处理装置就利用数据流转组件(数据流转)将第一数据从高频访问存储系统转存至与冷数据标识对应的低频访问存储系统。信息处理装置在将第二数据存储至低频访问存储系统、且第二数据的存储时长满足预设存储时长的情况下,信息处理装置就检测第二数据的第四访问频率;在第四访问频率大于或者等于预设访问频率阈值的情况下,信息处理装置确定第二数据的标识为热数据标识;信息处理装置就利用数据流转组件将第二数据从低频访问存储系统中转存至与热数据标识对应的高频访问存储系统。Exemplarily, as shown in Figure 4: when the information processing device receives the stored data written by the client through the cluster management node, the information processing device directly writes the stored data into the memory, and when the information processing device detects When the remaining storage capacity in the internal memory is less than the preset capacity lower limit threshold, the information processing device detects the data access frequency corresponding to the stored data in the internal memory, and the information processing device screens out the first access corresponding to the hot data identifier from the data access frequency. Frequency, and determine the first data corresponding to the first access frequency from the stored data; the information processing device determines that the storage system corresponding to the first data is high based on at least one storage category (type is HDFS) corresponding to at least one type of data A frequent access storage system (CFS system); the information processing device transfers the first data to the high frequency access storage system through a file system interface. The information processing device screens out the second access frequency corresponding to the cold data identifier from the data access frequency, and determines the second data corresponding to the second access frequency from the stored data; the information processing device based on at least one storage corresponding to at least one type of data The category (the type is CFS), determines that the storage system corresponding to the second data is a low-frequency access storage system (CFS system); the information processing device transfers the second data to the low-frequency access storage system through the file system interface. Afterwards, when the information processing device stores the first data in the high-frequency access storage system, and the storage duration of the first data meets the preset storage duration, the information processing device detects the third access frequency of the first data; 3. When the access frequency is less than the preset access frequency threshold, the information processing device determines that the identifier of the first data is a cold data identifier; the information processing device uses the data transfer component (data transfer) to transfer the first data from the high-frequency access storage system Save to the low-frequency access storage system corresponding to the cold data ID. When the information processing device stores the second data in the low-frequency access storage system, and the storage duration of the second data satisfies the preset storage duration, the information processing device detects a fourth access frequency of the second data; If it is greater than or equal to the preset access frequency threshold, the information processing device determines that the identification of the second data is a hot data identification; the information processing device uses the data flow component to transfer the second data from the low-frequency access storage system to the hot data Identify the corresponding high-frequency access storage system.
示例性的,如图5所示:信息处理装置将客户端实时写入的存储数据存入内存,在信息处理装置确定出该存储数据中的第一数据的标识为热数据标识的情况下,信息处理装置就将第一数据转存(刷新)至HDFS系统(热烟层),在信息处理装置确定出热烟层的第一数据的标识变为冷数据标识的情况下,信息处理装置就确定与冷数据标识对应的存储位置为CFS系统(冷层);并利用LZ4压缩方式对第一数据进行压缩(数据压缩),将压缩后的第一数据转存至冷层;之后,在信息处理装置确定出冷层中的第一数据的标识变为热数据标识的情况下,信息处理装置确定与热数据标识对应的存储位置为热烟层;并利用ZSTD压缩方式对压缩后的第一数据进行压缩(数据压缩),得到第一压缩数据并将第一压缩数据转存至热烟层。其中,热烟层为文件系统层中的本地文件系统,包括(HDD、SSD、NVM、AEP存储级内存);冷层为文件系统层中的存储,具体包括云存储或者其他云存储。Exemplarily, as shown in FIG. 5: the information processing device stores the stored data written by the client in real time into the internal memory, and when the information processing device determines that the identifier of the first data in the stored data is a hot data identifier, The information processing device transfers (refreshes) the first data to the HDFS system (hot smoke layer). Determine that the storage location corresponding to the cold data mark is the CFS system (cold layer); and utilize the LZ4 compression method to compress the first data (data compression), and transfer the compressed first data to the cold layer; after that, in the information When the processing device determines that the identifier of the first data in the cold layer has changed to a hot data identifier, the information processing device determines that the storage location corresponding to the hot data identifier is the hot smoke layer; and uses the ZSTD compression method to compress the compressed first data The data is compressed (data compression), the first compressed data is obtained, and the first compressed data is transferred to the hot smoke layer. Among them, the hot smoke layer is the local file system in the file system layer, including (HDD, SSD, NVM, AEP storage class memory); the cold layer is the storage in the file system layer, specifically including cloud storage or other cloud storage.
可以理解的是,信息处理装置通过检测内存中的存储数据对应的数据访问频率,使得信息处理装置可以基于数据访问频率,将存储数据划分成 至少一类数据,基于至少一类数据对应的至少一个存储类别,确定至少一类数据对应的至少一个存储系统,从而将至少一类数据分别转存至至少一个存储系统;其中,每一类数据对应一个存储系统,不需要再人工确定该存储数据的数据热度和对应的存储位置,也不需要人工来对存储数据进行转存,提高了对存储数据进行存储时的智能性。It can be understood that the information processing device detects the data access frequency corresponding to the stored data in the memory, so that the information processing device can divide the stored data into at least one type of data based on the data access frequency, based on at least one corresponding to the at least one type of data Storage category, determining at least one storage system corresponding to at least one type of data, so as to transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system, and there is no need to manually determine the storage data Data popularity and corresponding storage locations do not require manual dumping of stored data, which improves intelligence when storing stored data.
实施例二Embodiment two
基于实施例一同一发明构思,本申请实施例提供了一种信息处理装置1,对应于一种信息处理方法;图6为本申请实施例提供的一种信息处理装置的组成结构示意图一,该信息处理装置1可以包括:Based on the same inventive concept of Embodiment 1, the embodiment of the present application provides an information processing device 1 corresponding to an information processing method; FIG. 6 is a schematic diagram of the composition and structure of an information processing device provided in the embodiment of the present application. Information processing device 1 may include:
检测部分11,配置于在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测所述内存中的存储数据对应的数据访问频率;The detection part 11 is configured to detect the data access frequency corresponding to the stored data in the internal memory when it is detected that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold;
划分部分12,配置于基于所述数据访问频率,将所述存储数据划分成至少一类数据;所述至少一类数据中的每一类数据对应一个存储类别;The division part 12 is configured to divide the stored data into at least one type of data based on the data access frequency; each type of data in the at least one type of data corresponds to a storage category;
确定部分13,配置于基于所述至少一类数据对应的至少一个存储类别,确定所述至少一类数据对应的至少一个存储系统;The determining part 13 is configured to determine at least one storage system corresponding to the at least one type of data based on at least one storage class corresponding to the at least one type of data;
转存部分14,配置于将所述至少一类数据分别转存至所述至少一个存储系统;其中,每一类数据对应一个存储系统。The dumping part 14 is configured to dump the at least one type of data to the at least one storage system respectively; wherein, each type of data corresponds to a storage system.
在本申请的一些实施例中,所述装置还包括接收部分;In some embodiments of the present application, the device further includes a receiving part;
所述接收部分,配置于接收所述存储数据对应的数据标识;所述数据标识包括热数据标识和冷数据标识;The receiving part is configured to receive a data identifier corresponding to the stored data; the data identifier includes a hot data identifier and a cold data identifier;
相应的,所述至少一类数据包括第一数据和/或第二数据;所述装置还包括筛选部分和标识部分;Correspondingly, the at least one type of data includes first data and/or second data; the device further includes a screening part and an identification part;
所述筛选部分,配置于从所述数据标识中筛选出所述热数据标识;和/或,从所述数据标识中筛选出所述冷数据标识;The screening part is configured to filter out the hot data tags from the data tags; and/or, filter out the cold data tags from the data tags;
所述标识部分,配置于利用所述热数据标识标识所述第一数据对应的 第一访问频率;和/或,利用所述冷数据标识标识所述第二数据对应的第二访问频率;将所述第一访问频率和/或所述第二访问频率作为所述数据访问频率。。The identification part is configured to use the hot data identifier to identify the first access frequency corresponding to the first data; and/or use the cold data identifier to identify the second access frequency corresponding to the second data; The first access frequency and/or the second access frequency are used as the data access frequency. .
在本申请的一些实施例中,所述筛选部分,配置于从所述数据访问频率中筛选出所述热数据标识对应的所述第一访问频率,和/或,从所述数据访问频率中筛选出所述冷数据标识对应的所述第二访问频率;In some embodiments of the present application, the screening part is configured to filter out the first access frequency corresponding to the hot data identifier from the data access frequency, and/or, from the data access frequency Filter out the second access frequency corresponding to the cold data identifier;
所述确定部分13,配置于从所述存储数据中确定出所述第一访问频率对应的所述第一数据;和/或,并从所述存储数据中确定出所述第二访问频率对应的所述第二数据。The determining part 13 is configured to determine from the stored data the first data corresponding to the first access frequency; and/or, determine from the stored data that the second access frequency corresponds to The second data of .
在本申请的一些实施例中,所述确定部分13,配置于在所述数据访问频率中筛选出访问频率大于或者等于预设访问频率阈值的第一访问频率,并在所述存储数据中确定与所述第一访问频率对应的第一数据;和/或,在所述数据访问频率中筛选出访问频率小于所述预设访问频率阈值的第二访问频率,并在所述存储数据中确定与所述第二访问频率对应的第二数据。。In some embodiments of the present application, the determination part 13 is configured to filter out the first access frequency whose access frequency is greater than or equal to a preset access frequency threshold value from the data access frequencies, and determine the first data corresponding to the first access frequency; and/or, filter out the second access frequency whose access frequency is less than the preset access frequency threshold from the data access frequency, and determine in the stored data second data corresponding to the second access frequency. .
在本申请的一些实施例中,所述至少一个存储系统包括高频访问存储系统和/或低频访问存储系统;In some embodiments of the present application, the at least one storage system includes a high-frequency access storage system and/or a low-frequency access storage system;
所述检测部分11,配置于在将第一数据存储至所述高频访问存储系统、且所述第一数据的存储时长满足预设存储时长的情况下,检测所述第一数据的第三访问频率;The detection part 11 is configured to detect the third data of the first data when the first data is stored in the high-frequency access storage system and the storage duration of the first data satisfies a preset storage duration. frequency of visits;
所述确定部分13,配置于在所述第三访问频率小于预设访问频率阈值的情况下,确定所述第一数据的标识为冷数据标识;The determining part 13 is configured to determine that the identifier of the first data is a cold data identifier when the third access frequency is less than a preset access frequency threshold;
所述转存部分14,配置于将所述第一数据从所述高频访问存储系统转存至与所述冷数据标识对应的所述低频访问存储系统。The dumping part 14 is configured to dump the first data from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier.
在本申请的一些实施例中,所述检测部分11,配置于在将第二数据存储至低频访问存储系统、且所述第二数据的存储时长满足预设存储时长的情况下,检测所述第二数据的第四访问频率;In some embodiments of the present application, the detection part 11 is configured to detect the a fourth access frequency of the second data;
所述确定部分13,配置于在所述第四访问频率大于或者等于预设访问频率阈值的情况下,确定所述第二数据的标识为热数据标识;The determining part 13 is configured to determine that the identifier of the second data is a hot data identifier when the fourth access frequency is greater than or equal to a preset access frequency threshold;
所述转存部分14,配置于将所述第二数据从所述低频访问存储系统中转存至与所述热数据标识对应的高频访问存储系统。The dumping part 14 is configured to dump the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier.
需要说明的是,在实际应用中,上述检测部分11、划分部分12、确定部分13和转存部分14可由信息处理装置1上的处理器15实现,具体为CPU(Central Processing Unit,中央处理器)、MPU(Microprocessor Unit,微处理器)、DSP(Digital Signal Processing,数字信号处理器)或现场可编程门阵列(FPGA,Field Programmable Gate Array)等实现;上述数据存储可由信息处理装置1上的存储器16实现。It should be noted that, in practical applications, the above-mentioned detection part 11, division part 12, determination part 13 and dump part 14 can be realized by the processor 15 on the information processing device 1, specifically a CPU (Central Processing Unit, central processing unit ), MPU (Microprocessor Unit, microprocessor), DSP (Digital Signal Processing, digital signal processor) or field programmable gate array (FPGA, Field Programmable Gate Array) etc.; The memory 16 implements.
本发明实施例还提供了一种信息处理装置1,如图7所示,所述信息处理装置1包括:处理器15、存储器16和通信总线17,所述存储器16通过所述通信总线17与所述处理器15进行通信,所述存储器16存储所述处理器15可执行的程序,当所述程序被执行时,通过所述处理器15执行如上述所述的信息处理方法。The embodiment of the present invention also provides an information processing device 1. As shown in FIG. The processor 15 communicates, and the memory 16 stores a program executable by the processor 15. When the program is executed, the processor 15 executes the information processing method as described above.
在实际应用中,上述存储器16可以是易失性存储器(volatile memory),例如随机存取存储器(Random-Access Memory,RAM);或者非易失性存储器(non-volatile memory),例如只读存储器(Read-Only Memory,ROM),快闪存储器(flash memory),硬盘(Hard Disk Drive,HDD)或固态硬盘(Solid-State Drive,SSD);或者上述种类的存储器的组合,并向处理器15提供指令和数据。In practical applications, the above-mentioned memory 16 can be a volatile memory (volatile memory), such as a random access memory (Random-Access Memory, RAM); or a non-volatile memory (non-volatile memory), such as a read-only memory (Read-Only Memory, ROM), flash memory (flash memory), hard disk (Hard Disk Drive, HDD) or solid-state drive (Solid-State Drive, SSD); Provide instructions and data.
本发明实施例提供了一种计算机可读存储介质,其上有计算机程序,所述程序被处理器15执行时实现如上述所述的信息处理方法。An embodiment of the present invention provides a computer-readable storage medium, on which there is a computer program, and when the program is executed by the processor 15, the information processing method as described above is realized.
可以理解的是,信息处理装置通过检测内存中的存储数据对应的数据访问频率,使得信息处理装置可以基于数据访问频率,将存储数据划分成至少一类数据,基于至少一类数据对应的至少一个存储类别,确定至少一 类数据对应的至少一个存储系统,从而将至少一类数据分别转存至至少一个存储系统;其中,每一类数据对应一个存储系统,不需要再人工确定该存储数据的数据热度和对应的存储位置,也不需要人工来对存储数据进行转存,提高了对存储数据进行存储时的智能性。It can be understood that the information processing device detects the data access frequency corresponding to the stored data in the memory, so that the information processing device can divide the stored data into at least one type of data based on the data access frequency, based on at least one corresponding to the at least one type of data Storage category, determining at least one storage system corresponding to at least one type of data, so as to transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system, and there is no need to manually determine the storage data Data popularity and corresponding storage locations do not require manual dumping of stored data, which improves intelligence when storing stored data.
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
以上所述,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the protection scope of the present application.
工业实用性Industrial Applicability
本申请实施例提供了一种信息处理方法及装置、存储介质,信息处理方法包括:在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测内存中的存储数据对应的数据访问频率;基于数据访问频率,将存储数据划分成至少一类数据;至少一类数据中的每一类数据对应一个存储类别;基于至少一类数据对应的至少一个存储类别,确定至少一类数据对应的至少一个存储系统,并将至少一类数据分别转存至至少一个存储系统;其中,每一类数据对应一个存储系统。采用上述方法实现方案,信息处理装置通过检测内存中的存储数据对应的数据访问频率,使得信息处理装置可以基于数据访问频率,将存储数据划分成至少一类数据,基于至少一类数据对应的至少一个存储类别,确定至少一类数据对应的至少一个存储系统,从而将至少一类数据分别转存至至少一个存储系统;其中,每一类数据对应一个存储系统,不需要再人工确定该存储数据的数据热度和对应的存储位置,也不需要人工来对存储数据进行转存,提高了对存储数据进行存储时的智能性。An embodiment of the present application provides an information processing method and device, and a storage medium. The information processing method includes: detecting that the stored data in the memory corresponds to data access frequency; based on the data access frequency, divide the stored data into at least one type of data; each type of data in at least one type of data corresponds to a storage category; based on at least one storage category corresponding to at least one type of data, determine at least one At least one storage system corresponding to the type of data, and at least one type of data is transferred to at least one storage system; wherein, each type of data corresponds to a storage system. By adopting the implementation scheme of the above method, the information processing device detects the data access frequency corresponding to the stored data in the memory, so that the information processing device can divide the stored data into at least one type of data based on the data access frequency, based on at least one type of data corresponding to at least one type of data A storage category, to determine at least one storage system corresponding to at least one type of data, so as to transfer at least one type of data to at least one storage system; wherein, each type of data corresponds to a storage system, and there is no need to manually determine the storage data The heat of the data and the corresponding storage location do not need to manually dump the stored data, which improves the intelligence when storing the stored data.

Claims (10)

  1. 一种信息处理方法,所述方法包括:An information processing method, the method comprising:
    在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测所述内存中的存储数据对应的数据访问频率;When it is detected that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold, detecting the data access frequency corresponding to the stored data in the internal memory;
    基于所述数据访问频率,将所述存储数据划分成至少一类数据;所述至少一类数据中的每一类数据对应一个存储类别;Divide the stored data into at least one type of data based on the data access frequency; each type of data in the at least one type of data corresponds to a storage category;
    基于所述至少一类数据对应的至少一个存储类别,确定所述至少一类数据对应的至少一个存储系统,并将所述至少一类数据分别转存至所述至少一个存储系统;其中,每一类数据对应一个存储系统。Based on at least one storage category corresponding to the at least one type of data, determine at least one storage system corresponding to the at least one type of data, and transfer the at least one type of data to the at least one storage system respectively; wherein, each A type of data corresponds to a storage system.
  2. 根据权利要求1所述的方法,其中,所述在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测所述内存中的存储数据对应的数据访问频率之前,所述方法还包括:The method according to claim 1, wherein, when it is detected that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold, before detecting the data access frequency corresponding to the stored data in the internal memory, the The method also includes:
    接收所述存储数据对应的数据标识;所述数据标识包括热数据标识和冷数据标识;Receive a data identifier corresponding to the stored data; the data identifier includes a hot data identifier and a cold data identifier;
    相应的,所述至少一类数据包括第一数据和/或第二数据;所述检测所述内存中的存储数据对应的数据访问频率,包括:Correspondingly, the at least one type of data includes first data and/or second data; the detection of the data access frequency corresponding to the stored data in the memory includes:
    从所述数据标识中筛选出所述热数据标识,并利用所述热数据标识标识所述第一数据对应的第一访问频率;Filter out the hot data identifier from the data identifiers, and use the hot data identifier to identify a first access frequency corresponding to the first data;
    和/或,从所述数据标识中筛选出所述冷数据标识,并利用所述冷数据标识标识所述第二数据对应的第二访问频率;And/or, filter out the cold data identifier from the data identifiers, and use the cold data identifier to identify a second access frequency corresponding to the second data;
    将所述第一访问频率和/或所述第二访问频率作为所述数据访问频率。Use the first access frequency and/or the second access frequency as the data access frequency.
  3. 根据权利要求2所述的方法,其中,所述基于所述数据访问频率,将所述存储数据划分成至少一类数据,包括:The method according to claim 2, wherein said dividing said stored data into at least one type of data based on said data access frequency comprises:
    从所述数据访问频率中筛选出所述热数据标识对应的所述第一访问频率,并从所述存储数据中确定出所述第一访问频率对应的所述第一数据;selecting the first access frequency corresponding to the hot data identifier from the data access frequency, and determining the first data corresponding to the first access frequency from the stored data;
    和/或,从所述数据访问频率中筛选出所述冷数据标识对应的所述第二访问频率,并从所述存储数据中确定出所述第二访问频率对应的所述第二数据。And/or, filter out the second access frequency corresponding to the cold data identifier from the data access frequency, and determine the second data corresponding to the second access frequency from the stored data.
  4. 根据权利要求1所述的方法,其中,所述基于所述数据访问频率,将所述存储数据划分成至少一类数据,包括:The method according to claim 1, wherein said dividing said stored data into at least one type of data based on said data access frequency comprises:
    在所述数据访问频率中筛选出访问频率大于或者等于预设访问频率阈值的第一访问频率,并在所述存储数据中确定与所述第一访问频率对应的第一数据;Screening out a first access frequency whose access frequency is greater than or equal to a preset access frequency threshold from the data access frequencies, and determining first data corresponding to the first access frequency in the stored data;
    和/或,在所述数据访问频率中筛选出访问频率小于所述预设访问频率阈值的第二访问频率,并在所述存储数据中确定与所述第二访问频率对应的第二数据。And/or, filter out a second access frequency whose access frequency is less than the preset access frequency threshold from the data access frequencies, and determine second data corresponding to the second access frequency in the stored data.
  5. 根据权利要求1所述的方法,其中,所述至少一个存储系统包括高频访问存储系统和/或低频访问存储系统;所述将所述至少一类数据分别转存至所述至少一个存储系统之后,所述方法还包括:The method according to claim 1, wherein said at least one storage system includes a high-frequency access storage system and/or a low-frequency access storage system; said transferring said at least one type of data to said at least one storage system respectively Afterwards, the method also includes:
    在将第一数据存储至所述高频访问存储系统、且所述第一数据的存储时长满足预设存储时长的情况下,检测所述第一数据的第三访问频率;When the first data is stored in the high-frequency access storage system and the storage duration of the first data satisfies a preset storage duration, detecting a third access frequency of the first data;
    在所述第三访问频率小于预设访问频率阈值的情况下,确定所述第一数据的标识为冷数据标识;When the third access frequency is less than a preset access frequency threshold, determine that the first data identifier is a cold data identifier;
    将所述第一数据从所述高频访问存储系统转存至与所述冷数据标识对应的所述低频访问存储系统。Dumping the first data from the high-frequency access storage system to the low-frequency access storage system corresponding to the cold data identifier.
  6. 根据权利要求1所述的方法,其中,所述将所述至少一类数据分别转存至所述至少一个存储系统之后,所述方法还包括:The method according to claim 1, wherein, after transferring the at least one type of data to the at least one storage system, the method further comprises:
    在将第二数据存储至低频访问存储系统、且所述第二数据的存储时长满足预设存储时长的情况下,检测所述第二数据的第四访问频率;When the second data is stored in the low-frequency access storage system, and the storage duration of the second data satisfies a preset storage duration, detecting a fourth access frequency of the second data;
    在所述第四访问频率大于或者等于预设访问频率阈值的情况下,确定所述第二数据的标识为热数据标识;When the fourth access frequency is greater than or equal to a preset access frequency threshold, determining that the identifier of the second data is a hot data identifier;
    将所述第二数据从所述低频访问存储系统中转存至与所述热数据标识对应的高频访问存储系统。Transferring the second data from the low-frequency access storage system to the high-frequency access storage system corresponding to the hot data identifier.
  7. 一种信息处理装置,所述装置包括:An information processing device, the device comprising:
    检测部分,配置于在检测到内存中的剩余存储容量小于或者等于预设容量下限阈值的情况下,检测所述内存中的存储数据对应的数据访问频率;The detecting part is configured to detect the data access frequency corresponding to the stored data in the internal memory when it is detected that the remaining storage capacity in the internal memory is less than or equal to the preset capacity lower limit threshold;
    划分部分,配置于基于所述数据访问频率,将所述存储数据划分成至少一类数据;所述至少一类数据中的每一类数据对应一个存储类别;The dividing part is configured to divide the stored data into at least one type of data based on the data access frequency; each type of data in the at least one type of data corresponds to a storage category;
    确定部分,配置于基于所述至少一类数据对应的至少一个存储类别,确定所述至少一类数据对应的至少一个存储系统;The determining part is configured to determine at least one storage system corresponding to the at least one type of data based on at least one storage class corresponding to the at least one type of data;
    转存部分,配置于将所述至少一类数据分别转存至所述至少一个存储系统;其中,每一类数据对应一个存储系统。The dumping part is configured to dump the at least one type of data to the at least one storage system respectively; wherein, each type of data corresponds to a storage system.
  8. 根据权利要求7所述的装置,所述装置还包括接收部分;The apparatus according to claim 7, further comprising a receiving portion;
    所述接收部分,配置于接收所述存储数据对应的数据标识;所述数据标识包括热数据标识和冷数据标识;The receiving part is configured to receive a data identifier corresponding to the stored data; the data identifier includes a hot data identifier and a cold data identifier;
    相应的,所述至少一类数据包括第一数据和/或第二数据;所述装置还包括筛选部分和标识部分;Correspondingly, the at least one type of data includes first data and/or second data; the device further includes a screening part and an identification part;
    所述筛选部分,配置于从所述数据标识中筛选出所述热数据标识;和/或,从所述数据标识中筛选出所述冷数据标识;The screening part is configured to filter out the hot data tags from the data tags; and/or, filter out the cold data tags from the data tags;
    所述标识部分,配置于利用所述热数据标识标识所述第一数据对应的第一访问频率;和/或,利用所述冷数据标识标识所述第二数据对应的第二访问频率;将所述第一访问频率和/或所述第二访问频率作为所述数据访问频率。The identification part is configured to use the hot data identifier to identify the first access frequency corresponding to the first data; and/or use the cold data identifier to identify the second access frequency corresponding to the second data; The first access frequency and/or the second access frequency are used as the data access frequency.
  9. 一种信息处理装置,所述装置包括:An information processing device, the device comprising:
    存储器、处理器和通信总线,所述存储器通过所述通信总线与所述处理器进行通信,所述存储器存储所述处理器可执行的信息处理的程序,当所述信息处理的程序被执行时,通过所述处理器执行如权利要求1至6任 一项所述的方法。A memory, a processor, and a communication bus, the memory communicates with the processor through the communication bus, the memory stores an information processing program executable by the processor, and when the information processing program is executed , executing the method according to any one of claims 1 to 6 by the processor.
  10. 一种存储介质,其上存储有计算机程序,应用于信息处理装置,该计算机程序被处理器执行时实现权利要求1至6任一项所述的方法。A storage medium, on which a computer program is stored, applied to an information processing device, when the computer program is executed by a processor, the method described in any one of claims 1 to 6 is realized.
PCT/CN2022/088113 2021-06-09 2022-04-21 Information processing method and apparatus, and storage medium WO2022257615A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110642322.6A CN113391764A (en) 2021-06-09 2021-06-09 Information processing method and device and storage medium
CN202110642322.6 2021-06-09

Publications (1)

Publication Number Publication Date
WO2022257615A1 true WO2022257615A1 (en) 2022-12-15

Family

ID=77620025

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/088113 WO2022257615A1 (en) 2021-06-09 2022-04-21 Information processing method and apparatus, and storage medium

Country Status (2)

Country Link
CN (1) CN113391764A (en)
WO (1) WO2022257615A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115794045A (en) * 2023-02-07 2023-03-14 山东信息职业技术学院 Software development application data processing method based on big data
CN116303233A (en) * 2022-12-19 2023-06-23 广州市玄武无线科技股份有限公司 Data storage management method, device, equipment and computer storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113391764A (en) * 2021-06-09 2021-09-14 北京沃东天骏信息技术有限公司 Information processing method and device and storage medium
CN115686382B (en) * 2022-12-30 2023-03-21 南京鲸鲨数据科技有限公司 Data storage and reading method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050268062A1 (en) * 2002-01-21 2005-12-01 Hitachi, Ltd. Hierarchical storage apparatus and control apparatus thereof
CN103631538A (en) * 2013-12-05 2014-03-12 华为技术有限公司 Cold and hot data identification threshold value calculation method, device and system
CN107908791A (en) * 2017-12-12 2018-04-13 郑州云海信息技术有限公司 Data cache method, device, equipment and storage medium in distributed memory system
CN109491618A (en) * 2018-11-20 2019-03-19 上海科技大学 Data management system, method, terminal and medium based on mixing storage
US20190095109A1 (en) * 2017-09-28 2019-03-28 International Business Machines Corporation Data storage system performance management
CN110019081A (en) * 2017-07-20 2019-07-16 中兴通讯股份有限公司 Data persistence processing method, device, system and readable storage medium storing program for executing
CN112905113A (en) * 2021-02-08 2021-06-04 中国工商银行股份有限公司 Data access processing method and device
CN113391764A (en) * 2021-06-09 2021-09-14 北京沃东天骏信息技术有限公司 Information processing method and device and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050268062A1 (en) * 2002-01-21 2005-12-01 Hitachi, Ltd. Hierarchical storage apparatus and control apparatus thereof
CN103631538A (en) * 2013-12-05 2014-03-12 华为技术有限公司 Cold and hot data identification threshold value calculation method, device and system
CN110019081A (en) * 2017-07-20 2019-07-16 中兴通讯股份有限公司 Data persistence processing method, device, system and readable storage medium storing program for executing
US20190095109A1 (en) * 2017-09-28 2019-03-28 International Business Machines Corporation Data storage system performance management
CN107908791A (en) * 2017-12-12 2018-04-13 郑州云海信息技术有限公司 Data cache method, device, equipment and storage medium in distributed memory system
CN109491618A (en) * 2018-11-20 2019-03-19 上海科技大学 Data management system, method, terminal and medium based on mixing storage
CN112905113A (en) * 2021-02-08 2021-06-04 中国工商银行股份有限公司 Data access processing method and device
CN113391764A (en) * 2021-06-09 2021-09-14 北京沃东天骏信息技术有限公司 Information processing method and device and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303233A (en) * 2022-12-19 2023-06-23 广州市玄武无线科技股份有限公司 Data storage management method, device, equipment and computer storage medium
CN115794045A (en) * 2023-02-07 2023-03-14 山东信息职业技术学院 Software development application data processing method based on big data

Also Published As

Publication number Publication date
CN113391764A (en) 2021-09-14

Similar Documents

Publication Publication Date Title
WO2022257615A1 (en) Information processing method and apparatus, and storage medium
US10761758B2 (en) Data aware deduplication object storage (DADOS)
US10228851B2 (en) Cluster storage using subsegmenting for efficient storage
US8112463B2 (en) File management method and storage system
US9792344B2 (en) Asynchronous namespace maintenance
US8166012B2 (en) Cluster storage using subsegmenting
US11086519B2 (en) System and method for granular deduplication
US9330108B2 (en) Multi-site heat map management
US7536426B2 (en) Hybrid object placement in a distributed storage system
US9313270B2 (en) Adaptive asynchronous data replication in a data storage system
US10516732B2 (en) Disconnected ingest in a distributed storage system
US9658774B2 (en) Storage system and storage control method
US9177034B2 (en) Searchable data in an object storage system
US9952772B2 (en) Compression-based detection of inefficiency in local storage
US9898485B2 (en) Dynamic context-based data protection and distribution
US10540329B2 (en) Dynamic data protection and distribution responsive to external information sources
EP3264254A1 (en) System and method for a simulation of a block storage system on an object storage system
US20170336995A1 (en) Compression-based detection of inefficiency in external services
CN117873405A (en) Data storage method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22819218

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE