CN107665224B - Method, system and device for scanning HDFS cold data - Google Patents

Method, system and device for scanning HDFS cold data Download PDF

Info

Publication number
CN107665224B
CN107665224B CN201610620101.8A CN201610620101A CN107665224B CN 107665224 B CN107665224 B CN 107665224B CN 201610620101 A CN201610620101 A CN 201610620101A CN 107665224 B CN107665224 B CN 107665224B
Authority
CN
China
Prior art keywords
metadata
time
information
real
cold data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610620101.8A
Other languages
Chinese (zh)
Other versions
CN107665224A (en
Inventor
王永光
王哲涵
唐尚文
张瑜标
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201610620101.8A priority Critical patent/CN107665224B/en
Publication of CN107665224A publication Critical patent/CN107665224A/en
Application granted granted Critical
Publication of CN107665224B publication Critical patent/CN107665224B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/122File system administration, e.g. details of archiving or snapshots using management policies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/183Provision of network file services by network file servers, e.g. by using NFS, CIFS

Abstract

The invention discloses a method, a system and a device for scanning HDFS cold data, wherein the method comprises the following steps: deriving metadata information from the metadata nodes as basic data; streaming metadata information in metadata nodes in real time, incrementally acquiring new metadata information in real time, and combining the new metadata information and the basic data into real-time metadata information to be scanned; and scanning the real-time metadata information to be scanned according to a preset rule so as to obtain cold data in real time. The system comprises a basic data acquisition module, a real-time data streaming module, a metadata storage module and a real-time calculation module. When the metadata is acquired, the metadata is streamed, and the metadata information is incrementally exported in real time, so that the pressure on the server when the metadata is exported is reduced; in addition, the method and the device scan the metadata in real time, and the timeliness of cold data discovery is greatly improved.

Description

Method, system and device for scanning HDFS cold data
Technical Field
The invention relates to the technical field of big data processing, in particular to a method, a System and a device for scanning Hadoop Distributed File System (HDFS) cold data.
Background
In the distributed file storage system HDFS, the number of files is huge. In general, cold data (i.e., less frequently used data) can account for over 70% of the total number of files. The existence of a large amount of cold data causes storage and access pressure of the storage system.
A schematic structural block diagram of a scanning HDFS cold data system in the prior art. In the HDFS metadata server architecture, namenode (a) is a key node providing an external service, which may also be referred to as a metadata node, and one of the main functions of the namenode is to manage metadata information of a file. The metadata information includes directory structure and attribute information of the file (folder), mapping information of the file and the location thereof, and the like, such as information of file name, backup number, block data, node data, and the like. In order to speed up the access of metadata, namenode (a) generally stores the metadata of a file in a memory, but also stores the information on a hard disk at the same time, and performs persistent storage to form a metadata mirror image file. In addition, the modification operation of the metadata is recorded in an operation log (editlg), which is generally stored in a journal node. NameNode (S) is a backup node of NameNode (A) for ensuring the security of metadata. NameNode (S) reads EditLog from JournalNode, and modifies metadata according to EditLog to ensure that metadata in NameNode (S) is consistent with metadata in NameNode (A).
In the prior art, in order to scan HDFS cold data, when metadata is derived, a metadata server is over-stressed, and timeliness of cold data obtained by scanning the metadata is poor.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method, a system and a device for scanning HDFS cold data, aiming at the defects of the prior art, and to solve the problems of excessive pressure of a metadata server when metadata is derived, and poor timeliness of cold data acquisition by scanning metadata.
In order to solve the above technical problem, the present invention provides a method for scanning HDFS cold data, wherein the method comprises the following steps:
deriving metadata information from the metadata nodes as basic data;
streaming metadata information in metadata nodes in real time, incrementally acquiring new metadata information in real time, and combining the new metadata information and the basic data into real-time metadata information to be scanned;
and scanning the real-time metadata information to be scanned according to a preset rule so as to obtain cold data in real time.
Preferably, the step of streaming the metadata information in real time and incrementally acquiring new metadata information in real time includes:
incrementally acquiring an operation log in real time through an operation log node;
and restoring the metadata mirror image which is the same as the metadata in the metadata node by playing back the operation log.
Preferably, the step of deriving the metadata information from the metadata node comprises:
and analyzing the metadata mirror image in the metadata node, acquiring the latest metadata information, and exporting the latest metadata information.
Wherein the metadata information and the new metadata information as the basic data include last operation time information.
Preferably, the step of scanning the real-time metadata information to be scanned according to a predetermined rule so as to obtain cold data in real time includes:
and scanning the last operation time information of the metadata information to be scanned in real time according to the set cold data time period information, wherein when the last operation time information of the metadata to be scanned in real time is positioned in the set cold data time period, the metadata to be scanned in real time is cold data.
Preferably, after the obtaining of the new metadata information in real-time increments, the method further includes:
sending the new metadata information to a message queue,
adding new metadata information in the message queue to the base data.
The invention also provides a system for scanning the HDFS cold data, which comprises the following steps:
the basic data acquisition module is used for deriving metadata information from the metadata nodes as basic data;
the real-time data streaming module is used for acquiring new metadata information in a real-time incremental manner;
the metadata storage module is used for merging and storing the basic data and the new metadata information acquired in real time and providing the metadata information to be scanned in real time; and
and the real-time computing module is used for scanning the real-time metadata information to be scanned according to a preset rule so as to obtain cold data in real time.
Preferably, the real-time data streaming module includes:
the operation log acquiring unit is used for acquiring operation logs from operation log nodes in real time in an incremental manner; and
and the restoring unit is used for playing back the acquired operation log to obtain a metadata mirror image which is the same as the metadata in the metadata node.
Preferably, the real-time computing module comprises:
the reading unit is used for reading the real-time metadata information to be scanned;
the comparison unit is used for comparing the last operation time information of the real-time metadata information to be scanned with preset cold data time period information; and
and the judging unit is used for determining the real-time metadata to be scanned as cold data when the last operation time information of the real-time metadata information to be scanned is within the preset cold data time period according to the comparison result.
Preferably, the real-time computing module further comprises:
and the parameter configuration unit is used for configuring cold data time period information and providing comparison basis for the comparison unit.
Preferably, the system further comprises a message platform comprising a message queue; and the real-time data streaming module sends the acquired new metadata information to a message queue in the message platform, and the message queue sends the new metadata information to the metadata storage module, or the metadata storage module reads the new metadata information from the message queue.
The invention also provides a device for scanning the cold data of the HDFS, which comprises a first memory and a first processor, wherein the first memory is used for storing data and instructions, and the first processor is configured as follows according to the instructions:
analyzing a metadata mirror image in a metadata node, acquiring latest metadata information, and exporting the latest metadata information;
and streaming the metadata information in the metadata node in real time, and incrementally acquiring new metadata information in real time.
Preferably, in the above apparatus for scanning HDFS cold data, when the first processor is configured to stream metadata information in real time and incrementally acquire new metadata information in real time, the specific configuration includes:
incrementally acquiring an operation log in real time through an operation log node;
and restoring the metadata mirror image which is the same as the metadata in the metadata node by playing back the operation log.
The invention also provides another device for scanning the cold data of the HDFS, which comprises a second memory and a second processor, wherein the second memory is used for storing data and instructions, and the second processor is configured as follows according to the instructions:
receiving metadata information derived from metadata nodes as basic data;
receiving new metadata information streamed from the metadata node in real time, and merging the new metadata information with the basic data into the metadata information to be scanned in real time;
and scanning the real-time metadata information to be scanned according to a preset rule so as to obtain cold data in real time.
Preferably, in the above apparatus for scanning HDFS cold data, when the second processor is configured to scan the real-time metadata information to be scanned according to a predetermined rule, so as to obtain the cold data in real time, the specific configuration is as follows:
and scanning the last operation time information of the metadata information to be scanned in real time according to the set cold data time period information, wherein when the last operation time information of the metadata to be scanned in real time is positioned in the set cold data time period, the metadata to be scanned in real time is cold data.
According to an aspect of the present invention, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement the above-described method of scanning HDFS cold data.
When the metadata is acquired, the metadata is streamed and the metadata information is exported in real time in an incremental manner, so that the pressure on the server when the metadata is exported is reduced; in addition, the invention accesses the metadata information to the real-time computing system, thereby greatly improving the timeliness of cold data discovery.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent by describing embodiments of the present invention with reference to the following drawings, in which:
FIG. 1 is a block diagram of a prior art schematic structure for scanning HDFS cold data;
FIG. 2 is a block diagram illustrating the schematic structure of an embodiment of the present invention for scanning HDFS cold data;
FIG. 3 is a schematic flow chart of the method for scanning HDFS cold data according to the present invention;
FIG. 4 is a schematic diagram of the architecture of the scanning HDFS cold data system of the present invention;
FIG. 5 is a schematic diagram of the structure of an embodiment of a system for scanning HDFS cold data according to the present invention;
FIG. 6 is a schematic diagram of a first exemplary embodiment of an apparatus for scanning HDFS cold data according to the present invention; and
FIG. 7 is a schematic diagram of a second exemplary embodiment of a device for scanning HDFS cold data.
Detailed Description
The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, and procedures have not been described in detail so as not to obscure the present invention. The figures are not necessarily drawn to scale.
The flowcharts and block diagrams in the figures and block diagrams illustrate the possible architectures, functions, and operations of the systems, methods, and apparatuses according to the embodiments of the present invention, and may represent a module, a program segment, or merely a code segment, which is an executable instruction for implementing a specified logical function. It should also be noted that the executable instructions that implement the specified logical functions may be recombined to create new modules and program segments. The blocks of the drawings, and the order of the blocks, are thus provided to better illustrate the processes and steps of the embodiments and should not be taken as limiting the invention itself.
FIG. 2 is a block diagram illustrating the schematic structure of an embodiment of the present invention for scanning HDFS cold data; in this embodiment, the HDFS metadata server architecture is the same as that in the prior art, and includes a metadata node namenode (a), a backup metadata node namenode(s), and an operation log node journal. The metadata node NameNode (A) stores the operation log EditLog into the operation log node JournalNode, the backup metadata node NameNode (S) reads the EditLog from the JournalNode, and modifies the metadata thereof according to the EditLog, so as to ensure that the metadata in the NameNode (S) is consistent with the metadata in the NameNode (A). And acquiring new metadata information in real time in an incremental manner through the operation log node JournalNode, and sending the new metadata information to a message queue in a message platform so as to send the new metadata information to a real-time computing system to compute cold data in real time.
In the invention, the new metadata information can be continuously acquired in real time by a process of acquiring the new metadata information in real time by operating the journal node, which is called streaming of data.
Referring to fig. 3, fig. 3 is a schematic flow chart of the method for scanning HDFS cold data according to the present invention, and the method for scanning HDFS cold data provided by the present invention is described as follows:
in step S1, metadata information is derived from the metadata node as basic data. Specifically, the metadata mirror image in the metadata node is analyzed, the latest metadata information is obtained, the latest metadata information is exported, and the metadata is stored to provide basic data for the real-time computing system. The process is operated once, namely basic data is obtained once through the step.
Step S2, streaming the metadata information in the metadata node in real time, incrementally acquiring new metadata information in real time, and merging the new metadata information with the basic data to be the metadata information to be scanned in real time. Specifically, an operation log is incrementally acquired in real time through an operation log node; and restoring the metadata mirror image which is the same as the metadata in the metadata node by playing back the operation log. Specifically, the playback refers to reading the operation logs one by one, and acquiring the operation time and corresponding operations, i.e., new metadata information.
In this step, the streaming of the metadata information is performed in real time, that is, after all the metadata is obtained at once in step S1, it is not necessary to repeatedly obtain all the metadata information, and only the changed metadata information needs to be obtained. The operation log stored in the operation log node journal records the operation information of the metadata, such as when to perform what operation. The operation log is obtained in a real-time incremental mode, so that the metadata which are changed can be known, the metadata mirror image which is the same as the metadata in the metadata node can be restored through playback, and therefore new metadata information can be obtained in real time.
And step S3, scanning the real-time metadata information to be scanned according to a preset rule, thereby obtaining cold data in real time. The metadata information as the basic data and the new metadata information both include last operation time information, so that the last operation time information of the real-time metadata information to be scanned is scanned according to set cold data time period information, whether the last operation time information of the real-time metadata information to be scanned is located in the set cold data time period is compared, and if the last operation time information of the real-time metadata information to be scanned is located in the set cold data time period, the metadata is cold data. The cold data is then retrieved. The process of step 3 is completed by the real-time computing system in fig. 2, and the restored new metadata information is sent to the real-time computing system through a JN interface (i.e., a journal node interface). As an embodiment, as shown in fig. 2, the new metadata information may be first sent to a message queue in a message platform, and the message platform sends the new metadata information to the real-time computing system, or the real-time computing system actively queries and acquires the new metadata information from the message queue in the message platform.
According to the principle and method for scanning the cold data of the HDFS, the present invention provides a system for scanning the cold data of the HDFS, and the structural principle of the system is shown in fig. 4. The method specifically comprises the following steps: the system comprises a basic data acquisition module 1, a real-time data streaming module 2, a metadata storage module 3 and a real-time calculation module 4. Wherein, the basic data acquisition module 1 derives metadata information from a metadata node namenode (a) as basic data; the real-time data streaming module 2 is used for acquiring new metadata information in real-time increment; the metadata storage module 3 is used for merging and storing the basic data and the new metadata information acquired in real time and providing real-time metadata information to be scanned; and the real-time computing module 4 scans the real-time metadata information to be scanned according to a preset rule so as to obtain cold data in real time.
Specifically, as shown in fig. 5, the real-time data streaming module 2 includes an operation log obtaining unit 21 and a restoring unit 22, where the operation log obtaining unit 21 is configured to obtain an operation log from an operation log node incrementally in real time; the restoring unit 22 obtains a metadata mirror image that is the same as the metadata in the metadata node by playing back the obtained operation log.
The real-time computation module 4 comprises a reading unit 41, a comparison unit 42 and a judgment unit 43. The reading unit 41 is configured to read real-time metadata information to be scanned from the metadata storage module 3; the comparison unit 42 is configured to compare the last operation time information of the real-time metadata information to be scanned with predetermined cold data time period information; the determining unit 43 determines the real-time metadata to be scanned as cold data when the last operation time information of the real-time metadata information to be scanned is within the predetermined cold data time period according to the comparison result. In order to configure the cold data time period information, a configuration unit 44 may be further included, and the cold data time period information is configured through the configuration unit 44 to provide a basis for the comparison unit to compare.
Additionally, in one embodiment, a messaging platform may also be included, the messaging platform including a message queue. The real-time data streaming module 2 sends the acquired new metadata information to a message queue in the message platform, and the message queue sends the new metadata information to the metadata storage module 3, or the metadata storage module 3 reads the new metadata information from the message queue.
The present invention also provides a first apparatus for scanning HDFS cold data, as shown in fig. 6, including a first memory 100 and a first processor 101, where the first memory 100 is used to store data and instructions, and the first processor 101 is configured as follows according to the instructions:
analyzing a metadata mirror image in a metadata node, acquiring latest metadata information, and exporting the latest metadata information;
and streaming the metadata information in the metadata node in real time, and incrementally acquiring new metadata information in real time. Specifically, an operation log is incrementally acquired in real time through an operation log node; and restoring the metadata mirror image which is the same as the metadata in the metadata node by playing back the operation log.
The device one is located in the HDFS metadata server and used for obtaining metadata information for scanning.
The present invention further provides another apparatus for scanning HDFS cold data, as shown in fig. 7, including a second memory 200 and a second processor 201, where the second memory 200 is used to store data and instructions, and the second processor 201 is configured as follows according to the instructions:
receiving metadata information derived from metadata nodes as basic data;
receiving new metadata information streamed from the metadata node in real time, and merging the new metadata information with the basic data into the metadata information to be scanned in real time;
and scanning the real-time metadata information to be scanned according to a preset rule so as to obtain cold data in real time. The concrete configuration is as follows:
and scanning the last operation time information of the metadata information to be scanned in real time according to the set cold data time period information, wherein when the last operation time information of the metadata to be scanned in real time is positioned in the set cold data time period, the metadata to be scanned in real time is cold data.
The second device corresponds to the real-time computing system shown in fig. 2, and is used for completing the anchor sweeping of the metadata, so as to obtain the cold data in real time.
When the metadata is acquired, the metadata is streamed and the metadata information is exported in real time in an incremental manner, so that the pressure on the server when the metadata is exported is reduced; in addition, the invention accesses the metadata information to the real-time computing system, thereby greatly improving the timeliness of cold data discovery.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (13)

1. A method of scanning HDFS cold data, comprising the steps of:
deriving metadata information from the metadata nodes as basic data;
streaming metadata information in metadata nodes in real time, incrementally acquiring new metadata information in real time, and combining the new metadata information and the basic data into real-time metadata information to be scanned;
and scanning the last operation time information of the metadata information to be scanned in real time according to the set cold data time period information, wherein when the last operation time information of the metadata to be scanned in real time is positioned in the set cold data time period, the metadata to be scanned in real time is cold data.
2. The method of scanning HDFS cold data according to claim 1, wherein said streaming metadata information in real-time, incrementally acquiring new metadata information in real-time, comprises:
incrementally acquiring an operation log in real time through an operation log node;
and restoring the metadata mirror image which is the same as the metadata in the metadata node by playing back the operation log.
3. The method of scanning HDFS cold data according to claim 1 or 2, wherein the step of deriving metadata information from metadata nodes comprises:
and analyzing the metadata mirror image in the metadata node, acquiring the latest metadata information, and exporting the latest metadata information.
4. The method of scanning HDFS cold data according to claim 3, wherein the metadata information and new metadata information as base data include last operation time information.
5. The method of scanning HDFS cold data according to claim 1, further comprising, after incrementally acquiring new metadata information in real-time:
sending the new metadata information to a message queue;
adding new metadata information in the message queue to the base data.
6. A system for scanning HDFS cold data, comprising:
the basic data acquisition module is used for deriving metadata information from the metadata nodes as basic data;
the real-time data streaming module is used for acquiring new metadata information in a real-time incremental manner;
the metadata storage module is used for merging and storing the basic data and the new metadata information acquired in real time and providing the metadata information to be scanned in real time; and
and the real-time computing module is used for scanning the last operation time information of the metadata information to be scanned in real time according to the set cold data time period information, and when the last operation time information of the metadata to be scanned in real time is positioned in the set cold data time period, the metadata to be scanned in real time is cold data.
7. The system for scanning HDFS cold data of claim 6, wherein the real-time data streaming module comprises:
the operation log acquiring unit is used for acquiring operation logs from operation log nodes in real time in an incremental manner; and
and the restoring unit is used for playing back the acquired operation log to obtain a metadata mirror image which is the same as the metadata in the metadata node.
8. The system for scanning HDFS cold data according to claim 7, wherein the real-time computation module comprises:
the reading unit is used for reading the real-time metadata information to be scanned;
the comparison unit is used for comparing the last operation time information of the real-time metadata information to be scanned with preset cold data time period information; and
and the judging unit is used for determining the real-time metadata to be scanned as cold data when the last operation time information of the real-time metadata information to be scanned is within the preset cold data time period according to the comparison result.
9. The system for scanning HDFS cold data of claim 8, wherein the real-time computation module further comprises:
and the parameter configuration unit is used for configuring cold data time period information and providing comparison basis for the comparison unit.
10. The system for scanning HDFS cold data of claim 6, further comprising a message platform, the message platform including a message queue; and the real-time data streaming module sends the acquired new metadata information to a message queue in the message platform, and the message queue sends the new metadata information to the metadata storage module, or the metadata storage module reads the new metadata information from the message queue.
11. An apparatus for scanning HDFS cold data, comprising a first memory for storing data and instructions, and a first processor configured according to the instructions to:
deriving metadata information from the metadata nodes as basic data;
streaming metadata information in metadata nodes in real time, incrementally acquiring new metadata information in real time, and combining the new metadata information and the basic data into real-time metadata information to be scanned;
and scanning the last operation time information of the metadata information to be scanned in real time according to the set cold data time period information, wherein when the last operation time information of the metadata to be scanned in real time is positioned in the set cold data time period, the metadata to be scanned in real time is cold data.
12. The apparatus for scanning HDFS cold data according to claim 11, wherein the first processor, when configured to stream metadata information in a metadata node in real-time and incrementally obtain new metadata information in real-time, is specifically configured to:
incrementally acquiring an operation log in real time through an operation log node;
and restoring the metadata mirror image which is the same as the metadata in the metadata node by playing back the operation log.
13. A computer readable storage medium storing computer instructions which, when executed by a processor, implement a method of scanning HDFS cold data as claimed in any one of claims 1 to 5.
CN201610620101.8A 2016-07-29 2016-07-29 Method, system and device for scanning HDFS cold data Active CN107665224B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610620101.8A CN107665224B (en) 2016-07-29 2016-07-29 Method, system and device for scanning HDFS cold data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610620101.8A CN107665224B (en) 2016-07-29 2016-07-29 Method, system and device for scanning HDFS cold data

Publications (2)

Publication Number Publication Date
CN107665224A CN107665224A (en) 2018-02-06
CN107665224B true CN107665224B (en) 2021-04-30

Family

ID=61122020

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610620101.8A Active CN107665224B (en) 2016-07-29 2016-07-29 Method, system and device for scanning HDFS cold data

Country Status (1)

Country Link
CN (1) CN107665224B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918911B (en) * 2019-03-18 2020-11-03 北京升鑫网络科技有限公司 Method and equipment for scanning mirror image installation package information
CN113760855A (en) * 2021-09-10 2021-12-07 北京金山云网络技术有限公司 Data storage method and device, electronic equipment and storage medium
CN113760854A (en) * 2021-09-10 2021-12-07 北京金山云网络技术有限公司 Method for identifying data in HDFS memory and related equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101101563A (en) * 2007-07-23 2008-01-09 清华大学 Migration management based on massive data classified memory system
CN102449975A (en) * 2009-04-09 2012-05-09 诺基亚公司 Systems, methods, and apparatuses for media file streaming
CN103064902A (en) * 2012-12-18 2013-04-24 厦门市美亚柏科信息股份有限公司 Method and device for storing and reading data in hadoop distributed file system (HDFS)
CN104572357A (en) * 2014-12-30 2015-04-29 清华大学 Backup and recovery method for HDFS (Hadoop distributed filesystem)
CN105051696A (en) * 2013-01-10 2015-11-11 网络流逻辑公司 An improved streaming method and system for processing network metadata

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6981005B1 (en) * 2000-08-24 2005-12-27 Microsoft Corporation Partial migration of an object to another storage location in a computer system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101101563A (en) * 2007-07-23 2008-01-09 清华大学 Migration management based on massive data classified memory system
CN102449975A (en) * 2009-04-09 2012-05-09 诺基亚公司 Systems, methods, and apparatuses for media file streaming
CN103064902A (en) * 2012-12-18 2013-04-24 厦门市美亚柏科信息股份有限公司 Method and device for storing and reading data in hadoop distributed file system (HDFS)
CN105051696A (en) * 2013-01-10 2015-11-11 网络流逻辑公司 An improved streaming method and system for processing network metadata
CN104572357A (en) * 2014-12-30 2015-04-29 清华大学 Backup and recovery method for HDFS (Hadoop distributed filesystem)

Also Published As

Publication number Publication date
CN107665224A (en) 2018-02-06

Similar Documents

Publication Publication Date Title
CN110209652B (en) Data table migration method, device, computer equipment and storage medium
CA2910211C (en) Object storage using multiple dimensions of object information
CN106649828B (en) Data query method and system
CN104580439B (en) Method for uniformly distributing data in cloud storage system
CN106909595B (en) Data migration method and device
CN112311902B (en) File sending method and device based on micro-service
US20150248421A1 (en) System and method for recovering system status consistently to designed recovering time point in distributed database
CN107665224B (en) Method, system and device for scanning HDFS cold data
CN108614837B (en) File storage and retrieval method and device
CN111382123B (en) File storage method, device, equipment and storage medium
CN113297166A (en) Data processing system, method and device
EP2778953A1 (en) Encoded-search database device, method for adding and deleting data for encoded search, and addition/deletion program
CN110888837B (en) Object storage small file merging method and device
CN110413595B (en) Data migration method applied to distributed database and related device
US20180107404A1 (en) Garbage collection system and process
CN109460345B (en) Real-time data calculation method and system
CN114925041A (en) Data query method and device
US9405786B2 (en) System and method for database flow management
CN106998436B (en) Video backup method and device
US11509719B2 (en) Blockchain technology in data storage system
CN110737635B (en) Data blocking method
US9442975B2 (en) Systems and methods for processing data stored in data storage devices
US11620290B2 (en) Method and system for performing data cloud operations
CN105102083A (en) Data processing method, apparatus and system
EP4195068B1 (en) Storing and retrieving media recordings in an object store

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant