CN108984345B - Big data backup method based on virtual shared directory - Google Patents

Big data backup method based on virtual shared directory Download PDF

Info

Publication number
CN108984345B
CN108984345B CN201810776448.0A CN201810776448A CN108984345B CN 108984345 B CN108984345 B CN 108984345B CN 201810776448 A CN201810776448 A CN 201810776448A CN 108984345 B CN108984345 B CN 108984345B
Authority
CN
China
Prior art keywords
data
big data
backup
medium
nfs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810776448.0A
Other languages
Chinese (zh)
Other versions
CN108984345A (en
Inventor
匙凯
于富东
胡建华
杨林
崔明阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jilin Jlu Communication Design Institute Co ltd
Original Assignee
Jilin Jlu Communication Design Institute Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jilin Jlu Communication Design Institute Co ltd filed Critical Jilin Jlu Communication Design Institute Co ltd
Priority to CN201810776448.0A priority Critical patent/CN108984345B/en
Publication of CN108984345A publication Critical patent/CN108984345A/en
Application granted granted Critical
Publication of CN108984345B publication Critical patent/CN108984345B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1461Backup scheduling policy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A big data backup method based on virtual shared directory is prepared as providing file sharing protocol interface to external by local storage on media server, setting up a virtual shared directory, providing said interface to big data platform A to be backed up, carrying partition on local when big data platform A needs to be backed up to obtain sharing right of said virtual directory, disconnecting partition after backup is finished, backing back said partition to media server and providing shared directory service to another storage server.

Description

Big data backup method based on virtual shared directory
Technical Field
The invention belongs to the technical field of data backup, and particularly relates to a big data backup method for improving big data backup efficiency.
Background
The value of data in the big data era is more critical, and the safety of data running on the big data needs to be guaranteed, so that a faster and more universal backup technology is needed to realize data backup of various big data platforms and guarantee backup efficiency and compatibility.
At present, the method for data backup of a large data platform generally follows the following architecture, which includes the following parts: backup agents (i.e., agents), media servers, storage media.
The details of the specific implementation can be roughly divided into the following two types:
(1) client agent
Figure 846951DEST_PATH_IMAGE002
HTTP
Figure 410698DEST_PATH_IMAGE002
Media server
Figure 190435DEST_PATH_IMAGE002
ISCSI
Figure 585644DEST_PATH_IMAGE002
Storage medium
The backup agent is installed on a large data host of a to-be-backed end, collects backup data, and transmits the data to the media server through a network HTTP protocol, the media server is often deployed independently, collects data from each backup agent, and transmits and stores the data to a storage medium (such as disk) through an ISCSI interface after deduplication and compression are performed.
(2) Client agent
Figure 462333DEST_PATH_IMAGE002
HTTP
Figure 284796DEST_PATH_IMAGE002
Media server
Figure 551829DEST_PATH_IMAGE002
HTTP
Figure 813046DEST_PATH_IMAGE002
Storage medium
The backup agent is installed on a big data host of a side to be backed up, collects backup data, transmits the data to the media server through a network HTTP protocol, the media server is deployed independently, collects the data from each backup agent, performs deduplication and compression, and transmits and stores the data to a storage medium (such as object storage) through an HTTP interface.
In the prior art (1), corresponding acquisition clients are required for different backup objects, and agents are required to transfer data from a real data source, such as a hadoop name, to a temporary directory (on the host), then, the data in the directory is processed by block cutting (for example, one 64K data block at a time), and then each data block is transmitted to the media server end by the HTTP protocol, and after the media server receives the data, after a series of deduplication and compression processing, data is transmitted to a special storage medium (such as disk) through an FC network by an ISCSI protocol, the data in the whole process is subjected to 4 key time-consuming steps (i.e., agent local temporary storage, local switching, network transmission to a media server, and network transmission of the media server to the storage medium), the efficiency of data backup is difficult to guarantee, and the running risk of the system is increased by too many links.
Compared with the technology (1), the difference is that after the data is transmitted to the media server, the data is not directly transmitted to the storage media through the ISCSI protocol, but is cut into blocks again through the HTTP protocol, and the data is transmitted to the object storage through the HTTP protocol (object storage), the technology (2) is only different in the back-end storage protocol compared with the technology (1), the overall storage efficiency and risk are not effectively avoided, meanwhile, corresponding client agent agents also need to be developed for the acquisition of a multi-type large data platform, and the complexity and compatibility of a backup system are not improved. Therefore, there is a need in the art for a new solution to solve this problem.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the big data backup method based on the virtual shared directory improves the compatibility of a data backup system under a heterogeneous big data platform, simplifies the backup process of the big data platform backup system and improves the backup efficiency.
A big data backup method based on a virtual shared directory is characterized in that: the method comprises the following steps:
step one, establishing a virtual shared data storage backup system comprising a big data platform, a backup medium layer, a medium service layer and a storage medium;
secondly, the big data platform initiates a backup requirement to the system, the backup medium layer remotely mounts the network file medium NFS agent on the big data platform, provides a virtual shared directory based on a network file NFS protocol for the big data platform, and temporarily stores data in an internal directory of the NFS agent;
step three, after the NFS agent provided by the backup medium layer finishes temporary storage, the virtual sharing link is disconnected, and the data of the large data platform belongs to the backup medium layer;
step four, after data processing is carried out on the backup medium layer, the NFS agent is sent to a storage medium, and data of the big data platform is reserved in the storage medium;
step five, the big data platform initiates a data recovery request, backups data corresponding to the storage medium on the medium layer, establishes a shared virtual directory through the NFS agent, and sends the shared virtual directory to the medium service layer;
step six, mounting the NFS agent to the big data platform again through the medium service layer, and obtaining the file level access authority of the data by the big data platform;
and seventhly, the big data platform restores the data to the production environment, the restoration operation of the data is carried out, and the big data backup based on the virtual shared directory is completed.
The storage medium is an entity terminal device for actually storing data, can be automatically partitioned inside and is used for backing up data storage of more than one big data platform at the same time.
The backup medium layer is used for adapting the data receiving layer corresponding to the NFS agent to the storage medium for temporary storage and processing of data.
Through the design scheme, the invention can bring the following beneficial effects: a big data backup method based on a virtual shared directory improves the compatibility of a data backup system under a heterogeneous big data platform, simplifies the backup process of the big data platform backup system and improves the backup efficiency.
The invention can bring the following further beneficial effects: the invention realizes the creation of the virtual shared directory by two times of remote mounting, simplifies the complexity caused by the repeated processing and transmission of the existing backup software, and improves the efficiency of backup recovery.
The remote mounting technology of the invention adopts NFS protocol support, and a universal file protocol can be adapted to various big data platforms, and the compatibility of data backup of the big data platforms is improved without the need of traditional backup software for various clients.
Drawings
The invention is further described with reference to the following figures and detailed description:
fig. 1 is a schematic block diagram of a process of a big data backup method based on a virtual shared directory according to the present invention.
Detailed Description
A big data backup method based on a virtual shared directory is characterized in that: the method comprises the following steps:
step one, establishing a virtual shared data storage backup system comprising a big data platform, a backup medium layer, a medium service layer and a storage medium;
secondly, the big data platform initiates a backup requirement to the system, the backup medium layer remotely mounts the network file medium NFS agent on the big data platform, provides a virtual shared directory based on a network file NFS protocol for the big data platform, and temporarily stores data in an internal directory of the NFS agent;
step three, after the NFS agent provided by the backup medium layer finishes temporary storage, the virtual sharing link is disconnected, and the data of the large data platform belongs to the backup medium layer;
step four, after data processing is carried out on the backup medium layer, the NFS agent is sent to a storage medium, and data of the big data platform is reserved in the storage medium; the virtual shared directory is provided on the storage medium in a remote mounting mode, so that the disk-drop persistence of the backup data on the storage medium is realized, namely the shared directory is used as storage and reserved at the storage medium, and when other large data platforms need to be backed up at the moment, a new partition is divided at the storage medium and used for storing new backup data;
step five, the big data platform initiates a data recovery request, backups data corresponding to the storage medium on the medium layer, establishes a shared virtual directory through the NFS agent, and sends the shared virtual directory to the medium service layer;
step six, mounting the NFS agent to the big data platform again through the medium service layer, and obtaining the file level access authority of the data by the big data platform;
and seventhly, the big data platform restores the data to the production environment, the restoration operation of the data is carried out, and the big data backup based on the virtual shared directory is completed.
The invention provides a file sharing protocol interface to the outside through the local storage on the medium server, establishes a virtual sharing directory, if the interface is provided for the big data platform A which needs to be backed up, the partition is mounted on the local when the big data platform A needs to be backed up, the sharing right of the virtual directory can be obtained, after the backup is finished, the partition is disconnected, the partition can be returned to the medium server, meanwhile, the sharing directory service is provided for the other storage server, and the backup of the big data file is realized simply through the file copying.
The recovery process is the reverse of the backup process, except that the order of the two data shares is different.

Claims (3)

1. A big data backup method based on a virtual shared directory is characterized in that: comprises the following steps of (a) carrying out,
step one, establishing a virtual shared data storage backup system comprising a big data platform, a backup medium layer, a medium service layer and a storage medium;
secondly, the big data platform initiates a backup requirement to the system, the backup medium layer remotely mounts the network file medium NFS agent on the big data platform, provides a virtual shared directory based on a network file NFS protocol for the big data platform, and temporarily stores data in an internal directory of the NFS agent;
step three, after the NFS agent provided by the backup medium layer finishes temporary storage, the virtual sharing link is disconnected, and the data of the large data platform belongs to the backup medium layer;
step four, after data processing is carried out on the backup medium layer, the NFS agent is sent to a storage medium, and data of the big data platform is reserved in the storage medium;
step five, the big data platform initiates a data recovery request, backups data corresponding to the storage medium on the medium layer, establishes a shared virtual directory through the NFS agent, and sends the shared virtual directory to the medium service layer;
step six, mounting the NFS agent to the big data platform again through the medium service layer, and obtaining the file level access authority of the data by the big data platform;
and seventhly, the big data platform restores the data to the production environment, the restoration operation of the data is carried out, and the big data backup based on the virtual shared directory is completed.
2. The method for backing up big data based on the virtual shared directory as claimed in claim 1, wherein: the storage medium is a disk for actually storing data, can be automatically partitioned inside and is used for backing up data storage of more than one big data platform at the same time.
3. The method for backing up big data based on the virtual shared directory as claimed in claim 1, wherein: the backup medium layer is used for adapting the data receiving layer corresponding to the NFS agent to the storage medium for temporary storage and processing of data.
CN201810776448.0A 2018-07-11 2018-07-11 Big data backup method based on virtual shared directory Active CN108984345B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810776448.0A CN108984345B (en) 2018-07-11 2018-07-11 Big data backup method based on virtual shared directory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810776448.0A CN108984345B (en) 2018-07-11 2018-07-11 Big data backup method based on virtual shared directory

Publications (2)

Publication Number Publication Date
CN108984345A CN108984345A (en) 2018-12-11
CN108984345B true CN108984345B (en) 2020-06-23

Family

ID=64548399

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810776448.0A Active CN108984345B (en) 2018-07-11 2018-07-11 Big data backup method based on virtual shared directory

Country Status (1)

Country Link
CN (1) CN108984345B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111399984A (en) * 2020-03-19 2020-07-10 上海英方软件股份有限公司 File recovery method and system based on virtual machine backup data

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1554055A (en) * 2001-07-23 2004-12-08 �Ƚ�΢װ�ù�˾ High-availability cluster virtual server system
CN102375955A (en) * 2010-08-17 2012-03-14 伊姆西公司 System and method for locking files in combined naming space in network file system
US8429140B1 (en) * 2010-11-03 2013-04-23 Netapp. Inc. System and method for representing application objects in standardized form for policy management
US8655851B2 (en) * 2011-04-08 2014-02-18 Symantec Corporation Method and system for performing a clean file lock recovery during a network filesystem server migration or failover
CN103761168A (en) * 2014-01-26 2014-04-30 上海爱数软件有限公司 Method for mounting backup virtual machine based on nfs volume
CN104461776A (en) * 2014-11-26 2015-03-25 上海爱数软件有限公司 Application disaster tolerance method based on CDP and iSCSI virtual disk technology
CN105224256A (en) * 2015-10-13 2016-01-06 浪潮(北京)电子信息产业有限公司 A kind of storage system
CN105740052A (en) * 2016-01-28 2016-07-06 浪潮(北京)电子信息产业有限公司 Method, device and system for online migration of virtual machines of non-shared memories

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7103638B1 (en) * 2002-09-04 2006-09-05 Veritas Operating Corporation Mechanism to re-export NFS client mount points from nodes in a cluster
US8694469B2 (en) * 2009-12-28 2014-04-08 Riverbed Technology, Inc. Cloud synthetic backups
US10108687B2 (en) * 2015-01-21 2018-10-23 Commvault Systems, Inc. Database protection using block-level mapping
CN105468476B (en) * 2015-11-18 2019-03-08 盛趣信息技术(上海)有限公司 Data disaster recovery and backup systems based on HDFS
CN106250270B (en) * 2016-07-28 2019-05-21 广东奥飞数据科技股份有限公司 A kind of data back up method under cloud computing platform

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1554055A (en) * 2001-07-23 2004-12-08 �Ƚ�΢װ�ù�˾ High-availability cluster virtual server system
CN102375955A (en) * 2010-08-17 2012-03-14 伊姆西公司 System and method for locking files in combined naming space in network file system
US8429140B1 (en) * 2010-11-03 2013-04-23 Netapp. Inc. System and method for representing application objects in standardized form for policy management
US8655851B2 (en) * 2011-04-08 2014-02-18 Symantec Corporation Method and system for performing a clean file lock recovery during a network filesystem server migration or failover
CN103761168A (en) * 2014-01-26 2014-04-30 上海爱数软件有限公司 Method for mounting backup virtual machine based on nfs volume
CN104461776A (en) * 2014-11-26 2015-03-25 上海爱数软件有限公司 Application disaster tolerance method based on CDP and iSCSI virtual disk technology
CN105224256A (en) * 2015-10-13 2016-01-06 浪潮(北京)电子信息产业有限公司 A kind of storage system
CN105740052A (en) * 2016-01-28 2016-07-06 浪潮(北京)电子信息产业有限公司 Method, device and system for online migration of virtual machines of non-shared memories

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NetBackup Disk Based Data Protection Options;Alex Davies;《eval.symantec.com/enterprise/white_papers》;20090430;全文 *
基于虚拟化技术的三级存储方案研究与实现;韩雪;《万方数据知识服务平台》;20150730;全文 *

Also Published As

Publication number Publication date
CN108984345A (en) 2018-12-11

Similar Documents

Publication Publication Date Title
CN108255641B (en) CDP disaster recovery method based on cloud platform
CN106250270B (en) A kind of data back up method under cloud computing platform
CN107256182B (en) Method and device for restoring database
CN107526626B (en) Docker container thermal migration method and system based on CRIU
CN112084098A (en) Resource monitoring system and working method
CN103875229B (en) asynchronous replication method, device and system
US11921597B2 (en) Cross-platform replication
CN109144785B (en) Method and apparatus for backing up data
CN106302806B (en) A kind of method of data synchronization, system, synchronous obtaining method and relevant apparatus
US10534796B1 (en) Maintaining an active-active cloud across different types of cloud storage services
CN109976941B (en) Data recovery method and device
CN105677507B (en) A kind of business data cloud standby system and method
CN101808127A (en) Data backup method, system and server
CN105446831A (en) Server-Free backup method in conjunction with SAN
US8315986B1 (en) Restore optimization
US11768624B2 (en) Resilient implementation of client file operations and replication
EP3786802A1 (en) Method and device for failover in hbase system
CN103780417A (en) Database failure transfer method based on cloud hard disk and device thereof
CN104035837A (en) Method for backing up isomorphic/isomerous UNIX/Linux host on line
CN108710550B (en) Double-data-center disaster tolerance system for public security traffic management inspection and control system
CN114185484A (en) Method, device, equipment and medium for clustering document storage
CN108984345B (en) Big data backup method based on virtual shared directory
CN113190620B (en) Method, device, equipment and storage medium for synchronizing data between Redis clusters
CN105323271B (en) Cloud computing system and processing method and device thereof
CN112416878A (en) File synchronization management method based on cloud platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant