WO2018019310A1 - 一种大数据系统中数据备份方法、恢复方法和装置和计算机存储介质 - Google Patents

一种大数据系统中数据备份方法、恢复方法和装置和计算机存储介质 Download PDF

Info

Publication number
WO2018019310A1
WO2018019310A1 PCT/CN2017/098606 CN2017098606W WO2018019310A1 WO 2018019310 A1 WO2018019310 A1 WO 2018019310A1 CN 2017098606 W CN2017098606 W CN 2017098606W WO 2018019310 A1 WO2018019310 A1 WO 2018019310A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
backup
recovery
application
storage system
Prior art date
Application number
PCT/CN2017/098606
Other languages
English (en)
French (fr)
Inventor
谢东
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2018019310A1 publication Critical patent/WO2018019310A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation

Definitions

  • the present disclosure relates to the field of communications technologies, and in particular, to a data backup method, a recovery method, and a device total data storage medium in a big data system.
  • a database is a data storage and management center. After the various data are collected, after sorting, cleaning, inspection and normalization, the data is continuously imported into the database. During the daily operation of the system, some phased data needs to be preserved for a long time. For example, the user bill will be used as an important document, which has important purposes; when the monthly and monthly statistical report data is used as a reference for decision making, it needs to be saved. Therefore, the backup and recovery of these data is an important basic work.
  • the database system is divided into a relational database and a non-relational database.
  • Relational databases have a rigorous mathematical theoretical foundation, and database vendors usually provide a complete backup and recovery solution.
  • the database system maintains the internal clock and automatically generates the system change number. This number has global uniqueness, is assigned sequentially, and automatically grows with the database.
  • Data backup Firstly obtain the current system change number. Based on this number, the system saves the entire database or table data to the backup file in a snapshot manner. During the backup, the system changes are called incremental data. , will not be written to the backup file.
  • data recovery open the backup file, read the contents of the backup file, write to the database. Content that is not in the database will be created; the content already in the database will be overwritten.
  • the backup data has a low value density
  • the inventor found through in-depth research: in the big data system, since the database carries the application system, it is more practical to consider the backup and recovery scheme in combination with the database and the application system as a whole.
  • the data of key individuals in certain key business segments is very important and needs to be backed up.
  • the total amount of the call will change, and the user often needs to make a summary of the current bill as the proof of consumption. In the future, the balance of the account will change with new consumption. Therefore, users need to do a data backup every time they pay.
  • Embodiments of the present invention provide a data backup method, a recovery method, and a device in a big data system, which solve the problem that the existing backup method has low value density of backup data, long time in the backup recovery process, and high storage cost.
  • Embodiments of the present invention provide a data backup method in a big data system, including:
  • the embodiment of the invention further provides a data recovery method in a big data system, comprising:
  • the embodiment of the invention further provides a data backup device in a big data system, comprising:
  • a receiving module configured to receive a backup instruction, where the backup instruction includes at least identifier information of the backup object
  • a first obtaining module configured to acquire, by using the identifier information, configuration information of the backup object, where the configuration information is used to determine that backup data of the backup object is from an application or an external data storage system;
  • a second obtaining module configured to acquire the backup data from the application or the external data storage system by using the configuration information
  • a saving module for saving the backup data for saving the backup data.
  • the embodiment of the invention further provides a data recovery device in a big data system, comprising:
  • a first receiving module configured to receive a recovery instruction, where the recovery instruction includes at least identifier information of the recovery object
  • a third acquiring module configured to acquire backup data of the recovery object by using the identifier information
  • a recovery module for restoring the backup data to an application or to an external data storage system.
  • the embodiment of the present invention further provides a computer storage medium, where the computer storage medium stores one or more programs executable by a computer, and when the one or more programs are executed by the computer, the computer is executed as described above.
  • a data backup method in a big data system and a data recovery method in a big data system are provided.
  • the backup instruction is received, and the backup instruction includes at least the identifier information of the backup object, and the configuration information of the backup object is obtained by using the identifier information, where the configuration information is used to determine the backup of the backup object.
  • the data is from an application or an external data storage system; the backup data is obtained from the application or the external data storage system using the configuration information; the backup data is saved, so that an important backup object can be at an important moment Data is backed up, making backup data more valuable, saving storage space and reducing backup time.
  • FIG. 1 is a flowchart of a data backup method in a big data system according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a data backup and recovery device in a big data system according to an embodiment of the present invention
  • FIG. 3 is a flowchart of a data recovery method in a big data system according to an embodiment of the present invention
  • FIG. 4 is a flowchart of another method for backing up data in a big data system according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of another method for data recovery in a big data system according to an embodiment of the present invention.
  • FIG. 6 is a structural diagram of a data backup device in a big data system according to an embodiment of the present invention.
  • FIG. 7 is a structural diagram of a data recovery apparatus in a big data system according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of a data query method in a big data system according to an embodiment of the present invention.
  • an embodiment of the present invention provides a data backup method in a big data system, including the following steps:
  • Step S101 Receive a backup instruction, where the backup instruction includes at least identifier information of the backup object.
  • Step S102 Obtain configuration information of the backup object by using the identifier information, where the configuration information is used to determine that the backup data of the backup object is from an application or an external data storage system.
  • Step S103 Acquire the backup data from the application or the external data storage system by using the configuration information.
  • Step S104 save the backup data.
  • step S101 the command receiving unit of the processing module is first commanded to receive a certain backup.
  • the backup instruction of the object includes at least the index number (ID, Identification) of the backup object.
  • the backup instruction includes:
  • a backup command issued by the application a backup command automatically generated by a scheduled task, or a backup command automatically generated after the trigger condition is reached.
  • the backup command may be issued by the application, or it may be generated automatically by the scheduled task, or it may be automatically generated by the system after the trigger condition is reached.
  • the backup directory unit of the data dictionary module is used to query the directory information of the backup object using the ID of the backup object. For example, if the ID of a backup object is 99, the backup directory unit of the data dictionary module queries the directory information of the backup object with the ID 99, and the directory information includes the name of the backup object, the creation time, and the data related to the backup object. Which field of the table.
  • the data source unit of the data dictionary module determines where the source of the data associated with the backup object is from the application or from an external data storage system.
  • the backup recovery policy unit of the data dictionary module determines where the backup data of the backup object is obtained, whether it is obtained from the application or from an external data storage system.
  • backup data of the backup object is acquired from the application or the external data storage system.
  • the backup directory unit of the data dictionary module queries the directory information of the backup object with the ID 99, and the directory information includes the name of the backup object, the creation time, and the data related to the backup object. Which field of the table.
  • the data source unit of the data dictionary module determines where the source of the data associated with the backup object is from the application or from an external data storage system.
  • the backup recovery policy unit of the data dictionary module determines where the backup data of the backup object is obtained, whether it is obtained from the application or from an external data storage system. If the application provides backup data, the data processing module data extraction unit extracts the backup data from the application.
  • the backup data of the backup object is the data in the current memory of the application, and the application packages the data and sends it back for backup, thereby avoiding the access to the data storage system again, and the efficiency is higher. If the application does not provide backup data, the data processing module data extraction unit establishes a connection with the external data storage system, and queries the corresponding table to obtain backup data.
  • the configuration information includes:
  • Directory information is used to determine data related to the backup object, the data source information being used to determine that the data related to the backup object is from the application a program or the external data storage system, the policy information is used to determine to acquire the backup data from the application or the external data storage system;
  • the backup data includes:
  • the backup data is a data set corresponding to the backup object at a current time
  • the obtaining the backup data from the application or the external data storage system by using the configuration information includes:
  • the directory information contains the name of the backup object, the creation time, and which field of the table the data associated with the backup object comes from.
  • the data source unit of the data dictionary module determines where the source of the data associated with the backup object is from the application or from an external data storage system.
  • the backup recovery policy unit of the data dictionary module determines where the backup data of the backup object is obtained, whether it is obtained from the application or from an external data storage system.
  • the backup directory unit of the data dictionary module queries the directory information of the backup object with the ID 99, and the directory information includes the name of the backup object, the creation time, and the data related to the backup object. Which field of the table.
  • the data source unit of the data dictionary module determines where the source of the data associated with the backup object is from the application or from an external data storage system.
  • the backup recovery policy unit of the data dictionary module determines where the backup data of the backup object is obtained, whether it is obtained from the application or from an external data storage system.
  • the data related to the backup object included in the directory information is only descriptive data, not the backed up data, and the data obtained from the application or the external data storage system is the data that needs to be backed up. .
  • the backup data is a collection of all the data related to the backup object at the current time, and only the data to be backed up is backed up. It is possible to back up data of important backup objects at important moments without having to back up the entire data.
  • the backup data of the backup object may be directly obtained from the application or the external data storage system without receiving the backup instruction, and then the backup data is normalized, and the normalized backup data is saved to the data storage module data storage unit.
  • the saving the backup data includes:
  • the backup data is normalized, and the normalized backup data is saved.
  • the backup data of the same backup object may come from multiple tables, which are defined according to the data dictionary. After the format is normalized, it is packaged into a data set to make the processing program universal.
  • the data processing module data saving unit saves the backup data, and the data processing module data storage unit establishes a connection with the data storage module data storage unit, and saves the backup data to the data storage module data storage unit.
  • a data backup method in a big data system is proposed.
  • the minimum unit of backup is related data of a backup object at a certain moment, and the amount of data is small, so that data of important backup objects at important moments can be backed up, so that Backup data has a higher value density and saves storage space and backup time.
  • an embodiment of the present invention provides a data recovery method in a big data system, including the following steps:
  • Step S301 Receive a recovery instruction, where the recovery instruction includes at least identification information of the recovery object.
  • Step S302 Acquire backup data of the recovery object by using the identifier information.
  • Step S303 Restore the backup data to an application or restore to an external data storage system.
  • step S301 the command processing module instructs the receiving unit to receive a resume instruction for a certain recovery object, which may be issued by the application.
  • the identifier information includes:
  • the number of the recovery object and the recovery time are the same.
  • the recovery command contains the number of the recovery object and the recovery time. For the recovery moment, for example, the current data is restored to the data of the moment last year.
  • step S302 the data processing module data extracting unit establishes a connection with the data storage module data storage unit, finds the backup data through the query condition, and extracts the backup data from the data storage module data storage unit.
  • the backup data includes:
  • the backup data is a data set corresponding to the recovery object at a specified time
  • the recovering the backup data into an application includes:
  • ignoring means not changing the current data, that is, retaining the current data
  • appending means appending the backup data based on retaining the current data
  • replacing means deleting the current data, and then writing the backup data into the application. In the program.
  • the recovering to the external data storage system includes:
  • the backup data is split, and the split data is respectively written into a corresponding table, and the table is stored in the external data storage system.
  • the backup data is restored to the external data storage system, since the backup data may come from different tables, the backup data is first split and then into the corresponding table, which are stored in the external data storage system.
  • the minimum unit of backup is the related data of the backup object at a certain moment, and the data amount is small, so the data recovery can be completed quickly and the efficiency is higher.
  • an embodiment of the present invention provides a flow of a data backup method in a big data system, including the following steps:
  • Step S401 Receive a backup instruction for a backup object.
  • Step S402 parsing the backup object.
  • Step S403 determining whether the application provides backup data.
  • Step S404 If the application provides backup data, the backup data is obtained from the application; if the application does not provide the backup data, the backup data is obtained from the external data storage system.
  • Step S405 normalizing the backup data.
  • Step S406 saving backup data.
  • step S401 the command receiving unit of the command processing module first receives a backup instruction for a backup object, and the backup instruction includes at least the ID of the backup object.
  • the backup command may be issued by the application, or it may be generated automatically by the scheduled task, or it may be automatically generated by the system after the trigger condition is reached.
  • the backup directory unit of the data dictionary module is used to query the directory information of the backup object using the ID of the backup object. For example, if the ID of a backup object is 99, the backup directory unit of the data dictionary module queries the directory information of the backup object with the ID 99, and the directory information includes the name of the backup object, the creation time, and the data related to the backup object. Which field of the table.
  • the data source unit of the data dictionary module determines where the source of the data associated with the backup object is from the application or from an external data storage system.
  • the backup recovery policy unit of the data dictionary module determines where the backup data of the backup object is obtained, whether it is obtained from the application or from an external data storage system.
  • step S403 the backup recovery policy unit of the data dictionary module determines from where the backup data of the backup object is obtained, whether it is acquired from the application or from the external data storage system.
  • step S404 if the application provides backup data, the data processing module data extraction unit responds Use the program to extract backup data.
  • the backup data of the backup object is the data in the current memory of the application, and the application packages the data and sends it back for backup, thereby avoiding accessing the data storage system again, and the efficiency is higher; If the application does not provide backup data, the data processing module data extraction unit establishes a connection with the external data storage system, and queries the corresponding table to obtain backup data.
  • the backup data of the same backup object may come from a plurality of tables, and the backup data is normalized according to the format defined by the data dictionary, and then packaged into a data set to make the processing program have universality.
  • step S406 the data processing module data saving unit saves the backup data, and the data processing module data storage unit establishes a connection with the data storage module data storage unit, and saves the backup data to the data storage module data storage unit.
  • the backup data is a collection of all data related to the backup object at the current time, and only the data to be backed up is backed up. It is possible to back up data of important backup objects at important moments without having to back up the entire data.
  • a data backup method in a big data system is proposed.
  • the minimum unit of backup is related data of a backup object at a certain moment, and the amount of data is small, so that data of important backup objects at important moments can be backed up, so that Backup data has a higher value density and saves storage space and backup time.
  • an embodiment of the present invention provides a flow of a data recovery method in a big data system, including the following steps:
  • Step S501 Receive a recovery instruction for a certain recovery object.
  • Step S502 extracting backup data.
  • Step S503 restoring the backup data to the application or restoring the backup data to the external data storage system.
  • step S501 the command processing module instructs the receiving unit to receive a resume instruction for a certain recovery object, which may be issued by the application.
  • the recovery command contains the number of the recovery object and the recovery time. For the recovery moment, for example, the current data is restored to the data of the moment last year.
  • step S502 the data processing module data extracting unit establishes a connection with the data storage module data storage unit, finds the backup data through the query condition, and extracts the backup data from the data storage module data storage unit.
  • step S503 if the backup data is restored to the application, there are three cases: ignore, append, and replace.
  • ignoring means not changing the current data, that is, retaining the current data
  • appending means appending the backup data based on retaining the current data
  • replacing means deleting the current data, and then the number of backups It is written into the application
  • the backup data is restored to the external data storage system, since the backup data may come from different tables, the backup data is first split and then into the corresponding table, and these tables are stored in External data storage system.
  • the minimum unit of backup is the related data of the backup object at a certain moment, and the data amount is small, so the data recovery can be completed quickly and the efficiency is higher.
  • This device is the data backup and recovery device in the big data system shown in FIG. 2.
  • the embodiment of the present invention provides a structure of a data backup device in a big data system, including the following modules:
  • the receiving module 601 is configured to receive a backup instruction, where the backup instruction includes at least identifier information of the backup object;
  • the first obtaining module 602 is configured to acquire, by using the identifier information, configuration information of the backup object, where the configuration information is used to determine that the backup data of the backup object is from an application or an external data storage system;
  • a second obtaining module 603, configured to acquire the backup data from the application or the external data storage system by using the configuration information
  • the saving module 604 is configured to save the backup data.
  • the backup instruction includes:
  • a backup command issued by the application a backup command automatically generated by a scheduled task, or a backup command automatically generated after the trigger condition is reached.
  • the configuration information includes:
  • Directory information is used to determine data related to the backup object, the data source information being used to determine that the data related to the backup object is from the application a program or the external data storage system, the policy information is used to determine to acquire the backup data from the application or the external data storage system;
  • the backup data includes:
  • the backup data is a data set corresponding to the backup object at a current time
  • the second obtaining module 603 is configured to use the directory information to determine data related to the backup object, and use the data source information to determine that the data related to the backup object is from the application or the external data storage. System, using the policy information to determine from the application or the external data storage system Obtain the backup data.
  • the saving module 604 is configured to perform normalization processing on the backup data, and save the normalized backup data.
  • the data backup device in the big data system may be the data backup device in the big data system in the embodiment shown in FIG. 1 and FIG. 4, and the big data system in the embodiment shown in FIG. 1 and FIG.
  • Any implementation of the medium data backup device can be implemented by the data backup device in the big data system in this embodiment, and details are not described herein again.
  • a data backup device in a big data system is proposed.
  • the data backup method in the big data system can be implemented on the data backup device.
  • the minimum unit of the backup is the related data of the backup object at a certain moment, and the data amount is compared. Small, so that important backup objects can be backed up at important moments, making the backup data more valuable, saving storage space and reducing backup time.
  • the embodiment of the present invention provides a structure of a data recovery apparatus in a big data system, including the following modules:
  • the first receiving module 701 is configured to receive a recovery instruction, where the recovery instruction includes at least identifier information of the recovery object;
  • the third obtaining module 702 is configured to obtain backup data of the recovery object by using the identifier information
  • the recovery module 703 is configured to restore the backup data to an application or to the external data storage system.
  • the identifier information includes:
  • the number of the recovery object and the recovery time are the same.
  • the backup data includes:
  • the backup data is a data set corresponding to the recovery object at a specified time
  • the recovery module 703 is configured to replace the current data of the recovery object in the application with the backup data, add the backup data based on current data of the recovery object in the application, or retain the application in the application. Recover the current data of the object.
  • the recovery module 703 is configured to split the backup data, and then write the split data into a corresponding table, where the table is stored in the external data storage system.
  • the data recovery device in the above big data system may be the data recovery device in the big data system in the embodiment shown in FIG. 3 and FIG. 5, and the big data system in the embodiment shown in FIG. 3 and FIG. Data recovery Any implementation of the device may be implemented by the data recovery device in the big data system in this embodiment, and details are not described herein again.
  • a data recovery device in a big data system is proposed, and a data recovery method in a big data system can be implemented on the data recovery device.
  • the smallest unit of backup is the related data of the backup object at a certain moment, and the amount of data is small, so the data recovery can be completed quickly and the efficiency is higher.
  • the embodiment of the present invention provides a flow of a data query method in a big data system, including the following steps:
  • Step S801 Receive a query instruction for a backup object.
  • Step S802 extracting backup data.
  • step S803 the backup data is fed back.
  • step S801 the command processing module instructs the receiving unit to receive a query instruction for a backup object, where the query instruction includes a backup object ID and a backup data time range.
  • step S802 the data processing module data extracting unit establishes a connection with the data storage module data storage unit, finds the backup data by querying conditions, and extracts the backup data from the data storage module data storage unit.
  • step S803 the extracted backup data is returned to the client.
  • a data query method in a big data system is proposed.
  • the minimum unit of backup is related data of a backup object at a certain moment, and the amount of data is small, and all backup data related to the backup object at a certain moment will be simultaneously Query, so data query can be completed quickly and more efficiently.
  • the backup instruction includes:
  • a backup command issued by the application a backup command automatically generated by a scheduled task, or a backup command automatically generated after the trigger condition is reached.
  • the configuration information includes:
  • Directory information is used to determine data related to the backup object, the data source information being used to determine that the data related to the backup object is from the application a program or the external data storage system, the policy information is used to determine to acquire the backup data from the application or the external data storage system;
  • the backup data includes:
  • the backup data is a data set corresponding to the backup object at a current time
  • the obtaining the backup data from the application or the external data storage system by using the configuration information includes:
  • the saving the backup data includes:
  • the backup data is normalized, and the normalized backup data is saved.
  • the identifier information includes:
  • the number of the recovery object and the recovery time are the same.
  • the backup data includes:
  • the backup data is a data set corresponding to the recovery object at a specified time
  • the recovering the backup data into an application includes:
  • the recovering to the external data storage system includes:
  • the backup data is split, and the split data is respectively written into a corresponding table, and the table is stored in the external data storage system.
  • the storage medium is, for example, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
  • the backup instruction is received, and the backup instruction includes at least the identifier information of the backup object, and the configuration information of the backup object is obtained by using the identifier information, where the configuration information is used to determine the backup of the backup object.
  • the data is from an application or an external data storage system; the backup data is obtained from the application or the external data storage system using the configuration information; the backup data is saved, so that an important backup object can be at an important moment Data is backed up, making backup data more valuable, saving storage space and reducing backup time.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

一种大数据系统中数据备份方法、恢复方法和装置,该方法包括:接收备份指令(S101),所述备份指令中至少包括备份对象的标识信息;使用所述标识信息获取所述备份对象的配置信息(S102),所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据(S103);将所述备份数据保存(S104)。该方法可以提高备份数据的价值密度、节省存储空间和缩短备份时间。

Description

一种大数据系统中数据备份方法、恢复方法和装置和计算机存储介质 技术领域
本公开涉及通信技术领域,特别涉及一种大数据系统中数据备份方法、恢复方法和装置合计计算机存储介质。
背景技术
在通信系统中,数据库是数据存储和管理中心。各种数据被采集来之后,经过分类、清洗、检验和规范化处理后,源源不断地进入数据库。在系统日常运行过程中,一些阶段性的数据需要长期保存下来。例如:用户账单将作为重要凭证,具有重要用途;当月月统计报表数据作为决策参考依据,需要被保存下来。因此,这些数据的备份与恢复是一项重要的基本工作。
数据库系统分关系型数据库和非关系型数据库。关系型数据库具有严密的数学理论基础,数据库厂商通常都提供了完整的备份恢复方案。例如:Oracle数据库备份恢复技术特点是这样的:
1、数据库系统维护内部时钟,自动产生系统更改号,这个号码具有全局唯一性,被顺序分配,随数据库运行自动增长。
2、数据备份:首先获取当前系统更改号,基于此号码,系统以快照的方式,将整个数据库或者表的数据完整地保存到备份文件中,在备份期间,系统发生的变更称为增量数据,不会被写入备份文件。
3、数据恢复:打开备份文件,读取备份文件的内容,写入数据库。数据库中没有的内容,将会被创建;数据库中已经有的内容,将会被覆盖。
对于那些数据结构复杂,数据量大的情况,统称为大数据。面对这种类型数据,传统的关系型数据库系统在对大数据进行处理时显得越来越困难,于是产生了非关系型数据库,用于大数据处理。非关系型数据库大多是开源项目,目前还缺乏完善的数学理论基础,没有统一的行业标准。由于大数据的特点,这些非关系型数据库没有提供有效的数据库备份方案。
在实现本发明的过程中,发明人发现在大数据系统中数据备份和恢复,至少存在以下难题需要解决:
1、备份数据量大
大数据系统中,整体数据量往往非常庞大,且在不断扩展。如果选择整体数据备份,则需要的时间成本、存储成本都很高,所以很难满足实际需求。需要一种划分数据的方案,每次不对整体数据备份,又能实现备份的作用,保证数据的有效性。
2、数据备份恢复过程耗时长
在大数据系统中,目前还没有有效的备份恢复方法。是否可以借鉴关系型数据库的备份恢复方法呢?由于非关系型数据库往往不满足数据一致性要求,因此关系型数据库的备份恢复方法不能直接拿来使用。需要一种备份恢复方法,适用这种数据特点。如果直接使用关系型数据库的备份恢复方法,则每次备份恢复过程耗时长,正常使用的时间窗口就小,系统可用性低。
3、备份数据的价值密度低
大数据系统中,虽然整体数据量往往非常庞大,但是各个部分数据价值不一样。某些对象很重要,备份价值大;某些时刻很重要,备份价值大。如果不加区分,将所有数据各种时刻数据都备份出来,备份数据的价值密度低。重要对象在关键时刻的数据最重要,这些数据备份的价值密度最高,最需要备份。相反,不重要的对象在不重要的时刻的数据,就没必要备份了。
针对这种情况,发明人通过深入研究发现:在大数据系统中,由于数据库上面承载的是应用系统,如果结合数据库和应用系统整体来考虑备份恢复方案,则更具有实际意义。具体的来说,在通信系统中,在某些关键业务环节的重点个体的数据非常重要,需要备份。例如:在消费系统中,对于某个用户,如果存入话费,话费总额会发生变动,用户往往需要对当前账单做一个汇总,作为消费凭据。今后,随着新的消费,该账户的余额将变化。因此,用户每次缴费后就需要做一次数据备份。如果我们借鉴关系型数据库备份的方法,可以先找到涉及该账号当前资金相关的所有表,然后将表中该账号相关数据提取出来,进行备份。相对于数据库整体备份方案或者数据表整体备份方案,该方法具有消耗时间更短,备份结果集更小,操作灵活的益处。如果需要恢复数据,可以首先提取该账号在某个时刻的备份结果集,然后分别倒入对应的表。从而实现大数据系统中数据有效备份和恢复。
发明内容
本发明实施例提供一种大数据系统中数据备份方法、恢复方法和装置,解决了现有备份方法存在备份数据的价值密度低、备份恢复过程耗时长和存储成本高的问题。
本发明实施例提供一种大数据系统中数据备份方法,包括:
接收备份指令,所述备份指令中至少包括备份对象的标识信息;
使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;
使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据;
将所述备份数据保存。
本发明实施例还提供一种大数据系统中数据恢复方法,包括:
接收恢复指令,所述恢复指令中至少包括恢复对象的标识信息;
使用所述标识信息获取所述恢复对象的备份数据;
将所述备份数据恢复到应用程序中或者恢复到外部数据存储系统中。
本发明实施例还提供一种大数据系统中数据备份装置,包括:
接收模块,用于接收备份指令,所述备份指令中至少包括备份对象的标识信息;
第一获取模块,用于使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;
第二获取模块,用于使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据;
保存模块,用于将所述备份数据保存。
本发明实施例还提供一种大数据系统中数据恢复装置,包括:
第一接收模块,用于接收恢复指令,所述恢复指令中至少包括恢复对象的标识信息;
第三获取模块,用于使用所述标识信息获取所述恢复对象的备份数据;
恢复模块,用于将所述备份数据恢复到应用程序中或者恢复到外部数据存储系统中。
本发明实施例还提供一种计算机存储介质,所述计算机存储介质中存储有计算机可执行的一个或多个程序,所述一个或多个程序被所述计算机执行时使所述计算机执行如上述提供的一种大数据系统中数据备份方法和大数据系统中数据恢复方法。
上述技术方案中的一个技术方案具有如下优点或有益效果:
本发明实施例中,接收备份指令,所述备份指令中至少包括备份对象的标识信息;使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据;将所述备份数据保存,从而可以对重要备份对象在重要时刻的数据进行备份,使备份数据的价值密度更高、节省存储空间和缩短备份时间。
附图说明
图1为本发明实施例提供的一种大数据系统中数据备份方法的流程图;
图2为本发明实施例提供的一种大数据系统中数据备份恢复装置的示意图;
图3为本发明实施例提供的一种大数据系统中数据恢复方法的流程图;
图4为本发明实施例提供的另一种大数据系统中数据备份方法的流程图;
图5为本发明实施例提供的另一种大数据系统中数据恢复方法的流程图;
图6为本发明实施例提供的一种大数据系统中数据备份装置的结构图;
图7为本发明实施例提供的一种大数据系统中数据恢复装置的结构图;
图8为本发明实施例提供的一种大数据系统中数据查询方法的流程图。
具体实施方式
为使本发明要解决的技术问题、技术方案和优点更加清楚,下面将结合附图及具体实施例进行详细描述。
如图1所示,本发明实施例提供一种大数据系统中数据备份方法,包括以下步骤:
步骤S101、接收备份指令,所述备份指令中至少包括备份对象的标识信息。
步骤S102、使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统。
步骤S103、使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据。
步骤S104、将所述备份数据保存。
在步骤S101中,如图2所示,首先命令处理模块的命令接收单元接收对某备份 对象的备份指令,备份指令中至少包含备份对象的索引号(ID,Identification)。
可选的,所述备份指令包括:
由应用程序发出的备份指令、由定时任务自动产生的备份指令或达到触发条件后自动产生的备份指令。
备份指令可能是由应用程序发出的,也可能是定时任务自动产生的,也可能是达到触发条件后系统自动产生的。
在步骤S102中,使用备份对象的ID去数据字典模块的备份目录单元查询备份对象的目录信息。例如某个备份对象的ID为99,则到数据字典模块的备份目录单元查询ID为99的备份对象的目录信息,目录信息包括备份对象的名称、创建时间和与备份对象相关的数据来自哪张表的哪个字段。数据字典模块的数据源单元判断与备份对象相关的数据的来源是哪里,是来自应用程序还是来自外部数据存储系统。数据字典模块的备份恢复策略单元决定备份对象的备份数据从哪里获取,是从应用程序获取还是从外部数据存储系统获取。
在步骤S103中,从应用程序或外部数据存储系统获取备份对象的备份数据。例如某个备份对象的ID为99,则到数据字典模块的备份目录单元查询ID为99的备份对象的目录信息,目录信息包括备份对象的名称、创建时间和与备份对象相关的数据来自哪张表的哪个字段。数据字典模块的数据源单元判断与备份对象相关的数据的来源是哪里,是来自应用程序还是来自外部数据存储系统。数据字典模块的备份恢复策略单元决定备份对象的备份数据从哪里获取,是从应用程序获取还是从外部数据存储系统获取。若应用程序提供备份数据,则数据处理模块数据提取单元从应用程序提取备份数据。此时,应用程序在发出备份指令的时候,备份对象的备份数据就是应用程序当前内存中的数据,应用程序将这些数据打包后发出来备份,避免了再次访问数据存储系统,效率更高。若应用程序不提供备份数据,则数据处理模块数据提取单元与外部数据存储系统建立连接,查询对应的表获取备份数据。
可选的,所述配置信息包括:
目录信息、数据源信息和策略信息,其中,所述目录信息用于确定与所述备份对象相关的数据,所述数据源信息用于确定所述与所述备份对象相关的数据来自所述应用程序或者所述外部数据存储系统,所述策略信息用于决定从所述应用程序或者所述外部数据存储系统获取所述备份数据;
所述备份数据包括:
所述备份数据为所述备份对象在当前时刻对应的数据集合;
所述使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据,包括:
使用所述目录信息确定与所述备份对象相关的数据,使用所述数据源信息确定所述与所述备份对象相关的数据来自所述应用程序或者所述外部数据存储系统,使用所述策略信息决定从所述应用程序或者所述外部数据存储系统获取所述备份数据。
目录信息包含备份对象的名称、创建时间和与备份对象相关的数据来自哪张表的哪个字段。数据字典模块的数据源单元判断与备份对象相关的数据的来源是哪里,是来自应用程序还是来自外部数据存储系统。数据字典模块的备份恢复策略单元决定备份对象的备份数据从哪里获取,是从应用程序获取还是从外部数据存储系统获取。
例如某个备份对象的ID为99,则到数据字典模块的备份目录单元查询ID为99的备份对象的目录信息,目录信息包括备份对象的名称、创建时间和与备份对象相关的数据来自哪张表的哪个字段。数据字典模块的数据源单元判断与备份对象相关的数据的来源是哪里,是来自应用程序还是来自外部数据存储系统。数据字典模块的备份恢复策略单元决定备份对象的备份数据从哪里获取,是从应用程序获取还是从外部数据存储系统获取。
需要说明的是,目录信息中包含的与所述备份对象相关的数据只是描述性的数据,并不是备份的数据,而从应用程序或外部数据存储系统中获取的数据才是真正需要备份的数据。
备份数据是备份对象在当前时刻相关的所有数据的集合,只针对想要备份的数据进行备份。可以对重要备份对象在重要时刻的数据进行备份,不需要将整体数据进行备份。
另外,也可以不接收备份指令,直接从应用程序或外部数据存储系统获取备份对象的备份数据,然后将备份数据规范化处理,再将规范化处理后的备份数据保存到数据存储模块数据存储单元中。
可选的,所述将所述备份数据保存,包括:
对所述备份数据进行规范化处理,并将规范化处理后的备份数据保存。
同一个备份对象的备份数据可能来自多张表,这些备份数据按照数据字典定义的 格式规范化处理后,打包成一个数据集,使加工程序具有通用性。数据处理模块数据保存单元保存备份数据,数据处理模块数据保存单元与数据存储模块数据存储单元建立连接,将备份数据保存到数据存储模块数据存储单元中。
本实施例,提出一种大数据系统中数据备份方法,备份的最小单元是备份对象在某个时刻的相关数据,数据量较小,从而可以对重要备份对象在重要时刻的数据进行备份,使备份数据的价值密度更高,并且节省存储空间和缩短备份时间。
如图3所示,本发明实施例提供一种大数据系统中数据恢复方法,包括以下步骤:
步骤S301、接收恢复指令,所述恢复指令中至少包括恢复对象的标识信息。
步骤S302、使用所述标识信息获取所述恢复对象的备份数据。
步骤S303、将所述备份数据恢复到应用程序中或者恢复到外部数据存储系统中。
在步骤S301中,如图2所示,命令处理模块命令接收单元接收对某个恢复对象的恢复指令,恢复指令可能是由应用程序发出。
可选的,所述标识信息包括:
所述恢复对象的编号和恢复时刻。
恢复指令中包含恢复对象的编号和恢复时刻。对于恢复时刻,例如将当前数据恢复成去年今天这个时刻的数据。
在步骤S302中,数据处理模块数据提取单元与数据存储模块数据存储单元建立连接,通过查询条件找到备份数据,将备份数据从数据存储模块数据存储单元提取出来。
可选的,所述备份数据包括:
所述备份数据为所述恢复对象在指定时刻对应的数据集合;
所述将所述备份数据恢复到应用程序中,包括:
用所述备份数据替换所述应用程序中所述恢复对象当前数据、在所述应用程序中所述恢复对象当前数据基础上追加所述备份数据或保留所述应用程序中所述恢复对象当前数据。
如果将备份数据恢复到应用程序,分为三种情况:忽略、追加和替换。其中,忽略是指对当前数据不进行更改,即保留当前数据;追加是指在保留当前数据的基础上再将备份数据追加进来;替换是指将当前数据删除掉,再将备份数据写进应用程序中。
可选的,所述恢复到外部数据存储系统中,包括:
将所述备份数据拆分,再将拆分后的数据分别写入对应的表中,所述表存储于所述外部数据存储系统中。
如果将备份数据恢复到外部数据存储系统中,由于备份数据可能来自不同的表,因此将备份数据先进行拆分,再倒入相应的表中,这些表都存储于外部数据存储系统中。
本实施例,提出一种大数据系统中数据恢复方法,备份的最小单元是备份对象在某个时刻的相关数据,数据量较小,因此数据恢复能快速完成,效率更高。
如图4所示,本发明实施例提供一种大数据系统中数据备份方法的流程,包括以下步骤:
步骤S401、接收对某备份对象的备份指令。
步骤S402、解析备份对象。
步骤S403、判断应用程序是否提供备份数据。
步骤S404、若应用程序提供备份数据,则从应用程序获取备份数据;若应用程序不提供备份数据,则从外部数据存储系统获取备份数据。
步骤S405、规范化处理备份数据。
步骤S406、保存备份数据。
在步骤S401中,首先命令处理模块的命令接收单元接收对某备份对象的备份指令,备份指令中至少包含备份对象的ID。备份指令可能是由应用程序发出的,也可能是定时任务自动产生的,也可能是达到触发条件后系统自动产生的。
在步骤S402中,使用备份对象的ID去数据字典模块的备份目录单元查询备份对象的目录信息。例如某个备份对象的ID为99,则到数据字典模块的备份目录单元查询ID为99的备份对象的目录信息,目录信息包括备份对象的名称、创建时间和与备份对象相关的数据来自哪张表的哪个字段。数据字典模块的数据源单元判断与备份对象相关的数据的来源是哪里,是来自应用程序还是来自外部数据存储系统。数据字典模块的备份恢复策略单元决定备份对象的备份数据从哪里获取,是从应用程序获取还是从外部数据存储系统获取。
在步骤S403中,数据字典模块的备份恢复策略单元决定备份对象的备份数据从哪里获取,是从应用程序获取还是从外部数据存储系统获取。
在步骤S404中,若应用程序提供备份数据,则数据处理模块数据提取单元从应 用程序提取备份数据。此时,应用程序在发出备份指令的时候,备份对象的备份数据就是应用程序当前内存中的数据,应用程序将这些数据打包后发出来备份,避免了再次访问数据存储系统,效率更高;若应用程序不提供备份数据,则数据处理模块数据提取单元与外部数据存储系统建立连接,查询对应的表获取备份数据。
在步骤S405中,同一个备份对象的备份数据可能来自多张表,这些备份数据按照数据字典定义的格式规范化处理后,打包成一个数据集,使加工程序具有通用性。
在步骤S406中,数据处理模块数据保存单元保存备份数据,数据处理模块数据保存单元与数据存储模块数据存储单元建立连接,将备份数据保存到数据存储模块数据存储单元中。
需要说明的是,备份数据是备份对象在当前时刻相关的所有数据的集合,只针对想要备份的数据进行备份。可以对重要备份对象在重要时刻的数据进行备份,不需要将整体数据进行备份。
本实施例,提出一种大数据系统中数据备份方法,备份的最小单元是备份对象在某个时刻的相关数据,数据量较小,从而可以对重要备份对象在重要时刻的数据进行备份,使备份数据的价值密度更高,并且节省存储空间和缩短备份时间。
如图5所示,本发明实施例提供一种大数据系统中数据恢复方法的流程,包括以下步骤:
步骤S501、接收对某恢复对象的恢复指令。
步骤S502、提取备份数据。
步骤S503、将备份数据恢复到应用程序或将备份数据恢复到外部数据存储系统。
在步骤S501中,命令处理模块命令接收单元接收对某个恢复对象的恢复指令,恢复指令可能是由应用程序发出。恢复指令中包含恢复对象的编号和恢复时刻。对于恢复时刻,例如将当前数据恢复成去年今天这个时刻的数据。
在步骤S502中,数据处理模块数据提取单元与数据存储模块数据存储单元建立连接,通过查询条件找到备份数据,将备份数据从数据存储模块数据存储单元提取出来。
在步骤S503中,如果将备份数据恢复到应用程序,分为三种情况:忽略、追加和替换。其中,忽略是指对当前数据不进行更改,即保留当前数据;追加是指在保留当前数据的基础上再将备份数据追加进来;替换是指将当前数据删除掉,再将备份数 据写进应用程序中;如果将备份数据恢复到外部数据存储系统中,由于备份数据可能来自不同的表,因此将备份数据先进行拆分,再倒入相应的表中,这些表都存储于外部数据存储系统中。
本实施例,提出一种大数据系统中数据恢复方法,备份的最小单元是备份对象在某个时刻的相关数据,数据量较小,因此数据恢复能快速完成,效率更高。
需要说明的是,上述对备份数据进行备份和恢复可以在同一个装置中实现,这个装置就是图2所示的大数据系统中数据备份恢复装置。
如图6所示,本发明实施例提供一种大数据系统中数据备份装置的结构,包括以下模块:
接收模块601,用于接收备份指令,所述备份指令中至少包括备份对象的标识信息;
第一获取模块602,用于使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;
第二获取模块603,用于使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据;
保存模块604,用于将所述备份数据保存。
可选的,所述备份指令包括:
由应用程序发出的备份指令、由定时任务自动产生的备份指令或达到触发条件后自动产生的备份指令。
可选的,所述配置信息包括:
目录信息、数据源信息和策略信息,其中,所述目录信息用于确定与所述备份对象相关的数据,所述数据源信息用于确定所述与所述备份对象相关的数据来自所述应用程序或者所述外部数据存储系统,所述策略信息用于决定从所述应用程序或者所述外部数据存储系统获取所述备份数据;
所述备份数据包括:
所述备份数据为所述备份对象在当前时刻对应的数据集合;
第二获取模块603用于使用所述目录信息确定与所述备份对象相关的数据,使用所述数据源信息确定所述与所述备份对象相关的数据来自所述应用程序或者所述外部数据存储系统,使用所述策略信息决定从所述应用程序或者所述外部数据存储系统 获取所述备份数据。
可选的,保存模块604用于对所述备份数据进行规范化处理,并将规范化处理后的备份数据保存。
本实施例中,上述大数据系统中数据备份装置可以是图1和图4所示的实施例中的大数据系统中数据备份装置,且图1和图4所示的实施例中大数据系统中数据备份装置的任何实施方式都可以被本实施例中的大数据系统中数据备份装置所实现,这里不再赘述。
本实施例,提出一种大数据系统中数据备份装置,大数据系统中数据备份方法可以在这种数据备份装置上实现,备份的最小单元是备份对象在某个时刻的相关数据,数据量较小,从而可以对重要备份对象在重要时刻的数据进行备份,使备份数据的价值密度更高,并且节省存储空间和缩短备份时间。
如图7所示,本发明实施例提供一种大数据系统中数据恢复装置的结构,包括以下模块:
第一接收模块701,用于接收恢复指令,所述恢复指令中至少包括恢复对象的标识信息;
第三获取模块702,用于使用所述标识信息获取所述恢复对象的备份数据;
恢复模块703,用于将所述备份数据恢复到应用程序中或者恢复到外部数据存储系统中。
可选的,所述标识信息包括:
所述恢复对象的编号和恢复时刻。
可选的,所述备份数据包括:
所述备份数据为所述恢复对象在指定时刻对应的数据集合;
恢复模块703用于用所述备份数据替换所述应用程序中所述恢复对象当前数据、在所述应用程序中所述恢复对象当前数据基础上追加所述备份数据或保留所述应用程序中所述恢复对象当前数据。
可选的,恢复模块703用于将所述备份数据拆分,再将拆分后的数据分别写入对应的表中,所述表存储于所述外部数据存储系统中。
本实施例中,上述大数据系统中数据恢复装置可以是图3和图5所示的实施例中的大数据系统中数据恢复装置,且图3和图5所示的实施例中大数据系统中数据恢复 装置的任何实施方式都可以被本实施例中的大数据系统中数据恢复装置所实现,这里不再赘述。
本实施例,提出一种大数据系统中数据恢复装置,大数据系统中数据恢复方法可以在这种数据恢复装置上实现。备份的最小单元是备份对象在某个时刻的相关数据,数据量较小,因此数据恢复能快速完成,效率更高。
如图8所示,本发明实施例提供一种大数据系统中数据查询方法的流程,包括以下步骤:
步骤S801、接收对某备份对象查询指令。
步骤S802、提取备份数据。
步骤S803、将备份数据反馈回来。
在步骤S801中,命令处理模块命令接收单元接收对某备份对象的查询指令,查询指令包括备份对象ID和备份数据时间范围。
在步骤S802中,数据处理模块数据提取单元与数据存储模块数据存储单元建立连接,通过查询条件找到备份数据,将备份数据从数据存储模块数据存储单元提取出来。
在步骤S803中,将提取出来的备份数据返回给客户端。
本实施例,提出一种大数据系统中数据查询方法,备份的最小单元是备份对象在某个时刻的相关数据,数据量较小,备份对象在某个时刻的相关的所有备份数据将同时被查询到,因此数据查询能快速完成,效率更高。
本领域普通技术人员可以理解实现上述实施例方法的全部或者部分步骤是可以通过程序指令相关的硬件来完成,所述的程序可以存储于一计算机可读取介质中,该程序在执行时,包括以下步骤:
接收备份指令,所述备份指令中至少包括备份对象的标识信息;
使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;
使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据;
将所述备份数据保存。
可选的,所述备份指令包括:
由应用程序发出的备份指令、由定时任务自动产生的备份指令或达到触发条件后自动产生的备份指令。
可选的,所述配置信息包括:
目录信息、数据源信息和策略信息,其中,所述目录信息用于确定与所述备份对象相关的数据,所述数据源信息用于确定所述与所述备份对象相关的数据来自所述应用程序或者所述外部数据存储系统,所述策略信息用于决定从所述应用程序或者所述外部数据存储系统获取所述备份数据;
所述备份数据包括:
所述备份数据为所述备份对象在当前时刻对应的数据集合;
所述使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据,包括:
使用所述目录信息确定与所述备份对象相关的数据,使用所述数据源信息确定所述与所述备份对象相关的数据来自所述应用程序或者所述外部数据存储系统,使用所述策略信息决定从所述应用程序或者所述外部数据存储系统获取所述备份数据。
可选的,所述将所述备份数据保存,包括:
对所述备份数据进行规范化处理,并将规范化处理后的备份数据保存。
该程序在执行时,还包括以下步骤:
接收恢复指令,所述恢复指令中至少包括恢复对象的标识信息;
使用所述标识信息获取所述恢复对象的备份数据;
将所述备份数据恢复到应用程序中或者恢复到外部数据存储系统中。
可选的,所述标识信息包括:
所述恢复对象的编号和恢复时刻。
可选的,所述备份数据包括:
所述备份数据为所述恢复对象在指定时刻对应的数据集合;
所述将所述备份数据恢复到应用程序中,包括:
用所述备份数据替换所述应用程序中所述恢复对象当前数据、在所述应用程序中所述恢复对象当前数据基础上追加所述备份数据或保留所述应用程序中所述恢复对象当前数据。
可选的,所述恢复到外部数据存储系统中,包括:
将所述备份数据拆分,再将拆分后的数据分别写入对应的表中,所述表存储于所述外部数据存储系统中。
所述的存储介质,如只读存储器(Read-Only Memory,简称ROM)、随机存取存储器(Random Access Memory,简称RAM)、磁碟或者光盘等。
以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明所述原理的前提下,还可以作出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。
工业实用性
本发明实施例提供的技术方案可以应用于通信技术领域。本发明实施例中,接收备份指令,所述备份指令中至少包括备份对象的标识信息;使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据;将所述备份数据保存,从而可以对重要备份对象在重要时刻的数据进行备份,使备份数据的价值密度更高、节省存储空间和缩短备份时间。

Claims (11)

  1. 一种大数据系统中数据备份方法,其特征在于,包括:
    接收备份指令,所述备份指令中至少包括备份对象的标识信息;
    使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;
    使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据;
    将所述备份数据保存。
  2. 如权利要求1所述的方法,其特征在于,所述备份指令包括:
    由应用程序发出的备份指令、由定时任务自动产生的备份指令或达到触发条件后自动产生的备份指令。
  3. 如权利要求2所述的方法,其特征在于,所述配置信息包括:
    目录信息、数据源信息和策略信息,其中,所述目录信息用于确定与所述备份对象相关的数据,所述数据源信息用于确定所述与所述备份对象相关的数据来自所述应用程序或者所述外部数据存储系统,所述策略信息用于决定从所述应用程序或者所述外部数据存储系统获取所述备份数据;
    所述备份数据包括:
    所述备份数据为所述备份对象在当前时刻对应的数据集合;
    所述使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据,包括:
    使用所述目录信息确定与所述备份对象相关的数据,使用所述数据源信息确定所述与所述备份对象相关的数据来自所述应用程序或者所述外部数据存储系统,使用所述策略信息决定从所述应用程序或者所述外部数据存储系统获取所述备份数据。
  4. 如权利要求3所述的方法,其特征在于,所述将所述备份数据保存,包括:
    对所述备份数据进行规范化处理,并将规范化处理后的备份数据保存。
  5. 一种大数据系统中数据恢复方法,其特征在于,包括:
    接收恢复指令,所述恢复指令中至少包括恢复对象的标识信息;
    使用所述标识信息获取所述恢复对象的备份数据;
    将所述备份数据恢复到应用程序中或者恢复到外部数据存储系统中。
  6. 如权利要求5所述的方法,其特征在于,所述标识信息包括:
    所述恢复对象的编号和恢复时刻。
  7. 如权利要求6所述的方法,其特征在于,所述备份数据包括:
    所述备份数据为所述恢复对象在指定时刻对应的数据集合;
    所述将所述备份数据恢复到应用程序中,包括:
    用所述备份数据替换所述应用程序中所述恢复对象当前数据、在所述应用程序中所述恢复对象当前数据基础上追加所述备份数据或保留所述应用程序中所述恢复对象当前数据。
  8. 如权利要求6所述的方法,其中,所述恢复到外部数据存储系统中,包括:
    将所述备份数据拆分,再将拆分后的数据分别写入对应的表中,所述表存储于所述外部数据存储系统中。
  9. 一种大数据系统中数据备份装置,其中,包括:
    接收模块,设置为接收备份指令,所述备份指令中至少包括备份对象的标识信息;
    第一获取模块,设置为使用所述标识信息获取所述备份对象的配置信息,所述配置信息用于确定所述备份对象的备份数据来自应用程序或者外部数据存储系统;
    第二获取模块,设置为使用所述配置信息从所述应用程序或者所述外部数据存储系统获取所述备份数据;
    保存模块,设置为将所述备份数据保存。
  10. 一种大数据系统中数据恢复装置,其中,包括:
    第一接收模块,设置为接收恢复指令,所述恢复指令中至少包括恢复对象的标识信息;
    第三获取模块,设置为使用所述标识信息获取所述恢复对象的备份数据;
    恢复模块,设置为将所述备份数据恢复到应用程序中或者恢复到外部数据存储系统中。
  11. 一种计算机存储介质,所述计算机存储介质中存储有计算机可执行的一个或多个程序,所述一个或多个程序被所述计算机执行时使所述计算机执行如根据权利要求1-8中任一项所述的方法。
PCT/CN2017/098606 2016-07-27 2017-08-23 一种大数据系统中数据备份方法、恢复方法和装置和计算机存储介质 WO2018019310A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610600428.9A CN107665153A (zh) 2016-07-27 2016-07-27 一种大数据系统中数据备份方法、恢复方法和装置
CN201610600428.9 2016-07-27

Publications (1)

Publication Number Publication Date
WO2018019310A1 true WO2018019310A1 (zh) 2018-02-01

Family

ID=61015523

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/098606 WO2018019310A1 (zh) 2016-07-27 2017-08-23 一种大数据系统中数据备份方法、恢复方法和装置和计算机存储介质

Country Status (2)

Country Link
CN (1) CN107665153A (zh)
WO (1) WO2018019310A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110727548B (zh) * 2019-09-29 2022-03-04 上海英方软件股份有限公司 一种基于数据库dml同步的持续数据保护方法及装置
CN111831485B (zh) * 2020-07-21 2023-01-13 平安科技(深圳)有限公司 数据恢复方法、装置、电子设备及介质
CN113535470A (zh) * 2021-06-23 2021-10-22 浙江中控技术股份有限公司 组态备份方法、装置、电子设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577440A (zh) * 2012-07-27 2014-02-12 阿里巴巴集团控股有限公司 一种非关系型数据库中的数据处理方法和装置
CN104765651A (zh) * 2014-01-06 2015-07-08 中国移动通信集团福建有限公司 一种数据处理方法和装置
CN105183389A (zh) * 2015-09-15 2015-12-23 北京金山安全软件有限公司 一种数据分级管理方法、装置及电子设备
CN105302675A (zh) * 2015-11-25 2016-02-03 上海爱数信息技术股份有限公司 数据备份的方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577440A (zh) * 2012-07-27 2014-02-12 阿里巴巴集团控股有限公司 一种非关系型数据库中的数据处理方法和装置
CN104765651A (zh) * 2014-01-06 2015-07-08 中国移动通信集团福建有限公司 一种数据处理方法和装置
CN105183389A (zh) * 2015-09-15 2015-12-23 北京金山安全软件有限公司 一种数据分级管理方法、装置及电子设备
CN105302675A (zh) * 2015-11-25 2016-02-03 上海爱数信息技术股份有限公司 数据备份的方法和装置

Also Published As

Publication number Publication date
CN107665153A (zh) 2018-02-06

Similar Documents

Publication Publication Date Title
CN109034993B (zh) 对账方法、设备、系统及计算机可读存储介质
US20200026714A1 (en) Copying data changes to a target database
US8375008B1 (en) Method and system for enterprise-wide retention of digital or electronic data
US8239348B1 (en) Method and apparatus for automatically archiving data items from backup storage
US20170344433A1 (en) Apparatus and method for data migration
US10417265B2 (en) High performance parallel indexing for forensics and electronic discovery
EP3480705B1 (en) Database data modification request processing method and apparatus
WO2017028394A1 (zh) 一种基于实例的分布式数据恢复方法和装置
US10459804B2 (en) Database rollback using WAL
CN103605585A (zh) 一种基于数据发现的智能备份方法
WO2018019310A1 (zh) 一种大数据系统中数据备份方法、恢复方法和装置和计算机存储介质
EP3788505B1 (en) Storing data items and identifying stored data items
CN107330024B (zh) 标签系统数据的存储方法和装置
WO2023029275A1 (zh) 数据关联分析方法、装置、计算机设备和存储介质
US9251020B1 (en) Systems and methods for file-level replication
CN104156669A (zh) 一种计算机信息取证系统
US20220413971A1 (en) System and Method for Blockchain Based Backup and Recovery
WO2016107219A1 (zh) 数据恢复方法及装置
US11294866B2 (en) Lazy optimistic concurrency control
US8056052B2 (en) Populating service requests
US10824803B2 (en) System and method for logical identification of differences between spreadsheets
US10360234B2 (en) Recursive extractor framework for forensics and electronic discovery
CN115658391A (zh) 基于QianBase MPP数据库的WAL机制的备份恢复方法
US10268694B2 (en) Decoupling of archiving and destruction for dependent business objects
US20060004846A1 (en) Low-overhead relational database backup and restore operations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17833613

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17833613

Country of ref document: EP

Kind code of ref document: A1