CN114116675A

CN114116675A - Data archiving method and device

Info

Publication number: CN114116675A
Application number: CN202111477886.5A
Authority: CN
Inventors: 孟欣
Original assignee: Beijing Jingdong Zhenshi Information Technology Co Ltd
Current assignee: Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date: 2021-12-06
Filing date: 2021-12-06
Publication date: 2022-03-01

Abstract

The invention discloses a data archiving method and device, and relates to the technical field of computers. One embodiment of the method comprises: determining first data in a first database according to the primary key information, wherein the first data are data to be archived; uploading the first data to a second database to generate second data, wherein the second data is archived data of the first data in the second database; the archive record of the first data is saved in the archive library, and the first data in the first database is deleted. According to the implementation method, a large amount of disk space is not needed, the compression of data and the deletion of redundant fields are avoided, the original data structure cannot be damaged, the data structure does not need to be processed when the data are pulled back, and the cost and the difficulty of the data pulling back are reduced.

Description

Data archiving method and device

Technical Field

The invention relates to the technical field of computers, in particular to a data archiving method and device.

Background

The current data archiving scheme is to archive the data through a low-configuration large-space disk storage, compress the archived data at the same time, and delete unnecessary indexes and redundant fields in the data, so as to save the use space of the database.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:

the data pull-back is difficult due to the fact that a large amount of disk space is needed, the table structure is compressed, and unnecessary indexes are cleaned, the original data structure is damaged due to the cleaning of redundant fields, the data structure needs to be processed when the data need to be reused, and the data pull-back cost and difficulty are high.

Disclosure of Invention

In view of this, embodiments of the present invention provide a data archiving method and apparatus, which do not require a large amount of disk space, and avoid data compression and deletion of redundant fields, so that the original data structure is not damaged, and the data structure does not need to be processed during data pull-back, thereby reducing the cost and difficulty of data pull-back.

To achieve the above object, according to an aspect of an embodiment of the present invention, a data archiving method is provided.

A method of data archiving comprising: determining first data in a first database according to the primary key information, wherein the first data are data to be archived; uploading the first data to a second database to generate second data, wherein the second data is archived data of the first data in the second database; and saving the filing record of the first data in a filing database, and deleting the first data in the first database.

Optionally, the determining the first data in the first database according to the primary key information includes: determining target primary key information according to a time limit and an archiving type configured in an archiving task, wherein the target primary key information comprises creation time and business codes of business data, the creation time is within the range of the time limit, and the business codes are determined according to the archiving type; and determining the service data corresponding to the target primary key information in the first database as the first data.

Optionally, the uploading the first data to a second database to generate second data includes: judging whether an available filing address corresponding to the target primary key information exists at present according to each filing record in the filing database; if the first data exists, uploading the first data to the second database according to an available filing address corresponding to the target primary key information to generate second data; and if the first data does not exist, uploading the first data to the second database to generate the second data, and obtaining the archival address of the first data according to the storage address of the second data.

Optionally, the archive record includes primary key information, a hash code of the primary key information, an archive address, and a status of the archive address; the judging whether the available filing address corresponding to the target primary key information exists at present according to each filing record in the filing database comprises the following steps: inquiring the hash code of the target main key information in each filing record of the filing library, if the hash code of the target main key information is inquired, judging whether the state of the filing address corresponding to the hash code of the target main key information indicates that the filing address is available, and if so, currently, an available filing address corresponding to the target main key information exists; if the hash code of the target main key information is not inquired or the state of the archival address corresponding to the inquired hash code of the target main key information indicates that the archival address is unavailable, an available archival address corresponding to the target main key information does not exist currently.

Optionally, the uploading the first data to the second database according to the available archival address corresponding to the target primary key information includes: and acquiring historical archived data from an available archived address corresponding to the target primary key information, comparing the last piece of business data of the historical archived data with the first data, and incrementally archiving the first data to the second database under the condition that the comparison result is different.

Optionally, when the size of the file stored in the archive address exceeds a preset threshold, the status of the archive address indicates that the archive address is unavailable.

Optionally, the archive record of the first data comprises a unique identification of the first data; before the deleting the first data in the first database, the method includes: obtaining the unique identification of the first data from the archive; acquiring corresponding second data from the second database according to the unique identifier of the first data; and comparing the first data with the second data, and determining that the comparison results are the same.

Optionally, the second database is a cloud storage database.

Optionally, before uploading the first data to the second database to generate the second data, the method includes: and converting the unique identifier of the first data and the first data into a character string in a json format through serialization.

According to another aspect of the embodiments of the present invention, a data archiving apparatus is provided.

A data archiving apparatus, comprising: the first data determining module is used for determining first data in a first database according to the primary key information, wherein the first data are data to be archived; the second data generation module is used for uploading the first data to a second database to generate second data, and the second data is archived data of the first data in the second database; and the archive record storage module is used for storing the archive record of the first data in an archive library and deleting the first data in the first database.

Optionally, the first data determination module is further configured to: determining target primary key information according to a time limit and an archiving type configured in an archiving task, wherein the target primary key information comprises creation time and business codes of business data, the creation time is within the range of the time limit, and the business codes are determined according to the archiving type; and determining the service data corresponding to the target primary key information in the first database as the first data.

Optionally, the second data generation module is further configured to: judging whether an available filing address corresponding to the target primary key information exists at present according to each filing record in the filing database; if the first data exists, uploading the first data to the second database according to an available filing address corresponding to the target primary key information to generate second data; and if the first data does not exist, uploading the first data to the second database to generate the second data, and obtaining the archival address of the first data according to the storage address of the second data.

Optionally, the archive record includes primary key information, a hash code of the primary key information, an archive address, and a status of the archive address; the second data generation module is further to: inquiring the hash code of the target main key information in each filing record of the filing library, if the hash code of the target main key information is inquired, judging whether the state of the filing address corresponding to the hash code of the target main key information indicates that the filing address is available, and if so, currently, an available filing address corresponding to the target main key information exists; if the hash code of the target main key information is not inquired or the state of the archival address corresponding to the inquired hash code of the target main key information indicates that the archival address is unavailable, an available archival address corresponding to the target main key information does not exist currently.

Optionally, the second data generation module is further configured to: and acquiring historical archived data from an available archived address corresponding to the target primary key information, comparing the last piece of business data of the historical archived data with the first data, and incrementally archiving the first data to the second database under the condition that the comparison result is different.

Optionally, the archive record of the first data comprises a unique identification of the first data; also included is a comparison module for: obtaining the unique identification of the first data from the archive; acquiring corresponding second data from the second database according to the unique identifier of the first data; and comparing the first data with the second data, and determining that the comparison results are the same.

Optionally, the second database is a cloud storage database.

Optionally, the system further comprises a first data conversion module, configured to: and converting the unique identifier of the first data and the first data into a character string in a json format through serialization.

According to yet another aspect of an embodiment of the present invention, an electronic device is provided.

An electronic device, comprising: one or more processors; a memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the data archiving method provided by embodiments of the present invention.

According to yet another aspect of an embodiment of the present invention, a computer-readable medium is provided.

A computer-readable medium, on which a computer program is stored, which, when executed by a processor, implements the data archiving method provided by embodiments of the present invention.

One embodiment of the above invention has the following advantages or benefits: determining first data in a first database according to the primary key information, wherein the first data are data to be archived; uploading the first data to a second database to generate second data, wherein the second data is archived data of the first data in the second database; the archive record of the first data is saved in the archive library, and the first data in the first database is deleted. Data can be filed based on a cloud storage database, a large amount of disk space is not needed, compression of data and deletion of redundant fields are avoided, an original data structure cannot be damaged, the data structure does not need to be processed when the data are pulled back, and cost and difficulty of data pulling back are reduced.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic diagram of the main steps of a data archiving method according to one embodiment of the present invention;

FIG. 2 is one of the flow diagrams of data archiving according to one embodiment of the present invention;

FIG. 3 is a second schematic flow chart of data archiving according to an embodiment of the present invention;

FIG. 4 is a process flow diagram of archiving failure data, according to one embodiment of the present invention;

FIG. 5 is a schematic diagram of the main modules of a data archive device according to one embodiment of the present invention;

FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

FIG. 1 is a schematic diagram of the main steps of a data archiving method according to one embodiment of the present invention.

As shown in fig. 1, the data archiving method according to an embodiment of the present invention mainly includes the following steps S101 to S103.

Step S101: and determining first data in the first database according to the primary key information, wherein the first data is data to be archived.

Determining the first data in the first database according to the primary key information may include: determining target primary key information according to a time limit and an archiving type configured in an archiving task, wherein the target primary key information comprises creation time and business codes of business data, the creation time is within the range of the time limit, and the business codes are determined according to the archiving type; and determining the service data corresponding to the target primary key information in the first database as first data.

Step S102: and uploading the first data to a second database to generate second data, wherein the second data is the archived data of the first data in the second database.

Uploading the first data to a second database to generate second data, which may include: judging whether an available filing address corresponding to the target primary key information exists at present according to each filing record in the filing library; if the first data exists, uploading the first data to a second database according to an available filing address corresponding to the target primary key information to generate second data; and if the first data does not exist, uploading the first data to a second database to generate second data, and obtaining the archival address of the first data according to the storage address of the second data.

Judging whether an available filing address corresponding to the target primary key information exists currently according to each filing record in the filing library, wherein the judging step comprises the following steps: inquiring the hash code of the target main key information in each filing record of the filing library, if the hash code of the target main key information is inquired, judging whether the state of the filing address corresponding to the hash code of the target main key information indicates that the filing address is available, and if so, currently, having an available filing address corresponding to the target main key information; if the hash code of the primary key information is not inquired or the status of the archive address corresponding to the inquired hash code of the primary key information indicates that the archive address is unavailable, an available archive address corresponding to the target primary key information does not exist currently.

Uploading the first data to a second database according to an available archival address corresponding to the target primary key information may include: and acquiring historical filed data from an available filed address corresponding to the target primary key information, comparing the last piece of business data of the historical filed data with the first data, and incrementally filing the first data to a second database under the condition that the comparison result is different. Incremental archiving is to archive in an incremental manner, so that the archived data is not archived any more, and only unarchived data is archived.

When the size of the file stored in the archive address exceeds a preset threshold value, the state of the archive address indicates that the archive address is unavailable.

Before uploading the first data to the second database to generate the second data, the method may include: and converting the unique identification of the first data and the first data into a character string in a json format through serialization. The unique identification of the data is the data ID.

The first database is an original database where data to be filed are located, the second database is a database for filing the data, and the second database can be a cloud storage database.

Step S103: the archive record of the first data is saved in the archive library, and the first data in the first database is deleted.

The archive record may include primary key information, a hash of the primary key information, an archive address, a status of the archive address, and may also include a unique identification of the corresponding data.

The archival record of the first data may include the target primary key information and the corresponding hash code, as well as the archival address and the corresponding state of the first data, and may also include the unique identification of the first data. Before deleting the first data in the first database, the method may include: obtaining a unique identifier of the first data from an archive; acquiring corresponding second data from a second database according to the unique identifier of the first data; and comparing the first data with the second data, and determining that the comparison results are the same.

Fig. 2 is a schematic flow chart of data archiving according to an embodiment of the present invention, and fig. 3 is a schematic flow chart of data archiving according to an embodiment of the present invention.

As shown in fig. 2 and 3, the primary key information is determined according to the archive type of the service data. The filing type of the service data can be divided into pre-processing filing data, standard single filing data and charging result filing data, a service code corresponding to the pre-processing filing data is a service line code, and a service code corresponding to the standard single filing data and the charging result filing data is a merchant code, so that the primary key information of the pre-processing filing data comprises the service line code and the creation time of the service data (namely, the service line + yearly month in fig. 2), and the primary key information of the standard single filing data and the charging result filing data comprises the merchant code and the creation time of the service data (namely, the merchant + yearly month in fig. 2). Configuring a time limit and an archiving type in an archiving task, determining a scanning condition according to the time limit and the archiving type, specifically, taking service data creation time within the time limit and a service code determined according to the archiving type as the scanning condition, that is, scanning the service data in the first database according to target primary key information to determine first data corresponding to the scanning condition, wherein the target primary key information comprises the creation time and the service code of the service data, the creation time in the target primary key information is within the time limit, the service code in the target primary key information is determined according to the archiving type, for example, for the above example, the archiving type is standard single archive data, the service code is a merchant code, and assuming that the time limit is within one year before 5-1/2020, the creation time of the service data needs to meet the date < '2020-05-01', namely, the scanning condition is merchant/service line + date < '2020-05-01', the first data is determined by inquiring the service data corresponding to the primary key information (namely, the target primary key information) of which the primary key information satisfies the scanning condition, namely, the merchant + month of year. After the first data are determined, the unique identification of the first data and the first data are converted into a character string in a json format through serialization, and the main key information of the first data is converted into a hash code.

According to each filing record in the filing library, whether an available filing address corresponding to the target primary key information exists currently is judged, and the first data is uploaded to a second database (namely a cloud storage database, and "JFS (cloud storage service)" in fig. 3) to generate second data. Inquiring the hash code of the target main key information in each filing record of the filing library, if the hash code of the target main key information is inquired, judging whether the state of the filing address corresponding to the hash code of the target main key information indicates that the filing address is available, if so, judging that the available filing address corresponding to the target main key information currently exists, and uploading the first data to a second database according to the available filing address corresponding to the target main key information to generate second data; if the hash code of the target main key information is not inquired or the status of the filing address corresponding to the inquired hash code of the target main key information indicates that the filing address is unavailable, the available filing address corresponding to the target main key information does not exist at present, the first data is uploaded to a second database to generate second data, and the filing address of the first data is obtained according to the storage address of the second data. Specifically, whether an available archival address corresponding to the target primary key information exists is determined through the hash code of the target primary key information. Each archive record in the archive is shown in table 1, taking an archive record of first data as an example, ID is a unique identifier of the first data, hashKey is a hash code of primary key information of the first data, sourceKey is primary key information of the first data, URL is a storage address (i.e., an archive address) of second data corresponding to the first data in a second database, status is a state of the archive address, if status is 1, the state indication of the archive address is available, and if status is 2, the state indication of the archive address is unavailable.

As shown in fig. 3, it is determined whether the new file is newly uploaded, if the new file is newly uploaded, it indicates that there is no currently available filing address, and a new storage address that needs to be uploaded to the second database is used as the filing address, then a new file record is newly added, and the setting state (i.e. the state of the filing address) is 1, which indicates that the filing address is available; if the uploading is not newly added, the current available filing address exists, the first data can be uploaded to the available filing address, and then the filing record is updated, and the state is 1 (the filing address is available); put (id, data) is serialized to obtain a character string in json format, corresponding data is uploaded to a database for data archiving, an archiving record is newly added or updated, data is uploaded to JFS (field-effect-based system), and an archiving address URL is obtained.

TABLE 1 archive records in an archive repository

ID	hashKey	sourceKey	URL	status
					1

In one embodiment, the unique identifier of the business data is generated according to an incremental rule, when the first data is uploaded to the second database, historical archived data is obtained from an available archival address corresponding to the target primary key information, the last piece of business data of the historical archived data is compared with the first data, and if the comparison result is different, the first data is incrementally archived to the second database so as to perform incremental uploading of the first data.

In one embodiment, a threshold is set for the size of the file stored at the archive address, and when the size of the file stored at the archive address exceeds the preset threshold, the status of the archive address indicates that the archive address is unavailable. For example, as shown in fig. 3, it is determined whether the content of the uploaded file (i.e., the archived data, such as the second data) is greater than 100M (a preset threshold), if so, the archive record is updated to be complete, and the status is 2, i.e., the status of the URL (archive address) indicates that it is not available; if not, returning to the step of judging whether to add the upload newly (the step is introduced above, and is not described here again).

In one embodiment, the first data is uploaded to the second database by a set Map, wherein Map is a set of stored data in one of the programming languages.

In one embodiment, the unique identification of the first data is obtained from the target primary key information recorded in the archive library; acquiring corresponding second data from a second database according to the unique identifier of the first data; comparing the first data with the second data, deleting the first data in the first database after the comparison result is determined to be the same so as to release the storage space of the first database, wherein the step corresponds to the checking of the pull-back data after archiving in the step shown in fig. 2, if the checking is passed (yes), deleting the original data (namely, the first data), if the checking is not passed (namely, no), retrying, and if the retrying is failed, not deleting the original data. And after the original data is deleted, the disk fragments of the original database (namely the first database) can be cleaned regularly. Specifically, first data which has been filed in the filing database is pulled, the unique identifier of the first data is obtained, the first data is compared with second data corresponding to the second database, and if the comparison result is the same, the first data is deleted in the original database (namely, the first database). If the comparison result is different, the first data is archived again. By comparing the first data with the second data, the accuracy of data archiving can be verified, and the original data (namely the first data) is cleaned based on the unique identification to release space.

FIG. 4 is a process flow diagram of archiving failure data, according to one embodiment of the present invention.

As shown in fig. 4, in one embodiment, the data is scanned according to the business line/business + month of year, that is, the data to be archived (first data) in the first database is determined according to the primary key information, and the data is archived. When the data archiving fails, the unique identifier of the data which fails to be archived can be recorded, the data which fails to be archived in the data archiving process is used as new data which needs to be archived, the data which fails to be archived in the data archiving process is archived again, the data which succeeds in being archived in the data which fails to be archived is recorded in the archive database, and the data which fails to be archived is continuously used as new data which needs to be archived so as to retry the archiving. The process of filing the new data to be filed corresponds to the branch "update/new addition" in fig. 4, that is, the new data to be filed is filed in an update or new addition manner when filed, and filed in a new addition manner, that is, the above described case of new addition uploading indicates that there is no available filing address at present, and a new storage address that needs to be uploaded to the second database is used as the filing address. And archiving according to an updating mode, namely, the situation that the uploading is not newly increased is described above, and the situation indicates that an available archiving address exists currently, and the first data can be uploaded to the available archiving address. In addition, the ID of the archive failure data can also be uploaded to JFS. Fig. 4 shows a branch of determining whether there is a corresponding "success" after the failed data is archived, i.e., indicating that the data is archived successfully.

FIG. 5 is a schematic diagram of the main modules of a data archive device according to one embodiment of the present invention.

As shown in fig. 5, the data archiving apparatus 500 according to an embodiment of the present invention mainly includes: a first data determining module 501, a second data generating module 502 and an archive record keeping module 503.

The first data determining module 501 is configured to determine first data in a first database according to the primary key information, where the first data is data to be archived.

The second data generating module 502 is configured to upload the first data to a second database to generate second data, where the second data is archived data of the first data in the second database.

An archive record keeping module 503, configured to keep an archive record of the first data in the archive library, and delete the first data in the first database.

In one embodiment, the first data determination module is specifically configured to: determining target primary key information according to a time limit and an archiving type configured in an archiving task, wherein the target primary key information comprises creation time and a service code of service data, the creation time is within the range of the time limit, and the service code corresponds to the archiving type; and determining the service data corresponding to the target primary key information in the first database as first data.

In one embodiment, the second data generation module is specifically configured to: judging whether an available filing address corresponding to the target primary key information exists at present according to each filing record in the filing library; if the first data exists, uploading the first data to a second database according to an available filing address corresponding to the target primary key information to generate second data; and if the first data does not exist, uploading the first data to a second database to generate second data, and obtaining the archival address of the first data according to the storage address of the second data.

In one embodiment, the archive record may include primary key information, a hash of the primary key information, an archive address, a status of the archive address; the second data generation module is specifically configured to: inquiring the hash code of the target main key information in each filing record of the filing library, if the hash code of the target main key information is inquired, judging whether the state of the filing address corresponding to the hash code of the target main key information indicates that the filing address is available, and if so, currently, having an available filing address corresponding to the target main key information; if the hash code of the primary key information is not inquired or the status of the archive address corresponding to the inquired hash code of the primary key information indicates that the archive address is unavailable, an available archive address corresponding to the target primary key information does not exist currently.

In one embodiment, the second data generation module is specifically configured to: and acquiring historical filed data from an available filed address corresponding to the target primary key information, comparing the last piece of business data of the historical filed data with the first data, and incrementally filing the first data to a second database under the condition that the comparison result is different.

In one embodiment, the status of the archival address indicates that the archival address is unavailable when the size of a file stored by the archival address exceeds a preset threshold.

In one embodiment, the archive record of the first data may include the target primary key information and the corresponding hash code, as well as the archive address and the corresponding state of the first data; a comparison module may also be included for: acquiring a unique identifier of first data from target primary key information recorded in an archive library; acquiring corresponding second data from a second database according to the unique identifier of the first data; and comparing the first data with the second data, and determining that the comparison results are the same.

In one embodiment, the second database may be a cloud storage database.

In one embodiment, the system may further include a first data conversion module configured to: and converting the unique identification of the first data and the first data into a character string in a json format through serialization.

In addition, the specific implementation contents of the data archiving device in the embodiment of the present invention have been described in detail in the above data archiving method, and therefore, the repeated contents will not be described again.

Fig. 6 illustrates an exemplary system architecture 600 to which the data archiving method or data archiving apparatus of embodiments of the present invention may be applied.

As shown in fig. 6, the system architecture 600 may include

terminal devices

601, 602, 603, a network 604, and a server 605. The network 604 serves to provide a medium for communication links between the

terminal devices

601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. The

terminal devices

601, 602, 603 may have installed thereon various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 605 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the

terminal devices

601, 602, 603. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.

It should be noted that the data archiving method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the data archiving device is generally disposed in the server 605.

It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 7, a block diagram of a computer system 700 suitable for use with a terminal device or server implementing an embodiment of the invention is shown. The terminal device or the server shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a first data determining module, a second data generating module and an archive record keeping module. The names of these modules do not constitute a limitation to the module itself in some cases, and for example, the first data determination module may also be described as "a module for determining first data in the first database in accordance with the primary key information".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: determining first data in a first database according to the primary key information, wherein the first data are data to be archived; uploading the first data to a second database to generate second data, wherein the second data is archived data of the first data in the second database; the archive record of the first data is saved in the archive library, and the first data in the first database is deleted.

According to the technical scheme of the embodiment of the invention, first data in a first database is determined according to the primary key information, wherein the first data is data to be archived; uploading the first data to a second database to generate second data, wherein the second data is archived data of the first data in the second database; the archive record of the first data is saved in the archive library, and the first data in the first database is deleted. The data can be archived based on the cloud storage database without a large amount of disk space, so that data compression and deletion of redundant fields are avoided, and the cost and difficulty of data pull-back are reduced.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for archiving data, comprising:

determining first data in a first database according to the primary key information, wherein the first data are data to be archived;

uploading the first data to a second database to generate second data, wherein the second data is archived data of the first data in the second database;

and saving the filing record of the first data in a filing database, and deleting the first data in the first database.

2. The method of claim 1, wherein determining the first data in the first database according to the primary key information comprises:

determining target primary key information according to a time limit and an archiving type configured in an archiving task, wherein the target primary key information comprises creation time and business codes of business data, the creation time is within the range of the time limit, and the business codes are determined according to the archiving type;

and determining the service data corresponding to the target primary key information in the first database as the first data.

3. The method of claim 2, wherein uploading the first data into a second database to generate second data comprises:

judging whether an available filing address corresponding to the target primary key information exists at present according to each filing record in the filing database;

if the first data exists, uploading the first data to the second database according to an available filing address corresponding to the target primary key information to generate second data;

and if the first data does not exist, uploading the first data to the second database to generate the second data, and obtaining the archival address of the first data according to the storage address of the second data.

4. The method of claim 3, wherein the archive record comprises primary key information, a hash of the primary key information, an archive address, a status of the archive address;

the judging whether the available filing address corresponding to the target primary key information exists at present according to each filing record in the filing database comprises the following steps:

inquiring the hash code of the target main key information in each filing record of the filing library, if the hash code of the target main key information is inquired, judging whether the state of the filing address corresponding to the hash code of the target main key information indicates that the filing address is available, and if so, currently, an available filing address corresponding to the target main key information exists;

if the hash code of the target main key information is not inquired or the state of the archival address corresponding to the inquired hash code of the target main key information indicates that the archival address is unavailable, an available archival address corresponding to the target main key information does not exist currently.

5. The method of claim 3 or 4, wherein the uploading the first data to the second database according to an available archival address corresponding to the target primary key information comprises:

and acquiring historical archived data from an available archived address corresponding to the target primary key information, comparing the last piece of business data of the historical archived data with the first data, and incrementally archiving the first data to the second database under the condition that the comparison result is different.

6. The method of claim 4, wherein the status of the archive address indicates that the archive address is unavailable when the size of the file stored by the archive address exceeds a predetermined threshold.

7. The method of any of claims 2 to 4, wherein the archived record of the first data includes a unique identification of the first data;

before the deleting the first data in the first database, the method includes:

obtaining the unique identification of the first data from the archive;

acquiring corresponding second data from the second database according to the unique identifier of the first data;

and comparing the first data with the second data, and determining that the comparison results are the same.

8. The method of claim 1, wherein the second database is a cloud storage database.

9. A data archiving apparatus, comprising:

the first data determining module is used for determining first data in a first database according to the primary key information, wherein the first data are data to be archived;

the second data generation module is used for uploading the first data to a second database to generate second data, and the second data is archived data of the first data in the second database;

and the archive record storage module is used for storing the archive record of the first data in an archive library and deleting the first data in the first database.

10. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.

11. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-8.