CN112099991A - Method, device, system and storage medium for data backup and source data access - Google Patents

Method, device, system and storage medium for data backup and source data access Download PDF

Info

Publication number
CN112099991A
CN112099991A CN202010922994.8A CN202010922994A CN112099991A CN 112099991 A CN112099991 A CN 112099991A CN 202010922994 A CN202010922994 A CN 202010922994A CN 112099991 A CN112099991 A CN 112099991A
Authority
CN
China
Prior art keywords
source
data
standby
station
source station
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010922994.8A
Other languages
Chinese (zh)
Inventor
张健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202010922994.8A priority Critical patent/CN112099991A/en
Publication of CN112099991A publication Critical patent/CN112099991A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a method, a device, a system and a storage medium for data backup and source data access, wherein the method comprises the following steps: reading source data stored in a main source station; backing up the read source data to at least two first standby source stations, wherein the at least two first standby source stations respectively adopt different public cloud object storage spaces; wherein each first standby source station backs up part of the source data in the main source station; and the source returning destination address of the first standby source station is other standby source stations or the main source station when the first standby source station does not contain the request data. The method and the device are used for solving the problems that in the existing mass source data backup mode, a high available architecture under high concurrent access is maintained, storage resources and bandwidth resources are occupied, cost is high, and operation and maintenance are difficult.

Description

Method, device, system and storage medium for data backup and source data access
Technical Field
The present application relates to the field of computer network technologies, and in particular, to a method, an apparatus, a system, and a storage medium for data backup and source data access.
Background
The China network has 8 hundred million people, in the Internet era, each person is an information creator, the world generates hundreds of PB data volume every hour, particularly in the Internet industry, the video content sharing major live broadcast and on-demand service is rapidly developed, massive data of users are read, great challenge is provided for the service capacity of a source station, how to deal with high concurrent access of hundreds of millions of customers, a reasonable load balancing strategy and a high available architecture are indispensable, wave crests and wave troughs which are common in services are dealt with, and cost is reduced as much as possible while customer experience is guaranteed.
The current solutions to this problem are roughly:
in the first method, a video-on-demand enterprise establishes a source station for storing data by itself, prepares enough resources for staged business outbreak, expands longitudinally, and purchases more servers and bandwidths. In order to ensure that the source station is highly available in a high concurrency scene, a plurality of stations are self-established to realize active-standby switching, and the source station is subjected to full backup. The scheme obviously has resource waste and high construction cost, a Content Delivery Network (CDN) is also required to be built for realizing dynamic load balancing, challenges are provided for operation and maintenance, and user experience cannot be guaranteed. In addition, high available architectures for storing a plurality of source stations are built by themselves, so that the investment cost is too high, the long-term development of enterprises is not facilitated, the business peak value change is responded, and the flexibility and the reliability are not high.
And secondly, the video-on-demand enterprise establishes a full-scale main source station by itself, or establishes the main source station by itself and stores full-scale source data in the public cloud object. And moreover, the CDN edge node cache is utilized, the pressure of high concurrent access to the lower source station is reduced, and the access speed of a customer is improved. However, the full backup approach requires paying a lot for storage capacity and bandwidth to maintain a highly available architecture.
Disclosure of Invention
The application provides a method, a device, a system and a storage medium for data backup and source data access, which are used for solving the problems that a high available architecture under high concurrent access is maintained in the existing massive source data backup mode, the occupied storage resources and bandwidth resources are more, the cost is high, and the operation and maintenance are difficult.
In a first aspect, an embodiment of the present application provides a data backup method, including:
reading source data stored in a main source station;
backing up the read source data to at least two first standby source stations, wherein the at least two first standby source stations respectively adopt different public cloud object storage spaces;
wherein each first standby source station backs up part of the source data in the main source station; and the source returning destination address of the first standby source station is other standby source stations or the main source station when the first standby source station does not contain the request data.
Optionally, after reading the source data stored in the master source station, the method further includes:
backing up the read source data to at least one second standby source station, wherein the second standby source station stores the source data of which the data amount is not less than the preset proportion of the data amount in the main source station;
the source return destination address of the second standby source station is the main source station, and the source return destination address of the first standby source station is the second standby source station.
Optionally, a sum of data amounts of the source data stored in each of the second standby source stations is equal to a total amount of source data in the main source station.
Optionally, the method further comprises:
and determining the source returning priority of each first standby source station, and configuring the source returning rule of the first standby source station according to the source returning priority.
Optionally, configuring a back-source rule of the first standby source station according to the back-source priority includes:
configuring a source returning rule of the first standby source station as follows: the back source destination address is the address of the first standby source station with high priority or the address of the main source station.
Optionally, the types of source data stored in different first standby stations are different, and/or the types of source data stored in different second standby stations are different.
Optionally, the type of the source data is divided according to the access heat of the source data.
Optionally, the correspondence between the address of the first source-standby station and the type of the source data is stored in a central node of the content distribution network.
In a second aspect, an embodiment of the present application provides a method for accessing source data, where the source data is backed up by using the data backup method in the first aspect, and the method is applied to any one of the first backup source stations, and includes:
acquiring a data access request;
judging whether to back up the source data requested by the data access request;
if yes, returning the source data;
otherwise, according to the destination address of the source, the source data is obtained from other standby source stations or the main source station, and the source data is returned.
Optionally, the obtaining the source data from other standby source stations according to the source return destination address includes:
acquiring the source data from a second standby source station according to the return source destination address; when the second standby source station does not back up the source data, the second standby source station acquires the source data from the main source station and returns the source data;
alternatively, the first and second electrodes may be,
and acquiring the source data from the first standby source station with high priority according to the back-source destination address, wherein the first standby source station with high priority acquires the source data from the main source station and returns the source data when the source data is not backed up.
In a third aspect, an embodiment of the present application provides a system for accessing source data, where the source data is backed up by using the data backup method of the first aspect, and the system includes at least two first backup source stations and a primary source station;
any one of the first standby source stations is used for acquiring a data access request and judging whether to back up source data requested by the data access request, if so, returning the source data, otherwise, acquiring the source data from other standby source stations or the main source station according to a destination address of the source data, and returning the source data;
the master source station is configured to provide source data for the first standby source station or the other standby source stations.
Optionally, the system further comprises: at least one second standby source station, wherein source data with the data volume not less than the preset proportion in the main source station is stored in the second standby source station;
the first standby source station is used for acquiring the source data from the second standby source station according to the source return destination address and returning the source data;
and the second standby source station is used for acquiring the source data from the main source station and returning the source data to the first standby source station when the source data requested by the first standby source station is not included.
Optionally, the system further comprises a central node and an edge node of the content distribution network;
the edge node is used for acquiring the data access request submitted by a client and returning source data to the client when caching the source data requested by the data access request; when the source data requested by the data access request is not cached, forwarding the data access request to the central node;
the central node is configured to obtain the data access request forwarded by the edge node, obtain an address of the first standby source station corresponding to a type of source data requested by the data access request, and forward the data access request according to the address of the first standby source station.
In a fourth aspect, an embodiment of the present application provides a data backup apparatus, including:
the reading module is used for reading the source data stored in the main source station;
the backup module is used for backing up the read source data to at least two first standby source stations, and the at least two first standby source stations respectively adopt different public cloud object storage spaces;
wherein each first standby source station backs up part of the source data in the main source station; and the source returning destination address of the first standby source station is other standby source stations or the main source station when the first standby source station does not contain the request data.
In a fifth aspect, an embodiment of the present application provides a data backup system, including: the system comprises a main source station and at least two first standby source stations, wherein the at least two first standby source stations respectively adopt different public cloud object storage spaces;
the master source station is used for storing source data;
the first backup source station is used for backing up part of source data read from the main source station, and a return source destination address of the first backup source station when the first backup source station does not contain the request data is other backup source stations or the main source station.
In a sixth aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory;
the memory for storing a computer program;
the processor is configured to execute the program stored in the memory, and implement the data backup method according to the first aspect, or implement the method for accessing source data according to the second aspect.
In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the data backup method according to the first aspect, or implements the method for accessing source data according to the second aspect.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages: according to the method provided by the embodiment of the application, a main source station for storing source data is set, at least two different public cloud object storage spaces are used, the source data in the main source station is backed up in the different public cloud object storage spaces respectively, at least two first standby source stations backed up by the different public cloud object storage spaces are obtained, and each first standby source station is enabled to back up part of the source data in the main source station, so that the problems that the source data in the main source station is backed up in the public cloud in a full amount, occupied storage resources and bandwidth resources are large, and the cost is high are solved. In addition, the first standby source station is obtained by utilizing different public cloud object storage spaces, and a high-availability architecture under high concurrency can be realized.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
FIG. 1 is a functional diagram of a mirror image back source in an embodiment of the present application;
FIG. 2 is a diagram illustrating a specific process of data backup in an embodiment of the present application;
FIG. 3 is a diagram of a source data storage architecture in an embodiment of the present application;
FIG. 4 is a flowchart illustrating a method for accessing source data according to an embodiment of the present application;
FIG. 5 is a diagram illustrating an architecture of a system for accessing source data according to an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a data backup device according to an embodiment of the present application;
FIG. 7 is a block diagram of a data backup system according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The object storage is distributed storage facing to the internet, supports reading, writing and managing files at any time and place through an Http/Http protocol, supports a standard presentation layer state transition-application program interface (abbreviated as Rest API), provides an infinite storage space for a client through a flat storage framework, is a storage mode with high reliability, high availability, low cost and wireless expansion, is suitable for massive unstructured data storage, and can support elastic capacity expansion of resources. An object storage mode is mainly adopted in the public cloud.
Most of public clouds have a mirror image source returning function, and the main use scene of the mirror image source returning is to seamlessly migrate data to a cloud object for storage. The mirror image source returning means that after the source returning rule is configured, when data requested by a user does not exist in the object storage, the data requested by the user is acquired from a set source returning destination address through the source returning rule. At least 2 source station domain names are generally configured in the back source rule as back source destination addresses.
For example, as shown in the functional diagram of the mirror back source in fig. 1, if a mirror back source is configured on a bucket, the detailed description is as follows: when the user sends an access request to the object storage, if the data requesting access does not exist in the object storage, the data requesting access is pulled to the client source station, the pulled data is stored in the object storage, and the data is synchronously returned to the user. That is, when a user performs a GET (GET) operation on a non-existent file in the bucket, an Object Storage Service (OSS) requests the file from a source-destination address (domain name), and after the file is obtained, the OSS synchronizes and returns the file to the user, and stores the file in the bucket.
In the embodiment of the application, in order to enable a backup framework of mass data to ensure high availability of mass source data, save occupied storage capacity and bandwidth as much as possible and reduce cost, a data backup method is provided.
The data backup method provided in the first embodiment of the present application may be embedded in any electronic device in the form of a software program, for example, embedded in a master source station of a data operator.
As shown in fig. 2, in the embodiment of the present application, a specific process of data backup includes:
step 201, reading the source data stored in the master source station.
In a specific embodiment, the primary source station may be constructed by the data operator through a public cloud object storage space, or may be constructed by the data operator using a device owned by the data operator (self-construction for short). The primary source station is also referred to as the primary source station.
The mode that the data operator establishes the main source station by itself has the advantages that data resources are precious wealth of the data operator, if the data operator completely depends on public cloud storage, data are migrated to the self-establishment source station for storage in the future, time consumption and risks are high, and the mode that time cost can be saved and safety is high is realized for the data operator establishing the primary source station by itself at the level higher than that of a medium-sized enterprise.
Step 202, backing up the read source data to at least two first standby source stations, wherein the at least two first standby source stations respectively adopt different public cloud object storage spaces.
Each public cloud object storage space is backed up, namely, part of source data in the main source station is backed up in the first source station. And the source returning rule of the first standby source station is used for defining a source returning destination address when the first standby source station does not contain the request data. The first standby source station returns the source destination address to other standby source stations or the main source station when not containing the request data. For example, the back-source rule of the first standby station is to send a back-source request to the primary source station when the first standby station does not contain the request data.
By adopting a plurality of different public cloud object storage spaces to backup source data, the multi-cloud backup effect is achieved, the problem that backup source data cannot be used due to the fact that a certain public cloud service is out of order can be effectively avoided, and high availability of backup source data is achieved. Moreover, the characteristic that the mirror image back source supports the setting of a plurality of source stations is utilized, the framework of the main and standby source stations and even a plurality of backup source stations can be realized, and the high availability of the source data is realized.
In a specific embodiment, in addition to the main source station and the plurality of first standby source stations, at least one second standby source station may be deployed to backup the source data read from the main source station to the at least one second standby source station, where the second standby source station stores source data not less than a preset proportion of data amount in the main source station. And the source returning destination address of the second standby source station is the main source station, and the source returning destination address of the first standby source station is the second standby source station. In this embodiment, at least one second standby source station is arranged to store source data, so that high availability of the source data under high concurrency can be further ensured, and data security can be further ensured. In addition, a non-full backup mode is adopted, so that the problems of high storage space occupation and high bandwidth caused by full backup are solved, and high construction cost is avoided.
In an exemplary embodiment, a sum of the data amounts of the source data stored in each of the second standby source stations is equal to a total amount of the source data in the main source station. In this embodiment, the source data in the master source station is dispersedly deployed in different second backup source stations, so that the pressure of the master source station is effectively relieved, load balancing is guaranteed to be achieved under high concurrency, and the bandwidth requirement of the public cloud can be dynamically expanded in a traffic peak period.
For example, two second standby source stations are deployed, wherein one second standby source station stores not less than 60% of source data in the main source station, the other second standby source station stores not less than 40% of source data in the main source station, and the sum of the data stored in the two second standby source stations is equal to the total amount of the source data in the main source station.
The main source station is a primary source station, the at least one second standby source station is a secondary source station, and the at least two first standby source stations constructed by utilizing the public cloud object storage space are tertiary source stations.
The second standby source station can also be directly realized in the public cloud, and the sum of the second standby source station and the first standby source station can be controlled to be 2-3 in the initial construction period of the data operator. The second standby station and the first standby station may be deployed in different public clouds, respectively.
In a specific embodiment, when a plurality of first standby source stations are configured, the source returning priority of each first standby source station is determined, and the source returning rule of the first standby source stations is configured according to the source returning priority. Specifically, the source returning rule of the first standby source station is configured as follows: the back source destination address is the address of the first standby source station with high priority or the address of the main source station. By setting the source return priority, a hierarchical data backup architecture can be realized, and the high availability of source data is improved.
In one embodiment, the types of source data stored in different first standby stations are different. The type of the source data is divided according to the access heat of the source data and the like. For example, in the case where two first standby stations are provided, one of the first standby stations stores therein thermal data, and the other first standby station stores therein temperature data. It should be noted that the type division of the temperature data and the thermal data is only an example, and may be a type obtained by dividing the source data in other manners. By storing different types of source data to different first standby source stations for storage, high-concurrency load balancing can be achieved, and pressure of the main source station is effectively relieved.
In one embodiment, where multiple second standby stations are constructed, different ones of the second standby stations store source data of different types. The type of the source data is divided according to the access heat of the source data and the like. By storing different types of source data to different second standby source stations for storage, high-concurrency load balancing can be achieved, and pressure of the main source station is effectively relieved.
In a specific embodiment, it is assumed that the source returning rules for establishing two different cloud object storage service distribution buckets, that is, two first standby source stations, and a second standby source station and a self-established main source station which are constructed in a public cloud are as follows: and returning to the second standby source station first and then returning to the self-built main source station. Wherein, no less than 90% of the source data is backed up in the second standby station to improve the availability, and of course, the second standby station may not be provided if the cost is considered. The hot data is stored in one business distribution barrel, the hot data accounts for about 30% of the total amount of the source data, the warm data is stored in the other business distribution barrel, the hot data accounts for about 50% of the total amount of the source data, it can be guaranteed that a self-built main source station realizes load balance under high concurrency, the pressure of a single source station is effectively relieved, in a business peak period, the bandwidth requirement of a public cloud is dynamically expanded, high construction cost is avoided, in a business low peak period, along with reduction of access, cloud consumption can also be reduced, and by means of elastic expansion and contraction of the public cloud and a pay-by-volume model, the operation and maintenance cost of a data operator can be effectively saved.
In a specific embodiment, the correspondence between the address of the first source-standby station and the type of the source data is stored to a central node of a Content Delivery Network (CDN). Therefore, the CDN can be combined, when the hot data is cached in the CDN edge node, the terminal user can access the hot data nearby, the time delay is reduced, the pressure of a source station is reduced, meanwhile, the CDN can be combined with a data preheating function and an automatic refreshing function of the CDN to object storage, the hit rate and the accuracy of the data of the CDN are effectively improved, and the experience of the terminal user is improved.
The function of preheating stored data by the CDN means that any node in the CDN network actively goes to a main source station or a backup source station to download source data and cache the source data to an edge node, which is different from a process in which the CDN acquires the source data from the source station according to an access request uploaded by the edge node, but instead, a service party determines hotspot data within a period of time according to a service scenario, the hotspot data is likely to be accessed by a user, and the hotspot data is downloaded to a CDN center node and an edge node in advance to be stored, which is referred to as data preheating, so that the pressure of returning the source station of each level by the CDN can be effectively reduced, and a client can acquire a required resource more quickly.
The automatic refreshing means that when data in a standby source station of a cloud manufacturer is updated, the CDN is triggered to automatically refresh a node cache, the updated data is pulled from the standby source station and cached, and it is ensured that source data stored in a CDN center node and an edge node are correct, so that it is ensured that source data obtained by a customer from the CDN node is correct, and the data accuracy is improved.
The source data storage architecture shown in fig. 3 includes a self-built first-level source station (i.e., a main source station for full storage), a public cloud manufacturer 1-object storage (standby source station), a public cloud manufacturer 2-object storage (service distribution bucket), and a public cloud manufacturer 3-object storage (service distribution bucket).
The terminal sends an access request of source data to the CDN edge node, the edge node forwards the access request to the CDN center node, the CDN center node obtains a corresponding destination address according to an identifier of the source data carried in the access request, and the access request is forwarded to a public cloud manufacturer 1-object storage, a public cloud manufacturer 2-object storage or a public cloud manufacturer 3-object storage according to the destination address.
Assuming that the destination address is cloud manufacturer 1-object storage, the CDN central node forwards the access request to public cloud manufacturer 1-object storage, and if the source data does not exist in the public cloud manufacturer 1-object storage, obtains the source data from the primary source station by mirroring back to source 1, stores the source data locally, and returns the source data to the end user at the same time.
Assuming that the destination address is public cloud manufacturer 2-object storage, the CDN central node forwards the access request to the public cloud manufacturer 2-object storage, and if the source data does not exist in the public cloud manufacturer 2-object storage, obtains the source data from the cloud manufacturer 1-object storage by mirroring back to the source 1, and if the source data does not exist in the public cloud manufacturer 1-object storage, obtains the source data from the primary source station by mirroring back to the source 2. And the public cloud manufacturer 2-object storage stores the source data locally and returns the source data to the end user.
The destination address is a source data obtaining process stored by a public cloud manufacturer 3-object, and the source data obtaining process stored by a public cloud manufacturer 2-object can be referred to.
In the embodiment of the application, by setting the main source station for storing the source data and using different public cloud object storage spaces, the source data in the main source station is backed up in the different public cloud object storage spaces, at least two first standby source stations backed up by using the different public cloud object storage spaces are obtained, and each first standby source station is enabled to back up part of the source data in the main source station, so that the problems of large occupied storage resources and bandwidth resources and high cost caused by the fact that the source data in the main source station is backed up in the public cloud in a full amount are solved, and the operation and maintenance difficulty is reduced while the cost is reduced compared with the mode of a whole self-built source data storage framework. In addition, the first standby source station is obtained by utilizing different public cloud object storage spaces, and a high-availability architecture under high concurrency can be realized.
The method adopts a plurality of public cloud object storage spaces as a framework of the standby station, utilizes a flexible use mode of object storage, namely, the object storage can be used immediately, time cost of self-construction is avoided, and the requirements of expansion and reduction of business of self business of an enterprise can be quickly met, so that 2-3 different cloud manufacturers can be selected to provide object storage service at first.
The second embodiment of the present application further provides a method for accessing source data, where the source data is backed up by using the data backup method of the first embodiment, and the method for accessing source data may be applied to any one of the first backup source stations. As shown in fig. 4, the specific process of accessing the source data is as follows:
step 401, a data access request is obtained.
In a specific embodiment, a corresponding relationship between an address of a first standby source station and a type of source data is stored in a central node of the CDN, and after acquiring a data access request of a user, the central node of the CDN acquires an address of the first standby source station corresponding to the type of source data requested by the data access request, and forwards the data access request according to the address.
Step 402, determining whether to backup the source data requested by the data access request, if yes, executing step 403, otherwise, executing step 404.
At step 403, the source data is returned.
And step 404, acquiring source data from other standby source stations or the main source station according to the source return destination address, and returning the source data.
Specifically, when the configured destination address of the source back station is the address of the second standby station, the first standby station executes the process from step 401 to step 404, and acquires source data from the second standby station according to the destination address of the source back station; when the second standby source station does not back up the source data, the source data is obtained from the main source station and returned;
alternatively, the first and second electrodes may be,
when the configured destination address of the backup source station in the process of steps 401 to 404 is the address of the first backup source station with high priority, the first backup source station acquires the source data from the first backup source station with high priority according to the destination address of the backup source station, wherein when the first backup source station with high priority does not backup the source data, the first backup source station acquires the source data from the main source station and returns the source data, and at this time, the destination address of the backup source station configured by the first backup source station with high priority is the address of the main source station.
In this embodiment, when accessing the source data, the source data backed up by the data backup method provided in the first embodiment is directly returned when the first backup source station stores the requested source data, and when the first backup source station does not store the requested source data, the source data is obtained from other backup source stations or the main source station based on the source return destination address configured by the first backup source station, thereby achieving high availability of the source data at high concurrence.
In a third embodiment of the present application, a system for accessing source data is further provided, where the source data is backed up by using the data backup method provided in the first embodiment, as shown in fig. 5, the system includes at least two first standby source stations 501 and a main source station 502;
any first standby source station 501, configured to obtain a data access request, and determine whether to backup source data requested by the data access request, if so, return the source data, otherwise, obtain the source data from another standby source station or the primary source station according to a destination address of the source data, and return the source data;
a master source station 502 for providing source data to the first or other standby source stations.
In one embodiment, the system further comprises: and at least one second standby source station 503, wherein the second standby source station stores source data which is not less than the preset proportion of data in the main source station. The first standby source station 501 is configured to obtain source data from the second standby source station according to the source-back destination address, and return the source data. The second standby source station 503 is configured to, when source data requested by the first standby source station is not included, obtain the source data from the main source station 502, and return the source data to the first standby source station 501.
In a specific embodiment, the system further includes a central node 504 and an edge node 505 of the CDN.
The edge node 505 is configured to obtain a data access request submitted by a client, and when source data requested by the data access request is cached, return the source data to the client; when the source data requested by the data access request is not cached, the data access request is forwarded to the central node 504.
The central node 504 is configured to obtain the data access request forwarded by the edge node 505, obtain an address of a first standby source station corresponding to the type of source data requested by the data access request, and forward the data access request to the first standby source station 501 according to the address of the first standby source station.
In this embodiment, when accessing the source data, the source data backed up by the data backup method provided in the first embodiment is directly returned when the first backup source station stores the requested source data, and when the first backup source station does not store the requested source data, the source data is obtained from other backup source stations or the main source station based on the source return destination address configured by the first backup source station, thereby achieving high availability of the source data at high concurrence.
And in combination with the CDN, part of the data is cached in a CDN edge node, so that a terminal user can access hotspot data nearby, the delay is reduced, and simultaneously, the pressure of a source station is also reduced.
Based on the same concept, a fourth embodiment of the present application provides a data backup apparatus, and specific implementation of the apparatus may refer to the description of the method embodiment, and repeated details are not repeated. As shown in fig. 6, the apparatus mainly includes:
a reading module 601, configured to read source data stored in a master source station;
a backup module 602, configured to backup the read source data to at least two first standby source stations, where the at least two first standby source stations respectively use different public cloud object storage spaces;
wherein each first standby source station backs up part of the source data in the main source station; and the source returning destination address of the first standby source station is other standby source stations or the main source station when the first standby source station does not contain the request data.
Based on the same concept, a fifth embodiment of the present application further provides a data backup system, as shown in fig. 7, the system mainly includes: the system comprises a main source station 701 and at least two first standby source stations 702, wherein the at least two first standby source stations 702 respectively adopt different public cloud object storage spaces.
Specifically, the primary source station 701 is configured to store source data.
The first backup source station 702 is configured to backup part of source data read from the main source station 701, and a return source destination address of the first backup source station when the first backup source station does not contain the request data is another backup source station or the main source station.
Based on the same concept, a sixth embodiment of the present application further provides an electronic device, as shown in fig. 8, the electronic device mainly includes: a processor 801 and a memory 802, wherein the memory 802 stores programs executable by the processor 801, and the processor 801 executes the programs stored in the memory 802 to realize the following steps:
reading source data stored in a main source station; backing up the read source data to at least two first standby source stations, wherein the at least two first standby source stations respectively adopt different public cloud object storage spaces; wherein each first standby source station backs up part of the source data in the main source station; and the source returning destination address of the first standby source station is other standby source stations or the main source station when the first standby source station does not contain the request data.
Alternatively, the first and second electrodes may be,
acquiring a data access request; judging whether to back up the source data requested by the data access request; if yes, returning the source data; otherwise, according to the destination address of the source, the source data is obtained from other standby source stations or the main source station, and the source data is returned.
The processor 801 and the memory 802 in the electronic device may be connected through a communication bus, which may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus.
The Memory 802 may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor 801.
The Processor 801 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), etc., and may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic devices, discrete gates or transistor logic devices, and discrete hardware components.
In yet another embodiment of the present application, there is also provided a computer-readable storage medium having stored therein a computer program which, when run on a computer, causes the computer to execute the data backup method or the method of accessing source data described in the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more of the available media. The available media may be magnetic media (e.g., floppy disks, hard disks, tapes, etc.), optical media (e.g., DVDs), or semiconductor media (e.g., solid state drives), among others.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (16)

1. A method for data backup, comprising:
reading source data stored in a main source station;
backing up the read source data to at least two first standby source stations, wherein the at least two first standby source stations respectively adopt different public cloud object storage spaces;
wherein each first standby source station backs up part of the source data in the main source station; and the source returning destination address of the first standby source station is other standby source stations or the main source station when the first standby source station does not contain the request data.
2. The data backup method of claim 1, wherein after reading the source data stored in the primary source station, the method further comprises:
backing up the read source data to at least one second standby source station, wherein the second standby source station stores the source data of which the data amount is not less than the preset proportion of the data amount in the main source station;
the source return destination address of the second standby source station is the main source station, and the source return destination address of the first standby source station is the second standby source station.
3. The data backup method of claim 2, wherein the sum of the data amounts of the source data stored in each of the second backup source stations is equal to the total amount of the source data in the primary source station.
4. The data backup method of claim 1, wherein the method further comprises:
and determining the source returning priority of each first standby source station, and configuring the source returning rule of the first standby source station according to the source returning priority.
5. The data backup method of claim 4, wherein the configuring the back-source rule of the first standby station according to the back-source priority comprises:
configuring a source returning rule of the first standby source station as follows: the back source destination address is the address of the first standby source station with high priority or the address of the main source station.
6. The data backup method according to claim 2, wherein the types of source data stored in different first backup stations are different, and/or the types of source data stored in different second backup stations are different.
7. The data backup method according to claim 6, wherein the type of the source data is divided by the access heat of the source data.
8. The data backup method according to claim 7, wherein the correspondence between the address of the first backup source station and the type of the source data is saved to a central node of a content distribution network.
9. A method for accessing source data, wherein the source data is backed up by using the data backup method of any one of claims 1 to 8, and the method is applied to any one of the first source-standby stations, and comprises:
acquiring a data access request;
judging whether to back up the source data requested by the data access request;
if yes, returning the source data;
otherwise, according to the destination address of the source, the source data is obtained from other standby source stations or the main source station, and the source data is returned.
10. The method of claim 9, wherein the obtaining the source data from other alternate source stations according to the back-to-source destination address comprises:
acquiring the source data from a second standby source station according to the return source destination address; when the second standby source station does not back up the source data, the second standby source station acquires the source data from the main source station and returns the source data;
alternatively, the first and second electrodes may be,
and acquiring the source data from the first standby source station with high priority according to the back-source destination address, wherein the first standby source station with high priority acquires the source data from the main source station and returns the source data when the source data is not backed up.
11. A system for accessing source data, wherein the source data is backed up using the data backup method of any one of claims 1 to 8, the system comprising at least two first backup source stations and a primary source station;
any one of the first standby source stations is used for acquiring a data access request and judging whether to back up source data requested by the data access request, if so, returning the source data, otherwise, acquiring the source data from other standby source stations or the main source station according to a destination address of the source data, and returning the source data;
the master source station is configured to provide source data for the first standby source station or the other standby source stations.
12. The system for accessing source data of claim 11, further comprising: at least one second standby source station, wherein source data with the data volume not less than the preset proportion in the main source station is stored in the second standby source station;
the first standby source station is used for acquiring the source data from the second standby source station according to the source return destination address and returning the source data;
and the second standby source station is used for acquiring the source data from the main source station and returning the source data to the first standby source station when the source data requested by the first standby source station is not included.
13. The system for accessing source data of claim 12, further comprising a central node and an edge node of a content distribution network;
the edge node is used for acquiring the data access request submitted by a client and returning source data to the client when caching the source data requested by the data access request; when the source data requested by the data access request is not cached, forwarding the data access request to the central node;
the central node is configured to obtain the data access request forwarded by the edge node, obtain an address of the first standby source station corresponding to a type of source data requested by the data access request, and forward the data access request according to the address of the first standby source station.
14. A data backup apparatus, comprising:
the reading module is used for reading the source data stored in the main source station;
the backup module is used for backing up the read source data to at least two first standby source stations, and the at least two first standby source stations respectively adopt different public cloud object storage spaces;
wherein each first standby source station backs up part of the source data in the main source station; and the source returning destination address of the first standby source station is other standby source stations or the main source station when the first standby source station does not contain the request data.
15. A data backup system, comprising: the system comprises a main source station and at least two first standby source stations, wherein the at least two first standby source stations respectively adopt different public cloud object storage spaces;
the master source station is used for storing source data;
the first backup source station is used for backing up part of source data read from the main source station, and a return source destination address of the first backup source station when the first backup source station does not contain the request data is other backup source stations or the main source station.
16. An electronic device, comprising: a processor and a memory;
the memory for storing a computer program;
the processor is configured to execute the program stored in the memory to implement the data backup method according to any one of claims 1 to 8, or to implement the method for accessing source data according to any one of claims 9 to 10.
CN202010922994.8A 2020-09-04 2020-09-04 Method, device, system and storage medium for data backup and source data access Pending CN112099991A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010922994.8A CN112099991A (en) 2020-09-04 2020-09-04 Method, device, system and storage medium for data backup and source data access

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010922994.8A CN112099991A (en) 2020-09-04 2020-09-04 Method, device, system and storage medium for data backup and source data access

Publications (1)

Publication Number Publication Date
CN112099991A true CN112099991A (en) 2020-12-18

Family

ID=73757351

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010922994.8A Pending CN112099991A (en) 2020-09-04 2020-09-04 Method, device, system and storage medium for data backup and source data access

Country Status (1)

Country Link
CN (1) CN112099991A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170517A (en) * 2023-04-25 2023-05-26 中国人民解放军军事科学院系统工程研究院 Priority-based water flow cloud edge cooperative data unloading method
CN117294582A (en) * 2023-11-22 2023-12-26 畅捷通信息技术股份有限公司 High availability method, system and storage medium of multi-cloud content distribution network

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116170517A (en) * 2023-04-25 2023-05-26 中国人民解放军军事科学院系统工程研究院 Priority-based water flow cloud edge cooperative data unloading method
CN116170517B (en) * 2023-04-25 2023-06-27 中国人民解放军军事科学院系统工程研究院 Priority-based water flow cloud edge cooperative data unloading method
CN117294582A (en) * 2023-11-22 2023-12-26 畅捷通信息技术股份有限公司 High availability method, system and storage medium of multi-cloud content distribution network

Similar Documents

Publication Publication Date Title
US11734125B2 (en) Tiered cloud storage for different availability and performance requirements
US10826799B2 (en) Apparatus for providing cloud service based on cloud service brokerage and method thereof
WO2020177533A1 (en) Electronic invoice identifier allocation method, and electronic ticket generating method, device and system
US9740435B2 (en) Methods for managing content stored in cloud-based storages
US10528527B2 (en) File management in thin provisioning storage environments
TWI614703B (en) Information recommendation method and information recommendation device
CN108023953B (en) High-availability implementation method and device for FTP service
US11064041B2 (en) Apparatus for providing cloud service using cloud service brokerage based on multiple clouds and method thereof
CN110166523B (en) Content updating method, device, equipment and computer readable storage medium
CN103533006A (en) United cloud disk client, server, system and united cloud disk service method
Mansouri et al. Dynamic replication and migration of data objects with hot-spot and cold-spot statuses across storage data centers
US20120246206A1 (en) File server system and storage control method
US11593496B2 (en) Decentralized data protection system for multi-cloud computing environment
CN110825704B (en) Data reading method, data writing method and server
CN111212134A (en) Request message processing method and device, edge computing system and electronic equipment
US20170153909A1 (en) Methods and Devices for Acquiring Data Using Virtual Machine and Host Machine
CN112099991A (en) Method, device, system and storage medium for data backup and source data access
US11221993B2 (en) Limited deduplication scope for distributed file systems
US20120005274A1 (en) System and method for offering cloud computing service
CN109947373A (en) Data processing method and device
US10880376B1 (en) Downloading chunks of an object from a storage service while chunks of the object are being uploaded
CN108200151A (en) ISCSI Target load-balancing methods and device in a kind of distributed memory system
WO2023010948A1 (en) Cloud desktop data migration method, service node, management node, server, electronic device, and computer-readable storage medium
CN111831221A (en) Distributed storage method and system based on cloud storage
CN114979114B (en) Cloud application processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination