CN113300875A - Return source data verification method, server, system and storage medium - Google Patents

Return source data verification method, server, system and storage medium Download PDF

Info

Publication number
CN113300875A
CN113300875A CN202110185013.0A CN202110185013A CN113300875A CN 113300875 A CN113300875 A CN 113300875A CN 202110185013 A CN202110185013 A CN 202110185013A CN 113300875 A CN113300875 A CN 113300875A
Authority
CN
China
Prior art keywords
file
fragment
log record
source
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110185013.0A
Other languages
Chinese (zh)
Inventor
王俊奕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202110185013.0A priority Critical patent/CN113300875A/en
Publication of CN113300875A publication Critical patent/CN113300875A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/12Applying verification of the received information

Abstract

The embodiment of the application provides a method, a server, a system and a storage medium for verifying return source data. In the back-to-source data verification system, a log record uploaded to a log server by a CDN node contains verification information calculated by the CDN node on a back-to-source file fragment. Furthermore, the log server can perform content consistency check on the file fragments returned from different CDN nodes according to check information calculated by different CDN nodes aiming at the same file fragment. The back-to-source data verification mode depends on the verification value calculated by the CDN node requesting the back-to-source data for the file fragments obtained by the back-to-source data, does not depend on the source station, so that the whole verification process is transparent to the source station, the improvement on one side of the source station is not needed, and the flexibility of the back-to-source data verification operation is greatly improved.

Description

Return source data verification method, server, system and storage medium
Technical Field
The present application relates to the field of internet technologies, and in particular, to a method, a server, a system, and a storage medium for checking source data.
Background
In a Content Delivery Network (CDN) download acceleration scenario, when a user accesses a certain URL (Uniform Resource Locator), if a CDN node does not cache corresponding Content or the cache is due, the CDN node may obtain corresponding Content from a source station in a source return manner.
However, in some scenarios, the CDN node pulls back the source to fetch the dirty data due to the back-source connection being hijacked or the source station being abnormal. After the dirty data is distributed to the client, the client is caused to operate abnormally.
In the prior art, a CDN node side may perform consistency check on the fetched Content of the source reply based on a Content-MD5 HTTP (Hypertext Transfer Protocol) header corresponding to the source response file. However, this kind of verification method has high dependency on the source station, and cannot flexibly implement back-to-source data verification. Therefore, a new solution is yet to be proposed.
Disclosure of Invention
Various aspects of the present application provide a method, a server, a system, and a storage medium for checking source-backed data, so as to reduce the dependency of the consistency check of the source-backed data on a source station and improve the flexibility of the check of the source-backed data.
The embodiment of the application provides a back source data verification system, which comprises: the system comprises a plurality of CDN nodes, a source station and a log server; a first CDN node of the plurality of CDN nodes to: sending a fragment back-to-source request for the file to the source station; receiving the file fragments issued by the source station according to the fragment return request; calculating the verification information of the file fragment to generate a first log record containing the verification information of the file fragment, and sending the first log record to a log server; the log server is configured to: inquiring a second log record of the file fragment in the saved log records; the second log record comprises verification information calculated by the second CDN node on the file fragments from the source; and performing consistency check on the file fragments according to the check information in the first log record and the check information in the second log record.
The embodiment of the present application further provides a method for verifying source-backed data, including: acquiring a first log record sent by a first CDN node, wherein the first log record comprises verification information calculated by the first CDN node on a file fragment from a source; inquiring a second log record of the file fragment in the saved log records; the second log record comprises verification information calculated by the second CDN node on the file fragments from the source; and performing consistency check on the file fragments according to the check information in the first log record and the check information in the second log record.
The embodiment of the present application further provides a back-source data verification method, which is applicable to a CDN node, and includes: sending a fragment back-to-source request for the file to a source station; receiving the file fragments issued by the source station according to the fragment return request; calculating the verification information of the file fragments to generate a log record containing the verification information of the file fragments; and sending the log record of the file slice to a log server so that the log server performs consistency check on the file slice according to check information in the log record and check information generated by other CDN nodes according to the returned file slice.
An embodiment of the present application further provides a CDN server, including: a memory, a processor, and a communication component; the memory is to store one or more computer instructions; the processor is to execute the one or more computer instructions to: and executing the steps in the back source data verification method provided by the embodiment of the application.
The embodiment of the present application further provides a computer-readable storage medium storing a computer program, and the computer program, when executed by a processor, can implement the steps in the source return data verification method provided in the embodiment of the present application.
In the back-to-source data verification system provided by the embodiment of the application, the log server can obtain log records uploaded by different CDN nodes for the same file fragment, and the log records uploaded by each CDN node include verification information calculated by the CDN node on the back-to-source file fragment. Furthermore, the log server can perform content consistency check on the file fragments returned from different CDN nodes according to check information calculated by different CDN nodes aiming at the same file fragment. The back-to-source data verification mode depends on the verification value calculated by the CDN node requesting the back-to-source data for the file fragments obtained by the back-to-source data, does not depend on the source station, so that the whole verification process is transparent to the source station, the improvement on one side of the source station is not needed, and the flexibility of the back-to-source data verification operation is greatly improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic structural diagram of a back source data verification system according to an exemplary embodiment of the present application;
fig. 2 is a schematic flowchart of a back source data verification method according to an exemplary embodiment of the present application;
FIG. 3 is a flowchart illustrating a back source data verification method according to an exemplary embodiment of the present application;
fig. 4 is a schematic structural diagram of a CDN server according to an exemplary embodiment of the present application;
fig. 5 is a schematic structural diagram of a log server according to an exemplary embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
A Content Delivery Network (CDN) is composed of a data center and cache servers (CDN nodes). The cache servers are distributed in a region or a network where the user access is relatively concentrated, when the user accesses network resources, the data center can point the access of the user to the cache server closest to the user by using a global load technology, and the access delay of the user is greatly reduced.
In a CDN download acceleration scenario, when a user accesses a UR, if a CDN node does not cache corresponding content, or the cache has expired, the CDN node may obtain corresponding content from a source station in a source return manner.
However, in some scenarios, the CDN node pulls back to the source to get Dirty data ((Dirty Read)) due to the back-to-source connection being hijacked or the source station being abnormal. For example, the source station side responds to the CDN node with error data due to an exception such as storage. Or the source station comprises a plurality of clusters, data among the plurality of clusters are asynchronous, and for the same URL, the CDN node acquires different file contents from different clusters. Dirty data refers to data that is not in a given range or has no meaning for actual needs, or data that is in an illegal format, or data that has irregular encoding and fuzzy logic. After the dirty data is distributed to the client, the client is caused to operate abnormally, and the user experience is further influenced.
In order to ensure the accuracy of the back-source data, in the prior art, the CDN node side may perform consistency check on the pulled back-source Content based on the Content-MD5 HTTP header corresponding to the source station response file.
In this manner, the source station is required to respond to the Content-MD5 header for the CDN node to check back the Content consistency of the source data. When the source station receives a back-source request of the CDN node, the MD5 value of the file can be calculated, and the MD5 value is placed in an HTTP response header to respond to the CDN node. After downloading the complete file from the source station, the CDN node calculates the MD5 value of the file and compares the MD5 value with the MD5 value in the HTTP response header. If the comparison is consistent, the content consistency of the source returning file is determined, otherwise, the content consistency of the source returning file is determined.
In practice, however, on the one hand, there are many source stations that do not have the capability to respond to MD5 HTTP headers; on the other hand, the source station needs to consume certain computing resources to calculate the MD5 value, and needs certain development workload, so that under some CDN download acceleration scenarios, there is a greater risk of a content consistency problem, and verification of the back-source data cannot be flexibly implemented.
In view of the above technical problems, in some embodiments of the present application, a solution is provided, and the technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic structural diagram of a back-source data verification system according to an exemplary embodiment of the present application, and as shown in fig. 1, the back-source data verification system 100 includes: a source station 10, a plurality of CDN nodes 20, and a log server 30.
Returning to the source data verification system 100, the number of CDN nodes 20 may be two or more, and this embodiment is not limited.
The content distribution network is a tree structure and mainly comprises three levels: a center level, a middle level, and an edge level. The central node in the central hierarchy is a root node in the tree structure, the CDN nodes in the middle hierarchy are child nodes of the CDN nodes in the central hierarchy, and the CDN nodes in the edge hierarchy are child nodes of the CDN nodes in the middle hierarchy.
The central nodes in the central hierarchy are mainly used for realizing the functions of load balancing, data content distribution, access scheduling and the like. The CDN node of the middle hierarchy is used for connecting the CDN node in the edge hierarchy and the central node of the central hierarchy so as to relieve the data access pressure of the central node in the central hierarchy. CDN nodes in the edge level are used for synchronizing the content of the source station, directly responding to the user request and quickly distributing the content to the user.
In this embodiment, the CDN node 20 that initiates the back-to-source request to the source station 10 may be implemented as any CDN node in the intermediate level.
Generally, when a client requests a CDN node at an edge level to obtain a certain file, if the CDN node at the edge level does not cache the file, the CDN node at the edge level may request a CDN node at an intermediate level connected to the CDN node at the edge level to obtain the file. If the intermediate-level CDN node does not store the file, it may request the source station to source the file back.
Under the scenario of CDN accelerated downloading, the CDN may turn on the fragment back to source function. The fragmented backtracking is a backtracking mode for requesting a server of the source station to backtrack part of data content within a specified range of the source, which is beneficial to accelerating the distribution of larger files, reducing the consumption of backtracking flow and improving the response speed of resources.
Wherein, any CDN node in the plurality of CDN nodes is configured to: a shard back source request for the file is sent to the source station 10.
When receiving a back-to-source request sent by the CDN node, the source station 10 may issue a file fragment to the CDN node. Generally, the back-source request of the CDN node includes a range of the file fragment (range) requested to be obtained, and the source station 10 may issue the corresponding file fragment according to the range of the file fragment sent by the CDN node.
For example, the range of the file fragment specified by the back-to-source request of the CDN node is 0 to 100, and the range of the file fragment responded by the source station 10 is 0 to 100.
After receiving the file fragments issued by the source station 10, the CDN nodes may generate check information of the file fragments, generate log records of the file fragments according to the check information of the file fragments, and send the log records to the log server 30.
As shown in fig. 1, the log server 30 may establish communication connections with the CDN nodes 20, and receive log records uploaded by the file fragments obtained by each CDN node according to the back source.
Each time a log record of a certain file fragment uploaded by a CDN node is received, the log server 30 may query whether other log records of the file fragment have been saved in the saved log records. And if the other log records of the file fragment are not saved, saving the received log records. If other log records of the file fragment are stored, the content consistency of the file fragment is judged according to the verification information in the received log records and the verification information in the other log records of the file fragment.
Next, for convenience of description, the log record of the file fragment currently received by the log server 30 is described as a first log record, and the CDN node uploading the first log record is a first CDN node. The first log record includes check information generated by the first CDN node according to the file fragment returned to the source.
The log server 30 may query the saved log records for the existence of other log records for the file fragment. If not, saving the first log record into the saved log record. If other log records (marked as second log records) are inquired, consistency check can be performed on the file fragment according to the check information of the file fragment in the first log record and the check information of the file fragment in the second log record. And the second log record is uploaded by a second CDN node, and the verification information of the file fragment in the second log record is generated by the second CDN node according to the file fragment returned to the source. The second CDN node is different from the first CDN node.
After the log server 30 performs consistency check on the source return data, if it is determined that the source return data has content consistency, it may be considered that no exception occurs in the source return operation. The CDN node of the middle level that returns the source to the data may send the data that returns the source to the CDN node of the edge level, so that the CDN node of the edge level synchronizes the content of the source station, and directly responds to the user request, and quickly distributes the content to the user.
In this embodiment, the CDN node may be deployed as a server device, which may include, but is not limited to, a conventional server, a cloud host, a virtual center, and other devices, and this embodiment is not limited to this. The server device mainly includes a processor, a hard disk, a memory, a system bus, and the like, and is similar to a general computer architecture, and is not described in detail.
In the back-source data verification system 100, the CDN node 20 and the source station 10, and the CDN node 20 and the log server 30 may communicate with each other in a wired communication manner and a wireless communication manner. The WIreless communication mode includes short-distance communication modes such as bluetooth, ZigBee, infrared, WiFi (WIreless-Fidelity), long-distance WIreless communication modes such as LORA, and WIreless communication mode based on a mobile network. When the mobile network is connected through communication, the network format of the mobile network may be any one of 2G (gsm), 2.5G (gprs), 3G (WCDMA, TD-SCDMA, CDMA2000, UTMS), 4G (LTE), 4G + (LTE +), 5G, WiMax, and the like.
In this embodiment, the log server may obtain log records uploaded by different CDN nodes for the same file fragment, where the log record uploaded by each CDN node includes check information calculated by the CDN node on the file fragment that is sourced back to the source. Furthermore, the log server can perform content consistency check on the file fragments returned from different CDN nodes according to check information calculated by different CDN nodes aiming at the same file fragment. The back-to-source data verification mode depends on the verification value calculated by the CDN node requesting the back-to-source data for the file fragments obtained by the back-to-source data, does not depend on the source station, so that the whole verification process is transparent to the source station, the improvement on one side of the source station is not needed, and the flexibility of the back-to-source data verification operation is greatly improved.
In some optional embodiments, the CDN node may calculate a check value of the file fragment when receiving the file fragment returned to the source, and generate a log record according to the check information of the file fragment.
Optionally, the check value may be implemented as: at least one of an information digest value, a parity code value, and a gray code value of the file fragment.
Among them, Parity Check (Parity Check) is a method for checking the transmission correctness of codes. The consistency check is performed according to whether the number of "1" in the bits of a group of binary codes transmitted is an odd number or an even number. Odd parity is used, and odd parity is used, whereas even parity is used. Multiple CDN nodes may specify in advance which check method to employ. Usually, a parity bit is exclusively set, which is used to make the number of "1" s in the set of code data odd or even. In the consistency check, it is checked whether the number of "1" is odd or even, and details are not repeated.
Among them, Gray code is also called cyclic binary code or reflective binary code. In digital systems, various data needs to be converted into binary code for processing. Gray code is an unweighted code that encodes binary data. The plurality of CDN nodes can be agreed in advance to encode the file fragments downloaded from the source back by adopting Gray codes so as to realize consistency check based on the Gray codes.
The Message Digest value is calculated based on a Message Digest Algorithm (MD). In some embodiments, the MD5 values for file fragments may be calculated using a fifth version of the information summarization algorithm (MD 5). MD5 is a cryptographic hash function that generates a 128-bit (16-byte) hash value to ensure that the message is transmitted in a consistent and complete manner. The plurality of CDN nodes may agree in advance to use MD5 to perform computation on the file fragments downloaded from the back source, so as to obtain MD5 of the file fragments, respectively.
After the check value of the file fragment is obtained, the identifier of the file to which the file fragment belongs, the fragment range of the fragment file, and the check value obtained by the CDN node through calculation on the file fragment may be used as a set of triple data, and the triple data may be used as a log record of the source. That is, for any file slice, the log record includes: the file fragmentation method comprises the steps of identifying the file to which the file fragmentation belongs, fragmenting the range of the fragmented file, and calculating the check value obtained by the CDN node on the file fragmentation.
Based on the above, when the log server 30 receives the first log record of the file fragment sent by the first CDN node, the identifier of the file to which the file fragment belongs and the fragment range of the file fragment in the log record may be used as a query key (key), and in the saved log record, a log record corresponding to the identifier of the file to which the file fragment belongs and the fragment range of the fragmented file may be queried as the second log record of the file fragment.
The identifier of the file to which the file fragment belongs may be a name, a path, a unique identification code, or a URL of the file, and the embodiment is not limited.
In some embodiments, when the file's identity is implemented as a URL and the check value is implemented as an MD5 value, a log record of a file fragment will contain the following triplets: (URL of file, range of file fragment, MD5 value of file fragment).
That is, when receiving the first log record, the log server 30 may use the URL of the file in the first log record and the range of the file fragment as query keys to query whether or not another log record of the file fragment exists in the saved log records.
If the arrival server 30 inquires about the second log record, the consistency check can be performed on the file fragment according to the check information of the file fragment in the first log record and the check information of the file fragment in the second log record.
Optionally, if a plurality of other log records of the file fragment are queried in the saved log records, all the other log records may be used as second log records for comparing the verification information. Alternatively, one log record closest to the time of the first log record may be selected from the plurality of other log records as the second log record for comparing the verification information according to the time stamp sequence of the plurality of other log records, which is not limited in this embodiment.
After querying the second log record, the log server 30 may obtain the first check value of the file fragment from the first log record, and obtain the second check value of the file fragment from the second log record. Next, it is determined whether the first check value and the second check value are consistent.
If the first check value is consistent with the second check value, the log server 30 may determine that the file fragment sourced back by the first CDN and the file fragment sourced back by the second CDN node have content consistency, that is, no exception occurs in the source returning operation of the file fragment. If the first check value and the second check value are not consistent, the log server 30 may determine that the content of the file fragment sourced back to the first CDN is not consistent with the content of the file fragment sourced back to the second CDN node, that is, the source returning operation of the file fragment is abnormal.
Continuing with the example where the check value is implemented as an MD5 value, the log server 30 may obtain an MD5 value for the file fragment from the first log record and an MD5 value for the file fragment from the second log record. If the values of MD5 are consistent, log server 30 may determine that there is no exception to the back-to-source operation for the file fragment. If the values of MD5 do not match, log server 30 may determine that the back-to-source operation for the file slice is abnormal.
Optionally, after the log server 30 determines that the file fragment returned to the source is abnormal, an alarm message may be sent to the CDN node where the file fragment is cached, so as to prompt that the resource associated with the file fragment is abnormal.
Optionally, after the log server 30 determines that an exception occurs in the file fragment returned to the source, a refresh operation on the file to which the file fragment belongs may be triggered. In such an embodiment, the log server 30 may send a content refresh request to the content management node to trigger the content management node to delete the cache of the file in the content distribution network to which the file fragment belongs.
The back-source data verification system 100 provided in the foregoing and following embodiments of the present application may be used to detect a sporadic dirty data problem in a CDN file downloading scenario, and does not need a source station to calculate and respond to an MD5 verification header of a file.
Generally, for the same resource file, there may be multiple CDN nodes in the CDN system returning to the source station to download the resource file. According to historical data statistics of the CDN download acceleration scenario, it is known that the percentage of requests for obtaining dirty data by the CDN back to the source is low, for example, in some cases, the percentage of requests for obtaining dirty data by the CDN back to the source may be lower than 0.01%. When multiple CDN nodes return to the source to download the same resource file, most of the return-to-source data obtained by the CDN nodes is normal, and a few CDN nodes may obtain dirty data.
Based on the technical scheme provided by the embodiment of the application, each CDN node may perform MD5 calculation on the file fragment downloaded from the back source to obtain verification information of the file fragment. Next, a log record is generated according to the triplets of the file URL, the file fragment range, and the MD5 value of the file fragment. Assuming that there are n CDN nodes returning the source of the same URL, the n CDN nodes will generate n log records. After each log record is generated, the log records can be collected and uploaded to a log server in real time.
The log server can perform real-time streaming computation on the log records uploaded by the CDN nodes. And if the latest uploaded log record is the first log record generated by a certain file fragment, directly storing the log record into a log server without comparison processing. And if the latest uploaded log record is not the first log record generated by the file fragment, using the file URL and the range of the file fragment as query keywords to query the last log record of the file fragment. The last log record may be the log record closest to the timestamp of the most recent log record. Next, the MD5 value in the most recent log record is compared to the MD5 value in the last log record.
If the comparison is consistent, the consistency of the file fragment contents acquired by each CDN node can be determined; if the comparison is inconsistent, it can be determined that the contents of the file fragments obtained by each CDN node are inconsistent, and at this time, the log server may send an alarm message and call a refresh interface of content management to perform full-network refresh, so as to clear all caches of the resource on the CDN system.
Based on cross validation of the MD5 value calculated by the CDN node and detection of the dirty data problem, the content consistency verification requirement under the CDN download acceleration scene under most conditions can be met, and the back-source dirty data problem detection and repair capability is provided for CDN download acceleration products. Meanwhile, the scheme does not need a source station to respond to the MD5 HTTP header, has high universality, is suitable for various CDN downloading acceleration scenes, and is not repeated one by one.
Fig. 2 is a flowchart illustrating a back-source data checking method according to an exemplary embodiment of the present application, where the method, when executed on the CDN node side, may include the steps shown in fig. 2:
step 201, sending a fragment back-to-source request for the file to the source station.
Step 202, receiving the file fragments sent by the source station according to the fragment return source request.
Step 203, calculating the verification information of the file fragment to generate a log record containing the verification information of the file fragment.
And 204, sending the log record of the file slice to a log server so that the log server performs consistency check on the file slice according to check information in the log record and check information generated by other CDN nodes according to the file slice from the back source.
In some exemplary embodiments, one way to compute the verification information of the file fragment to generate a log record containing the verification information of the file fragment comprises: calculating a check value of the file fragment; and generating a log record of the file fragment according to the identifier of the file to which the file fragment belongs, the fragment range of the fragmented file and the check value of the file fragment.
In some exemplary embodiments, the check value of the file fragment includes: at least one of an information digest value, a parity code value, and a gray code value of the file fragment.
In this embodiment, the log record uploaded by the CDN node to the log server includes check information calculated by the CDN node on the file segment returned from the source. Furthermore, the log server can perform content consistency check on the file fragments returned from different CDN nodes according to check information calculated by different CDN nodes aiming at the same file fragment. The back-to-source data verification mode depends on the verification value calculated by the CDN node requesting the back-to-source data for the file fragments obtained by the back-to-source data, does not depend on the source station, so that the whole verification process is transparent to the source station, the improvement on one side of the source station is not needed, and the flexibility of the back-to-source data verification operation is greatly improved.
Fig. 3 is a flowchart illustrating a method for checking back source data according to another exemplary embodiment of the present application, where the method, when executed on the log server side, may include the steps shown in fig. 3:
step 301, obtaining a first log record sent by a first CDN node, where the first log record includes check information calculated by the first CDN node on a file segment from back to source.
Step 302, inquiring a second log record of the file fragment in the saved log records; the second log record comprises verification information calculated by the second CDN node on the file fragments which are sourced back.
And step 303, performing consistency check on the file fragments according to the check information in the first log record and the check information in the second log record.
In some exemplary embodiments, the first log record comprises: the file fragmentation comprises an identifier of a file to which the file fragmentation belongs, a fragmentation range of the fragmented file and a first check value obtained by calculating the file fragmentation by the first CDN node; the second log record comprises: the identification of the file to which the file fragment belongs, the fragment range of the fragment file, and a second check value calculated by the second CDN node for the file fragment.
In some exemplary embodiments, one way to query a second log record of the file fragment in the saved log records comprises: and according to the identification and the fragment range of the file corresponding to the saved log record, inquiring the log record corresponding to the identification of the file to which the file fragment belongs and the fragment range of the fragment file in the saved log record as the second log record.
In some exemplary embodiments, one way to perform consistency check on the file shards according to the check information of the file shards in the first log record and the check information of the file shards in the second log record includes: obtaining the first check value of the file fragment from the first log record, and obtaining the second check value of the file fragment from the second log record; if the first check value is consistent with the second check value, determining that the source returning operation of the file fragment is not abnormal; and if the first check value is inconsistent with the second check value, determining that the source returning operation of the file fragment is abnormal.
Wherein, the check value of the file fragment comprises: at least one of an information digest value, a parity code value, and a gray code value of the file fragment.
In some exemplary embodiments, the method further comprises: and if the source returning operation of the file fragment is determined to be abnormal, sending a content refreshing request to a content management node to trigger the content management node to delete the cache of the file to which the file fragment belongs in a content distribution network.
In this embodiment, the log server may obtain log records uploaded by different CDN nodes for the same file fragment, where the log record uploaded by each CDN node includes check information calculated by the CDN node on the file fragment from which the CDN node arrives. Furthermore, the log server can perform content consistency check on the file fragments returned from different CDN nodes according to check information calculated by different CDN nodes aiming at the same file fragment. The back-source data verification mode depends on the verification value calculated by the CDN node requesting the back-source data for the file fragments obtained by the back-source data, does not depend on the source station, so that the whole verification process is transparent to the source station, the improvement on one side of the source station is not needed, and the flexibility of the back-source data verification operation is greatly improved
It should be noted that the execution subjects of the steps of the methods provided in the above embodiments may be the same device, or different devices may be used as the execution subjects of the methods. For example, the execution subjects of step 201 to step 204 may be device a; for another example, the execution subject of steps 201 and 202 may be device a, and the execution subject of step 203 may be device B; and so on.
In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations are included in a specific order, but it should be clearly understood that the operations may be executed out of the order presented herein or in parallel, and the sequence numbers of the operations, such as 201, 202, etc., are merely used for distinguishing different operations, and the sequence numbers do not represent any execution order per se. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.
Fig. 4 is a schematic structural diagram of a log server provided in an exemplary embodiment of the present application, where the log server is suitable for the back source data verification system provided in the foregoing embodiment. As shown in fig. 4, the log server includes: memory 401, processor 402, and communications component 403.
Memory 401 for storing computer programs and may be configured to store other various data to support operations on CDN servers. Examples of such data include instructions for any application or method operating on the CDN server, contact data, phonebook data, messages, pictures, videos, and so forth.
A processor 402, coupled to the memory 401, for executing the computer program in the memory 401 for: a first log record sent by a first CDN node is obtained through a communication component 403, where the first log record includes check information calculated by the first CDN node on a file fragment that is sourced back to the source. Inquiring a second log record of the file fragment in the saved log records; the second log record comprises verification information calculated by the second CDN node on the file fragments which are sourced back. And performing consistency check on the file fragments according to the check information in the first log record and the check information in the second log record.
In some exemplary embodiments, the first log record comprises: the file fragmentation comprises an identifier of a file to which the file fragmentation belongs, a fragmentation range of the fragmented file and a first check value obtained by calculating the file fragmentation by the first CDN node; the second log record comprises: the identification of the file to which the file fragment belongs, the fragment range of the fragment file, and a second check value calculated by the second CDN node for the file fragment.
In some exemplary embodiments, when querying the second log record of the file fragment from the saved log records, the processor 402 is specifically configured to: and according to the identification and the fragment range of the file corresponding to the saved log record, inquiring the log record corresponding to the identification of the file to which the file fragment belongs and the fragment range of the fragment file in the saved log record as the second log record.
In some exemplary embodiments, when performing consistency check on the file fragment according to the check information of the file fragment in the first log record and the check information of the file fragment in the second log record, the processor 402 is specifically configured to: obtaining the first check value of the file fragment from the first log record, and obtaining the second check value of the file fragment from the second log record; if the first check value is consistent with the second check value, determining that the source returning operation of the file fragment is not abnormal; and if the first check value is inconsistent with the second check value, determining that the source returning operation of the file fragment is abnormal.
Wherein, the check value of the file fragment comprises: at least one of an information digest value, a parity code value, and a gray code value of the file fragment.
In some exemplary embodiments, the processor 402 is further configured to: and if the source returning operation of the file fragment is determined to be abnormal, sending a content refreshing request to a content management node to trigger the content management node to delete the cache of the file to which the file fragment belongs in a content distribution network.
Further, as shown in fig. 4, the CDN server further includes: power components 404, and the like. Only some of the components are shown schematically in fig. 4, and it is not meant that the CDN server includes only the components shown in fig. 4.
In this embodiment, the log record uploaded by the CDN node to the log server includes check information calculated by the CDN node on the file fragments returned from the source. Furthermore, the log server can perform content consistency check on the file fragments returned from different CDN nodes according to check information calculated by different CDN nodes aiming at the same file fragment. The back-to-source data verification mode depends on the verification value calculated by the CDN node requesting the back-to-source data for the file fragments obtained by the back-to-source data, does not depend on the source station, so that the whole verification process is transparent to the source station, the improvement on one side of the source station is not needed, and the flexibility of the back-to-source data verification operation is greatly improved.
Accordingly, the present application further provides a computer readable storage medium storing a computer program, where the computer program is capable of implementing the steps that can be executed by the log server in the foregoing method embodiments when executed.
Fig. 5 is a schematic structural diagram of a CDN server provided in an exemplary embodiment of the present application, where the CDN server is suitable for the back source data verification system provided in the foregoing embodiment. As shown in fig. 5, the CDN server includes: memory 501, processor 502, and communication component 503.
A memory 501 for storing a computer program and may be configured to store other various data to support operations on the log server. Examples of such data include instructions for any application or method operating on the log server, contact data, phonebook data, messages, pictures, videos, and the like.
A processor 502, coupled to the memory 501, for executing computer programs in the memory 501 for: sending a fragment back-to-source request for the file to the source station through the communication component 503; receiving the file fragments issued by the source station according to the fragment return request; calculating the verification information of the file fragments to generate a log record containing the verification information of the file fragments; and sending the log record of the file slice to a log server so that the log server performs consistency check on the file slice according to check information in the log record and check information generated by other CDN nodes according to the returned file slice.
In some exemplary embodiments, when the processor 502 calculates the verification information of the file fragment to generate the log record containing the verification information of the file fragment, it is specifically configured to: calculating a check value of the file fragment; and generating a log record of the file fragment according to the identifier of the file to which the file fragment belongs, the fragment range of the fragmented file and the check value of the file fragment.
In some exemplary embodiments, the check value of the file fragment includes: at least one of an information digest value, a parity code value, and a gray code value of the file fragment.
Further, as shown in fig. 5, the log server further includes: power supply components 504, and the like. Only some of the components are schematically shown in fig. 5, and it is not meant that the log server comprises only the components shown in fig. 5.
In this embodiment, the log server may obtain log records uploaded by different CDN nodes for the same file fragment, where the log record uploaded by each CDN node includes check information calculated by the CDN node on the file fragment that is sourced back to the source. Furthermore, the log server can perform content consistency check on the file fragments returned from different CDN nodes according to check information calculated by different CDN nodes aiming at the same file fragment. The back-to-source data verification mode depends on the verification value calculated by the CDN node requesting the back-to-source data for the file fragments obtained by the back-to-source data, does not depend on the source station, so that the whole verification process is transparent to the source station, the improvement on one side of the source station is not needed, and the flexibility of the back-to-source data verification operation is greatly improved.
Accordingly, the present application further provides a computer readable storage medium storing a computer program, where the computer program is capable of implementing the steps that can be executed by the CDN server in the foregoing method embodiments when executed.
The memories of fig. 4 and 5 may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The communication components of fig. 4 and 5 described above are configured to facilitate wired or wireless communication between the device in which the communication component is located and other devices. The device in which the communication component is located may access a wireless network based on a communication standard, such as WiFi, 2G, 3G, 4G, or 5G, or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component may be implemented based on Near Field Communication (NFC) technology, Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
The power supply components of fig. 4 and 5 described above provide power to the various components of the device in which the power supply components are located. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (12)

1. A system for verifying data returned from a source, comprising: the system comprises a plurality of CDN nodes, a source station and a log server;
a first CDN node of the plurality of CDN nodes to: sending a fragment back-to-source request for the file to the source station; receiving the file fragments issued by the source station according to the fragment return request; calculating the verification information of the file fragment to generate a first log record containing the verification information of the file fragment, and sending the first log record to a log server;
the log server is configured to: inquiring a second log record of the file fragment in the saved log records; the second log record comprises verification information calculated by the second CDN node on the file fragments from the source; and performing consistency check on the file fragments according to the check information in the first log record and the check information in the second log record.
2. A method for verifying back-source data is characterized by comprising the following steps:
acquiring a first log record sent by a first CDN node, wherein the first log record comprises verification information calculated by the first CDN node on a file fragment from a source;
inquiring a second log record of the file fragment in the saved log records; the second log record comprises verification information calculated by the second CDN node on the file fragments from the source;
and performing consistency check on the file fragments according to the check information in the first log record and the check information in the second log record.
3. The method of claim 2, wherein the first log record comprises: the file fragmentation comprises an identifier of a file to which the file fragmentation belongs, a fragmentation range of the fragmented file and a first check value obtained by calculating the file fragmentation by the first CDN node;
the second log record comprises: the identification of the file to which the file fragment belongs, the fragment range of the fragment file, and a second check value calculated by the second CDN node for the file fragment.
4. The method of claim 3, wherein querying the saved log record for a second log record of the file fragment comprises:
and according to the identification and the fragment range of the file corresponding to the saved log record, inquiring the log record corresponding to the identification of the file to which the file fragment belongs and the fragment range of the fragment file in the saved log record as the second log record.
5. The method of claim 3, wherein performing a consistency check on the file shards according to the check information of the file shards in the first log record and the check information of the file shards in the second log record comprises:
obtaining the first check value of the file fragment from the first log record, and obtaining the second check value of the file fragment from the second log record;
if the first check value is consistent with the second check value, determining that the source returning operation of the file fragment is not abnormal;
and if the first check value is inconsistent with the second check value, determining that the source returning operation of the file fragment is abnormal.
6. The method of claim 5, further comprising:
and if the source returning operation of the file fragment is determined to be abnormal, sending a content refreshing request to a content management node to trigger the content management node to delete the cache of the file to which the file fragment belongs in a content distribution network.
7. A method for verifying back-source data is characterized by comprising the following steps:
sending a fragment back-to-source request for the file to a source station;
receiving the file fragments issued by the source station according to the fragment return request;
calculating the verification information of the file fragments to generate a log record containing the verification information of the file fragments;
and sending the log record of the file slice to a log server so that the log server performs consistency check on the file slice according to check information in the log record and check information generated by other CDN nodes according to the returned file slice.
8. The method of claim 7, wherein computing the verification information of the file fragment to generate a log record containing the verification information of the file fragment comprises:
calculating a check value of the file fragment;
and generating a log record of the file fragment according to the identifier of the file to which the file fragment belongs, the fragment range of the fragmented file and the check value of the file fragment.
9. The method of claim 8, wherein the parity values for the file segments comprise: at least one of an information digest value, a parity code value, and a gray code value of the file fragment.
10. A log server, comprising: a memory, a processor, and a communication component;
the memory is to store one or more computer instructions;
the processor is to execute the one or more computer instructions to: performing the steps of the method of any one of claims 2-6.
11. A CDN server, comprising: a memory, a processor, and a communication component;
the memory is to store one or more computer instructions;
the processor is to execute the one or more computer instructions to: performing the steps of the method of any one of claims 7-9.
12. A computer-readable storage medium storing a computer program, wherein the computer program is capable of performing the method of any one of claims 2-6 or the method of any one of claims 7-9 when executed by a processor.
CN202110185013.0A 2021-02-10 2021-02-10 Return source data verification method, server, system and storage medium Pending CN113300875A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110185013.0A CN113300875A (en) 2021-02-10 2021-02-10 Return source data verification method, server, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110185013.0A CN113300875A (en) 2021-02-10 2021-02-10 Return source data verification method, server, system and storage medium

Publications (1)

Publication Number Publication Date
CN113300875A true CN113300875A (en) 2021-08-24

Family

ID=77318924

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110185013.0A Pending CN113300875A (en) 2021-02-10 2021-02-10 Return source data verification method, server, system and storage medium

Country Status (1)

Country Link
CN (1) CN113300875A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114449044A (en) * 2021-12-27 2022-05-06 天翼云科技有限公司 CDN cache verification method and device and electronic equipment
CN116527555A (en) * 2023-06-20 2023-08-01 中国标准化研究院 Cross-platform data intercommunication consistency test method

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7631009B1 (en) * 2007-01-02 2009-12-08 Emc Corporation Redundancy check of transaction records in a file system log of a file server
US20100293137A1 (en) * 2009-05-14 2010-11-18 Boris Zuckerman Method and system for journaling data updates in a distributed file system
CN103379139A (en) * 2012-04-17 2013-10-30 百度在线网络技术(北京)有限公司 A verification method and a verification system for distributed cache content, and apparatuses
WO2016107197A1 (en) * 2014-12-31 2016-07-07 中兴通讯股份有限公司 Network program recording method, device and system, and recorded-program playing method and device
WO2017071566A1 (en) * 2015-10-26 2017-05-04 中兴通讯股份有限公司 Network video playback method and system, and user terminal and home streaming service node
CN107087038A (en) * 2017-06-29 2017-08-22 珠海市魅族科技有限公司 A kind of method of data syn-chronization, synchronizer, device and storage medium
CN108683668A (en) * 2018-05-18 2018-10-19 腾讯科技(深圳)有限公司 Resource checksum method, apparatus, storage medium and equipment in content distributing network
CN110809191A (en) * 2019-10-08 2020-02-18 烽火通信科技股份有限公司 Video tamper-proofing method and system based on index verification and real-time package conversion
CN110889143A (en) * 2018-09-07 2020-03-17 阿里巴巴集团控股有限公司 File verification method and device
CN111031110A (en) * 2019-11-29 2020-04-17 山东英信计算机技术有限公司 File uploading method and device, electronic equipment and storage medium
WO2020211731A1 (en) * 2019-04-19 2020-10-22 华为技术有限公司 Video playing method and related device
CN111880826A (en) * 2020-07-28 2020-11-03 平安科技(深圳)有限公司 Cloud service application upgrading method and device, electronic equipment and storage medium

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7631009B1 (en) * 2007-01-02 2009-12-08 Emc Corporation Redundancy check of transaction records in a file system log of a file server
US20100293137A1 (en) * 2009-05-14 2010-11-18 Boris Zuckerman Method and system for journaling data updates in a distributed file system
CN103379139A (en) * 2012-04-17 2013-10-30 百度在线网络技术(北京)有限公司 A verification method and a verification system for distributed cache content, and apparatuses
WO2016107197A1 (en) * 2014-12-31 2016-07-07 中兴通讯股份有限公司 Network program recording method, device and system, and recorded-program playing method and device
WO2017071566A1 (en) * 2015-10-26 2017-05-04 中兴通讯股份有限公司 Network video playback method and system, and user terminal and home streaming service node
CN107087038A (en) * 2017-06-29 2017-08-22 珠海市魅族科技有限公司 A kind of method of data syn-chronization, synchronizer, device and storage medium
CN108683668A (en) * 2018-05-18 2018-10-19 腾讯科技(深圳)有限公司 Resource checksum method, apparatus, storage medium and equipment in content distributing network
CN110889143A (en) * 2018-09-07 2020-03-17 阿里巴巴集团控股有限公司 File verification method and device
WO2020211731A1 (en) * 2019-04-19 2020-10-22 华为技术有限公司 Video playing method and related device
CN110809191A (en) * 2019-10-08 2020-02-18 烽火通信科技股份有限公司 Video tamper-proofing method and system based on index verification and real-time package conversion
CN111031110A (en) * 2019-11-29 2020-04-17 山东英信计算机技术有限公司 File uploading method and device, electronic equipment and storage medium
CN111880826A (en) * 2020-07-28 2020-11-03 平安科技(深圳)有限公司 Cloud service application upgrading method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋毅;薛振宇;滕林;靖海;王哲;汪雄才;高晓鹏;: "面向一体化配电网规划设计平台的数据集成技术研究及应用", 电网技术, no. 07 *
文勇军;黄浩;樊志良;唐立军;: "分布式日志系统REST安全接口设计", 网络安全技术与应用, no. 04 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114449044A (en) * 2021-12-27 2022-05-06 天翼云科技有限公司 CDN cache verification method and device and electronic equipment
CN114449044B (en) * 2021-12-27 2023-10-10 天翼云科技有限公司 CDN cache verification method and device and electronic equipment
CN116527555A (en) * 2023-06-20 2023-08-01 中国标准化研究院 Cross-platform data intercommunication consistency test method
CN116527555B (en) * 2023-06-20 2023-09-12 中国标准化研究院 Cross-platform data intercommunication consistency test method

Similar Documents

Publication Publication Date Title
CN106100902B (en) Cloud index monitoring method and device
CN102067557B (en) Method and system of using a local hosted cache and cryptographic hash functions to reduce network traffic
US10778760B2 (en) Stream-based data deduplication with peer node prediction
CN102882974B (en) Method for saving website access resource by website identification version number
CN109542613A (en) Distribution method, device and the storage medium of service dispatch in a kind of CDN node
US20230040213A1 (en) Cache management in content delivery systems
US20110173290A1 (en) Rotating encryption in data forwarding storage
CN113300875A (en) Return source data verification method, server, system and storage medium
US20140359066A1 (en) System, method and device for offline downloading resource and computer storage medium
CN105279258B (en) File storage method and system with balanced distribution
CN113411404A (en) File downloading method, device, server and storage medium
CN106790334A (en) A kind of page data transmission method and system
CN105791366A (en) Large file HTTP-Range downloading method, cache server and system
EP3579526B1 (en) Resource file feedback method and apparatus
US11089100B2 (en) Link-server caching
CN115150204B (en) Data transmission system
CN107395772B (en) Management method and management system for repeated data
US10015012B2 (en) Precalculating hashes to support data distribution
CN113411364B (en) Resource acquisition method and device and server
CN112861031B (en) URL refreshing method, device and equipment in CDN and CDN node
CN109688204B (en) File downloading method, node and terminal based on NDN (named data networking)
Zhang et al. SimpleSync: A parallel delta synchronization method based on Flink
WO2022252357A1 (en) Consensus processing method and apparatus for blockchain network, device, system, and medium
CN113691614B (en) Information processing method and device
CN116796099A (en) Short link generation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination