Disclosure of Invention
Based on this, a static webpage updating method, a device, computer equipment and a storage medium are provided for solving the problem that the prior method for caching the webpage needs to verify the validity of the webpage to the server every time and the waste of bandwidth is very serious.
A static webpage updating method comprises the following steps:
obtaining a cache file of a static webpage, and performing hash value query on the cache file of the static webpage;
if the cache file of the static webpage contains a hash value, determining a cache time threshold value according to a time node of which the hash value changes;
if the cache file of the static webpage does not contain the hash value, acquiring a service scene parameter corresponding to the cache file of the static webpage, and determining a cache time threshold value according to the service scene parameter;
and determining a time node for updating the cache file of the static webpage according to the cache time threshold.
In one possible embodiment, the obtaining the cache file of the static webpage and performing hash value query on the cache file of the static webpage include:
obtaining a cache file of the static webpage, and extracting an extension field in the cache file of the static webpage;
comparing the extension name field with a preset extension name classification table to obtain the type attribute of the cache file of the static webpage;
and determining whether the cache file of the static webpage contains a hash value or not according to the type attribute.
In one possible embodiment, if the cache file of the static webpage contains a hash value, determining a cache time threshold according to a time node at which the hash value changes includes:
dividing the detail information of the cache file into a plurality of sub-information segments, and respectively calculating the hash value of each sub-information segment;
merging the hash values of the sub-information segments to obtain the hash value of the cache file of the static webpage;
and acquiring each time node with the changed hash value of the cache file of the static webpage, and obtaining the cache time threshold value according to the time interval of each time node.
In one possible embodiment, the obtaining a service scenario parameter corresponding to the cache file of the static webpage if the cache file of the static webpage does not include the hash value, and determining the cache time threshold according to the service scenario parameter includes:
obtaining scene characteristic words of a service scene in a cache file of the static webpage, and obtaining a service rule corresponding to the service scene according to the scene characteristic words;
acquiring a service information tree corresponding to the service rule, wherein a root node or a child node in the service information tree comprises at least one service scene parameter;
traversing each node on the service information tree to obtain service scene parameters corresponding to the cache files of the static webpage;
and extracting time-related parameters in the service scene parameters, and taking the minimum value in the time-related parameters as the cache time threshold value.
In one possible embodiment, the determining, according to the cache time threshold, a time node for updating the cache file of the static webpage includes:
when the time of the cache file of the static webpage stored locally exceeds the cache time threshold, sending a new file resource acquisition instruction to a server side;
receiving feedback information of the server, and if the feedback information contains new file resources, taking a time node corresponding to the cache time threshold value as a time node for updating the cache file of the static webpage;
and if not, taking the time node of the resource updating instruction sent by the server side as the time node for updating the cache file of the static webpage.
In one possible embodiment, the determining, according to the cache time threshold, a time node for updating the cache file of the static webpage includes:
when the storage time of the cache file of the static webpage exceeds the cache time threshold, sending an instruction for verifying the entity value Etag of the requested variable and the last modified time last-modified to a server side;
receiving feedback information of the server side on the entity value Etag of the requested variable and the numerical condition of the last modified time last-modified;
if one of the entity value Etag or the last modified time last-modified in the feedback information is changed in value, taking a time node corresponding to the cache time threshold value as a time node for updating the cache file of the static webpage;
otherwise, continuing to use the cache file of the static webpage until the entity value Etag or the last-modified time changes.
In a possible embodiment, the merging the hash values of the sub information segments to obtain the hash value of the cache file of the static webpage includes:
acquiring the byte length of each sub information segment, and correcting the hash value of each sub information segment by taking the byte length as a coefficient;
and adding the corrected hash values of the sub information segments to obtain the hash value of the cache file of the static webpage.
A static webpage updating device comprises the following modules:
the hash value acquisition module is used for acquiring the cache file of the static webpage and inquiring the hash value of the cache file of the static webpage;
the hash value time processing module is configured to determine a cache time threshold value according to a time node at which the hash value changes if the hash value is included in the cache file of the static webpage; if the cache file of the static webpage does not contain the hash value, acquiring a service scene parameter corresponding to the cache file of the static webpage, and determining a cache time threshold value according to the service scene parameter;
and the updating node determining module is set to determine the updating time node of the cache file of the static webpage according to the cache time threshold.
A computer device comprising a memory and a processor, the memory having stored therein computer-readable instructions that, when executed by the processor, cause the processor to perform the steps of the static web page update method described above.
A storage medium having stored thereon computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the above static web page update method.
Compared with the existing mechanism, the method and the device have the advantages that whether the cache files of the static webpage contain the hash values or not is subjected to classification analysis, and the cache time threshold value meeting the requirement is set, so that the problem of bandwidth waste is avoided.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Fig. 1 is an overall flowchart of a static webpage updating method in an embodiment of the present application, where the static webpage updating method includes the following steps:
s1, obtaining a cache file of the static webpage, and performing hash value query on the cache file of the static webpage;
specifically, the cache file mainly comprises two file types, the first type is a cache file without a hash value, the file of the type is represented by an html file, the other type is a cache file with a hash value, and the file of the type is represented by a file with a format of js/css/img and the like. When the two different types of files need to be updated, different updating mechanisms are adopted, so that the bandwidth can be effectively saved. Specifically, when the message header is configured, max-age values with different lengths are set, so that the update time of the cache file can be obtained in a targeted manner.
In this step, for the query of the hash value of the cache file, all values in the cache file message may be extracted first, then each value is entered into the hash function for calculation, and then whether the hash value exists in the cache file is determined according to the operation result of the hash value. That is to say, a hash threshold may be set first, if a value extracted from a certain cache file is smaller than the hash threshold after being operated by a hash function, the cache file contains a hash value, otherwise, the cache file does not contain a hash value.
S2, if the cache file of the static webpage contains a hash value, determining a cache time threshold value according to the time node of the hash value;
specifically, a preset hash value extraction task is obtained, the preset hash value extraction task contains a plurality of time nodes for hash value extraction, when the time node for hash value extraction arrives, a hash value is extracted from the cache file information, and if the hash value extracted at the next time changes from the hash value extracted at the previous time, the cache file needs to be updated. The preset hash value extraction task is generated after statistics according to historical data. The hash value of the cache file changes because the Webpack is newly packaged once, that is, when the web page information accessed by the user needs to be updated, the Webpack receives an instruction sent by the user or the server to newly package the cache content.
S3, if the cache file of the static webpage does not contain the hash value, acquiring a service scene parameter corresponding to the cache file of the static webpage, and determining a cache time threshold value according to the service scene parameter;
specifically, a business scenario is a process or sub-process of a certain business, including a target, a participant, an operation flow, and an information transfer process. The service scenario may be a high concurrency test scenario, a stability test scenario, or the like. Each service scenario corresponds to different service scenario parameters, for example, the core parameter in the high concurrency test service scenario is concurrency number and response time.
When the service scene parameters are obtained, the service rule corresponding to the service scene may be obtained first, and then the service scene parameters corresponding to the cache file may be obtained according to the feature information in the service rule. Taking a high concurrency test scene as an example, a large number of passengers need to rush to get home for train tickets in the spring festival, and the service scene is a high concurrency scene ordered for the train tickets, wherein the characteristic information is as follows: the ticket release time, the ticket release quantity and the like can obtain corresponding service scene parameters according to the characteristic information of the ticket release time, for example, the ticket release time is 13:00, and then the cache time threshold corresponding to the service scene can be obtained only by carrying out line-height concurrent test on the maximum retention time of the corresponding cache file at the time of the parameter of 13: 00.
And S4, determining the time node of the update of the cache file of the static webpage according to the cache time threshold.
Specifically, when the storage time of the cache file in the local storage container is greater than the cache time threshold, the cache file needs to be updated, otherwise, the webpage information obtained by the user is not the latest information, so that inconvenience is brought to the user. Taking the example of the train ticket robbed in spring festival, the local cache time threshold is 5s, when the time exceeds 5s, the local cache needs to be updated, otherwise, the ticket information is already stored, and the local end still has no ticket, which causes that the user cannot buy the train ticket.
In this step, the time node for updating the cache file may be modified according to the bandwidth or network speed of the actual network, that is, the bandwidth is increased, so that the time node for updating the original cache file needs to be advanced, that is, the frequency of updating the local cache file is increased.
In this embodiment, whether the cache file contains the hash value is classified and analyzed, and a cache time threshold meeting the requirement is set, so that the problem of bandwidth waste is avoided.
Fig. 2 is a schematic diagram illustrating a hash value obtaining process in a static webpage updating method according to an embodiment of the present application, where as shown in the drawing, the S1 obtains a cache file of a static webpage, and performs a hash value query on the cache file of the static webpage, where the hash value query includes:
s11, obtaining the cache file of the static webpage, and extracting the extension name field in the cache file of the static webpage;
specifically, the cached file information contains file content information and file name information, the 'logbook' in the cached file information is inquired, all character segments behind the 'logbook' symbol are extracted, and the character segments behind the 'logbook' are compared with the content in an extension name table stored in a database to obtain character segments which are consistent in comparison. If the number of the compared consistent character segments is two or more, acquiring the character segment in front of the '. quadrature.' symbol, and calculating the association degree between the character segment in front of the '. quadrature.' symbol and the character segment behind the '. quadrature.' symbol, wherein if the association degree is less than 10%, the character segment behind the '. quadrature.' symbol is an extension, otherwise, the character segment is not. The length of the character segment before the "-" symbol or the length of the character segment after the "-" symbol is determined according to the special symbol, and the special symbol has punctuation segmentation symbols such as ", and the like.
S12, comparing the extension field with a preset extension classification table to obtain the type attribute of the cache file of the static webpage;
in the preset extension classification table, the extensions are classified into two types, one is the extension with the hash value, and the other is the extension without the hash value. And if the extension field is in the extension part with the hash value, the cache file is a file with the hash value, otherwise, the cache file is a file without the hash value.
S13, determining whether the cache file of the static webpage contains a hash value according to the type attribute.
In this embodiment, the cache file is correctly classified by analyzing the extension of the cache file, so that when the cache file is updated, a correct file updating manner can be obtained according to whether the cache file has a hash value.
Fig. 3 is a schematic view of a processing procedure when a hash value exists in a static webpage updating method in an embodiment of the present application, as shown in the drawing, in step S2, if a cache file of the static webpage contains a hash value, determining a cache time threshold according to a time node at which the hash value changes includes:
s21, dividing the detail information of the cache file into a plurality of sub information segments, and respectively calculating the hash value of each sub information segment;
specifically, when the cache file information is divided into a plurality of sub information segments, the number of bytes included in each information segment may be the same, or the number of bytes included in each sub information segment may be different. The MD5 algorithm is used for calculating the hash value of each sub information segment, or the SHA-12 algorithm may be used for calculating the hash value, and if the two algorithms are used for calculating the hash value at the same time, the two algorithms may be compared to obtain the hash value. And if the difference value of the hash values obtained by the two algorithms is within the preset error threshold value, the hash value obtained by the MD5 algorithm is used as the hash value of the sub information segment. Otherwise, the sub information segment is divided again, so that the number of bytes in the sub information segment is changed. Wherein the predetermined error threshold is typically 1%.
S22, merging the hash values of the sub information segments to obtain the hash value of the cache file of the static webpage;
specifically, when the hash values of the sub-information segments are combined, the accuracy of the hash values needs to be checked, that is, the hash values of the sub-information segments are connected with the hash values according to the positions of the sub-information segments in the cache information file as abscissa and the hash values as ordinate to establish a hash value curve, then the hash value curve is divided into a plurality of curve segments, and if the curvature radius of a certain curve segment is greater than a preset curvature radius threshold, the abnormal hash value corresponding to the curve segment is deleted to obtain the hash value to be combined. And then adding the hash values of the sub information segments after the abnormal hash value is removed to obtain the hash value of the cache file information.
S23, obtaining each time node with the changed hash value of the cache file of the static webpage, and obtaining the cache time threshold value according to the time interval of each time node.
Specifically, when the cache file needs to be updated, the webpack packs a new file and sends the new file to the local, and at this time, the hash value of the cache file changes. When the hash value of the cache file is monitored, the number of arguments of the hash function may be monitored, that is, the arguments of the hash function are sent to the cache file every preset time, and if the arguments are changed, the hash value is also changed.
In this embodiment, the cache file information is obtained by blocking the hash value, so that the time threshold value of the cache file to be updated is accurately obtained, and the optimal scheme of the retention time of the cache file is obtained.
Fig. 4 is a schematic view of a processing procedure of a static webpage update method without a hash value in an embodiment of the present application, as shown in the drawing, in S3, if the cache file of the static webpage does not include a hash value, obtaining a service scenario parameter corresponding to the cache file of the static webpage, and determining a cache time threshold according to the service scenario parameter includes:
s31, obtaining scene feature words of a service scene in the cache file of the static webpage, and obtaining a service rule corresponding to the service scene according to the scene feature words;
specifically, the cache file information includes corresponding service scene information, for example, if the cache file is information of right and late trains in spring, the service scene feature words in the cache file are "trains" and "right and late". In this service scenario, the corresponding service rule information is: updates were made every 15 minutes. That is, each service scene feature word corresponds to a service rule in the database, and service rule information corresponding to a service scene can be obtained by searching whether the service rule includes the service scene feature word in the database.
S32, acquiring a service information tree corresponding to the service rule, wherein a root node or a child node in the service information tree comprises at least one service scene parameter;
specifically, in the service information tree, the service information is classified, or a service scene of a train at a time later is taken as an example. In the service information tree, the root node is "train time", the secondary slave node is "right after, and the leaf node is" spring transportation ". In this example, the property node is a leaf node "spring fortune". Because in spring, the situation of the train late is easier to happen than in flat time due to the addition of the temporary passenger train.
S33, traversing each node on the service information tree to obtain service scene parameters corresponding to the cache file of the static webpage;
specifically, after traversing the service information tree, word vector conversion is performed on the characteristic information in each characteristic node, the characteristic information is converted into a multidimensional word vector, and then dimension reduction processing is performed on the multidimensional word vector to obtain a two-dimensional word vector. And solving the characteristic value of the two-dimensional word vector to obtain a service scene parameter corresponding to each characteristic node, and then summarizing the parameters of each characteristic node to obtain the service scene parameter.
S34, extracting time-related parameters from the service scene parameters, and taking a minimum value of the time-related parameters as the buffer time threshold.
Specifically, the node information of each characteristic node is obtained, for example, the node information of the node a is the train time, and then the parameter corresponding to the node is the time-related parameter. Comparing the information of the characteristic nodes in the step with a preset time-related vocabulary, if the information of the characteristic nodes is on the preset time-related vocabulary, the information is a time-related parameter, otherwise, the information is not. And the preset time-related vocabulary is obtained after statistics according to historical data.
In this embodiment, the tree model is used to analyze the service scene parameters, so as to obtain the cache time threshold, thereby facilitating the timely update of the cache file and reasonably arranging the system tasks.
In an embodiment, the step S4 of determining, according to the cache time threshold, a time node for updating the cache file of the static web page includes:
when the time of the cache file of the static webpage stored locally exceeds the cache time threshold, sending a new file resource acquisition instruction to a server side;
specifically, the latest file information of the web page is stored in the server side, and the cache file needs to be updated to meet the browsing requirement of the user after the storage time exceeds the cache time threshold.
Receiving feedback information of the server, and if the feedback information contains new file resources, taking a time node corresponding to the cache time threshold value as a time node for updating the cache file of the static webpage;
and if not, taking the time node of the resource updating instruction sent by the server side as the time node for updating the cache file of the static webpage.
Specifically, the generation time marked by the information sent by the server to the web page file is compared with the generation time of the cache file, if the two times are consistent, the web page of the server is not updated, at this time, the storage time of the cache file can be prolonged, and if the two times are inconsistent, the web page is updated at the server, at this time, the cache file needs to be updated. And in the time period of prolonging the storage time of the cache file, if the webpage is updated, the server side sends a resource updating instruction, records a time node of the resource updating instruction, and takes the node as an updating time node of the cache file. Meanwhile, the event is taken as a log and recorded in a database, and the time interval between two resource updating time nodes is taken as a cache time threshold value when the same-class cache file is used next time.
In this embodiment, the cache time threshold is revised by the server, so that bandwidth resources are better utilized, and bandwidth waste is not caused.
In an embodiment, the step S4 of determining, according to the cache time threshold, a time node for updating the cache file of the static web page includes:
when the storage time of the cache file of the static webpage exceeds the cache time threshold, sending an instruction for verifying the entity value Etag of the requested variable and the last modified time last-modified to a server side;
the HTTP protocol specification defines ETag as "entity value of requested variable". Stated another way, ETag is a token (token) that may be associated with a Web resource. A typical Web resource may be a Web page, but may also be a JSON or XML document. The server is solely responsible for determining what the token is and its meaning and transmitting it to the client in an HTTP response header, the following is the format returned by the server: the query update format of the ETag "50b1c1d4f775c61: df3" client is as follows: If-None-Match: W/"50b1c1d4f775c61: df3" If ETag is not changed, then return state 304 and then not return, again as Last-Modified. Testing Etag is useful primarily when downloading breakpoints.
When the browser requests a certain URL for the first time, the return state of the server side will be 200, the content is the resource requested by the client side, and a Last-Modified attribute marks the Last Modified time of the file at the server side.
The Last-Modified format is similar to this:
Last-Modified:Fri,12May 2006 18:53:33GMT
when the client requests the URL for the second time, the browser sends an If-Modified-session header to the server, as specified by the HTTP protocol, asking If the file has been Modified after that time:
If-Modified-Since:Fri,12May 2006 18:53:33GMT
if the resources of the server side are Not Changed, the HTTP 304(Not Changed.) status code is automatically returned, and the content is empty, so that the transmission data volume is saved. When the server side code changes or the server is restarted, the resource is sent out again, and the return is similar to the first request. Therefore, the resources are not repeatedly sent to the client, and the client can obtain the latest resources when the server is changed.
Receiving feedback information of the server side on the entity value Etag of the requested variable and the numerical condition of the last modified time last-modified;
if one of the entity value Etag or the last modified time last-modified in the feedback information is changed in value, taking a time node corresponding to the cache time threshold value as a time node for updating the cache file of the static webpage;
otherwise, continuing to use the cache file of the static webpage until the entity value Etag or the last-modified time changes.
In this embodiment, the Etag has higher priority than Last-Modified, and if only Last-Modified verifies, there are a series of problems: the file may be changed periodically, but his content does not change, not wishing to get the client anew; some servers cannot get exactly the last modification time of the file. Therefore, the two must be turned on simultaneously to maintain the accuracy of the verification result.
In this embodiment, the update time node of the cache file is judged by using the Etag and the Last-Modified, so that the storage time of the cache file is accurately obtained, and the bandwidth utilization rate is further improved.
In an embodiment, the S22, merging the hash values of the sub information segments to obtain the hash value of the cache file of the static web page, includes:
acquiring the byte length of each sub information segment, and correcting the hash value of each sub information segment by taking the byte length as a coefficient;
specifically, if there are 512 bytes in the a sub-information segment and 256 bytes in the B sub-information segment, the hash value of the a sub-information segment is 30 and the hash value of the B sub-information segment is 50 when performing the hash value correction, and the hash value of the a is still 30 and the hash value of the B sub-information segment is 50 × 0.5 — 25 when performing the hash value correction.
And adding the corrected hash values of the sub information segments to obtain the hash value of the cache file of the static webpage.
In one embodiment, a static webpage updating apparatus is provided, as shown in fig. 5, including the following modules:
the hash value acquisition module is used for acquiring the cache file of the static webpage and inquiring the hash value of the cache file of the static webpage;
the hash value time processing module is configured to determine a cache time threshold value according to a time node at which the hash value changes if the hash value is included in the cache file of the static webpage; if the cache file of the static webpage does not contain the hash value, acquiring a service scene parameter corresponding to the cache file of the static webpage, and determining a cache time threshold value according to the service scene parameter;
and the updating node determining module is set to determine the updating time node of the cache file of the static webpage according to the cache time threshold.
In one embodiment, a computer device is provided, the computer device includes a memory and a processor, the memory stores computer readable instructions, and the computer readable instructions, when executed by the processor, cause the processor to execute the steps of the static webpage updating method in the above embodiments.
In one embodiment, a storage medium storing computer-readable instructions is provided, which when executed by one or more processors, cause the one or more processors to perform the steps of the static web page updating method in the above embodiments. Wherein the storage medium may be a non-volatile storage medium.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-described embodiments are merely illustrative of some embodiments of the present application, which are described in more detail and detail, but are not to be construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.