CN108268370B - Website quality analysis method, device and system based on Referer and template library matching - Google Patents

Website quality analysis method, device and system based on Referer and template library matching Download PDF

Info

Publication number
CN108268370B
CN108268370B CN201611260470.7A CN201611260470A CN108268370B CN 108268370 B CN108268370 B CN 108268370B CN 201611260470 A CN201611260470 A CN 201611260470A CN 108268370 B CN108268370 B CN 108268370B
Authority
CN
China
Prior art keywords
website
acquiring
session records
websites
template library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611260470.7A
Other languages
Chinese (zh)
Other versions
CN108268370A (en
Inventor
郭天晨
程路
王易风
陈建平
潘梁
范东东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201611260470.7A priority Critical patent/CN108268370B/en
Publication of CN108268370A publication Critical patent/CN108268370A/en
Application granted granted Critical
Publication of CN108268370B publication Critical patent/CN108268370B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3612Software analysis for verifying properties of programs by runtime analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Information Transfer Between Computers (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the invention discloses a website quality analysis method, a device and a system based on Referer and template library matching. The method comprises the following steps: acquiring and analyzing the total flow in the link, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites; using a Referer matching method to associate the session records carrying the Referer fields to corresponding websites; matching the session records not carrying the Referer fields with a static resource template library by using a template library matching method, so that all the session records not carrying the Referer fields are associated to corresponding websites; and scoring each website after all the session records are associated to the website so as to allow the user to analyze the quality of the website. The embodiment of the invention can associate all the session records to the website by utilizing a Referer matching method and a template library matching method, thereby improving the integrity and the accuracy of the session records of the website. According to the embodiment of the invention, the user can analyze the website quality by scoring each website.

Description

Website quality analysis method, device and system based on Referer and template library matching
Technical Field
The embodiment of the invention relates to the technical field of network management of data services, in particular to a website quality analysis method, a device and a system based on refer and template library matching.
Background
Currently, mainstream technologies of website quality analysis schemes include active dial-up test technology and passive monitoring analysis. The active dial testing analysis is to record relevant data of each website in a terminal-simulated dial testing mode and then sort the data into required indexes; and subjective feeling of the user can be obtained according to the index condition of each website. The passive monitoring analysis is to analyze the data of the website by analyzing the data message of the network outlet.
However, in the process of implementing the embodiment of the present invention, the inventors found that: when active dial testing analysis is carried out, part of sensing indexes are not universal and are greatly influenced by dial testing contents and dial testing environments, so that the part of sensing indexes cannot accurately reflect the quality of a website, the analysis result can only macroscopically obtain the quality and the quality of one website, and reference data cannot be provided for fault positioning.
During passive monitoring analysis, because the message data is scattered and not directly corresponding to a certain website, only the Uniform Resource Locator (URL) of the message data can be analyzed, and then the browser sends a request to the web server through a reader (HTTP reader is a part of the reader), which generally carries the reader to tell the server from which page the request is linked, and the server can obtain some information for processing) to associate the message with the corresponding website. Therefore, passive monitoring analysis is only suitable for analyzing relevant indexes (such as time delay, success rate and the like) of each element or each server IP, and cannot perform overall perception on the whole website.
Disclosure of Invention
One purpose of the embodiments of the present invention is to solve the problem that the prior art cannot perform overall sensing on a website because active dial testing analysis cannot locate a fault or passive monitoring analysis only analyzes each element.
In a first aspect, an embodiment of the present invention provides a website quality analysis method based on refer and template library matching, where the method includes:
acquiring and analyzing the total flow in the link, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites;
using a Referer matching method to associate the session records carrying the Referer fields to corresponding websites;
matching the session records not carrying the Referer fields with a static resource template library by using a template library matching method, so that all the session records not carrying the Referer fields are associated to corresponding websites;
and after all the session records are associated to the websites, scoring each website so as to allow the user to analyze the website quality.
Optionally, the step of matching the session records not carrying the Referer field with the static resource template library by using the template library matching method, so that all the session records not carrying the Referer field are associated with the corresponding website includes:
sending an access request to each website at regular time by using a simulation browser;
acquiring response data of the website by using a network packet capturing method;
and forming all session records of each website according to the access request and corresponding response data and storing the session records into a static resource template library of the website.
Optionally, the step of scoring each website after all session records are associated to the website includes:
acquiring all KQI index values of each website;
calculating the score of each website according to each KQI index value and the preset weight value thereof;
the KQI index value is end-to-end time delay of web browsing, end-to-end speed of web browsing, end-to-end success rate of web browsing or end-to-end integrity rate of web browsing.
Optionally, the step of scoring each website based on all session records further comprises:
acquiring any one KQI index of all websites, and acquiring the website corresponding to the maximum or minimum KQI index value;
acquiring KQI index values of the HOSTs corresponding to the website, and acquiring the HOSTs corresponding to the maximum or minimum KQI index values;
acquiring KPI index values of all URLs corresponding to the HOST, and acquiring URLs corresponding to the KPI index values to be the maximum or the minimum;
acquiring KPI index values of all server IPs corresponding to the URL, and acquiring the server IP corresponding to the maximum or minimum KPI index value;
the KPI is response time delay, response success rate or retransmission packet loss;
each website includes a plurality of HOST, each HOST corresponding to a plurality of URLs, each URL corresponding to a plurality of server IPs.
Optionally, the step of acquiring any one of the KQI indexes of all the websites and acquiring the website corresponding to the highest or smallest KQI index value may be replaced with the following steps:
and directly selecting the website to be analyzed and any KQI index thereof from all websites.
In a second aspect, an embodiment of the present invention further provides a website quality analysis apparatus based on refer and template library matching, where the apparatus includes:
the session record acquisition module is used for acquiring the total flow in the link for analysis, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites;
the Referer session record association module is used for associating the session records carrying the Referer fields to corresponding websites by using a Referer matching method;
the template library session record association module is used for matching the session records not carrying the refer fields with the static resource template library by using a template library matching method so as to associate all the session records not carrying the refer fields with the corresponding websites;
and the scoring module is used for scoring each website after all the session records are associated to the website so as to enable a user to analyze the quality of the website.
Optionally, the apparatus further comprises a static resource template library module for performing the following steps:
sending an access request to each website at regular time by using a simulation browser;
acquiring response data of the website by using a network packet capturing method;
and forming all session records of each website according to the access request and corresponding response data and storing the session records into a static resource template library of the website.
Optionally, the scoring module is configured to perform the following steps:
acquiring all KQI index values of each website;
calculating the score of each website according to each KQI index value and the preset weight value thereof;
the KQI index value is end-to-end time delay of web browsing, end-to-end speed of web browsing, end-to-end success rate of web browsing or end-to-end integrity rate of web browsing.
Optionally, the scoring module is further configured to perform the following steps:
acquiring any one KQI index of all websites, and acquiring the website corresponding to the maximum or minimum KQI index value;
acquiring KQI index values of the HOSTs corresponding to the website, and acquiring the HOSTs corresponding to the maximum or minimum KQI index values;
acquiring KPI index values of all URLs corresponding to the HOST, and acquiring URLs corresponding to the KPI index values to be the maximum or the minimum;
acquiring KPI index values of all server IPs corresponding to the URL, and acquiring the server IP corresponding to the maximum or minimum KPI index value;
the KPI is response time delay, response success rate or retransmission packet loss;
each website includes a plurality of HOST, each HOST including a plurality of URLs, each URL including a plurality of server IPs.
In a third aspect, an embodiment of the present invention further provides a website quality analysis system based on refer and template library matching, where the system includes: deep packet inspection equipment (DPI) and the website quality analysis device according to the second aspect; the DPI is in communication connection with the website quality analysis device;
the DPI is connected into a link in a serial or mirror mode and used for acquiring the full flow of the link and sending the full flow to the website quality analysis device;
the website quality analysis device is used for acquiring the full-scale flow analysis, the associated website and the website score of the link.
According to the technical scheme, the access requests of all users and the response data of all websites in the link can be obtained by obtaining the full flow in the link, so that all session records of the link are obtained; then, a Referer matching method is utilized to associate the session records carrying the Referer fields to corresponding websites; matching the session records not carrying the Referer fields with a static resource template library by using a template library matching method, so that all the session records not carrying the Referer fields are associated to corresponding websites; and finally, scoring each website for the user to perform website quality analysis. Compared with the prior art, the embodiment of the invention can associate all the session records to the website by using a Referer matching method and a template library matching method, thereby improving the integrity and the accuracy of the session records of the website. In addition, the embodiment of the invention provides the user with website quality analysis by scoring each website.
Drawings
The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and not to be construed as limiting the invention in any way, and in which:
fig. 1 is a schematic flowchart of a website quality analysis method based on Referer and template library matching according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a comparison of scoring results of multiple websites according to the method shown in FIG. 1;
FIG. 3 is a diagram illustrating a comparison of scoring results of end-to-end delay indicators for web browsing of multiple websites according to the method shown in FIG. 1;
fig. 4 is a schematic structural diagram of a website quality analysis apparatus based on Referer and template library matching according to an embodiment of the present invention;
fig. 5 is a block diagram of a website quality analysis apparatus based on Referer and template library matching according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Example one
The embodiment of the invention provides a website quality analysis method based on refer and template library matching, as shown in fig. 1, the method comprises the following steps:
s1, acquiring and analyzing the total flow in the link, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites;
s2, associating the session records carrying the refer fields to corresponding websites by using a refer matching method;
s3, matching the session records not carrying the refer fields with a static resource template library by using a template library matching method, so that all the session records not carrying the refer fields are associated to corresponding websites;
and S4, scoring each website after all the session records are associated to the website so as to enable the user to analyze the website quality.
In practical application, Deep Packet Inspection (DPI) equipment, such as telecom-level DPI equipment in the prior art, is adopted in the embodiment of the present invention, and is connected to a link in a serial connection or mirror image manner, so as to collect the full traffic of the link. The full traffic refers to traffic data of all users in a certain area. Therefore, the DPI device is generally installed at a key network location, such as a group backbone network outlet, a three-party interconnection outlet, a provincial network outlet, an Internet Data Center (IDC) and a mobile core network outlet. Of course, the skilled person can select the mounting position of the DPI device and the parameters of the DPI device according to the specific scenario, and the present invention is not limited thereto.
In practical application, the refer field in the embodiment of the invention means that the browser sends an access request to a website to carry HTTP refer; the HTTP refer is part of the header and when the browser sends a request to the web server, it typically takes the refer to tell the server from which page to link, whereby the server can obtain some information for processing. Since the refer field is mature, it will not be described in detail here.
It should be noted that the website quality analysis method provided by the embodiment of the present invention may be implemented on a server or a computing device of a separate device. The present invention will be described with reference to a server as an example.
In step S1, the DPI device acquires the full traffic on the link in real time and sends the acquired full traffic to the server. The server analyzes the access request from the browser and the response data from each website from the full-volume flow, matches the access request with the response data to form a session record, and finally obtains all session records of all users and all websites.
In step S2, the server obtains the Referer field in each session record, associates the Referer field with the corresponding website according to the Referer field, until all the session records carrying the Referer field are associated with the website.
In practical application, a few session records in all session records do not carry a Referer field or carry errors caused by cross-site calling, CDN delivery, policy configuration errors and the like, and at this time, the server cannot determine which website the server belongs to according to the part of session records. That is, all session records cannot be associated to corresponding websites by the Referer matching method, and thus, website quality analysis cannot be accurately performed by using the session records.
Therefore, in step S3, the server matches the session record not carrying the Referer field with the static resource template library by using a template library matching method. When the session record exists in a certain static resource template library, the session record belongs to the website corresponding to the static resource template library, and at the moment, the server associates the session record with the website corresponding to the static resource template library. In this way, the server can associate all the session records that do not carry the refer field to the website by using a template library matching method.
In practical applications, the embodiment of the present invention includes, before step S3, a step of acquiring a static resource template library:
s31, regularly sending an access request to each website by using the simulation browser;
s32, acquiring the response data of the website by using a network packet capturing method;
and S33, forming all session records of each website according to the access request and the corresponding response data and storing the session records into a static resource template library of the website.
The server accesses the website by using the simulated browser, sends an access request to the website, and then acquires response data of the website by using a network packet capturing method (in the prior art, no further description is given here), so that all session records of the website, such as pictures, advertisements, contents and the like, can be acquired. And the server stores all the session records of the website into a static resource template library. For example, if the server needs a static resource template library of the surf, the server may periodically (for example, once in 5 minutes) access the surf website, and then acquire all session records of the surf website to form the static resource template library of the surf website.
When the server is associated by using a Referer matching method, if a certain session record does not have a Referer field, matching is carried out in a static resource template library. And if the static resource template library has an access session record, taking the session record as an element of the Sina website, namely associating the session record to the Sina website.
It can be seen that, in the embodiment of the present invention, all session records can be associated to the website by using the steps S2 and S3.
It should be noted that, the step S3 and the step S2 may be interchanged, and at this time, the step S3 may match all the session records, and associate the session record carrying the Referer field or not carrying the Referer field to the corresponding website. Step S3 now includes four results: (1) the session records are all associated, (2) part of the session records carrying the refer field are not associated, (3) part of the session records carrying the refer field are not associated, and (4) part of the session records carrying no or the refer field are not associated.
With respect to the result (1), step S2 need not be executed at this time.
With respect to the result (2), step S2 is executed at this time until all session records are associated with the corresponding website.
With respect to the result (3), the static resource template library needs to be updated at this time, and step S3 is executed again. And then executing each scheme according to the correlation result.
For the result (4), step S2 is executed first, and then step S3 is executed after the static resource template library is updated until all the session records are associated with the corresponding website.
The person skilled in the art can add various schemes to the above-mentioned schemes according to specific scenarios, and the schemes also fall into the scope of protection of the present application.
In practical applications, a user accesses the main page SP of a website or a specific link that can be clicked into the website one by using a browser. However, from the DPI perspective, the website generates tens or even hundreds of session records for each click of the user on the browser. In order to evaluate the quality of a website, the website needs to be analyzed comprehensively and accurately on the basis of subjective feeling and objective data. Therefore, in step S4 of the embodiment of the present invention, the server scores each website according to the session record and the corresponding index.
In the embodiment of the invention, when the server carries out subjective feeling comment on the website, a key quality index KQI method can be adopted for comment. When the key quality index KQI method is reviewed, a plurality of perception-based indexes are formulated through subjective feelings (such as the speed of opening the website and whether displayed contents are complete) when the website is visited. In the embodiment of the invention, 4 KQI indexes of end-to-end time delay of web browsing, end-to-end speed of web browsing, end-to-end success rate of web browsing and end-to-end integrity rate of web browsing are selected for comment. The end-to-end time delay of web browsing and the end-to-end rate of web browsing reflect the speed of opening the website, and the end-to-end success rate of web browsing and the end-to-end integrity rate of web browsing reflect whether the website content is successfully and completely displayed. In practical application, the embodiment of the present invention further sets a certain weighting, i.e., a preset weight value, for the 4 KQI indexes, so as to adapt to the situation that different review occasions attach importance to a certain KQI index, and improve the review effect. In practical applications, those skilled in the art may set the KQI index and the number of the KQI indexes according to a specific scenario, and the present invention is not limited thereto.
The server obtains 4 KQI index values of each website and a preset weight value of each KQI index, so that the score of each website can be obtained. In practical application, the score of each website can be 0-100, so that the website quality analysis is facilitated. Fig. 2 is a schematic diagram illustrating a comparison of scoring results of multiple websites according to an embodiment of the present invention. Referring to fig. 2, the server scores several common websites including netbook 163, sina, Baidu baidu, tencent, fox nahu and Taobao, and the score of Taobao is 87.32 points at the highest. Namely, the user can subjectively know the quality of each website according to the scores.
The scoring result obtained by the key quality index KQI method is convenient for a user to know the quality of the website, but the website with poor performance cannot be optimized according to the scoring result. Therefore, in the embodiment of the invention, a key quality indicator KQI method and a key Performance indicator KPI (Key Performance indicators) method are adopted to score each website, namely, drilling analysis is carried out on each website.
In practice, the main page SP of each website is composed of a plurality of HOST HOSTs, and each HOST corresponds to a plurality of Uniform Resource Locators (URLs), and each URL corresponds to a plurality of server IPs. According to the vertical level distribution of each website, the quality analysis is carried out on the multi-dimensional selection indexes of each website in the embodiment of the invention, namely the quality analysis is carried out according to the vertical level of one website until the quality of the IP of the server is analyzed, thereby providing decision basis for website quality positioning and optimization.
In the embodiment of the invention, when the quality of the main page SP and the HOST HOST of the website is analyzed, a key quality index KQI method is adopted, but when the quality is analyzed based on the uniform resource locator URL and the server IP, a key performance index KPI method is adopted. Optionally, the KPI indicator may be response delay, response success rate, or retransmission packet loss. In practical applications, those skilled in the art may set the KPI indicators and the number of the KPI indicators according to specific scenarios, and the present invention is not limited thereto.
The scoring, i.e. drilling analysis, of each website in the embodiment of the invention comprises the following steps:
s41, acquiring any one KQI index of all websites, and acquiring the website corresponding to the highest or lowest KQI index value;
s42, acquiring KQI index values of each HOST corresponding to the website, and acquiring the HOST corresponding to the maximum or minimum KQI index value;
s43, acquiring KPI index values of all URLs corresponding to the HOST, and acquiring URLs corresponding to the KPI index values to be the maximum or the minimum;
s44, obtaining KPI index value of each server IP corresponding to the URL, and obtaining the server IP corresponding to the maximum or minimum KPI index value.
First, the server obtains a KQI index and a preset weight value of each website, see fig. 1. Or, directly selecting a website to be analyzed and any one of the KQI indexes thereof, and selecting the KQI index, i.e., the end-to-end time delay of web browsing for analysis in the embodiment of the present invention. Referring to fig. 2, among the above websites, the end-to-end browsing delay (KQI index value) of the tabao web page is 10595.96ms, and the smallest performance among the websites is the best.
TABLE 1 end-to-end delay scoring results for web browsing of multiple websites
Figure BDA0001199626140000111
The server of the embodiment of the invention also scores a plurality of websites by combining the method with a certain KQI index. Taking the end-to-end time delay of web browsing in the KQI index as an example, table 1 shows the scoring results of a plurality of websites.
Referring to table 1, the website with the worst end-to-end delay for web browsing in the KQI index is eastman website, and the delay value is 36326.49 ms.
Referring to table 2, web page browsing end-to-end delay values of a plurality of HOSTs of the eastern wealth eastmoney website are obtained according to the vertical layer of the website.
TABLE 2 end-to-end delay scoring results for web browsing for multiple HOSTs of east wealth
Figure BDA0001199626140000121
Through comparison, the most end-to-end delay of the HOST named as hqguba1.eastmoney. com "among the 17 HOSTs of the eastern wealth eastmoney website is 58255.66ms
The server then continues to analyze the URL and server IP using the key performance indicator KPI method. See table 3.
TABLE 3 eastern wealth server IP response delay
Figure BDA0001199626140000122
As shown in Table 3, the response time of the server with IP 140.207.213.99 is 302.72 seconds, and the normal range is generally about 0.3 to 1 second. Therefore, the reason for causing the large time delay of the oriental wealth website is that the server response time delay is large in the time period. At this time, the website maintainer can adjust and optimize the server corresponding to the IP, so that the quality of the website is improved.
As can be seen from the above, in the embodiment of the present invention, all session records are associated to a website by using a Referer matching method and a template library matching method, so that the accuracy and the integrity of association of session records of each website can be improved, and a basis for website quality analysis is provided. And then, subjective feeling grading and/or objective data grading are/is carried out on the websites by using a KQI method and a KPI method, so that a user can clearly know the quality of each website, and the root cause of the quality of each website can be further positioned (in the embodiment of the invention, the website performance is poor), so that the positioned server can be adjusted and the quality of the website can be optimized.
Example two
Fig. 4 shows a website quality analysis apparatus based on Referer and template library matching according to an embodiment of the present invention, and as shown in fig. 4, the apparatus includes:
a session record obtaining module M1, configured to obtain and analyze the total traffic in the link, and match the access request obtained after the analysis with the response data to obtain all session records of all users and all websites;
a refer session record associating module M2, configured to associate, by using a refer matching method, a session record carrying a refer field to a corresponding website;
a template library session record association module M3, configured to match, by using a template library matching method, session records that do not carry refer fields with a static resource template library, so that all the session records that do not carry refer fields are associated with corresponding websites;
and the scoring module M4 is used for scoring each website after all the session records are associated to the website so as to allow the user to analyze the quality of the website.
It should be noted that, the session record obtaining module M1 actually obtains the total traffic on the link collected by the DPI device, and then analyzes the total traffic; and obtaining access request and response data from the analyzed data for matching, thereby obtaining all session records of all users and all websites. The refer session record associating module M2 associates the session records carrying the refer fields with the corresponding websites, and the template library session record associating module M3 matches the session records not carrying the refer fields with the static resource template library, so that all the session records not carrying the refer fields are associated with the corresponding websites. Finally, the scoring module M4 scores the websites by using the KQI method and/or the KPI method, so as to perform quality analysis on the websites or optimize the located server IP to improve the quality of the websites.
In practical applications, since the template library session record association module M3 further needs to use a static resource template library, the website quality analysis apparatus provided in the embodiment of the present invention further includes a static resource template library module M5 (not shown in the figure). The static resource template library module M5 utilizes a simulation browser to send an access request to each website at regular time, and then utilizes a network packet capturing method to obtain the response data of the websites; and finally, forming all session records of each website according to the access request and the corresponding response data and storing the session records into a static resource template library of the website. Therefore, the static resource template library is updated regularly, and the session records which cannot be associated by the refer session record association module M2 can be perfectly compensated, so that the accuracy and the integrity of the association of the session records which do not carry fields to the website are ensured.
In practical application, the scoring module M4 obtains all KQI index values of each website, for example, the KQI index values are end-to-end time delay of web browsing, end-to-end speed of web browsing, end-to-end success rate of web browsing, or end-to-end integrity rate of web browsing, and then calculates the score of each website according to each KQI index value and a preset weight value thereof. The score may represent the user's subjective perception of the website, i.e., how well the website is perceived.
In practical application, the scoring module M4 further obtains any one of the KQI indexes of all websites, and obtains a website corresponding to the maximum or minimum KQI index value; then, acquiring KQI index values of the HOSTs corresponding to the website, and acquiring the HOSTs corresponding to the maximum or minimum KQI index values; thirdly, acquiring KPI index values of all URLs corresponding to the HOST, and acquiring URLs corresponding to the KPI indexes (the KPI indexes are response delay, response success rate or retransmission packet loss) with the maximum or minimum values; and finally, acquiring KPI index values of all server IPs corresponding to the URL, and acquiring the server IP corresponding to the maximum or minimum KPI index value. Therefore, the scoring module M4 can locate a certain server IP causing the score change of the website, and is convenient for website maintenance personnel to adjust and optimize the server IP, thereby improving the quality of the website.
As for the apparatus embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
EXAMPLE III
The embodiment of the invention also provides a website quality analysis system based on the matching of Referer and the template base, which comprises: deep packet inspection equipment DPI and the website quality analysis device of the second embodiment; the DPI is in communication connection with the website quality analysis device;
the DPI is connected into a link in a serial or mirror mode and used for acquiring the full flow of the link and sending the full flow to the website quality analysis device;
the website quality analysis device is used for acquiring the full-scale flow analysis, the associated website and the website score of the link.
For the system embodiment, since it is basically similar to the method and apparatus embodiments, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
Example four
Fig. 5 is a block diagram showing a website quality analysis apparatus based on Referer and template library matching according to a fourth embodiment of the present application. Referring to fig. 5, the website quality analysis apparatus includes: a processor (processor)501, a memory (memory)502, a communication Interface (Communications Interface)503, and a bus 504;
wherein the content of the first and second substances,
the processor 501, the memory 502 and the communication interface 503 complete mutual communication through the bus 504;
the communication interface 503 is used for information transmission between communication devices of the website quality analysis apparatus;
the processor 501 is configured to call program instructions in the memory 502 to perform the methods provided by the above-mentioned method embodiments, for example, including: acquiring and analyzing the total flow in the link, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites; using a Referer matching method to associate the session records carrying the Referer fields to corresponding websites; matching the session records not carrying the Referer fields with a static resource template library by using a template library matching method, so that all the session records not carrying the Referer fields are associated to corresponding websites; and after all the session records are associated to the websites, scoring each website so as to allow the user to analyze the website quality.
EXAMPLE five
An embodiment of the present invention discloses a computer program product, which includes a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions, when the program instructions are executed by a computer, the computer can execute the methods provided by the above method embodiments, for example, the method includes: acquiring and analyzing the total flow in the link, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites; using a Referer matching method to associate the session records carrying the Referer fields to corresponding websites; matching the session records not carrying the Referer fields with a static resource template library by using a template library matching method, so that all the session records not carrying the Referer fields are associated to corresponding websites; and after all the session records are associated to the websites, scoring each website so as to allow the user to analyze the website quality.
EXAMPLE six
Embodiments of the present invention provide a non-transitory computer-readable storage medium, which stores computer instructions, where the computer instructions cause the computer to perform the methods provided by the above method embodiments, for example, the methods include: acquiring and analyzing the total flow in the link, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites; using a Referer matching method to associate the session records carrying the Referer fields to corresponding websites; matching the session records not carrying the Referer fields with a static resource template library by using a template library matching method, so that all the session records not carrying the Referer fields are associated to corresponding websites; and after all the session records are associated to the websites, scoring each website so as to allow the user to analyze the website quality.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. In the device, the PC remotely controls the equipment or the device through the Internet, and accurately controls each operation step of the equipment or the device. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. The program for realizing the invention can be stored on a computer readable medium, and the file or document generated by the program has statistics, generates a data report and a cpk report, and the like, and can carry out batch test and statistics on the power amplifier.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (10)

1. A website quality analysis method based on Referer and template library matching is characterized by comprising the following steps:
acquiring and analyzing the total flow in the link, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites;
using a Referer matching method to associate the session records carrying the Referer fields to corresponding websites;
matching the session records not carrying the Referer fields with a static resource template library by using a template library matching method, so that all the session records not carrying the Referer fields are associated to corresponding websites;
and after all the session records are associated to the websites, scoring each website so as to allow the user to analyze the website quality.
2. The website quality analysis method according to claim 1, wherein the step of matching the session records not carrying the refer field with the static resource template library by using the template library matching method, so that all the session records not carrying the refer field are associated with the corresponding website comprises:
sending an access request to each website at regular time by using a simulation browser;
acquiring response data of the website by using a network packet capturing method;
and forming all session records of each website according to the access request and corresponding response data and storing the session records into a static resource template library of the website.
3. The website quality analysis method according to claim 1, wherein the step of scoring each website after all session records are associated to the website comprises:
acquiring all KQI index values of each website;
calculating the score of each website according to each KQI index value and the preset weight value thereof;
the KQI index value is end-to-end time delay of web browsing, end-to-end speed of web browsing, end-to-end success rate of web browsing or end-to-end integrity rate of web browsing.
4. The website quality analysis method according to any one of claims 1 to 3, wherein the step of scoring each website after all the session records are associated to the website further comprises:
acquiring any one KQI index of all websites, and acquiring the website corresponding to the maximum or minimum of any one KQI index of all websites;
acquiring KQI index values of the HOSTs corresponding to the website, and acquiring the HOSTs corresponding to the maximum or minimum KQI index values;
acquiring KPI index values of all URLs corresponding to the HOST, and acquiring URLs corresponding to the KPI index values to be the maximum or the minimum;
acquiring KPI index values of all server IPs corresponding to the URL, and acquiring the server IP corresponding to the maximum or minimum KPI index value;
the KPI is response time delay, response success rate or retransmission packet loss;
each website includes a plurality of HOST, each HOST corresponding to a plurality of URLs, each URL corresponding to a plurality of server IPs.
5. The website quality analysis method according to claim 4, wherein the step of obtaining any one of the KQI indicators of all websites and obtaining the website corresponding to the largest or smallest KQI indicator of all websites can be replaced with the following steps:
and directly selecting the website to be analyzed and any KQI index thereof from all websites.
6. A website quality analysis device based on Referer and template library matching is characterized in that the device comprises:
the session record acquisition module is used for acquiring the total flow in the link for analysis, and matching the access request obtained after analysis with the response data to obtain all session records of all users and all websites;
the Referer session record association module is used for associating the session records carrying the Referer fields to corresponding websites by using a Referer matching method;
the template library session record association module is used for matching the session records not carrying the refer fields with the static resource template library by using a template library matching method so as to associate all the session records not carrying the refer fields with the corresponding websites;
and the scoring module is used for scoring each website after all the session records are associated to the website so as to enable a user to analyze the quality of the website.
7. The website quality analysis device of claim 6, further comprising a static resource template library module for performing the steps of:
sending an access request to each website at regular time by using a simulation browser;
acquiring response data of the website by using a network packet capturing method;
and forming all session records of each website according to the access request and corresponding response data and storing the session records into a static resource template library of the website.
8. The website quality analysis device according to claim 6, wherein the scoring module is configured to perform the following steps:
acquiring all KQI index values of each website;
calculating the score of each website according to each KQI index value and the preset weight value thereof;
the KQI index value is end-to-end time delay of web browsing, end-to-end speed of web browsing, end-to-end success rate of web browsing or end-to-end integrity rate of web browsing.
9. The website quality analysis device according to claim 6, wherein the scoring module is further configured to perform the following steps:
acquiring any one KQI index of all websites, and acquiring the website corresponding to the maximum or minimum of any one KQI index of all websites;
acquiring KQI index values of the HOSTs corresponding to the website, and acquiring the HOSTs corresponding to the maximum or minimum KQI index values;
acquiring KPI index values of all URLs corresponding to the HOST, and acquiring URLs corresponding to the KPI index values to be the maximum or the minimum;
acquiring KPI index values of all server IPs corresponding to the URL, and acquiring the server IP corresponding to the maximum or minimum KPI index value;
the KPI is response time delay, response success rate or retransmission packet loss;
each website includes a plurality of HOST, each HOST including a plurality of URLs, each URL including a plurality of server IPs.
10. A website quality analysis system based on Referer and template library matching, the system comprising: a deep packet inspection Device (DPI) and a website quality analysis device according to any one of claims 6 to 9; the DPI is in communication connection with the website quality analysis device;
the DPI is connected into a link in a serial or mirror mode and used for acquiring the full flow of the link and sending the full flow to the website quality analysis device;
the website quality analysis device is used for acquiring the full-scale flow analysis, the associated website and the website score of the link.
CN201611260470.7A 2016-12-30 2016-12-30 Website quality analysis method, device and system based on Referer and template library matching Active CN108268370B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611260470.7A CN108268370B (en) 2016-12-30 2016-12-30 Website quality analysis method, device and system based on Referer and template library matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611260470.7A CN108268370B (en) 2016-12-30 2016-12-30 Website quality analysis method, device and system based on Referer and template library matching

Publications (2)

Publication Number Publication Date
CN108268370A CN108268370A (en) 2018-07-10
CN108268370B true CN108268370B (en) 2021-06-15

Family

ID=62753752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611260470.7A Active CN108268370B (en) 2016-12-30 2016-12-30 Website quality analysis method, device and system based on Referer and template library matching

Country Status (1)

Country Link
CN (1) CN108268370B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113448835B (en) * 2020-09-25 2023-10-20 北京新氧科技有限公司 Static resource testing method and device, electronic equipment and storage medium
CN114328184B (en) * 2021-12-01 2024-05-17 重庆长安汽车股份有限公司 Big data cloud testing method based on vehicle-mounted Ethernet architecture

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102361484A (en) * 2011-07-05 2012-02-22 上海交通大学 Passive network performance measuring system and page identification method thereof
CN104301161A (en) * 2013-07-17 2015-01-21 华为技术有限公司 Computing method, computing device and communication system for business quality index
CN104410516A (en) * 2014-11-24 2015-03-11 中国联合网络通信集团有限公司 User-service awareness assessment method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160103861A1 (en) * 2014-10-10 2016-04-14 OnPage.org GmbH Method and system for establishing a performance index of websites

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102361484A (en) * 2011-07-05 2012-02-22 上海交通大学 Passive network performance measuring system and page identification method thereof
CN104301161A (en) * 2013-07-17 2015-01-21 华为技术有限公司 Computing method, computing device and communication system for business quality index
CN104410516A (en) * 2014-11-24 2015-03-11 中国联合网络通信集团有限公司 User-service awareness assessment method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
戴明珠;云南电信移动互联网业务感知提升研究;《中国优秀硕士学位论文全文数据库 经济与管理科学辑》;20160615(第06期);第J155-48页 *

Also Published As

Publication number Publication date
CN108268370A (en) 2018-07-10

Similar Documents

Publication Publication Date Title
CN103888490B (en) A kind of man-machine knowledge method for distinguishing of full automatic WEB client side
CN104994133B (en) A kind of mobile Web web page access user experience perception evaluating method based on network KPI
CN105357195A (en) Unauthorized web access vulnerability detecting method and device
CN105868256A (en) Method and system for processing user behavior data
US20120317151A1 (en) Model-Based Method for Managing Information Derived From Network Traffic
WO2013049853A1 (en) Analytics driven development
US8818927B2 (en) Method for generating rules and parameters for assessing relevance of information derived from internet traffic
CN108768921B (en) Malicious webpage discovery method and system based on feature detection
US20120253733A1 (en) Transaction based workload modeling for effective performance test strategies
US20170330107A1 (en) Method for performing user profiling from encrypted network traffic flows
CN114244564B (en) Attack defense method, device, equipment and readable storage medium
CN115134099B (en) Network attack behavior analysis method and device based on full flow
CN107547490A (en) A kind of scanner recognition method, apparatus and system
CN107085549A (en) The method and apparatus of fault message generation
CN110555146A (en) method and system for generating network crawler camouflage data
CN105635064A (en) CSRF attack detection method and device
CN108206769A (en) Method, apparatus, equipment and the medium of screen quality alarm
CN107168850A (en) A kind of URL pages monitoring method and device
CN108268370B (en) Website quality analysis method, device and system based on Referer and template library matching
Liu et al. Request dependency graph: A model for web usage mining in large-scale web of things
CN102684925B (en) Method and device for acquiring internet access source information
CN108804501B (en) Method and device for detecting effective information
US11394687B2 (en) Fully qualified domain name (FQDN) determination
CN107526748A (en) A kind of method and apparatus for identifying user and clicking on behavior
CN113676926A (en) User network perception portrait method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant