JP6280092B2

JP6280092B2 - Estimation apparatus and estimation method

Info

Publication number: JP6280092B2
Application number: JP2015219747A
Authority: JP
Inventors: 川口　銀河; 銀河川口; 里衣田行; 史弥小林
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-11-09
Filing date: 2015-11-09
Publication date: 2018-02-14
Anticipated expiration: 2035-11-09
Also published as: JP2017091172A

Description

本発明は、推定装置及び推定方法に関する。 The present invention relates to an estimation apparatus and an estimation method.

インターネットの利用が普及している中で、ウェブブラウザを快適に利用できるかどうかは、ユーザが「インターネットを快適に利用できるか」に大きく影響する。 With the widespread use of the Internet, whether or not a web browser can be used comfortably greatly affects whether the user can use the Internet comfortably.

そのため、ユーザがウェブページを閲覧する際にブラウザを操作し、ブラウザがデータを読み込み・表示をするまでにかかる時間（以下、「ウェブページ表示時間」という。）を把握し、管理することが品質管理上重要である。 For this reason, it is necessary to understand and manage the time it takes for the user to operate the browser when browsing the web page and the browser reads and displays the data (hereinafter referred to as “web page display time”). It is important for management.

ウェブページ表示時間を把握するため、ブラウザにはNavigationTiming等の仕組みが供えられ、直接表示時間を知る手法なども普及してきている（非特許文献１、非特許文献２）。 In order to grasp the web page display time, the browser is provided with a mechanism such as NavigationTiming, and a method for directly knowing the display time has been widespread (Non-Patent Document 1, Non-Patent Document 2).

本多他、「Navigation Timing APIを用いたWeb品質劣化切り分け」、コミュニケーションクオリティ研究会2014.Honda et al., `` Web quality degradation isolation using Navigation Timing API '', Communication Quality Study Group 2014. 「Navigation Timing」、［online］、［平成２７年１０月２６日検索］、インターネット（URL：http://www.w3.org/TR/navigation-timing/）“Navigation Timing”, [online], [October 26, 2015 search], Internet (URL: http://www.w3.org/TR/navigation-timing/) H. Drucker, et al.,``Support Vector Regression Machines'',Advances in Neural Information Processing Systems 9, NIPS 1996,H. Drucker, et al., `` Support Vector Regression Machines '', Advances in Neural Information Processing Systems 9, NIPS 1996,

しかしながら、NavigationTiming技術については、コンテンツ事業者（ウェブサーバ）やユーザ（クライアント端末）において利用されることが想定されており、データ転送を担当するキャリアが管理するネットワーク上において観測されるデータに対して適用することは困難である。 However, the NavigationTiming technology is assumed to be used by content providers (web servers) and users (client terminals), and for data observed on the network managed by the carrier in charge of data transfer It is difficult to apply.

本発明は、上記の点に鑑みてなされたものであって、ネットワーク上の観測データからウェブブラウザの快適性を推定可能とすることを目的とする。 The present invention has been made in view of the above points, and an object thereof is to make it possible to estimate the comfort of a web browser from observation data on a network.

そこで上記課題を解決するため、推定装置は、或るウェブページについてのウェブブラウザでの表示に関する複数回の試行のそれぞれにおける、発生時期が早い順に所定数のＧＥＴリクエストのそれぞれについての試行内での相対的な発生時期と、前記試行ごとに計測された、前記或るウェブページの表示が指示されてから、少なくとも前記或るウェブページの表示に必要なデータの前記ウェブブラウザへの転送が完了するまでの所要時間とを、所定の推定モデルに学習させる推定モデル生成部と、ネットワーク上において観測された、前記或るウェブページに関する各ＧＥＴリクエストの相対的な発生時期に、前記所定の推定モデルを適用して、当該ＧＥＴリクエストに関する前記所要時間を推定する推定部と、を有する。 Therefore, in order to solve the above-described problem, the estimation apparatus performs the determination within each trial for each of a predetermined number of GET requests in the order of early occurrence in each of a plurality of trials related to display on a web browser for a certain web page. Transfer of at least the data necessary for displaying the certain web page to the web browser is completed after the relative occurrence time and the display of the certain web page measured for each trial are instructed. An estimated model generation unit that learns a required time until a predetermined estimated model, and the relative estimated time of each GET request regarding the certain web page observed on the network, the predetermined estimated model And an estimator that estimates the required time for the GET request.

ネットワーク上の観測データからウェブブラウザの快適性を推定可能とすることができる。 The comfort of the web browser can be estimated from the observation data on the network.

本発明の実施の形態における推定装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the estimation apparatus in embodiment of this invention. 本発明の実施の形態における推定装置の機能構成例を示す図である。It is a figure which shows the function structural example of the estimation apparatus in embodiment of this invention. 推定装置が実行する処理手順の一例を説明するためのフローチャートである。It is a flowchart for demonstrating an example of the process sequence which an estimation apparatus performs. 学習データを構成するＨＴＴＰ−ＧＥＴの発生履歴を示す図である。It is a figure which shows the generation history of HTTP-GET which comprises learning data. 学習データを構成する転送完了時間及び表示完了時間の履歴の例を示す図である。It is a figure which shows the example of the log | history of the transfer completion time and display completion time which comprise learning data. 各試行についての学習閾値分のレコードの抽出結果の一例を示す図である。It is a figure which shows an example of the extraction result of the record for the learning threshold value about each trial. 各ＨＴＴＰ−ＧＥＴの発生時刻の相対時刻への変換結果の一例を示す図である。It is a figure which shows an example of the conversion result to the relative time of the generation time of each HTTP-GET. 試行ごとの転送完了時間又は表示完了時間のベクトルの一例を示す図である。It is a figure which shows an example of the vector of the transfer completion time or display completion time for every trial.

以下、図面に基づいて本発明の実施の形態を説明する。まず、ウェブページの閲覧・表示におけるプロセスについて説明する。当該プロセスを極めて単純化すると、以下の通りである。
（１）ユーザによるクリック等を起点としたページ取得要求が発生する。
（２）ブラウザが対象ＵＲＬの指すページのｈｔｍｌデータをＨＴＴＰ（HyperText Transfer Protocol）のＧＥＴリクエストでウェブページの転送を要求すると、ｈｔｍｌデータがブラウザに転送される。
（３）ｈｔｍｌデータ内にサブコンテンツ（画像データ、必要ｓｃｒｉｐｔ等）が示されており、ブラウザは、それらを順次解読処理し、ＨＴＴＰのＧＥＴリクエストで取得する（ＨＴＴＰでＧＥＴ）。
（４）ブラウザは順次取得したサブコンテンツを処理して表示する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. First, a process for browsing / displaying a web page will be described. The process is greatly simplified as follows.
(1) A page acquisition request is generated starting from a click by the user.
(2) When the browser requests the transfer of a web page with the GET request of the HTTP (HyperText Transfer Protocol) for the html data of the page indicated by the target URL, the html data is transferred to the browser.
(3) Sub-contents (image data, required script, etc.) are shown in the html data, and the browser sequentially decodes them and obtains them with an HTTP GET request (GET with HTTP).
(4) The browser processes and displays the sub-contents acquired sequentially.

一般のウェブページの閲覧では、ｈｔｔｐｓで暗号化されていない場合は、ネットワーク上でのパケットのやり取りを読み取る（パケットキャプチャする）ことで、個別のＨＴＴＰのＧＥＴリクエスト（以下、「ＨＴＴＰ−ＧＥＴ」と表記する。）の送出状況及び送出時刻を把握することが出来る。そこで、（４）の表示処理の分を省略して、少なくとも（１）から（３）までの所要時間（≒ウェブページの表示が指示されてから表示に必要なデータのブラウザへの転送が完了するまでの所要時間）を知ることが出来れば、簡易的に表示完了時間を推定することができる。以下、この「ウェブページの表示が指示されてから表示に必要なデータのブラウザへの転送が完了するまでの所要時間」を、単に、「転送完了時間」という。また、（１）〜（４）までの所要時間を、「表示完了時間」という。 When browsing a general web page, if it is not encrypted with https, it reads the packet exchange on the network (packet capture), thereby obtaining an individual HTTP GET request (hereinafter referred to as “HTTP-GET”). The transmission status and the transmission time can be grasped. Therefore, omitting the display processing in (4), at least the required time from (1) to (3) (≈ transfer of data necessary for display to the browser after the instruction to display the web page is completed Display completion time can be simply estimated. Hereinafter, the “required time from when the web page display is instructed until the transfer of data necessary for display to the browser” is simply referred to as “transfer completion time”. The required time from (1) to (4) is referred to as “display completion time”.

しかし、ＨＴＴＰ−ＧＥＴを観測することで、「転送完了時間」を特定するには以下の２点の問題がある。
（ａ）ウェブページの作りによっては、自動更新等により、ウェブページの表示完了後もＨＴＴＰ−ＧＥＴが継続して発生する。
（ｂ）同じウェブページを同じ端末で繰り返し表示した場合、表示完了までに生じるＨＴＴＰ−ＧＥＴの数が必ずしも一定ではない。 However, there are the following two problems to specify the “transfer completion time” by observing HTTP-GET.
(A) Depending on the creation of the web page, HTTP-GET is continuously generated even after the display of the web page is completed by automatic update or the like.
(B) When the same web page is repeatedly displayed on the same terminal, the number of HTTP-GET generated until the display is completed is not necessarily constant.

まず、（ａ）により、一つのウェブページに関して「ＨＴＴＰ−ＧＥＴの発生が終了するタイミング」はネットワーク上でＧＥＴリクエストの発生状況を観測する範囲では、判断できないため、そもそも表示用データの転送の完了は、ＨＴＴＰ−ＧＥＴの転送の完了と対応しない。 First, according to (a), the “timing at which generation of HTTP-GET ends” for one web page cannot be determined within the scope of observing the generation status of the GET request on the network. Does not correspond to the completion of HTTP-GET transfer.

また、（ｂ）により、ウェブページ毎にＨＴＴＰ−ＧＥＴの数に関して一定の閾値を事前に決定し、ＨＴＴＰ−ＧＥＴのシーケンスを観測して当該閾値にＧＥＴ回数が到達した時点をもってＨＴＴＰ−ＧＥＴの転送の完了と判断することも困難である。 Further, according to (b), a predetermined threshold is determined in advance for the number of HTTP-GETs for each web page, the HTTP-GET is transferred when the number of GETs reaches the threshold by observing the HTTP-GET sequence. It is also difficult to judge that the completion is complete.

そこで、本実施の形態では、転送完了（上記の（３）の完了）よりも手前の段階（上記の（３）の途中までの段階）でのＨＴＴＰ−ＧＥＴの発生履歴から、転送完了時間又は表示完了時間を推定する。 Therefore, in the present embodiment, the transfer completion time or the transfer completion time or the time from the occurrence history of HTTP-GET at the stage before the completion of transfer (the completion of (3) above) (the stage until the middle of the above (3)). Estimate the display completion time.

図１は、本発明の実施の形態における推定装置のハードウェア構成例を示す図である。図１の推定装置１０は、それぞれバスＢで相互に接続されているドライブ装置１００、補助記憶装置１０２、メモリ装置１０３、ＣＰＵ１０４、及びインタフェース装置１０５等を有する。 FIG. 1 is a diagram illustrating a hardware configuration example of an estimation apparatus according to an embodiment of the present invention. The estimation device 10 in FIG. 1 includes a drive device 100, an auxiliary storage device 102, a memory device 103, a CPU 104, an interface device 105, and the like that are mutually connected by a bus B.

推定装置１０での処理を実現するプログラムは、ＣＤ−ＲＯＭ等の記録媒体１０１によって提供される。プログラムを記憶した記録媒体１０１がドライブ装置１００にセットされると、プログラムが記録媒体１０１からドライブ装置１００を介して補助記憶装置１０２にインストールされる。但し、プログラムのインストールは必ずしも記録媒体１０１より行う必要はなく、ネットワークを介して他のコンピュータよりダウンロードするようにしてもよい。補助記憶装置１０２は、インストールされたプログラムを格納すると共に、必要なファイルやデータ等を格納する。 A program that realizes processing in the estimation apparatus 10 is provided by a recording medium 101 such as a CD-ROM. When the recording medium 101 storing the program is set in the drive device 100, the program is installed from the recording medium 101 to the auxiliary storage device 102 via the drive device 100. However, the program need not be installed from the recording medium 101 and may be downloaded from another computer via a network. The auxiliary storage device 102 stores the installed program and also stores necessary files and data.

メモリ装置１０３は、プログラムの起動指示があった場合に、補助記憶装置１０２からプログラムを読み出して格納する。ＣＰＵ１０４は、メモリ装置１０３に格納されたプログラムに従って推定装置１０に係る機能を実行する。インタフェース装置１０５は、ネットワークに接続するためのインタフェースとして用いられる。 The memory device 103 reads the program from the auxiliary storage device 102 and stores it when there is an instruction to start the program. The CPU 104 executes a function related to the estimation device 10 according to a program stored in the memory device 103. The interface device 105 is used as an interface for connecting to a network.

なお、推定装置１０は、それぞれが図１に示される構成を有する複数のコンピュータによって構成されてもよい。 In addition, the estimation apparatus 10 may be configured by a plurality of computers each having the configuration illustrated in FIG.

図２は、本発明の実施の形態における推定装置の機能構成例を示す図である。図２において、推定装置１０は、学習データ取得部１１、閾値決定部１２、推定モデル生成部１３、及び推定部１４等を有する。これら各部は、推定装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。推定装置１０は、また、学習データ記憶部１５を利用する。学習データ記憶部１５は、例えば、補助記憶装置１０２、又は推定装置１０にネットワークを介して接続可能な記憶装置等を用いて実現可能である。 FIG. 2 is a diagram illustrating a functional configuration example of the estimation device according to the embodiment of the present invention. 2, the estimation apparatus 10 includes a learning data acquisition unit 11, a threshold determination unit 12, an estimation model generation unit 13, an estimation unit 14, and the like. Each of these units is realized by processing that one or more programs installed in the estimation apparatus 10 cause the CPU 104 to execute. The estimation device 10 also uses the learning data storage unit 15. The learning data storage unit 15 can be realized by using, for example, a storage device that can be connected to the auxiliary storage device 102 or the estimation device 10 via a network.

学習データ取得部１１は、予め設定されたＵＲＬに係る、評価対象のウェブページ（以下「評価対象ページ」という。）について、転送完了時間又は表示完了時間の推定のための学習データを取得する。推定の対象が転送完了時間であれば、転送完了時間に関する学習データが取得され、推定の対象が表示完了時間であれば、表示完了時間に関する学習データが取得される。推定の対象が、転送完了時間及び表示完了時間のいずれであるかは、予め設定される。学習データは、ネットワークキャプチャデータの取得や、ウェブブラウザにおけるウェブページに関するデータ転送の完了又はウェブページの表示の完了等のタイミングの検知等を並行して実行することで取得される。学習データ取得部１１は、同一の評価対象ページに関して複数回にわたって繰り返し行われるウェブページの表示指示及び表示ついて、ページ転送完了の時刻若しくはページ表示完了の時刻、及び転送されたＨＴＴＰ−ＧＥＴ（ＨＴＴＰのＧＥＴリクエスト）の発生履歴を示す学習データを取得する。取得された学習データは、学習データ記憶部１５に記憶される。 The learning data acquisition unit 11 acquires learning data for estimating a transfer completion time or a display completion time for an evaluation target web page (hereinafter referred to as “evaluation target page”) related to a preset URL. If the estimation target is the transfer completion time, learning data regarding the transfer completion time is acquired, and if the estimation target is the display completion time, learning data regarding the display completion time is acquired. It is set in advance whether the estimation target is the transfer completion time or the display completion time. The learning data is acquired by executing, in parallel, acquisition of network capture data, detection of timing such as completion of data transfer regarding the web page in the web browser or completion of display of the web page. The learning data acquisition unit 11 includes a web page display instruction and display repeatedly performed a plurality of times for the same evaluation target page, a page transfer completion time or a page display completion time, and a transferred HTTP-GET (HTTP HTTP). Learning data indicating the history of occurrence of (GET request) is acquired. The acquired learning data is stored in the learning data storage unit 15.

閾値決定部１２は、学習データに基づいて、転送完了時間又は表示完了時間の推定に用いるＨＴＴＰ−ＧＥＴの数（閾値）を決定する。 The threshold determination unit 12 determines the number (threshold) of HTTP-GET used for estimating the transfer completion time or the display completion time based on the learning data.

推定モデル生成部１３は、閾値決定部１２によって決定された閾値の範囲内の学習データを、統計的な推定モデルに学習させる。 The estimation model generation unit 13 causes the statistical estimation model to learn the learning data within the threshold range determined by the threshold determination unit 12.

推定部１４は、学習データを学習した推定モデルを、ネットワーク上において観測された観測データ（ＧＥＴリクエストの発生履歴）に対して適用して、転送完了時間及び表示完了時間のうち、推定対象として設定された方を推定する。 The estimation unit 14 applies the estimation model obtained by learning the learning data to the observation data (GET request occurrence history) observed on the network, and sets it as an estimation target among the transfer completion time and the display completion time. Estimate who has been.

以下、推定装置１０が実行する処理手順について説明する。図３は、推定装置が実行する処理手順の一例を説明するためのフローチャートである。 Hereinafter, the process procedure which the estimation apparatus 10 performs is demonstrated. FIG. 3 is a flowchart for explaining an example of a processing procedure executed by the estimation apparatus.

ステップＳ１０１において、学習データ取得部１１は、予めそのＵＲＬが設定されている評価対象ページを、推定装置１０内のブラウザに表示させ、その際のＨＴＴＰ−ＧＥＴの発生履歴と、ブラウザによる評価対象ページの転送完了時間又は表示完了時間との実績値を取得する。学習データ取得部１１は、取得されたデータ（学習データ）を、学習データ記憶部１５に記憶する。なお、ＨＴＴＰ−ＧＥＴの発生履歴については、転送の開始から転送完了時間までのものが取得される。すなわち、転送完了時間より後のＨＴＴＰ−ＧＥＴについては、当該発生履歴に含まれない。 In step S <b> 101, the learning data acquisition unit 11 displays an evaluation target page in which the URL is set in advance on a browser in the estimation device 10, an HTTP-GET occurrence history at that time, and an evaluation target page by the browser The actual value of the transfer completion time or display completion time is acquired. The learning data acquisition unit 11 stores the acquired data (learning data) in the learning data storage unit 15. Note that the HTTP-GET occurrence history is acquired from the start of transfer to the transfer completion time. That is, HTTP-GET after the transfer completion time is not included in the occurrence history.

ＨＴＴＰ−ＧＥＴの発生履歴については、例えば、推定装置１０内のｔｃｐｄｕｍｐ等のパケットキャプチャツールでパケットデータを取得した後、ｔｓｈａｒｋ等のキャプチャ解析ツールで、当該パケットデータをＨＴＴＰ−ＧＥＴの発生数及び時刻に変換することで取得されてもよい。例えば、キャプチャデータファイル名がｄｕｍｐ．ｐｃａｐであるとすると、以下のコマンドを実行することで、ＨＴＴＰ−ＧＥＴの発生数及び時刻を得ることができる。
%tshark -r dump.pcap -Y http.request.uri -T fields -e frame.time_epoch-e ......
又は、推定装置１０内にプロキシを設定し、ブラウザによるウェブページの表示時のプロキシログからＨＴＴＰ−ＧＥＴ時刻情報のログが取得されてもよい。 As for the occurrence history of HTTP-GET, for example, after acquiring packet data with a packet capture tool such as tcpdump in the estimation apparatus 10, the number of times and time of occurrence of HTTP-GET with the capture analysis tool such as tshark It may be obtained by converting into For example, if the capture data file name is dump. If it is pcap, the number of HTTP-GET occurrences and the time can be obtained by executing the following command.
% tshark -r dump.pcap -Y http.request.uri -T fields -e frame.time_epoch-e ......
Alternatively, a proxy may be set in the estimation device 10, and a log of HTTP-GET time information may be acquired from a proxy log when a web page is displayed by a browser.

一方、ブラウザでの評価対象ページの転送完了時間及び表示完了時間については、例えば、NavigationTimingのＡＰＩ（Application Program Interface）により計測して数値化が可能である。 On the other hand, the transfer completion time and display completion time of the evaluation target page in the browser can be measured and quantified, for example, by an API (Application Program Interface) of NavigationTiming.

なお、ステップＳ１０１の１回の実行を、「試行」という。ステップＳ１０１は、事前に設定された試行回数（例えば、１０１回等）分だけ繰り返される。その結果、図４及び図５に示される情報によって構成される学習データが、取得される。 Note that one execution of step S101 is referred to as “trial”. Step S101 is repeated for the number of trials set in advance (for example, 101 times). As a result, learning data constituted by the information shown in FIGS. 4 and 5 is acquired.

図４は、学習データを構成するＨＴＴＰ−ＧＥＴの発生履歴を示す図である。図４において、ＨＴＴＰ−ＧＥＴの発生履歴は、各試行において検出されたＨＴＴＰ−ＧＥＴごとに、試行番号、ＧＥＴリクエスト、及び時刻等を含む。 FIG. 4 is a diagram showing an occurrence history of HTTP-GET constituting the learning data. In FIG. 4, the HTTP-GET occurrence history includes a trial number, a GET request, a time, etc. for each HTTP-GET detected in each trial.

試行番号は、何番目の試行において検出されたＧＥＴリクエストであるのかを示す値である。ＧＥＴリクエストは、ＧＥＴリクエストの内容を示す文字列である。時刻は、ＧＥＴリクエストが発生した（検出された）時刻である。なお、試行ごとに、ＧＥＴリクエストの数は異なりうるため、図４において、試行ごとの行数は異なりうる。 The trial number is a value indicating what number of trial the GET request is detected. The GET request is a character string indicating the content of the GET request. The time is the time when the GET request is generated (detected). Since the number of GET requests can be different for each trial, the number of rows for each trial can be different in FIG.

図５は、学習データを構成する転送完了時間及び表示完了時間の履歴の例を示す図である。図５には、各試行における、評価対象ページの転送完了時間及び表示完了時間が示されている。なお、推定の対象が転送完了時間である場合、転送完了時間のみが取得されてもよく、表示完了時間である場合、転送完了時間及び表示完了時間が取得されてもよい。 FIG. 5 is a diagram illustrating an example of a transfer completion time and display completion time history constituting the learning data. FIG. 5 shows the transfer completion time and display completion time of the evaluation target page in each trial. In addition, when the estimation target is the transfer completion time, only the transfer completion time may be acquired, and when it is the display completion time, the transfer completion time and the display completion time may be acquired.

なお、図４と図５とにおいて、試行番号が共通するレコードは、同じ試行番号に関する学習データである。例えば、図５において、試行２（試行番号が２である試行）に関するＨＴＴＰ−ＧＥＴの発生履歴は、図４において試行番号が２であるレコードを参照することで特定できる。 In FIG. 4 and FIG. 5, records having a common trial number are learning data relating to the same trial number. For example, in FIG. 5, the HTTP-GET occurrence history regarding trial 2 (trial whose trial number is 2) can be identified by referring to the record whose trial number is 2 in FIG. 4.

なお、評価対象ページの表示に関する試行は、推定装置１０とは異なる端末において実行されてもよい。この場合、ステップＳ１０１では、学習データが入力されるだけでよい。 Note that the trial related to the display of the evaluation target page may be executed in a terminal different from the estimation device 10. In this case, it is only necessary to input learning data in step S101.

続いて、閾値決定部１２は、予め設定されている閾値決定法及びパラメータを、学習データに適用することで、閾値を決定する（ステップＳ１０２）。 Subsequently, the threshold value determination unit 12 determines a threshold value by applying a preset threshold value determination method and parameters to the learning data (step S102).

まず、閾値決定部１２は、ＨＴＴＰ−ＧＥＴ発生履歴（図４）について、試行ごとのレコード数を集計し、集計結果を昇順にソートする。図４の例では、試行１の集計結果は４であり、試行２の集計結果は５である。 First, the threshold value determination unit 12 aggregates the number of records for each trial in the HTTP-GET occurrence history (FIG. 4), and sorts the aggregation results in ascending order. In the example of FIG. 4, the count result of trial 1 is 4, and the count result of trial 2 is 5.

続いて、閾値決定部１２は、予め設定されている閾値決定法及びパラメータに基づいて、閾値を決定する。本実施の形態では、「分布の下位パーセンタイル指定」で、ソートした数字の小さい側から、事前に決められた分位点が閾値として決定される。 Subsequently, the threshold determination unit 12 determines a threshold based on a preset threshold determination method and parameters. In the present embodiment, in “distribution lower percentile designation”, a quantile point determined in advance is determined as a threshold value from the smaller sorted number side.

例えば、閾値決定法が、「分布の下位５パーセンタイル」であれば、ソート結果の最小値から５％の位置（すなわち、１０１回試行の場合、最小値から６個目の値）が閾値として決定される。 For example, if the threshold value determination method is “lower 5th percentile of distribution”, a position 5% from the minimum value of the sorting result (that is, the sixth value from the minimum value in the case of 101 trials) is determined as the threshold value. Is done.

この理由は、最小値を用いると、異常終了してしまったサンプル等が含まれる場合、最小値が極めて小さくなり、推定に必要な適正なデータ数が得られない可能性があるためである。そこで、一定の分位点抽出で対応する。 This is because if the minimum value is used and the sample or the like that has ended abnormally is included, the minimum value becomes extremely small, and there is a possibility that an appropriate number of data necessary for estimation cannot be obtained. Therefore, a certain quantile extraction is used.

以上により決定された閾値を、「学習閾値Ｎ」という。 The threshold value determined as described above is referred to as “learning threshold value N”.

続いて、推定モデル生成部１３は、ＨＴＴＰ−ＧＥＴ発生履歴（図４）について、試行ごとに、時刻の値が小さい方から学習閾値Ｎ個分のレコード（すなわち、発生時期の早い順にＮ個のＧＥＴリクエストに関するレコード）を抽出する（ステップＳ１０３）。なお、レコードの数が学習閾値Ｎ未満である試行については、レコードの抽出は行われない。 Subsequently, for each trial, the estimated model generation unit 13 records, for each trial, N records corresponding to the learning threshold value N from the smallest time value (that is, N records in the order of occurrence time). A record related to the GET request is extracted (step S103). For trials in which the number of records is less than the learning threshold N, no records are extracted.

図６は、各試行について学習閾値分のレコードの抽出結果の一例を示す図である。図６では、レコードの抽出が行われた試行ごとに、試行番号、Ｎ個分のＨＴＴＰ−ＧＥＴの発生時刻が示されている。なお、図６では、Ｍ個の試行が、学習閾値Ｎ個以上のレコードを含んでいた例に対応する。 FIG. 6 is a diagram illustrating an example of a record extraction result corresponding to the learning threshold for each trial. In FIG. 6, for each trial in which the record is extracted, the trial number and the occurrence time of N HTTP-GETs are shown. FIG. 6 corresponds to an example in which M trials include N or more learning threshold records.

続いて、推定モデル生成部１３は、レコードが抽出された試行ごとに、各ＨＴＴＰ−ＧＥＴの発生時刻を、当該試行の最初のＨＴＴＰ−ＧＥＴの発生時刻からの相対値（相対時刻）に変換する（ステップＳ１０４）。すなわち、図６に示される各試行のＧＥＴ−Ｘ（Ｘ＝１〜Ｎ）時刻について、ＧＥＴ−１時刻からの差分（試行内の相対的な発生時期）が算出される。 Subsequently, for each trial in which the record is extracted, the estimation model generation unit 13 converts the occurrence time of each HTTP-GET into a relative value (relative time) from the occurrence time of the first HTTP-GET of the trial. (Step S104). That is, for the GET-X (X = 1 to N) time of each trial shown in FIG. 6, the difference from the GET-1 time (relative occurrence time within the trial) is calculated.

図７は、各ＨＴＴＰ−ＧＥＴの発生時刻の相対時刻への変換結果の一例を示す図である。図７には、図６に示した各試行の各ＨＴＴＰ−ＧＥＴの発生時刻について、相対時刻への変換結果が示されている。なお、ＧＥＴ−１時刻については、各試行について、相対時刻は常に０となり情報量が無いので破棄する。その結果、各試行の列数はＮ−１となる。 FIG. 7 is a diagram illustrating an example of a conversion result of the generation time of each HTTP-GET to a relative time. FIG. 7 shows the conversion result to the relative time for the generation time of each HTTP-GET in each trial shown in FIG. Note that the GET-1 time is discarded because the relative time is always 0 for each trial and there is no information amount. As a result, the number of columns in each trial is N-1.

続いて、推定モデル生成部１３は、（ＧＥＴ数がＮ以上であった）試行ごとに、転送完了時間又は表示完了時間のベクトルを生成する（ステップＳ１０５）。 Subsequently, the estimation model generation unit 13 generates a transfer completion time or display completion time vector for each trial (the number of GETs is N or more) (step S105).

図８は、試行ごとの転送完了時間又は表示完了時間のベクトルの一例を示す図である。図８において、（１）は、図５に示した学習データに基づいて生成された、転送完了時間のベクトルである。（２）は、図５に示した学習データに基づいて生成された、表示完了時間のベクトルである。なお、推定対象が転送完了時間であれば、転送完了時間のベクトルのみが生成されてもよく、推定対象が表示完了時間であれば、表示完了時間のベクトルのみが生成されてもよい。 FIG. 8 is a diagram illustrating an example of a transfer completion time or display completion time vector for each trial. In FIG. 8, (1) is a transfer completion time vector generated based on the learning data shown in FIG. (2) is a display completion time vector generated based on the learning data shown in FIG. If the estimation target is the transfer completion time, only the transfer completion time vector may be generated. If the estimation target is the display completion time, only the display completion time vector may be generated.

続いて、推定モデル生成部１３は、予め設定された推定方法に基づいて、推定モデルを生成する（ステップＳ１０６）。本実施の形態では、推定方法として、サポートベクター回帰（ＳＶＲ：SupportVectorRegression）が設定された例について説明する。但し、本実施の形態に適用可能な推定方法は、ＳＶＲに限定されず、他の方法が用いられてもよい。例えば、重回帰等が用いられてもよい。なお、ＳＶＲ自体は、単なる機械学習の既知の手法であり、本実施の形態では、入力データの使い方及び出力とのマッピングがポイントである。 Subsequently, the estimated model generation unit 13 generates an estimated model based on a preset estimation method (step S106). In the present embodiment, an example in which support vector regression (SVR) is set as the estimation method will be described. However, the estimation method applicable to the present embodiment is not limited to SVR, and other methods may be used. For example, multiple regression or the like may be used. Note that the SVR itself is just a known method of machine learning, and in this embodiment, the usage of input data and mapping with output are the points.

ここでは、学習データ（Ｘ）を以下のように構成する。
Ｘ＝（Ｘ＿１，Ｘ＿２，...，Ｘ＿（Ｎ−１））
但し、
Ｘ＿１＝［各試行のＧＥＴ−２の相対時刻の長さＭのベクトル］
＝［０．０３２，０．２００，...，０．２２２］（図７の例）
Ｘ＿（Ｎ−１）＝［０．２３３，０．６６８，...，１．２２３］
また、学習データ（Ｙ）を以下のように構成する。
転送完了時間が推定対象である場合、
Ｙ＝［転送完了時間の長さＭのベクトル］＝［２．５，２．９，...，４．１］
表示完了時間が推定対象である場合、
Ｙ＝［表示完了時間の長さＭのベクトル］＝［３．１，３．６，...，４．７］
推定モデル生成部１３は、ＳＶＲのモデルＥを生成し、上記のように構成した学習データ（Ｘ）及び学習データ（Ｙ）で学習させる。すなわち、ＸとＹとの関係を、モデルＥに学習させる。 Here, the learning data (X) is configured as follows.
X = (X_1, X_2, ..., X_ (N-1))
However,
X_1 = [vector of length M of relative time of GET-2 of each trial]
= [0.032, 0.200, ..., 0.222] (example in Fig. 7)
X_ (N−1) = [0.233, 0.668,..., 1.223]
Further, the learning data (Y) is configured as follows.
If transfer completion time is to be estimated,
Y = [vector of transfer completion time length M] = [2.5, 2.9,..., 4.1]
If the display completion time is to be estimated,
Y = [vector of display completion time length M] = [3.1, 3.6,..., 4.7]
The estimation model generation unit 13 generates an SVR model E and learns the learning data (X) and the learning data (Y) configured as described above. That is, the model E is caused to learn the relationship between X and Y.

例えば、ＳＶＲの機能を持っているscikit-learnのライブラリにおける、ＳＶＲの使い方（http://scikit-learn.org）に準じれば、以下の記述によって、モデルＥの生成と、学習とを行うことができる。
E=SVR() #モデル生成
E.fit(X、Y) #学習データで学習
続いて、推定部１４は、モデルＥを、ネットワーク上の観測データに対して適用し、当該観測データに関して、転送完了時間又は表示完了時間を推定する（ステップＳ１０７）。 For example, according to the usage of SVR (http://scikit-learn.org) in the scikit-learn library having the SVR function, model E is generated and learned according to the following description. be able to.
E = SVR () #Model generation
E.fit (X, Y) #Learning with learning data Subsequently, the estimation unit 14 applies the model E to the observation data on the network, and estimates the transfer completion time or display completion time for the observation data. (Step S107).

具体的には、推定部１４は、ネットワーク上において時系列に観測された評価対象ページに関する各ＨＴＴＰ−ＧＥＴの発生時刻の履歴を取得する。推定部１４は、当該各ＨＴＴＰ−ＧＥＴの発生時刻について、最初のＨＴＴＰ−ＧＥＴの発生時刻からの相対的な発生時期（相対時刻）を算出する。その結果、相対時刻のベクトルＸ＿ｎｅｗが得られる。 Specifically, the estimation unit 14 acquires a history of occurrence times of HTTP-GET related to evaluation target pages observed in time series on the network. The estimation unit 14 calculates a relative generation time (relative time) from the first HTTP-GET generation time with respect to the generation time of each HTTP-GET. As a result, a relative time vector X_new is obtained.

例えば、観測された各ＨＴＴＰ−ＧＥＴの発生時刻の履歴が、以下の通りであったとする。
ＧＥＴ発生時刻の履歴：（２３４５．３３３，２３４５．９９９，２３４６．５００，２３４７．０００）
この場合、Ｘ＿ｎｅｗの値は、以下の通りとなる。
Ｘ＿ｎｅｗ＝［０．６６６，１．１６７，...，１．６６７］
推定部１４は、Ｘ＿ｎｅｗを推定モデルＥで推定し、推定値Ｔ＿ｅを得る。
Ｔ＿ｅ＝Ｅ．ｐｒｅｄｉｃｔ（Ｘ＿ｎｅｗ）
このように推定されたＴ＿ｅの値（例えば、３．８）が、観測データに関する「転送完了時間」又は「表示完了時間」の推定値である。すなわち、学習データ（Ｙ）が転送完了時間のベクトルであれば、転送完了時間の推定値が得られる。学習データ（Ｙ）が表示完了時間のベクトルであれば、表示完了時間の推定値が得られる。 For example, it is assumed that the history of observed occurrence times of each HTTP-GET is as follows.
GET occurrence time history: (234.333, 2345.999, 2346.500, 2347.000)
In this case, the value of X_new is as follows.
X_new = [0.666, 1.167, ..., 1.667]
The estimation unit 14 estimates X_new using the estimation model E, and obtains an estimated value T_e.
T_e = E. predict (X_new)
The value of T_e estimated in this way (for example, 3.8) is an estimated value of “transfer completion time” or “display completion time” regarding the observation data. That is, if the learning data (Y) is a vector of transfer completion time, an estimated value of transfer completion time is obtained. If the learning data (Y) is a vector of display completion time, an estimated value of display completion time is obtained.

なお、転送完了時間及び表示完了時間の双方の推定値が得られてもよい。 Note that estimated values of both the transfer completion time and the display completion time may be obtained.

また、複数のＵＲＬのそれぞれごとに推定モデルを生成しておき、ネットワークにおいて観測されたＧＥＴリクエストに係るＵＲＬに対応する推定モデルを利用して、転送完了時間又は表示完了時間の推定が行われてもよい。 In addition, an estimation model is generated for each of a plurality of URLs, and transfer completion time or display completion time is estimated using an estimation model corresponding to a URL related to a GET request observed in the network. Also good.

上述したように、本実施の形態によれば、ネットワーク上の観測データからウェブブラウザの快適性を推定可能とすることができる。 As described above, according to the present embodiment, the comfort of the web browser can be estimated from the observation data on the network.

なお、本実施の形態において、評価対象ウェブページは、或るウェブページの一例である。転送完了時間及び表示完了時間は、試行ごとに計測された、前記或るウェブページの表示が指示されてから、少なくとも前記或るウェブページの表示に必要なデータのウェブブラウザへの転送が完了するまでの所要時間の一例である。 In the present embodiment, the evaluation target web page is an example of a certain web page. The transfer completion time and the display completion time are measured for each trial, and after the display of the certain web page is instructed, at least the transfer of data necessary for displaying the certain web page to the web browser is completed. It is an example of the time required until.

以上、本発明の実施例について詳述したが、本発明は斯かる特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 As mentioned above, although the Example of this invention was explained in full detail, this invention is not limited to such specific embodiment, In the range of the summary of this invention described in the claim, various deformation | transformation・ Change is possible.

１０推定装置
１１学習データ取得部
１２閾値決定部
１３推定モデル生成部
１４推定部
１５学習データ記憶部
１００ドライブ装置
１０１記録媒体
１０２補助記憶装置
１０３メモリ装置
１０４ＣＰＵ
１０５インタフェース装置
Ｂバス DESCRIPTION OF SYMBOLS 10 Estimation apparatus 11 Learning data acquisition part 12 Threshold value determination part 13 Estimation model production | generation part 14 Estimation part 15 Learning data storage part 100 Drive apparatus 101 Recording medium 102 Auxiliary storage apparatus 103 Memory apparatus 104 CPU
105 Interface device B bus

Claims

Relative occurrence time within each trial for each of a predetermined number of GET requests in order of early occurrence in each of a plurality of trials related to display on a web browser for a certain web page, and measurement for each trial. The time required from the time when the display of the certain web page is instructed until the transfer of the data necessary for displaying the certain web page to the web browser is completed according to a predetermined estimation model. An estimated model generator to be trained;
An estimation unit that estimates the time required for the GET request by applying the predetermined estimation model to a relative generation time of each GET request for the certain web page observed on the network;
The estimation apparatus characterized by having.

A threshold determination unit that determines a threshold based on the number of GET requests in each of the plurality of trials;
The estimated model generation unit, in each trial in which the number of GET requests is greater than or equal to the threshold, relative occurrence time within the trial for each of the threshold number of GET requests in order of the occurrence time; A time required until the transfer of data necessary for displaying the certain web page to the web browser is completed after the display of the certain web page is instructed, measured for each trial. Let the predetermined estimation model learn,
The estimation apparatus according to claim 1.

Relative occurrence time within each trial for each of a predetermined number of GET requests in order of early occurrence in each of a plurality of trials related to display on a web browser for a certain web page, and measurement for each trial. The time required from the time when the display of the certain web page is instructed until the transfer of the data necessary for displaying the certain web page to the web browser is completed according to a predetermined estimation model. An estimation model generation procedure to be trained;
An estimation procedure for estimating the required time for the GET request by applying the predetermined estimation model to a relative occurrence time of each GET request for the certain web page observed on the network;
Is performed by a computer.

The computer executes a threshold determination procedure for determining a threshold based on the number of GET requests in each of the plurality of trials;
The estimation model generation procedure includes a relative generation time in each trial for each of the threshold number of GET requests in order of early generation time in each trial in which the number of GET requests is equal to or greater than the threshold. A time required until the transfer of data necessary for displaying the certain web page to the web browser is completed after the display of the certain web page is instructed, measured for each trial. Let the predetermined estimation model learn,
The estimation method according to claim 3.