JP6971053B2

JP6971053B2 - Data management equipment, data management methods, and programs

Info

Publication number: JP6971053B2
Application number: JP2017084326A
Authority: JP
Inventors: 康夫遠峯; 彰真吉野; 文十川; 一樹大利
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2017-04-21
Filing date: 2017-04-21
Publication date: 2021-11-24
Anticipated expiration: 2037-04-21
Also published as: JP2018181234A

Description

本発明は、データ管理装置、データ管理方法、およびプログラムに関する。 The present invention relates to a data management device, a data management method, and a program.

従来、ウェブページを提供するシステムにおいては、ユーザが端末装置を用いてウェブページにアクセスした履歴を示すアクセスログを取得して記憶装置に保管する運用が行われている。また、この保管されたアクセスログを解析することで、ウェブページのアクセス回数やウェブページにアクセスしたユーザの情報などを把握する方法が提案されている（例えば、特許文献１参照）。 Conventionally, in a system that provides a web page, an operation of acquiring an access log showing a history of a user accessing a web page using a terminal device and storing it in a storage device has been performed. Further, a method has been proposed in which the stored access log is analyzed to grasp the number of times a web page is accessed, information on a user who has accessed the web page, and the like (see, for example, Patent Document 1).

特開２０１４−２２８２１号公報Japanese Unexamined Patent Publication No. 2014-22821

上述のアクセスログが長期間にわたって保管される場合、このアクセスログを記憶する記憶装置の容量が逼迫してしまう場合がある。この対策として、アクセスログを定期的に削除する運用が行われている。しかしながら、削除時には不要と判断されたアクセスログであっても、将来的に実施される解析処理において必要となる場合がある。このため、容量を抑えつつ解析に必要なアクセスログを保管する方法が求められている。 When the above-mentioned access log is stored for a long period of time, the capacity of the storage device for storing the access log may become tight. As a countermeasure, access logs are deleted regularly. However, even an access log that is determined to be unnecessary at the time of deletion may be required in the analysis process to be performed in the future. Therefore, there is a demand for a method of storing access logs required for analysis while reducing the capacity.

また、従来のアクセスログの解析処理では、アクセスログそのものを処理対象としている。しかしながら、容量が大きなアクセスログを処理した場合、処理装置の負荷が増大し、解析処理に時間を要する場合がある。このため、解析が容易な形式でアクセスログを保管する方法が求められている。 Further, in the conventional access log analysis process, the access log itself is targeted for processing. However, when processing an access log having a large capacity, the load on the processing device increases, and the analysis process may take time. Therefore, there is a demand for a method of storing access logs in a format that is easy to analyze.

本発明は、このような事情を考慮してなされたものであり、容量を抑えつつ解析処理に適した形式でログデータを保管することが可能なデータ管理装置、データ管理方法、およびプログラムを提供することを目的の一つとする。 The present invention has been made in consideration of such circumstances, and provides a data management device, a data management method, and a program capable of storing log data in a format suitable for analysis processing while suppressing the capacity. One of the purposes is to do.

本発明の一態様は、端末装置によるアクセスに応じて取得されるログデータを取得する取得部と、前記取得部によって取得された前記ログデータに含まれるデータのうち第１の期間が経過したデータに対して、前記ログデータに含まれる第１の項目に着目した第１のサンプリング処理を行って第１のサンプリングログを生成するとともに前記ログデータに含まれる第２の項目に着目した第２のサンプリング処理を行って第２のサンプリングログを生成するサンプリング部と、前記取得部によって取得された前記ログデータに含まれるデータのうち前記第１の期間が経過したデータを無効化するとともに前記サンプリング部によって生成された前記第１のサンプリングログおよび第２のサンプリングログのうち前記第１の期間よりも長い第２の期間が経過した前記第１のサンプリングログおよび第２のサンプリングログを無効化する無効化部とを備えるデータ管理装置である。 One aspect of the present invention is an acquisition unit that acquires log data acquired in response to access by a terminal device, and data included in the log data acquired by the acquisition unit for which the first period has elapsed. On the other hand, the first sampling process focusing on the first item included in the log data is performed to generate the first sampling log, and the second item focusing on the second item included in the log data is generated. The sampling unit that performs sampling processing to generate a second sampling log, and the data included in the log data acquired by the acquisition unit that has passed the first period are invalidated and the sampling unit is used. Invalidates the first sampling log and the second sampling log that have passed a second period longer than the first period among the first sampling log and the second sampling log generated by. It is a data management device equipped with a conversion unit.

本発明の一態様によれば、容量を抑えつつ解析処理に適した形式でログデータを管理することが可能である。 According to one aspect of the present invention, it is possible to manage log data in a format suitable for analysis processing while suppressing the capacity.

データ管理システム１の一例を示す構成図である。It is a block diagram which shows an example of a data management system 1. ログデータＬの一例を示す図である。It is a figure which shows an example of the log data L. データ管理装置７によるデータ管理の概念を示す図である。It is a figure which shows the concept of data management by a data management apparatus 7. データ管理装置７の機能構成の一例を示すブロック図である。It is a block diagram which shows an example of the functional structure of a data management apparatus 7. データ管理装置７による処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of processing by a data management apparatus 7. サンプリングログＳＬ（行動ログ）の一例を示す図である。It is a figure which shows an example of a sampling log SL (behavior log). サンプリングログＳＬ（コンテンツログ）の一例を示す図である。It is a figure which shows an example of a sampling log SL (content log). 編集データＥＤ（属性データ）の一例を示す図である。It is a figure which shows an example of edit data ED (attribute data). 編集データＥＤ（メタデータ）の一例を示す図である。It is a figure which shows an example of edit data ED (metadata).

以下、図面を参照し、本発明のデータ管理装置、データ管理方法、およびプログラムの実施形態について説明する。本発明のデータ管理装置は、例えば、ウェブページなどの電子ページへのアクセスに応じて取得されるログデータに対してサンプリング処理、編集処理、無効化処理などを行うことで、データの保管期間、保管するデータの内容などを適宜設定し、ログデータの管理を行う。電子ページには、ブラウザによって参照されるウェブページの他、アプリケーションプログラムによって参照されるアプリページが含まれてよい。以下の説明では、ウェブページに着目して説明を行う。 Hereinafter, embodiments of the data management apparatus, data management method, and program of the present invention will be described with reference to the drawings. The data management device of the present invention performs data storage period, for example, by performing sampling processing, editing processing, invalidation processing, etc. on log data acquired in response to access to an electronic page such as a web page. Manage log data by appropriately setting the contents of the data to be stored. The electronic page may include a web page referenced by a browser as well as an app page referenced by an application program. In the following explanation, the explanation will be focused on the web page.

［全体構成］
図１は、データ管理システム１の一例を示す構成図である。データ管理システム１は、例えば、一以上の端末装置３と、一以上のサービス提供装置５と、一以上のデータ管理装置７とを備える。端末装置３と、サービス提供装置５と、データ管理装置７とは、ネットワークＮＷによって互いに接続されており、このネットワークＮＷを介して互いに通信する。ネットワークＮＷは、例えば、ＷＡＮ（Wide Area Network）やＬＡＮ（Local Area Network）、インターネット、専用回線、無線基地局、プロバイダなどを含む。 [overall structure]
FIG. 1 is a configuration diagram showing an example of a data management system 1. The data management system 1 includes, for example, one or more terminal devices 3, one or more service providing devices 5, and one or more data management devices 7. The terminal device 3, the service providing device 5, and the data management device 7 are connected to each other by a network NW, and communicate with each other via the network NW. The network NW includes, for example, a WAN (Wide Area Network), a LAN (Local Area Network), the Internet, a dedicated line, a wireless base station, a provider, and the like.

［端末装置］
端末装置３は、サービス提供装置５が提供するサービスを利用するユーザによって操作される。端末装置３は、例えば、パーソナルコンピュータ、スマートフォンなどの携帯電話やタブレット端末、ＰＤＡ（Personal Digital Assistant）などのコンピュータ装置である。 [Terminal device]
The terminal device 3 is operated by a user who uses the service provided by the service providing device 5. The terminal device 3 is, for example, a personal computer, a mobile phone such as a smartphone, a tablet terminal, or a computer device such as a PDA (Personal Digital Assistant).

端末装置３は、ユーザから所定の操作を受け付けると、予めインストールされたブラウザを介してサービス提供装置５が提供するウェブページにアクセスする。例えば、サービス提供装置５が提供するウェブページは、ニュースサイト、ショッピングサイト、検索サイト、オークションサイト、ＳＮＳ（Social Networking Service）サイトなどを構成するページである。 When the terminal device 3 receives a predetermined operation from the user, the terminal device 3 accesses the web page provided by the service providing device 5 via a browser installed in advance. For example, the web page provided by the service providing device 5 is a page constituting a news site, a shopping site, a search site, an auction site, an SNS (Social Networking Service) site, and the like.

［サービス提供装置］
サービス提供装置５は、インターネット上において、上述したニュースサイトやショッピングサイトなどのウェブページを提供するウェブサーバ装置であってよいし、アプリケーションが起動された端末装置３と通信を行って、各種情報の受け渡しを行うアプリケーションサーバ装置であってもよい。サービス提供装置５は、自身が提供するウェブページへの端末装置３によるアクセスに応じて取得されるログデータＬを出力する。 [Service provider]
The service providing device 5 may be a web server device that provides a web page such as the above-mentioned news site or shopping site on the Internet, or communicates with the terminal device 3 in which the application is started to obtain various information. It may be an application server device that performs delivery. The service providing device 5 outputs the log data L acquired in response to the access by the terminal device 3 to the web page provided by the service providing device 5.

図２は、ログデータＬの一例を示す図である。ログデータＬには、例えば、端末装置３を利用するユーザを識別する「ユーザＩＤ１０」、ウェブページに関連付けされたコンテンツを識別する「コンテンツＩＤ１１」、端末装置３からのアクセスを処理した日付を示す「日付１２」、端末装置３からのアクセスを受けたウェブページのＵＲＬ（Uniform Resource Locator）を示す「対象ＵＲＬ１３」、対象ＵＲＬへの遷移元のウェブページのＵＲＬを示す「遷移元ＵＲＬ１４」などが含まれる。 FIG. 2 is a diagram showing an example of log data L. The log data L indicates, for example, a "user ID 10" that identifies a user who uses the terminal device 3, a "content ID 11" that identifies content associated with a web page, and a date when access from the terminal device 3 is processed. "Date 12", "target URL 13" indicating the URL (Uniform Resource Locator) of the web page accessed from the terminal device 3, "transition source URL 14" indicating the URL of the transition source web page to the target URL, and the like. included.

「ユーザＩＤ１０」は、例えば、ユーザが端末装置３を用いてサービス提供装置５にアクセスする際にログインを行っている場合におけるログインＩＤを含む。あるいは、「ユーザＩＤ１０」は、端末装置３に備えられたウェブブラウザごとに管理されるクッキー（HTTP cookie）に関する情報や、端末装置３のＩＰアドレスなどを含む。 The "user ID 10" includes, for example, a login ID when the user is logged in when accessing the service providing device 5 using the terminal device 3. Alternatively, the "user ID 10" includes information on a cookie (HTTP cookie) managed for each web browser provided in the terminal device 3, an IP address of the terminal device 3, and the like.

「コンテンツＩＤ１１」は、例えば、ウェブページに関連付けされたコンテンツである商品、サービスなどを識別する識別子である。例えば、ウェブページが「商品Ａ」を購入するショッピングサイトである場合、このウェブページには、「商品Ａ」を示すコンテンツＩＤが関連付けされている。また、例えば、ウェブページが「スポーツＡ」のニュースを提供するニュースサイトである場合、このウェブページには、「スポーツＡ」を示すコンテンツＩＤが関連付けされている。ウェブページとコンテンツとの関連付けを示すデータは、例えば、サービス提供装置５または外部の記憶装置に記憶される各種マスタに記憶されていてよい。 The "content ID 11" is, for example, an identifier that identifies a product, service, or the like that is content associated with a web page. For example, when the web page is a shopping site for purchasing "product A", the web page is associated with a content ID indicating "product A". Further, for example, when the web page is a news site that provides news of "sports A", the web page is associated with a content ID indicating "sports A". The data indicating the association between the web page and the content may be stored in, for example, various masters stored in the service providing device 5 or an external storage device.

ログデータＬには、例えば、端末装置３がウェブページにアクセスする度に、一行のデータが追加される。ログデータＬは、例えば、日毎、週毎などの任意のタイミングで作成されるテキストデータである。 For example, one line of data is added to the log data L each time the terminal device 3 accesses the web page. The log data L is text data created at an arbitrary timing such as every day or every week.

［データ管理装置］
データ管理装置７は、例えば、サービス提供装置５が提供するウェブページまたはアプリケーションプログラムによって参照されるアプリページのログデータＬを取得し、取得したログデータＬの管理を行う。 [Data management device]
The data management device 7 acquires, for example, the log data L of the web page provided by the service providing device 5 or the application page referenced by the application program, and manages the acquired log data L.

図３は、データ管理装置７によるデータ管理の概念を示す図である。データ管理装置７では、ログデータＬに含まれるデータを３つの段階に分けて管理する。この３つの段階には、例えば、短期保管段階、長期保管段階、および無期限保管段階が含まれる。 FIG. 3 is a diagram showing the concept of data management by the data management device 7. The data management device 7 manages the data included in the log data L in three stages. These three stages include, for example, a short-term storage stage, a long-term storage stage, and an indefinite storage stage.

短期保管段階は、ログデータＬそのものを保管する段階である。この短期保管段階では、例えば、過去１年間のログデータＬが保管される。 The short-term storage stage is a stage in which the log data L itself is stored. In this short-term storage stage, for example, log data L for the past year is stored.

長期保管段階は、ログデータＬに対して所定のサンプリング処理を行ったサンプリングログＳＬを保管する段階である。この長期保管段階では、例えば、上述の短期保管段階よりも長い過去３年間のサンプリングログＳＬが保管される。このサンプリングログＳＬには、例えば、ログデータＬに含まれるデータに対して、「ユーザ」に関連する項目に着目してサンプリング処理を行った結果得られた「行動ログＡＬ」と、「コンテンツ」に関連する項目に着目してサンプリング処理を行った結果得られた「コンテンツログＣＬ」とが含まれる。 The long-term storage stage is a stage in which the sampling log SL obtained by performing a predetermined sampling process on the log data L is stored. In this long-term storage stage, for example, sampling log SL for the past three years, which is longer than the short-term storage stage described above, is stored. In this sampling log SL, for example, the "behavior log AL" and the "content" obtained as a result of sampling processing focusing on the items related to the "user" for the data included in the log data L. The "content log CL" obtained as a result of performing sampling processing focusing on the items related to is included.

無期限期保管段階は、ログデータＬに対して所定の編集処理を行った編集データＥＤを保管する段階である。この無期限保管段階では、例えば、保管期間は設けられず、過去の全ての編集データＥＤが保管される。この編集データＥＤには、例えば、ログデータＬに含まれるデータに対して、「ユーザ」に関連する項目に着目して編集処理を行った結果得られた「属性データＡＤ」と、「コンテンツ」に関連する項目に着目して編集処理を行った結果得られた「メタデータＭＤ」とが含まれる。 The indefinite period storage stage is a stage in which the edited data ED obtained by performing a predetermined editing process on the log data L is stored. In this indefinite storage stage, for example, no storage period is provided, and all past edited data EDs are stored. The edited data ED includes, for example, "attribute data AD" and "content" obtained as a result of editing the data included in the log data L by focusing on the items related to the "user". The "metadata MD" obtained as a result of performing the editing process focusing on the items related to the above is included.

図４は、データ管理装置７の機能構成の一例を示すブロック図である。データ管理装置７は、例えば、取得部２０と、サンプリング部２２と、編集部２４と、無効化部２６と、記憶部２８とを備える。記憶部２８は、例えば、ログデータ記憶部３０と、サンプリングログ記憶部３２と、編集データ記憶部３４とを備える。データ管理装置７に含まれる各機能部は、複数の装置に分散されてもよい。例えば、サンプリング部２２と他の機能部とは別体の装置によって実現されてもよい。記憶部２８は、ＮＡＳ（Network Attached Storage）などの記憶装置であってもよい。 FIG. 4 is a block diagram showing an example of the functional configuration of the data management device 7. The data management device 7 includes, for example, an acquisition unit 20, a sampling unit 22, an editing unit 24, an invalidation unit 26, and a storage unit 28. The storage unit 28 includes, for example, a log data storage unit 30, a sampling log storage unit 32, and an edit data storage unit 34. Each functional unit included in the data management device 7 may be distributed to a plurality of devices. For example, the sampling unit 22 and other functional units may be realized by a separate device. The storage unit 28 may be a storage device such as NAS (Network Attached Storage).

取得部２０、サンプリング部２２、編集部２４、および無効化部２６は、例えば、ＣＰＵ（Central Processing Unit）などのプロセッサが、記憶部２８に記憶されたプログラム（ソフトウェア）を実行することにより実現される。プログラムは、例えば、ネットワークＮＷを介してアプリケーションサーバからダウンロードされてもよいし、予めデータ管理装置７にプリインストールされていてもよい。また、これらの機能部は、ＬＳＩ（Large Scale Integration）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field-Programmable Gate Array）などのハードウェアによって実現されてもよいし、ソフトウェアとハードウェアの協働によって実現されてもよい。 The acquisition unit 20, the sampling unit 22, the editing unit 24, and the invalidation unit 26 are realized by, for example, a processor such as a CPU (Central Processing Unit) executing a program (software) stored in the storage unit 28. NS. The program may be downloaded from the application server via the network NW, or may be pre-installed in the data management device 7 in advance. Further, these functional units may be realized by hardware such as LSI (Large Scale Integration), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array), or cooperation between software and hardware. May be realized by.

記憶部２８は、例えば、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、ＨＤＤ（Hard Disk Drive）、フラッシュメモリ、またはこれらのうち複数が組み合わされたハイブリッド型記憶装置などによって実現されてよい。 The storage unit 28 may be realized by, for example, a RAM (Random Access Memory), a ROM (Read Only Memory), an HDD (Hard Disk Drive), a flash memory, or a hybrid storage device in which a plurality of these are combined. ..

取得部２０は、ネットワークＮＷを介して、サービス提供装置５からログデータＬを取得する。取得部２０は、サービス提供装置５から取得したログデータＬをログデータ記憶部３０に記憶させる。 The acquisition unit 20 acquires the log data L from the service providing device 5 via the network NW. The acquisition unit 20 stores the log data L acquired from the service providing device 5 in the log data storage unit 30.

サンプリング部２２は、取得部２０によって取得されたログデータＬに含まれるデータのうち、第１の期間が経過したログデータＬに対してサンプリング処理を行い、サンプリングログＳＬを生成する。サンプリング部２２は、生成したサンプリングログＳＬを、サンプリングログ記憶部３２に記憶させる。例えば、サンプリング部２２は、ログデータＬがサービス提供装置５によって日毎に作成される日次のテキストデータである場合、作成後１年以上経過したテキストデータに対してサンプリング処理を行う。 The sampling unit 22 performs sampling processing on the log data L for which the first period has passed among the data included in the log data L acquired by the acquisition unit 20, and generates a sampling log SL. The sampling unit 22 stores the generated sampling log SL in the sampling log storage unit 32. For example, when the log data L is daily text data created daily by the service providing device 5, the sampling unit 22 performs sampling processing on the text data one year or more after the creation.

（サンプリング処理）
サンプリング部２２は、第１の期間が経過したログデータＬに対してデータのサンプリング（データの間引き）を行う。このサンプリング処理は、例えば、「ユーザ」に関連する第１の項目に着目して行う第１のサンプリング処理と、「コンテンツ」に関連する第２の項目に着目して行う第２のサンプリング処理とを含む。 (Sampling process)
The sampling unit 22 performs data sampling (data thinning) on the log data L for which the first period has elapsed. This sampling process is, for example, a first sampling process focusing on the first item related to the "user" and a second sampling process focusing on the second item related to the "content". including.

「ユーザ」に関連する第１の項目に着目して行う第１のサンプリング処理は、例えば、ログデータＬにおいて、「ユーザ」に関連する「ユーザＩＤ１０」の項目のデータが存在するログデータを抽出するサンプリング処理、特定のフォーマットの「ユーザＩＤ１０」のログデータを抽出するサンプリング処理などを含む。 In the first sampling process focusing on the first item related to the "user", for example, in the log data L, the log data in which the data of the item of the "user ID 10" related to the "user" exists is extracted. Sampling process to be performed, sampling process to extract log data of "user ID 10" in a specific format, and the like are included.

「コンテンツ」に関連する第２の項目に着目して行う第２のサンプリング処理は、例えば、ログデータＬにおいて、特定のコンテンツと関連付けされた「コンテンツＩＤ１１」を含むデータを抽出するサンプリング処理などを含む。 The second sampling process focusing on the second item related to the "content" includes, for example, a sampling process for extracting data including the "content ID 11" associated with the specific content in the log data L. include.

また、サンプリング部２２には、サンプリング処理によりサンプリングされるログデータの割合（以下、「サンプリングレート」と呼ぶ）が設定されている。例えば、サンプリング部２２は、サンプリングレートに基づいて、サンプリングログＳＬの量の調整を行う。なお、サンプリング部２２は、上述の行単位でのログデータのサンプリングの他、列単位での項目毎のサンプリング処理を行ってもよい。 Further, the sampling unit 22 is set with a ratio of log data sampled by the sampling process (hereinafter, referred to as “sampling rate”). For example, the sampling unit 22 adjusts the amount of the sampling log SL based on the sampling rate. In addition to the above-mentioned sampling of log data in row units, the sampling unit 22 may perform sampling processing for each item in column units.

また、サンプリング部２２は、ログデータＬに対するサンプリング処理を段階的に行ってもよい。例えば、サンプリング部２２は、第１の期間が経過したログデータＬに対して第１のサンプリング処理を行った後、所定の期間が経過した後に第２のサンプリング処理を行ってもよい。さらに、サンプリング部２２は、この第２のサンプリング処理が行われたデータに対して、所定の期間が経過した後に第３のサンプリング処理を行ってもよい。このような二段階以上のサンプリング処理により、サンプリングログＳＬの短期保管、中期保管、長期保管といった段階的な保管期間の設定が可能となる。 Further, the sampling unit 22 may perform sampling processing on the log data L step by step. For example, the sampling unit 22 may perform the first sampling process on the log data L for which the first period has elapsed, and then perform the second sampling process after the elapse of a predetermined period. Further, the sampling unit 22 may perform a third sampling process on the data on which the second sampling process has been performed after a predetermined period of time has elapsed. By such two or more stages of sampling processing, it is possible to set a stepwise storage period such as short-term storage, medium-term storage, and long-term storage of the sampling log SL.

編集部２４は、取得部２０によって取得されたログデータＬに対して編集処理を行って編集データＥＤを生成する。編集部２４は、生成した編集データＥＤを、編集データ記憶部３４に記憶させる。例えば、編集部２４は、ログデータＬがサービス提供装置５によって日毎に作成される日次のテキストデータである場合、このテキストデータに対して編集処理を行う。 The editing unit 24 performs an editing process on the log data L acquired by the acquisition unit 20 to generate the edited data ED. The editing unit 24 stores the generated editing data ED in the editing data storage unit 34. For example, if the log data L is daily text data created daily by the service providing device 5, the editorial unit 24 edits the text data.

（編集処理）
上述の編集処理は、例えば、「ユーザ」に関連する第１の項目に着目して行う第１の編集処理と、「コンテンツ」に関連する第２の項目に着目して行う第２の編集処理とを含む。 (Editing process)
The above-mentioned editing process is, for example, a first editing process focusing on the first item related to the "user" and a second editing process focusing on the second item related to the "content". And include.

「ユーザ」に関連する第１の項目に着目して行う第１の編集処理は、例えば、ログデータＬにおいて、「ユーザ」に関連する「ユーザＩＤ１０」の項目毎に、その他の項目を意味ベースの属性化したカラム（属性）に集約あるいは置き換え、その他の項目の並び替えおよび無効化、ログデータの並び替えなどを行い、目的に応じた所定の形式の編集データＥＤ（属性データＡＤ）を生成する。 In the first editing process focusing on the first item related to the "user", for example, in the log data L, for each item of the "user ID 10" related to the "user", other items are semantically based. Aggregate or replace in the attributed columns (attributes) of, sort and invalidate other items, sort log data, etc., and generate edit data ED (attribute data AD) in a predetermined format according to the purpose. do.

「コンテンツ」に関連する第２の項目に着目して行う第２の編集処理は、例えば、ログデータＬにおいて、「コンテンツ」に関連する「コンテンツＩＤ１１」毎に、その他の項目を意味ベースの属性化したカラム（属性）に集約あるいは置き換え、その他の項目の並び替えおよび無効化、ログデータの並び替えなどを行い、目的に応じた所定の形式の編集データを生成する。 The second editing process focusing on the second item related to the "content" is, for example, in the log data L, for each "content ID 11" related to the "content", other items are meaning-based attributes. It aggregates or replaces the converted columns (attributes), sorts and invalidates other items, sorts log data, etc., and generates edit data in a predetermined format according to the purpose.

無効化部２６は、ログデータ記憶部３０に記憶されたログデータＬに含まれるデータのうち、第１の期間が経過したデータを無効化する。ログデータＬの無効化とは、例えば、ログデータ記憶部３０からログデータＬを物理的または論理的に削除することを言う。ここで無効化の対象となるログデータＬは、サンプリング部２２によるサンプリング処理の対象となったログデータである。第１の期間は、例えば、「１年」などに設定される。ここで、ログデータＬがサービス提供装置５によって日毎に作成される日次のテキストデータである場合、無効化部２６は、作成後１年以上経過したテキストデータを無効化する。 The invalidation unit 26 invalidates the data contained in the log data L stored in the log data storage unit 30 for which the first period has passed. Disabling the log data L means, for example, physically or logically deleting the log data L from the log data storage unit 30. Here, the log data L to be invalidated is the log data to be sampled by the sampling unit 22. The first period is set to, for example, "1 year". Here, when the log data L is daily text data created daily by the service providing device 5, the invalidation unit 26 invalidates the text data one year or more after the creation.

また、無効化部２６は、サンプリングログ記憶部３２に記憶されたサンプリングＳＬのうち、第２の期間が経過したサンプリングログＳＬを無効化する。この第２の期間は、上述の第１の期間よりも長い期間が設定される。第２の期間は、例えば、「３年」などに設定される。この場合、無効化部２６は、サンプリング部２２によって生成された後３年以上経過したサンプリングログＳＬを無効化する。なお、上記のようにサンプリング部２２が段階的なサンプリング処理を行っている場合、無効化部２６は、サンプリングログＳＬに対して個別に設定された期間に応じてサンプリングログＳ毎に上記の無効化処理を行ってよい。 Further, the invalidation unit 26 invalidates the sampling log SL in which the second period has elapsed among the sampling SLs stored in the sampling log storage unit 32. This second period is set to be longer than the first period described above. The second period is set to, for example, "3 years". In this case, the invalidation unit 26 invalidates the sampling log SL that has been generated by the sampling unit 22 for 3 years or more. When the sampling unit 22 performs the stepwise sampling process as described above, the invalidation unit 26 invalidates the above for each sampling log S according to the period individually set for the sampling log SL. You may perform the conversion process.

ログデータ記憶部３０は、取得部２０によって取得されたログデータＬを記憶する。サンプリングログ記憶部３２は、サンプリング部２２によって生成されたサンプリングログＳＬを記憶する。編集データ記憶部３４は、編集部２４によって生成された編集データＥＤを記憶する。 The log data storage unit 30 stores the log data L acquired by the acquisition unit 20. The sampling log storage unit 32 stores the sampling log SL generated by the sampling unit 22. The editing data storage unit 34 stores the editing data ED generated by the editing unit 24.

［データ管理装置の処理］
次に、図５から図９を参照しながらデータ管理装置７の動作について説明する。図５は、データ管理装置７による処理の流れの一例を示すフローチャートである。図５のフローチャートに示す処理は、日次、週次、月次などの所定のタイミングで開始される。或いは、この処理は、データ管理装置７の利用者による指示に応じて開始されてもよい。 [Processing of data management device]
Next, the operation of the data management device 7 will be described with reference to FIGS. 5 to 9. FIG. 5 is a flowchart showing an example of the processing flow by the data management device 7. The process shown in the flowchart of FIG. 5 is started at a predetermined timing such as daily, weekly, or monthly. Alternatively, this process may be started in response to an instruction from the user of the data management device 7.

まず、取得部２０は、ネットワークＮＷを介して、サービス提供装置５からログデータＬを取得する（ステップＳ１０１）。取得部２０は、取得したログデータＬをログデータ記憶部３０に記憶させる。 First, the acquisition unit 20 acquires the log data L from the service providing device 5 via the network NW (step S101). The acquisition unit 20 stores the acquired log data L in the log data storage unit 30.

次に、サンプリング部２２によるサンプリング処理（ステップＳ１０３）および編集部２４による編集処理（ステップＳ１０５）のいずれか一方、または双方が実施される。実施対象となる処理は、予め設定されたスケジュールまたは利用者による指示などに応じて決定される。 Next, either one or both of the sampling process (step S103) by the sampling unit 22 and the editing process (step S105) by the editing unit 24 are performed. The process to be executed is determined according to a preset schedule or an instruction by the user.

（サンプリング処理）
サンプリング部２２は、取得部２０によって取得されてログデータ記憶部３０に記憶されているログデータＬのうち、第１の期間が経過したログデータＬを読み出してサンプリング処理を行い、サンプリングログＳＬを生成する（ステップＳ１０３）。サンプリング部２２は、生成したサンプリングログＳＬをサンプリングログ記憶部３２に記憶させる。 (Sampling process)
The sampling unit 22 reads out the log data L for which the first period has passed out of the log data L acquired by the acquisition unit 20 and stored in the log data storage unit 30, performs sampling processing, and performs sampling log SL. Generate (step S103). The sampling unit 22 stores the generated sampling log SL in the sampling log storage unit 32.

ここで、サンプリング部２２は、ログデータＬに含まれる項目のうち、「ユーザ」に関連する「ユーザＩＤ１０」の項目に着目して第１のサンプリング処理を行い、サンプリングログＳＬとして行動ログＡＬ（第１のサンプリングログ）を生成する。サンプリング部２２は、例えば、ログデータＬに含まれるデータのうち、ユーザＩＤ１０の項目のデータが存在するデータをサンプリングする。 Here, the sampling unit 22 performs the first sampling process focusing on the item of the "user ID 10" related to the "user" among the items included in the log data L, and performs the first sampling process, and the action log AL (as the sampling log SL). First sampling log) is generated. For example, among the data included in the log data L, the sampling unit 22 samples the data in which the data of the item of the user ID 10 exists.

図６は、図２に示すログデータＬから、ユーザＩＤ１０の項目のデータが存在するログデータをサンプリングしたサンプリングログＳＬの一例を示す図である。図６に示すように、サンプリングログＳＬにおいては、ログデータＬに含まれるデータのうち、３行目および６行目のユーザＩＤ１０の項目のデータが存在しないデータは無効化されている。 FIG. 6 is a diagram showing an example of a sampling log SL in which log data in which the data of the item of the user ID 10 exists is sampled from the log data L shown in FIG. As shown in FIG. 6, in the sampling log SL, among the data included in the log data L, the data in which the data of the item of the user ID 10 in the third row and the sixth row does not exist is invalidated.

また、サンプリング部２２は、ログデータＬに含まれるデータの項目のうち、「コンテンツ」に関連する「コンテンツＩＤ１１」の項目に着目して第２のサンプリング処理を行い、サンプリングログＳＬとしてコンテンツログＣＬ（第２のサンプリングログ）を生成する。サンプリング部２２は、例えば、ログデータＬに含まれるデータのうち、コンテンツＩＤ１１の項目のデータが特定の形式を有するデータをサンプリングする。 Further, the sampling unit 22 performs a second sampling process focusing on the item of the "content ID 11" related to the "content" among the items of the data included in the log data L, and performs the second sampling process, and the content log CL as the sampling log SL. (Second sampling log) is generated. The sampling unit 22 samples, for example, data included in the log data L in which the data of the item of the content ID 11 has a specific format.

図７は、図２に示すログデータＬから、コンテンツＩＤ１１の項目のデータが“００００１”から“０００９９”の範囲であるログデータをサンプリングしたサンプリングログＳＬの一例を示す図である。図７に示すように、サンプリングログＳＬにおいては、ログデータＬに含まれるデータのうち、コンテンツＩＤ１１の項目のデータが“００００１”から“０００９９”の範囲ではない５行目から７行目のデータは無効化されている。 FIG. 7 is a diagram showing an example of a sampling log SL obtained by sampling log data in which the data of the item of the content ID 11 is in the range of “00001” to “00099” from the log data L shown in FIG. As shown in FIG. 7, in the sampling log SL, among the data included in the log data L, the data of the item of the content ID 11 is not in the range of "00001" to "00099", and the data in the 5th to 7th lines. Has been disabled.

また、サンプリング部２２は、予め設定されたサンプリングレートに基づいて、上述のサンプリングログＳＬに含まれるデータの量を調整する。例えば、サンプリングレートとして“３０％”が設定されている場合、サンプリングログＳＬに含まれるデータのうち３０％のデータが残るように調整を行う（７０％のデータを無効化する）。 Further, the sampling unit 22 adjusts the amount of data included in the above-mentioned sampling log SL based on a preset sampling rate. For example, when "30%" is set as the sampling rate, adjustment is made so that 30% of the data included in the sampling log SL remains (70% of the data is invalidated).

（編集処理）
編集部２４は、取得部２０によって取得されてログデータ記憶部３０に記憶されているログデータＬに対して編集処理を行い、編集データＥＤを生成する（ステップＳ１０５）。編集部２４は、生成した編集データＥＤを編集データ記憶部３４に記憶させる。 (Editing process)
The editing unit 24 performs an editing process on the log data L acquired by the acquisition unit 20 and stored in the log data storage unit 30, and generates an edited data ED (step S105). The editing unit 24 stores the generated editing data ED in the editing data storage unit 34.

編集部２４は、ログデータＬに含まれる項目のうち、「ユーザ」に関連する「ユーザＩＤ１０」の項目に着目して第１の編集処理を行い、編集データＥＤとして属性データＡＤ（第１の編集データ）を生成する。編集部２４は、マスターデータまたは所定の処理ロジックに従う第１の編集処理を行い、編集データＥＤを生成する。編集部２４は、例えば、ログデータＬにおいて、各ユーザＩＤと関連付けされた対象ＵＲＬのリンク先のページが男性向けまたは女性向けであるかや、アクセス対象のコンテンツの傾向を集計して、ユーザＩＤをキーとして、その他の項目を「性別」などの意味ベースの属性化したカラム（属性）に集約あるいは置き換える。なお、「ユーザＩＤ」と「属性」の対応付けは、ユーザがウェブページの会員登録時に入力したデータを参照することで行われてよい。 The editorial unit 24 performs the first editing process focusing on the item of the "user ID 10" related to the "user" among the items included in the log data L, and performs the first editing process, and the attribute data AD (first) as the editing data ED. Edit data) is generated. The editing unit 24 performs the first editing process according to the master data or a predetermined processing logic, and generates the editing data ED. For example, in the log data L, the editorial unit 24 aggregates the tendency of the content to be accessed, whether the linked page of the target URL associated with each user ID is for men or women, and the user ID. Is used as a key, and other items are aggregated or replaced with a meaning-based attributed column (attribute) such as "gender". The correspondence between the "user ID" and the "attribute" may be performed by referring to the data input by the user when registering as a member of the web page.

図８は、図２に示すログデータＬに対して第１の編集処理を行うことにより得られた編集データＥＤ（行動履歴データ）の一例を示す図である。図８では、「ユーザＩＤ」が“ｂｂｂｂｂ”であるデータに対して、「属性１（性別）」が“男性”および「属性２（趣味）」が“スポーツ”が関連付けされたデータが示されている。このように「ユーザ」に関連する「ユーザＩＤ１０」の項目に着目した編集処理を行うことデータ容量を圧縮することが可能である。また、このような編集データＥＤは、ユーザの行動の統計的なデータを把握することが容易である解析処理に適した形式となっている。なお、編集部２４は、上記の「属性１（性別）」および「属性２（趣味）」に加えてあるいは代えて、年齢、アクセス頻度などの多様な属性情報を編集データＥＤに追加してよい。 FIG. 8 is a diagram showing an example of edit data ED (behavior history data) obtained by performing the first edit process on the log data L shown in FIG. 2. FIG. 8 shows data in which "attribute 1 (gender)" is "male" and "attribute 2 (hobby)" is "sports" with respect to data in which "user ID" is "bbbbbb". ing. In this way, it is possible to compress the data capacity by performing the editing process focusing on the item of the "user ID 10" related to the "user". Further, such edited data ED is in a format suitable for analysis processing in which it is easy to grasp statistical data of user behavior. The editorial unit 24 may add various attribute information such as age and access frequency to the editing data ED in addition to or in place of the above-mentioned "attribute 1 (gender)" and "attribute 2 (hobby)". ..

また、編集部２４は、ログデータＬに含まれる項目のうち、「コンテンツ」に関連する「コンテンツＩＤ１１」の項目に着目して第２の編集処理を行い、編集データＥＤとしてメタデータＭＤ（第２の編集データ）を生成する。編集部２４は、マスターデータまたは所定の処理ロジックに従う第２の編集処理を行い、編集データＥＤを生成する。編集部２４は、例えば、ログデータＬに含まれるコンテンツＩＤをキーとして、その他の項目を「商品、サービス」などの意味ベースの属性化したカラム（属性）に集約あるいは置き換える。また、編集部２４は、例えば、各コンテンツＩＤにアクセスしたユーザの傾向を集計して、「対象ユーザ」の属性を決定する。 Further, the editorial unit 24 performs a second editing process focusing on the item of the "content ID 11" related to the "content" among the items included in the log data L, and performs the second editing process, and the metadata MD (the first) as the editing data ED. 2 edit data) is generated. The editing unit 24 performs a second editing process according to the master data or a predetermined processing logic, and generates the editing data ED. For example, the editorial unit 24 aggregates or replaces other items with a meaning-based attributed column (attribute) such as "product, service" using the content ID included in the log data L as a key. In addition, the editorial unit 24, for example, aggregates the tendencies of users who have accessed each content ID, and determines the attribute of the "target user".

図９は、図２に示すログデータＬに対して第２の編集処理を行うことにより得られた編集データＥＤ（アクセス履歴データ）の一例を示す図である。図９では、「コンテンツＩＤ」が“００００１”であるデータに対して、「属性１（商品、サービス）」が“スポーツ”および「属性２（対象ユーザ）」が“男性”が関連付けされたデータが示されている。このように「コンテンツ」に関連する「コンテンツＩＤ１１」の項目に着目した編集処理を行うことデータ容量を圧縮することが可能である。また、このような編集データＥＤは、コンテンツＩＤと関連付けされたウェブページに対するユーザのアクセス状況を把握することが容易である解析処理に適した形式となっている。なお、編集部２４は、上記の「属性１（商品、サービス）」および「属性２（対象ユーザ）」に加えてあるいは代えて、アクセス数などの多様な属性情報を編集データＥＤに追加してよい。 FIG. 9 is a diagram showing an example of edit data ED (access history data) obtained by performing a second edit process on the log data L shown in FIG. 2. In FIG. 9, data in which "attribute 1 (product, service)" is associated with "sports" and "attribute 2 (target user)" is associated with "male" with respect to data in which "content ID" is "00001". It is shown. In this way, it is possible to compress the data capacity by performing the editing process focusing on the item of "content ID 11" related to "content". Further, such an edited data ED is in a format suitable for analysis processing, which makes it easy to grasp the user's access status to the web page associated with the content ID. In addition, the editorial unit 24 adds various attribute information such as the number of accesses to the edit data ED in addition to or instead of the above "attribute 1 (product, service)" and "attribute 2 (target user)". good.

なお、各ウェブページに表示された画像に関連する情報などがログデータ記憶部３０に記憶されている場合、編集部２４は、この画像の色などの特徴量や、画像の種類（風画像景なのか、人物画像なのか）などに基づいて、編集データＥＤ（メタデータＭＤ）を生成してもよい。 When information related to the image displayed on each web page is stored in the log data storage unit 30, the editorial unit 24 determines the feature amount such as the color of the image and the type of the image (wind image scene). The edited data ED (metadata MD) may be generated based on (whether it is a person image or not).

次に、無効化部２６は、ログデータ記憶部３０に記憶されたログデータＬのうち、第１の期間が経過したログデータＬを無効化し、サンプリングログ記憶部３２に記憶されたサンプリングＳＬのうち、第２の期間が経過したサンプリングログＳＬを無効化する（ステップＳ１０７）。以上により、本フローチャートの処理を終了する。 Next, the invalidation unit 26 invalidates the log data L whose first period has passed among the log data L stored in the log data storage unit 30, and the sampling SL stored in the sampling log storage unit 32. Of these, the sampling log SL for which the second period has elapsed is invalidated (step S107). This completes the processing of this flowchart.

以上において説明した実施形態によれば、容量を抑えつつ解析処理に適した形式でログデータを管理することが可能である。サンプリングログＳＬの容量は、ログデータＬよりも小さいため、必要な情報を残しつつ記憶部に記憶されるデータの容量を低減させることができる。このサンプリングログＳＬは、ログデータＬを用いて所望のモデルの生成する際に利用されることが可能である。また、編集データＥＤは、解析に適した形式を有しており、解析における処理装置の負荷を軽減するとともに、処理時間を短縮することも可能である。また、編集データＥＤは無効化されることはないため、容量を抑えつつ必要な情報を残すことが可能である。 According to the embodiment described above, it is possible to manage log data in a format suitable for analysis processing while suppressing the capacity. Since the capacity of the sampling log SL is smaller than that of the log data L, it is possible to reduce the capacity of the data stored in the storage unit while retaining the necessary information. This sampling log SL can be used when generating a desired model using the log data L. Further, the edited data ED has a format suitable for analysis, and it is possible to reduce the load on the processing device in the analysis and also to shorten the processing time. Further, since the edited data ED is not invalidated, it is possible to leave necessary information while suppressing the capacity.

なお、上述の実施形態においては、取得部２０、サンプリング部２２、および編集部２４の各々が、ログデータＬ、サンプリングログＳＬ、および編集データＥＤを、データ管理装置７内に設けられた記憶部２８に記憶させる構成を説明した。しかしながら、取得部２０、サンプリング部２２、および編集部２４の各々は、ログデータＬ、サンプリングログＳＬ、および編集データＥＤを電気代の安価な地域（例えば、外国、地方など）に配置された記憶装置に送信するようにしてもよい。また、取得部２０、サンプリング部２２、および編集部２４の各々は、ログデータＬ、サンプリングログＳＬ、および編集データＥＤを磁気テープなどの外部の記憶媒体に出力してもよい。 In the above-described embodiment, each of the acquisition unit 20, the sampling unit 22, and the editing unit 24 stores the log data L, the sampling log SL, and the editing data ED in the data management device 7. The configuration to be stored in 28 was explained. However, each of the acquisition unit 20, the sampling unit 22, and the editing unit 24 stores the log data L, the sampling log SL, and the editing data ED in a region (for example, a foreign country, a region, etc.) where the electricity bill is cheap. It may be sent to the device. Further, each of the acquisition unit 20, the sampling unit 22, and the editing unit 24 may output the log data L, the sampling log SL, and the editing data ED to an external storage medium such as a magnetic tape.

また、上述の実施形態においては、サンプリング部２２が、所定のサンプリングレートに基づいてサンプリング処理を行う構成を説明した。しかしながら、サンプリング部２２が、記憶部２８の空き容量を確認して、空き容量に応じてサンプリングレートを動的に変化させるようにしてもよい。 Further, in the above-described embodiment, the configuration in which the sampling unit 22 performs the sampling process based on a predetermined sampling rate has been described. However, the sampling unit 22 may check the free space of the storage unit 28 and dynamically change the sampling rate according to the free space.

また、上述の実施形態においては、サンプリング部２２が、所定の「第１の期間」に基づいてサンプリング処理を行い、無効化部２６が、所定の「第１の期間」および「第２の期間」に基づいて無効化処理を行う構成を説明した。しかしながら、サンプリング部２２または無効化部２６が、記憶部２８の空き容量を確認して、空き容量に応じて「第１の期間」および「第２の期間」を動的に変化させるようにしてもよい。 Further, in the above-described embodiment, the sampling unit 22 performs sampling processing based on a predetermined "first period", and the invalidation unit 26 performs a predetermined "first period" and "second period". ], The configuration for performing invalidation processing was explained. However, the sampling unit 22 or the invalidation unit 26 confirms the free space of the storage unit 28 and dynamically changes the "first period" and the "second period" according to the free space. May be good.

以上、本発明を実施するための形態について実施形態を用いて説明したが、本発明はこうした実施形態に何等限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 Although the embodiments for carrying out the present invention have been described above using the embodiments, the present invention is not limited to these embodiments, and various modifications and substitutions are made without departing from the gist of the present invention. Can be added.

１‥データ管理システム、３‥端末装置、５‥サービス提供装置、７‥データ管理装置、２０‥取得部、２２‥サンプリング部、２４‥編集部、２６‥無効化部、２８‥記憶部、３０‥ログデータ記憶部、３２‥サンプリングログ記憶部、３４‥編集データ記憶部、ＮＷ‥ネットワーク 1 Data management system, 3 Terminal equipment, 5 Service provision equipment, 7 Data management equipment, 20 Acquisition unit, 22 Sampling unit, 24 Editing department, 26 Invalidation unit, 28 Storage unit, 30 Log data storage unit, 32 sampling log storage unit, 34 editing data storage unit, NW network

Claims

An acquisition unit that acquires log data acquired in response to access by the terminal device,
Of the data included in the log data acquired by the acquisition unit, the data for which the first period has passed is subjected to the first sampling process focusing on the first item included in the log data. A sampling unit that generates a sampling log of 1 and performs a second sampling process focusing on the second item included in the log data to generate a second sampling log.
Of the data included in the log data acquired by the acquisition unit, the data for which the first period has elapsed is invalidated, and the first sampling log and the second sampling log generated by the sampling unit are used. out e Bei and said invalidating unit for invalidating the first sampling logs and second sampling logs longer second period of time than the first time period,
The first sampling process is a process of extracting data in which the data of the first item is present in the data for which the first period has elapsed, or a process of extracting the data in which the data of the first period has elapsed, or the first in the data for which the first period has elapsed. Item data is the process of extracting data in a specific format.
The second sampling process is a process of extracting data in which the data of the second item is associated with a specific content in the data after the first period has passed.
Data management device.

The data included in the log data acquired by the acquisition unit is subjected to the first editing process focusing on the first item to generate the first editing data, and the second item is focused on. second editing process further example Bei editing unit for generating a second edited data by that,
The first editing process is a process of aggregating or replacing the data of other items for each data of the first item in the log data.
The second editing process is a process of aggregating or replacing the data of other items for each data of the second item in the log data.
The data management device according to claim 1.

The first item is an item for identifying a user who uses the terminal device.
The second item is an item for identifying the content associated with the electronic page accessed by the terminal device.
The data management device according to claim 1 or 2.

The invalidation unit does not invalidate the first edit data and the second edit data generated by the editorial unit.
The data management device according to claim 2.

The sampling unit performs the first sampling process and the second sampling process based on a predetermined sampling rate.
The data management device according to any one of claims 1 to 4.

Each of the acquisition unit, the sampling unit, and the editing unit arranges the log data, the sampling log, and the editing data in an area where the electricity bill is cheaper than the area where the data management device is arranged. Send to the storage device,
The data management device according to claim 2.

An acquisition unit that acquires log data acquired in response to access by the terminal device,
Of the data included in the log data acquired by the acquisition unit, the data for which the first period has passed is subjected to the first editing process focusing on the first item included in the log data. e Bei and editing unit configured to generate a second editing data by the second editing process that focuses on the second item included in the log data to generate a first edited data,
The first editing process is a process of aggregating or replacing the data of other items for each data of the first item in the log data.
The second editing process is a process of aggregating or replacing the data of other items for each data of the second item in the log data.
Data management device.

The computer
Acquires the log data acquired in response to access by the terminal device,
Of the acquired data included in the log data, the data for which the first period has passed is subjected to the first sampling process focusing on the first item included in the log data, and the first sampling is performed. A log is generated and a second sampling process focusing on the second item included in the log data is performed to generate a second sampling log.
Of the acquired data included in the log data, the data for which the first period has passed is invalidated, and the first sampling log and the second sampling log generated are said to have the first period. Disable the first sampling log and the second sampling log after a longer second period .
The first sampling process is a process of extracting data in which the data of the first item is present in the data for which the first period has elapsed, or a process of extracting the data in which the data of the first period has elapsed, or the first in the data for which the first period has elapsed. Item data is the process of extracting data in a specific format.
The second sampling process is a process of extracting data in which the data of the second item is associated with a specific content in the data after the first period has passed.
Data management method.

On the computer
Get the log data acquired according to the access by the terminal device,
Of the acquired data included in the log data, the data for which the first period has passed is subjected to the first sampling process focusing on the first item included in the log data, and the first sampling is performed. A log is generated and a second sampling process focusing on the second item included in the log data is performed to generate a second sampling log.
Of the acquired data included in the log data , the data for which the first period has passed is invalidated, and the first sampling log and the second sampling log generated are said to have the first period. Disable the first sampling log and the second sampling log after a longer second period .
The first sampling process is a process of extracting data in which the data of the first item is present in the data for which the first period has elapsed, or a process of extracting the data in which the data of the first period has elapsed, or the first in the data for which the first period has elapsed. Item data is the process of extracting data in a specific format.
The second sampling process is a process of extracting data in which the data of the second item is associated with a specific content in the data after the first period has passed.
program.