JP2006185279A

JP2006185279A - Device and method for grasping accessing party

Info

Publication number: JP2006185279A
Application number: JP2004379553A
Authority: JP
Inventors: Toshiaki Ejiri; 俊章江尻
Original assignee: KAN KK
Current assignee: KAN KK
Priority date: 2004-12-28
Filing date: 2004-12-28
Publication date: 2006-07-13

Abstract

<P>PROBLEM TO BE SOLVED: To provide a program capable of executing automatingly analysis of marketing on a web site, in particular, an accessing party. <P>SOLUTION: Access log information is acquired, a host name of the accessing party is extracted based thereon to be converted into a URL, information of a top page is acquired by automatic access, and a title character sequence is acquired to be stored correlated with a domain name. The host name of the accessing party is extracted based on the access log information, the domain name is extracted, a corresponding organization name is acquired referring to a domain name-organization name lookup table prepared preliminarily, a URL of a request home page is acquired from the access log information, whether a linked party is a search engine or not is determined, a search keyword is acquired from a referrer of a data, when the linked party is determined to be the search engine, so as to execute cross tabulation of the organization name, the search keyword and a browsing page. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、ウェブサイトにアクセスしてくるユーザのアクセス元を把握する装置及び方法に関する。 The present invention relates to an apparatus and a method for grasping an access source of a user who accesses a website.

ウェブサイトの解析をサービスとして提供し、コンサルタント業務を行う業者は従来からある。 Conventionally, there are contractors who provide website analysis as a service and perform consultancy.

しかし、自動的にそれを実行するプログラムで有益なものが少ない。
特開２００４−０７０６７７号公報 However, there are few useful programs that do this automatically.
JP 2004-070677 A

本発明の目的は、ウェブサイトのマーケティング、とりわけアクセス元の解析を自動化して実行するプログラムを提供することにある。 An object of the present invention is to provide a program that automates and executes website marketing, especially analysis of access sources.

本発明のアクセス元把握装置は、インターネット上に存在し、ウェブサイトを解析してアクセス元に関する情報を把握するアクセス元把握装置であって、アクセスログ情報を取得するアクセスログ情報取得手段と、該アクセスログ情報取得手段が取得した情報に基づいてアクセス元のホスト名を抽出するアクセス元ホスト名抽出手段と、該アクセス元ホスト名抽出手段が抽出したホスト名からドメイン名を抽出し，ＵＲＬに変換するホスト名ＵＲＬ変換手段と、該ホスト名ＵＲＬ変換手段が変換したＵＲＬに自動的にアクセスしてトップページの情報を取得するトップページ情報取得手段と、該トップページ情報取得手段が取得した情報からタイトル文字列を取得するタイトル文字列取得手段と、該タイトル文字列取得手段が取得した文字列を前記ドメイン名と対応させて保存するドメイン名組織名対応表保存手段とを有するものである。 An access source grasping device of the present invention is an access source grasping device that exists on the Internet and analyzes a website to grasp information relating to an access source, an access log information obtaining unit for obtaining access log information, An access source host name extraction unit that extracts an access source host name based on information acquired by the access log information acquisition unit, and a domain name is extracted from the host name extracted by the access source host name extraction unit and converted into a URL A host name URL converting unit that performs the above processing, a top page information acquiring unit that automatically accesses the URL converted by the host name URL converting unit to acquire top page information, and information acquired by the top page information acquiring unit. Title character string acquisition means for acquiring a title character string, and a character string acquired by the title character string acquisition means Those having a domain name organization name correspondence table storage means for storing in correspondence with the domain name.

請求項２に記載した発明は、インターネット上に存在し、ウェブサイトを解析してアクセス元に関する情報を把握するアクセス元把握装置であって、アクセスログ情報を取得するアクセスログ情報取得手段と、該アクセスログ情報取得手段が取得した情報に基づいてアクセス元のホスト名を抽出するアクセス元ホスト名抽出手段と、該アクセス元ホスト名抽出手段が抽出したホスト名からドメイン名を抽出するドメイン名抽出手段と、該ドメイン名抽出手段が抽出したドメイン名に対応する組織名をあらかじめ用意されたドメイン名組織名対応表を参照して取得する組織名取得手段と、前記アクセスログ情報からリクエストページのＵＲＬを取得し、リンク元が検索エンジンか否かを判断するリンク元判断手段と、該リンク元判断手段がリンク元を検索エンジンであると判断した場合に、データのリファラから検索キーワードを取得する検索キーワード取得手段と、組織名と検索キーワード及び閲覧ページのクロス集計を実行するクロス集計実行手段とを有する。 The invention described in claim 2 is an access source grasping device that exists on the Internet and analyzes a website to grasp information related to an access source, an access log information obtaining unit for obtaining access log information, An access source host name extraction unit that extracts an access source host name based on information acquired by the access log information acquisition unit, and a domain name extraction unit that extracts a domain name from the host name extracted by the access source host name extraction unit An organization name acquisition unit for acquiring an organization name corresponding to the domain name extracted by the domain name extraction unit with reference to a domain name organization name correspondence table prepared in advance, and the URL of the request page from the access log information. A link source determination unit that acquires the link source and determines whether the link source is a search engine; If it is determined that the original search engines has a search keyword obtaining means for obtaining a search keyword from the data referrer, and a cross tabulation execution means for executing a cross-tabulation of the organization name as a search keyword, and browsing the page.

本発明のアクセス元把握方法は、インターネット上に存在し、ウェブサイトを解析してアクセス元に関する情報を把握するアクセス元把握装置のアクセス元把握方法であって、アクセスログ情報を取得するアクセスログ情報取得ステップと、該アクセスログ情報取得ステップで取得した情報に基づいてアクセス元のホスト名を抽出するアクセス元ホスト名抽出ステップと、該アクセス元ホスト名抽出ステップで抽出したホスト名からドメイン名を抽出し，ＵＲＬに変換するホスト名ＵＲＬ変換ステップと、該ホスト名ＵＲＬ変換ステップで変換したＵＲＬに自動的にアクセスしてトップページの情報を取得するトップページ情報取得ステップと、該トップページ情報取得ステップで取得した情報からタイトル文字列を取得するタイトル文字列取得ステップと、該タイトル文字列取得ステップで取得した文字列を前記ドメイン名と対応させて保存するドメイン名組織名対応表保存ステップとを有する。 The access source grasping method of the present invention is an access source grasping method of an access source grasping device that exists on the Internet and analyzes a website to grasp information relating to the access source, and access log information for acquiring access log information An acquisition source, an access source host name extraction step that extracts an access source host name based on the information acquired in the access log information acquisition step, and a domain name is extracted from the host name extracted in the access source host name extraction step A host name URL converting step for converting the URL, a top page information acquiring step for automatically accessing the URL converted in the host name URL converting step and acquiring top page information, and the top page information acquiring step Title string that gets the title string from the information acquired in Has a resulting step, a domain name organization name correspondence table storage step in which a character string obtained in the title character string obtaining step is stored in correspondence with the domain name.

請求項４に記載した発明は、インターネット上に存在し、ウェブサイトを解析してアクセス元に関する情報を把握するアクセス元把握装置のアクセス元把握方法であって、アクセスログ情報を取得するアクセスログ情報取得ステップと、該アクセスログ情報取得ステップで取得した情報に基づいてアクセス元のホスト名を抽出するアクセス元ホスト名抽出ステップと、該アクセス元ホスト名抽出ステップで抽出したホスト名からドメイン名を抽出するドメイン名抽出ステップと、該ドメイン名抽出ステップで抽出したドメイン名に対応する組織名をあらかじめ用意されたドメイン名組織名対応表を参照して取得する組織名取得ステップと、前記アクセスログ情報からリクエストページのＵＲＬを取得し、リンク元が検索エンジンか否かを判断するリンク元判断ステップと、該リンク元判断ステップでリンク元を検索エンジンであると判断した場合に、データのリファラから検索キーワードを取得する検索キーワード取得ステップと、組織名と検索キーワード及び閲覧ページのクロス集計を実行するクロス集計実行ステップとを有する。 The invention described in claim 4 is an access source grasping method of an access source grasping device that exists on the Internet and analyzes a website to grasp information related to an access source, and access log information for acquiring access log information An acquisition source, an access source host name extraction step that extracts an access source host name based on the information acquired in the access log information acquisition step, and a domain name is extracted from the host name extracted in the access source host name extraction step A domain name extraction step, an organization name acquisition step of acquiring an organization name corresponding to the domain name extracted in the domain name extraction step with reference to a domain name organization name correspondence table prepared in advance, and the access log information Get the URL of the request page and determine whether the link source is a search engine A link source determination step, a search keyword acquisition step of acquiring a search keyword from a data referrer when the link source is determined to be a search engine in the link source determination step, an organization name, a search keyword, and a browsing page A cross tabulation execution step for executing cross tabulation.

ウェブサイトのマーケティングを自動化して実行できる点に本発明の利点がある。 The advantage of the present invention is that website marketing can be automated.

図１０は、本発明のハードウェア構成を示す図である。今、サーバ１０に診断対象となるウェブサイトが置かれているものとする。本発明に係るアクセス元把握装置は、サーバ２０が、必要なコンピュータプログラムを読み込んでその機能を実現するものである。サーバ１０に置かれているウェブサイトには、多数のユーザが端末コンピュータ１０１，１０２，１０３，１０４などからアクセスするが、そのアクセスは一般には、図示を省略したホストコンピュータを介してなされる。本発明におけるアクセス元把握装置は、サーバ２０が、サーバ１０にあるアクセスログ情報を入手して、それを解析することにより、アクセス元を把握するものである。 FIG. 10 is a diagram showing a hardware configuration of the present invention. Now, assume that a website to be diagnosed is placed on the server 10. In the access source grasping device according to the present invention, the server 20 reads a necessary computer program and realizes its function. A large number of users access the Web site placed on the server 10 from the terminal computers 101, 102, 103, 104, etc., but the access is generally made via a host computer (not shown). In the access source grasping device according to the present invention, the server 20 obtains access log information in the server 10 and analyzes it to grasp the access source.

図１は、アクセス元の組織名を把握するプログラムのフローチャートである。スタートすると（ステップ５００）、まず当該診断すべきサイトのアクセスログ情報を取得し、入力する（ステップ５１０）。そして、そのアクセスログ情報からアクセス元のコンピュータのホストコンピュータ名を抽出する（ステップ５２０）。一般にホストコンピュータを有する企業がそのホスト名から類推されるＵＲＬでウェブサイトを有する場合が多くある。そのことに基づいて、ホスト名からドメイン名を抽出し、ＵＲＬに変換する（ステップ５３０）。そして、その各々のＵＲＬに自動的にアクセスして、トップページの情報を自動的に収集し、取得する（ステップ５４０）。トップページがあれば（ステップ５５０でＹＥＳなら）、そのページのタイトルの文字列を取得し（ステップ５６０）、取得できれば（ステップ５７０でＯＫ）ドメイン名と文字列とを保存する（ステップ５８０）。ステップ５５０でトップページがなかった場合、又はステップ５７０で文字列が取得できなかった場合はエラーを記録する（ステップ６１０）。これらの処理をアクセスログ情報のすべてについて繰り返して行う（ステップ５９０、ステップ６００を経て、ステップ５２０へ）。すべてのアクセスログ情報についての処理が終わったら終わる（ステップ６２０）。取得した組織名はドメイン名と対応させた表として蓄積され、検索される形で保存される。 FIG. 1 is a flowchart of a program for grasping the organization name of the access source. When started (step 500), first, access log information of the site to be diagnosed is acquired and input (step 510). Then, the host computer name of the access source computer is extracted from the access log information (step 520). In general, a company having a host computer often has a website with a URL inferred from the host name. Based on this, the domain name is extracted from the host name and converted into a URL (step 530). Then, each URL is automatically accessed, and the information on the top page is automatically collected and acquired (step 540). If there is a top page (YES in step 550), the character string of the title of the page is acquired (step 560), and if it can be acquired (OK in step 570), the domain name and character string are stored (step 580). If there is no top page in step 550, or if a character string cannot be obtained in step 570, an error is recorded (step 610). These processes are repeated for all access log information (steps 590 and 600 to step 520). When all the access log information has been processed, the process ends (step 620). The acquired organization name is accumulated as a table corresponding to the domain name, and stored in a searched form.

図２は、アクセス元の組織名を有効活用するプログラムのフローチャートである。スタートすると（ステップ７００）、まずログからデータを入力する（ステップ７１０）。当該データからアクセス元のホスト名を抽出する（ステップ７２０）。ホスト名からドメイン名を抽出し（ステップ７３０）、ドメイン名と組織名の対応表に抽出したドメイン名があれば（ステップ７４０でＹＥＳ）、組織名を取得し（ステップ７５０）、データからリクエスト（閲覧）ページのＵＲＬを取得し（ステップ７６０）、リンク元が検索エンジンなら（ステップ７７０でＹＥＳ）、データのリファラから検索キーワードを取得する（ステップ７８０）。アクセスログ情報のすべてについてこれらの処理を繰り返す（ステップ７９０からステップ８００を経てステップ７２０へ）。すべての情報についての処理が終わったら、組織名と検索キーワード・閲覧ページのクロス集計を実行して（ステップ８１０）、終わる（ステップ８２０）。このプログラムにより、どのような組織から、どのようなキーワードで当該ウェブサイト（診断対象となるウェブサイト）に訪問し、どのようなページを閲覧しているかを把握できる。 FIG. 2 is a flowchart of a program that effectively uses the organization name of the access source. When started (step 700), data is first input from the log (step 710). An access source host name is extracted from the data (step 720). The domain name is extracted from the host name (step 730), and if there is an extracted domain name in the domain name / organization name correspondence table (YES in step 740), the organization name is acquired (step 750), and the request ( (Browse) The URL of the page is acquired (step 760). If the link source is a search engine (YES in step 770), the search keyword is acquired from the data referrer (step 780). These processes are repeated for all access log information (from step 790 to step 720 through step 800). When the processing for all information is completed, the cross-tabulation of the organization name and the search keyword / view page is executed (step 810), and the processing ends (step 820). With this program, it is possible to grasp from which organization, what keyword, the website (the website to be diagnosed) is visited, and what page is being viewed.

図３は、ドメイン名・組織名の対応表の整備を実行するプログラムのフローチャートである。図１のプログラムにより自動的に作成されたドメイン名と組織名の対応表は、当該診断対象のウェブサイトの運営者が診断結果を得る際に、修正を加えて利用したいと欲する場合があることに鑑みて、カスタマイズできるようにしたプログラムである。スタートすると（ステップ９００）、取得したドメイン名と文字列を対応させて表示する（ステップ９１０）、データの文字列がドメイン名に対応した組織名となっているかどうかをウェブサイト管理者が判断して、対応してないと判断すると（ステップ９２０でＮＯ）、文字列を該当する組織名に変更して（ステップ９３０）、対応表に保存する（ステップ９４０）。変更したいデータがなくなるまでこの処理を繰り返す（ステップ９５０でＹＥＳならば、ステップ９６０を経てステップ９２０へ）。すべての変更が終わればプログラムを終了する（ステップ９７０）。 FIG. 3 is a flowchart of a program that executes a maintenance table of domain name / organization names. The domain name / organization name correspondence table automatically created by the program shown in FIG. 1 may require that the operator of the website to be diagnosed modify and use it when obtaining diagnosis results. This is a program that can be customized. When started (step 900), the acquired domain name and character string are displayed in correspondence with each other (step 910), and the website administrator determines whether the data character string is an organization name corresponding to the domain name. If the character string is determined not to correspond (NO in step 920), the character string is changed to the corresponding organization name (step 930) and stored in the correspondence table (step 940). This process is repeated until there is no more data to be changed (if step 950 is YES, step 960 is followed by step 920). When all changes are completed, the program is terminated (step 970).

図４は、このプログラムの画面遷移を示す図である。本プログラムをインターネットを介してユーザに利用させるプログラムとして提供する場合には、わかりやすいデザインと画面構成をする必要がある。このプログラムの画面は主に、メイン画面と詳細画面からなる。そして、さらに、このプログラムのユーザがドメイン名を自ら登録又は修正することを可能とするための画面、及びカテゴリー登録や修正を行う画面をもっている。ここで、カテゴリーとは、アクセス元の組織を分類するためのものであり、競合、パートナー、見込み客、顧客、その他、などの分類が考えられる。アクセス元である各企業がどのカテゴリーに属するものであるかは、定期的に見直され、登録、修正がなされるべきである。 FIG. 4 is a diagram showing screen transition of this program. When this program is provided as a program that allows a user to use it via the Internet, it is necessary to have an easy-to-understand design and screen configuration. The screen of this program mainly consists of a main screen and a detail screen. Furthermore, it has a screen for allowing the user of this program to register or modify the domain name, and a screen for registering or modifying the category. Here, the category is for classifying the access source organization, and classification such as competition, partner, prospective customer, customer, etc. can be considered. The category to which each accessing company belongs should be reviewed periodically and registered and revised.

図５は、メイン画面のハードコピーである。図１又は図２のプログラムを実行した結果得た集計結果が表及びグラフの形で視覚的に表示される。一ヶ月又は一週間という単位で定期的なレポートを継続的なこのプログラムの利用者に対して送付することもできるし、一時的なユーザに対してその要求に応じて集計して結果を表示することも可能である。 FIG. 5 is a hard copy of the main screen. The tabulation results obtained as a result of executing the program of FIG. 1 or 2 are visually displayed in the form of tables and graphs. Regular reports can be sent to users of this program on a monthly or weekly basis, and temporary users can be aggregated according to their requests and the results displayed. It is also possible.

図６は、カテゴリー別にサイト解析したグラフの例である。この例では、半年にわたるアクセス元解析結果をカテゴリー別に示してある。 FIG. 6 is an example of a graph obtained by site analysis for each category. In this example, the access source analysis results over six months are shown by category.

図７は、ドメインランク表のハードコピーである。所定期間内の組織ごとのアクセス数をその多い順にランキングして表示することが可能である。 FIG. 7 is a hard copy of the domain rank table. It is possible to rank and display the number of accesses for each organization within a predetermined period in descending order.

図８は、ドメイン設定画面のハードコピーである。図１に示したプログラムを実行すれば、自動的にホストコンピュータ名から対応する組織のウェブサイトのＵＲＬを取得し、そのサイトのトップページのタイトルから組織名を取得できる。たとえば、ホストコンピュータ名のなかの「ｃｓ」を「ｗｗｗ」に置き換えれば、そのままウェブサイトのＵＲＬになるといったように、ホストコンピュータ名の前半部分を「ｗｗｗ」に置き換えたり、ドメイン名に「ｗｗｗ」を付加すれば、ウェブサイトのＵＲＬになる場合が多いからである。そのようにして得たドメイン名と組織名との対応表が、このプログラムのユーザ、すなわち解析、診断を欲するサイトの運営者の目から見て適切でない場合には、図８に示すドメイン設定画面から修正が可能となっている。 FIG. 8 is a hard copy of the domain setting screen. If the program shown in FIG. 1 is executed, the URL of the corresponding organization website can be automatically acquired from the host computer name, and the organization name can be acquired from the title of the top page of the site. For example, if “cs” in the host computer name is replaced with “www”, the URL of the website is used as it is, and the first half of the host computer name is replaced with “www”, or the domain name is “www”. This is because the URL of the website is often added. When the correspondence table between the domain name and the organization name obtained in this way is not appropriate from the viewpoint of the user of this program, that is, the site operator who wants analysis and diagnosis, the domain setting screen shown in FIG. Can be modified from

図９は、詳細画面のハードコピーである。図１及び図２に示したプログラムにより得られた集計結果をそれぞれのアクセス元について表示している。すなわち、当該アクセス元のアクセスランク、カテゴリー、アクセス数、ユニークアクセス数、検索エンジンを介してきた場合の検索キーワードランク表、リンク元順、などが表示されている。 FIG. 9 is a hard copy of the detail screen. The tabulation results obtained by the program shown in FIGS. 1 and 2 are displayed for each access source. In other words, the access rank, category, number of accesses, number of unique accesses, search keyword rank table when coming through a search engine, link source order, and the like are displayed.

サーバ自身の機能として働くのみならず、端末コンピュータからの要求に答えて働いて機能を提供する用途にも適用できる。 It can be applied not only to the function of the server itself, but also to the use of providing functions by responding to requests from the terminal computer.

アクセス元の組織名を把握するプログラムのフローチャートである。It is a flowchart of the program which grasps | ascertains the organization name of the access source. アクセス元の組織名を有効活用するプログラムのフローチャートである。It is a flowchart of the program which uses effectively the organization name of an access source. ドメイン名・組織名の対応表の整備を実行するプログラムのフローチャートである。It is a flowchart of the program which performs maintenance of the correspondence table of a domain name and an organization name. 画面遷移を表した図である。It is a figure showing screen transition. メイン画面のハードコピーである。A hard copy of the main screen. カテゴリー別にサイト解析したグラフの例である。It is an example of the graph which analyzed the site according to category. ドメインランク表のハードコピーである。A hard copy of the domain rank table. ドメイン設定画面のハードコピーである。A hard copy of the domain configuration screen. 詳細画面のハードコピーである。A hard copy of the detail screen. 本発明のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of this invention.

Explanation of symbols

１０，２０サーバ
１０１，１０２，１０３，１０４端末コンピュータ 10, 20 Server 101, 102, 103, 104 Terminal computer

Claims

An access source grasping device that exists on the Internet and analyzes websites to obtain information on access sources,
Access log information acquisition means for acquiring access log information;
An access source host name extraction unit that extracts an access source host name based on the information acquired by the access log information acquisition unit;
A host name URL converting means for extracting a domain name from the host name extracted by the access source host name extracting means and converting it into a URL;
Top page information acquisition means for automatically accessing the URL converted by the host name URL conversion means and acquiring top page information;
Title character string acquisition means for acquiring a title character string from the information acquired by the top page information acquisition means;
An access source grasping device comprising: a domain name organization name correspondence table storage unit that stores a character string acquired by the title character string acquisition unit in association with the domain name.

An access source grasping device that exists on the Internet and analyzes websites to obtain information on access sources,
Access log information acquisition means for acquiring access log information;
An access source host name extraction unit that extracts an access source host name based on the information acquired by the access log information acquisition unit;
A domain name extraction means for extracting a domain name from a host name extracted by the access source host name extraction means, and a domain name organization name correspondence table in which organization names corresponding to the domain names extracted by the domain name extraction means are prepared in advance. Organization name acquisition means to acquire by reference,
Link source determination means for acquiring a URL of a request page from the access log information and determining whether the link source is a search engine;
Search keyword acquisition means for acquiring a search keyword from a referrer of data when the link source determination means determines that the link source is a search engine;
An access source grasping device comprising: an organization name, a search keyword, and a cross tabulation execution means for executing a cross tabulation of browsing pages.

An access source grasping method of an access source grasping device that exists on the Internet and analyzes a website to grasp information related to an access source,
An access log information acquisition step for acquiring access log information;
An access source host name extraction step for extracting an access source host name based on the information acquired in the access log information acquisition step;
A host name URL conversion step for extracting a domain name from the host name extracted in the access source host name extraction step and converting it into a URL, and automatically accessing the URL converted in the host name URL conversion step for information on the top page A top page information acquisition step for acquiring
A title character string acquisition step of acquiring a title character string from the information acquired in the top page information acquisition step;
A domain name organization name correspondence table storage step for storing the character string acquired in the title character string acquisition step in association with the domain name.

An access source grasping method of an access source grasping device that exists on the Internet and analyzes a website to grasp information related to an access source,
An access log information acquisition step for acquiring access log information;
An access source host name extraction step for extracting an access source host name based on the information acquired in the access log information acquisition step;
A domain name extraction step for extracting a domain name from the host name extracted in the access source host name extraction step, and a domain name organization name correspondence table in which organization names corresponding to the domain names extracted in the domain name extraction step are prepared in advance. An organization name acquisition step for reference and acquisition;
A link source determination step of acquiring a URL of a request page from the access log information and determining whether the link source is a search engine;
A search keyword acquisition step of acquiring a search keyword from a referrer of data when the link source is determined to be a search engine in the link source determination step;
An access source grasping method comprising: an organization name, a search keyword, and a cross tabulation execution step for executing cross tabulation of browsing pages.