KR100449083B1

KR100449083B1 - Method of CRM on comparison with web-log and user ID

Info

Publication number: KR100449083B1
Application number: KR10-2001-0049107A
Authority: KR
Inventors: 김경재
Original assignee: (주) 이씨마이너
Priority date: 2001-08-14
Filing date: 2001-08-14
Publication date: 2004-09-16
Also published as: KR20030015061A

Abstract

본 발명은 고객관계관리방법에 관한 것으로서, 익명사용자에 대한 정보뿐만 아니라 회원에 대한 자세한 분석을 가능하게 하여 보다 향상된 고객맞춤서비스를 제공하고자 하는 목적을 달성하기 위해 제안되고 있다.The present invention relates to a customer relationship management method, and has been proposed to achieve an object of providing an improved customized service by enabling detailed analysis of members as well as information on anonymous users.

이러한 목적을 실현하기 위해 본 발명은 인터넷을 통해 연결되어 있는 다수의 접속자 클라이언트와, 웹서버와 마이닝서버에 의해 수행되는 고객관계관리를 위해 웹사이트에 고객인증을 위한 로그인스크립트가 구성된 웹사이트에서, 상기 웹서버로 접속하여 로그인스크립트가 구성된 웹사이트에 접속하는 접속자의 로그인정보를 추출하는 단계와; 상기 추출된 로그인정보와 접속자의 웹사이트접속에 따라 발생하는 로그데이터를 각각 별도의 파일로 저장하는 단계와; 상기 저장된 로그파일에 대해 필터링을 수행하여 저장하는 단계와; 상기 저장된 접속자의 로그인정보파일과 필터링된 로그파일을 상기 마이닝서버로 전송하는 단계와; 상기 전송된 로그인정보파일과 로그파일을 이용하여 최초접속정보를 추출하는 단계와; 상기 추출된 최초접속정보를 기준으로 상기 로그파일에서 일정시간동안의 접속자의 웹사이트 접속정보를 추출하여 저장하는 단계를 포함하는 웹로그와 ID매칭을 통한 고객관계관리방법을 제안하고 있으며, 특정 웹사이트를 방문하는 사용자 각각에 대해 보다 신뢰성있는 사용자정보를 획득하여 맞춤형식의 고객관리서비스를 제공할 수 있는 장점이 있다.In order to achieve the above object, the present invention provides a plurality of visitor clients connected through the Internet, and a website in which a login script for customer authentication is configured on a website for customer relationship management performed by a web server and a mining server. Extracting login information of a visitor accessing a web site configured with a login script by accessing the web server; Storing the extracted log-in information and log data generated in accordance with a user's access to a website as separate files; Filtering and storing the stored log file; Transmitting the stored login information file and the filtered log file to the mining server; Extracting initial access information by using the transmitted login information file and log file; We propose a customer relationship management method through a web log and ID matching, including the step of extracting and storing the website access information of the visitor for a predetermined time from the log file based on the extracted initial access information, and a specific web There is an advantage that can provide customized customer management service by obtaining more reliable user information for each user visiting the site.

Description

How to manage customer relationship through web log and user ID matching {Method of CRM on comparison with web-log and user ID}

본 발명은 고객관계관리(CRM)에서 사용될 수 있는 방법에 관한 것으로서, 보다 상세하게는 고객 개인별에 따른 데이터를 추출하여 보다 질높은 고객서비스를 실현할 수 있도록 하는 방법을 제시하고 있다.The present invention relates to a method that can be used in customer relationship management (CRM), and more particularly, to provide a method for realizing higher quality customer service by extracting data according to individual customers.

현재 인터넷을 기반으로 하는 인터넷 비즈니스 시장은 1999년에는 600억 원 규모로 확대되고, 2005년까지는 연 200% 이상의 초고속성장을 거듭하여 2조600억 원대의 시장이 예상된다. 특히 세계적으로 전자상거래 규모는 1996년 26억 달러, 1997년에 74억 달러, 1998년에　2백 30억 달러이상의 거래가 이루어 진 것으로 알려져 있으며, 미국의 시장 조사 기관인 IDC(International Data Co)는 2001년엔 적어도 2천2백억 달러를 넘어 설것으로 전망하고 있다.The Internet-based Internet business market is expected to expand to 600 billion won in 1999, and by 2005, the market will grow by over 200% annually to reach 2.6 trillion won. In particular, global e-commerce was reported to be worth more than $ 2.6 billion in 1996, $ 7.4 billion in 1997, and $ 23 billion in 1998. IDC (International Data Co), a US market research firm, It is expected to surpass at least $ 220 billion.

이러한 전자상거래의 비약적인 사용자확대와 기술의 발전에 따라 사업주는 고객을 확보하고 확보된 고객을 관리하기 위한 방안으로서 보다 성공적인 웹사이트운영방법이 필요하게 되었는데 이러한 요구에 따라 창안된 것이 고객관계관리(Customer relationship management:CRM)이다.As a result of the rapid expansion of users and the development of technology, business owners need a more successful website operation method to secure customers and manage the secured customers. relationship management (CRM).

고객관계관리(CRM)이란 고객관리프로세스를 자동화한 고객관리시스템으로서 기존 고객에 대한 정보를 종합적으로 분석해 우수 고객을 추출하고 이들에 관한 각종 정보를 바탕으로 DB마케팅을 한 차원 발전시킨 통합 마케팅 솔루션으로서, 우리 회사의 고객이 누구인지, 고객이 무엇을 원하는지를 파악하여 고객이 원하는 제품과 서비스를 지속적으로 제공함으로써 고객을 오래 유지시키고 이를 통해 고객의 평생가치를 극대화하여 수익성을 높이는 통합된 고객관계 관리 프로세스를 말한다.Customer relationship management (CRM) is a customer management system that automates the customer management process. It is an integrated marketing solution that comprehensively analyzes existing customer information, extracts excellent customers, and advances DB marketing based on various information about them. Integrated customer relationship management that keeps customers long by continually delivering the products and services they want by knowing who our customers are and what they want, thereby maximizing their lifetime value Say process.

즉 고객데이터와 정보를 분석, 통합하여 개별 고객의 특성에 기초한 마케팅 활동을 계획, 지원, 평가하는 과정으로서, 이러한 고객관계관리 프로세스는 보통 네 단계를 거치는데 마케팅 기획 및 전략을 위한 고객정보분석, 분석된 정보를 통한 마케팅 계획, 다양한 접촉 채널을 통한 고객접촉 그리고 고객접촉을 통해 발생된 데이터의 분석 및 정제단계 등으로 세분될 수 있는 바, 현재 인터넷을 이용한 전자상거래에 있어서도 e-CRM이란 용어로서 동일한 개념이 적용되고 있다.In other words, the process of planning, supporting, and evaluating marketing activities based on the characteristics of individual customers by analyzing and integrating customer data and information. These customer relationship management processes usually go through four stages: analysis of customer information for marketing planning and strategy, It can be subdivided into marketing plans through analyzed information, customer contact through various contact channels, and analysis and purification of data generated through customer contact. The same concept applies.

이러한 인터넷상에서의 고객관계관리를 위한 고객정보를 축적하는 방법은 다양한데, 그 중 웹사이트상에서 고객들이 움직이는 행동양식, 웹사이트상에서 구매 및 거래한 기록 등의 정보들을 통합적으로 관리함으로써 보다 입체적으로 고객을 이해하고 그 정보를 실시간으로 정리하는 프로세스의 구축이 필요하게 되었으며, 이러한 요구를 충족시켜주기 위해 웹로그 분석 및 데이터 마이닝 등의 기술기반이 발전하게 되었다.There are various ways to accumulate customer information for customer relationship management on the Internet. Among them, customers can be managed in a three-dimensional manner by managing information such as behaviors that customers move on the website and records of purchases and transactions on the website. There is a need to build a process that understands and organizes the information in real time, and technology foundations such as analytics and data mining have evolved to meet these demands.

웹로그(Web-log)는 인터넷을 이용하는 사용자들이 웹서버를 통해 특정 웹사이트로의 접속에 대한 정보데이터를 말하는 것으로서, 상기 웹서버를 통해 이루어지는 사용자의 모든 작업에 대한 기록이라 할 수 있으며, 상기와 같이 저장되는 사용자 로그데이터에는 웹사이트의 페이지뷰, 사용자의 페이지뷰, 접속장소 및 방식, 시간별 페이지뷰, 방문자수 등이 있다.Web-log refers to information data about a user accessing a specific website through a web server. The web-log is a record of all operations performed by the user through the web server. The user log data stored as is the page view of the website, the page view of the user, the location and method of access, the page view by time, the number of visitors, and the like.

일반적으로 인터넷에 접속한 사용자는 특정 웹페이지를 보기위한 요구를 웹서버로 전송하게 되고, 이에 웹서버는 해당 웹페이지와 관련된 여러 파일들에 접근하게 된다.In general, a user connected to the Internet sends a request to a web server to view a particular web page, and the web server accesses various files related to the web page.

상기와 같이 사용자의 요구에 의해 특정 웹페이지로 접속한 웹서버에는 이후 사용자의 요구에 의해 수행된 모든 작업들이 웹서버에 미리 정해진 위치에 데이터로 남게되는데, 사용자가 요청하는 특정 웹페이지뿐만 아니라 해당 웹페이지와 관련된 이미지파일, 이미지 데이터, 인클루드(Include)파일 등에 대한 정보가 로그파일에 저장되는 것이다.As described above, in a web server connected to a specific web page by a user's request, all operations performed by the user's request are left as data in a predetermined position in the web server. Information about image files, image data, and include files related to web pages is stored in log files.

상기와 같은 웹로그는 그 분석을 위해 웹트렌즈사(社)나 장원그래픽스사(社)등에서 제작한 웹로그분석툴을 사용하거나 또는 웹서버에서 자체적으로 제공하는 분석툴을 사용하여 분석하게 되는데, 이럴 경우 접속자와 방문자수, 방문자들의 분류, 방문자의 접속 ISP별 집계, 홈페이지 디렉토리와 파일별 통계, 시간별(월, 주, 요일, 일, 시간)분석 등의 다양한 정보를 제공하고 있다.Such a web log is analyzed using a web analysis tool manufactured by Web Trends Co., Ltd. or Changwon Graphics Co., Ltd. or by using an analysis tool provided by the web server itself. It provides various information such as the number of visitors and visitors, the classification of visitors, the aggregate of visitors' ISPs, the statistics of homepage directory and file, and the analysis by time (month, week, day of the week, day, time).

이하 상기와 같이 종래에 사용되고 있는 일반적인 웹로그의 분석방법에 대해 살펴보기로 한다.Hereinafter, a description will be given of an analysis method of a general web log conventionally used as described above.

도 1은 종래에 적용중인 고객관계관리를 위한 웹로그분석방법을 설명하기 위한 시스템구성을 간략히 도시한 도면으로서, 모뎀(5)과 ISP서버(6) 또는 프락시서버(7)를 통하거나 또는 직접 웹서버(10)와 연결되어 있는 다수의 제1, 제2, 제3클라이언트(1)(2)(3)를 대표적으로 도시하고 있다.FIG. 1 is a diagram briefly showing a system configuration for explaining a web analytics method for customer relationship management, which is applied in the prior art, directly or through a modem 5 and an ISP server 6 or a proxy server 7. A plurality of first, second, and third clients (1) (2) (3) connected to the web server 10 is shown representatively.

먼저 웹로그의 분석을 위해서는 다양한 종류의 데이터들이 필요한데, 상기와 같은 구성에서 상기 프락시서버(7)에 저장된 각종 데이터, 상기 웹서버(10)에 저장된 웹로그파일 및 현재 운영중인 웹데이터에서 얻을 수 있다.First of all, various types of data are required for the analysis of the web logs, and in the above configuration, various data stored in the proxy server 7, web log files stored in the web server 10, and web data currently in operation can be obtained. have.

상기 프락시서버(7)는 일반적으로 방화벽으로 알려져 있으며, 인터넷 사용자들의 웹서핑 속도를 빠르게 해주는 캐싱(Caching)을 수행하는 서버로서, 사용자 클라이언트(2)와 사용자가 사용하길 원하는 웹서버(10)의 중간에 위치하며, 상기 클라이언트(2)가 요구하는 문서를 대신해서 상기 웹서버(10)에 요청한 후 다시 상기 클라이언트(2)로 넘겨준다. 이때 상기 프락시서버(7)에도 그 문서가 저장된다.The proxy server 7 is generally known as a firewall, and performs a caching to speed up web surfing of Internet users. The proxy server 7 includes a user client 2 and a web server 10 that the user wants to use. Located in the middle, the web server 10 requests the web server 10 instead of the document requested by the client 2 and then transfers the document to the client 2 again. At this time, the document is also stored in the proxy server 7.

이후 사용자가 같은 내용을 다시 요청하거나 또는 다른 사용자가 같은 내용을 요구할 경우에는 해당 사이트에 요청하는 것이 아니라 상기 프락시서버(7)가 자신에게 저장된 내용을 사용자에게 전달하게 되는 것이다.After that, when the user requests the same contents again or when another user requests the same contents, the proxy server 7 delivers the contents stored therein to the user rather than requesting the corresponding site.

상기와 같이 프락시서버(7)를 이용할 경우, 상기 웹서버(10)에 요구하는 작업 등의 기록은 웹서버(10)의 로그파일이 아닌 상기 프락시서버(7)에 기록되고, 또한 상기 프락시서버(7)에 없는 내용을 요청한 경우라도 해당 사용자 정보가 아닌 프락시서버(7)의 정보가 기록된다.When the proxy server 7 is used as described above, a record of a job or the like requested by the web server 10 is recorded in the proxy server 7 rather than a log file of the web server 10, and the proxy server 7 is also recorded. Even if a request is made that is not found in (7), the information of the proxy server 7 is recorded instead of the corresponding user information.

상기 웹서버(10)의 로그파일에서 웹로그분석을 위해 얻을 수 있는 데이터는 이미 전술한 바와 같이 접속자와 방문자수, 방문자들의 분류, 방문자의 접속 ISP별 집계, 홈페이지 디렉토리와 파일별 통계, 시간별(월, 주, 요일, 일, 시간)분석 등이 있다.The data that can be obtained for the web analysis in the log file of the web server 10, as described above, the number of visitors and visitors, the classification of visitors, the aggregated visitor's access ISP, the homepage directory and file statistics, hourly (monthly) , Week, day, day, time).

마지막으로 상기 웹데이터에서 얻을 수 있는 정보에 대해 살펴보면, 웹데이터란 실제 웹사이트에서 운영중인 회원의 데이터나 컨텐츠 데이터를 말하며 상기 웹서버(10)에 저장되는 로그파일에는 사용자가 요구한 파일이름이 기록되어 있다.Finally, when looking at the information that can be obtained from the web data, the web data refers to the data or content data of the members running on the actual website, the log file stored in the web server 10 has a file name requested by the user It is recorded.

상기한 바와 같이 얻어지는 다양한 정보를 바탕으로 웹로그분석을 수행하는 과정을 살펴보면 다음과 같다.Looking at the process of performing the web analytics based on the various information obtained as described above are as follows.

도 2는 일반적인 웹로그분석의 과정을 대략적으로 도시한 단계진행도로서, 아파치, IIS, 넷스케이프 등 대부분의 웹서버에서 저장하고 있는 웹로그를 바탕으로 설명하고 있다.Figure 2 is a step-by-step progress diagram illustrating a general web analytics process, based on web logs stored in most web servers such as Apache, IIS, and Netscape.

먼저 웹로그의 분석을 위해서는 해당 웹사이트의 효율적인 운영을 위해 어떠한 분석이 필요한지를 미리 결정해두어야 하는데(S1), 이러한 분석대상의 명확한 결정에 따라 각 사이트의 특성에 맞는 사용자별 분석을 수행해야 한다.First, in order to analyze the web log, it is necessary to determine in advance what kind of analysis is necessary for the effective operation of the website (S1). According to the specific decision of the analysis target, the user-specific analysis must be performed according to the characteristics of each site. .

다음으로 로그파일에 저장된 데이터를 분석가능한 데이터로의 변환을 수행해야 한다. 다시 말해, 로그파일에 저장되는 데이터는 분석의 범위가 좁은 원시적인 형태의 데이터로서 이를 보다 다양하게 분석가능한 형태로 바꿔주어야 한다.(S2)Next, we need to convert the data stored in the log file into data that can be analyzed. In other words, the data stored in the log file is a primitive form of data that has a narrow scope of analysis, and should be converted into a more diverse form of analysis (S2).

상기 과정을 거친후 마지막으로 로그의 분석이 이루어진다.(S3)After the above process, the log is finally analyzed (S3).

로그분석은 분석하고자 하는 요소들에 따라 다양한 알고리즘을 통해 수행이 가능하며, 연관규칙탐사, 연속패턴검사, 군집분석, 의사결정수, 신경망모형 등 데이터마이닝(Data mining)의 알고리즘을 사용한 분석도 유용하다 할 수 있다.Log analysis can be performed through various algorithms according to the elements to be analyzed, and analysis using data mining algorithms such as association rule search, continuous pattern test, cluster analysis, decision tree, neural network model, etc. is also useful. You can do it.

상기에서는 전술한 바와 같이 일반적인 웹로그분석의 과정에 대해 살펴보았으며, 이하 보다 상세한 웹로그분석방법에 대해 설명한다.In the above, the process of general web analytics has been described as described above, and a detailed web analytics method will be described below.

도 3은 일반적인 웹로그의 분석방법을 설명하기 위한 도면으로서, 웹서버(10)와 마이닝서버(Mining server)(20)에서 수행되는 내부 프로세스를 과정에 따라 순차적으로 도시하고 있다.FIG. 3 is a diagram for explaining a general method of analyzing web logs, which sequentially illustrates internal processes performed by the web server 10 and the mining server 20 according to a process.

여기서 마이닝(Mining)이란, 대량의 데이터로부터 유용한 정보들을 추출하는 과정을 말하며, 이러한 데이터마이닝을 수행하는 서버를 마이닝서버라 한다.Here, mining refers to a process of extracting useful information from a large amount of data, and a server that performs data mining is called a mining server.

상기 웹서버(10)에 다수의 클라이언트(1)(2)(3)들의 접속에 대한 데이터를 저장하고 있는 로그파일이 저장되는데 주로 일(Day)단위로 기록되고 있다.In the web server 10, a log file that stores data about a connection of a plurality of clients 1, 2, and 3 is stored, and is mainly recorded in units of days.

이때 로그포멧은 NCSA(National Center for Supercomputing Applications)의 경우 Combined format을, W3C(World Wide Web Consortium)인 경우 Extende4d Log Format을 사용하는 것이 바람직하다.In this case, it is preferable to use the Combined format for the NCSA (National Center for Supercomputing Applications) and the Extende4d Log Format for the W3C (World Wide Web Consortium).

상기와 같이 웹서버(10)에 일(Day)단위로 저장된 로그파일은 상기 웹서버(10)에 의해 정해진 시간에 마이닝서버(20)로 FTP(File Transfer Protocol)를 통해 전송되는데, 상기 마이닝서버(20)에서는 전달받은 로그파일을 일정주기마다 데이터베이스(22)에 로드하고, 다시 PL/SQL과 같은 데이터베이스프로그램을 실행하여 임시테이블에서 목표테이블로 적재하게 된다. 여기서 PL/SQL(Procedural Language/SQL)은 오라클(Oracle)에서 사용되는 데이터베이스의 언어인 SQL을 C언어와 같은 절차적인 단위로 구성된 코드를 가미하여서 만든 언어이다.As described above, the log file stored on the web server 10 in units of days is transmitted to the mining server 20 through FTP (File Transfer Protocol) at a time determined by the web server 10. The mining server In (20), the received log file is loaded into the database 22 at regular intervals, and then loaded from the temporary table to the target table by executing a database program such as PL / SQL. Here, PL / SQL (Procedural Language / SQL) is a language made by adding the code composed of procedural units such as C language to SQL, the language of the database used in Oracle.

그러나 상기와 같이 설명한 일반적인 웹로그분석방법은 보다 상세한 사용자에 대한 정보를 얻을 수 없을뿐더러, 고객 각자에게 맞춤형식의 고객관리서비스를 제공하지 못하고 로그분석결과에 따른 일반적인 서비스만을 전체 고객에게 지원하고 있는 실정이다. 또한 로그파일의 정제과정없이 마이닝서버(20)로 전송하여 분석하는 과정이 수행되기 때문에 대용량 로그파일의 경우 전송 및 분석에 많은 시간과 고용량의 컴퓨터 처리능력을 요구하는 단점이 있다.However, the general web analytics method described above cannot obtain more detailed user information, and does not provide customized customer management service to each customer but only general services based on the log analysis results. It is true. In addition, since a process of transmitting and analyzing the mining server 20 is performed without refining the log file, a large log file has a disadvantage of requiring a lot of time and a high capacity computer processing capacity for transmission and analysis.

본 발명은 상기와 같은 고객서비스 문제를 해결하기 위한 것으로, 특정 웹사이트를 방문하는 사용자 각각에 대해 보다 신뢰성있는 사용자 이용정보를 획득하여 맞춤형식의 고객관리서비스를 제공하는데 가장 큰 목적이 있다.The present invention is to solve the customer service problem as described above, and has the greatest object to provide a customized customer management service by obtaining more reliable user usage information for each user visiting a specific website.

도 1은 종래에 적용중인 고객관계관리를 위한 웹로그분석방법을 설명하기 위한 시스템구성을 간략히 도시한 도면1 is a diagram briefly showing a system configuration for explaining a web analytics method for customer relationship management applied in the prior art

도 2는 일반적인 웹로그분석의 과정을 대략적으로 도시한 단계진행도Figure 2 is a step-by-step progress diagram showing a general process of general analytics

도 3은 일반적인 웹로그의 분석방법을 설명하기 위한 도면3 is a view for explaining a general method of analyzing web logs

도 4는 본 발명에 따른 고객관계관리 방법을 설명하기 위한 시스템 내부 동작과정을 도시한 도면4 is a diagram illustrating an internal process of a system for explaining a customer relationship management method according to the present invention.

<도면의 주요부분에 대한 부호의 설명><Description of the symbols for the main parts of the drawings>

1,2,3 : 클라이언트 5 : 모뎀1,2,3: Client 5: Modem

6 : ISP서버 7 : 프락시서버6: ISP server 7: Proxy server

10 : 웹서버 11 : 마이닝서버10: web server 11: mining server

상기 목적을 달성하기 위해, 본 발명은 웹서버(10)와 마이닝서버(Mining server)(20)에 의해 수행되는 고객관계관리를 위해 웹사이트에 고객인증을 위한 로그인스크립트가 구성된 웹사이트에서, 상기 웹서버로 접속하여 로그인스크립트가 구성된 웹사이트에 접속하는 접속자의 로그인정보를 추출하는 단계와; 상기 추출된 로그인정보와 접속자의 웹사이트접속에 따라 발생하는 로그데이터를 각각 별도의 파일로 저장하는 단계와; 상기 저장된 로그파일에 대해 필터링을 수행하여 저장하는 단계와; 상기 저장된 접속자의 로그인정보파일과 필터링된 로그파일을 상기 마이닝서버로 전송하는 단계와; 상기 전송된 로그인정보파일과 로그파일을 이용하여 최초접속정보를 추출하는 단계와; 상기 추출된 최초접속정보를 기준으로 상기 로그파일에서 일정시간동안의 접속자의 웹사이트 접속정보를 추출하여 저장하는 단계를 포함하는 웹로그와 ID매칭을 통한 고객관계관리방법을 제안하고 있다.In order to achieve the above object, the present invention is a web site 10 and a mining server (Mining server) for the customer relationship management performed by the website 20, the login script for customer authentication is configured on the website, the Extracting login information of a visitor accessing a web site configured with a login script by accessing a web server; Storing the extracted log-in information and log data generated in accordance with a user's access to a website as separate files; Filtering and storing the stored log file; Transmitting the stored login information file and the filtered log file to the mining server; Extracting initial access information by using the transmitted login information file and log file; Based on the extracted initial access information, a web log and a customer relationship management method through ID matching are included, including the step of extracting and storing the website access information of the visitor for a predetermined time from the log file.

상기 로그인정보는 웹프로그램, 즉 CGI(또는 ASP, PHP, JSP)프로그램에 저장되는 사용자정보와 웹서버의 시스템에서 발생되는 정보를 포함하고 있으며, 또한 상기 웹서버에 저장된 접속자의 로그인정보파일과 필터링된 로그파일은 FTP를 통해상기 마이닝서버로 전송하는 것을 더욱 포함하고 있다.The login information includes user information stored in a web program, that is, a CGI (or ASP, PHP, JSP) program, and information generated from a system of a web server. The login information file and filtering of a visitor stored in the web server are also included. The log file further includes transmitting to the mining server through FTP.

이하 본 발명에 따른 웹로그 ID매칭을 통한 고객관계관리에 사용되는 방법에 대한 자세한 설명을 첨부된 도면을 참조하여 상세히 설명한다.Hereinafter, a detailed description of a method used for customer relationship management through weblog ID matching according to the present invention will be described in detail with reference to the accompanying drawings.

도 4는 본 발명에 따른 고객관계관리 방법을 설명하기 위해 각 장치에서 수행되는 내부동작을 순차적으로 도시하고 있는 도면으로서, 다수의 클라이언트(1)(2)(3)와, 웹서버(10) 및 마이닝서버(20)에서 수행되는 과정을 도시하고 있다.4 is a diagram sequentially illustrating an internal operation performed in each device to explain a customer relationship management method according to the present invention, and includes a plurality of clients (1) (2) (3) and a web server (10). And a process performed by the mining server 20.

상기 웹서버(10)에서 수행되는 동작을 살펴보면, 먼저 개별고객에 대한 보다 상세한 로그분석을 위해서 본 발명에서는 로그인 정보를 더욱 추가하여 사용하고 있는데, 로그인정보는 고객이 특정 웹페이지로 방문하여 회원서비스를 이용하기 위해 회원임을 확인하는 과정에서 발생하는 정보, 즉 아이디, 패스워드 등을 임의로 추출한 정보이다.Looking at the operation performed in the web server 10, first, in order to more detailed log analysis for the individual customer in the present invention, the login information is further used, the login information is visited by a customer to a specific web page member service In order to use the information generated in the process of confirming that the member, that is, information extracted arbitrarily, such as ID, password.

고객은 로그인이 가능하도록 구성된 웹페이지에서 자신의 아이디와 비밀번호를 입력하게 되고 로그인 페이지는 아이디와 비밀번호를 회원정보가 기록된 회원데이터베이스에 조회하게 된다. 상기 회원데이터베이스에서 회원임이 확인된 고객에 대해서는 회원서비스의 이용에 대한 권한을 부여하게 된다.The customer enters his or her ID and password in a web page configured to log in, and the login page inquires the ID and password in the member database where the member information is recorded. The customer confirmed to be a member in the member database is authorized to use the member service.

이러한 로그인과정은 CGI(또는 ASP, PHP, JSP)라고 하는 프로그램이 수행하는데, 상기와 같은 프로그램에 고객의 로그인 정보를 추출하기 위한 프로그램코드를 더욱 추가한 프로그램을 사용하는 바, 상기 추가된 프로그램의 동작에 대해 설명하면, 일반적으로 방문자가 CGI(또는 ASP, PHP, JSP) 등의 웹프로그램으로 작성된 웹사이트에서 로그인을 위해 입력한 아이디, 패스워드는 상기 웹프로그램에 의해 특정 변수로 지정되어 웹서버에 저장된다. 또한 방문자가 접속한 웹서버 시스템 또는 웹프로그램에서 제공되는 날짜, 시간, 방문자IP주소와, 상기 방문자가 자신의 웹브라우저를 통해 전송한 브라우저정보 및 운영체제(OS)정보 등을 종합하여 방문자의 로그인정보를 생성하게 되는데, 이러한 추가프로그램에 의해 각각의 정보제공처에서 수집한 사용자 정보는 다음과 같은 예시형식을 취한다.This login process is performed by a program called CGI (or ASP, PHP, JSP), and uses a program in which a program code for extracting a client's login information is added to the above program. In the description of the operation, in general, the visitor inputs a user ID and password for logging in a website written by a web program such as CGI (or ASP, PHP, JSP). Stored. In addition, the login information of the visitor by combining the date, time, visitor IP address provided by the web server system or web program accessed by the visitor, and browser information and OS information transmitted by the visitor through his web browser. The user information collected at each information provider by this additional program takes the following example format.

alivekim 211.207.52.40 2001-07-09 00:54:25 Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)alivekim 211.207.52.40 2001-07-09 00:54:25 Mozilla / 4.0 (compatible; MSIE 5.5; Windows 98)

즉, alivekim이라는 아이디를 가진 회원이 2001년 7월 9일 0시 54분 25초에 211.207.52.40의 위치에서 Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)의 브라우저를 통해서 로그인 과정을 거쳐 회원임을 확인했다는 것을 말하고 있다.In other words, a member with the name alivekim is logged in through the browser of Mozilla / 4.0 (compatible; MSIE 5.5; Windows 98) at 0:54:25 on July 9, 2001 at 211.207.52.40. I say I confirmed it.

이러한 로그인 정보들은 고객이 로그인 과정을 거쳐 회원임이 확인될 때 마다 파일로 누적되는데, 이하 이 파일을 로그인정보파일이라 하며 이러한 로그인정보파일은 상기 웹서버(10)에 저장된 로그파일과는 별도로 일정시간단위(예를들어 하루단위)로 저장된다.These login information is accumulated in a file every time the customer is confirmed to be a member through the login process, hereinafter this file is called a login information file, and this login information file is a predetermined time separately from the log file stored in the web server 10 It is stored in units (eg daily).

상기와 같이 고객의 로그인에 대해 로그인정보파일이 저장됨과 동시에 상기 웹서버(10)에는 고객이 요청한 로그인 페이지에 대한 정보들이 로그파일에 저장된다.As described above, the login information file is stored with respect to the login of the customer, and the information on the login page requested by the customer is stored in the log file in the web server 10.

다음으로 상기 웹서버(10)에 일단위로 저장된 로그파일은 로그분석에 필요한 데이터만을 걸러내는 필터링(filtering)작업을 수행하게 되는데, 로그파일의 필터링은 공지기술로서 웹서버(10)의 스케줄러를 이용하여 일정시간 간격(예를 들어 하루단위)마다 실행한다.Next, the log file stored in the web server 10 is filtered to filter only the data necessary for log analysis. The filtering of the log file uses a scheduler of the web server 10 as a known technology. To run at regular intervals (for example, per day).

상기 필터링작업은 대용량의 로그파일을 처리하기 용이하도록 하기 위해 불필요한 데이터를 제거하는 작업을 말하며, 로그 포맷이 NCSA의 Combined format이든 W3C의 Extended log format이든 다음과 같은 포맷으로 로그 분석에 적합하도록 일치시키면서 필터링한다.The filtering operation refers to a process of removing unnecessary data in order to facilitate processing of a large log file, and whether the log format is a combined format of NCSA or an extended log format of W3C to match log analysis in the following format. To filter.

2001-07-09 01:16:20 210.101.41.91 G /cgi/quiz/quiz1.php3 - 200 - - mozilla/4.0+(compatible;+MSIE+5.0;+Windows+98;+DigExt)2001-07-09 01:16:20 210.101.41.91 G /cgi/quiz/quiz1.php3-200--mozilla / 4.0 + (compatible; + MSIE + 5.0; + Windows + 98; + DigExt)

상기와 같은 데이터는 210.101.41.91의 아이피(IP) 주소를 가진 웹사이트 방문자가 2001년 7월 9일 오전 1시 16분 20초에 mozilla/4.0+(compatible;+MSIE+5.5;+Windows+98;+DigExt)라는 웹브라우저(Explorer 5.5, Window98을 사용)를 사용하여 /cgi/quiz/quiz1.php3 페이지를 요청했으며 에러없이('200'이 의미하는 내용) GET방식으로 전송됐음을 의미한다.The data above shows that a visitor to a website with an IP address of 210.101.41.91 is running mozilla / 4.0 + (compatible; + MSIE + 5.5; + Windows + 98 at 1:16:20 AM on July 9, 2001. ; + DigExt) requested a /cgi/quiz/quiz1.php3 page using a web browser (Explorer 5.5, using Window98) and sent with GET without any error (what '200' means).

상기 GET방식은 CGI프로그램에서 사용되는 데이터전송방식의 하나로써, 주로 입력되는 데이터값들이 많지 않거나 간단하게 URL에 붙여 CGI프로그램에 전달할 때 사용되는 방식이다.The GET method is one of data transmission methods used in a CGI program. The GET method is mainly used when there are not many input data values or is simply attached to a URL and transmitted to a CGI program.

상기 웹서버(10)에서 필터링된 로그파일과 로그인정보파일은 정해진 시간에 상기 마이닝서버(20)로 FTP(File Transfer Protocol)를 통해 전송된다.The log file and the login information file filtered by the web server 10 are transmitted to the mining server 20 through FTP (File Transfer Protocol) at a predetermined time.

이하 상기 마이닝서버(20)에서 수행되는 과정을 설명한다.Hereinafter, a process performed by the mining server 20 will be described.

이하 과정도 상기 마이닝서버(20)의 스케줄러에 의해 정해진 시간마다 수행되는 것이 바람직하며, 상기 웹서버(10)에서 필터링된 로그파일과 로그인정보파일이 상기 마이닝서버(20)로 FTP전송된 후에 수행되는 것이 바람직하다.The following process is also preferably performed every time determined by the scheduler of the mining server 20, the log file and the log information file filtered by the web server 10 is performed after the FTP transmission to the mining server 20 It is desirable to be.

상기 웹서버(10)로부터 FTP를 통해 필터링된 로그파일과 사용자의 로그인정보파일을 전송받은 마이닝서버(20)는 상기 각 파일을 읽어들여 자신의 메모리에 저장한다. 이 때 저장되는 로그파일은 한 개 혹은 그 이상이 될 수도 있다.The mining server 20 which receives the log file filtered through the web server 10 and the user's login information file from the web server 10 reads each file and stores the file in its own memory. At this time, there may be one or more log files.

이후 상기 마이닝서버(20)는 저장된 로그인정보파일의 로그인정보를 읽으면서 상기 로그파일에 저장되어 있는 같은 시간대의 로그데이터와 비교하여 방문자별 웹사이트 이동 경로를 찾게 되는데, 고객별 웹사이트 이동 경로를 찾는 과정은 다음과 같다.Thereafter, the mining server 20 finds a website moving path for each visitor by comparing the log data of the same time zone stored in the log file while reading the login information of the stored login information file. The process of finding is as follows.

먼저 로그인정보파일의 아이피주소, 브라우저정보 등을 메모리에 저장된 로그파일내의 로그데이터의 아이피주소, 브라우저정보와 비교해서 고객이 처음 방문한 시점을 로그데이터에서 찾게 되는데, 이때 검색된 로그데이터에 포함된 웹페이지가 고객의 웹사이트 이동 경로 중 가장 처음 방문한 웹페이지가 된다.First of all, the IP address and browser information of the login information file are compared with the IP address and browser information of the log data in the log file stored in the memory, and the customer first visits the log data, and the web page included in the searched log data is found. Becomes the first visited webpage in the customer's website navigation path.

이후 상기에서 검색된 로그데이터를 기준으로 일정시간 동안의 경로를 계속 찾아나가게 되는데 이 때도 첫 방문 시점을 찾아낸 것과 마찬가지로 아이피주소, 브라우저 정보를 비교하여 경로를 찾아낸다. 여기서 일정 시간은 방문자가 웹사이트에 들어와서 서비스를 받고 종료할때까지의 평균 시간으로서 보통 30분 정도를 웹사이트 1인 평균 방문 시간으로 정하고 있으며 이는 관리자의 설정에 따라 달라질 수 있으며 또한 각 웹사이트마다 다를 수 있다.After that, the route is continuously searched for a predetermined time based on the log data searched above. In this case, the route is searched by comparing IP addresses and browser information, similarly to finding the first visit point. Here, a certain time is the average time from the time when a visitor enters a website and receives and closes the service, and usually 30 minutes is set as the average visit time per website, which may vary depending on the administrator's setting. May vary.

상기에서 설명한 고객별 웹사이트 이동 경로를 찾는 과정을 예를 들어 자세히 설명하면, 상기 로그인정보파일에 기록된 방문자의 로그인정보가 예를 들어,For example, the process of finding a website moving path for each customer described above will be described in detail. For example, the login information of the visitor recorded in the login information file may be, for example,

alivekim211.207.52.402001-07-09 00:54:25 mozilla/4.0 (compatible; MSIE 5.5; Windows 98)이고,alivekim 211.207.52.40 2001-07-09 00:54:25 mozilla / 4.0 (compatible; MSIE 5.5; Windows 98),

상기 로그파일에 기록된 방문자의 로그데이터가,Visitor log data recorded in the log file,

1001 : 2001-07-09 00:54:20 210.101.41.91 GET /cgi/quiz/quiz1.php3 - 200 - - mozilla/4.0 (compatible; MSIE+5.0; Windows+98; DigExt)1001: 2001-07-09 00:54:20 210.101.41.91 GET /cgi/quiz/quiz1.php3-200--mozilla / 4.0 (compatible; MSIE + 5.0; Windows + 98; DigExt)

1002 : 2001-07-09 00:54:22211.207.52.40GET /cgi/quiz/quiz1.php3 - 200 - - mozilla/4.0 (compatible; MSIE 5.5; Windows 98)1002: 2001-07-09 00:54:22 211.207.52.40 GET /cgi/quiz/quiz1.php3-200--mozilla / 4.0 ( compatible; MSIE 5.5; Windows 98 )

1003 : 2001-07-09 00:54:26 210.101.41.91 GET /cgi/quiz/quiz1.php3 - 200 - - mozilla/4.0 (compatible; MSIE+5.0; Windows+98; DigExt)1003: 2001-07-09 00:54:26 210.101.41.91 GET /cgi/quiz/quiz1.php3-200--mozilla / 4.0 (compatible; MSIE + 5.0; Windows + 98; DigExt)

으로 기록되어 있을 때, 상기 각 데이터에서 alivekim은 1001,1002,1003의 로그데이터 중 1002번라인 로그데이터의 아이피주소와 브라우저정보와 일치하므로 1002번라인의 로그데이터가 alivekim의 이동 경로중의 하나임을 알 수 있다.In the above data, alivekim coincides with the IP address of the log data of line 1002 and the browser information among the log data of 1001,1002,1003, indicating that the log data of line 1002 is one of alivekim's moving paths. Able to know.

즉, alivekim 이라는 회원이 /cgi/quiz/quiz1.php3라는 페이지를 2001년 7월 9일 0시 54분 22초에 정상적('200'코드가 의미하는 내용)으로 요청되었음을 알 수 있다.In other words, alivekim member can see that the page /cgi/quiz/quiz1.php3 is normally requested (0:54:22 seconds) on July 9, 2001.

상기 1001, 1002, 1003은 설명을 위해 임의로 붙인 라인넘버이다.The numbers 1001, 1002, and 1003 are line numbers arbitrarily added for description.

상기와 같이 필터링된 로그파일과 로그인정보파일의 비교에 의해 추출되는 고객정보, 다시말해 고객별 웹사이트 이동 경로는 상기 마이닝서버(20)에 파일로저장되는데 다음과 같은 바람직한 형태로 저장된다.The customer information extracted by comparing the filtered log file and the log information file as described above, that is, the website movement path for each customer is stored in the mining server 20 as a file, which is stored in the following preferred form.

alivekim 20010709123436 - - 0 search.naver.com /search.naver -alivekim 20010709123436--0 search.naver.com /search.naver-

alivekim 20010709123436 200 G 2 www.ecminer.com /index.html -alivekim 20010709123436 200 G 2 www.ecminer.com /index.html-

alivekim 20010709123438 200 G 58 www.ecminer.com /introduction.html -alivekim 20010709123438 200 G 58 www.ecminer.com /introduction.html-

설명하면, alivekim 이라는 아이디를 가진 회원이 2001년 7월 9일 오전 12시 34분 36초에 www.naver.com이라는 검색사이트에서 검색하여 www.ecminer.com이라는 사이트에 들어왔으며, 2001년 7월 9일 오전 12시 34분 36초에 /index.html에 2초동안 머물렀으며 2001년 7월 9일 오전 12시 34분 38초에 /introduction.html에 58초동안 머물렀다가 사이트를 빠져나갔다는 의미이다.In other words, a member with the name alivekim entered the site www.ecminer.com on July 9, 2001, at 12:34:36 AM and searched for it on www.naver.com. I stayed at /index.html for 2 seconds at 12:34:36 AM on 9th and 58 seconds at /introduction.html at 12:34:38 AM on 9th July 2001. It means.

상기와 같은 형식으로 고객별 이동 경로가 모두 저장된 분석파일은 DBMS(Oracle, MS-SQL 등)의 로더(Loader)에 의해서 임시테이블에 저장되고, 다시 PL/SQL이라는 데이터베이스 프로그램으로 고객별 로그 분석 및 마이닝에 적합하도록 변환(Transformation)과정을 거친 후 상기 마이닝서버(20)의 로그분석테이블에 저장된다.The analysis file in which all movement paths for each customer are stored in the above format is stored in a temporary table by the loader of the DBMS (Oracle, MS-SQL, etc.), and the log analysis and analysis by customer is performed again using a database program called PL / SQL. After the transformation process suitable for mining is stored in the log analysis table of the mining server 20.

상기 로그분석테이블에 저장된 분석파일은 로그분석정보와 웹사이트에서 운영되고 있는 고객(회원)데이터베이스, 구매(거래)데이터베이스 등과 같은 기간계 데이터베이스와 같이 고객관계관리를 위한 분석에 응용되어질수 있다.The analysis file stored in the log analysis table may be applied to analysis for customer relationship management such as log analysis information and a period database such as a customer (member) database and a purchase (transaction) database operated on a website.

즉, 접속자와 방문자수, 방문자들의 분류, 방문자의 접속 ISP별 집계, 홈페이지 디렉토리와 파일별 통계, 시간별(월, 주, 요일, 일, 시간)분석와 같은 기본적인 로그 분석과 회원별 인기페이지, 방문시간, 방문주기, 이동경로 등의 회원별 분석 및 상기 기간계데이터베이스과 함께 연관규칙탐사, 연속패턴검사, 군집분석, 의사결정수, 신경망모형 등 데이터마이닝(Data mining) 알고리즘을 사용한 분석도 가능해진다.That is, basic log analysis such as visitor and visitor number, visitor's classification, visitor's access ISP's statistics, homepage directory and file statistics, hourly (month, week, day, day, hour) analysis, popular page by member, visit time, It is also possible to analyze members by visit period, movement route, etc. and data mining algorithm such as association rule exploration, continuous pattern inspection, cluster analysis, decision tree, neural network model, etc. together with the period database.

상기와 같이 제안된 본 발명에 따른 웹로그와 ID매칭을 통한 고객관계관리에 의하면, 기존의 로그분석방법이 익명의 방문자에 대해 방문자수, 방문회수, 일자별통계와 같은 단순한 통계정보만을 보여주는데에 반하여, 익명사용자에 대한 정보뿐만 아니라 회원에 대한 자세한 분석이 가능하게 되는 장점이 있다.According to the customer relationship management through the web log and ID matching according to the present invention proposed as described above, while the existing log analysis method shows only simple statistical information such as the number of visits, the number of visits, the statistics by date, for anonymous visitors, It has the advantage of enabling detailed analysis of members as well as information on anonymous users.

또한 기간계데이터베이스와 연동한 데이터마이닝분석을 수행하기 때문에 고객관계관리에 있어서 보다 입체적인 분석이 수행되는 장점이 있는 것이다.아울러 웹서버에서 필터링과정을 수행하여 불필요한 데이터를 줄이기 때문에, 분석에 있어서 수행시간을 더욱 단축시켜주는 효과도 발생하고 있음은 당연한 효과라 할수있다.In addition, since the data mining analysis is linked with the main database, the three-dimensional analysis is performed in the customer relationship management. In addition, since the filtering process is reduced in the web server, the time required for the analysis is reduced. It is a natural effect that the shortening effect is occurring.

Claims

On a website that is connected via the Internet and has a login script for customer authentication on the website for customer relationship management performed by multiple visitor clients, web servers and mining servers,

Extracting login information of a visitor accessing a web site configured with a login script by accessing the web server;

Storing the extracted log-in information and log data generated in accordance with a user's access to a website as separate files;

Filtering and storing the stored log file;

Transmitting the stored login information file and the filtered log file to the mining server;

Extracting initial access information by using the transmitted login information file and log file;

Extracting and storing website access information of a visitor for a predetermined time from the log file based on the extracted initial access information;

Customer relationship management method through weblog and ID matching, including

The method according to claim 1,

The login information includes a web log and ID matching method including using user information stored in a web program such as CGI (or ASP, PHP, JSP) and information generated from a system of a web server.

The method according to claim 1,

The login information file and the filtered log file of the visitor stored in the web server are further transmitted to the mining server via FTP.