JP2007164378A5

JP2007164378A5 -

Info

Publication number: JP2007164378A5
Application number: JP2005358328A
Authority: JP
Filing date: 2005-12-12
Publication date: 2008-07-17
Anticipated expiration: 2025-12-12

Claims

In order to analyze a vocabulary frequently used in a specific type of business / industry, a web document provider is a related word extraction device for associating mutually related advertising vocabulary data from a plurality of advertising vocabulary data,
A receiving unit that receives a Web document stored in a storage device connected via a communication line;
A Web document storage unit for storing the Web document received by the receiving unit;
An input unit for receiving input of first advertising vocabulary data related to the advertising vocabulary data to be extracted;
A Web document extraction unit that extracts a Web document including the first advertising vocabulary data input via the input unit from the Web document storage unit;
An extraction unit for extracting second advertisement vocabulary data commonly included in the Web document extracted by the Web document extraction unit;
A domain generation unit that generates a domain in which the second advertising vocabulary data extracted by the extraction unit is associated with the first advertising vocabulary data;
A domain storage unit for storing the domain generated by the domain generation unit;
A related word extraction device.

The domain generating unit includes a first advertisement vocabulary data different from the first advertisement vocabulary data, a domain generated from second advertisement vocabulary data extracted from the other first advertisement vocabulary data, and the domain The related word extraction device according to claim 1, wherein the storage unit associates a domain that is already stored with the storage unit.

The extraction unit preferentially extracts second advertisement vocabulary data having a high frequency when extracting the second advertisement vocabulary data commonly included in the Web document extracted by the Web document extraction unit ; The related word extraction device according to claim 1 or 2.

A related word extraction method for associating advertisement vocabulary data related to each other from a plurality of advertisement vocabulary data in order for a web document provider to analyze vocabulary frequently used in a specific industry / industry,
Receiving a Web document stored in a storage device connected via a communication line;
Storing the web document received in the receiving step;
An input step for receiving input of first advertising vocabulary data related to the advertising vocabulary data to be extracted;
An extraction step of extracting a Web document including the first advertising vocabulary data input in the input step;
A second advertising vocabulary data extracting step for extracting second advertising vocabulary data commonly included in the Web document extracted by the Web document extracting unit ;
A domain generating step for generating a domain in which the second advertising vocabulary data extracted in the second advertising vocabulary data extracting step is associated with the first advertising vocabulary data;
A domain storage step of storing the domain generated by the domain generation step;
A related word extraction method comprising:

The domain generation step includes a first advertisement vocabulary data different from the first advertisement vocabulary data, a domain generated from second advertisement vocabulary data extracted from the other first advertisement vocabulary data, and the domain The related word extraction method according to claim 4, wherein in the storing step, the already stored domain is associated.