TW200636504A - Method of using Web Page template to analyze Web Page document for extracting data - Google Patents

Method of using Web Page template to analyze Web Page document for extracting data

Info

Publication number
TW200636504A
TW200636504A TW094111727A TW94111727A TW200636504A TW 200636504 A TW200636504 A TW 200636504A TW 094111727 A TW094111727 A TW 094111727A TW 94111727 A TW94111727 A TW 94111727A TW 200636504 A TW200636504 A TW 200636504A
Authority
TW
Taiwan
Prior art keywords
web page
template
analyze
extracting data
document
Prior art date
Application number
TW094111727A
Other languages
Chinese (zh)
Other versions
TWI292104B (en
Inventor
Jian-Shing Wang
Original Assignee
Gimefi Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gimefi Corp filed Critical Gimefi Corp
Priority to TW094111727A priority Critical patent/TW200636504A/en
Publication of TW200636504A publication Critical patent/TW200636504A/en
Application granted granted Critical
Publication of TWI292104B publication Critical patent/TWI292104B/zh

Links

Landscapes

  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention is related to a method of using Web Page template to analyze Web Page documents for extracting data. In the invention, a Web Page template is established. Then, the read content of Web Page document is analyzed through a Web page parser based on the setting of the Web Page template. In addition, the data analyzed from the content of Web Page documents are extracted and recorded in a database so as to reach the purpose of automatically extracting the information contained in the content of Web Page document.
TW094111727A 2005-04-13 2005-04-13 Method of using Web Page template to analyze Web Page document for extracting data TW200636504A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW094111727A TW200636504A (en) 2005-04-13 2005-04-13 Method of using Web Page template to analyze Web Page document for extracting data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW094111727A TW200636504A (en) 2005-04-13 2005-04-13 Method of using Web Page template to analyze Web Page document for extracting data

Publications (2)

Publication Number Publication Date
TW200636504A true TW200636504A (en) 2006-10-16
TWI292104B TWI292104B (en) 2008-01-01

Family

ID=45067419

Family Applications (1)

Application Number Title Priority Date Filing Date
TW094111727A TW200636504A (en) 2005-04-13 2005-04-13 Method of using Web Page template to analyze Web Page document for extracting data

Country Status (1)

Country Link
TW (1) TW200636504A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI399653B (en) * 2008-11-06 2013-06-21

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI512505B (en) * 2010-05-20 2015-12-11 Alibaba Group Holding Ltd The method, device and e - commerce system of crawling web pages
CN103971244B (en) 2013-01-30 2018-08-17 阿里巴巴集团控股有限公司 A kind of publication of merchandise news and browsing method, apparatus and system
CN108090076B (en) * 2016-11-22 2021-01-22 北京国双科技有限公司 Page character processing method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI399653B (en) * 2008-11-06 2013-06-21

Also Published As

Publication number Publication date
TWI292104B (en) 2008-01-01

Similar Documents

Publication Publication Date Title
BR0306749A (en) Computer readable method and medium for importing and exporting hierarchically structured data
WO2007063547A3 (en) System and method for appending security information to search engine results
TW200603632A (en) Methods and apparatus for identifying media content
WO2012116208A3 (en) Apparatus, method, and computer-accessible medium for explaining classifications of documents
WO2009051939A3 (en) Automatically instrumenting a set of web documents
WO2005109178A3 (en) Extracting information from web pages
EP1909194A4 (en) Information processing device, feature extraction method, recording medium, and program
TW200629150A (en) File formats, methods, and computer program products for representing workbooks
WO2007144853A3 (en) Method and apparatus for performing customized paring on a xml document based on application
TW200500890A (en) Method and apparatus for analyzing claims in portfolios automatically
ZA200509352B (en) File formats, methods, and computer program products for representing documents
ATE503312T1 (en) APPARATUS AND METHOD FOR STORING AND READING A FILE COMPRISING A MEDIA DATA CONTAINER AND MEDIA DATA CONTAINER
TW200636504A (en) Method of using Web Page template to analyze Web Page document for extracting data
HK1123478A1 (en) Method and apparatus for sequenced extraction from electrocardiogramic waveforms
EP1993052A3 (en) Data processing apparatus and method, program, and storage medium for the identification of content
WO2009060888A1 (en) Author's influence determination system, author's influence determination method, and program
WO2006115908A3 (en) User-driven media system in a computer network
TW200707316A (en) System and method for speedily obtaining material changes in motherboard design
Harding et al. Population Ageing and Government Age Pension Outlays: using microsimulation models to inform policy making
TW200627197A (en) Method of building personal relationship network
Hutchins et al. Analysis of lagoon samples from different concentrated animal feeding operations (CAFOs) for estrogens and estrogen conjugates
TW200622735A (en) Web page information extracting module and method having on-line learning mechanism
Saragih ANALYZING FACTORS THAT INFLUENCE STOCK BEHAVIOR IN SMALL CAPITALIZATION EXCHANGE (2013)
Hong et al. Analysis of time-domain maximum likelihood method and sample maximum likelihood method for errors-in-variables
Sánchez-Rebull et al. The diversity of the top management team and the survival and success of international companies: The case of Spanish companies with foreign direct investment in China