GB2356070A - Character set auto-filtering browser - Google Patents

Character set auto-filtering browser Download PDF

Info

Publication number
GB2356070A
GB2356070A GB9926181A GB9926181A GB2356070A GB 2356070 A GB2356070 A GB 2356070A GB 9926181 A GB9926181 A GB 9926181A GB 9926181 A GB9926181 A GB 9926181A GB 2356070 A GB2356070 A GB 2356070A
Authority
GB
United Kingdom
Prior art keywords
web page
browser
response data
character set
analyzing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB9926181A
Other versions
GB9926181D0 (en
Inventor
Sheng-Hong Yang
Cheng-Shing Lai
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INVENTEC ELECTRONICS
Original Assignee
INVENTEC ELECTRONICS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by INVENTEC ELECTRONICS filed Critical INVENTEC ELECTRONICS
Priority to GB9926181A priority Critical patent/GB2356070A/en
Publication of GB9926181D0 publication Critical patent/GB9926181D0/en
Publication of GB2356070A publication Critical patent/GB2356070A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

An auto-filter-page browser that can automatically filter out pages whose characters are not supported by the present browser when utilizing the HTTP protocol to open Web pages. The browsing method first provides a browser. Then, the browser link to the web page is used to retrieve response data, which comprises a StatusLine, an Entityheader and an Entitytext from the web page. Next, the procedure confirms the reception of the response data by analyzing the first three characters of the StatusLine. Furthermore, the procedure judges the character set used by the web page by analyzing the Entityheader and the Entitytext. Finally, if the character set used by the web page is not supported by the browser, the procedure displays a dialog frame asking whether or not the web page should continue being downloaded.

Description

2356070 AUTO- FILTER- PAGE BROWSER AND ITS BROWSING METHOD The present
invention relates to a browser for use when accessing pages on the World-Wide Web or similar information networks, more specifically to a browser whicli can automatically filter pages. The present invention also relates to a browsing method of this browser.
The Internet has become a very important communication tool. Users can acquire information they need through browsers. However, the transmission speed of the Internet is limited. Further, web pages utilizing all kinds of character sets (English, Chinese, or French etc.) are downloaded when a conventional browser opens them, regardless of whether the browser supports the character set. If the browser does not support the character set used by the web pages, the pages will display confused codes.
Consequently, a great deal of time and money is wasted downloading unreadable pages. While some browsers today can support various character sets used by web pages, others can only support a few character sets. However, there are many web pages using a wide variety of character sets, so there is always a possibility of downloading non-supported characters and the resulting confused codes.
An embodiment of the present invention can provide an auto- f ilter-page browser and its browsing method for automatically filtering Web pages whose characters are not supported by the present browser, when utilizing the HTTP protocol upon opening pages. Consequently, a great deal of download time and money may he saved.
The browsing. method of the invention first provides a browser. Then, the browser link to the web page is used to retrieve response data, which comprises a StatusLine, an Entityheader and an Entitytext from the web page. Next, the procedure confirms the reception of the response data by analyzing the first three characters of the StatusLine. Furthermore, the procedure judges the character set used by the web page by analyzing the Entityheader and the Entitytext.
Finally, if the character set used by the web page is not supported by the browser, the procedure displays a dialog frame asking whether or not the web page should continue being downloaded.
The browser provided in the present invention comprises three modules: a control module, an analyzing StatusLine module and an analyzing Entityheader module.
When utilizing HTTP protocol on opening pages, the control module will link to the web page for retrieving response data, including a StatusLine, an Entityheader and an Entitytext from the web page. The analyzing StatusLine module will confirm the reception of the response data by analyzing the status line. The analyzing Entityheader module will judge whether or not the character set used by the web page is supported by the browser and, if the character set used by the web page is not supported by the browser, displays a dialog frame asking whether or not the web page should continue being downloaded.
The present invention further relates to an internet access device employing the above browser.
The present invention will be described in detail with reference to the illustrated embodiment and the accompanying drawings, in which:
Fig. 1 is a f low chart showing the steps for the control module of the auto- f ilter-page browser in this invention; Fig. 2 is a flow chart showing the steps for the analyzing StatusLine module of the auto- f ilter-page browser in this invention; and Fig. 3 is flow chart showing the steps for the analyzing Entityheader module of the auto-filter-page browser in this invention.
In the present invention, an auto- filter -page browser is disclosed. Under the HTTP protocol, a "GET" S or "POST" command is sent to the web server for retrieving the web page's content when the user wants to open a certain page on the web server. Thus, the web server will send responsive data to the user's side according to the origin of the "GET" or "POST" command.
The responsive data usually comprises a StatusLine, an Entityheader and an Entitytext. The Entityheader and the Entitytext may have data indicating the character set used by the web page.
Therefore, the present invention determines the character set used by the web page by analyzing the response data. When the character set used by the web page can be supported by the web page, the web page continues downloading; otherwise a dialog frame is displayed asking whether or not the web page should continue being downloaded.
In a presently preferred embodiment of the present invention, in order to achieve automatic filtering of a web page before browsing, three modules, a control module, an analyzing StatusLine module and an analyzing Entityheader module are provided in the browser for controlling and analyzing purposes. The control module links to the web page for retrieving response data, including a StatusLine, an Entityheader and an Entitytext from the web page when the HTTP protocol is used to deliver pages. The analyzing StatusLine module confirms the reception of the response data by analyzing the status line. The analyzing Entityheader module will judge whether or not the character set used by the web page is supported by the browser and, if the character set used by the web page is not supported by the browser, displays a dialog frame asking whether or not the web page should continue being downloaded. In addition, the browser will set three Boolean variables consisting of bStatusLineGot, bEntityHeaderGot and bContinueVisit for recording the analyzing condition of the StatusLine, the Entityheader and the Entitytext.
The bStatusLineGot is used for recording whether or not the StatusLine gets the variable. The bEntityHeaderGot is used for recording whether or not the Entityheader gets the variable. The bContinueVisit is used for recording whether or not the user continues visiting the web page (i.e., proceeds with downloading it).
Referring to Fig. 1, a flow chart showing the steps for the control module of the auto- f ilter-page browser in this invention is illustrated. The control module links to the web page for retrieving response data, including a StatusLine, an Entityheader and an Entitytext, from the web page when utilizing HTTP protocol on opening page. In step S1-1, when the user utilizes the HTTP protocol to open Web pages, the control module will default the bStatusLineGot as false (0), the bEntityHeaderGot as false (0) and the bContinueVisit as true (1) at initializing.
In step S1-2, the control module receives the URL address of the web page input by the user.
In step S1-3, the control module retrieves an IP address of the web page according to the URL address.
In step S1-4, the control module sends a linking request to a web server having the web page according to the IP address of the web page.
In step S1-5, the control module judges whether the transmission of the linking request is successful.
If the link is successful, the control module will execute step S1-6; otherwise step Sl-14 is executed to end the control module.
In step S1-6, the control module sends a downloading request to the web server according to the "GET" or "POST" command of the HTTP protocol.
In step S1-7, the control module sends a "GET" or "POST" command to the web server where the web page is located.
In step S1-8, the control module judges whether the response data, including a StatusLine, an Entityheader and an Entitytext has been retrieved from the web page. If the response data has been retrieved, the control module executes step S1-9; otherwise, it execute step Sl-13 for interrupting the link.
In step S1-9, the control module receives the response data.
In step S1-10, the control module transmits the response data to the analyzing StatusLine module and is the analyzing Entityheader module for judging whether the character set used by the web page is supported by the browser. If the character set used by the web page is not supported -by the browser, a dialog frame is displayed asking whether or not the -web page should continue being downloaded.
In step S1-11, the control module judges whether or not the web page should continue being downloaded according to the user's decision. If the user wants to download the web page, step Sl-12 is executed; otherwise step Sl-13 is executed to interrupt the link.
In step S1-12, the control module downloads and displays the portion of the web page according to the user's decision. Then, after the downloading is completed, the procedure returns to S1-8 for judging whether or not the web server still has additional response data.
In step S1-13, the control module interrupts the link to with the web server. Then, the procedure executes step S1-14.
In step S1-14, the control module terminates.
Fig. 2 illustrates a flow chart showing the steps for the analyzing StatusLine module of the auto-filter page browser in this invention. The analyzing StatusLine module is used for confirming the reception of the response data by analyzing the StatusLine.
In step S2-1, the analyzing StatusLine module gets the analyzed data i.e. the response data from the web page.
In step S2-2, the analyzing StatusLine module judges whether or not the StatusLine of the response data from the web page has been received completely. If it has been fully received, the analyzing StatusLine module will execute step S2-3; otherwise step S2-9 is executed to end the analyzing StatusLine module.
In step S2-3, the analyzing StatusLine module sets the variable bStatusLineGot as true (1).
In step S2-4, the analyzing StatusLine module gets the first three characters from the StatusLine.
In step S2-5, the analyzing StatusLine module transforms the first three characters into a status code of a decimal integer.
In step S2-6, the analyzing StatusLine module compares the status code with a predefined code (200) for confirming the reception of the response data. If the status code is correct, it will execute step S2-7; otherwise step S2-8 is executed to show an error message.
In step S2-7, the analyzing StatusLine enters the analyzing Entityheader module for analyzing the Entityheader and Entitybody. Then, the procedure executes step S2-9.
In step S2-8, the analyzing StatusLine module shows an error message and sets the bContinueVisit as false (0). Then, the procedure executes step S2-9.
In step S2-9, the analyzing StatusLine module terminates.
Fig. 3 illustrates a flow chart showing the steps for (processing by) the analyzing Entityheader module of the auto- f ilter-page browser in this invention. The analyzing Entityheader module is used for analyzing the Entityheader and Entitybody and judging whether or not the character set used by the web page is supported by the browser and, if the character set used by the web page is not supported by the browser, displaying a dialog frame asking whether or not the web page should continue being downloaded.
In step S3-1, the analyzing Entityheader module gets the analyzed data i.e. the response data from the web page.
In step S3-2, the analyzing Entityheader module judges whether or not the Entityheader of the response data from the web page has been received completely. If it has not been fully received, the analyzing Entityheader module will execute step S3-3; otherwise step S3-13 is executed to analyze Entitybody (as in a traditional browser).
In step S3-3, the analyzing Entityheader module judges whether or not the response data has an Entityheader. If the response data has the Entityheader, the analyzing Entityheader module executes step S3-4; otherwise it executes step S3-14 to end the analyzing Entityheader module.
In step S3-4, the analyzing Entityheader module gets the Entityheader of the response data from the web page.
In step S3-S, the analyzing Entityheader module judges whether the Entityheader contains information for indicating the character set used by the web page.
If the Entityheader contains information for indicating the character set, it will execute step S3-6; otherwise it executes step S3-11 to judge whether or not the Entityheader of the response data has been received completely.
In step S3-6, the analyzing Entityheader module judges whether or not the character set used by the web page is supported by the browser. If the character is not supported by the browser, step S3-7 is executed to display a dialog frame asking whether or not the web page should continue being downloaded; otherwise it executes step S3-11 to judge whether or not the Entityheader of the response data has been received completely.
In step S3-7, the analyzing Entityheader module displays a dialog frame asking whether or not the web page should continue being downloaded.
In step S3-8, the analyzing Entityheader module judges whether the user continues downloading the web page. If the user decides to terminate the downloading, it will execute step S3-9 to set the bContinueVisit as false (0) and S3-14 to end the analyzing Entityheader module; otherwise step S3-10 is executed to set the bContinueVisit as true (1).
In step S3-9, the analyzing Entityheader module sets the bContinueVisit as false(O).
In step S3-10, the analyzing Entityheader module sets the bContinueVisit as true (1).
In step S3-11, the analyzing Entityheader module judges whether or not the Entityheader of the response data has been received completely. If the Entityheader of the response data has been received completely, it will execute S3-12 to set the EntityheaderGot as true (1); otherwise step S3-14 is executed to end the analyzing Entityheader module.
In step S3-12, the analyzing Entityheader module sets the EntityheaderGot as true (1).
In step S3-13, the analyzing Entityheader module analyzes whether or not the Entitytext contains information for indicating the character set used by the web page. Then, the procedure executes step S2-14.
The action of step S-13 is similar to that in a traditional browser, so it is not described in detail here.
In step S3-14, the analyzing Entityheader module terminates and returns to the control module.
While the invention has been particularly shown and described with reference to the preferred embodiment thereof, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the invention.
The browser of the present invention can be embodied in various kinds of internet access device such as a personal computer, Internet -capable is cellphone, Internet -capable television receiver, and so on.
The browser of the present invention is not restricted to use with the World-Wide Web, but can also be used to access corporate intranets and other forms of information networks based on WWW standards.

Claims (14)

CLAIMS:
1. A method for automatically filtering a web page before browsing, comprising the steps of:
providing a browser; linking, with the browser, to the web page for retrieving response data from the web page; determining a character set used by the web page by analyzing the response data; and deciding whether or not the web page should continue being downloaded according to the character set used by the web page.
2. A method as recited in claim 1 wherein the response data comprises a StatusLine, an Entityheader and an Entitytext.
3. A method as recited in claim 2, wherein the step of determining the character set used by the web page comprises the steps of:
confirming the reception of the response data by analyzing the StatusLine; and judging the character set used by the web page by analyzing the Entityheader and the Entitytext.
4. A method as recited in claim 3, wherein the step of confirming the reception of the response data comprises the steps of:
receiving the StatusLine of the response data from the web page; acquiring the first three characters from the StatusLine; transforming the first three characters into a status code; and comparing the status code with a predefined code for confirming the reception of the response data.
5. A method as recited in claim 4, wherein the step of transforming the first three characters into the status code comprises a step of transforming the first three characters into a decimal integer.
6. A method as recited in claim 4 or 5, wherein the predefined code is 200.
7. A method as recited in claim 3, 4, 5 or 6, wherein the step of judging the character set used by the web page by analyzing the Entityheader and the Entitytext comprises the steps of:
receiving the Entityheader of the response data from the web page; judging whether or not the Entityheader contains information for indicating the character set used by the web page; analyzing whether or not the Entitytext contains information for indicating the character set used by the web page; judging whether the character set used by the web page is supported by the browser; and if the character set used by the web page is not supported by the browser, displaying a dialog frame asking whether or not the web page should continue being downloaded.
8. A browser for automatically filtering a web page, comprising:
a control module f or linking to the web page for retrieving response data from the web page, the response data including a StatusLine, an Entityheader and an Entitytext; a first analyzing module for confirming the 3S reception of the response data by analyzing the StatusLine; and a second analyzing module for deciding a character set used by the web page, wherein whether or not the web page should continue being downloaded is decided according to the character set used by the web page.
9. The browser as recited in claim 8, wherein the control module retrieves an IP address of the web page, sends a linking request to a web server having the web page, judges whether the transmission of the linking request is successful, sends a downloading request to the web server when the transmission of the linking request is successful and retrieves the response data from the web server.
10. The browser as recited in claim 8 or 9, wherein the first analyzing module judges whether or not the StatusLine of the response data from the web page has been received, retrieves the first three characters from the StatusLine, transforms the first three characters into a status code and compares the status code with a predefined code for confirming the reception of the response data.
11. The browser as recited in claim 10, wherein the first three characters of the StatusLine are transformed into the status code of a decimal integer.
12. The browser as recited in claim 10 or 11, wherein said the predefined code is 200.
13. The browser as recited in any of claims 8 to 12, wherein the second analyzing module judges whether or not the Entityheader of the response data from the web page has been received, judges whether the Entityheader contains information for indicating the character set used by the web page, analyzes whether or not the Entitytext contains information for indicating the character set used by the web page, judges whether or not the character set used by the web page is supported by the browser and, if the character set used S by the web page is not supported by the browser, displays a dialog frame asking whether or not the web page should continue being downloaded.
14. A browser substantially as hereinbefore described with reference to the accompanying drawings.
1S. An internet access device incorporating the browser of any of claims 8 to 14.
GB9926181A 1999-11-04 1999-11-04 Character set auto-filtering browser Withdrawn GB2356070A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB9926181A GB2356070A (en) 1999-11-04 1999-11-04 Character set auto-filtering browser

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB9926181A GB2356070A (en) 1999-11-04 1999-11-04 Character set auto-filtering browser

Publications (2)

Publication Number Publication Date
GB9926181D0 GB9926181D0 (en) 2000-01-12
GB2356070A true GB2356070A (en) 2001-05-09

Family

ID=10863984

Family Applications (1)

Application Number Title Priority Date Filing Date
GB9926181A Withdrawn GB2356070A (en) 1999-11-04 1999-11-04 Character set auto-filtering browser

Country Status (1)

Country Link
GB (1) GB2356070A (en)

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
http://andrew2.andrew.cmu.edu/rfc/rfc2068.html, January 1997Req. for Com. 2068, Net. Working Group *
http://www.flora.org/lynx-dev/lynx-dev/9509/0113.html, Sep95"Support of Content-Type", Lynx dev grp *

Also Published As

Publication number Publication date
GB9926181D0 (en) 2000-01-12

Similar Documents

Publication Publication Date Title
US6507867B1 (en) Constructing, downloading, and accessing page bundles on a portable client having intermittent network connectivity
US7810049B2 (en) System and method for web navigation using images
US6027024A (en) Hand-held portable WWW access terminal with visual display panel and GUI-based WWW browser program integrated with bar code symbol reader
US6625447B1 (en) Method and architecture for an interactive two-way data communication network
KR100266937B1 (en) Web browser method and system for display and management of server latency
JPH10326244A (en) Method for transmitting data and server used for the same
US6915328B2 (en) Web content format for mobile devices
US20020017566A1 (en) Hand-held www access device with gui-based www browser program integrated with bar code symbol reader for automatically accessing and displaying html-encoded documents by reading bar code symbols.
US20050066037A1 (en) Browser session mobility system for multi-platform applications
US20010044824A1 (en) System for using wireless web devices to store web link codes on list server for subsequent retrieval
US20030195963A1 (en) Session preservation and migration among different browsers on different devices
US20060004775A1 (en) Method and system for sharing the browser
US6799300B1 (en) Document processor
US7143181B2 (en) System and method of sending chunks of data over wireless devices
US20030187952A1 (en) System and method for formatting information requested by a mobile device
US6942150B2 (en) Web-based mobile information access terminal
US20060106837A1 (en) Parsing system and method of multi-document based on elements
JP2001292270A (en) Communication terminal equipment
KR20020031691A (en) Method and system for real-time transforming internet contents
WO2005045699A1 (en) Method and system for delivering documents to terminals with limited display capabilities, such as mobile terminals
JP2987355B2 (en) Hypertext display system and hypertext display method
JP2005513647A (en) Hypermedia access function
WO2008132706A1 (en) A web browsing method and system
EP1071024A2 (en) Method and apparatus for splitting markup flows into discrete screen displays
JPH10149372A (en) Information display device

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)