CN101539914A - Technical proposal for readable customization conversion of web pages - Google Patents

Technical proposal for readable customization conversion of web pages Download PDF

Info

Publication number
CN101539914A
CN101539914A CN200810084688A CN200810084688A CN101539914A CN 101539914 A CN101539914 A CN 101539914A CN 200810084688 A CN200810084688 A CN 200810084688A CN 200810084688 A CN200810084688 A CN 200810084688A CN 101539914 A CN101539914 A CN 101539914A
Authority
CN
China
Prior art keywords
data
rule
web
content
reader
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200810084688A
Other languages
Chinese (zh)
Inventor
韩露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN200810084688A priority Critical patent/CN101539914A/en
Publication of CN101539914A publication Critical patent/CN101539914A/en
Pending legal-status Critical Current

Links

Images

Abstract

A readable customization conversion technique comprises the following steps of: performing corresponding processing of web page data described in web page languages in the prior text version; showing a reader contents selected according to specific rules; and displaying the contents in a mode more complying with the reading habits of the reader. The rules for processing the contents are determined by a source URL or associated URL of the web page data. The processing modes of the rules comprise active selection of specific content and redefinition of the formats of display modes.

Description

Webpage is carried out the technical scheme of readableization customization conversion
Affiliated technical field
The present invention is a kind of technical scheme of carrying out content choice and readableization customization conversion for the internet web page with the text data format description.
Background technology
Widely used on the internet webpage is the data according to certain computer linguistic norm formation with the text data format description.This type of linguistic norm is to see the markup language that information designs for Web page create with in web browser.Wherein, and hypertext markup language (English: Hyper Text Markup Language, hereinafter to be referred as HTML) be the most generally be used and support a kind of.It is an international standard of being formulated standard by World Wide Web Consortium (English: The World Wide Web Consortium, hereinafter to be referred as W3C).
Html language is described the each side attribute of webpage with plain text data, comprises the refers to of Word message content, page layout form, webpage representation style and other types content (as image, video, sound etc.).According to its data description, browser is presented at the mode of content of pages with the W3C standard code in the user interface.Because the formulation of html language standard, therefore the page can represent to page reader with the pattern that the page animation person designs in advance.Html language is supported by all main flow browsers and is used, and is the basis and the core technology of internet web page reading function.
On the HTML of standard normative foundation, each browser developers or development company have introduced some extra flag informations, JavaScript script that is proposed as VBScript script that Microsoft proposed and Netscape etc.These extra flag informations are affixed on the information based on HTML, just can produce bandwagon effect and the additional function that some HTML standards are not provided on corresponding browser.
In addition, also there is the textual form linguistic norm of other other non-HTML, derives from by HTML usually, also have all characteristics of above-mentioned html language.
The HTML growth data of above-mentioned standard html language data, non-standard content, other non-HTML normative text data layout webpage descriptive languages, all be applicable to scope of the present invention, and in explanation of the present invention, be referred to as homepages language and web data.
The typical scene that homepages language is employed is: page reader is left the web data of specified network position in by the mode request of manual input, selected prefabricated bookmark or webpage clicking internal chaining in the browser operation interface, this network site " URL(uniform resource locator) " (English: Uniform Resource Location, hereinafter to be referred as URL) describe, be often referred to a web page server (English: the web page files of being safeguarded Web Server).After web data was delivered to browser by the source network position, browser was shown to page reader with the information content that this webpage comprised with described exhibition method of these data and form in the mode of homepages language defined.The web data that web page server provided can be the static data that has comprised complete info web in advance, also can be only to contain the part info web in advance, and replenishes the complete dynamic data of generation by program in real time before web data is delivered to browser.No matter be static state or dynamic data, the displaying format information wherein and the selection mode of content all are predetermined by Web page maker and organizer.
The another kind of common scene that homepages language is employed is: be kept at the directly position of visit of the employed calculating function of reader by the web data read by form with static file, as the hard disk of this machine, can be by this machine directly network storage equipment of visit or the mobile memory medium of this machine of access.In this case, browser directly reads file content, and to come the displayed web page content with the described identical mode of above-mentioned typical scene.Equally, the selection mode of the displaying format information of webpage and content all is predetermined by Web page maker and organizer.
As mentioned above, the web page contents that the web page browing person is seen and show form is designed in advance and is stipulated by the webpage making person, is a kind of form of passive acceptance.And the webpage making person is in design webpage flex spline formula, also can only be in the scope that the homepages language standard is supported.Generally speaking, page reader is to content of pages with page layout formats is all uncontrollable or adjust.
Therefore, in the relevant activity of web page display, produced following problem:
1. the webpage making person has usually exceeded the needed scope of single reader in order to satisfy the page of most target reader's demand mades.In this case, the web page browing person usually has to from a large amount of contents, filters out own needed or interested content, ignores other guide simultaneously.When the reader carries out time of this class screening and ratio when acquiring a certain degree, its reading experience will worsen.
2. the webpage making person usually can add the content of some reader's demand that departs from objectives in order to satisfy himself purpose in the page, sometimes even can pass through technological means, forces to make and watches this type of content to become the prerequisite of reading other guide.In this case, the web page browing person is in order to obtain its needed partial page content, usually has to watch oneself do not wish the other guide seeing even dislike.When the reader carries out time that this class unwillingly reads and ratio when acquiring a certain degree, its reading experience will worsen.
3. because the restriction of homepages language standard itself and the respective limits or the defective of browser, also since different readers between and there are differences at aspects such as reading habit, eyesight level and aesthetic conceptions between reader and the webpage making person, the page that the webpage making person makes usually can't provide part web page browing person desired represent effect and reading experience, the reading experience of peer-level also can't be provided for all web page browing persons.
Some existing technology provide the part solution to above problem:
1. webpage stops technology fully: the ultimate principle of this technology is, stipulates that by access control person a web pages stops condition, and these conditions generally include the URL feature and stop keyword.When the reader attempted to carry out page access, the webpage interceptor filtered the URL of accessed webpage on the one hand, for not meeting the visit of address by condition, will be prevented from the visit of whole webpage; On the other hand, interceptor is for stoping keyword to filter, when finding to stop the keyword coupling then whole web page access to be stoped by the web data of obtaining after the url filtering.This technology can partly deal with problems 2, but because what be prevented from is whole webpage, so the reader wishes that the partial content of seeing also is prevented from, therefore is not a solution that reading experience is improved.Particularly use scene from it, access control person is the controlling party that stops, and usually is not the reader, and this technology is the improvement that is used for information Control rather than reading experience more.In addition, this technology does not have to help to the solution of problem 1,3 substantially.
2. webpage part filtering technique: the ultimate principle of this technology is, stipulates that by access control person a web pages stops condition, and these conditions generally include the URL feature and stop keyword.When the reader attempted to carry out page access, the webpage interceptor filtered the URL of accessed web page on the one hand, for not meeting the visit of address by condition, filtered its relevant visit, the relevant portion of the page is carried out blanking handle; On the one hand, interceptor is for stoping keyword to filter, when finding to stop the keyword coupling then the relevant portion of the page to be carried out the blanking processing by the page data of obtaining after the url filtering.This technology is the improvement that webpage is stoped fully technology, and can be used in combination with it.It can solve problem 2 better, and the simple and easy solution to problem 1 also is provided.Yet, be that a kind of simple exclusive method is filtered owing to stop keyword to filter, can't reach the purpose that the reader accurately screens content of interest under many circumstances, and cause the situation of mistake filter easily.Because this filtering technique has kept the style information that represents in the web data, it can't deal with problems 3.And, destroy the layout and the visual balance of original page easily because blanking processing (being generally Liu Bai or alternative with warning prompt) is carried out in the original zone of filtering content, may further cause negative effect to reading experience.
The conclusion that draws from the above analysis is that prior art does not provide a perfect solution to the web page browing experience problem that proposes in the preceding surface analysis.
Summary of the invention
In order to overcome the restriction of homepages language itself, and the negative influence that web page browing person's reading experience is brought of webpage making and issue present situation.The invention provides a kind of readableization customization switch technology, existing web data content is carried out respective handling, the content that the person that shows the web page browing selects through ad hoc rules, and show, thereby the person's that improves the web page browing reading experience in the mode that more meets reader's reading habit.
The present invention solves the technical scheme that above technical matters takes: realize a kind of browser with following technology, when the web page browing person passes through its accession page, this browser is at first obtained web data by the flow process of standard, then according to come source network address to determine the rule sets that is suitable for by it, according to this rule sets the content in this web data is carried out screening based on the active selecting type of characteristic character string, only keep selected content, give up the style information of the other guide and the page.The content of this part reservation may not comprise former page style information or comprise incomplete former page style information.For any residual any former page style information, browser is ignored, and shows content information according to the form reconfiguration rule of determining in the aforementioned rule group.Above content choice rule and form reconfiguration rule be prefabricated or page reader formulation by browser.Through the web data after the rule treatments, can still describe with the HTML standard, also can use extended formatting or browser internal data structure to describe.
The invention has the beneficial effects as follows, the reader is when reading institute's accessed web page, and its browser is its page that represents, and no longer is complete content of former webpage and the designed pattern of webpage making person, but analyzed and screening through content is carried out, form was carried out the form that represents of custom-modification.Optimization reconstruct through accurate selection and form, can reduce time and difficulty that the reader screens obtaining information, more direct and obtain its interested valid data exactly, the exhibition method of the information content has readability more and can be controlled by the reader simultaneously.Because above effect, the reader can improve its efficient of reading webpage, improves the experience of reading.
Contrast other prior aries, it is advantageous that:
1. content choice is at the part web data, can not shield whole webpage because of partial content
2. content choice is by initiatively selection, rather than simple the eliminating, has improved specific aim and the accuracy selected.
3. the content choice rule is to determine with source, the network address, can use Different Rule to different web sites, or the different pages of same web site are used Different Rule, has further improved specific aim and the accuracy selected.
4. the content after selecting will customize layout and displaying according to the rule that source, the network address is determined, eliminate because former space of a whole page content staying after being filtered is white, perhaps the chaotic situation of the space of a whole page that causes of the disappearance of the format information after the information filtering.
5. the content after selecting will customize layout and displaying according to the rule that source, the network address is determined, its rule can comprise reader's customization, makes page display effect no longer be limited to page organizer's design, and is determined by the reader.Eliminated because reading quality, the reading experience deterioration problem that aspect differences such as reading habit, eyesight level and aesthetic conceptions cause.
Description of drawings
Below in conjunction with figure and embodiment the present invention is further described.
Fig. 1. typical case involved in the present invention is outside mutual
Fig. 2. logic function module of the present invention and function mode thereof
Among the figure:
1. reader
2. browser
3. network
4. web page server
5. the web data of preserving with document form
101. the reader initiates the web page browing request
102. the web data request that browser sends to web page server by network
103. the web data that web page server returns to browser by network
104. browser is to webpage that the reader represented
201. the web data that browser directly reads from file
301. describe the webpage request of obtaining (can by ancillary rules information) of location with URL
302. requested info web (can by additional URL and Rule Information)
303. the browser internal data is handled the described web data through rule treatments of organizational form
304. the web data of the process rule treatments of describing with intermediate data format
501. network address acquisition module
502. web data acquisition module
503. webpage rule treatments module
504. webpage represents module
505. rule is selected module
506. address mapping module
507. intermediate format conversion module
Embodiment
The outside mutual or application scenarios of typical case of the present invention is below described its treatment scheme by shown in Figure 1:
Step:
1. reader (1) initiates web page browing request (101).
2. browser (2) acceptance and processing request (101) and formation are to the respective request of outside web data, this request can be that (connecing step 3), also can be that the local accessing operation (201) at the web data of preserving with document form (5) (skips to step 6) for network requests (102) at web page server.
3. network requests (102) arrives web page server (4) by network (3) transmission.
4. web page server (4) carries out respective handling to network requests (102), passes requested web data back browser (2) by network response (103).
5. network response (103) arrives browser (2) by network (3) transmission.
6. the final web page display (104) of (comprising necessary content choice and form reconstruct) formation is analyzed and handled to browser (2) to the web data that obtains, and presents to reader (1).
Annotate: remove core procedure described above, actual flow process may comprise other additional steps and abnormal conditions processing.
All the elements of the present invention all are contained in the browser (2), and realize in step 6.
Technical scheme of the present invention has realized a kind of web browser, and this browser is except can comprising a series of common browser functions, and under the situation of enabling, it can also realize following operation of the present invention (as shown in Figure 2):
Step:
1. network address acquisition module (501) resolves to the form of URL(uniform resource locator) (URL) with user's page request, and address information (301) is used for subsequent operation.This URL resolves, and can be based on the information composition that obtains from name server (English: Domain Name Server, abbreviation DNS); Can be based on the predetermined domain name map information of this machine forms; Also can be based on the prefabricated URL mapping relations of software inhouse obtains.
1.5 (optional step) rule selects module (505) to find out corresponding content choice rule and form reconfiguration rule according to address information (301), the customization of these address informations and Rule Information and corresponding relation be pre-configured and maintenance by browser.
2. web data acquisition module (502) is obtained web data according to address information (301).According to the appointment of address information (301), Data Source can be to come from web page server (4), also can be to come from the addressable file content of this machine (5).If Data Source is file (5), may need address mapping module (506) that the local file location map is arrived the former network address (URL), and append to web data (201) and pass to web data acquisition module (502), with the foundation that provides follow-up rule to select.After the URL information that obtains web data and correspondence, web data acquisition module (502) passes to webpage rule treatments module (503) with these group data (302).
2.5 (optional step) rule selects module (505) to find out corresponding content choice rule and form reconfiguration rule according to web data corresponding address information (302), the customization of these address informations and Rule Information and corresponding relation be pre-configured and maintenance by browser.(annotate: optional step 1.5 and 2.5 needs to adopt at least one of them).
3. webpage rule treatments module (503) is behind content choice rule that obtains page data and select and form reconfiguration rule (302), requirement according to rule is at first screened web page contents, then the object content that obtains after the screening is carried out form reconstruct according to the form reconfiguration rule.Specific requirement according to the browser design, the mode of output can be directly with the data organization form (303) of browser inter-process, also can be to be output as certain intermediate data format (304), this intermediate data format can be output to the outside with file or other forms, use for other softwares, or use for this browser.If what adopt is that the data organization form (303) of inter-process then skips to step 5.
4. webpage intermediate data format (304) is resolved by intermediate format conversion module (507), and the data organization form (303) that is converted to the browser inter-process passes to webpage and represents module (504).
5. webpage represents module (504) after obtaining the browser inter-process described web data of data organization form (303), and the form reconstruct mode designed according to this browser represents content of pages (104).
Annotate: module described above is divided and is described with logic function, and needs not to be the actual framework of browser and the qualification of code module make.
In each related module of the present invention, rule selects the inter-process mode of module (505) and webpage rule treatments module (503) to form the unique technique combination.
Rule selects the treatment mechanism of module (505) to be:
This module can be visited a class internal data, such data description the mapping relations between URL feature and the rule sets.When obtaining a URL from external module when, it searches the condition coupling that meets this URL feature in above-mentioned mapping relations data, when this URL was met certain default URL feature by identification, the rule sets that this feature is shone upon was associated with this URL, so that use in subsequent treatment.
Rule sets in the foregoing description is made up of following content:
The content choice rule: this rule-like has comprised the target web data has been carried out the required information of content choice.The selection mode that the present invention adopts is initiatively to select, but not wipe-out mode.The promptly definite in advance that part of content that need be searched in web data provides identification and positioning describing to this content, then so that page rule treatments module can extract required content from former web data.The description of identification and location mainly refers to 1) can be used to identify the feature string of this partial content in the web data, 2) with the information of the relative position of some feature string.
The form reconfiguration rule: this type of rule description how to forming display format, as display position, font style, font size, font color etc. through the resulting data of content choice.This form reconfiguration rule can be customized according to the logical semantics of the associated former page of related urls feature by the Rulemaking person and be formed, and also can adopt certain general rule.The Rulemaking person can be exploitation, the maintainer of this browser, also can be the user of this browser.
The treatment mechanism of page rule treatments module (503) is:
For a web data with by its related URL feature given content choice rule and form reconfiguration rule, this module is carried out the content choice of carrying out with the active selection mode for former data, for the data that obtain after selecting, use given form reconfiguration rule, produce its target data.The target data of this module can be two kinds of forms:
1. the internal data form of this browser can directly be used by webpage representation module (504);
2. the intermediate data format of certain agreement, as can be by the XML form of externalizing etc., this form can be reduced to the internal data form by the corresponding module of the intermediate format conversion module (507) of this browser or other softwares, is used to the page afterwards and shows.

Claims (4)

1. the invention provides a kind of readableization customization switch technology, it is characterized in that: the web data that the homepages language of existing textual form is described carries out respective handling, show the content of reader, and show in the mode that more meets reader's reading habit through the ad hoc rules selection.
2. readableization customization switch technology according to claim 1 is characterized in that: according to the web data of being implemented conversion come origin url or related with it URL, select corresponding content choice rule and form reconfiguration rule.
3. readableization customization switch technology according to claim 1 is characterized in that: former page data is being carried out in the step of content choice the mode that adopts the described active of content choice rule to select.
4. readableization customization switch technology according to claim 1 is characterized in that: in the step that the data after converting are shown, adopt the described mode of form reconfiguration rule.
CN200810084688A 2008-03-18 2008-03-18 Technical proposal for readable customization conversion of web pages Pending CN101539914A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810084688A CN101539914A (en) 2008-03-18 2008-03-18 Technical proposal for readable customization conversion of web pages

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810084688A CN101539914A (en) 2008-03-18 2008-03-18 Technical proposal for readable customization conversion of web pages

Publications (1)

Publication Number Publication Date
CN101539914A true CN101539914A (en) 2009-09-23

Family

ID=41123105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810084688A Pending CN101539914A (en) 2008-03-18 2008-03-18 Technical proposal for readable customization conversion of web pages

Country Status (1)

Country Link
CN (1) CN101539914A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894138A (en) * 2010-06-25 2010-11-24 优视科技有限公司 Visual page content subscription processing method and system thereof
CN102096542A (en) * 2009-12-14 2011-06-15 深圳速浪数字技术有限公司 Business and operation support system (BOSS) personalized display method and personalized BOSS
CN102595366A (en) * 2011-01-07 2012-07-18 中国移动(深圳)有限公司 Generating method and system of roaming protocol file
CN102955800A (en) * 2011-08-25 2013-03-06 腾讯科技(深圳)有限公司 Method, system and terminal for website access
CN103455601A (en) * 2013-09-03 2013-12-18 小米科技有限责任公司 Webpage processing method and device, and terminal equipment

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096542A (en) * 2009-12-14 2011-06-15 深圳速浪数字技术有限公司 Business and operation support system (BOSS) personalized display method and personalized BOSS
CN101894138A (en) * 2010-06-25 2010-11-24 优视科技有限公司 Visual page content subscription processing method and system thereof
CN101894138B (en) * 2010-06-25 2012-11-07 优视科技有限公司 Visual page content subscription processing method and system thereof
CN102595366A (en) * 2011-01-07 2012-07-18 中国移动(深圳)有限公司 Generating method and system of roaming protocol file
CN102595366B (en) * 2011-01-07 2014-10-08 中国移动(深圳)有限公司 Generating method and system of roaming protocol file
CN102955800A (en) * 2011-08-25 2013-03-06 腾讯科技(深圳)有限公司 Method, system and terminal for website access
CN102955800B (en) * 2011-08-25 2016-03-16 腾讯科技(深圳)有限公司 website access method, system and terminal
CN103455601A (en) * 2013-09-03 2013-12-18 小米科技有限责任公司 Webpage processing method and device, and terminal equipment

Similar Documents

Publication Publication Date Title
US6865593B1 (en) Dynamic integration of web sites
CN100465956C (en) System, web server and method for adding personalized value to web sites
US20200042560A1 (en) Automatically generating a website specific to an industry
US7299407B2 (en) Marking and annotating electronic documents
US20090112824A1 (en) Method and apparatus for generating presentation configuration file of document content
US20070214422A1 (en) Framework for implementing skins into a portal server
EP1933242A1 (en) A method for ensuring internet content compliance
US20080184135A1 (en) Web authoring plugin implementation
JP5233220B2 (en) Page additional information sharing management method
US20080040094A1 (en) Proxy For Real Time Translation of Source Objects Between A Server And A Client
JP2009531793A (en) System and method for converting web community and web application data
US20160232141A1 (en) Dynamic website building system
JP5096619B2 (en) Homepage integrated service providing system and method
US20080189604A1 (en) Derivative blog-editing environment
US20170109442A1 (en) Customizing a website string content specific to an industry
CN101539914A (en) Technical proposal for readable customization conversion of web pages
JP2000067038A (en) Homepage preparing device
CN101576885B (en) Technical scheme for extracting dynamic generation web page contents
Artail et al. Device-aware desktop web page transformation for rendering on handhelds
CN101311927A (en) Page-added information sharing management method
JP2008123425A (en) Web document data providing device, method, and system
KR100962342B1 (en) System and method for providing creation, registration and management service of homepage
KR100929925B1 (en) System and method for providing total homepage service
KR100445452B1 (en) Manual providing server system and manual providing method thereof
KR20080087057A (en) Partial linking method in the web-page

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20090923