EP3602352A1 - Transformation of marked-up content into a file format that enables automated browser based pagination - Google Patents
Transformation of marked-up content into a file format that enables automated browser based paginationInfo
- Publication number
- EP3602352A1 EP3602352A1 EP18776286.9A EP18776286A EP3602352A1 EP 3602352 A1 EP3602352 A1 EP 3602352A1 EP 18776286 A EP18776286 A EP 18776286A EP 3602352 A1 EP3602352 A1 EP 3602352A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- file format
- content
- marked
- page
- ffts
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
- 230000009466 transformation Effects 0.000 title claims abstract description 138
- 238000000034 method Methods 0.000 claims abstract description 103
- 230000001131 transforming effect Effects 0.000 claims abstract description 24
- 230000002441 reversible effect Effects 0.000 claims description 146
- 238000004590 computer program Methods 0.000 claims description 76
- 238000009877 rendering Methods 0.000 claims description 47
- 238000012545 processing Methods 0.000 claims description 36
- 239000000945 filler Substances 0.000 claims description 10
- 238000003780 insertion Methods 0.000 claims description 6
- 230000037431 insertion Effects 0.000 claims description 6
- 238000004891 communication Methods 0.000 description 23
- 230000006870 function Effects 0.000 description 20
- 239000012530 fluid Substances 0.000 description 17
- 238000007667 floating Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 15
- 238000013515 script Methods 0.000 description 14
- 238000005516 engineering process Methods 0.000 description 9
- 230000036961 partial effect Effects 0.000 description 8
- 238000012546 transfer Methods 0.000 description 7
- 238000013499 data model Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000004883 computer application Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000010295 mobile communication Methods 0.000 description 4
- 238000007639 printing Methods 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 230000002452 interceptive effect Effects 0.000 description 3
- 239000012071 phase Substances 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000004075 alteration Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 101100536354 Drosophila melanogaster tant gene Proteins 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- BUGBHKTXTAQXES-UHFFFAOYSA-N Selenium Chemical compound [Se] BUGBHKTXTAQXES-UHFFFAOYSA-N 0.000 description 1
- 230000004308 accommodation Effects 0.000 description 1
- 238000004873 anchoring Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 229940057324 biore Drugs 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 235000014510 cooky Nutrition 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000855 fermentation Methods 0.000 description 1
- 230000004151 fermentation Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000007791 liquid phase Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000010946 mechanistic model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 229920000136 polysorbate Polymers 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000011669 selenium Substances 0.000 description 1
- 229910052711 selenium Inorganic materials 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- 238000004065 wastewater treatment Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/957—Browsing optimisation, e.g. caching or content distillation
- G06F16/9577—Optimising the visualization of content, e.g. distillation of HTML documents
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/84—Mapping; Conversion
- G06F16/88—Mark-up to mark-up conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/114—Pagination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
- G06F40/143—Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
- G06F40/154—Tree transformation for tree-structured or markup documents, e.g. XSLT, XSL-FO or stylesheets
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/80—Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
- G06F16/81—Indexing, e.g. XML tags; Data structures therefor; Storage structures
Definitions
- a typical markup language document is made of different types of content, for example, textual content, images, videos, etc., and carries syntax information that instructs a browser how to render different types of content in the markup language document to a user.
- the syntax information comprises a set of markup language tags that are executed on the browser.
- rendering a document on a browser can be controlled, for example, by using cascading style sheets (CSS) that describe the formatting of a document written in a markup language.
- CSS cascading style sheets
- a CSS document is typically attached, embedded, or linked to a markup language document. The CSS defines how each element, for example, font size of text, color of a background or text, position and alignment of content elements, etc., in the markup language document appears on the browser.
- markup language documents are typically displayed as continuous running documents without any page breaks. These continuous running documents are not print-friendly.
- a typical markup language document can accommodate a large amount of content, whereas a standard print ready page has, for example, 8.5" x 11 " dimensions with margins that reduce the space available for accommodation of a large amount of content during a print operation.
- the content has to be broken at two levels, that is, a horizontal level or page width and a vertical level or page height.
- the page width relates to a line break, and the page height relates to a page break.
- Content rendering on a browser can have loose lines, and spaces are often distributed in ways that make a page appear to have rivers of blanks flowing through the page.
- Line breaks rendered by the browser can be discerned as belonging to four different types, namely, word space breaks (wsbr), soft hyphen breaks (wshbr), hard breaks (wbr), and paragraph breaks (wsp).
- Word space breaks are discerned by finding which spaces are quashed to a zero width. The word space breaks are then interpreted as an end of a line or a line break.
- soft hyphens if a line breaks in a soft hyphen, then the soft hyphen attains a non-zero width which is also interpreted as the end of the line or as a line break.
- a hard line break can be discerned when an offset decrease is encountered. Therefore, any markup language content that falls outside a printing area needs to be resized and repositioned accordingly for an optimal print output without losing any data when a print operation is performed.
- One method for printing continuous running pages involves introducing page breaks based on a vertical height equal to a page of printing media upon which the content is to be printed.
- the problem with relying on introducing page breaks based on the vertical height is that text lines and other content are disrupted in between a page and the same is printed.
- markup language documents for example, hypertext markup language (HTML) documents
- word spaces and line breaks are not explicitly tagged.
- the word spaces and the line breaks remain anonymous, for example, as generic word spaces and line breaks, and hence are difficult to read and understand for printing accurately.
- handheld devices for example, smartphones, tablets, etc.
- the non-print-friendly documents, page numbering issues, and other page layout problems still exist in fluid pages.
- Markup language documents are typically interactive and dynamic in nature, whereas the print is essentially static in nature.
- hypertext markup language (HTML) documents contain free flowing or reflowing content. Images, paragraphs, videos and other similar content are arranged in an HTML document as tags.
- HTML documents are adaptable to different devices. That is, if an HTML document is viewed in a web browser, then the HTML document adapts to the web browser and displays content of the HTML document as per the specifications of the web browser. If this HTML document is viewed on a mobile browser of a mobile device, then the HTML document adapts to the specifications of the mobile browser.
- the HTML content is not suitable to print. Since the HTML content is not fixed, a printer would interpret specific elements of the HTML content inaccurately and therefore print the HTML content inaccurately.
- the output page loses the original text anchoring position of the anchored floats in the markup language document.
- a float such as a table is broken across more than one page. The table is rotated for fitting in one page. On rotating, the table is disassembled and difficult to restore to its original layout in the output page.
- Other page elements for example, a line break, a page break, footnotes, etc., introduce artificial rigidity to an original fluid flow of text stream in the output page.
- the actions performed on the markup language document of the page output need to be tracked.
- Tracking results in increased programming complexity and reduced efficiency for rendering the markup language document in a reversible file format.
- Increased programming complexity results in a need for increased processing power and memory. Therefore, there is a need for reducing some levels of reversibility of the output page without losing the ability to generate a paginated output page, and hence there is a need for rendering the markup language document in a reversible file format, or a partially reversible file format, or a non-reversible file format based on a selected level of reversibility, in runtime.
- Markup language content and associated content elements are interpreted and defined using markup language tags on any standard web browser.
- the tags included in a markup language document are typically executed on a server or on a web browser on a user's client device. Scripts or tags that run directly on a web browser have less latency time compared to a server side execution of tags.
- Pagination of a hypertext markup language (HTML) document involves partitioning content of the HTML document and presenting the partitioned content on individual pages.
- Conventional solutions include pagination of HTML documents based either on cut-off markers or the number of items to be displayed per page. These solutions are typically implemented using server side technologies.
- US Patent No. 7,647,553 B2 provides a hypertext markup language view template that allows a hypertext markup language content document to flow into a series of containers. This is performed by identifying the layout of the hypertext markup language document by using view templates.
- a hypertext markup language authorship is provided that takes a bottomless continuous running hypertext markup language page and positions the content in a series of predefined containers within the display media. The content is flowed into the predefined containers. This method does not handle the positioning of footnotes on the same page where respective footnote citations reside, which makes it difficult for a user to refer to citations.
- This method also does not place floats proximate to their corresponding citations, which makes it difficult for the user to access floats corresponding to the citations. Furthermore, this method does not address header and footer conversion issues.
- US Patent No. 6,789,229 B l addresses issues with pagination that involves more processor intensive tasks. This method uses pagination techniques that involve determining reproducible pages followed by numbering individual pages based on hard breaks. This method requires a predetermined list of hard breaks occurring in the document being processed which requires a lot of processing time to display page numbers and therefore, there is a need for a faster and efficient technique to process page numbers.
- a publication by Hewlett-Packard Laboratories titled "Automatic Pagination of HTML Documents in a Web Browser” discloses automatic pagination of hypertext markup language (HTML) documents on the client side.
- the methods disclosed in this publication utilize a built-in library of JavaScript functions in a browser and size attributes to format an HTML page.
- the paginations are performed through extensible stylesheet language transformation (XSLT).
- XSLT extensible stylesheet language transformation
- These pagination techniques render page numbers in tabs which occupy more space if the number of pages is large.
- These methods do not handle page numbers when a print operation is initiated.
- these methods do not position floats and footnotes on the same page where their respective citations reside.
- These methods transform a regular HTML page into individual pages with paginated tabs, but do not efficiently handle a journal or a novel style HTML page which translates to hundreds or even thousands of individual pages.
- Conventional file formats for example, the portable document format (PDF) of Adobe
- the portable document format is based on a fixed layout and does not support a fluid layout. Page numbers in the portable document format are forced and not based on the content.
- the ePub file format is designed with reflowable content, which can optimize text and graphics according to a display device.
- the ePub file format does not support header and footer at a conversion stage, places floats at random locations, and does not proxy floats, for example, videos and long tables to a linked source, thereby hindering the user experience.
- a computer implemented method and a file format transformation system deployed on a client device, or in an embodiment, on a server, that transform marked-up content in a first file format, for example, a hypertext markup language (HTML) format to a second file format that enables automated browser based pagination and that can be stored offline, executed with less latency and improved performance speed, and can be restored to a continuous page.
- a computer implemented method and a file format transformation system that render the marked-up content in the second file format on demand and ahead of demand based on a selected level of reversibility to reduce
- a first file format for example, a hypertext markup language (HTML) format
- HTML hypertext markup language
- the method and the FFTS deployed on a client device, and in an embodiment on a server render the marked-up content in the second file format on demand and ahead of demand based on a selected level of reversibility to reduce programming complexity and increase rendering efficiency, without losing the ability to generate a paginated output page.
- the second file format is therefore a reversible file format, or a partially reversible file format, or a non-reversible file format.
- the reversible file format allows the marked-up content to be reversed to the first file format to restore the continuous page.
- the method and the file format transformation system (FFTS) disclosed herein implement document tagging of all content including spaces and line breaks to transform fluid pages to fixed pages that are print-friendly and provide a fixed page view that captures document elements, for example, line breaks, floats, footnotes or end notes, page numbers, headers and footers, captions, etc., which are expressed relationally and assigned page appropriate placement.
- the client side implementation of the method and the file format transformation system (FFTS) disclosed herein allows a user of a document to be presented with an alternate presentation of the document without additional communication costs between a server and the user's client device.
- the client side and server side implementation of the method and the FFTS disclosed herein enables automated browser based pagination of markup language documents, for example, hypertext markup language (HTML) documents based on the dimensions of a web browser's window and the rendered size of components.
- HTML hypertext markup language
- the reversible file format allows a user to view the page-broken document as a continuous document on a browser. The user can switch between the two views.
- the partially reversible format or the non-reversible format reduces the programming complexity involved in generating a paginated output.
- the computer implemented method and the FFTS disclosed herein position floats and footnotes on the same page where their respective citations reside, support headers and footers at a conversion stage, place floats at appropriate locations, and proxy floats, for example, videos and long tables to a linked source, thereby enhancing user experience.
- the computer implemented method disclosed herein is minimalistic in terms of document object model (DOM) manipulation and performs minimum manipulation to create pages.
- DOM document object model
- the computer implemented method disclosed herein employs the file format
- FFTS transformation system
- the FFTS receives the marked-up content of the first file format.
- the FFTS reflows the received marked-up content of the first file format into a continuous page having a configurable page width.
- the FFTS identifies spaces and block elements in the reflown marked-up content of the first file format.
- the FFTS generates and appends tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format.
- the FFTS determines line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags and tags the determined line breaks.
- the file format transformation system identifies anchored floats, for example, figures, tables, images, videos, etc., in the reflown marked-up content of the first file format and tags the identified anchored floats.
- the FFTS positions the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page.
- the FFTS identifies footnotes in the reflown marked- up content of the first file format and tags the identified footnotes.
- the FFTS positions the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page.
- the FFTS groups the marked-up content with the positioned anchored floats and the positioned footnotes on each page.
- the FFTS inserts one or more of multiple pagination elements, for examples, page numbers, a header, a footer, etc., on each page containing the grouped marked-up content.
- the FFTS renders the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility.
- the data-ph5 attribute described above pertains to hypertext markup language5 (HTML5).
- HTML5 hypertext markup language5
- the "class" attribute can be used instead of the data-ph5 attribute.
- Class attribute expressions in legacy HTML impose certain limitations to reversibility compared to the data-ph5 attribute in HTML5.
- the file format transformation system tracks positions of the identified anchored floats and the identified footnotes in the reflown marked-up content of the first file format, and positions of the page breaks in the continuous page prior to grouping the marked-up content and inserting the pagination elements on each page for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility.
- the FFTS tracks positions of the inserted pagination elements for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility.
- related systems comprise circuitry and/or programming for effecting the methods disclosed herein.
- the circuitry and/or programming can be any combination of hardware, software, and/or firmware configured to effect the methods disclosed herein depending upon the design choices of a system designer. Also, various structural elements can be employed depending on the design choices of the system designer.
- FIGS. 1A-1B illustrate a computer implemented method for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination.
- FIG. 2 exemplarily illustrates an interpretation of marked-up content in a second file format.
- FIGS. 3A-3F exemplarily illustrate a flowchart comprising the steps performed by a file format transformation system for transforming marked-up content in a first file format to a reversible second file format.
- FIGS. 4A-4B exemplarily illustrate screenshots showing edit views of marked-up content.
- FIG. 4C exemplarily illustrates a screenshot showing a proof view of the marked-up content rendered in a reversible file format.
- FIG. 4D exemplarily illustrates a screenshot showing a partial source code of the marked-up content rendered in a reversible file format.
- FIG. 5 exemplarily illustrates a flow diagram showing a process flow implemented by an embodiment of the file format transformation system deployed on a server for transforming marked-up content in a first file format to a second file format based on a selected level of reversibility.
- FIGS. 6A-6B exemplarily illustrate a flowchart comprising the steps performed by the file format transformation system for transforming marked-up content in a first file format to a second file format based on a selected level of reversibility.
- FIG. 7 A exemplarily illustrates a system comprising the file format transformation system deployed on a client device for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination.
- FIG. 7B exemplarily illustrates an embodiment of the system comprising the file format transformation system deployed on a server for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination.
- FIGS. 8A-8Q exemplarily illustrate screenshots showing transformation of marked-up content in a first file format to a reversible file format in edit views and proof views.
- FIGS. 9A-9F exemplarily illustrate screenshots showing transformation of marked-up content in a first file format to a second file format based on a selected level of reversibility in edit views and proof views.
- the computer implemented method and the file format transformation system (FFTS) deployed on a client device as disclosed in the co-pending non-provisional patent application with application number 3348/CHE/2015 titled "Transformation Of Marked-up Content To A Reversible File Format For Automated Browser Based Pagination", filed in the Indian Patent Office on 1 July 2015 and incorporated herein by reference in its entirety, transform marked-up content in a first file format to a reversible second file format.
- the FFTS disclosed herein is not limited to be deployed on a client device.
- the FFTS is also deployable on a server to render the marked-up content ahead of demand to the client device.
- the computer implemented method and the file format transformation system (FFTS) deployed on the client device or the server transform the marked- up content in a first file format to a second file format based on a selected level of reversibility.
- the FFTS allows rendering of the marked-up content in the second file format in different levels of reversibility.
- "different levels of reversibility” refer to extents to which the transformed marked-up content in the second file format can be reversed to an original layout, that is, to the first file format to restore a continuous page.
- the different levels of reversibility of the second file format are completely reversible, or partially reversible, or non-reversible.
- a user selects a desired level of reversibility of the second file format and the FFTS transforms the marked-up content from a first file format to the second file format based on the selected level of reversibility.
- the co-pending non-provisional patent application with application number 3348/CHE/2015 discloses transformation of the marked-up content from the first file format to a reversible second file format
- the present patent of addition application discloses transformation of the marked-up content from the first file format to a reversible second file format, or a partially reversible second file format, or a non-reversible second file format based on the selected level of reversibility, on demand or ahead of demand by the FFTS deployed on the client device or the server.
- FIGS. 1A-1B illustrate a computer implemented method for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination.
- marked-up content refers to content having markups or appended tags that indicate the type of content, for example, a header, a footer, a caption, a table, a figure, an image, a video, a line break, etc.
- line break refers to a pagination element representing the end of a line of text.
- the second file format disclosed herein is named, for example, as "PH5" that represents pagination with hypertext markup language 5 (HTML5) and comprises a set of properties including tags that are generated in accordance with structural semantics of documents in the first file format, for example, hypertext markup language (HTML) documents, and recognizes scripts that shape the PH5 output.
- HTML5 hypertext markup language
- the scripts that shape the PH5 output vary.
- FFTS transformation system
- the client device is a computing device, for example, a personal computer, a tablet computing device, a mobile computer, a mobile phone, a smart phone, etc.
- the FFTS converts web content seamlessly using document tagging.
- the FFTS receives 101 marked-up content of a first file format, for example, a hypertext markup language (HTML) format or an extensible hypertext markup language format (XHTML).
- HTML hypertext markup language
- XHTML extensible hypertext markup language format
- the marked-up content of the first file format is processed, transformed, and executed by an algorithm in the FFTS for rendering the marked-up content in the second file format based on a selected level of reversibility.
- the user selects the desired level of reversibility of the second file format by declaring a token corresponding to the desired level of reversibility.
- the FFTS receives document contents, for example, in the HTML format.
- the first file format is an extensible markup language (XML) format.
- the FFTS converts a document from the XML format to an HTML format and then transforms the marked-up content in the HTML format to the second file format with the selected level of reversibility, for example, as a reversible file format, or a partially reversible file format, or a non-reversible file format.
- reversible file format refers to a file format that can be back transformed into the first file format.
- the reversible file format allows the marked-up content to be reversed to the first file format to restore the continuous page.
- partially reversible file format refers to a file format where a few aspects of the marked-up content can be back transformed into the first file format.
- non-reversible file format refers to a file format with rigidity that does not allow back transforming of the marked-up content to the first file format.
- a browser that loads the marked-up content of the first file format inserts code points, for example, soft hyphens in the marked-up content of the first file format based on dictionary elements, for example, dictionary syllables such as - im-por-tant, con-se-quence, ap-pear-ance, etc.
- dictionary syllables such as - im-por-tant, con-se-quence, ap-pear-ance, etc.
- soft hyphens refer to code points reserved in coded character sets used for breaking words across lines by inserting visible hyphens.
- the browser is a headless browser implemented as a server side application, for example, a command line server application.
- headless browser refers to a web browser without a graphical user interface.
- the headless browser is a piece of software that accesses web pages without a display.
- the headless browser provides automated control of webpages and, in an embodiment, provides the content of web pages to other programs.
- the headless browser is executed via a command-line interface or using a network communication. Examples of the headless browsers comprise PhantomJS with WebKit ® of Apple Inc., or Selenium ® WebDriver of Software Freedom conserveancy, Inc., as a Firefox ® extension of Mozilla Foundation
- the FFTS produces pages of marked-up content on demand in the second file format with improved performance by executing browser-based pagination scripts on the client side, that is, on the client device, in an embodiment, when the pages of marked-up content need to be rendered ahead of demand, the FFTS implemented on the server side, that is, on the server, runs the same browser-based pagination scripts using a headless browser. For example, the FFTS renders a fixed page where no alterations were made over time, ahead of time for speedy delivery.
- the FFTS also maintains archival copies in a fixed layout for facilitating a restore of a paginated hypertext markup language (HTML) document to the fixed page using the archival copies.
- HTML hypertext markup language
- the marked-up content received by the file format transformation system is transformed as disclosed in the following method steps 102-115.
- the FFTS reflows 102 the received marked-up content of the first file format into a continuous page having a configurable page width.
- the term "reflow” refers to a browser process of recalculating positions of hypertext markup language (HTML) elements in the HTML content and re- rendering the HTML elements with new positions.
- HTML hypertext markup language
- a generic computer using a generic program cannot reflow the received marked-up content of the first file format into a continuous page having a configurable page width in accordance with the method steps disclosed above.
- the FFTS identifies 103 spaces and block elements in the reflown marked-up content of the first file format.
- the FFTS identifies existing break elements, for example, hard breaks such as soft hyphen breaks, line breaks, and paragraph breaks in the reflown marked-up content of the first file format.
- the FFTS also identifies unanchored or uncited floats in the reflown marked-up content of the first file format.
- the block elements are content elements that create blocks or large groupings of content and generally begin new lines of text. The block elements expand to fill a parent container containing text, inline elements, etc., and can have margins and/or padding, fitting child elements.
- a ⁇ div> element is a block element in the HTML.
- the block elements for example, ( ⁇ div>, ⁇ hl> - ⁇ h6>, ⁇ p>) in a document start on a new line and take up the full width available.
- a generic computer using a generic program cannot identify spaces and block elements in the reflown marked-up content of the first file format in accordance with the method steps disclosed above.
- the file format transformation system (FFTS) generates and appends 104 tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format.
- the FFTS generates tags in accordance with structural semantics of the marked-up content, which then helps the scripts recognize the tags.
- word space refers to a single space between two words.
- floats and footnotes have prior representation in an input document of the first file format, for example, the HTML format and need no specific tagging.
- floats refers, for example, to images, videos, audio content, tables, figures, etc., that float unhinged from the main content flow, except in their relationship to their citations as available in the input document.
- Image floats have, for example, ⁇ img> tags.
- Table floats can be recognized by the presence of various tag elements, for example, ⁇ td>, ⁇ tr>, etc.
- footnotes refers to content that is intended to be placed at the bottom of a page and used to cite references to content on the page. Footnotes are in a number series and are shown as superscript ⁇ sup> numbers that are assigned to specific locations in the main content flow, and these superscripts reference notes appended to the main content, for example, at the bottom in a continuous page.
- a generic computer using a generic program cannot generate and append tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format in accordance with the method steps disclosed above.
- the file format transformation system determines 106 line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags as disclosed in the detailed description of FIG. 3C, and tags the determined line breaks.
- the line breaks retain integrity of the second file format by hyphenating and adjusting spaces in the marked-up content rendered in the second file format.
- the FFTS identifies the line breaks through JavaScript ® developed by Sun Microsystems, Inc.
- the file format transformation system identifies 108 anchored floats in the reflown marked-up content of the first file format and tags the identified anchored floats.
- the FFTS positions 109 the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page.
- the FFTS positions the tagged anchored floats proximal to associated float citations on the current page based on availability of space for the tagged anchored floats on the current page.
- the FFTS identifies 110 footnotes in the reflown marked-up content of the first file format and tags the identified footnotes.
- the FFTS places the footnotes initially as "line notes" immediately below the cited line, works out the available space after flowing the main text, and then reflows the footnotes to the bottom of the same page.
- page break refers to a marker that indicates that content which follows the marker is part of a new page.
- the FFTS groups 113 the marked-up content with the positioned anchored floats and the positioned footnotes on each page.
- the FFTS inserts 114 one or more of multiple pagination elements, for example, page numbers, a header, a footer, a footnote ruler, fillers, etc., on each page containing the grouped marked-up content.
- the FFTS tags the line breaks, for example, as ⁇ span data- ph5 "wsbr">.
- the FFTS represents the lines ending with hyphenations, for example, as ⁇ span data-ph5 "wshbr”>.
- paragraph break refers to a pagination element representing the end of a paragraph.
- the paragraph break is a non-intrusive data model that preserves an original data model of the hypertext markup language (HTML).
- the FFTS tags the paragraph breaks, for example, as ⁇ div data-ph5 "wsp">.
- the file format transformation system FFTS
- the FFTS initially positions the floats near their anchors and then moves the floats to the bottom or top of the current page, or to one of the following pages according to the availability of space similar to footnotes.
- the FFTS positions floats, for example, images, tables, text boxes, pull-outs, etc., in proximity to the anchors and ensures that grouped elements such as captions for the floats, if any, appear immediately before or after the floats, and that the captions are not widowed or orphaned.
- the FFTS handles the grouped elements comprising, for example, a float and a caption associated with the float in the second file format at a position assigned in the marked-up content of the first file format to the float.
- the file format transformation system declares uniform resource locater (URL) breaks to a paging engine.
- the FFTS couples expressions such as footnotes to page breaks.
- the page break breaks a web page into a predefined length and delivers cut pages, while ensuring headings and words at the beginning and end paragraphs are not widowed or orphaned.
- the FFTS introduces page breaks when a script cookie cuts the fluid page to a reference dimension.
- the FFTS initially positions footnotes next to the corresponding citations.
- the FFTS moves the footnotes to the footnote section of the page after introduction of the page breaks.
- the FFTS numbers the footnotes and positions the footnotes at the bottom of the relevant page.
- the file format transformation system inserts page numbers, a header, a footer, a footnote ruler, fillers, etc., or any combination thereof in one or more pages in the second file format.
- the FFTS inserts page numbers on the pages based on a predefined numbering style.
- the FFTS inserts the footnote ruler, for example, as a horizontal line to separate running text and the footnotes.
- the FFTS tags the footnote ruler, for example, as ⁇ div data- ph5 "footNoteRuler">.
- the FFTS allows the footnote ruler to be tweaked on and off in the cut pages.
- the FFTS uses filler compensation for eliminating orphans, widows, and divorce between couples, for example, a section heading and a paragraph, a figure and a table, a table heading and a table, etc.
- the FFTS automatically deploys fillers, for example, line spaces, if needed, to fill a page to increase aesthetics.
- a generic computer using a generic program cannot determine and tag the line breaks in the reflown marked-up content of the first file format; identify, tag, and position the anchored floats on the current page; identify, tag, and position the footnotes at the footnote section on the current page; position the page breaks in the continuous page; group the marked-up content with the positioned anchored floats and the positioned footnotes on each page; and insert the pagination elements on each page containing the grouped marked-up content in accordance with the method steps disclosed above.
- the file format transformation system renders 115 the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility, for example, as a reversible file format, or a partially reversible file format, or a non-reversible file format.
- Different levels of reversibility of the second file format reduce programming complexity and increase rendering efficiency of the FFTS to render the marked-up content.
- the FFTS renders the grouped marked-up content in the second file format by sacrificing some aspects of reversibility, without losing the ability to generate a paginated output.
- a user of the FFTS indicates to the FFTS whether to restore the grouped marked-up content in the second file format to a continuous page to retain the full richness of the marked-up content or not.
- the user provides this indication to the browser of the FFTS using tokens.
- the FFTS uses the tokens, determines whether to retain the marked-up content in the second file format or reverse the marked-up content in the second file format to the first file format to restore the continuous page to the earlier state of the marked-up content before the transformation of the marked-up content by the FFTS. If the token corresponds to a reversible file format, the FFTS back transforms the marked-up content of the second file format to the first file format.
- the FFTS reverses the marked-up content with document elements, for example, the identified anchored floats, the identified footnotes, the inserted pagination elements, etc., to original positions in the first file format based on the level of reversibility desired.
- a token indicates if some parts of the marked-up content need to be filtered out for security reasons in some reader use case, such as in legal documents, or when required by a publisher of the continuous page.
- the FFTS retains the marked-up content with the document elements, for example, the identified anchored floats, the identified footnotes, the inserted pagination elements, etc., in the second file format.
- the file format transformation system FFTS
- the FFTS performs hyphenation and justification of the rendered marked-up content in the second file format to provide kerning based on aesthetics, for example, for avoidance of loose lines and blank rivers.
- the FFTS achieves lossless reversibility of the marked- up content from the reversible file format to the first file format.
- lossless reversibility refers to reversibility where the marked-up content is completely back transformed from the reversible file format to the first file format, that is, to the continuous page.
- the reversible file format allows the marked-up content to be reversed to the first file format to restore the continuous page.
- the partially reversible file format allows the marked-up content to be partially reversed to the first file format to partially restore the continuous page.
- the rendered marked-up content in the second file format is accessible on multiple browsers on multiple operating systems.
- the fixed page in the second file format to which the marked-up content in the first file format is transformed is expressed, for example, as a pixel dimension equivalent of a paper size or a device size.
- the data model of the second file format for example, referred to as the PH5 format transforms a fluid page, for example, in a hypertext markup language (HTML) format to a fixed page, for example, in the reversible file format or the PH5 format, where the
- the file format transformation system interprets a fluid page and delivers a fixed page.
- the tagged input allows the transformation of the fluid page to the fixed page.
- the enriched inheritance comprises page breaks.
- the other elements are defined in terms of the page breaks.
- the extension of the fixed page in the PH5 format is, for example, .PH5.
- the FFTS bridges fluid web content and fixed page typesetting, originating as a fluid HTML, without a reference printer at the destination.
- the PH5 format is similar, for example, to a zip file format such as an electronic publication (ePub) format and can be opened in a common browser on any operating system in a fixed page view.
- ePub electronic publication
- a PH5 file can be back- transformed into a standard HTML file from which the PH5 file was generated with the fluidity of the HTML file restored.
- a generic computer using a generic program cannot render the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility in accordance with the method steps disclosed above.
- the file format transformation system performs document intelligence tagging. Tagging the spaces or blanks effects visible content for emulation and standardization.
- line break candidates are identified and marked up as page breaks. With this method, implicit statements in the document are understood and tagged for downstream machine reading or paging.
- the transformation from a fluid file format to the second file format, for example, the PH5 format is accomplished subject to the availability of a tag set that exposes an understanding of document semantics to scripts that generate the PH5 package. Creation of the tag set allows creation of a fixed page view that captures document elements that are expressed relationally and that are then assigned page-and-context-appropriate placement and styling.
- a PH5 file as a portable document anticipates the tag set in a work queue and defines a standard for creating the same.
- the PH5 files do not need reference printers, driver installations, configuration of printer settings, etc., and also do not need a reader application or a browser plug-in. Furthermore, the PH5 files allow offline storage of information.
- the file format transformation system tracks positions of the identified anchored floats and the identified footnotes in the reflown marked-up content of the first file format, and positions of the page breaks in the continuous page prior to grouping the marked-up content and inserting the pagination elements on each page for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility.
- the FFTS tracks original positions of floats and footnotes in the marked-up content in the first file format, that is, the continuous page, before moving the floats and the footnotes to new positions in the marked-up content of the second file format.
- the FFTS further tracks positions of the inserted pagination elements for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility.
- the FFTS also tracks pagination elements, for example, a header, a footer, a page-number-folio, page breaks, borders, etc., that were not present in the first file format, that is, the continuous page. Tracking positions of the floats, the footnotes, the pagination elements, etc., allows the FFTS to reverse the marked-up content from the second file format to the first file format to restore the continuous page completely. In an embodiment, the FFTS degrades some levels of reversibility of the second file format to the continuous page.
- the focus of the computer implemented method and the file format transformation system (FFTS) disclosed herein is on an improvement to automated browser based pagination, and not on tasks for which a generic computer is used in its ordinary capacity. Accordingly, the method and the FFTS disclosed herein are not directed to an abstract idea. Rather, the method and the FFTS disclosed herein are directed to a specific improvement to the way the processor in the client device or the server deploying the FFTS operate, embodied in, for example, rendering the grouped marked-up content with the inserted pagination elements in the second file format on demand or ahead of demand based on a selected level of reversibility.
- the design and the flow of data and interactions between a web browser on the client device or the headless browser on the server and the file format transformation system (FFTS) are deliberate, designed, and directed.
- the FFTS processes the marked-up content of the first file format to steer the FFTS towards a finite set of predictable outcomes.
- the FFTS implements one or more specific computer programs to transform the marked-up content in the first file format to the second file format on demand or ahead of demand based on the selected level of reversibility.
- the interactions between the web browser on the client device or the headless browser on the server and the FFTS allow the FFTS to receive the marked-up content of the first file format.
- the FFTS From this marked-up content, the FFTS, through the use of other, separate and autonomous computer programs, transforms the marked-up content from the first file format to the second file format. This transformation requires twelve or more separate computer programs and subprograms, the execution of which cannot be performed by a person using a generic computer with a generic program.
- the steps performed by the FFTS disclosed above are tangible, provide useful results, and are not abstract.
- the combination of software and hardware implementation of the FFTS on the client device or the server is an improvement in computer related technology.
- the computer implemented method disclosed herein improves the functionality of the computer, that is, the client device or the server, and provides an improvement in computer related technology as follows: While pagination was typically performed outside the browser through external components residing external to the browser at the operating system level, and directed towards a device such as a printer, the method and the file format transformation system (FFTS) disclosed herein achieve such a pagination within the context of the browser generated hypertext markup language (HTML) rendering, while also making the HTML document compatible with external pagination devices such as print drivers of printers. Thus, the FFTS avoids the problem of incompatibly issues with external print drivers in rendering a page.
- HTML hypertext markup language
- Pagination within the context of the browser generated HTML rendering helps publishers to create print ready deliverables directly through a cloud without external applications comprising, for example, existing desktop publishing (DTP) software.
- DTP desktop publishing
- the method and the FFTS disclosed herein are diverse in converting paginated outputs for rendering document fragments within handheld devices, or within devices with larger screen widths and form factors such as in projection display devices. The FFTS therefore tweaks pagination to achieve a desired end use of rendering on handheld devices or larger screens.
- FIG. 2 exemplarily illustrates an interpretation of marked-up content in a second file format, for example, the PH5 format.
- a typical hypertext markup language (HTML) page does not have tags specified for spaces.
- the HTML page comprises a header, a footer, footnotes, floats such as figures, tables, images, video, audio, etc.
- the file format transformation system (FFTS) loads a hypertext markup language (HTML) page with associated cascading style sheets and transforms the HTML page to a PH5 page 201 of the PH5 format as exemplarily illustrated in FIG. 2.
- the FFTS identifies word spaces and block elements in the HTML page and appends the identified word spaces and the identified block elements with appropriate PH5 format tags.
- the FFTS performs tagging without replacing the original HTML tags, thereby preserving the original HTML tags to allow the final output second file format to be reverted back into the HTML page, if a user wants to suppress the changes and revert back to the HTML page.
- the tagged HTML page that is, the PH5 page 201 exemplarily illustrated in FIG. 2, contains all the content of the original input HTML page along with the PH5 format tags.
- the tagging process allows the FFTS to transform a fluid HTML page into fixed HTML pages.
- a fluid HTML page contains responsive content elements that resize their position and geometry according to a web browser width.
- the FFTS places an additional class tag after the first footnote "firstFootnote” and tags the following footnotes with an additional tag
- the FFTS renders the PH5 page 201 with the PH5 tags disclosed above.
- the file format transformation system (FFTS) performs PH5 tag recognition for automated browser based pagination and generates output pages 202.
- the FFTS recognizes the PH5 format tags appended in the PH5 tagged hypertext markup language (HTML) page 201.
- HTML hypertext markup language
- the PH5 tagged HTML page 201 comprises two figures labeled as "FIG 1" and "FIG 2" along with another float.
- the FFTS encounters the float tag of the first float, that is, FIG 1, in the PH5 tagged HTML page 201 and positions "FIG 1" proximal to the corresponding citation until the FFTS encounters a page break tag.
- the FFTS positions the float "FIG 1" at the top or bottom of the page close to the respective citation.
- the FFTS then allows the reflow of the HTML content to fit in the specified page width.
- the FFTS upon recognizing the footnote tag, introduces footnote matter at the bottom of the page in close proximity to the respective citation.
- the FFTS introduces a footnote ruler to separate the main content from the footnote matter upon recognition of the footnote tag.
- the FFTS further encounters a page number tag and introduces a page number at the bottom of the page after the footnote matter.
- the FFTS then breaks the page into an individual page after encountering the page break tag which is placed based on the reference page height.
- the file format transformation system then proceeds to the next section after the page break tag, proxies "FIG 2" and other floats, for example, audio, video, tables, etc., to a linked source, positions these floats according to the availability of space, positions page breaks according to a pixel dimension of the page, and inserts a page number for the current page.
- the FFTS then proceeds to the next section after the page break tag, positions the remaining footnotes on the next page, and inserts a page number for the next page.
- the FFTS performs the page transformation process until the last page break tag is recognized.
- FIGS. 3A-3F exemplarily illustrate a flowchart comprising the steps performed by the file format transformation system (FFTS) for transforming marked-up content in a first file format, for example, a hypertext markup language (HTML) format to a reversible second file format, hereafter referred to as a "reversible file format".
- FFTS file format transformation system
- FIG. 3A the FFTS loads 301 HTML content with cascading styling sheets (CSS) in a browser and examines 302 the loaded HTML content.
- the FFTS analyzes and describes syntactic roles of the HTML content.
- the FFTS introduces 303 hidden code points, for example, hidden soft hyphens into the HTML content based on popular dictionary elements, for example, dictionary syllables.
- the FFTS then reflows 304 the HTML content to fit a desired page width with a running continuous page height.
- the reflow process is used in a markup language document to render the markup language document to different types of user devices.
- the FFTS performs word spacing according to a kerning of a selected font.
- the FFTS also identifies 307 block elements in the reflown HTML content and introduces 308 a tag, for example, a ⁇ div class WSP> tag for each of the identified block elements in the reflown HTML content, where "WSP" refers to word space paragraph.
- ⁇ div> refers to a markup language tag that defines a container for holding content elements.
- the file format transformation system iteratively processes the generated tags and identifies, for each of the identified word spaces and the identified block elements, one or more pagination elements in the reflown hypertext markup language (HTML) content.
- the FFTS identifies pagination elements such as line breaks, floats, and footnotes as exemplarily illustrated in FIGS. 3C-3E respectively.
- FIG. 3B exemplarily illustrates iteration steps performed by the file format
- the FFTS iterates 309 the steps of determining and assignment of line breaks for every occurrence of the ⁇ WS> tag and the ⁇ WSP> tag as exemplarily illustrated in FIG. 3C, until all the ⁇ WS> tags and the ⁇ WSP> tags are processed 310.
- the FFTS after processing all the ⁇ WS> and ⁇ WSP> tags, iterates 311 all the line breaks and then proceeds to the steps exemplarily illustrated in FIGS. 3D-3E.
- FIG. 3C exemplarily illustrates determination and assignment of line breaks at appropriate positions in the reversible file format.
- the file format transformation system determines and assigns line breaks upon encountering any one of the following conditions: If the word space ⁇ WS> equals zero, then the FFTS identifies 312 the word space as a line break; if a soft hyphen ⁇ SHY> is not equal to zero, then the FFTS identifies 313 the soft hyphen as a line break; and if the FFTS identifies a paragraph break, the FFTS forces 314 a line break. After assigning the line breaks, the FFTS iterates 311 all the line breaks as exemplarily illustrated in FIG. 3B. FIG.
- 3D exemplarily illustrates positioning of floats proximate to a first citation in the reflown hypertext markup language (HTML) content.
- the file format transformation system identifies 315 where the floats are cited in the reflown HTML content and checks 316 whether a current page can accommodate one or more floats. If the current page cannot accommodate one or more floats, the FFTS positions 317 one or more floats on the next available page proximate to their respective citation. If the current page can accommodate one or more floats, the FFTS inserts 318 one or more floats on the current page.
- the FFTS examines each line from top to bottom until the FFTS reaches the specified page height of 500 pixels, while keeping track of the pixels covered. If the FFTS encounters a float before the 500 pixel height, the FFTS analyzes the float pixel dimension and the pixels covered so far and determines the sum of the float pixel dimension and the pixels covered till the point where the float was cited.
- the FFTS inserts the float on the next available page after the page break, in a way that the float follows the citation but does not precede the citation, and if the sum of the float pixel dimension and the pixels covered till the point where the float was cited is less than the specified page height, then the float is inserted on the same page proximate to its citation.
- the FFTS proceeds to the steps exemplarily illustrated in FIG. 3F.
- FIG. 3E exemplarily illustrates positioning of footnotes at relevant pages in the reversible file format.
- the file format transformation system identifies footnotes in the reflown hypertext markup language (HTML) content and checks 319 whether space is available for a footnote cited in the reflown HTML content in the current page. If space is not available in the current page, the FFTS positions 320 the footnote, that is, the citation point's sentence and matter, on the next page. If there is enough space available in the current page, the FFTS positions 321 the footnote matter on a page footnote section as the footnote is cited. For example, for a page with 500 pixels of fixed height, the FFTS examines each line from top to bottom until the FFTS reaches the specified page height of 500 pixels, keeping track of the pixels covered.
- HTML hypertext markup language
- the FFTS analyzes the corresponding footnote pixel dimensions and the pixels covered so far and determines the sum of the footnote pixel dimension and the pixels covered till the point where the footnote was cited. If the sum exceeds the specified page height, for example, 500 pixels, the FFTS accommodates the footnote along with its citation on the next available page after the page break, and if the sum of the footnote pixel dimension and the pixels covered till the point where the footnote was cited is less than the specified page height, then the footnote is accommodated proximate to its citation in the same page at the bottom.
- the specified page height for example, 500 pixels
- FIG. 3F exemplarily illustrates the rendering of the hypertext markup language (HTML) content in the reversible file format.
- the file format transformation system (FFTS) compares 322 the HTML content with a specified page height and introduces page breaks appropriately into the HTML content.
- the page breaks break the HTML content into individual pages of a predefined length.
- the FFTS groups 323 the HTML content on each page from header to footer using a ⁇ div> element.
- the FFTS inserts 324 page numbers into the individual pages based on a predefined numbering style, a header, a footer, and a footnote ruler wherever necessary.
- the FFTS checks 325 whether all the line breaks are processed. If all the line breaks are not processed, the FFTS iterates 311 all the line breaks as exemplarily illustrated in FIGS. 3B-3C. The FFTS then delivers 326 the marked-up content in the reversible file format. The FFTS provides an option to revert 327 the changes made in the reversible file format to the first file format, for example, the HTML format. If a user wants to revert from the reversible file format to the HTML format, the FFTS suppresses 328 the changes by hiding the changes in a background and displays the input HTML page having the input HTML content. Based on the user's token declaration disclosed in the detailed description of FIGS.
- the FFTS if the FFTS transforms the input HTML page to a partially reversible file format, and the user wants to revert from the partially reversible file format to the HTML format, the FFTS suppresses 328 some aspects of reversibility of the second file format of the input HTML page.
- the FFTS reduces some levels of reversibility, while producing the same paginated output. Based on the user's token declaration not to revert to the HTML format, the FFTS transforms the input HTML page to a non-reversible format and the process ends.
- FIGS. 4A-4B exemplarily illustrate screenshots showing edit views of marked-up content.
- FIG. 4A exemplarily illustrates a screenshot of an input hypertext markup language (HTML) page containing marked-up content without an edit window 402 in a right pane of a graphical user interface (GUI) 401.
- FIG. 4B exemplarily illustrates a screenshot of the input HTML page containing marked-up content showing the edit window 402 in the right pane of the GUI 401.
- HTML hypertext markup language
- FIG. 4C exemplarily illustrates a screenshot showing a proof view of the marked-up content rendered in a reversible file format.
- the file format transformation system transforms the input hypertext markup language (HTML) page exemplarily illustrated in FIGS. 4A-4B, to an output page in the reversible file format, that is, the PH5 format as exemplarily illustrated in FIG. 4C.
- the FFTS hyphenates words where appropriate.
- the FFTS retains the original HTML tags and appends the PH5 format tags to the marked-up content.
- FIG. 4D exemplarily illustrates a screenshot showing a partial source code of the marked-up content rendered in the reversible file format, that is, the PH5 format.
- FIG. 5 exemplarily illustrates a flow diagram showing a process flow implemented by an embodiment of the file format transformation system (FFTS) deployed on a server for transforming marked-up content in a first file format to a second file format based on a selected level of reversibility.
- the FFTS receives a document in an extensible markup language (XML) format 501.
- the FFTS deployed on the server converts the document from the XML format 501 to an HTML format 502.
- the FFTS sets 503 a predefined condition for reversibility based on a selection of one of the different levels of reversibility, that is, complete or full reversibility, or partial reversibility, or non-reversibility indicated by a user, for example, via a graphical user interface (GUI) provided by the FFTS.
- GUI graphical user interface
- the FFTS implemented on a headless browser 504 then transforms the document in the HTML format 502 to the second file format, for example, a reversible file format, or a partially reversible file format, or a non-reversible file format based on the set predefined condition.
- the FFTS captures 505 a document object model (DOM) of the HTML document and performs minimalistic manipulation to the DOM to generate a pre-processed HTML output document 506.
- the FFTS runs browser-based pagination scripts using the headless browser and renders the pre-processed HTML output document 506 having a level of reversibility selected by the user on a client device.
- the FFTS renders the marked-up content in the HTML document ahead of demand to the client device. For example, a typical HTML page rendering on the client device takes about 15 seconds. With the server side implementation using the headless browser, the time consumed to render the HTML page on the client device is reduced to about 2 seconds.
- FIGS. 6A-6B exemplarily illustrate a flowchart comprising the steps performed by the file format transformation system (FFTS) for transforming marked-up content in a first file format to a second file format based on a selected level of reversibility.
- a user selects 601 a level of reversibility for completely or partially reversing the marked-up content of the second file format to the first file format by declaring a token for a reversible file format or a partially or custom reversible file format.
- the FFTS using the token, determines whether to render the marked-up content in the second file format such that the marked-up content in the second file format can be completely or partially restored to the first file format.
- the FFTS identifies 602 document elements or artifacts that have to be added for pagination of the marked-up content that is in the first file format.
- the document elements or artifacts for pagination are the pagination elements, for example, page borders, header and footer placeholders, etc.
- the FFTS determines 603 whether reversibility of the marked-up content from the second file format to the first file format is required from the token declared by the user. If the declared token is for reversibility of the second file format to the first file format, the FFTS tracks the positions of the pagination elements for rendering the marked-up content with the pagination elements in the second file format with a desired level of reversibility.
- the FFTS marks 604 the document elements or the artifacts that have been added to paginate the marked-up content and proceeds to step 605. If reversibility of the second file format is not required, that is, if reversibility of the pagination elements in the second file format to the first file format is not required, the FFTS proceeds to identify 605 floating artifacts, that is, anchored floats, for example, figures, tables, etc., that already exist in the original document, that is, in the marked-up content of the first file format but need to be moved during pagination of the grouped marked-up content of the first file format.
- the file format transformation system determines 606 whether reversibility of the floating artifacts in the marked-up content from the second file format to the first file format is required. If a reversible file format or a partially reversible file format is required, the FFTS inserts 607 a hidden tag to mark the original position and sequence of each floating artifact in the first file format. The hidden tag guides reversal of the marked-up content with the pagination elements from the second file format to the first file format based on the selected level of reversibility as defined by the declared token and then proceeds to step 608.
- the FFTS identifies 608 other floating artifacts, that is, footnotes, for example, citations and references, whose links will be repositioned in the marked-up content of the second file format.
- the file format transformation system determines 609 whether reversibility of the footnotes in the marked-up content from the second file format to the first file format is required. If reversibility of the footnotes in the marked-up content of the second file format is required, the FFTS inserts 610 a hidden tag to mark the original position of each floating artifact, that is, each footnote.
- the FFTS tracks positions of the identified footnotes in the marked-up content of the first file format to guide reversal of the marked-up content in the second file format to the first file format based on the selected level of reversibility and proceeds to step 611.
- the FFTS stores 611 the hidden tags that will be used in reversing the paginated output in the second file format to the original layout, that is, to the first file format, within the source of the processed and paginated output in the second file format. That is, the FFTS stores the hidden tags that will be used to reverse the paginated output in the second file format to the original layout of the continuous page within the client device.
- the FFTS determines 612 whether reversibility of the paginated output in the second file format to the first file format is required.
- the FFTS carries 613 out the reversal according to the selected level of reversibility using the inserted hidden tags and ends the process. If reversibility of the marked-up content from the second file format to the first file format is not required, the FFTS ends the process.
- a user selects the level of reversibility of the marked-up content in the second file format to be completely reversible to the first file format, for example, a hypertext markup language (HTML) format, that is, to be completely reversible to the continuous page with the original HTML layout.
- HTML hypertext markup language
- the file format transformation system determines to render the marked-up content in the selected reversible file format such that the marked-up content in the reversible file format can be completely restored to the HTML format.
- the FFTS identifies the pagination elements that have to be added for pagination of the marked-up content in the HTML format.
- the FFTS determines that the declared token is for the reversible file format and tracks the positions of the pagination elements for rendering the marked-up content with the pagination elements in the reversible file format.
- the FFTS further marks the document elements or the artifacts, for example, the page borders, the header and footer place holders, etc., that have been added to paginate the marked-up content that is in the HTML format.
- the FFTS then identifies floating artifacts, that is, anchored floats, for example, figures, tables, etc., that already exist in the marked-up content of the HTML format but need to be moved during pagination of the grouped marked-up content of the HTML format.
- the FFTS determines that reversibility of the floating artifacts in the marked-up content from the reversible file format to the HTML format is required from the declared token.
- the FFTS inserts a hidden tag to mark the original position and sequence of each floating artifact in the HTML format.
- the hidden tag guides reversal of the marked-up content in the reversible file format with the pagination elements to the HTML format as defined by the declared token.
- the file format transformation system proceeds to identify other floating artifacts, that is, footnotes, for example, citations and references, whose links will be repositioned in the marked-up content of the reversible file format.
- the FFTS determines that the reversibility of the footnotes in the marked-up content from the reversible file format to the HTML format is required.
- the FFTS inserts a hidden tag to mark the original position of each floating artifact, that is, each footnote.
- the FFTS tracks positions of the identified footnotes in the marked-up content of the HTML format to guide reversal of the marked-up content in the reversible file format to the HTML format.
- the FFTS stores the hidden tags that will be used in reversing the paginated output in the reversible file format to the original layout, that is, the HTML format, within the source of the processed and paginated output in the reversible file format. That is, the FFTS stores the hidden tags that will be used to reverse the paginated output in the reversible file format to the original layout of the continuous page within the client device.
- the FFTS For reversal of the reversible file format to the HTML format, the FFTS carries out the reversal using the inserted hidden tags for the floating artifacts and the positions of the pagination elements.
- the marked-up content in the second file format comprises anchored floats in original positions, footnotes in original positions, and no pagination elements.
- the file format transformation system FFTS
- the FFTS inserts pagination elements into the marked-up content of the HTML format. Due to pagination of the marked-up content in the HTML format, the FFTS moves the anchored floats to new positions in the second file format. Also, the FFTS repositions the footnotes in the second file format.
- the user selects a level of reversibility for partially reversing the marked-up content of the second file format to the first file format by declaring a token for a partially reversible file format.
- the FFTS using the token, determines to render the marked-up content in the partially reversible file format such that the marked-up content in the partially reversible file format can be partially restored to the HTML format.
- partially restoring the marked-up content in the partially reversible file format to the HTML format comprises removing the inserted pagination elements and moving the footnotes to the original positions as in the HTML format, while retaining the anchored floats in their new positions in the HTML format.
- the file format transformation system identifies the pagination elements, for example, the page borders, the header and footer placeholders, etc., that have to be added for pagination of the marked-up content in the hypertext markup language (HTML) format.
- the FFTS determines that reversibility of the pagination elements from the partially reversible file format to the HTML format is required from the declared token and tracks the positions of the pagination elements for rendering the marked-up content with the pagination elements in the partially reversible file format.
- the FFTS also marks the pagination elements that have been added to paginate the marked-up content in the HTML format.
- the FFTS proceeds to identify the anchored floats, for example, figures, tables, etc., that already exist in the marked-up content of the HTML format but need to be moved during pagination of the grouped marked-up content of the HTML format.
- the file format transformation system determines whether reversibility of the anchored floats in the marked-up content of the partially reversible file format is required.
- the FFTS proceeds to identify footnotes, for example, citations and references, whose links will be repositioned in the marked- up content of the partially reversible file format.
- the FFTS does not insert tags to mark the original positions and sequence of the anchored floats in the HTML format.
- the file format transformation system FFTS determines that reversibility of the footnotes in the marked-up content from the partially reversible file format to the hypertext markup language (HTML) format is required and inserts a hidden tag to mark the original position of each footnote.
- the FFTS tracks positions of the identified footnotes in the marked-up content of the HTML format to guide reversal of the marked-up content in the partially reversible file format to the HTML format.
- the FFTS stores the hidden tags for the pagination elements and the footnotes that will be used in partially reversing the paginated output in the partially reversible file format to the original layout of the continuous page within the client device.
- the FFTS determines that partial reversibility of the paginated output in the partially reversible file format to the HTML format is required and carries out the partial reversal accordingly using the inserted hidden tags.
- the marked-up content in the HTML format comprises the footnotes in their original positions, the anchored floats in their new positions, and the inserted pagination elements removed.
- HTML hypertext markup language
- the user accordingly, declares a token for a non-reversible file format.
- the file format transformation system (FFTS), using the token, determines to render the marked-up content in the nonreversible file format such that the marked-up content is retained in the non-reversible file format.
- the FFTS identifies document elements or artifacts, that is, pagination elements, for example, the page borders, the header and footer place holders, etc., that have to be added for pagination of the marked-up content in the HTML format.
- the FFTS determines that reversibility of the pagination elements in the marked-up content of the non-reversible file format to the HTML format is not required and proceeds to identify floating artifacts, that is, anchored floats, for example, figures, tables, etc., that already exist in the marked-up content of the HTML format but need to be moved during pagination of the grouped marked-up content of the HTML format.
- the file format transformation system identifies other floating artifacts, that is, footnotes, for example, citations and references, whose links will be repositioned in the marked-up content of the HTML format.
- the FFTS does not insert or store hidden tags used for reversing the paginated output to the original layout, that is, the HTML format, within the source of the processed and paginated output in the non-reversible file format.
- FIG. 7A exemplarily illustrates a system 700 comprising the file format transformation system (FFTS) 702 deployed on a client device 701 for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination.
- the client device 701 can be, for example, a personal computer, a tablet computing device, a mobile computer, a mobile phone, a smart phone, a portable computing device, a laptop, a personal digital assistant, a touch centric device, a workstation, a portable electronic device, a network enabled computing device, an interactive network enabled communication device, any other suitable computing equipment, combinations of multiple pieces of computing equipment, etc.
- the client device 701 is a computer system that is programmable using a high level computer programming language.
- the FFTS 702 is implemented on the client device 701 using programmed and purposeful hardware. In an embodiment, the FFTS 702 is implemented as a standalone software application on the client device 701. In the client device 701, the FFTS 702 employs a browser as a client application. In an embodiment, the FFTS 702 is accessible to a user through a broad spectrum of technologies and devices, for example, cellular phones, tablet computing devices, etc., with access to the internet.
- the FFTS 702 comprises a content reception module 702a, a content reflow module 702b, a space and block identification module 702c, a tagging module 702d, a pagination element processing module 702e, a position tracking module 702f, and a compiler 702g.
- the client device 701 comprises a non-transitory computer readable storage medium such as a memory unit 703 for storing computer programs and data, a processor 704 communicatively coupled to the non-transitory computer readable storage medium, a display unit 705, a data bus 706, a network interface 707, an input/output (I/O) controller 708, input devices 709, a fixed media drive 710 such as a hard drive, a removable media drive 711 for receiving removable media, output devices 712, etc.
- a non-transitory computer readable storage medium such as a memory unit 703 for storing computer programs and data
- a processor 704 communicatively coupled to the non-transitory computer readable storage medium
- a display unit 705 a data bus 706, a network interface 707
- an input/output (I/O) controller 708 input devices 709
- a fixed media drive 710 such as a hard drive
- a removable media drive 711 for receiving removable media
- output devices 712 etc
- non-transitory computer readable storage medium refers to all computer readable media, for example, non-volatile media, volatile media, and transmission media, except for a transitory, propagating signal.
- Non- volatile media comprise, for example, solid state drives, optical discs or magnetic disks, and other persistent memory volatile media including a dynamic random access memory (DRAM), which typically constitute a main memory.
- Volatile media comprise, for example, a register memory, a processor cache, a random access memory (RAM), etc.
- Transmission media comprise, for example, coaxial cables, copper wire, fiber optic cables, modems, etc., including wires that constitute a system bus coupled to the processor 704.
- the non-transitory computer readable storage medium stores computer program instructions defined by modules, for example, 702a, 702b, 702c, 702d, 702e, 702f, 702g, etc., of the file format transformation system (FFTS) 702.
- modules for example, 702a, 702b, 702c, 702d, 702e, 702f, 702g, etc., of the file format transformation system (FFTS) 702.
- FFTS file format transformation system
- the memory unit 703 is used for storing computer programs, applications, and data.
- the content reception module 702a, the content reflow module 702b, the space and block identification module 702c, the tagging module 702d, the pagination element processing module 702e, the position tracking module 702f, the compiler 702g, etc., of the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7A are stored in the memory unit 703 of the client device 701.
- the memory unit 703 is, for example, a random access memory (RAM) or another type of dynamic storage device that stores information and instructions for execution by the processor 704.
- the memory unit 703 also stores temporary variables and other intermediate information used during execution of the instructions by the processor 704.
- the client device 701 further comprises a read only memory (ROM) or another type of static storage device that stores static information and instructions for the processor 704.
- ROM read only memory
- the processor 704 executes the computer program instructions defined by the modules, for example, 702a, 702b, 702c, 702d, 702e, 702f, 702g, etc., of the file format transformation system (FFTS) 702.
- the processor 704 refers to any one or more microprocessors, central processing unit (CPU) devices, finite state machines, computers, microcontrollers, digital signal processors, logic, a logic device, an electronic circuit, an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a chip, etc., or any combination thereof, capable of executing computer programs or a series of commands, instructions, or state transitions.
- the processor 704 is implemented as a processor set comprising, for example, a programmed microprocessor and a math or graphics co-processor.
- the processor 704 is selected, for example, from the Intel ® processors such as the Itanium ® microprocessor or the Pentium ® processors, Advanced Micro Devices (AMD ® ) processors such as the Athlon ® processor, UltraSPARC processors, microSPARC processors, hp processors, International Business Machines (IBM ® ) processors such as the PowerPC ® microprocessor, the MIPS ® reduced instruction set computer (RISC) processor of MIPS Technologies, Inc., RISC based computer processors of ARM Holdings, Motorola ® processors, Qualcomm ® processors, etc.
- Intel ® processors such as the Itanium ® microprocessor or the Pentium ® processors
- AMD ® Advanced Micro Devices
- IBM ® International Business Machines
- processors such as the PowerPC ® microprocessor
- the FFTS 702 disclosed herein is not limited to employing a processor 704.
- the FFTS 702 employs a controller or a microcontroller.
- the processor 704 executes the modules, for example, 702a, 702b, 702c, 702d, 702e, 702f, 702g, etc., of the FFTS 702.
- the content reception module 702a of the file format transformation system (FFTS) 702 receives the marked-up content of the first file format, for example, the hypertext markup language (HTML) format.
- FFTS file format transformation system
- HTML hypertext markup language
- An example of a pseudocode of the content reception module 702a executed by the processor 704 of the client device 701 for receiving the marked-up content of the first file format is provided below: function receiveContent(self, container, source) ⁇
- var innerContainer null
- paginator null
- var content null
- paginator domHelper.create('div');
- innerContainer. innerHTML source
- the content reflow module 702b of the file format transformation system (FFTS) 702 reflows the received marked-up content of the first file format into a continuous page having a configurable page width.
- FFTS file format transformation system
- FFTS 702 identifies spaces and block elements in the reflown marked-up content of the first file format.
- An example of a pseudocode of the space and block identification module 702c executed by the processor 704 of the client device 701 for identifying and tagging spaces and block elements in the reflown marked-up hypertext markup language (HTML) content is provided below: function putSpanForWordSpace(self, content) ⁇
- str jQ(this).text()
- the tagging module 702d of the file format transformation system (FFTS) 702 generates and appends tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format. For each of the identified spaces and the identified block elements, the pagination element processing module 702e determines line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags. The tagging module 702d tags the determined line breaks.
- An example of the pseudocode of the pagination element processing module 702e executed by the processor 704 of the client device 701 for determining the line breaks is provided below: function determineLineBreaks() ⁇
- the pagination element processing module 702e For each of the determined line breaks, the pagination element processing module 702e identifies anchored floats in the reflown marked-up content of the first file format. The tagging module 702d tags the identified anchored floats. Further, for each of the determined line breaks, the pagination element processing module 702e positions the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page. The pagination element processing module 702e positions the tagged anchored floats proximal to associated float citations on the current page based on the availability of space for the tagged anchored floats on the current page.
- the pagination element processing module For each of the determined line breaks, the pagination element processing module
- the pagination element processing module 702e of the file format transformation system (FFTS) 702 identifies footnotes in the reflown marked-up content of the first file format.
- the tagging module 702d tags the identified footnotes.
- the pagination element processing module 702e positions the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page.
- the pagination element processing module 702e positions the tagged footnotes proximal to associated footnote citations on the current page based on the availability of space for the tagged footnotes on the current page.
- footnoteHeight getFootnoteHeight(footnoteltem);
- the pagination element processing module 702e positions page breaks in the continuous page based on a configurable page height and the determined line breaks for the positioning of the tagged anchored floats and the tagged footnotes on a subsequent page on nonavailability of space on the current page.
- the compiler 702g of the file format transformation system (FFTS) 702 groups the marked-up content with the positioned anchored floats and the positioned footnotes on each page.
- the pagination element processing module 702e inserts one or more pagination elements, for example, page numbers, a header, a footer, a footnote ruler, fillers, etc., on each page containing the grouped marked-up content.
- An example of the pseudocode of the compiler 702g executed by the processor 704 of the client device 701 for performing the steps of grouping and insertion of page numbers is provided below: function makePageBlocks() ⁇
- wrapPageWithNumber(" ⁇ div class 'page” + i + '>", i, startPage, endPage);
- the compiler 702g renders the grouped marked-up content with the inserted pagination elements in the second file format based on a selected level of reversibility. That is, the second file format is a reversible file format, or a partially reversible file format, or a non-reversible file format.
- the reversible file format allows the marked-up content to be completely reversed to the first file format to restore the continuous page.
- the pagination element processing module 702e handles grouped elements comprising, for example, a float and a caption associated with the float in the second file format at a position assigned in the marked-up content of the first file format to the float.
- the compiler 702g completely or partially reverses the marked-up content in the second file format, for example, the reversible file format or the partially reversible file format to the first file format to completely or partially restore the continuous page.
- An example of the pseudocode of the compiler 702g executed by the processor 704 of the client device 701 for reversing the marked-up content in the PH5 format to the marked-up content in the original input hypertext markup language (HTML) format is provided below: function removePaginationArtifacts() ⁇
- var headerFooter content.find(.page-header-footer");
- headerFooter.remove()
- the position tracking module 702f of the file format transformation system (FFTS) 702 tracks positions of the identified anchored floats and the identified footnotes in the reflown marked-up content of the first file format, and positions of the page breaks in the continuous page prior to the grouping of the marked-up content by the compiler 702g and the insertion of the pagination elements on each page by the pagination element processing module 702e for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility.
- FFTS file format transformation system
- the position tracking module 702f tracks positions of the inserted pagination elements for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility.
- An example of a pseudocode of the position tracking module 702f executed by the processor 704 of the client device 701 for tracking the pagination elements, for example, header, footer, page-number-folio, page-break, borders, etc., in the reflown marked-up content of the first file format is provided below: function tagPageElements (foos) ⁇
- the display unit 705 of the client device 701, via the graphical user interface (GUI) 401 displays information such as the marked-up content, display interfaces, user interface elements such as text fields, etc., for allowing a user of the file format transformation system (FFTS) 702 to view an input page in the first file format and a processed, transformed and paginated output page in the second file format.
- the display unit 705 comprises, for example, a liquid crystal display, a plasma display, an organic light emitting diode (OLED) based display, etc.
- the data bus 706 permits communications between the modules, for example, 703, 704, 705, 707, 708, 709, 710, 711, 712, etc., of the client device 701.
- the network interface 707 enables connection of the client device 701 to a network, for example, a short range network or a long range network.
- the network interface 707 is provided as an interface card also referred to as a line card.
- the network interface 707 is, for example, one or more of an infrared (IR) interface, an interface implementing Wi-Fi ® of Wi-Fi Alliance Corporation, a universal serial bus (USB) interface, a FireWire ® interface of Apple Inc., an Ethernet interface, a frame relay interface, a cable interface, a digital subscriber line (DSL) interface, a token ring interface, a peripheral controller interconnect (PCI) interface, a local area network (LAN) interface, a wide area network (WAN) interface, interfaces using serial protocols, interfaces using parallel protocols, Ethernet communication interfaces, asynchronous transfer mode (ATM) interfaces, a high speed serial interface (HSSI), a fiber distributed data interface (FDDI), interfaces based on transmission control protocol (TCP)/inter
- the I/O controller 708 controls input actions and output actions performed by the FFTS 702.
- the input devices 709 are used for inputting data into the client device 701. Users of the client device 701 use the input devices 709 to provide inputs to the file format transformation system (FFTS) 702. For example, a user may enter a file format, declare a token to select a level of reversibility of the marked-up content from the second file format to the first file format, or edit an input page on the graphical user interface (GUI) 401 using the input devices 709.
- FFTS file format transformation system
- the input devices 709 are, for example, a keyboard such as an alphanumeric keyboard, a microphone, a joystick, a pointing device such as a computer mouse, a touch pad, a light pen, a physical button, a touch sensitive display device, a track ball, a pointing stick, any device capable of sensing a tactile input, etc.
- Computer applications and computer programs are used for operating the FFTS 702.
- the computer programs are loaded onto the fixed media drive 710 and into the memory unit 703 of the client device 701 via the removable media drive 711.
- the computer applications and computer programs are loaded directly via a network.
- Computer applications and computer programs are executed by double clicking a related icon displayed on the display unit 705 using one of the input devices 709.
- the output devices 712 for example, a printer outputs the results of operations performed by the FFTS 702.
- the FFTS 702 renders the paginated output page in the second file format using the output devices 712.
- the processor 704 of the client device 701 executes an operating system selected, for example, from the Linux ® operating system, the Unix ® operating system, any version of the Microsoft ® Windows ® operating system, the Mac OS of Apple Inc., the IBM ® OS/2, VxWorks ® of Wind River Systems, Inc., QNX Neutrino ® developed by QNX Software Systems Ltd., Palm OS ® , the Solaris operating system developed by Sun Microsystems, Inc., the Android operating system, the Windows Phone ® operating system of Microsoft Corporation, the BlackBerry ® operating system of BlackBerry Limited, the iOS operating system of Apple Inc., the Symbian TM operating system of Symbian Foundation Limited, etc.
- the file format transformation system selected, for example, from the Linux ® operating system, the Unix ® operating system, any version of the Microsoft ® Windows ® operating system, the Mac OS of Apple Inc., the IBM ® OS/2, VxWorks ® of Wind River Systems, Inc., QNX Neutrino ® developed by QNX
- FFTS 702 employs the operating system for performing multiple tasks.
- the operating system is responsible for management and coordination of activities and sharing of resources of the client device 701.
- the operating system further manages security of the FFTS 702, peripheral devices connected to the client device 701, and network connections.
- the operating system employed on the client device 701 recognizes, for example, inputs provided by the users using one of the input devices 709, the output display, files, and directories stored locally on the fixed media drive 710.
- the operating system on the client device 701 executes different computer programs using the processor 704.
- the processor 704 and the operating system together define a computer system for which application programs in high level programming languages are written.
- the processor 704 of the client device 701 retrieves instructions defined by the content reception module 702a, the content reflow module 702b, the space and block identification module 702c, the tagging module 702d, the pagination element processing module 702e, the position tracking module 702f, and the compiler 702g, for performing respective functions disclosed above.
- the processor 704 retrieves instructions for executing the modules, for example, 702a, 702b, 702c, 702d, 702e, 702f, 702g, etc., of the FFTS 702 from the memory unit 703.
- a program counter determines the location of the instructions in the memory unit 703 of the client device 701.
- the program counter stores a number that identifies the current position in the computer program of each of the modules, for example, 702a, 702b, 702c, 702d, 702e, 702f,
- the instructions fetched by the processor 704 from the memory unit 703 after being processed are decoded.
- the instructions are stored in an instruction register in the processor 704.
- the processor 704 executes the instructions, thereby performing one or more processes defined by those instructions.
- the instructions stored in the instruction register are examined to determine the operations to be performed.
- the processor 704 then performs the specified operations.
- the operations comprise arithmetic operations and logic operations.
- the operating system performs multiple routines for performing a number of tasks required to assign the input devices 709, the output devices 712, and memory for execution of the modules, for example,
- the tasks performed by the operating system comprise, for example, assigning memory to the modules, for example, 702a, 702b, 702c, 702d, 702e, 702f, 702g, etc., of the FFTS 702, and to data used by the FFTS 702, moving data between the memory unit 703 and disk units, and handling input/output operations.
- the operating system performs the tasks on request by the operations and after performing the tasks, the operating system transfers the execution control back to the processor 704.
- the processor 704 continues the execution to obtain one or more outputs.
- the outputs of the execution of the modules, for example, 702a, 702b, 702c, 702d, 702e, 702f, 702g, etc., of the FFTS 702 are displayed to the user on the display unit 705.
- FIG. 7B exemplarily illustrates an embodiment of the system 700 comprising the file format transformation system (FFTS) 702 deployed on a server 723 for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination.
- the FFTS 702 is deployed on the server 723 using programmed and purposeful hardware as exemplarily illustrated in FIG. 7B.
- the server 723 communicates with a client device 701 over a network 724.
- the server 723 is, for example, a personal computer, a tablet computing device, a mobile computer, a portable computing device, a laptop, a touch centric device, a workstation, a portable electronic device, a network enabled computing device, an interactive network enabled communication device, any other suitable computing equipment, combinations of multiple pieces of computing equipment, etc.
- the FFTS 702 is implemented as a standalone software application on the server 723.
- the FFTS 702 employs a headless browser as a command-line server application.
- the FFTS 702 is accessible to a user through a broad spectrum of technologies and devices, for example, cellular phones, tablet computing devices, etc., with access to the internet.
- the FFTS 702 is implemented in a cloud computing environment.
- cloud computing environment refers to a processing environment comprising configurable computing physical and logical resources, for example, networks, servers, storage media, applications, virtual machines, services, etc., and data distributed over the network 724.
- the cloud computing environment provides on-demand network access to a shared pool of the distributed computing physical and logical resources.
- the network 724 is, for example, one of the internet, an intranet, a wired network, a wireless network, a communication network that implements Bluetooth ® of Bluetooth Sig, Inc., a network that implements Wi-Fi ® of Wi-Fi Alliance Corporation, an ultra- wideband
- UWB wireless universal serial bus
- USB wireless universal serial bus
- GSM global system for mobile
- CDMA code division multiple access
- 3G third generation
- 4G fourth generation
- 5G fifth generation
- LTE long- term evolution
- the server 723 comprises a non-transitory computer readable storage medium such as a memory unit 713 for storing computer programs and data, a processor 714 communicatively coupled to the non-transitory computer readable storage medium, a display unit 715, a data bus 716, a network interface 717, an input/output (I/O) controller 718, input devices 719, a fixed media drive 720 such as a hard drive, a removable media drive 721 for receiving removable media, output devices 722, etc., similar to the memory unit 703, the processor 704, the display unit 705, the data bus 706, the network interface 707, the I/O controller 708, the input devices 709, the fixed media drive 710, the removable media drive 711, the output devices 712, etc., of the client device 701 respectively, disclosed in the detailed description of FIG.
- a non-transitory computer readable storage medium such as a memory unit 713 for storing computer programs and data
- a processor 714 communicatively coupled to the non-trans
- modules 713, 714, 715, 716, 717, 718, 719, 720, 721, and 722 of the server 723 are similar to the structure and functions of the corresponding modules 703, 704, 705, 706, 707, 708, 709, 710, 711, and 712 of the client device 701 respectively, disclosed in the detailed description of FIG. 7A.
- GUI graphical user interface
- the processor 714 of the server 723 retrieves instructions defined by the content reception module 702a, the content reflow module 702b, the space and block identification module 702c, the tagging module 702d, the pagination element processing module 702e, the position tracking module 702f, and the compiler 702g for performing respective functions disclosed in the detailed description of FIG. 7A.
- the pseudocodes of the content reception module 702a, the content reflow module 702b, the space and block identification module 702c, the pagination element processing module 702e, the position tracking module 702f, and the compiler 702g disclosed in the detailed description of FIG. 7A are executed by the processor 714 of the server 723 for performing their respective functions.
- FFTS transformation system
- the client device 701 exemplarily illustrated in FIG. 7A
- the server 723 exemplarily illustrated in FIG. 7B
- the scope of the computer implemented method and system 700 disclosed herein is not limited to the FFTS 702 being run locally on a single computer system via the operating system and the processor 704 or 714 exemplarily illustrated in FIGS. 7A-7B, but may be extended to run remotely over the network 724 by employing a web browser and a remote server, a mobile phone, or other electronic devices.
- one or more portions of the FFTS 702 are distributed across one or more computer systems (not shown) coupled to the network 724.
- FIG. 7A exemplarily illustrated in FIG. 7A
- server 723 exemplarily illustrated in FIG. 7B
- disclosed herein stores computer program codes comprising instructions executable by at least one processor 704 or 714 respectively, for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination, where the second file format is a reversible file format, or a partially reversible file format, or a non-reversible file format.
- the computer program codes comprise a first computer program code for receiving the marked-up content of the first file format; a second computer program code for reflowing the received marked-up content of the first file format into a continuous page having a configurable page width; a third computer program code for identifying spaces and block elements in the reflown marked-up content of the first file format; and a fourth computer program code for generating and appending tags to the identified spaces and the identified block elements in the reflown marked-up content of the first file format.
- the computer program codes further comprise a fifth computer program code for determining line breaks in the reflown marked-up content of the first file format based on preconfigured criteria associated with the appended tags; and a sixth computer program code for tagging the determined line breaks.
- the computer program codes further comprise a seventh computer program code for identifying anchored floats in the reflown marked-up content of the first file format; an eight computer program code for tagging the identified anchored floats; a ninth computer program code for positioning the tagged anchored floats on a current page based on availability of space for the tagged anchored floats on the current page; a tenth computer program code for identifying footnotes in the reflown marked-up content of the first file format; an eleventh computer program code for tagging the identified footnotes; a twelfth computer program code for positioning the tagged footnotes at a footnote section on the current page based on availability of space for the tagged footnotes on the current page; a thirteenth computer program code for positioning page breaks in the continuous page based on a configurable page height and the determined line breaks for positioning the tagged anchored floats and the tagged footnotes on a subsequent page on non-availability of space on the
- the ninth computer program code positions the tagged anchored floats proximal to associated float citations on the current page based on the availability of space for the tagged anchored floats on the current page.
- the twelfth computer program code positions the tagged footnotes proximal to associated footnote citations on the current page based on the availability of space for the tagged footnotes on the current page.
- the computer program codes further comprise a sixteenth computer program code for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on a selected level of reversibility.
- the computer program codes further comprise a seventeenth computer program code for tracking positions of the identified anchored floats and the identified footnotes in the reflown marked-up content of the first file format, and positions of the page breaks in the continuous page prior to the grouping of the marked-up content and the insertion of the pagination elements on each page for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility; and an eighteenth computer program code for tracking positions of the inserted pagination elements for rendering the grouped marked-up content with the inserted pagination elements in the second file format based on the selected level of reversibility.
- the computer program codes further comprise one or more additional computer program codes for performing additional steps that may be required and contemplated for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination.
- a single piece of computer program code comprising computer executable instructions performs one or more steps of the computer implemented method disclosed herein for transforming marked-up content in a first file format to a second file format that enables automated browser based pagination.
- the computer program codes comprising computer executable instructions are embodied on the non- transitory computer readable storage medium.
- the processor 704 of the client device 701 exemplarily illustrated in FIG. 7A, or in an embodiment, the processor 714 of the server 723 exemplarily illustrated in FIG.
- the computer executable instructions retrieves these computer executable instructions and executes them.
- the computer executable instructions When the computer executable instructions are executed by the processor 704 or 714, the computer executable instructions cause the processor 704 or 714 to perform the steps of the computer implemented method for transforming marked-up content of a first file format to a second file format that enables automated browser based pagination.
- FIGS. 8A-8Q exemplarily illustrate screenshots showing transformation of marked-up content in a first file format to a reversible second file format in edit views and proof views.
- the file format transformation system (FFTS) 702 is configured as a software application on a client device 701 as exemplarily illustrated in FIG. 7A, or in an embodiment, on a server 723 as exemplarily illustrated in FIG. 7B.
- a user of the client device 701 may want to edit and review a technical document of, for example, a hypertext markup language (HTML) format that is viewed as a running continuous page.
- HTML hypertext markup language
- the user invokes the FFTS 702 on the client device 701 and loads the input HTML document into the FFTS 702.
- the FFTS 702 allows the user to view the input HTML document via a graphical user interface (GUI) 401 of the FFTS 702.
- GUI graphical user interface
- FIG. 8A exemplarily illustrates a screenshot of an opening page of the loaded input HTML document without an edit window 402 in the right pane of the GUI 401.
- FIG. 8B exemplarily illustrates a screenshot of the opening page of the loaded input HTML document, showing the edit window 402 in the right pane of the GUI 401.
- the edit window 402 allows the user to edit the input HTML document or accept suggested changes made by other users to the input HTML document in an edit view as exemplarily illustrated in FIG. 8B.
- FIG. 8A exemplarily illustrates a screenshot of an opening page of the loaded input HTML document without an edit window 402 in the right pane of the GUI 401.
- FIG. 8B exemplarily illustrates a screenshot of the opening page of the loaded input HTML document, showing the edit window 402 in the right pane of the GUI 401.
- FIG. 8C exemplarily illustrates a screenshot of the output HTML page transformed by the FFTS 702 to a reversible file format, showing a header 801 and a footer 802 entered on the opening page in a proof view.
- the FFTS 702 positions the marked-up content in an appropriate location close to their respective citations in the proof view.
- the opening page in the reversible file format can be reversed to the first file format in the edit view.
- FIG. 8D exemplarily illustrates a screenshot without the edit window 402 in the right pane of the graphical user interface (GUI) 401, showing hyphenations 803 entered in a page of the input hypertext markup language (HTML) document.
- GUI graphical user interface
- FIG. 8E exemplarily illustrates a screenshot with the edit window 402 in the right pane of the GUI 401, showing the hyphenations 803 entered in the input HTML page.
- the user can edit the input HTML page using the edit window 402 in the right pane of the GUI 401 as exemplarily illustrated in FIG. 8E.
- the edit window 402 allows the user to edit the input HTML page with the hyphenations 803.
- FIG. 8F exemplarily illustrates a screenshot of the output HTML page with the hyphenations 803 exemplarily illustrated in FIGS. 8D-8E, transformed by the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7 A or FIG. 7B, to a reversible file format, showing the header 801 entered in the output HTML page in a proof view.
- FFTS file format transformation system
- FIG. 8G exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing a float, for example, a figure 804, without the edit window 402 in the right pane of the graphical user interface (GUI) 401.
- FIG. 8H exemplarily illustrates a screenshot of the page of the input HTML document containing the figure 804, showing the edit window 402 in the right pane of the GUI 401.
- the edit window 402 allows the user to edit the input HTML page containing the figure 804.
- FIG. 81 exemplarily illustrates a screenshot of the output HTML page containing the figure 804 transformed by the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7 A or FIG.
- FFTS file format transformation system
- FIG. 8J exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing a float, for example, a table 806, without the edit window 402 in the right pane of the graphical user interface (GUI) 401.
- FIG. 8K exemplarily illustrates a screenshot of the page of the input HTML document containing the table 806, showing the edit window 402 in the right pane of the GUI 401.
- FIG. 8L exemplarily illustrates a screenshot of the output HTML page transformed by the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7 A or FIG. 7B, to a reversible file format, showing the header 801, the footer 802, and a page number 805 entered on the page in a proof view.
- the FFTS 702 positions the table 806 in an appropriate location close to a respective citation in the proof view.
- FIG. 8M exemplarily illustrates a screenshot of a page of the input hypertext markup language (HTML) document containing footnotes 807, without the edit window 402 in the right pane of the graphical user interface (GUI) 401.
- FIG. 8N exemplarily illustrates a screenshot of the page of the input HTML document containing the footnotes 807, showing the edit window 402 in the right pane of the GUI 401.
- the edit window 402 allows the user to edit the page containing the footnotes 807.
- FIG. 80 exemplarily illustrates a screenshot of the output HTML page transformed by the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7 A or FIG. 7B, to a reversible file format, showing the header 801, the footer 802, a page number 805 entered on the page, and the footnotes 807 positioned in the footnote section below a footnote ruler 808 in a proof view.
- FFTS file format transformation system
- FIG. 8P exemplarily illustrates a screenshot of an output hypertext markup language (HTML) page transformed by the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7 A or FIG. 7B, to a reversible file format, showing the header 801 and the footer 802 at the top and the bottom of the page respectively, in a proof view.
- the output HTML page also contains a page number 805 and a footnote 807 positioned in the footnote section below the footnote ruler 808 in the proof view.
- FIG. 8Q exemplarily illustrates a screenshot of output hypertext markup language (HTML) pages transformed by the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7 A or FIG. 7B, to a reversible file format, showing a page break 809 in a proof view.
- the FFTS 702 breaks the running continuous input HTML page into individual reversible file format pages containing a header 801 and a footer 802, and renders the output on the graphical user interface (GUI) 401.
- GUI graphical user interface
- FIGS. 9A-9F exemplarily illustrate screenshots showing transformation of marked-up content in a first file format to a second file format based on a selected level of reversibility in edit views and proof views.
- FIG. 9A exemplarily illustrates a screenshot of the graphical user interface (GUI) 401 provided by the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7A or FIG. 7B, showing marked-up content containing a float, for example, a figure 901 in the first file format prior to inserting pagination elements in the marked-up content in an edit view.
- the figure 901 has a citation 902 in the marked-up content.
- the FFTS 702 tracks the position of the figure 901 in the marked-up content of the first file format prior to inserting the pagination elements for rendering the marked-up content in the second file format.
- FIG. 9B exemplarily illustrates a screenshot of the GUI 401 showing the marked-up content, after pagination, with the float, that is, the figure 901 moved to a new position in a proof view.
- the FFTS 702 moves the figure 901 to the new position in the marked-up content of the second file format and positions the figure 901 on the same page where the citation 902 resides.
- the selected level of reversibility of the marked-up content in the second file format to the first file format is complete reversibility and therefore FIG.
- 9B exemplarily illustrates the marked-up content in the reversible file format. If the selected level of reversibility is non- reversibility of the marked-up content in the second file format to the first file format, the FFTS 702 does not track the position of the figure 901.
- FIG. 9C exemplarily illustrates a screenshot of the graphical user interface (GUI) 401 showing the marked-up content 903 in the first file format, before pagination, without a header and a footer in an edit view.
- FIG. 9D exemplarily illustrates a screenshot of the GUI 401 showing the marked-up content 903 after pagination with a header 904 and a footer 905 in a proof view.
- the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7A or FIG. 7B, reflows the marked-up content 903 with the introduction of the header 904 and the footer 905.
- the FFTS 702 positions the marked-up content 903 in an appropriate position close to respective citations.
- the FFTS 702 tracks the positions of the inserted header 904 and the inserted footer 905 that were not present in the first file format exemplarily illustrated in FIG. 9C. Tracking the positions of the inserted header 904 and the inserted footer 905 allows the FFTS 702 to reverse the marked-up content 903 from the second file format to the first file format.
- the selected level of reversibility of the marked-up content 903 in the second file format to the first file format is complete reversibility and therefore the FFTS 702 tracks the positions of the inserted header 904 and the inserted footer 905 in the second file format such that the marked-up content 903 in the second file format can be completely restored to the first file format. If the selected level of reversibility is non-reversibility of the marked-up content 903 in the second file format to the first file format, the FFTS 702 does not track the positions of the inserted header 904 and the inserted footer 905.
- FIG. 9E exemplarily illustrates a screenshot of the graphical user interface (GUI) 401 showing the marked-up content 906 in the first file format, before pagination, without a page break in an edit view.
- FIG. 9F exemplarily illustrates a screenshot of the GUI 401 showing the marked-up content 906, after pagination, with a page break 907 in a proof view.
- the file format transformation system (FFTS) 702 exemplarily illustrated in FIG. 7 A or FIG. 7B, tracks the position of the page break 907 that was not present in the first file format exemplarily illustrated in FIG. 9E. Tracking the position of the page break 907 allows the FFTS 702 to reverse the marked-up content 906 from the second file format to the first file format.
- FFTS file format transformation system
- the selected level of reversibility of the marked-up content 906 in the second file format to the first file format is complete reversibility and therefore the FFTS 702 tracks the position of the page break 907 in the second file format such that the marked-up content 906 in the second file format can be completely restored to the first file format. If the selected level of reversibility is non- reversibility of the marked-up content 906 in the second file format to the first file format, the FFTS 702 does not track the position of the page break 907.
- non-transitory computer readable storage media participate in providing data, for example, instructions that are read by a computer, a processor or a similar device.
- the "non-transitory computer readable storage media” also refer to a single medium or multiple media, for example, a centralized database, a distributed database, and/or associated caches and servers that store one or more sets of instructions that are read by a computer, a processor or a similar device.
- non-transitory computer readable storage media also refer to any medium capable of storing or encoding a set of instructions for execution by a computer, a processor or a similar device and that causes a computer, a processor or a similar device to perform any one or more of the methods disclosed herein.
- non-transitory computer readable storage media comprise, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, a laser disc, a Blu-ray Disc ® of the Blu -ray Disc Association, any magnetic medium, a compact disc-read only memory (CD-ROM), a digital versatile disc (DVD), any optical medium, a flash memory card, punch cards, paper tape, any other physical medium with patterns of holes, a random access memory (RAM), a programmable read only memory (PROM), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM), a flash memory, any other memory chip or cartridge, or any other medium from which a computer can read.
- RAM random access memory
- PROM programmable read only memory
- EPROM erasable programmable read only memory
- EEPROM electrically erasable programmable read only memory
- flash memory any other memory chip or cartridge, or any other medium from which a computer can
- the computer programs that implement the methods and algorithms disclosed herein are stored and transmitted using a variety of media, for example, the computer readable media in a number of manners.
- hard-wired circuitry or custom hardware is used in place of, or in combination with, software instructions for implementation of the processes of various embodiments. Therefore, the embodiments are not limited to any specific combination of hardware.
- the computer program codes comprising computer executable instructions can be implemented in any programming language that runs on an internet browser,
- FFTS file format transformation system
- the computer implemented method and the file format transformation system (FFTS) 702 disclosed herein can be configured to work in a network environment comprising one or more computers that are in communication with one or more devices via a network.
- the computers communicate with the devices directly or indirectly, via a wired medium or a wireless medium such as the Internet, a local area network (LAN), a wide area network (W AN) or the Ethernet, a token ring, or via any appropriate communications mediums or combination of communications mediums.
- Each of the devices comprises processors, examples of which are disclosed above, that are adapted to communicate with the computers.
- each of the computers is equipped with a network communication device, for example, a network interface card, a modem, or other network connection device suitable for connecting to a network.
- Each of the computers and the devices executes an operating system, examples of which are disclosed above. While the operating system may differ depending on the type of computer, the operating system provides the appropriate communications protocols to establish communication links with the network. Any number and type of machines may be in communication with the computers.
- the computer implemented method and the file format transformation system (FFTS) 702 disclosed herein are not limited to a particular computer system platform, processor, operating system, or network.
- one or more aspects of the computer implemented method and the FFTS 702 disclosed herein are distributed among one or more computer systems, for example, servers configured to provide one or more services to one or more client computers, or to perform a complete task in a distributed system.
- one or more aspects of the computer implemented method and the FFTS 702 disclosed herein are performed on a client-server system that comprises components distributed among one or more server systems that perform multiple functions according to various embodiments. These components comprise, for example, executable, intermediate, or interpreted code, which communicate over a network using a communication protocol.
- the computer implemented method and the FFTS 702 disclosed herein are not limited to be executable on any particular system or group of systems, and are not limited to any particular distributed architecture, network, or communication protocol.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN201743011293 | 2017-03-30 | ||
PCT/IN2018/050156 WO2018179002A1 (en) | 2017-03-30 | 2018-03-20 | Transformation of marked-up content into a file format that enables automated browser based pagination |
Publications (2)
Publication Number | Publication Date |
---|---|
EP3602352A1 true EP3602352A1 (en) | 2020-02-05 |
EP3602352A4 EP3602352A4 (en) | 2020-10-28 |
Family
ID=63677729
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP18776286.9A Ceased EP3602352A4 (en) | 2017-03-30 | 2018-03-20 | Transformation of marked-up content into a file format that enables automated browser based pagination |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP3602352A4 (en) |
WO (1) | WO2018179002A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111104556B (en) * | 2019-11-19 | 2023-09-15 | 泰康保险集团股份有限公司 | Service processing method and device |
CN111444452B (en) * | 2020-02-21 | 2023-06-23 | 广州杰赛科技股份有限公司 | Webpage conversion method and device and storage medium |
CN114118007B (en) * | 2021-12-02 | 2022-07-08 | 江苏中威科技软件系统有限公司 | Method for converting format data stream file into OFD file |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017002130A1 (en) * | 2015-07-01 | 2017-01-05 | Tnq Books And Journals Private Limited | Transformation of marked-up content to a reversible file format for automated browser based pagination |
-
2018
- 2018-03-20 EP EP18776286.9A patent/EP3602352A4/en not_active Ceased
- 2018-03-20 WO PCT/IN2018/050156 patent/WO2018179002A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
EP3602352A4 (en) | 2020-10-28 |
WO2018179002A1 (en) | 2018-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10318614B2 (en) | Transformation of marked-up content into a file format that enables automated browser based pagination | |
US8819028B2 (en) | System and method for web content extraction | |
US10289649B2 (en) | Webpage advertisement interception method, device and browser | |
EP2399234B1 (en) | Font handling for viewing documents on the web | |
US20160117412A1 (en) | Recursive extraction and narration of nested tables | |
CN108710490B (en) | Method and device for editing Web page | |
RU2579888C2 (en) | Universal presentation of text to support various formats of documents and text subsystem | |
EP3602352A1 (en) | Transformation of marked-up content into a file format that enables automated browser based pagination | |
CN103970750A (en) | Method and device for generating HTML (Hypertext Markup Language) web pages | |
US9535880B2 (en) | Method and apparatus for preserving fidelity of bounded rich text appearance by maintaining reflow when converting between interactive and flat documents across different environments | |
CN111831384A (en) | Language switching method and device, equipment and storage medium | |
KR102574306B1 (en) | dynamic typesetting | |
CN104281589A (en) | Mathematical formula searching method and device | |
CN112527291A (en) | Webpage generation method and device, electronic equipment and storage medium | |
CN112463152A (en) | Webpage adaptation method and device based on AST | |
Sikos | Web Standards: Mastering HTML5, CSS3, and XML | |
US10157238B2 (en) | Transformation of marked-up content to a reversible file format for automated browser based pagination | |
US20180246857A1 (en) | Markup code generator | |
CN114625996A (en) | Webpage content paging method and device, electronic equipment and readable storage medium | |
Turčić et al. | Dynamic mathematical layout in e-books | |
Rahman | Jump Start Bootstrap: Get Up to Speed with Bootstrap in a Weekend | |
CN104216868A (en) | Adaptation method and device for document display format | |
US9594737B2 (en) | Natural language-aided hypertext document authoring | |
US9223762B2 (en) | Encoding information into text for visual representation | |
US9984053B2 (en) | Replicating the appearance of typographical attributes by adjusting letter spacing of glyphs in digital publications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20191028 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R079 Free format text: PREVIOUS MAIN CLASS: G06F0017300000 Ipc: G06F0016840000 |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20200925 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06F 40/14 20200101ALI20200921BHEP Ipc: G06F 40/114 20200101ALI20200921BHEP Ipc: G06F 16/957 20190101ALI20200921BHEP Ipc: G06F 40/154 20200101ALI20200921BHEP Ipc: G06F 16/84 20190101AFI20200921BHEP |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20220215 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R003 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED |
|
18R | Application refused |
Effective date: 20231215 |